How Vulnerable is our Interlinked Infrastructure?
Critical Infrastructure: Interlinked and Vulnerable
Computers and communications are boosting performance, but interconnection increases the risk of a technological domino effect.
The infrastructure of the United States-the foundations on which the nation is built-is a complex system of interrelated elements. Those elements-transportation, electric power, financial institutions, communications systems, and oil and gas supply-reach into every aspect of society. Some are so critical that if they were incapacitated or destroyed, an entire region, if not the nation itself, could be debilitated. Continued operation of these systems is vital to the security and well-being of the country.
Once these systems were fairly independent. Today they are increasingly linked and automated, and the advances enabling them to function in this manner have created new vulnerabilities. What in the past would have been an isolated failure caused by human error, malicious deeds, equipment malfunction, or the weather, could today result in widespread disruption.
Among certain elements of the infrastructure (for example, the telecommunications and financial networks), the degree of interdependency is especially strong. But they all depend upon each other to varying degrees. We can no longer regard these complex operating systems as independent entities. Together they form a vast, vital-and vulnerable-system of systems.
The elements of infrastructure themselves are vulnerable to physical and electronic disruptions, and a dysfunction in any one may produce consequences in the others. Some recent examples:
- The western states power outage of 1996. One small predictable accident of nature-a power line shorting after it sagged onto a tree-cascaded into massive unforeseen consequences: a power-grid collapse that persisted for six hours and very nearly brought down telecommunications networks as well. The system was unable to respond quickly enough to prevent the regional blackout, and it is not clear whether measures have been taken to prevent another such event.
- The Northridge, California, earthquake of January 1994 affecting Los Angeles. First-response emergency personnel were unable to communicate effectively because private citizens were using cell phones so extensively that they paralyzed emergency communications.
- Two major failures of AT&T communications systems in New York in 1991. The first, in January, created numerous problems, including airline flight delays of several hours, and was caused by a severed high-capacity telephone cable. The second, in September, disrupted long distance calls, caused financial markets to close and planes to be grounded, and was caused by a faulty communications switch.
- The satellite malfunction of May 1998. A communications satellite lost track of Earth and cut off service to nearly 90 percent of the nation's approximately 45 million pagers, which not only affected ordinary business transactions but also physicians, law enforcement officials, and others who provide vital services. It took nearly a week to restore the system.
Failures such as these have many harmful consequences. Some are obvious, but others are subtle-for example, the loss of public confidence that results when people are unable to reach a physician, call the police, contact family members in an emergency, or use an ATM to get cash.
The frequency of such incidents and the severity of their impact are increasing, in part because of vulnerabilities that exist in the nation's information infrastructure. John Deutch, then director of the CIA, told Congress in 1997 that he ranked information warfare as the second most serious threat to U.S. national security, just below weapons of mass destruction in terrorist hands. Accounts of hacking into the Pentagon's computers and breakdowns of satellite communications have been reported in the press. These incidents suggest wider implications for similar systems.
Two major issues confront the nation as we consider how best to protect critical elements of the infrastructure. The first is the need to define the roles of the public and private sectors and to develop a plan for sharing responsibility between them. The second is the need to understand how each system in the infrastructure functions and how it affects the others so that its interdependencies can be studied. Both issues involve a multitude of considerations.
In 1996, the Presidential Commission on Critical Infrastructure Protection was established. It included officials concerned with the operation and protection of the nation and involved in energy, defense, commerce, the CIA and the FBI, as well as 15 people from the private sector. The commission conducted a 15-month study of how each element of the infrastructure operates, how it might be vulnerable to failures, and how it might affect the others. Among its conclusions: 1) the infrastructure is at serious risk, and the capability to do harm is readily available; 2) there is no warning system to protect the infrastructure from a concerted attack; 3) government and industry do not efficiently share information that might give warning of an electronic attack; and 4) federal R&D budgets do not include the study of threats to the component systems in the infrastructure. (Information on the commission, its members, its tasks, and its goals, as well as the text of the presidential directive, are available on the Web at http://www.pccip.gov.)
A major question that faced the commission, and by implication the nation, is the extent to which the federal government should get involved in infrastructure protection and in establishing an indications and warning system. If the government is not involved, who will ensure that the interdependent systems function with the appropriate reliability for the national interest? There is at present no strategy to protect the interrelated aspects of the national infrastructure; indeed, there is no consensus on how its various elements actually mesh.
We believe that protecting the national infrastructure must be a key element of national security in the next few decades. There is obviously an urgent and growing need for a way to detect and warn of impending attacks on, and system failures within, critical elements of the national infrastructure. If we do not develop such an indications and warning capability, we will be exposed and easily threatened.
The presidential commission's recommendations also resulted in the issuance on May 22, 1998, of Presidential Decision Directive 63 (PDD 63) on Critical Infrastructure Protection. PDD 63 establishes lines of responsibility within the federal government for protecting each of the infrastructure elements and for formulating an R&D strategy for improving the surety of the infrastructure.
PDD 63 has already triggered infrastructure-protection efforts by all federal agencies and departments. For example, not only is the Department of Energy (DOE) taking steps to protect its own critical infrastructure, but it is also developing a plan to protect the key components of the national energy infrastructure. Energy availability is vital to the operations of other systems. DOE will be studying the vulnerabilities of the nation's electric, gas, and oil systems and trying to determine the minimum number of systems that must be able to continue operating under all conditions, as well as the actions needed to guarantee their operation.
Achieving public-private cooperation. A major issue in safeguarding the national infrastructure is the need for public-private cooperation. Private industry owns 85 percent of the national infrastructure, and the country's economic well-being, national defense, and vital functions depend on the reliable operation of these systems.
Private industry's investment in protecting the infrastructure can be justified only from a business perspective. Risk assessments will undoubtedly be performed to compare the cost of options for protection with the cost of the consequences of possible disruptions. For this reason, it is important that industry have all the information it needs to perform its risk assessments. The presidential commission reported that private owners and operators of the infrastructure need more information on threat and vulnerabilities.
Much of the information that industry needs may be available from the federal government, particularly from the law enforcement, defense, and intelligence communities. In addition, many government agencies have developed the technical skills and expertise required to identify, evaluate, and reduce vulnerabilities to electronic and physical threats. This suggests that the first and primary focus of industry-government cooperation should be to share information and techniques related to risk management assessments, including incident reports, identification of weak spots, plans and technology to prevent attacks and disruptions, and plans for how to recover from them.
Sharing information can help lessen damage and speed recovery of services. However, such sharing is difficult for many reasons. Barriers to collaboration include classified and secret materials, proprietary and competitively sensitive information, liability concerns, fear of regulation, and legal restrictions.
There are two cases in which the public and private sectors already share information successfully. The first is the collaboration between the private National Security Telecommunications Advisory Committee and the government's National Communications System. The former comprises the leading U.S. telecommunications companies; the latter is a confederation of 23 federal government entities. The two groups are charged jointly with ensuring the robustness of the national telecommunications grid. They have been working together since 1984 and have developed the trust that allows them to share information about threats, vulnerabilities, operations, and incidents, which improves the overall surety of the telecommunications network. Their example could be followed in other infrastructure areas, such as electric power.
The second example of successful public-private cooperation is the federally run Centers for Disease Control's (CDC's) epidemiological databases. The CDC has over the years developed a system for acquiring medical data to analyze for the public good. The CDC collaborates with state agencies and responsible individuals to obtain information that has national importance. CDC obtains it as anonymous data, thus protecting the privacy of individual patients. The way CDC gathers, analyzes, and reports data involving an enormous number of variables from across the nation is a model for how modern information technology can be applied to fill a social need while minimizing harm to individuals. Especially relevant to information-sharing is the manner in which the CDC is able to eliminate identifiable personal information from databases, a concern when industry is being asked to supply the government with proprietary information.
The ultimate goal is to develop a real-time ability to share information on the current status of all systems in the infrastructure. It would permit analysis and assessment to determine whether certain elements were under attack. As the process of risk assessment and development of protection measures proceeds, a national center for analysis of such information should be in place and ready to facilitate cooperation between the private and public sectors. To achieve this goal, a new approach to government-industry partnerships will be needed.
Assessing system adequacy
We use the term "infrastructure surety" to describe the protection and operational assurance that is needed for the nation's critical infrastructure. Surety is a term that has long been associated with complex high-consequence systems, such as nuclear systems, and it encompasses safety, security, reliability, integrity, and authentication, all of which are needed to ensure that systems are working as expected in any situation.
A review of possible analytical approaches to this surety problem suggests the need for the application of what is known as a consequence-based assessment in order to understand and manage critical elements of the systems. It begins by defining the consequences of disruptions, then by identifying critical nodes-elements that are so important that severe consequences would result if they could not operate. Finally, it outlines protection mechanisms and associated costs of protecting those nodes. This approach is used to assess the safety of nuclear power plants, and insurance companies use it in a variety of ways. It permits the costs and benefits of each protection option to be assessed realistically and is particularly attractive in situations in which the threat is difficult to quantify, because it allows the costs of disruptions to be defined independently of what causes the disruptions. Industry can then use these results in assessing risks. It provides a way for industry to establish a business case for protecting assets.
One area of particular concern, and one that must be faced in detail with private industry, is the widespread and increasing use of supervisory control and data acquisition systems-networks of information systems that interconnect the business, administrative, safety, and operational sections within an element of the infrastructure. The presidential commission identified these supervisory control systems as needing attention because they control the flow of electricity, oil and gas, and telecommunications throughout the country and are also vulnerable to electronic and physical threats. Because of its long-term involvement with complex and burgeoning computer networks, DOE could work with industry to develop standards and security methods for supervisory control and data acquisition protocols and to develop the means to monitor vital parts in the system.
The need for a warning center
The commission recognized the need for a national indications and warning capability to monitor the critical elements of the national infrastructure and determine when and if they are under attack or are the victim of destructive natural occurrences. It favors surveillance through a national indications and warning center, which would be operated by a new National Infrastructure Protection Center (NIPC). The center would be a follow-on to the Infrastructure Protection Task Force, headed by the FBI and created in 1996 It had representatives from the Justice, Transportation, Energy, Defense, and Treasury Departments, the CIA, FBI, Defense Information Systems Agency, National Security Agency, and National Communications System. The task force was charged with identifying and coordinating existing expertise and capabilities in the government and private sector as they relate to protecting the critical infrastructure from physical and electronic threats. A national center would receive and transmit data across the entire infrastructure, warning of impending attack or failure, providing for physical protection of a vital system or systems, and safeguarding other systems that might be affected. This would include a predictive capability. The center would also allow proprietary industry information to be protected.
Timely warning of attacks and system failures is a difficult technical and organizational challenge. The key remaining questions are 1) which data should be collected to provide the highest probability that impending attacks can be reliably predicted, sensed, and/or indicated to stakeholders? and 2) how can enormous volumes of data be efficiently and rapidly processed?
Securing the national infrastructure depends on understanding the relationships of its various elements. Computer models are an obvious choice to simulate interactions among infrastructure elements. One in particular is proving to be extremely effective for this kind of simulation. It is an approach in which interactions are modeled individually by computer programs called intelligent agents, one for each interaction. Each program is designed to represent an entity of some kind, such as a bank, an electrical utility, or a telecommunications company. These are allowed to interact. As they do so, they learn from their experience, alter their behavior, and interact differently in subsequent encounters, much as a person or company would do in the real world.
The behavior of the independent systems then becomes apparent. What this makes possible is a way to simulate a large number of possible situations and to analyze their consequences. One way to express the consequences of disruption is by analyzing the economic impact of an outage on a city, a region, and the nation. The agent-based approach can use thousands of agent programs to model very complex systems. In addition, the user can set up hypothetical situations (generally disruptive events, such as outages or hacking incidents) to determine system performance. In fact, the agent-based approach can model the effects of an upset to a system without ever knowing the exact nature of the upset. It offers advantages over traditional techniques for modeling the interdependencies of elements of the infrastructure, because this approach can use rich sources of micro-level data (demographics, for example) to develop forecasts of interactions, instead of using macro-scale information, such as flow models for electricity.
The agent-based approach can exploit the speed, performance, and memory of massively parallel computers to develop computer models as tools for security planning and for counterterrorism measures. It will allow critical nodes to be mapped and it provides a method to quantify the physical and economic consequences of large and small disruptions.
A few agent-based models are in existence. One example is ENERGY 2020 for the electric power and gas industries. ENERGY 2020 can be combined with a powerful commercially available economic model, such as Regional Economic Models, Inc., or Sandia's ASPEN, which models the banking and finance infrastructure.
In conjunction with these agent-based models, multiregional models encompassing the entire U.S. economy can evaluate regional effects of national policies, events, or other changes. The multiregional approach incorporates key economic interactions among regions and allows for national variables to change as the net result of regional changes. It is based on the premise that national as well as local markets determine regional economic conditions, and it incorporates interactions among these markets. By ignoring regional markets and connections, other methods may not accurately account for regional effects or represent realistic national totals.
At Sandia, we modeled two scenarios, both involving the electricity supply to a major U.S. city. The first assumed a sequence of small disruptions over one year that resulted from destruction of electricity substations servicing a quarter of the metropolitan area. This series of small outages had the long-term effect of increasing labor and operating costs, and thus the cost of electricity, making the area less apt to expand economically and so less attractive to a labor force.
In the second scenario, a single series of short-lived and well-planned explosions destroyed key substations and then critical transmission lines. We timed and sequenced the simulated explosions so that they did significant damage to generating equipment. Subsequent planned damage to transmission facilities exacerbated the problem by making restoration of power more difficult.
Yet our findings were the opposite of what might have been expected. Scenario 1, which was less than half as destructive as scenario 2, was five times more costly to business and to the costs of maintaining the supply of electricity. Thus it had a long-lasting and substantial effect on the area. The United States as a whole feels the effects of scenario 1 more than it does those of scenario 2. A series of small disruptions provides a strong signal about the risk of doing business in a geographic area, and companies tend to relocate. With a single disruption, even a large one, economic uncertainty is short-lived and local, and the rest of the country tends to be isolated from the problem. This example gives an idea of what computer simulations can accomplish and the considerations they generate. To validate the simulations will, of course, require additional work. One clear advantage of such simulations is the ability to explore nonintuitive outcomes.
Eventually, models may be able to be combined to picture the critical national infrastructure in toto. An understanding of the fundamental feedback loops in modeling the national infrastructure is critical to analyzing and predicting the response of the infrastructure to unexpected perturbations. With further development, such computer models could analyze the impact of disruptive events in the infrastructure anywhere in the United States. They could identify critical nodes, assess the susceptibility of all the remaining systems in the infrastructure, and determine cost-effective and timely counter-measures.
These simulations can even determine winners and losers in an event and
predict long-term consequences. For example, Sandia's Teraflop computer system could allow such events to be analyzed as they happen to provide the information flow and technical support required for subsequent responses and for long-term considerations, such as remediation and prevention. Such capabilities could be the backbone of a national indications and warning center.
The U.S. infrastructure will continue to be reconfigured because of rapid advances in technology and policy. It will change with the numbers of competing providers and in response to an uncertain regulatory and legal framework. Yet surety is easiest to engineer in discrete well-understood systems. Indeed, the exceptional reliability and enviable security of the current infrastructure were achieved in the regulated systems-engineering environment of the past. The future environment of multiple providers, multiple technologies, distributed control, and easy access to hardware and software is fundamentally different. The solutions that will underlie the security of the future infrastructure will be shaped by this different environment and may be expected to differ considerably from the solutions of the past.
Some current policy discussions tend to treat infrastructure surety as an expected product of market forces. Where there are demands for high reliability or high surety, there will be suppliers-at a price. In this view, customers will have an unprecedented ability to protect themselves by buying services that can function as a back-up, demanding services that support individual needs for surety, and choosing proven performers as suppliers.
But the surety of the nation's infrastructure is not guaranteed. We who have long been in national security question the ability of the marketplace to anticipate and address low-probability but high-consequence situations. We are moving from an era in which the surety of the infrastructure was generally predictable and controlled to one in which there are profound uncertainties.
Generally, the private sector cannot expect the market to provide the level of security and resilience that will be required to limit damage from a serious attack on, or escalating breakdown of, the infrastructure or one of its essential systems. The issue of private and public sector rights and responsibilities as they relate to the surety of the infrastructure remains an unresolved part of the national debate.
In the United States, government is both regulator and concerned customer.
Essential governmental functions include continuity, emergency services, and military operations, and they all depend on energy, communications, and computers. The government has a clear role in working with private industry to improve the surety of the U.S. infrastructure.
Because of its responsibilities in areas involving state-of-the-art technologies, such as those common to national defense and electrical power systems, DOE is a national leader in high-speed computation, computer modeling and simulation, and in the science of surety assessment and design. Among its capabilities:
- Computer modeling of the complex interactions among infrastructure systems. Various models of individual elements of the infrastructure have been developed in and outside DOE, although there are currently no models of their interdependencies.
- Risk assessment tools to protect physical assets. In the 1970s, technologies were developed to prevent the theft of nuclear materials transported between DOE facilities. Recently major improvements have been made in modeling and simulating physical protection systems.
- Physical protection for plants and facilities in systems determined to be crucial to the operation of the nation. This involves technology for sensors, entry control, contraband detection, alarms, anti-failure mechanisms, and other devices to protect these systems. Some of the technology and the staff to develop new protection systems are available; the issue is what level of protection is adequate and who will bear the costs.
- Architectural surety, which calls for enhanced safety, reliability, and security of buildings. Sandia is formulating a program that encompasses computational simulation of structural responses to bomb blasts for prediction and includes other elements, such as computer models for fragmentation of window glass, for monitoring instruments, and for stabilization of human health.
- Data collection and surety. DOE already has technical capability to contribute, but what is needed now is to define and acquire the necessary data, develop standards and protocols for data sharing, design systems that protect proprietary data, and develop analytical tools to ensure that rapid and correct decisions will emerge from large volumes of data.
The report by the President's Commission on Critical Infrastructure Protection urged that a number of key actions be started now. In particular, these recommendations require prompt national consideration:
- Establishment of a National Indications and Warning Center, with corrective follow-up coordinated by the National Infrastructure Protection Center.
- Development of systems to model the national critical infrastructure, including consequence-based assessment, probabilistic risk assessment, modeling of interdependencies, and other similar tools to enhance our understanding of how the infrastructure operates. Close cooperation among government agencies and private-sector entities will be vital to the success of this work.
- Development of options for the protection of key physical assets using the best available technology, such as architectural surety and protection of electronic information through encryption and authentication, as developed in such agencies as DOE, the Department of Defense, and the National Aeronautics and Space Administration.
Adequate funding will be needed for these programs. Increasing public awareness of the vulnerability of the systems that the critical national infrastructure comprises and of the related danger to national security and the general welfare of the nation will generate citizen support for increased funding. We believe these issues are critical to the future of the country and deserve to be brought to national attention.
C. Paul Robinson is president of Sandia National Laboratories, Joan B. Woodard is vice president of Sandia's Energy, Information and Infrastructure Technology Division, and Samuel G. Varnado is director of Sandia's Energy and Critical Infrastructure Technology Center.