14
404 IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER Using Distributed Topology Update and Preplanned Configurations to Achieve Trunk Network Survivability Brian A. Coan, Senior Member IEEE Will E. Leland, Member IEEE Mario P. Vecchi, Senior Member IEEE Abel Weinrib Liang T. Wu Bellcore, Morristown Bellcore, Morristown Bellcore, Morristown Bellcore, Morristown Bellcore, Morristown Key Words - Telephonetrunk networks, Network survivabili- ty, Distributed fault-tolerant protocols, Preplanned network recon- figuration, Applications of topology update protocols ReaderAids - Purpose: Present a new approach for trunk network survivability Special math needed for explanation: None Special math needed to use results: None Results useful to: Telephone network engineers Abstmct - The extensive deploymentof high-bandwidth fiber- optic transmission facilities in the telephone trunk network has in- creased the amount of damage that can be caused by a single physical failure. The increased vulnerability of the trunk network has inspired a wide range of recent work on survivability, which is the problem of improving the ability of the trunk network to provide service after a physical failure. We propose a new modular approach to survivability. It is intended for a trunk network con- sisting of high-bandwidth fiber-optic Links connected through recon- figurable digital cross-connect nodes. It works for both node and Link failures. Our approach comprises a distributed protocol with two parts. First, cause the surviving digital cross-connect nodes to converge to an agreement on the topology (i.e., what is up and what is down). Second, based on the agreed topology and on a precomputed plan for that topology, reconfigure the digital cross- connect nodes to restore as much call-carrying capacity as possi- ble. The modularity of our approach comes from separating the problem of devising a distributed fault-tolerant protocol to deter- mine what the failure is from the problem of designing a network reconfiguration for that failure. 1. INTRODUCTION Two technologicaladvances have had a radical impact on the telephone trunk network. They are the extensive deploy- ment over the past decade of high-bandwidth fiber-optic trans- mission facilities and the ongoing deployment of reconfigurable digital cross-connect (DCS) nodes, also known as facility switch- es. The aggregation of traffic, which has been accelerated by the deployment of fiber-optic transmission facilities, has increas- ed the amount of damage that can be caused by a single physical failure. The accidental severing of a single fiber-opticcable can disrupt tens of thousands of telephone calls. Fortunately, the abundant spare capacity of fiber-optic links together with the deployment of reconfigurable DCS nodes greatly increases the range of strategies available to recover from failures. After a fiber-optic cable has been cut or a DCS node fails, the surviv- ing DCS nodes may be used to reconfigure the trunk network, possibly restoring much or all of the lost call-carryingcapacity. Alternative survivability techniques which do not utilize the ability of DCS nodes to reconfigure the trunk network are also possible. One such traditional survivability technique is called 1: 1 protection switching with physically diverse routes (PPDR for short). Physical fiber-optic connections are duplicated, with each alternate following a different geographical route. Electronics are provided at each end of a connection to find and use the alternate that is functional. PPDR is primarily a hardware technique, although sophisticated software tools [23] have been developed to design low-cost PPDR networks. Because PPDR-based networks reduce the role of DCS nodes, they may be more difficult than DCS-based networks to rear- range in response to changes in traffic patterns. The final sec- tion of this paper reports a comparison of the cost of PPDR with the cost of NETSPAR. Recently there has been much research on DCS-based sur- vivability techniques. The solutions that have been proposed include completely centralized [7,16] and completely distributed [9,10,11,14,18]. The centralized solutions require a reliable controller that is reliably connected to all of the DCS nodes. After a failure the controller determines what happened (based on messages from the DCS nodes), decides what rerouting should be applied, and implements this decision by down-loading instructions to the DCS nodes. The principal advantage is the simplicity of centralized control. The principal disadvantage is the cost of making the centralized controller and its communica- tion network reliable (although, in some networks, this has already been done for other reasons). The completely distributed solutions eliminate the need for a centralized controller to manage reconfiguration. In these solu- tions the response to a failure is that the DCS nodes run a distributed protocol which determines what happened, decides what rerouting should be done, and implements this decision. The messages of the distributed protocol are sent over the same links that carry telephone traffic between DCS nodes. These links may fail. A key technical difficulty that must be overcome in a satisfactory distributed solution is maintaining correct opera- tion despite link failures. The principal advantageof distributed 'A preliminary version of this paper was presented at the XIII International Switching Symposium[S] OO18-9529/91/ 1OOO-0404$01 .WO1991 IEEE

Using distributed topology update and preplanned configurations to achieve trunk network survivability

  • Upload
    lt

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using distributed topology update and preplanned configurations to achieve trunk network survivability

404 IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER

Using Distributed Topology Update and Preplanned Configurations to Achieve Trunk Network Survivability

Brian A. Coan, Senior Member IEEE

Will E. Leland, Member IEEE

Mario P. Vecchi, Senior Member IEEE

Abel Weinrib

Liang T. Wu

Bellcore, Morristown

Bellcore, Morristown

Bellcore, Morristown

Bellcore, Morristown

Bellcore, Morristown

Key Words - Telephone trunk networks, Network survivabili- ty, Distributed fault-tolerant protocols, Preplanned network recon- figuration, Applications of topology update protocols

ReaderAids - Purpose: Present a new approach for trunk network survivability Special math needed for explanation: None Special math needed to use results: None Results useful to: Telephone network engineers

Abstmct - The extensive deployment of high-bandwidth fiber- optic transmission facilities in the telephone trunk network has in- creased the amount of damage that can be caused by a single physical failure. The increased vulnerability of the trunk network has inspired a wide range of recent work on survivability, which is the problem of improving the ability of the trunk network to provide service after a physical failure. We propose a new modular approach to survivability. It is intended for a trunk network con- sisting of high-bandwidth fiber-optic Links connected through recon- figurable digital cross-connect nodes. It works for both node and Link failures. Our approach comprises a distributed protocol with two parts. First, cause the surviving digital cross-connect nodes to converge to an agreement on the topology (i.e., what is up and what is down). Second, based on the agreed topology and on a precomputed plan for that topology, reconfigure the digital cross- connect nodes to restore as much call-carrying capacity as possi- ble. The modularity of our approach comes from separating the problem of devising a distributed fault-tolerant protocol to deter- mine what the failure is from the problem of designing a network reconfiguration for that failure.

1. INTRODUCTION

Two technological advances have had a radical impact on the telephone trunk network. They are the extensive deploy- ment over the past decade of high-bandwidth fiber-optic trans- mission facilities and the ongoing deployment of reconfigurable digital cross-connect (DCS) nodes, also known as facility switch- es. The aggregation of traffic, which has been accelerated by the deployment of fiber-optic transmission facilities, has increas-

ed the amount of damage that can be caused by a single physical failure. The accidental severing of a single fiber-optic cable can disrupt tens of thousands of telephone calls. Fortunately, the abundant spare capacity of fiber-optic links together with the deployment of reconfigurable DCS nodes greatly increases the range of strategies available to recover from failures. After a fiber-optic cable has been cut or a DCS node fails, the surviv- ing DCS nodes may be used to reconfigure the trunk network, possibly restoring much or all of the lost call-carrying capacity.

Alternative survivability techniques which do not utilize the ability of DCS nodes to reconfigure the trunk network are also possible. One such traditional survivability technique is called 1: 1 protection switching with physically diverse routes (PPDR for short). Physical fiber-optic connections are duplicated, with each alternate following a different geographical route. Electronics are provided at each end of a connection to find and use the alternate that is functional. PPDR is primarily a hardware technique, although sophisticated software tools [23] have been developed to design low-cost PPDR networks. Because PPDR-based networks reduce the role of DCS nodes, they may be more difficult than DCS-based networks to rear- range in response to changes in traffic patterns. The final sec- tion of this paper reports a comparison of the cost of PPDR with the cost of NETSPAR.

Recently there has been much research on DCS-based sur- vivability techniques. The solutions that have been proposed include completely centralized [7,16] and completely distributed [9,10,11,14,18]. The centralized solutions require a reliable controller that is reliably connected to all of the DCS nodes. After a failure the controller determines what happened (based on messages from the DCS nodes), decides what rerouting should be applied, and implements this decision by down-loading instructions to the DCS nodes. The principal advantage is the simplicity of centralized control. The principal disadvantage is the cost of making the centralized controller and its communica- tion network reliable (although, in some networks, this has already been done for other reasons).

The completely distributed solutions eliminate the need for a centralized controller to manage reconfiguration. In these solu- tions the response to a failure is that the DCS nodes run a distributed protocol which determines what happened, decides what rerouting should be done, and implements this decision. The messages of the distributed protocol are sent over the same links that carry telephone traffic between DCS nodes. These links may fail. A key technical difficulty that must be overcome in a satisfactory distributed solution is maintaining correct opera- tion despite link failures. The principal advantage of distributed

'A preliminary version of this paper was presented at the XIII International Switching Symposium[S]

OO18-9529/91/ 1OOO-0404$01 .WO199 1 IEEE

Page 2: Using distributed topology update and preplanned configurations to achieve trunk network survivability

COAN ET AL. : TOPOLOGY UPDATE AND PREPLANNED CONFIGURATION TO ACHIEVE TRUNK NETWORK SURVIVABILITY 405

survivability techniques is their elimination of the need for a reliable controller. Their principal disadvantage is the inherent complexity of designing correct distributed fault-tolerant pro- tocols, a notoriously hard problem.

In this paper we present a new hybrid survivability tech- nique called NETSPAR. It makes use of a centralized controller for certain network configuration operations. Nevertheless, the response to a failure is completely independent of the controller; it is accomplished by a distributed protocol executed by the DCS nodes. Because NETSPAR removes the centralized controller from the immediate failure response, it may well reduce the reliability requirements (and hence the cost) of the controller and its communication network. Still we preserve the desirable property that the rerouting policy is determined in a centraliz- ed location.

NETSPAR is a modular approach to the survivability prob- lem. Specifically, the problem of devising a distributed fault- tolerant protocol to determine what failure has happened is separated from the problem of designing a network reconfigura- tion for that failure. NETSPAR operates as follows. When a failure occurs the surviving DCS nodes begin by converging to an agree- ment on the topology (i.e., what is up and what is down). Based on the new topology and on a precomputed plan for that topology, each surviving DCS node loads a new configuration table, in real time establishing a new pattern of cross-connections. NETSPAR builds on known techniques for agreeing on the topology and for designing configuration tables for DCS nodes, synthesizing these techniques into a solution to the problem of improving the sur- vivability of trunk networks. We believe that its modularity, its simplicity, and its ability to build on known fault-tolerant pro- tocols are the main contributions of NETSPAR.

In the literature the problem of causing nodes to agree on the topology is called the topology update problem. Many variants of this problem have been studied [1,15,17,20]. Several have been used in the ARPANET [15,17] as part of its decen- tralized adaptive routing subsystem. For survivability applica- tions, we give a fast and simple solution to a restricted variant of the topology update problem.

The NETSPAR survivability technique works for any failures that can be detected and classified, including single or multiple link failures and single or multiple node failures. The primary constraints are: the amount of memory required to store configuration tables, the amount of time necessary to compute and down-load the configuration tables, and the amount of spare capacity required in the physical network. For each covered failure, one additional configuration table is required at each DCS node. It would clearly be too expensive to compute and store configuration tables for all possible failures since the number of possible failures is huge. It should, however, be feasi- ble to compute and store configuration tables for all single com- ponent failures and for those multiple component failures that are most likely to occur. In the case of failures not included in the precomputed configuration tables, the NETSPAR strategy degrades gracefully: it performs no worse, and usually somewhat better, than having taken no action.

In Section 2 we outline how NETSPAR achieves sur- vivability in DCS-based trunk networks. Then in the following

two sections we give technical details of NETSPAR: in Sec- tion 3 we discuss the construction of planned reconfigurations for covered failures and in Section 4 we give a distributed fault- tolerant protocol that causes the DCS nodes to converge to the correct view of what failure has happened. Finally in Sections 5,6, and 7 we analyze the cost and performance of NETSPAR.

2. RECONFIGURABLE TRUNK NETWORKS

In this section we review reconfigurable (DCS-based) trunk networks and sketch how our survivability techniques operate in these networks. We begin by contrasting current recon- figurable trunk networks with traditional non-reconfigurable trunk networks. Then we discuss the normal operation of a reconfigurable trunk network. Finally we discuss the desired failure response of such a network. The technical details necessary to achieve this failure response are presented in subse- quent sections.

In the telephone network, central-ofice switches perform the routing for each individual telephone call. The loop plant connects customers to central-office switches and the trunk net- work provides the interconnection between central-office switch- es. The recent deployment of fiber-optic technology has led to fundamental changes in the design of the trunk network. Tradi- tionally, central-office switches have been directly connected by physical trunks, each providing one voice circuit between a pair of switches. With the deployment of high-bandwidth fiber- optic links and reconfigurable DCS nodes, the trunk network has been evolving to a two-layer architecture: the physical layer which consists of DCS nodes and the links that connect adja- cent DCS nodes, and the logical layer which consists of logical trunks routed through the physical network. A basic design philosophy of this new architecture is that the change should be completely transparent to the central-office switches. Specifically, from the point of view of the central-office switch- es, the logical trunks of the new architecture should behave just like the direct physical trunks of the old architecture.

This traditional approach is illustrated in Fig. 1. Central- office switches are drawn as circles. Each of the lines connec- ting central-office switches indicates 672 voice circuits, or one SONET [2,3] STSl worth of capacity. The physical layer of the new architecture is illustrated in Fig. 2. Central-office switches are shown as circles and DCS nodes are shown as squares. Fiber-optic links connecting the DCS nodes are shown as groups of parallel lines, with each line representing one STSl worth of capacity. The logical network is established by the DCS nodes by the way they cross-connect these circuits, creating logical trunks between the central-office switches. (No cross- connections are shown in Fig. 2.) The pattern of cross- connections at each DCS node is stored in a conjguration table. A particular logical network is established when a consistent set of configuration tables is loaded at all of the DCS nodes.

The granularity of cross-connection of a DCS node is usual- ly larger than a single voice circuit. The smallest unit that can be independently routed through the physical network is of this granularity. Thus, we define a pipe to be a collection of logical

Page 3: Using distributed topology update and preplanned configurations to achieve trunk network survivability

406 IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER

is deciding how many, if any, logical trunks should COMeCt each pair of central-office switches, and expressing this requirement in terms of pipes. A possible product of this step is shown in Table 1, which gives a collection of pipes of STS1-capacity to be implemented in the physical network of Fig. 2 . The second step is laying out these pipes on the physical facilities. The luyout of a pipe is the complete end-to-end path that the pipe takes through the physical network. In general there are many possi- ble ways of laying out a collection of pipes on a physical net- work. One way to lay out the pipes from Table 1 on the physical network of Fig. 2 is illustrated in Fig. 3, the contents of each DCS node’s configuration table being represented by the pat- tern of connections drawn in that DCS. From the point of view of the central-office switches this network is the same as the one in Fig. 1.

TABLE 1 Required Pipes between Switches Figure 1. The Physical Network - Old View

Figure 2. The Physical Network - Current View

trunks that all connect the same pair of central-office switches and all follow the same route through the physical network. A pipe is defined to have capacity equal to the granularity of recon- figuration. For concreteness, throughout this section the size of each pipe is taken to be one STS1, but our techniques are applicable independent of the size of a pipe.

We emphasize that the functionality provided by DCS nodes is quite different from that provided by central-office switches. There are two main differences, each of which makes the DCS unsuitable for processing individual telephone calls. First, DCS nodes do not have access to the signaling and routing information for the individual calls that they carry. Second, as mentioned above, the granularity of reconfiguration by DCS nodes is generally greater than the individual voice circuit.

Given a particular physical network, there are two steps in designing a logical network to lay out over that physical net- work. The first step, which is outside the scope of this paper,

A B C D E

A - 2 1 0 0 B 2 - 0 1 0

1 1 c 1 0 - D O l l - 1 E O O l l -

Figure 3. A Layout of the Logical Network

Suppose that we have a failure of a physical link (between two DCS nodes) in our two-layered trunk network. Any pipe that uses the failed physical link is disconnected by the failure. Were the physical link from DCS X to DCS 2 to fail, the pipe from switch B to switch D , which is routed over this physical link, would be disconnected. Service could be restored by rerouting the disconnected pipe from failed physical facilities to the spare capacity on surviving facilities, as shown in Fig. 4. As we will see, NETSPAR can achieve the failure response that is illustrated in Fig. 4.

Page 4: Using distributed topology update and preplanned configurations to achieve trunk network survivability

COAN ET AL. : TOPOLOGY UPDATE AND PREPLANNED CONFIGURATION TO ACHIEVE TRUNK NETWORK SURVIVABILITY 401

Figure 4. The Reconfigured Network after a Failure

The network in this section is too small to illustrate one important aspect of NETSPAR. Two alternative approaches to reconfiguration are possible: end-to-end and backhauling . In NETSPAR we use the more global approach: end-to-end recon- figuration in which the entire pipe can be relocated. In contrast, many of the completely distributed survivability algorithms use backhauling in which only the portion of the pipe that is actual- ly routed over a failed link is rerouted. The end-to-end approach has the potential to make better use of spare capacity; a quan- titative comparison is given in Subsection 5.3.

NETWORK REPROVISIONING NORMAL OPERATION (COORDINATED BY THE

NETWORK OPERATIONS

I \ I I

FAILURE EVENT 6- I OPERATION WITH LOSS . I I OUR RECONFIGURATON

PROTOCOL I I

Figure 5. Proposed Operation of'a Trunk Network

In Fig. 5 we give a schematic representation of a way to administer a trunk network that uses the NETSPAR protocol

for survivability. Ovals represent modes of operation and rec- tangles represent events that take the network from one mode of operation to another. In a network that is operating normal- ly, a failure event can happen without warning. Immediately after the failure the network is in a degraded mode with some connections lost. The first DCS node to detect the failure automatically initiates a reconfiguration using NETSPAR, quickly bringing the network to a mode in which some or all of the lost connections have been restored. Automatic recon- figuration is followed by physical repair of the failed com- ponents, which may take hours or days. After the physical repairs have been completed the network is manually recon- figured (under the control of the network operations center) to the normal configuration which utilizes all facilities including those that were just repaired. The goal of NETSPAR is to restore as much service as possible in the interval from a failure event to the return to normal operation; it is not a substitute for the repair of failed components.

3. CONFIGURATION TABLES

In this section we explain how to construct a set of con- figuration tables to be used in conjunction with the reconfigura- tion protocol given in Section 4 and how to select one of these tables to employ for a given failure. We define afuilure to be the set of components that are down. The first step is to choose those failures for which configuration tables will be computed (we call these coveredfailures). If a covered failure occurs, then NETSPAR will reconfigure the trunk network according to the preplanned configuration table for the specific covered failure; otherwise, NETSPAR will exhibit graceful degradation, selec- ting a table that performs at least as well as, and usually better than, retaining the no-failure configuration. The decision of which failures should be covered is an engineering tradeoff. The desire for as much failure coverage as possible must be balanced against the cost of the necessary spare facilities deployed in the network and the cost of computing, installing, and storing a large number of configuration tables.

NETSPAR works no matter which collection of failures is covered. The approach presented here for constructing con- figuration tables can be applied to whatever collection of failures is chosen to be covered. Similarly, the general method described here for selecting one of these tables to load in the event of some failure (not necessarily in the covered set) is always applicable. To illustrate the selection procedure, we present a simple selec- tion function for the specific case where the covered set con- sists of all single-link failures.

Throughout this section both the logical network (the desired pipes connecting the central-office switches) and the physical network (the topology and capacities of the physical links interconnecting the DCS nodes and connecting the DCS nodes to the central-office switches) are fixed. The con- figuration-table design problem is to find the set of configura- tion tables for each covered failure that maximizes the number of pipes carried by the surviving physical network, subject to the constraint that the capacity of each link is fixed. In Section

Page 5: Using distributed topology update and preplanned configurations to achieve trunk network survivability

408 IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER

5 we turn to the related problem of finding the least-cost physical network that guarantees complete recovery from each of the covered failures.

The remainder of this section is organized as follows. In Subsection 3.1 we discuss currently known algorithms for generating configuration tables to lay out a particular logical network on a particular physical network with a predetermined set of covered failures. Then in Subsection 3.2 we show how the DCS nodes can make a mutually consistent selection of the table to employ for whatever failure arises.

3.1 Constructing Conjiguration Tables

A set of configuration tables gives the layout of a particular logical network on a particular physical network. All of the desired pipes will be operational in the normal configuration, and the objective is for as many as possible of the pipes to be operational after each covered failure. We require that the post- failure restoration be done without rerouting those pipes that are not cut by the failure. This requirement ensures that no in- progress telephone calls are unnecessarily disrupted by a recon- figuration procedure, even if the reconfiguration time is large.

The problem of designing configuration tables for DCS nodes is mathematically equivalent to the integer-constrained multi-commodity flow problem on undirected networks, which we shall refer to as the IMF problem. Each commodity models a single pipe. Solutions are constrained to be integers because the granularity of reconfiguration is the single pipe. Exact solu- tion of this problem is believed to be computationally infeasi- ble for large networks [8]; nevertheless, heuristics are available that yield good approximate solutions. A standard heuristic [12] is to use linear programming to get an exact solution to the prob- lem with the integer constraint removed, and then to adjust that solution to get an approximate solution to the original con- strained problem.

A set of configuration tables can be constructed using any algorithm that yields a good solution to the IMF problem. First, configuration tables are produced for the fully-operational net- work where all components (links and DCS nodes) are operating correctly. Based on this configuration for the no-failure case, configuration tables are then produced for each covered failure. For each failure, the pipes unaffected by the failure are left un- disturbed. Proceeding independently for each covered failure, the design finds new routes for the pipes disconnected by that failure, using only the spare capacity on the surviving links and the capacity previously used by the disconnected pipes.

This technique can be applied to any desired collection of covered failures. Recall that a failure is represented by the set of links that are down. Any failure, node or link, single or multi- ple, can be represented this way. For our purposes, a DCS node failure is equivalent to the failure of all of the links incident to that node.

Because the configuration tables for each failure depend on the no-failure configuration chosen, the formulation of the table design problem as multiple independent instances of the IMF problem is a simplifying approximation. There is no guarantee that the combination of these independently solved

optimization problems will yield the true overall optimum, even if exact solutions for each IMF problem are available. A motiva- tion for making this approximation is to allow the use of known, simple methods for table construction. Better recovery may potentially be realized by an algorithm that determines the no- failure tables in concert with the covered-failure tables. This combined problem is addressed in Section 5.

3.2 Selecting a Conjiguration Table

If all components (links and DCS nodes) are operating cor- rectly, each DCS node uses the configuration table designed for normal operation. If one of the covered failures occurs, then the network is reconfigured using the table designed for this failure. If the network encounters a noncovered failure, then all DCS nodes must use some consistent rule to select a table. By “consistent rule” we mean that all DCS nodes in the system use exactly the same rule and that this rule is reliably loaded into all DCS nodes at the same time as the configuration tables. The rule is expressed as a function called SELECT that maps each possible failure to the configuration table to be used if that failure happens. Causing all DCS nodes to have a consistent (meaning exactly equal, except in the case of actual network partitions) view of what failure has occurred is the topic of Section 4.

If F is a covered failure, then SELECT(F) is the table for failure otherwise, SELECT (F) is the table for some failure F ’ where F ’ is a fixed maximal subset of F for which a con- figuration table exists. If the function SELECT is defined in this way, then our protocol will exhibit graceful degradation. In the special case that the set of covered failures is exactly the col- lection of all single-link failures, then an acceptable SELECT function is easily defined: let SELECT(F) be the table for the highest-numbered down link in F.

We emphasize that SELECT (F) depends only on the value of F and that all DCS nodes have exactly the same SELECT func- tion. This function is placed in the DCS nodes at the same time as the consistent collection of configuration tables, using the reprovisioning technique discussed in Subsection 4.5. This con- sistency condition ensures that all DCS nodes take the same ac- tion in response to a failure. After any failure X , covered or not, consider all DCS nodes in the same connected component of the surviving network. There will be some covered failure X’ such that all of these DCS nodes load the configuration table for X’ . The value of X’ will be the same for all DCS nodes in the same connected component because the reconfiguration protocol of Section 4 guarantees that all DCS nodes converge to the view that X is the failure that occurred. All DCS nodes eventually apply the same function SELECT to the same failure X to obtain the same covered failure X’ . They then load the configuration tables for X’ .

4. DYNAMICS OF RECONFIGURATION

We have discussed which configuration tables should be loaded after a failure. Now we discuss the process by which the DCS nodes converge to a consistent view of what failure

Page 6: Using distributed topology update and preplanned configurations to achieve trunk network survivability

COAN ET AL. : TOPOLOGY UPDATE AND PREPLANNED CONFIGURATION TO ACHIEVE TRUNK NETWORK SURVIVABILITY 409

has actually occurred. In Subsection 4.1 we review our assump- tions about hardware capabilities and failure modes. In Subsec- tion 4.2 we characterize the way our protocol keeps track of the changing topology of the network, causing each DCS node locally to maintain a correct view of the topology of its con-

assumption about relative processor speeds. Of course, the running time depends on the details of message delay and pro- cessor scheduling.

4.2 Dejinition of the Topology Update Problem nected component of the network. In Subsection 4.3 we give the pseudocode for our reconfiguration protocol and explain how this protocol causes each DCS node to select and load the ap- propriate configuration table for the current topology. In Subsec- tion 4.4 we prove that the local view of the topology maintain- ed at each DCS node in the reconfiguration protocol converges to the true topology. In Subsection 4.5 we explain how to reprovision the network (i.e., switch to a completely different collection of configuration tables and accompanying SELECT function). Finally in Subsection 4.6 we discuss extensions to handle unanticipated failure modes.

If there are L links in the trunk network, then each DCS node running our protocol maintains an L-element array LINKS, each of whose elements is either UP or DOWN. Initially all elements are UP. The purpose of this array is to track the true state of the network. While the topology is changing the LINKS array may give an inaccurate or outdated picture of the network. After the topology stops changing, all of the LINKS arrays should con- verge to the correct view. (There is one technical detail. If a failure disconnects the network, i.e., splits it into two or more pieces, then the DCS nodes in different pieces may have dif- ferent views of the topology. This limitation is unimportant for our work because the only inaccurate information that a DCS ever has is about hardware that is inaccessible to it.) Our pro- tocol ensures that the following three conditions are satisfied by the link arrays after any sufficiently long interval with no

4.1 Hardware Capabilities and Failure Modes

we make certain assumptions about hardware capabilities and the way in which hardware components fail. we believe that our assumptions are realistic and consistent with the emerg- ing SONET [2,3] standard. Some of our assumptions (like reliable message delivery) can be ensured by lower level pro- tocols not Dresented in this DaDer.

Termination: For any Surviving DCS node there is Some time after which that node stops changing its LINKS array. We refer

Links have only fail-stop failures. That is, a link operates cor- rectly for some time and then fails completely. Links do not corrupt messages. (Links that have failed and been repaired and links with intermittent failures are treated as faulty from the time of their first failure. Reintegration of repaired links is done by the procedure illustrated in Fig. 5.) DCS nodes have only fail-stop failures. That is, a node operates correctly for some time and then fails completely. An operating node never makes an erroneous computation or deviates from the protocol which it is running. (Nodes that have failed and been repaired and nodes with intermittent failures are treated as faulty from the time of their first failure. Reintegration of repaired nodes is done by the procedure il- lustrated in Fig. 5.) DCS nodes can reliably send messages over those links that are up. Each link, in addition to carrying telephone circuits between DCS nodes, also carries a signaling channel (e.g., a portion of the overhead channel of the SONET standard could be used). The signaling channel and its associated telephone circuits fail together: they are all faulty or they all work. Each DCS node can reliably detect the failure of each of its adjacent links. The failure of a DCS node appears to each of its neighboring DCS nodes as a failure of the link that connects the surviv- ing node to the failed node. (Variants on this assumption would be possible. All we require is some means by which a DCS node can detect the failure of an adjacent DCS node.) We make no timing assumptions. Specifically, we make no assumption about the availability of reliable clocks, we make no assumption on message delivery time, and we make no

to the value of the LINKS array after this time as the even- tual value. Agreement: If n and n ’ are any surviving nodes in the same connected component of the surviving network, then the even- tual value of the array LINKS at node n is equal to the even- tual value of the array LINKS at node n’. Accuracy: If n is a surviving node and is in the same con- nected component of the surviving network as link 1 , then the eventual value of LINKS(1) at node n is UP if and only if 1 is a surviving link.

4.3 The Reconjiguration Protocol

The key mechanism used in our protocol is intelligent flooding of the network with an announcement of each link failure. The number of messages sent per failure is limited because each node forwards information about each failure at most once. Thus, in the worst case, each link failure causes one message to be sent in each direction on each link in the physical network.

In our reconfiguration protocol each DCS node continual- ly executes its program, examining in turn each of its incident links. (It skips those links that it knows are down.) When a DCS node examines a link, the two events of interest are directly detecting that the link is down or receiving a message on the link. (There is only one kind of message used in our protocol, the announcement of some link failure.) In either event the node begins flooding the network with the information if it is new.

For use in the pseudocode we define two functions and three primitive procedures. The functions are called DEGREE and MAP. For any node n the value of DEGREE ( n ) is the degree (i.e., number of neighbors) of node n . For any node n and for any d, where 1 I d I DEGREE(n), the value of MAP(n,d)

Page 7: Using distributed topology update and preplanned configurations to achieve trunk network survivability

410 IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER

is the label of the d-th link incident to node n, where the links incident to node n have been given some arbitrary order.

The primitive procedures are called SEND, RECEIVE, and WORKING. At an arbitrary DCS node, say node n, these pro- cedures operate as follows. The procedure SEND(d,b) requires that 1 I d I DEGREE(n) and that b is an integer. The parameter d designates one of the links incident to node n, and the parameter b designates the proposed contents of a message to be sent along this link. The procedure SEND sends the message b on channel d , i.e., the message is placed in the buf- fer of the node at the far end of the named link. The message is lost if the link fails before delivery. The procedure RECEIVE (d) requires that 1 I d 5 DEGREE ( n ) . The parameter d designates one of the links incident to node n. If the named link is up and has one or more buffered messages, then the pro- cedure RECEIVE will return either the first message in the buf- fer (removing the message from the buffer) or NULL. If there are no buffered messages it returns NULL. The procedure WORKING(d) requires that 1 4 d 4 DEGREE(n). The parameter d designates one of the links incident to node n. The procedure WORKING returns logical true if link MAP (n ,d ) is up; otherwise, it returns logical false.

The pseudocode for our protocol is given in Fig. 6. A copy of this pseudocode runs at each of the DCS nodes in the network.

/nil inlirolion: le1 n be ~llr 1 0 1 1 d d this I)CS ,,ode lor 1 = 1 to L do

end for L I N K S ( ~ ) = UP

Main Driving Loop: while true do

for'd = 1 to DEGREE(,,) if LINKS(MAP(n, d)) = UP then

ii no1 W o n K l N c ( d ) then

else call U P D A T E ( M A P ( ~ , d))

r n = RECEIVE(d) i f m # ~ J L L Llien

end if call UPDATE(1n)

eiid i f end i f

eiid lor end while

S u b m u l i n e : procedure: UPDATE(b) conilnelil. b is llie lnbcl of L litak t l m l is (IOWII

i f LINKS(b) = U P llien LINKS(b) = D O W N h i d = 1 10 D E G I I E E ( n )

S E N U ( d , L ) eii<l f o r t = SEI.ECT(I.INKS), wll r ie SELECT is lile iunrliol8 l h n l yiel&

i t l s h l l wil ig! i i a t ioid I i t l>lv f Ilia iwlw of IIW hi~Iw+;r( 1nd4,xcd dat11t.tai ,,r LINKS 1 1 1 ~ ~ i s D O W N

w a 1 il ':,,<I ,1,1,1 1.,111,,.

Figure 6. The Reconfiguration Protocol

As we will show in the next subsection, all nodes in each connected component converge to the same view of the network. Thus, the configuration tables which they have installed after convergence are a consistent set of tables that were selected for the failure situation which has occurred.

While the protocol is converging, those pipes which were disconnected by the failure may not yet be completely restored. The new routes for these pipes are constructed incrementally as each DCS node in the network installs its new configuration tables.

4.4 Correctness of the Reconfiguration Protocol

In this subsection we show that the reconfiguration pro- tocol (Fig. 6) solves the topology update problem. Thus the installation of configuration tables, which is done locally and asynchronously at each DCS node, results in a consistent and appropriate set being used network-wide, and this desirable behavior is achieved even if there are failures while the recon- figuration protocol is running.

fieorem 1: The reconfiguration protocol solves the topology update problem.

Proof: We show that the protocol satisfies the three conditions: termination, agreement, and accuracy.

Termination Condition: There are only two types of events that cause a DCS node to change its LINKS array: detection of the first failure of an adjacent link and receipt of a message from a neighboring DCS. We show termination by showing that on- ly a bounded number of these events can occur. For the detec- tion of first failures of adjacent links, the bound follows im- mediately from the bounded number of incident links at each DCS node. Now we show that there is a bound on the number of messages that can be sent (and hence a bound on the number of messages that can be received). A DCS node only sends messages after changing a LINKS entry from UP to DOWN. A bounded number of messages are sent for each change. Network-wide there are a bounded number of entries in LINKS arrays. DCS nodes never change LINKS entries from DOWN to UP. Thus there are a bounded number of such changes that can take place and therefore a bounded number of messages that can be sent.

Agreement Condition: The proof is by contradiction. Sup- pose Y and 2 are two surviving DCS nodes in the same compo- nent of the surviving network and that they have different views of the status of some link 1. Because Y and Z are in the same component, there must be a path nl, n2, ..., n, such that Y = n l , Z=n,, and for 1 I i I x - 1, nj is connected to ni+l by a sur- viving link. Choose the minimurnj such that DCS nodes nl and nj have different views of the status of link 1. (Such a j clearly must exist because n1 and n, have different views). DCS nodes nj-l and nj are adjacent nodes with different views on the status of link 1. Without loss of generality, suppose that DCS node nj-l thinks it is UP. When nj first changed its view to DOWN, it sent messages to all of its neighbors, including nj- 1.

Because the link between them is in the surviving network, nj- got this message and must have changed the status of link 1 to DOWN; thus we have a contradiction.

Accuracy Condition: There are two cases. Suppose link E is a surviving link. Link-failure messages are sent in response to two types of events: the receipt of a similar link-failure message or the physical detection of a link's failure. The first

Page 8: Using distributed topology update and preplanned configurations to achieve trunk network survivability

COAN ET AL. : TOPOLOGY UPDATE AND PREPLANNED CONFIGURATION TO ACHIEVE TRUNK NETWORK SURVIVABILITY 41 1

link-failure message for any link is sent in response to the physical detection of the failure of that link. Because link 1 is a surviving link, no DCS node detects the failure of link 1. Thus no link-failure messages are sent for I and no DCS node changes its view from UP to DOWN. Suppose instead that link 1 is not a surviving link. The two DCS nodes at the ends of 1 detect the failure and update their LINKS arrays. By the agreement con- dition, node n has the same view-that I is DOWN. 0

4.5 Reprovisioning the Network

Long-term operation of a trunk network will necessitate a mechanism for changing the tables for both the no-failure case and the covered failures. We refer to this process as reprovi- sioning. In this subsection we outline a method for reprovision- ing the network. This method allows for the changing of the no-failure tables, the tables for the covered failures, and the SELECT function. (Once new tables are installed, a distributed flooding protocol can be used to activate the new no-failure tables.) General reprovisioning entails the rearrangement of in- progress calls. Thus, to make complete use of our reprovision- ing method one must be willing to reroute stable calls under the control of our distributed protocols.

The method allows the network operator to switch to a completely different collection of configuration tables (together with their accompanying SELECT function). The key technical problem here is ensuring that the DCS nodes take consistent action even if a failure occurs in the middle of the reprovision- ing process. The solution is to adapt a 2-phase commit protocol from the area of distributed databases [ 191. This method requires enough memory in each DCS node to hold two complete col- lections of configuration tables, the old one and the new one. It also requires some centralized source to down-load new tables. This source need not always be available. A failure of the cen- tralized source halts the reprovisioning, but does not leave the network vulnerable to inconsistency.

Each DCS node maintains two complete collections of con- figuration tables with one SELECT function for each. Call them A and B. Each DCS node always has one collection (either A or B) which is safe for it to use. We call this the preference of that DCS node. A node that detects a failure will broadcast its preference along with notification of the failure. At times when both collections are safe to use, different DCS nodes can have conflicting preferences, so each DCS node also has a rule for how to resolve conflicting preferences (received from dif- ferent DCS nodes).

To load new tables, the central controller proceeds in two phases. In the first phase the central controller updates one of the two collections and the rule for resolving conflicting preferences. This phase can be done safely when the controller knows that all of the DCS nodes in the network prefer the other (unchanged) collection of configuration tables. In the second phase the central controller updates the preferences of all of the DCS nodes. This can be done safely when both collections of tables are in usable condition and when all DCS nodes in the system have exactly the same rule for resolving conflicting preferences. The protocol maintains the following three in-

variants, thus guaranteeing correct operation during reprovi- sioning. (1) There is always at least one collection (A or B) which is in a usable condition at all DCS nodes in the network. (2) No DCS node prefers a collection that is in an unusable con- dition anywhere. (3) If there are any two DCS nodes with con- flicting preferences, then all DCS nodes have exactly the same rule for resolving conflicts.

4.6 Extensions for Self-stabilization

Although the topology update protocol presented in this section is simple and converges rapidly, it relies on certain assumptions: no corrupted messages and no corruption of DCS processor state, among others. If these assumptions are violated then the protocol may fail. If the network faces a significant chance of message or state corruption, a self-stabilizing [6] topology-update protocol may be used instead. A self-stabilizing protocol exhibits correct behavior regardless of the state in which any of its processors are started, thus allowing for automatic recovery from any unforeseen error. Self-stabilizing topology update protocols are known [4,20]. The substitution of one of these protocols in NETSPAR would increase the degree of robustness, at the cost of greater complexity and poten- tially longer convergence times.

5. SPARE-CAPACITY REQUIREMENTS

In this section we begin our evaluation of NETSPAR by presenting a simulated-annealing based network-design tool that allows us to quantify the spare-capacity requirements in a typical network. First we evaluate the amount of spare capacity required by NETSPAR for complete recovery from any single-link failure. Then we compare the spare-capacity requirements of end-to-end recovery, which is used in NETSPAR, to the spare- capacity requirements of backhauling, which is used in many of the completely distributed survivability protocols.

Given a fixed physical network topology and a fixed set of pipes, our network-design tool determines a minimum-cost survivable network design. The network design that is produc- ed specifies the amount of capacity that is required on each physical link and the sets of configuration tables for normal operation and for each covered failure. In contrast with the ap- proach taken in Section 3, where link capacity was fixed and the degree of restoration was to be maximized, our network- design tool seeks to minimize the amount of spare capacity necessary for a predetermined level of survivability. Throughout this section the level of survivability that we seek is the com- plete recovery from any single-link failure. Another contrast with the approach of Section 3 is that we abandon the partition into separate optimization problems for the normal case and each covered failure; rather, our network-design tool seeks to minimize a global cost function.

The cost to be minimized in the network design is the sum of the capacities on all of the physical links-the topology of available physical links is assumed to be fixed. The total re- quired capacity on each link includes the capacity needed to

Page 9: Using distributed topology update and preplanned configurations to achieve trunk network survivability

412 IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER

carry the traffic for the normal no-failure case plus the spare capacity on that link required to maintain all traffic in the case that any covered failure occurs. This cost function makes the simplifying assumption that bandwidth is equally expensive on all links and that incremental costs to add capacity are constant. While we only consider this simple linear cost function, the simulated annealing approach makes it straightforward to use more general cost functions, including nonlinear or stepwise.

The remainder of this section is organized as follows. In Subsection 5.1 we briefly explain how simulated annealing can be applied to the design of survivable networks. The results of applying simulated annealing to two sample networks are presented in Subsection 5.2. Finally, Subsection 5.3 compares the spare capacity requirements of end-to-end recovery and backhauling .

5.1 R e Simulated Annealing Technique

Simulated annealing [13,22] is a well-known heuristic technique that can be useful for solving complex combinatorial optimization problems. It is particularly attractive when the problem exhibits many local minima and when the state space is so large that traditional techniques become computationally infeasible. The approach is based on an analogy with “cooling down” a physical system (for example, cooling down a melt to grow a single crystal). If the cooling is done slowly enough, the system will end up in its ground state of minimum energy.

When applying simulated annealing, we define a cost func- tion C which is to be minimized, as well as “local moves” that will take the system from one configuration to another. Simulated annealing has the attractive property that it yields global solutions based on the iterative repetition of many sim- ple local computations. The temperature is a control parameter that determines the probability of accepting the changes pro- duced by a local move. At a given temperature T, a move is accepted if it decreases C; otherwise, it is accepted with prob- ability e where AC is the amount by which the move in- creases C and e is the base for the natural logarithms. To find a global minimum of the objective function one typically starts at high temperatures where essentially all moves are accepted. One then follows a suitable annealing schedule to cool the system to zero temperature, with each decrease in temperature reduc- ing the probability of accepting a move that increases the cost function. Details of the simulated annealing heuristic (such as choice of annealing schedule) are outside the scope of this paper and can be found in the literature [ 13,221.

In applying simulated annealing to the problem of survivable network design in this paper, the cost function we chose is C = E!=.=, xl, where x[ is the capacity of physical link I and L is the total number of links. This sum is the actual cost we wish to minimize (the total capacity of the network). To improve the con- vergence rate a second low-order term is added to the cost func- tion. This term is chosen to be small enough to guarantee that the first term dominates, so the cost function with and without the second term have minima in the same places.

The computational performance of the simulated anneal- ing approach appears attractive for reasonable sized networks.

For instance, on the 26-node network studied in the next two subsections, good results were obtained within about 10 minutes on a high-end workstation.

5.2 Spare-Capacity Requirements of NETSPAR

In this subsection, we present our survivable network design results for two model networks: a small example 7-node network and a larger 26-node network. The first physical net- work is illustrated in Fig. 7. Table 2 gives the number of STSl pipes required between the DCS nodes. (In this subsection we consider the aggregate traffic between DCS nodes rather than between central-office switches since it is the capacity of the backbone physical network interconnecting DCS nodes that is of interest.) Because this network has a small number of nodes and a regular topology, we were able to find the true least-cost survivable network by hand. The numbers on the links in Fig. 7 give this solution, with the first number indicating the nor- mal capacity used when there are no failures and the second number indicating the spare capacity used to recover from any single-link failure. Our simulated annealing tool obtained the same least-cost design.

3+ 1

3+ 1 3 i 2

3+1

Figure 7. Required Capacity for the 7-Node Network

TABLE 2 Required Pipes for the 7-Node Network

A B C D E F G

A - 1 1 1 1 1 2 B 1 - 1 1 1 1 2 c 1 1 - 1 1 1 2 D l l l - 1 1 2 E l l l l - 1 2 F l l l l l - 2 G 2 2 2 2 2 2 -

Page 10: Using distributed topology update and preplanned configurations to achieve trunk network survivability

COAN ET AL.: TOPOLOGY UPDATE AND PREPLANNED CONFIGURATION TO ACHIEVE TRUNK NETWORK SURVIVABILITY 413

33Lit341

\ I 341:339

313+126 r6s+!r1

5Gt227 D V Y t L 14 / \

El’ \ \ 5571 167

\ 279tt17

U Figure 8. Required Capacity for the 26-Node Network

The second network we study is a 26-node network derived from data on the placement of physical facilities and the logical trunk requirements in a portion of an existing trunk network. Fig. 8 illustrates the physical network and Table 3 gives the number of DSl pipes required between the DCS nodes. (Here the granularity of facility reconfiguration is the DS1, which is 24 voice circuits, rather than the STS1.) The results of apply- ing our network-design tool are indicated in Fig. 8. Again, the first number on a link is the capacity used under normal opera- tion and the second number is the spare capacity used for failure cases. The sum of all link capacities required to carry the traf- fic for the no-failure case is 9,879 DS1 circuits. The capacity required in our survivable network design is 16,567 DS1 cir- cuits. Thus, 68% extra capacity is required to recover from any single-link failure for this realistic network.

Throughout this paper we assume that the only pipes that may be rerouted after a failure are those pipes that have been disconnected by the failure. As we will discuss in Sec- tion 6.2, if the reconfiguration protocol is guaranteed to com- plete in a sufficiently short time, then it would be a reasonable alternative to allow any pipes to be rerouted after a failure. At least for the 26-node network, allowing all of the pipes to be rerouted gives little advantage compared with allowing only the disconnected pipes to be rerouted, saving only 27 DS 1 circuits overall.

TABLE 3 Required Pipes for the 26-Node Network

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A - 9 2 8 1 9 0 1 5 1 1 5 3 1 3 0 1 0 1 0 0 5 0 1 4 2 4 0 1 2 B 92 - 4 8 3 4 0 9 2 4 6 5 4 1 1 1 8 6 0 5 0 6 0 0 26 2 2 1 9 1 0 1 5 0 3 7 C 8 4 8 - 14 0 2 6 1 1 2 3 2 4 9 1 0 2 1 0 2 1 0 6 0 1 1 0 3 5 D 19 34 14 - 0 7 1 8 5 3 5 1 4 2 2 5 2 8 0 3 1 0 10 1 3 2 0 1 1 1 0 2 6 E O O O O - 0 0 0 0 0 1 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 F l 9 2 7 0 - 16 6 4 5 3 8 4 5 4 1 1 4 0 8 1 3 44 2 7 1 3 3 9 1 3 5 G 5 2 4 6 1 8 0 1 6 - 12 59 40 31 52 54 1 15 1 5 45 6 12 26 3 18 0 6 8 H 1 6 1 5 0 6 1 2 - 36 23 32 41 27 26 13 5 3 32 13 12 4 3 13 1 3 4 I 15 54 12 35 0 45 59 36 - 90 50 42 42 3 26 3 8 123 22 35 38 43 27 12 13 25 J 3 11 3 14 0 38 40 23 90 - 42 70 49 3 14 4 8 20 10 30 18 2 21 3 9 15 K 1 18 2 22 19 45 31 32 50 42 - 49 66 7 37 8 3 64 7 16 18 13 23 5 11 11 L 30 60 49 52 0 41 52 41 42 70 49 - 42 7 37 10 0 0 0 0 52 0 59 0 32 34 M 1 5 1 8 0 14 54 27 42 49 66 42 - 16 19 5 9 46 15 22 10 5 14 1 9 10 N 0 0 0 0 0 0 1 2 6 3 3 7 7 1 6 - 1 0 0 2 1 1 0 1 0 0 0 0 0 1 6 2 3 0 8 1 5 1 3 2 6 1 4 3 7 3 7 1 9 1 - 2 4 22 8 1 0 5 0 1 1 1 7 7 P O 0 1 1 0 1 1 5 3 4 8 1 0 5 0 2 - 1 3 3 1 1 1 1 0 1 1 Q O O O O O 3 5 3 8 8 3 0 9 0 4 1 - 30 4 0 0 6 0 6 0 0 R 5 26 2 10 0 44 45 32 123 20 64 0 46 2 22 3 30 - 30 50 24 8 37 4 5 8 S 0 2 1 1 0 2 6 1 3 2 2 1 0 7 0 1 5 1 8 3 4 3 0 - 9 2 3 1 2 3 2 2 T 1 2 0 3 0 7 1 2 1 2 3 5 3 0 1 6 0 2 2 1 1 0 1 0 50 9 - 9 4 5 4 8 1 1 U 4 1 9 6 2 0 0 1 3 2 6 4 3 8 1 8 1 8 5 2 1 0 0 5 1 0 24 2 9 - 4 29 I 6 10 V 2 1 0 0 1 0 3 3 3 43 2 1 3 0 5 1 0 1 6 8 3 1 4 5 4 - 1 0 0 1 W 4 15 11 11 0 9 18 13 27 21 23 59 14 0 11 1 0 37 2 4 29 1 - 0 28 18 X 0 0 0 0 0 1 0 1 1 2 3 5 0 1 0 1 0 6 4 3 8 1 0 0 - 0 0 Y 1 3 3 2 0 3 6 3 13 9 11 32 9 0 7 1 0 5 2 1 6 0 2 8 0 - 8 Z 2 7 5 6 0 5 8 4 2 5 1 5 1 1 3 4 1 0 0 7 1 0 8 2 1 1 0 1 1 8 0 8 -

Page 11: Using distributed topology update and preplanned configurations to achieve trunk network survivability

414

5 . 3 Comparison with Backhuuling

IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER

This amount, in turn, depends on the types of failure to be

In this subsection we compare the spare-capacity re- quirements of end-to-end recovery as it is done in NETSPAR to the spare-capacity requirements of a greedy variant of backhauling that is used in many completely distributed sur- vivability protocols. (Recall that end-to-end recovery and backhauling are defined in Section 2.) The quantity that we com- pare is the total capacity required for complete recovery for any single-link failure.

Because the NETSPAR response to covered failures is preplanned, it is straightforward for that response to be end-to- end. Compared to NETSPAR, completely distributed techniques devise more of their recovery plan after a failure has occurred. To minimize reconfiguration time they often use a greedy variant of backhauling whch is especially well suited to distributed com- putation. After a single-link failure greedy buckhuuling routes as much of the disrupted traffic over the shortest available path in the surviving network. Remaining disrupted traffic is then routed over the next shortest available path. This process is repeated until either all disrupted traffic is rerouted or there is no path with remaining spare capacity.

Because the use of greedy backhauling constrains the network-design search space, it has the potential to increase the cost of attaining a given level of survivability. In this subsection we quantify that difference for the 26-node network presented in the previous subsection. To perform the calculations for greedy backhauling , we have again used simulated annealing.

Recall that to make the 26-node network survivable using end-to-end recovery requires a total of 16,567 DSl circuits, with the ratio of spare to normal circuits equal to 68 % . In contrast, using greedy backhauling requires a total of 17,293 DS1 cir- cuits, with the ratio of spare to normal circuits equal to 75%.

Suppose instead we fix the total capacity available to greedy backhauling to be that needed for end to end recovery, 16,567 DSl circuits, and optimize the placement of this capacity for greedy backhauling using simulated annealing. Averaged over all single-link failures the fraction of the disrupted capacity that is recoverable is 97%. The recovery of 97% of the disrupted traffic assumes that no limit is placed on the number of rounds of finding shortest paths. If the required recovery time limits the number of rounds, then the amount of recovery may be reduced. For instance, if only one round is allowed then 69% of capacity is recoverable. For two rounds 84% is recoverable, and for three rounds 92% is recoverable.

6. PERFORMANCE OF RECONFIGURATION

In this section we calculate the performance of the recon- figuration protocol of Subsection 4.3 on the 26-node network of Fig. 8. The specific performance measures that we calculate are the memory requirement at each DCS node and the run- ning time (i.e., time from failure to reconfiguration).

6.1 Memory Requirements The memory requirements for our protocol are dominated

by the amount of space required to store configuration tables.

tolerated. The amount of memory required at each DCS node to tolerate various types of failures in our 26-node network is given in Table 4. The table entries are calculated as follows. The memory required at each node is the product of the space needed to store a single configuration table and the number of configuration tables required at that node. Each configuration table at a DCS node must specify a destination DCS port ad- dress for each incident DS1 channel. We assume that each destination address takes two bytes. This assumption yields a size of 6840 bytes for each table at node M -the node with the largest memory requirement because it has the largest number of DS1 terminations, 3420. The number of tables stored at each node is equal to the total number of failures to be tolerated. For example, in our 26-node network there are 42 links and therefore 42 possible single-link failures, yielding a maximum memory requirement of 281 Kbyte at each node to cover all single-link failures.

TABLE 4 Memory Requirements for the 26-Node Network

Number of Type of failure tolerated such failures Memory required

All single-link failures 42 281 Kbyte All single-node failures 26 174 Kbyte All single-component failures 68 454 Kbyte

5.6 Mbyte All double-link failures 861 All double-node failures 325 2.1 Mbyte All double-component failures 1186 7.7 Mbyte

6.2 Running Time

The running time of the reconfiguration protocol is the time for the most distant DCS node to find out about a failure plus the time for that node to load the configuration table that cor- responds to that failure. Let P be the path along which news of the failure propagates to the most distant node. The time that it takes this node to find out about the failure is the sum of the propagation time along P (due to the finite speed of light and the finite bit rate of the signaling channels) and the time for the news to pass through the nodes on path P . In the recon- figuration protocol, as soon as a DCS node gets a message an- nouncing a failure, it sends the news to all of its neighboring nodes. (Here we consider a protocol that is slightly modified to improve the performance: a message is not sent back on the link that it was just received on.) In the worst case, at each node on path P , the last neighbor to be notified might be the next node along the path P . Thus the time for the information to pass through a node n can be bounded by DEGREE(^) - 1) .Send time + Receive time, where DEGREE(n) is the degree of the node and Receive time and Send time are the times it takes a node to receive and send a message (due to, e.g., protocol pro- cessing). We assume that the send and receive times are the same and call their common value Msg time.

Page 12: Using distributed topology update and preplanned configurations to achieve trunk network survivability

COAN ET AL.: TOPOLOGY UPDATE AND PREPLANNED CONFIGURATION TO ACHIEVE TRUNK NETWORK SURVIVABILITY 415

We now make the above calculation for the network of Fig. 8. Some existing DCS nodes are able to load configura- tion tables in 2 msec. We assume that the longest distance in the network is 500 km so that the speed-of-light latency is about 2.5 msec. Assuming a signaling channel of 64 Kbit/sec and 8-byte messages, the propagation time is 1 msec on each link. We have calculated an upper bound on the worst-case relay delay across the network. For instance, when there is a failure detected at node A it takes at most 26 . Msg time for the information to reach node Q, traveling over 9 links. (The shortest-delay path is different from the shortest-hops path, being determined by the degree of the nodes as well as the number of hops.) Com- bining these pieces yields the following calculation of the worst- case running time of the reconfiguration protocol.

Tmniime 5 26 Msg time + 9 - Propagation time + Latency

+ Load time

I 26 Msg time + 9 1 msec + 2.5 msec + 2 msec

A digital central-office switch waits approximately 150 msec before assuming that a connection has failed and initiating test and diagnostic procedures on the affected trunk. Thus, our recovery protocol can be completely transparent to the central- office switches using the trunk network if Tmntime < 150 msec, which will be the case as long as Msg time < 5.25 msec.

The foregoing calculation is based on technological para- meters which we believe are achievable, given sufficient effort. While short, a message time of 5 msec is realizable on current- ly available technology such as high-speed workstations, especially if the number of layers of protocol for processing the NETSPAR messages is limited. To achieve fast enough loading of configuration tables and DCS reconfiguration, it is necessary to use DCS nodes with fast enough real-time perfor- mance. (DCS nodes with much slower reconfiguration times exist but are more suited to an environment where reconfigura- tion requests are entered manually by a craftsperson in response to a service order.) If slower technology is used, then the NETSPAR approach is still useful for restoring service after a physical failure, but the reconfiguration time may be greater than the 150 msec trunk timeout interval, resulting in the loss of those in-progress calls routed over failed components.

Several of the design choices made in NETSPAR accom- modate the possibility of slower technology. Specifically, the decision not to move those in-progress calls unaffected by a failure and the decision not to automatically reintegrate repaired physical links were made to ensure the usefulness of NETSPAR in an environment where reconfiguration is too slow to be done without disrupting in-progress calls. If a user of NETSPAR were sufficiently confident of the speed of reconfiguration, then two optimizations could be made. First, the rearrangement of pipes after a failure could be global, moving both those pipes disrupted by the failure and those not, which might allow for more effi- cient use of spare capacity. Second, a more general topology update protocol could be used, allowing for the automatic reintegration of repaired physical facilities -a process that in-

herently involves the rearrangement of pipes that carry in- progress calls.

7. COST OF RECONFIGURATION

Tsai, Coan, Kerner, and Vecchi [21] evaluated the cost effectiveness of NETSPAR by comparing the installed first-cost of NETSPAR with the installed first-cost of traditional PPDR. Installed first-cost includes expenses for equipment and installa- tion, but does not include operational and maintenance cost. Two sample physical network topologies were considered, one ring and one mesh, and several network parameters were varied in order to quantify the sensitivity of these two approaches to physical link length, network connectivity, and repeaterless span length. For some networks the traditional PPDR approach has a lower first-cost, while for other networks NETSPAR has a cost advantage.

Before we begin our comparison we need a note on ter- minology. Physical connections in the transmission plant run between buildings called central offices. DCS nodes are located in central offices. A fiber-optic link that enters a central office can be connected to the DCS node in that office. This arrange- ment is the one assumed throughout this paper. Alternatively, the fiber-optic link can be directly connected (on a piece of equipment called a frame) to an outgoing fiber-optic link, bypassing the DCS node and creating aphysicalpath consisting of one or more links. In the PPDR strategy, the network that handles the no-failure case traffic is a collection of physical paths, many of which consist of more than one link.

It is possible to identify three characteristics of NETSPAR which have a significant impact on the cost. In comparison to PPDR, two have a favorable (cost-reducing) impact and one has an unfavorable impact [2 11.

(1) NETSPAR takes a global (network wide) approach to using spare capacity on links, whereas traditional PPDR designs allocate a specific spare physical path for each normal physical path. A global approach results in increased efficiencies: NETSPAR reduces the total amount of spare capacity required on links and it reduces the number of physical fiber strands used. The reduction in amount of spare capacity required is reflected primarily in the cost of the required transmission equipment. The reduction in the number of fiber strands used has an im- pact on the cost of the required fiber cables and repeaters.

(2) NETSPAR requires DCS terminations at all nodes in order to flexibly reconfigure the network. These DCS termina- tions provide the repeater function at no added cost. This is an advantage with respect to traditional PPDR in those networks with large internode distances.

(3) In traditional PPDR designs it is common for physical paths (both spare and normal) to pass through central offices without terminating on the DCS node, whereas in NETSPAR it is common for physical paths to terminate on almost all of the DCS nodes that they traverse. The cost of these added DCS terminations has an unfavorable effect for NETSPAR. This ef- fect is less important as the connectivity of the network

Page 13: Using distributed topology update and preplanned configurations to achieve trunk network survivability

416 IEEE TRANSACTIONS ON RELIABILITY, VOL. 40, NO. 4, 1991 OCTOBER

increases, and the number of spare physical paths (relative to the total number of physical paths) required by NETSPAR decreases.

Overall conclusions are that the hardware first-cost for NETSPAR is relatively smaller in networks with high connec- tivity or losg internode distances. No significant effect was seen due to traffic volume on the relative benefit of NETSPAR. There are many other factors, besides the installed first-cost of the hardware, that have to be considered when evaluating alternative survivability strategies. The quantification of these factors is beyond the scope of the present paper

ACKNOWLEDGMENT

We thank Marty Kerner and Ethen Tsai for assistance with evaluating the cost-effectiveness of our approach to trunk net- work survivability, Jo Klem of the Strategic Planning District of Southern Bell for providing us with the network model used in subsection 5.2, Biilent Yener for assistance with the im- plementation of simulated annealing, and the anonymous referees for numerous comments which helped us improve the presentation of this paper.

REFERENCES

[I] Y. Afek, B. Awerbuch, and E. Gafni, “Applying Static Network Pro- tocols to Dynamic Networks,” In Proc. 28th IEEE Symposium on Foun- dations of Computer Science, pp. 358-370, October 1987.

[2] Bellcore, “SONET Transport Systems: Common Generic Criteria,” In Bellcore Technical Advisory TA-TSY--000253, July 1988.

[3] R. J. Boehm, Y.-C. Ching, C. G. Griffth, and F. A. Saal, “Standardiz- ed Fiber Optic Transmission Systems-A Synchronous Optical Network View,” IEEEJ. on Select. Areas C o m n . , vol. SAC-4, pp. 1424-1431, December 1986.

[4] B. A. Coan and M. P. Herlihy, manuscript in preparation, 1991. [5] B. A. Coan, M. P. Vecchi, and L. T. Wu, “A Distributed Protocol to

Improve the Survivability of Trunk Networks,” In Proc. XIII Interna- tional Switching Symposium, vol. 4, pp. 173-179, May 1990.

[6] E. W. Dijkstra, “Self-Stabilizing Systems in Spite of Distributed Con- trol,” Communications of the ACM, vol. 17, pp. 643-644. November 1974.

[7] D. K. Doherty, W. D. Hutcheson, and K. K. Raychaudhuri, “High Capaci- ty Digital Network Management and Control,” In Proc. IEEE Global Telecommunications Conference, pp. 301.3.1-301.3.5, December 1990.

[8] M. R. Garey and D. S . Johnson, Computers and Intracrabiliry: A Guide to the Theory ofNP-Completeness, W. H. Freeman and Co., New York, 1979.

[9] W. D. Grover, “The Self-Healing Network: A Fast Distributed Restora- tion Technique for Networks Using Digital Crossconnect Machines,” In Proc. IEEE Global Telecommunications Conference, pp. 28.2.1-28.2.6, December 1987.

[IO] W. D. Grover, B. D. Venables, J. H. Sandham, A. F. Milne, “Perfor- mance Studies of a Selfhealing Network Protocol in Telecom Canada Long Haul Networks,” In Proc. IEEE Global Telecommunications Conference, pp. 403.3.1-403.3.7, December 1990.

[ l l ] C. Han Yang and S . Hasegawa, “FITNESS: Failure Immunization Technology for Network Service Survivability,” In Proc. IEEE Global Telecommunications Conference, pp. 47.3.1-47.3.6, November/ December 1988.

[12] T. C. Hu, Combinatorial Algorithms, Addison-Wesley, Reading, MA, 1982.

[13] S . Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by Simulated Annealing,” Science, vol. 220, pp. 671-680, May 1983.

[14] H. Komine, T. Chujo, T. Ogura, K. Miyazaki, and T. Soejima “A Distributed Restoration Algorithm for Multiple-Link and Node Failures of Transport Networks,” In Proc. IEEE Global Telecommunications Con- ference, pp. 403.4.1-403.4.5, December 1990.

[I51 J. M. McQuillan, I. Richer, and E. C. Rosen, “The New Routing Algorithm for the ARPANET,” IEEE Trans. Commun., vol. COM-28, pp. 711-719, May 1980.

[I61 C . Palmer and F. Hummel, “Restoration in a Partitioned Multi-Bandwidth Cross-Connect Network,” In Proc. IEEE Global Telecommunications Con- ference, pp. 301.7.1-301.7.5, December 1990.

[I71 R. Perlman, “Fault-Tolerant Broadcast of Routing Information,” Com- pur. Networks, vol. 7, pp. 395-405, December 1983.

[IS] H. Sakauchi, Y. Nishimura, and S . Hasegawa, “A Self-Healing Network with an Economic Spare-Channel Assignment, ” In Proc. IEEE Global Telecommunications Conference, pp. 403.1.1-403.1.6, December 1990.

1191 D. Skeen, “Crash Recovery in a Distributed Database System,” Ph.D. Thesis, Department of Electrical Engineering and Computer Science, University of California, Berkeley, 1982. (Also available as technical report UCB/BRL M82145.)

[20] J. M. Spinelli and R. G. Gallager, “Event Driven Topology Broadcast without Sequence Numbers,” IEEE Trans. Commun., vol. COM-37, pp. 468-474, May 1989.

[21] E. I. Tsai, B. A. Coan, M. Kerner, and M. P. Vecchi, “A Comparison of Strategies for Survivable Network Design: Reconfigurable and Con- ventional Approaches, ” In Proc. IEEE Global Telecommunications Con- ference, pp. 301.1.1-301.1.7, December 1990.

[22] P. J. M. van Laarhoven and E. H. L. Aarts, SimulatedAnnealing: Theory and Applications, D. Reidel Publishing Company, Dordrecht, Holland, 1987.

[23] T.-H. Wu, D. J. Kolar, and R. H. Cardwell, “Survivable Network Ar- chitectures for Broadband Fiber Optic Networks: Model and Performance Comparisons,” IEEE J. Lightwave Tech., vol. 6, pp. 1698-1709, November 1988.

AUTHORS

Brian A. Coan; Bellcore; 445 South Street; Morristown, New Jersey 07962- 19 10; USA.

Brian A. Coan received the B.S.E. degree in electrical engineering and computer science from Princeton University, Princeton, NJ in 1977, the M.S. degree in computer engineering from Stanford University, Stanford, CA, in 1979, and the Ph.D. degree in computer science from the Massachusetts ln- stitute of Technology, Cambridge, MA in 1987. He has worked for Amdahl Corporation and AT&T Bell Laboratories. Since 1987 he has been a member of the technical staff at Bellcore. His main research interests are in distributed systems and fault tolerance.

Will E. Leland; Bellcore; 445 South Street; Morristown, New Jersey

Will E. Leland received the B.A. and B.S. degrees in mathematics and physics from the Massachusetts Institute of Technology in 1973 and 1974, and the M.S. and Ph.D. degrees in computer sciences from the University of Wisconsin-Madison in 1978 and 1982. Since 1984 he has been a member of the technical staff at Bellcore. His research interests include computer and com- munications network performance measurement and modeling, high-speed com- munications networks, network management, and distributed systems. He is a member of the Association for Computing Machinery, the IEEE, and Sigma Xi.

07962-1910; USA.

(continued on page 427)

Page 14: Using distributed topology update and preplanned configurations to achieve trunk network survivability

WU ET AL.: A MULTI-PERIOD DESIGN MODEL FOR SURVIVABLE NETWORK ARCHITECTURE SELECTION FOR SONET 427

American Standard for Telecommunications, Digital Hierarchy Optical Zn- terface Rates and Formats Specification, ANSI T1.105/1988. T-H. Wu, M. Burrowes, “Feasibility study of a high-speed SONET self- healing ring architecture in future interoffice fiber networks”, ZEEE Com- munications Mag., vol 28, 1990 Nov, pp 33-42. T. Flanagan, “Fiber network survivability”, IEEE Communications Mag., vol 28, 1990 Jun, pp 46-53. R. H. Cardwell, C. L. Monma, T-H. Wu, “Computer-aided design pro- cedures for survivable fiber optic telephone networks”, ZEEE J. Selected Areas in Communications, vol 7, 1989 Oct, pp 1188-1 197.

AUTHORS

Dr. Tsong-Ho Wu; Bell Communications Research; 331 Newman Springs Road; Red Bank, New Jersey 07701-7040 USA.

Tsong-Ho Wu received his PhD from the State University of New York at Stony Brook in 1983. He has been a member of the technical staff of Net- work Control Research Division at Bell Communications Research since 1986. He is responsible for broadband fiber network design and survivable network architectures. From 1983 - 1986 he was with United Telecommunications Inc. and its data communication division (UNINET) as a senior research scientist; he was responsible for project management and research for planning a new nationwide packet-switched data network. His research interests include broad- band switch and network design, and new cost-effective optical network ar- chitectures that best use unique characteristics of photonic switching technology.

Richard H. Cardwell; Bell Communications Research; 331 Newman Springs Road; Red Bank, New Jersey 07701-7040 USA.

Richard H. Cardwell, as manager of the Network Architecture and Design Research district, directs a group responsible for devising innovative network control strategies, as well as new approaches to network survival. Rich worked for AT&T Bell Laboratories where he was responsible for developing Dynamic Non-Hierarchical Routing schemes before coming to Bellcore at divestiture. Both his BS from Michigan Technological University and his MS from the University of Michigan are in Electrical Engineering.

Michael Boyden; Pacific Bell; 2600 Camino; San Ramon, California 94583 USA.

Michael Boyden is a Technology Consultant in the Network Services Labs at Pacific Bell. His study area is speech processing - focusing in speech recognition, identity verification, and synthesis. In 1990 he was a Bellcore Technology Intern where he worked in applied research on speech and image processing and network management technology. Prior to going to Bellcore, Michael was on Pacific Bell’s Network Engineering Staff, where he investigated the impact of new technologies and services on the interoffice facilities net- work. He holds a BS in Industrial Engineering from Purdue University and is a Professional Engineer in California.

Manuscript TR91-302 received 1990 August 9; revised 1991 March 9.

IEEE Log Number 02218 4 T R b

Using Distributed Topology Update and Preplanned Configurations to Achieve Trunk Network Survivability (continued from page 416)

Mario P. Vecchi; Bellcore; 445 South Street; Morristown, New Jersey 07962- 19 10; USA.

Mario P. Vecchi is currently a District Manager in the Applied Research Area of Bellcore. Prior to joining Bellcore, he lived in Caracas, Venezuela, where he worked at the Instituto Venezolano de Investigaciones Cientificas, he was professor of electrical engineering at the Universidad Metropolitana, and he was the founder and Executive Director of Xynertek C.A. He received a B.S. in electrical engineering from Cornell University and a Ph.D. from the Massachusetts Institute of Technology.

Abel Weinrib; Bellcore; 445 South Street; Momstown, New Jersey 07962-1910; USA.

Abel Weinrib received the S.B. degree from the Massachusetts Institute of Technology in 1979 and the Ph.D. degree from Harvard University in 1983, both in Physics. Since 1985 he has been a member of the Technical Staff in the Applied Research Area at Bellcore. His current research interests include control issues in circuit and packet networks and algorithms for distributed systems. He is a member of the Association for Computing Machinery, the IEEE Computer Society, Phi Beta Kappa, and Sigma Xi.

Liang T. Wu; Bellcore; 445 South Street; Momstown, New Jersey 07962-1910; USA.

Liang T. Wu is currently District Manager of the Network Systems Ar- chitecture Research group at Bellcore. The work of his group is focused on studies to establish the technical feasibility of open distributed software to sup- port multimedia communications. His prior work at Bellcore was on the ar- chitecture, design, and prototyping of fiber-based exchange networks, for both wideband and broadband applications. Prior to joining Bellcore he worked for three years at AT&T Bell Laboratories in Switching Systems Engineering where he was a principal systems engineer for the “Fast Packet Network” exploratory study. He graduated from the University of Michigan, Ann Arbor, with a Ph.D. degree in Computer and Communication Sciences.

Manuscript TR91-317 received 1990 November 26; revised 1991 March 18.

IEEE Log Number 02224 4 T R b