11
IEEE TRANSACTIONS ON COMPUTERS, VOL. 37. NO. 9, SEPTEMBER 1988 1019 Design and Analysis of Dynamic Redundancy Networks Abstract-Most previous work in the fault-tolerant design of multistage interconnection networks (MIN’s) has been based on improving the reliabilities of the networks themselves. For par- allel systems containing a large number of processing elements (PE’s), the capability to recover from a PE fault is also impor- tant. The dynamic redundancy (DR) network is investigated in this paper. It can tolerate faults in the network and support a system to tolerate PE faults without degradation by adding spare PE’s, while retaining the full capability of a multistage cube net- work. The DR network can also be controlled by the same routing tags used for the multistage cube. Hence, with a recovery pro- cedure added in the operating system, programs which can be executed in a system based on a multistage cube can be executed in a system based on the proposed network before and after a fault without any modification. A variation of the DR network, the reduced DR network, is also considered, which can be imple- mented more cost effectively than the DR while retaining most of the advantages of the DR. The reliabilities of DR-based sys- tems with one spare PE and the reliabilities of systems with no spare PE’s are estimated and compared, and the effect of adding multiple spare PE’s is analyzed. It is shown that no matter how much redundancy is added into an MIN, the system reliability cannot exceed a certain bound; however, using the DR and spare PE’s, this bound can be exceeded. Index Terns-Dynamic redundancy, fault tolerance, intercon- nection network, multistage cube, parallel processing, partition- ing, PASM, reconfiguration, reliability analysis. I. INTRODUCTION ARGE-SCALE parallel systems which employ a large Ln umber of processing elements (PE’s) and an intercon- nection network for inter-PE communications have received increasing interest for applications which need fast computing power. These systems are vulnerable to failures because of the large number of components involved. One of the major design problems with such systems is to make these systems fault tolerant and reliable. Fault-tolerant design for multistage interconnection network (MIN) based systems has been intensively studied. Most pre- Manuscript received August 28, 1986; revised April 27, 1987. This work was supported by the Supercomputing Research Center under Contract 695 and the Rome Air Development Center under Contract F30602-83-K-0119. Preliminary versions of parts of this paper were presented at the Sixth In- ternational Conference on Distributed Computing Systems, May 1986, and at the 1986 Real-Time Systems Symposium, December 1986. A topology similar to the DR was described by Balasubramanian et al., in J. Purullel Dbtributed Comput., Aug. 1988. M. Jeng is with the Department of Computer Science, University of Hous- ton, Houston, TX 77204. H. 1. Siege1 is with the Parallel Processing Laboratory, School of Electrical Engineering, h r d u e University, West Lafayette, IN 47907. IEEE Log Number 87 1842 1. vious work has focused on designing fault-tolerant MIN’s. An MIN is said to be fault-tolerant if under certain types of faults it can continue to provide a fault-free connection for any input-output pair. There are many methods to make an MIN fault tolerant. One example is to use error correcting codes to tolerate bit errors in the data and/or control paths [23]. Another example is to introduce redundant connection paths for any input-output pair such that at least one fault- free path is available in the presence of switch failures or link failures (e.g., [2]). Many fault-tolerant MIN’s have been pro- [311, [341-[361, [46], 1481. Surveys and comparisons of fault- tolerant MIN’s are given in [ 11 and [4]. Adding redundancy into the network can increase the sys- tem reliability. However, the overall system reliability will be bounded by the reliabilities of other components such as the PE’s. One possibility to further enhance system reliability is to add spare PE’s. Most fault-tolerant MIN’s do not have ex- tra 110 ports for the spares. Some networks, such as the ANC [35] and ABN [20], contain more than N I/O ports. How- ever, they are primarily designed for systems containing N processors (or PE’s) where a single processor is connected to multiple 110 ports. The issue of adding spare PE’s to these networks has not been addressed. This paper investigates a fault-tolerant variation of the gen- eralized cube (GC) which contains extra I/O ports for adding spare PE’s, and analyzes how much reliability improvement can be obtained by using this approach. The GC network is chosen as a representative of the topologically equivalent class of multistage cube networks which include the baseline [47], the indirect binary n-cube [32], the omega [21], the Flip [7], the SW-banyan (S = F = 2) [18], and the multistage shuffle- exchange [45]. Multistage cube-type networks have been used or proposed for use in many systems, such as STARAN [8], BBN butterfly [ll], IBM RP3 [33], PASM [43], Ultracom- puter [ 191, the Ballistic Missile Defense Agency distributed processing test bed [26], [41], the Flow Model Processor of the Numerical Aerodynamic Simulator [6], and data flow ma- chines [13]. In this paper, it is assumed that each PE is attached to an input port and an output port of an MIN. However, the results presented are also applicable if processors and memories are attached to different sides of an MIN. A PE which participates in the execution of tasks will be cdled afunctioning PE, oth- erwise it is a spare PE. A spare PE will become functioning when a faulty functioning PE is detected and isolated. Fault posed recently (e.g., 131, [lo], [ W , [201, [221, V71, 1301, 0018-9340/88/0900-1019$01.00 0 1988 IEEE I ~ I I I

Design and analysis of dynamic redundancy networks

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Design and analysis of dynamic redundancy networks

IEEE TRANSACTIONS ON COMPUTERS, VOL. 37. NO. 9, SEPTEMBER 1988 1019

Design and Analysis of Dynamic Redundancy Networks

Abstract-Most previous work in the fault-tolerant design of multistage interconnection networks (MIN’s) has been based on improving the reliabilities of the networks themselves. For par- allel systems containing a large number of processing elements (PE’s), the capability to recover from a PE fault is also impor- tant. The dynamic redundancy (DR) network is investigated in this paper. It can tolerate faults in the network and support a system to tolerate PE faults without degradation by adding spare PE’s, while retaining the full capability of a multistage cube net- work. The DR network can also be controlled by the same routing tags used for the multistage cube. Hence, with a recovery pro- cedure added in the operating system, programs which can be executed in a system based on a multistage cube can be executed in a system based on the proposed network before and after a fault without any modification. A variation of the DR network, the reduced DR network, is also considered, which can be imple- mented more cost effectively than the DR while retaining most of the advantages of the DR. The reliabilities of DR-based sys- tems with one spare PE and the reliabilities of systems with no spare PE’s are estimated and compared, and the effect of adding multiple spare PE’s is analyzed. It is shown that no matter how much redundancy is added into an MIN, the system reliability cannot exceed a certain bound; however, using the DR and spare PE’s, this bound can be exceeded.

Index Terns-Dynamic redundancy, fault tolerance, intercon- nection network, multistage cube, parallel processing, partition- ing, PASM, reconfiguration, reliability analysis.

I. INTRODUCTION ARGE-SCALE parallel systems which employ a large Ln umber of processing elements (PE’s) and an intercon-

nection network for inter-PE communications have received increasing interest for applications which need fast computing power. These systems are vulnerable to failures because of the large number of components involved. One of the major design problems with such systems is to make these systems fault tolerant and reliable.

Fault-tolerant design for multistage interconnection network (MIN) based systems has been intensively studied. Most pre-

Manuscript received August 28, 1986; revised April 27, 1987. This work was supported by the Supercomputing Research Center under Contract 695 and the Rome Air Development Center under Contract F30602-83-K-0119. Preliminary versions of parts of this paper were presented at the Sixth In- ternational Conference on Distributed Computing Systems, May 1986, and at the 1986 Real-Time Systems Symposium, December 1986. A topology similar to the DR was described by Balasubramanian et al., in J. Purullel Dbtributed Comput., Aug. 1988.

M. Jeng is with the Department of Computer Science, University of Hous- ton, Houston, TX 77204.

H. 1. Siege1 is with the Parallel Processing Laboratory, School of Electrical Engineering, h r d u e University, West Lafayette, IN 47907.

IEEE Log Number 87 1842 1.

vious work has focused on designing fault-tolerant MIN’s. An MIN is said to be fault-tolerant if under certain types of faults it can continue to provide a fault-free connection for any input-output pair. There are many methods to make an MIN fault tolerant. One example is to use error correcting codes to tolerate bit errors in the data and/or control paths [23]. Another example is to introduce redundant connection paths for any input-output pair such that at least one fault- free path is available in the presence of switch failures or link failures (e.g., [2]). Many fault-tolerant MIN’s have been pro-

[311, [341-[361, [46], 1481. Surveys and comparisons of fault- tolerant MIN’s are given in [ 11 and [4].

Adding redundancy into the network can increase the sys- tem reliability. However, the overall system reliability will be bounded by the reliabilities of other components such as the PE’s. One possibility to further enhance system reliability is to add spare PE’s. Most fault-tolerant MIN’s do not have ex- tra 110 ports for the spares. Some networks, such as the ANC [35] and ABN [20], contain more than N I/O ports. How- ever, they are primarily designed for systems containing N processors (or PE’s) where a single processor is connected to multiple 110 ports. The issue of adding spare PE’s to these networks has not been addressed.

This paper investigates a fault-tolerant variation of the gen- eralized cube (GC) which contains extra I/O ports for adding spare PE’s, and analyzes how much reliability improvement can be obtained by using this approach. The GC network is chosen as a representative of the topologically equivalent class of multistage cube networks which include the baseline [47], the indirect binary n-cube [32], the omega [21], the Flip [7], the SW-banyan (S = F = 2) [18], and the multistage shuffle- exchange [45]. Multistage cube-type networks have been used or proposed for use in many systems, such as STARAN [8], BBN butterfly [ l l ] , IBM RP3 [33], PASM [43], Ultracom- puter [ 191, the Ballistic Missile Defense Agency distributed processing test bed [26], [41], the Flow Model Processor of the Numerical Aerodynamic Simulator [6], and data flow ma- chines [13].

In this paper, it is assumed that each PE is attached to an input port and an output port of an MIN. However, the results presented are also applicable if processors and memories are attached to different sides of an MIN. A PE which participates in the execution of tasks will be cdled afunctioning PE, oth- erwise it is a spare PE. A spare PE will become functioning when a faulty functioning PE is detected and isolated. Fault

posed recently (e.g., 131, [lo], [ W , [201, [221, V71, 1301,

0018-9340/88/0900-1019$01.00 0 1988 IEEE

I ~

I I I

Page 2: Design and analysis of dynamic redundancy networks

1020 IEEE TRANSACTIONS ON COMPUTERS, VOL. 37. NO. 9, SEPTEMBER 1988

detection/location is not of direct concern here. There are procedures in the literature for detecting and locating faults in an MIN and an MIN-based system [12], [15], [24]. It is assumed that a faulty component can be successfully detected and located.

In Section 11, the dynamic redundancy (DR) network and its properties will be discussed in detail. A variation of the DR network, the reduced DR (RDR) network, is considered in Section 111. In Section IV, the reliabilities of systems based on DR or RDR are estimated and compared to an upper bound to which the system reliability can be improved by adding redun- dancy into MIN's only, and the sufficient condition to obtain reliability improvement by using the DR or RDR and adding spare PE's is given. Finally, in Section V, the complexity of the DR network is discussed.

11. THE DR NETWORK

A . Definition of the DR Network The design of the DR network is based on the interconnec-

tion graph of the GC. A GC for N = 8 is shown in Fig. 1 . In general, a GC with N = 2'" I/O ports consists of m stages, where each stage consists of N/2 2 x 2 interchange boxes. Fig. 2 shows an equivalent SW-banyan graph of GC [25], [40]. Let p = g,-l . . .glgo be the binary representation of an arbitrary U0 port label. Then the m cube interconnection functions can be defined as

where 0 5 i < m , and g; denotes the complement of gi [38]. Stage i of a GC can implement the cubei function. Using the representation in Fig. 1 , at stage i, l i nk j and link cube;(j) can exchange data. Using the representation in Fig. 2, at stage i , switch j and switch cubei(j) can exchange data; i.e., switch j in stage i is connected to switch cubei(j) in stage i - 1 , O I j < N , O r i < m .

The DR network contains m stages, where N = 2m. Stages are ordered from m - 1 to 0 from the input side to the output side of the network. Each stage has N + S switches followed by 3(N + S ) links, as shown in Fig. 3 for N = 8 and S = 2. In addition, there are N + S output switches. This allows for an initial set of N functioning PE's and S spares. PE's and switches of the network are physically numbered from 0 to N + S - 1 . PE j of the system is connected to the input of switch j of stage m - 1 and to the network output switch j . Each switch j at stage i of the network has three output links to stage i - 1 . The first link f is connected to switch ( j - 2') mod ( N + S ) of stage i - 1 , the second link fo is connected to switch j of stage i - 1 , and the third link f+' to switch ( j + 2') mod ( N + S ) of stage i - 1 . Switches at stage 0 are connected to the output switches.

A row of a DR network contains all the network switches having the same address, all links incident out of them, and the associated network input link. A row has the same address as its switches. Two rows of the DR network are said to be adjacent if their addresses are consecutive (modulo N + S ) .

B . Reconfiguring the DR Network It is assumed that the system assigns PE 0 to PE N - 1 as

the functioning PE's at the beginning. When PE j or row j

cubei(gm-i . . .g lgo) = g m - l . . . gi+lgigi-l ...glgo (2.1)

0 O n 0 0-0 0-

0

t I n P U P t

U

U t

u u Stage 2 1 0

# exchange

# straight

lower upper broadcast broadcast

an interchange box are shown. Fig. 1 . Generalized cube network for N = 8 . The four legitimate states of

0

1

2 0

t 5

6

- 7

Stage 2 0 Output 1 Switches

(a)

"IOI: Y (b)

Fig. 2 . (a) Graphical interpretation of the multistage cube network for N = 8 . (b) Relationship between graphical interpretation and interchange box representation.

is detected faulty, physical PE p and row p of the network, 0 r p < N + S , will logically be renumbered t (p ) :

(2.2) where k = j + S mod ( N + S ) . PE's with logical addresses between 0 and N - 1 will become new functioning PE's. Let n denote the subnetwork which contains all rows of the network with logical addresses between 0 and N - 1, then II will be the subnetwork to provide the inter-PE communications for the new functioning PE's. It will be shown that no matter what j is, II can logically perform all functions that a GC can perform.

Theorem 1: Stage i of II, 0 I i I m - 1 , can perform cubei based on the logical addresses.

Proof: For any logical switch t (p) at stage i of n,O I i < m , if t(q) = cubei(t(p)), then it is obvious that 0 I t(q) < N . Hence, it is sufficient to show that there always exists a link for physical switch p to physical switch q . From

(2.3)

t ( p ) = ( p - k) mod ( N + S ) ,

(2.21, t (p) = ( p - k ) mod (N + S )

Page 3: Design and analysis of dynamic redundancy networks

1021 JENG AND SIEGEL: DYNAMIC REDUNDANCY NETWORKS

6 6

I n P t U

Stage 2 1 0 output Switchee

0 t U

:

Fig. 3. A DR network for N = 8 and S = 2.

t(q) = (q - k ) mod ( N + S). (2.4) Case I : Bit i of t ( p ) is 0: Since bit i of t ( p ) is 0, from

(2.31,

0 I [ t ( p ) + 2' = ( p - k ) mod ( N + S) + 2'1 < N + S . S o , ( p - k ) mod ( N + S ) + 2 ' = ( p - k + 2 ' ) mod ( N +

Since (q - k ) mod ( N + S) = t(q) = cube;(t(p))

(q - k ) mod ( N + S) = ( p - k + 2') mod ( N + S). So, q = ( p + 2') mod (N + S).

From the definition of the network, there is a link f +; which connects switch p at stage i to switch ( p + 2') mod (N + S ) at stage i - 1.

Case 2: Bit i of t ( p ) is 1: The proof is similar to Case 1. Corollary: In a DR network with no faults, stage i of rI

Proof: It follows from the Theorem 1 when t ( p ) is the 0

Fig. 4 shows that rI contains a cube network for N = 8 and S = 2 with PE 7 faulty and t ( p ) = ( p - k ) mod ( N + S) = ( p - 9) mod 10. The solid lines show the cube subgraph of the DR (compare to Figs. 2 and 3). The DR network can tolerate any single PE or network fault. Multiple faults in S adjacent rows of the network and the associated S PE's can be tolerated by selecting II to exclude those S rows and PE's.

C. Control of the DR Network The DR network can be operated in either the circuit

switched mode or packet switched mode. When a PE wants to transfer data, it will generate a routing tag as header of a message to establish a connection path. Each switch is set in- dependently. The Exclusive-OR [4 11 and destination [2 11 rout- ing tag schemes used for the GC can be used to control the DR network.

S) .

= t ( p ) + 2',

can perform cube;, 0 I i I m - 1.

identity function (k = 0).

0 0

1 1

2 2

3 3 0

4 :

5 5 5 :

I n 4

t 6 6

..... ..... ..... ..... : 7 : i7i 17; ..... 7 7 i,?,i ....

8 i8.i ..... ..... ..... ..... ..... ..... :8i i8i 8

gTP--g 1 1 / \ 3 3

Stage 2 1 0 output Switch-

Fig. 4. Reconfiguration of a DR network with N = 8 and S = 2 when PE 7 is faulty. The solid lines show the cube subgraph of the DR network.

Control of the DR network is based on the logical addresses. Given a source PE with logical address X and a destina- tion PE with logical address Y , the Exclusive-OR tag E =

OR of X and Y . Each switch in stage i of II will examine e; to determine which link to use. Let W = wmpl . . . w1 wo be the logical address of the switch. If e; = 0, then link fo is used. If e; = 1 and w; = 0, switch W will use link f+;. Otherwise (e; = 1 and w; = 1) it will use linkf-,. Thus, for e; = 1, W is connected to cube;( W ) .

This approach necessitates adding a one-bit flag in each switch to store the ith bit of the logical address of the switch, w;. Observe that at stage i a connection path from a source logical PE X to a destination logical PE Y will use switch with

the GC in [40]). Thus, w; = x;. Therefore, these flags can be set by each source PE sending its logical address during system initialization and after each reconfiguration due to a fault. Furthermore, as an alternative scheme that does not require a switch to store w;, the source PE logical address X can be sent with each message, and x; is used in place of w;.

The DR network can also be controlled by using a desti- nation tag which is the destination PE logical address Y . A switch W in stage i can examine y ; and w; and use link f-, when y ; < w;, link fo when y ; = w;, or link f+; when y ; > w;. As in the case for the EXClUSiVe-OR scheme, xi can be used in place of w;.

For the GC, either routing scheme can include an m-bit broadcast mask B, where b; = 1 means broadcast at stage i [41]. At a stage i switch in the DR network, when b; = 1, if w; = 0, links fo and f+; will be used (upper broadcast), otherwise, links fo and f will be used (lower broadcast). As before, xi can be used in place of w,.

D. Partitionability of the DR Network The partitionability of an interconnection network is the

ability to divide the network into independent subnetworks

e,,, - 1 . . . el eo can be derived by taking the bitwise Exclusive-

logical address W = y,-l . . . Y , + ~ X ; . . .xo (this is shown for

Page 4: Design and analysis of dynamic redundancy networks

1022 lEEE TRANSACTlONS ON COMPUTERS, VOL. 31 , NO. 9. SEPTEMBER 1988

such that each subnetwork of size N has all of the intercon- nection capabilities of a complete network of that same type with size N 1391. When S is even, the DR network can be partitioned into two independent subnetworks a0 and ?rl by setting all switches in stage 0 to fo, where RO contains all even rows and R I contains all odd rows. Both a0 and R I are of size (N/2) + (S/2). The theory underlying this is similar to that for partitioning the ADM network, as discussed in [39] and [40]. An example of partitioning the DR network with N = 8 and S = 2 is shown in Fig. 5. Each PE in a partition has a partition address between 0 and (N/2) + (S/2) - 1. Let h , be an address transformation which maps the physical addresses of switches in R; to the partition addresses. Then h x o ( j ) = j / 2 and hx,(j) = ( j - 1)/2.

Since ?r0 and a1 have all the interconnection capabilities of a complete DR network, both RO and R I can be partitioned again if both N/2 and S/2 are even. The network does not have to be partitioned into subnetworks of the same size. For example, consider a DR network with S 2 4. It could first be partitioned into odd and even halves, and then just the even half is partitioned again, resulting in one subnetwork of size (N/2) + (S/2) and two subnetworks of size (N/4) + (S/4). In general, the physical addresses of all the switches in a subnetwork (partition) of size ( N + S) /2" must agree in their low-order U bit positions, and each partition (subsystem) can tolerate faults independently.

The partitionability of the DR network provides the neces- sary capabilities for multiprocessor systems which can operate in multiple-SIMD mode and use spare PE's in each SIMD sub- system to enhance system reliability. Control and reconfigura- tion of each subnetwork for each SIMD subsystem will be the same as discussed in the previous section. The partitionability property of the DR network can also be exploited to establish virtual MIMD subsystems in an MIMD system. It can also be used to support partitionable SIMD/MIMD systems, i.e., systems capable of being partitioned into independent SIMD or MIMD machines of various sizes. PASM is such a system [43], [42]. The DR study was motivated by an investigation of incorporating fault tolerance into PASM.

E. Fault Recovery In the recovery process, to ensure that the data for a task

to be restarted are not polluted by faults, the task has to be rolled back to the beginning or to a point where a copy of clean data has been stored. The rollback distance depends on the fault detection techniques used and error latency [37] and will not be discussed here.

Unlike most other fault-tolerant MIN's, the proposed DR network does not provide multiple connection paths for every input-output pair. Fault tolerance of the DR network, as well as the system based on it, is achieved by reconfiguration. After reconfiguring the system, programs and data must be reloaded into the new functioning PE's to restart the task. Since the DR network eliminates network rows and PE's rather than finding a second connection path, there is no need to determine if a connection path is faulty and no need to modify the routing tags to reroute paths. In an SIMD environment, permuting data would need only one pass through the network after the

e e

0 0 * 0 0 0

1 1

2 2 2 2 2 2

x;,: .... , x .... ..... 4 : : . . . . , , . : 4 : . . . . _ . . . . i4; .... 4 .... .....

8 8

9 9 9

Stage 2 1 0 output Switches

Fig. 5 . Partitioning a DR network with N = 8 and S = 2 into two independent subnetworks. The solid lines show the subnetwork containing only odd number switches. The dotted lines show the subnetwork containing only even number switches.

recovery. Hence, with a recovery procedure added in the op- erating system, programs which can be executed in a system based on a GC can be executed in a DR-based system before and after a fault without any modification. The performance of the DR-based system will not be degraded after recovery.

111. THE REDUCED DR NETWORK

Consider the proof of Theorem 1 in Section 11-B. In stage i of the DR network, if bit i of t ( p ) , the logical address of a physical switch p , is 0, then link f+; is used to perform cubej(t)@)); if bit i of t ( p ) is 1, then linkf-, is used to per- form cube;(t(p)). Therefore, if the function t ( p ) is modified such that bit i of t (p) is always equal to bit i of p for some i, then switches in these stages will need only two links: fo and f + i , orfo andf-i. Thus, the complexity of the DR can be reduced without losing the capability to emulate the multistage cube network.

The graph of an RDR network of size N + S , N = 2m and S = 2', is a subgraph of a DR. The procedure to construct an RDR network is as follows.

1) At stages m - 1 to s, the RDR network has the same interstage connections as the DR network.

2) At stage i , O I i < s, a switch with physical address p , 0 I p < N + S , has two output links. If bit i of p is 0, then switch p has output links fo and f+;; otherwise, it has output links f, and f -;.

When S = 1 (s = 0), the RDR network is identical to the DR network. If S > 1 (s > 0), RDR has less links in stages s - 1 to 0. The interconnections f+; and f-; in stages s - 1 to 0 of the RDR network connect the switch physically numbered p to switch cube;(p). An RDR network with N = 8 and S = 4 is shown in Fig. 6.

Since the RDR network contains less links than DR, not

Page 5: Design and analysis of dynamic redundancy networks

JENG AND SIEGEL: DYNAMIC REDUNDANCY NETWORKS 1023

1-

0

I

2

3 3

4 4

0

P t f 6 6 u

6 :

7 7

a 8

0

10 ll-F-w~-ll 10

2 2 3 3

Stage 2 1 0 o u t p u t Switches

An RDR network with N = 8 and S = 4, Fig. 6.

any N adjacent rows of the RDR network can act as a GC. When a fault occurs, the set of new functioning PE’s must be carefully selected so that the RDR network can still provide the full GC interconnection capabilities for the new functioning PE’s. When a PE and/or a row of the RDR with physical address j fails, PE p and row p , 0 I p < N + S , will be logically renumbered t ’ ( p ) :

t ’ ( p ) = ( p - k ’ ) mod ( N + S ) , (3 .1)

where k’ = L(j + S ) / S ] x S. The notation 1x1 represents the floor function, the greatest integer smaller than or equal to x. Consider Fig. 6 for example. If PE 1 is faulty, then k‘ = 4, PE’s 0, 1, 2, and 3 will be isolated and the remaining PE’s will become new functioning PE’s.

The RDR network has the same interstage connections as the DR network in stages m - 1 to s. In stages s - 1 to 0 of the RDR, because k‘ is a multiple of S and S = 2’, bit i of t ’ ( p ) will be equal to bit i of p for 0 I i < s. Therefore, Theorem 2 follows.

Theorem 2: Given the mapping function I ’ , the RDR net- work can provide all the interconnection capabilities of a GC before and after a single fault occurs.

Proof: The proof is similar to that for Theorem 1 and hence is omitted here. 0

Fault tolerance in the DR and RDR networks is different. In the DR network, multiple faults are tolerable if they are contained within S adjacent rows of the network and their associated PE’s. In the RDR, multiple faults can be tolerated when each of them causes the same PE’s to be isolated. Let p and q be two physical addresses of any two faulty components. If L(p + S)/SJ = L(q + S)/SJ, then these faults are tolerable. For example, in Fig. 6, multiple faults in both PE 0 and PE 3

are tolerable because both of them cause PE’s 0, 1, 2, and 3 to be isolated, while multiple faults in both PE 3 and PE 4 are not tolerable because they cause different PE’s to be isolated.

The RDR network has the same partitionability as the DR network. Even with the reduction of the number of links in stages s - 1 to 0, the RDR network can be partitioned into S independent subnetworks, each of which provides single fault tolerance for each subsystem.

One advantage of the RDR network is that it can be con- structed from smaller DR and GC networks. First consider stages m - 1 to s of an RDR network. Recall that the RDR network can be partitioned into S independent subnetworks of size ( N / S ) + 1 by setting switches in stages 0 through s - 1 to straight; each subnetwork contains switches whose physi- cal addresses agree in the low-order s bit positions. Based on reasoning similar to that for partitioning, stages m - 1 to s of an RDR can be decomposed into S DR subnetworks of size ( N / S ) + 1. Each DR subnetwork contains switches whose physical addresses agree in the low-order s bit positions. Sec- ond, in stages s - 1 to 0 of the RDR network, a switch with physical address p has links fo and f+; if bit i of p is 0, or linksfo andf-; if bit i of p is 1. So the interconnection func- tionf-; o r f+ ; in stage i , O I i < s, is equivalent to cube;. Therefore, stages s - 1 to 0 of the RDR network can be de- composed into ( N / S ) + l GC subnetworks of size S . Each GC subnetwork contains switches of RDR whose physical ad- dresses agree in bit m to bit s. Therefore, an RDR network of size N + S can be constructed by using S DR networks of size (N /S ) + 1 and ( N / S ) + 1 GC networks of size S. Fig. 7 shows an example of constructing an RDR network of size 8 + 4 from four DR networks of size 2 + 1 and three GC networks of size 4.

Previously, the fault tolerance of the DR and RDR networks has been considered based on an implementation with N + S switches per stage. Consider an RDR network implemented by DR and GC subnetworks as described above. If 2 x 2 interchange boxes are used in the GC subnetworks, then any single interchange box failure in the GC subnetworks can be tolerated. For example, consider the RDR network in Fig. 6. If the interchange box which corresponds to switches 1 and 3 in stages 1 and 0 (as well as the associated links) fails, the failure can be tolerated no matter how the RDR network is partitioned. If the RDR network is not partitioned, then failure of the interchange box can be considered as multiple faults which cause the same PE’s (PE’s 0, 1, 2, and 3 ) to be isolated and hence can be tolerated. If the RDR network is partitioned into two subnetworks, one containing even rows and the other containing odd rows, then the interchange box failure will cause PE’s 1 and 3 in the odd partition to be isolated. So this failure is also tolerable. When the RDR network is partitioned into four subnetworks, the failure of the interchange box will affect two partitions, one containing rows 1, 5 , and 9, and the other containing rows 3 , 7, and 11. Each partition contains faults in one row of the subnetwork. Both partitions have to be reconfigured to tolerate this failure.

In general, since a 2 x 2 interchange box corresponds to switches in two rows of the RDR network, at most two parti- tions will be affected if an interchange box is faulty. If these

Page 6: Design and analysis of dynamic redundancy networks

IEEE TRANSACTIONS ON COMPUTERS, VOL. 37, NO. 9, SEPTEMBER 1988

11 11 11

1024

I n

!l

I 11 11

The system reliability Ro is

Ro = C1 X R; x R , (4.1)

where CI is a factor which represents the reliability degrada- tion due to other subsystems, such as the secondary memory storage and those modules which control and sequence the system.

Using a fault-tolerant MIN (not a DR) would increase R, and therefore increase Ro. However, no matter how reliable an MIN is used, R,, is always less than one. Hence, Ro can never exceed CIR;. Let

R = CIR;. (4.2) Then R can be considered as an upper bound of the system reliability that can be obtained by adding redundancy to the MIN only.

Now consider a DR-based system which contains N + 1 homogeneous PE’s and a DR network of size N + 1. Let

R I be the reliability of the overall system with S = 1; A,, be the failure rate of an individual PE; A, be the failure rate of a network switch and all links

incident out of it; p be the ratio between A, and A,, i.e., p = A,/A,,;

R , be the reliability of a network switch and the associated

Rr be the reliability of an individual row of the DR links;

network.

The failure rate of a component depends on the complexity of the component, maturity of fabrication process, and other factors such as operating temperature. Both A, and A, are in terms of number of failures per million hours. Typical component failure rates are in the range of 0.1 to tens per million hours [U]. If both A, and A, are constant during the operational life of the system, then given a mission time T , the number of failures which may occur in a single PE during this period is A, T. For example, if A, = 1 failure per million hours and T = lo00 h, then A,, T = lop3. The failure prob- abilities will follow the exponential distribution [U]. Thus, for a mission time T , R, = e-ApT and R , = e-’,’. Rr is approximately equal to RC+l because there are m switches plus one output switch in a row of the DR network. Replacing A, by PA,,, then R , can be expressed as

R - ( e - ~ X P T m+l - ~ d m + l ) - r - 1 - p - R; (4.3)

where cy = p(m + 1). Since a network switch is much smaller than a PE, A, is far less than A,,. Hence, p Q 1. For a PE or a row of DR to be usable in the system, both of them must be fault free, if they have the same physical address. The reliability RI of the overall system then is

R1 = c2[(RpRr)N+’ + (N + l)(RpRr)N(l - RpRr)]

)I (4.4) = C2e-(l+“)WT[1 + ~ ( 1 - e-(I +a)XpT

where C, is a factor similar to C1. To compare R1 to the upper bound R , it is assumed that

CI = C2 = 1. Table I shows R 1 and R versus N for three different values of A, T at p = 0.1. In Table I, when A, T =

Page 7: Design and analysis of dynamic redundancy networks

JENG AND SIEGEL: DYNAMIC REDUNDANCY NETWORKS 1025

&T = bT = 5x10-' bT = N R RI R RI R R,

4 0.9996 1.0000 0.9980 0.9999 0.9960 0.999Q

16 0.9984 0.9999 0.Q920 0.9999 0.9841 0.9996

64 0.Q936 0.9999 0.9684 0.9Q85 0.9379 O.QQ44

256 0.9747 0.9988 0.8798 0.9747 0.7740 0.9135

1024 0.9026 0.9799 0.5992 0.7078 0.3589 0.3662

4096 0.6639 0.7569 0.1289 0.0513 0.0166 0.0008

2 6 6 1 1 1 2 6 6 > - (1 + a)(aNy)2 - - a(aNy)2 - - (1 + a)(aNy)2

6 6

Therefore,

2 6 6

* (C, = c, = 1, p = 0.1)

lop3 or 5 x l op4 , R1 > R for N I 1024; for APT =

R I > R for N I 4096. In the following, it will be shown that R1 can exceed the bound R for a wide range of N .

Theorem 3: If N I [ l - (1 + a)2ApT]/[a(l + a/2)ApT], thenR1 > R .

Proof: Let y = APT. Because R = e-NY and R I =

ep( l+a)Ny[ l +N(1 -e-('+")y)], proving R I > R is equivalent to proving

(4.5)

Since e-"NY = C r = O ( - a N y ) k / k ! and 1 + N[l -

e-aNy[l + ~ ( 1 - e-(I+*)Y)] > 1.

e-(l+")Y] = 1 - N C ~ k = , [-(1 + a)ylk/k!, (4.5) becomes

m m (20 T ) ( l ( - aNY I k '--(' k! + a)y1k) > 1. (4.6) k = I

Because N I [ l - (1 + a)2y]/[a(l + a/2)y],

Since all the terms in the left side of (4.7) are positive, any nonconstant term must be less than one. Hence, aNy < 1, and (1 + a)y < (1 + ~ ) ~ y < 1. Therefore,

m 2 1 - N [--U + > 1 - N [-(I + 4 Y l k

k! ' k = 1

k! k = I

(4.9) From (4.59, (4.6), (4.8), and (4.9), if

2 [-(' + a)r1k) > 1, (4.10)

k ! k = 1

k!

then R I > R . Hence, in the following, it will be proved that (4.10) is true. Since 1 - aNy - a2Ny/2 - (1 + ( ~ ) ~ y 2 0 (see 4.7), so

(aNy)2 (1 + a)2Ny2 Ny - a(Ny)2 - ~ - 2 0. (4.11) 2 2

(4.12)

Similarly,

1 1 - a ( l + a)2N2y3 - - a ( l + a)2N3y4 > 0. 2 4

(4.13)

(4.14)

By adding (4.1 l), (4.12), (4.13), and (4.14) together and in- crementing both sides by 1, (4.10) is attained. Therefore, R1 > R . 0

When NA,T is large, both R and R1 become small. For example, if APT = and N = 1024, then NA,T = 1, R = 0.359, and R I = 0.366. In practice, for a system to be usable, the system reliability must be near one, i.e., NA,T must be far less than one. Thus, in a practical sys- tem, (1 + a)*A,T 4 1. The condition in Theorem 3 can be approximated by the following inequality:

(1) 1 + - APT (4.15)

When N satisfies the condition above, then reliability improve- ment can be made by using a DR network of size N + 1 and one spare PE in a system. For example, if A, T = 5 x and p = 0.1, then a reliability improvement can be made for any system with N I 1024. The reliability improvement in this case can be measured as follows:

1 - R 1 - R I RIF = ~ (4.16)

where RIF is the reliability improvement factor [29]. For example, if N = 256 and APT = RIF = (1 - 0.9747)/(1 - 0.9988) = 21. If N is very large, R I may be- come smaller than R (see Table I). The reason is that when N is very large, the effect of adding one spare becomes small and the reliability enhancement is offset by the decrease in the network reliability. However, R I < R does not imply R1 < RO because the reliability of an N x N fault-tolerant MIN will also decrease as N increases and hence RO will become much less than R .

B. Effect of Multiple Spares in a Nonpartitionable System

In a DR-based or RDR-based system, using more spares (S > 1) may tolerate more faults. However, it will be shown

Page 8: Design and analysis of dynamic redundancy networks

1026 IEEE TRANSACTIONS ON COMPUTERS, VOL. 37. NO. 9 , SEPTEMBER 1988

that if a system is not designed to operate in a partitionable environment, then little reliability improvement can be made by adding additional spares.

First, consider the reliability of a DR-based system. Let R, denote the reliability of a nonpartitionable system containing

for inter-PE communication, and define IUF I ,, as follows:

where CsiA RL = ( 1 - R:)/(1 - R,). Since R , < 1. so S - fli R: > 0 and 1 - R, > 0. Hence, R , / R ~ > 1 . Therefore, R, > R I for 2 I S < N .

b) MF~,, 1 if ( 1 + a ) ~ ~ p T + 1 ,

1 - R I - 1 - CRr[R , + (N + 1 ) ( 1 - R,)] 1 - Rs

N + S homogeneous PE’s and a DR network of size N + S R I F ~ , ~ - - 1 - CR$’[R; + ( N + S)(1 - R,)]’

F o r ( l + a ) S A , T e 1,R: = e-(l+a)S’pT = l-(l+cu)SX,,T (4‘17) and R, = 1 - ( 1 + a)A,T. Let F, = (1 + a)ApT. Then

The following theorem can be derived. Theorem 4: For 2 IS < N , a) R, > R I ; b) RIFI., = 1

if ( 1 + a)SApT e 1. Proof: A DR-based system of size N + S is available

for applications which require N PE’s if at least N adjacent PE’s and their associated rows of the DR network are fault- free. Thus, if there are exactly N + S , N + S - 1 , . . . , or N adjacent PE‘s and associated rows of the DR which are fault-free, then the system is usable. Let Rc = R,R, =

e-(’+a)Apr. The possibility that all PE’s and all rows of the DR are fault-free is RF”; the possibility that exactly N + S - 1 adjacent PE’s and the associated rows of the DR are fault- free is (N + S ) R r + S p l ( l - R,). For a system to contain exactly i fault-free adjacent PE’s and their associated rows of the DR, N I i I N + S - 2, there must be a faulty PE or component associated with each row on the “top and bottom (mod N + S)” of the i fault-free rows. For example, if i = N + 1 and PE’s 2 to N + 2 and their associated rows of DR are fault-free, then PE 1 or row 1 of the DR must be faulty, and PE N + 3 or row N + 3 of the DR must be faulty, otherwise i will be N + 2 or N + 3 instead of N + 1. The PE’s and rows numbered N + 4 to N + S - 1, and 0 may or may not be faulty. Thus, the possibility that exactly i adjacent PE’s and their associated rows of the DR are fault-free is (N + S)RL(1 - R,)2, where N I i I N + S - 2. Since all these conditions are mutually exclusive, the reliability R, of the system is

1 - CRr[1 - Fc + (N + 1)FJ 1 - CRr[1 - SFc + ( N + S)FJ MF1.S =

1 - CRp(1 + NF,) 1 - CRP(1 + NF,) = 1 . n - -

Ingenera l , ( l+a)S < N , s o ( l + a ) S A , T < N A p T < 1 . Therefore, in practice, RIF,,, = 1 holds for most systems. Ta- ble I1 shows the system reliabilities for different values of S and N . It shows that for a nonpartitionable system, a signif- icant enhancement can be obtained by adding the first spare (S = l ) , while only little gain can be made by further in- creasing S.

In a RDR-based system, not any N adjacent fault-free PE’s and their associated rows of the RDR can be used. Therefore, R, for an RDR-based system will be less than R, for a DR- based system, for S = 2’. So R I F I , , = 1 holds for RDR also.

C. Effect of Multiple Spares in a Partitionable System The partitionability of DR (or RDR) can provide the nec-

essary capabilities for a partitionable system to incorporate spare PE’s. In a system being designed to operate in multiple- SIMD mode, virtual multiple-MIMD mode, or partitionable SIMD/MIMD mode, according to the Theorem 4, only one spare is needed in each subsystem. If a system of size R! is to be partitioned into Q (= 24) subsystems of size 2m-4, Q

N + S - 2 spare PE’s are needed. The DR (or RDR) network to be used in the system will be of size N + Q. Because each partition

* ( 1 - Rcl2 + ’ ’ . + R?+’(1 - Rc)* + R?(l - R c ) 2 1 ) can operate independently, the overall system reliability is the

R, = C(RF+S + (N + S>[R:+’-’(l - R,) + R ,

product of all the reliabilities of these partitions. Let- R, be the overall system reliability when the system is partitioned into Q subsystems. Then

= C[R;+’ + (N + S)RF+S-l(l - R,)

R , = C3[RLN’”” + (g + l)Rr’Q(l - R,) (4.19)

where C3 is a factor similar to C I and C2 and R, = R,R,. Table 111 shows R , and R I at three different values of APT for C3 = 1, q = m/2, and p = 0.1. It shows that the system reliability can be greatly improved by adding multiple spares in a partitionable system.

I Q + (N + S)RY(l - R,)2 R’, (4.18)

s<i R: = ( 1 -

i = O s-2 I where C is a factor similar to C1. Using R:-’)/(1 - Rc) and rearranging terms in (4 .18) in

(4.19) R, = CRr[R: + (N + S)(1 - R,)].

a ) R s > R I for21S1N

R, - R,S + ( N + S)(1 - R,) R I Rc + ( N + 1 ) ( 1 - R,) - -

s- I

= I + Rc + ( N + 1 ) ( 1 - Rc)

V. COMPLEXITY AND COMPARISON An approach to adding spares is used in the GFll SIMD

machine project [9] . There a 576 input/output Benes network is used, where there are 512 “primary processors” and 64 spares. While the GF11 design is good for its intended appli- cations, its overall architecture and centralized control Benes

Page 9: Design and analysis of dynamic redundancy networks

JENG AND SIEGEL: DYNAMIC REDUNDANCY NETWORKS 1027

TABLE I1 RELIABILITIES OF NONPARTITIONABLE SYSTEMS AT DIFFERENT VALUES

OF S . WHEN S 2 1, A DR NETWORK OF SIZE N + s IS USED

S

- 0

1

2

3

&,T = 10-~

N=64 N=256

0.9379 0.7740

0.9944 0.9135

0.9944 0.9135

0.9944 0.9135

bT = 5x10-‘

N=64 N=256

0.9684 0.8798

0.9985 0.9747

0.9985 0.9747

0.9985 0.9747

* (C, = c = 1, p = 0.1)

TABLE I11 SYSTEM RELIABILITIES: R , IS THE RELIABILITY OF A DR-BASED NONPARTITIONABLE SYSTEM OF SIZE N + 1. R, IS THE RELIABILITY OF A DR-BASED OR RDR-BASED PARTITIONABLE SYSTEM OF SIZE N + Q

N

4

16

64

256

1024

4096

1.0000 1 . m

0.9999 0.9999

0.9999 0.9999

0.9988 0.9999

0.9799 0.9992

0.7569 0.9930

a T = 5x10-‘

Rl R,

0.9999 0.9999

0.9999 0.9999

0.9985 0.9997

0.9747 0.9980

0.7078 0.9819

0.0513 0.8454

&T =

Rl R,

0.9999 0.9999

0.9996 0.9999

0.9944 0.9991

0.9135 0.9923

0.3662 0.9311

0.0008 0.5261

* (C, = c, = 1, p = 0.1)

network were not designed for, and are therefore inappropri- ate for, emulating GC networks in multiple-SIMD, MIMD, and partitionable SIMD/MIMD systems.

The crossbar networks can readily accommodate spare pro- cessors and much more easily be reconfigured. However, they are seldom considered for use in large scale multiprocessor systems because of their high cost complexity, i.e., a cost of O(N2) switches and links for a size N network. A crossbar network can be a cost-effective solution when the entire net- work can be built in one chip [17]. However, current technol- ogy cannot put a large (e.g., N = 2’) crossbar network into one chip or even a small set of chips. Hence, for large N, an MIN with a lower cost complexity of O(N log N ) switches and links is much more favored. This is evidenced by the use of multistage cube-type networks in university projects such as PASM [43] and Ultracomputer [ 191, and industrial projects such as the Goodyear STARAN [8], the BBN butterfly [ 1 1 1 , and the IBM RP3 [33].

Using a standard unique path MIN (in our case it is a GC) with size 2N for applications which require N PE’s is another straightforward method of providing spare PE’s. However, using a size 2N network requires adding N spare PE’s to continue functioning as a multistage cube after a network fault occurs. The total hardware required in this approach will be more than double that for one network of size N and double the number of PE’s. The extra N PE’s will not be used until there is a fault (otherwise, this method will be a degraded recovery method for applications requiring 2N PE’s). The cost overhead due to the N spare PE’s will make this method become more expensive than the RDR approach.

The graph representation of a DR network is similar to that of an ADM network [27]. The ADM network was developed for applications which need network performance or permu- tation capabilities beyond that of the multistage cube. In or- der to increase network performance, ADM adopts a routing tag scheme other than the destination tag or the Exclusive- OR tag. The ADM employs a dynamic rerouting tag scheme with which, in some cases, faulty switches or links can be avoided by taking an alternative path. However, it is not a fault-tolerant network, e.g., there is only one disjoint path if both source address and destination address are even or both are odd. The ADM network was never intended to support the use of spare PE’s. On the other hand, the DR network is de- signed for applications which need a fault-tolerant multistage cube type of network and spare PE’s. With the DR network, any faulty PE, switch, or link can be tolerated by replacing the entire faulty row of the DR with a spare row, maintaining N PE’s and N parallel paths through the network. Programs which can run in ADM-based systems cannot be executed in DR-based systems. The routing scheme used in the ADM net- work cannot be used to control the DR network, and many permutation patterns which can be done in ADM may not be realized in the DR network. Thus, while the ADM and the DR are topologically related, there are significant differences which include design goals, network control schemes, the way in which the networks are used, permutation capability, net- work fault tolerance and recovery procedures, and impact on system fault tolerance.

The complexity of the DR approximately equals that of the ADM network when S is much smaller than N. Compare the ADM and the GC based on their graph representations first. The link ratio between the ADM and the GC is 3/2. If links of their graphs are interpreted as interchange boxes and switching nodes are interpreted as links [Fig. 2(b)], then the ADM needs N more links than the GC, which interconnect interchange boxes within each stage of ADM [28]. Hence, the link ratio between the ADM and the GC is two. The link ratios between DR and GC are similar. Consider the reliability im- provement obtained by using DR, from (4.16), the reliability improvement factor is about 2 1, compared to the upper bound R , for Ap T =

The link ratio between the RDR and the DR is (3m -s)/3m. However, the RDR network is more cost effective than the DR because it can be implemented more easily as discussed in Section 111. When a system is operating in a partitionable environment, the reliability improvement factor obtained by using the RDR network is about 250, compared to R, for APT =

and about five for Ap T = 5 x

and about 60 for APT = 5 x lop4. VI. CONCLUSION

The purposes of this paper are to investigate the possibility of adding redundancy to MIN’s as well as to other subsystems to enhance the overall system reliability, and to analyze the improvement in reliability that can be obtained. While many other fault-tolerant multistage cube type of networks have been proposed [l], the network described here differs in that in addition to being fault tolerant, it also supports the inclusion of spare processors into the system. The DR network would be

Page 10: Design and analysis of dynamic redundancy networks

1028 IEEE TRANSACTIONS ON COMPUTERS, VOL. 37, NO. 9, SEPTEMBER 1988

more expensive than the GC, but it can significantly improve system reliability over a large range of N.

The DR and RDR networks are designed to be partitionable into up to S independent subnetworks (subsystems), each of which is single-fault tolerant in terms of network faults or PE faults. Furthermore, the DR and RDR networks retain the same multistage cube capabilities for one-to-one, broadcast, and permutation connections before and after reconfiguration due to partitioning or a fault.

With a recovery procedure added in the operating system, the application programs can be executed before and after a fault without any modification (after being reloaded into the functioning PE’s). The RDR network can be implemented more cost effectively than the DR while retaining most of the capabilities of the DR network. Fault-tolerant capabilities of DR and RDR are obtained by reconfiguration and no compo- nent of these networks is assumed to be fault-free.

REFERENCES G. B. A d a m 111, D. P. Agrawal, and H. J. Siegel, “A survey and comparison of fault-tolerant multistage interconnection networks,” Computer, pp. 14-27, June 1987. G. B. Adams I11 and H. J. Siegel, “The extra stage cube: A fault- tol- erant interconnection network for supersystems,” IEEE Trans. Com- put., vol. C-31, pp. 443-454, May 1982. - , “Modifications to improve the fault-tolerance of extra stage cube interconnection network,” in Proc. I984 Int. Conf. Parallel Processing, Aug. 1984, pp. 169-173. D. P. Agrawal and D. Kaur, “Fault tolerant capabilities of redundant multistage interconnection networks.” in Proc. 1983 Real-Time Syst. Symp., Dec. 1983, pp. 119-127. 1. A. Baqai and T. Lang, “Reliability aspects of the Illiac IV com- puter,” in Proc. I976 Int. Conf. Parallel Processing, Aug. 1976,

G. H. Barnes, “Design and validation of a connection network for many-processor multiprocessor systems,” in Proc. 1980 Int. Conf. Parallel Processing, Aug. 1980, pp. 79-80. K. E. Batcher, “The flip network in STARAN,” in Proc. 1976 Int. Conf. Parallel Processing, Aug. 1976, pp. 65-71. -, “STARAN series E,” in Proc. 1977 Int. Conf. Parallel Pro- cessing, Aug. 1977, pp. 140-143. J. Beetem, M. Denneau, and D. Weingarten, “The GF11 supercom- puter,” in Proc. 12th Symp. Comput. Architecture, June 1985, pp. 108- 1 15. L. Ciminiera and A. Serra, “A fault-tolerant connecting network for multiprocessor systems,” in Proc. 1982 Int. Conf. Parallel Process- ing, Aug. 1982, pp. 113-122. W. Crowther, J . Goodhue, E. Starr, R. Thomas, W. Milliken, and T. Blackadar, “Performance measurements on a 128-node butterfly parallel processor,” in Proc. 1985 Int. Conf. Parallel Processing,

N. J . Davis IV, W. T-Y. Hsu, and H. J. Siegel, “Fault location techniques for distributed control interconnection networks,” IEEE Trans. Comput., vol. C-34, pp. 902-910, Oct. 1985. J. B. Dennis, G. A. Boughton, and C. K. L. Leung, “Building blocks for data flow prototypes,” in Proc. 7th Symp. Comput. Architec- ture, May 1980, pp. 1-8. D. M. Dias and J . R. Jump, “Augmented and pruned N log N mul- tistage networks: Topology and performance,” in Proc. I982 Int. Conf. Parallel Processing, Aug. 1982, pp. 10-1 1. T-Y. Feng and Q. Zhang, “Fault diagnosis of multistage intercon- nection networks with four valid states,” in Proc. 5th Int. Conf. Distributed Comput. Syst., May 1985, pp. 218-226. M. J. Flynn, “Very high-speed computing systems,” Proc. IEEE, vol. 54, pp. 1901-1909, Dec. 1966. M. A. Franklin, “VLSI performance comparison of banyan and cross- bar communication networks,” IEEE Trans. Comput., vol. C-30, pp. 283-291, Apr. 1981. L. R. Goke and G. J . Lipovski, “Banyan networks for partitioning multiprocessor systems,” in Proc. 1st Symp. Comput. Architecture, Dec. 1973, pp. 175-189.

pp. 123-131.

Aug. 1985, pp. 531-540.

A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, “The NYU Ultracomputer- Designing an MIMD shared memory parallel computer, ” IEEE Trans. Comput.,

V. P. Kumar and S. M. Reddy, “Design and analysis of fault-tolerant multistage interconnection network with low link complexity,” in Proc. 12th Symp. Comput. Architecture, June 1985, pp. 376-385. D. H. Lawrie, “Access and alignment of data in an array processor,” IEEE Trans. Comput., vol. C-24, pp. 1145-1155, Dec. 1975. C-T. Lea, “A load-sharing banyan network,” in Proc. 1985 Int. Conf. Parallel Processing, Aug. 1985, pp. 317-324. J . E. Lilienkamp, D. H. Lawrie, and P-C. Yew, “A fault-tolerant interconnection network using error correcting codes,” in Proc. 1982 Int. Conf. Parallel Processing, Aug. 1982, pp. 123-125. J . Maeng, “Self-diagnosis of multistage network-based computer sys- tems,” in Proc. 1983 Int. Fault-Tolerant Comput. Symp., June 1983, pp. 324-331. M. Malek and W. W. Myre, “A description method of interconnec- tion networks,” Distributed Processing Quarterly, IEEE Computer Society Tech. Comm. Distributed Processing Newsletter, vol. 1,

W. C. McDonald and J. M. Williams, “The advanced data processing testbed,” COMPSAC, pp. 346-351, Mar. 1978. R. J. McMillen and H. J. Siegel, “Performance and fault-tolerance improvements in the inverse augmented data manipulator network,” in Proc. 9th Symp. Comput. Architecture, Apr. 1982, pp. 63-72. - , “Evaluation of cube and data manipulator networks,” J. Par- allel Distributed Comput., vol. 2, pp. 79-107, Feb. 1985. Y-W. Ng and A. Avizienis, “ARIES- An automated reliability esti- mation system for redundant digital structure,” in Proc. 1977 Relia- bility Maintainability Symp., Jan. 1977, pp. 108-1 13. K. Padmanabhan and D. H. Lawrie, “A class of redundant path multi- stage interconnection networks,” IEEE Trans. Comput., vol. C-32, pp. 1099-1108, Dec. 1983. - , “Fault tolerance schemes in shuffle-exchange type interconnec- tion networks,” in Proc. 1983 Int. Conf. Parallel Processing, Aug. 1983, pp. 71-75. M. C. Pease 111, “The indirect binary n-cube microprocessor array,” IEEE Trans. Comput., vol. C-26, pp. 458-473, May 1977. G. F. Pfister, W. C. Brantley, D. A. George, S . L. Harvey, W. J . Kleinfelder. K. P. McAuliffe, E. A. Melton, V. A. Norton, and J . Weiss, “The IBM research parallel processor prototype (RP3): Intro- duction and architecture,” in Proc. I985 Int. Conf. Parallel Pro- cessing, Aug. 1985, pp. 764-771. C. S . Raghavendra and A. Varma, “INDRA: A class of interconnec- tion networks with redundant paths,” in Proc. 1984 Real-Time Syst. Symp., Dec. 1984, pp. 153-164. S . M. Reddy and V. P. Kumar, “On fault-tolerant multistage intercon- nection networks,” in Proc. 1984 Int. Conf. Parallel Processing, Aug. 1984, pp. 155-164. J. P. Shen and J. P. Hayes, “Fault tolerance of a class of connecting networks,” in Proc. 7th Symp. Comput. Architecture, May 1980, pp. 61-71. K. G. Shin and Y. H. Lee, “Analysis of the impact of error detec- tion on computer performance,” in Proc. 1983 Int. Fault-Tolerant Comput. Symp., June 1983, pp. 356-359. H. J. Siegel, “Analysis techniques for SIMD machine interconnection networks and the effects of processor address masks,” IEEE Trans. Comput., vol. C-26, pp. 153-161, Feb. 1977. -, “The theory underlying the partitioning of permutation net- works,” IEEE Trans. Comput., vol. C-29, pp. 791-801, Sept. 1980. - , Interconnection Networks for Large-Scale Parallel Process- ing: Theory and Case Studies. Lexington, MA: Lexington Books, D. C. Health, 1985. H. J. Siegel and R. J. McMillen, “The multistage cube: A versatile interconnection network,” Computer, vol. 14, pp. 65-76, Dec. 1981. H. J. Siegel, T. Schwederski, J . T. Kuehn, and N. J. Davis IV, “An overview of the PASM parallel processing system,” in Computer Architecture, D. D. Gajski, V. M. Milutinovic, H. J. Siegel, and B. P. Furht, Eds. Washington, DC: IEEE Computer Society Press,

H. J. Siegel, L. J. Siegel, F. C. Kemmerer, P. T. Mueller, Jr., H. E. Smalley, Jr . , and S. D. Smith, “PASM: A partitionable SIMDiMIMD system for image processing and pattern recognition,” IEEE Trans. Comput., vol. C-30, pp. 934-947, Dec. 1981. D. P. Siewiorek and R. S. Swarz, The Theory and Practice of Re- liable System Design.

vol. C-32, pp. 175-189, Feb. 1983.

pp. 1-6, Feb. 1981.

1987, pp. 387-407.

Bedford, MA: Digital, 1982, pp. 17-62.

Page 11: Design and analysis of dynamic redundancy networks

JENG AND SIEGEL: DYNAMIC REDUNDANCY NETWORKS 1029

S. Thanawastien and V. P. Nelson, “Interference analysis of shuf- fle/exchange networks,” IEEE Trans. Comput., vol. C-30, pp. 545- 556, Aug. 1981.

Dr. Jeng is a member of the Tau Beta Pi and Eta Kappa Nu honorary societies.

N-F. Tzeng, P-C. Yew, and C-Q. Zhu, “A fault-tolerant scheme for multistage interconnection networks,” in Proc. 12th Symp. Comput. Architecture, June 1985, pp. 368-375. C-L. Wu and T-Y. Feng, “On a class of multistage interconnection networks,” IEEE Trans. Comput., vol. (2-29, pp. 694-102, Aug. 1980. C-L. Wu, T-Y. Feng, and M-C. Lin, “Star: A local network system for real-time management of imagery data,” IEEE Trans. Comput., vol. C-31, pp. 923-933, Oct. 1982.

Howard Jay Siegel (M’77-SM’82) was born in New Jersey on January 16, 1950. He re- ceived the S.B. degree in electrical engineering and the S.B. degree in management from the Mas- sachusetts Institute of Technology, Cambridge, in 1972, the M.A. and M.S.E. degrees in 1974, and the Ph.D. degree in 1977, all in electrical engi- neering and computer science from Princeton Uni- versity, Princeton, NJ.

In June 1976, he joined the School of Electri- cal Engineering, Purdue University, West Lafayette,

IN, where he is a Professor and Director of the PASM Parallel Processing Project. He authored the book Interconnection Networks for Large-scale Parallel Processing, and has consulted, given tutorials, coedited four books, and coauthored over 120 papers on parallel processing.

Dr Siegel has been a guest editor of the IEEE TRANSACTIONS ON COMPUTERS (twice), an IEEE Computer Society Distinguished Visitor, a NATO Advanced Study Institute Lecturer, Chair of the IEEE Computer So- ciety Technical C o m t t e e on Computer Architecture (TCCA), Chair of the ACM Special Interest Group on Computer Architecture (SEARCH), Chair of the ACM/IEEE Workshop on Interconnection Networks for Parallel and Distributed Processing (1980), General Chair of the 3rd International Con- ference on Distributed Computing Systems (1982), Program Co-chair of the 1983 International Conference on Parallel Processing, and General Chair of the 15th Annual International Symposium on Computer Architecture (1988) He is currently an associate editor of the Journal of Parallel and Distributed Computing and a member of the Eta Kappa Nu and Sigma Xi honorary so-

Menkae Jeng (M’88) was born in Kauhsiung, Tal- wan, on April 20, 1957 He received the B S E E degree from National Taiwan University, Taipei, Taiwan, in 1978, the M S E E degree from MIS- sissippi State University, Starkville, in 1984, and the Ph D degree from the School of Electrical En- gineering, Purdue University, West Lafayette, IN, in 1987

While at Purdue, he was a Research Assistant on the PASM Parallel Processing Project He is cur- rently an Assistant Professor in the Department of

Computer Science, University of Houston, Houston, TX His research in- terest include computer architecture, parallel processing, fault-tolerant com- puting, design and analysis of algorithms, and operating systems for parallel computers cieties