8
Performance of falult tolerant ATM switches I? U. Tag I e N.K. Sharma Indexing twnw A TM switching, Meun tinir to Juiluse, Bunyun nrtwor1c.y Fuuh to1erunt.e Abstract: The authors propose a new, high performance self routing fault tolerant asynchronous transfer mode (ATM) switching network for B-TSDN which employs an enhanced scheme of the banyan network. It consists of two banyan networks with links provided at every stage to allow cell transfer to and from each banyan plane, thereby offering multiple paths between each input-output pair and giving it a high degree of fault tolerance and overcoming the single path limitation of banyan networks. Using simulations, the proposed network is shown to offer better performance than other networks in terms of cell loss rates with or without the presence of faults in the network. Its architecture offers high throughput, low and kited amount of delay, maintenance of cell sequence, cost effectivity, simple routing, modularity and priority of messages. Furthermore, the authors examine the reliaibility, mean time to failure and availability of the proposed network vis-a-vis other switching networks. Numerical results show the proposed network to be better than other switching networks. 1 Introduction Broadband ISDN (integrated services digital network) or B-TSDN will sulpport various communication serv- ices which include voice, data and video. The CCITT has recommended the use of asynchronous transfer mode (ATM) as the transport technique for B-ISDN [I]. ATM is a very high bandwidth, low delay, ccll switching and multiplexing techniqiie which utilices small, fixed length cells of 53 bytes. Multistage interconnection networks (MINs) have emerged as viable alternatives to the low cost but inefficient time shared bus and the efficient but costly crossbar switch as the transport technique of choice for B-ISDN. lnterconnection is provided at a modest cost and complexity of O(Mog2N) with a sufficiently large bandwidth. Among the various characteristics an interconnection network must have to handle B-ISDN are: high throughput, low delay, low cell loss probability, high degree of fault tolerance, maintenance 0 IEE, 1996 lEE Proceedings online no. 19960669 Paper first received 25th August 1995 and in revised form 31s~ May 1996 The authors are with La Trobe University, Dfpdrtment of Computer Science and Computer Engineering. Bundoora, Victoria 3083, Australia IEE Pvoc.-Conznzun.. Vol. 143, No. 5, Ocrobrr 1996 of cell sequence, cost effectiveness, simple routing and modularity. High throughput and low delay are the two most important requirements a:: the amount of traffic expected is huge; high cell loss probability implies increased delays and network traffic due to resent cells. The availability of multiple paths at every stage of the network is needed to provide not only a high degree of fault tolerance but also to reduce the chances of block- ing. Resequencing needs to be avoided as this means increased delays and the requirement for extra logic at the outputs of the MIN. A simple routing algorithm is desired so that the network can be self-routing while keeping the switch complexity low. Lastly, there is modularity which is a desired trait for iniplement.ation in VLSI (very large scale integration). Various types of switching networks have been pro- posed to handle B-ISDN using ATM [2-111. Most of these employ a type of banyan network as the basic element which, although allowing for a very simple routing algorithm, is highly susceptible to blocking because of its single path structure [4]. Furthermore, a single fault in either a switching element (SE) or link may render the network iiicapable of performing its intended function. Hence, the various networks that havc been proposed usua1l.y involve steps to enhance the basic banyan network 15.0 as to allow for multiple paths which in turn gives it a degree of fault to1e:rance One example of a fault tolerant network is that pro- posed by D. Agrawal [2] which is based on Benes net- work [4, 51. The number of stages in a N x N Benes network is 2n - 1 (where IZ = log2N) which means increased delays as the network gets larger. Moreover, fault tolerance is only available until the middle stage, after which a cell has no choice but to follow a single path. Another fault tolerant network is that of the parallel banyan network wherein a second plane (or more) is added in order to create more paths between any input- output pair [5]. But although this approach offers increased throughput, the amount of hardware and its complexity increases accordingly. More importantly, fault tolerance is offered only at the first (demulti- plexer) stage as each plane is essentially just a baseline network. Here we present a new switching network that pro- vides multiple paths at each stage of the network. Throughput is high as the two banyan planes can both be used for transmitting messages. Delay is kept low with or without faults as the number of stages eixoun- tered by a cell is fixed to log2N. Sequencing of cells is assured as the cells follow the same path whether or not there are faults. Fault tolerance is offered at all [3-1 I]. 317

Performance of fault tolerant ATM switches

  • Upload
    nk

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Performance of fault tolerant ATM switches

Performance of falult tolerant ATM switches

I? U. Tag I e N.K. Sharma

Indexing twnw A T M switching, Meun tinir to Juiluse, Bunyun nrtwor1c.y Fuuh to1erunt.e

Abstract: The authors propose a new, high performance self routing fault tolerant asynchronous transfer mode (ATM) switching network for B-TSDN which employs an enhanced scheme of the banyan network. It consists of two banyan networks with links provided at every stage to allow cell transfer to and from each banyan plane, thereby offering multiple paths between each input-output pair and giving it a high degree of fault tolerance and overcoming the single path limitation of banyan networks. Using simulations, the proposed network is shown to offer better performance than other networks in terms of cell loss rates with or without the presence of faults in the network. Its architecture offers high throughput, low and kited amount of delay, maintenance of cell sequence, cost effectivity, simple routing, modularity and priority of messages. Furthermore, the authors examine the reliaibility, mean time to failure and availability of the proposed network vis-a-vis other switching networks. Numerical results show the proposed network to be better than other switching networks.

1 Introduction

Broadband ISDN (integrated services digital network) or B-TSDN will sulpport various communication serv- ices which include voice, data and video. The CCITT has recommended the use of asynchronous transfer mode (ATM) as the transport technique for B-ISDN [I]. ATM is a very high bandwidth, low delay, ccll switching and multiplexing techniqiie which utilices small, fixed length cells of 53 bytes.

Multistage interconnection networks (MINs) have emerged as viable alternatives to the low cost but inefficient time shared bus and the efficient but costly crossbar switch as the transport technique of choice for B-ISDN. lnterconnection is provided at a modest cost and complexity of O(Mog2N) with a sufficiently large bandwidth. Among the various characteristics an interconnection network must have to handle B-ISDN are: high throughput, low delay, low cell loss probability, high degree of fault tolerance, maintenance 0 IEE, 1996 l E E Proceedings online no. 19960669 Paper first received 25th August 1995 and in revised form 31s~ May 1996 The authors are with La Trobe University, Dfpdrtment of Computer Science and Computer Engineering. Bundoora, Victoria 3083, Australia

I E E Pvoc.-Conznzun.. Vol. 143, No. 5 , Ocrobrr 1996

of cell sequence, cost effectiveness, simple routing and modularity.

High throughput and low delay are the two most important requirements a:: the amount of traffic expected is huge; high cell loss probability implies increased delays and network traffic due to resent cells. The availability of multiple paths at every stage of the network is needed to provide not only a high degree of fault tolerance but also to reduce the chances of block- ing. Resequencing needs to be avoided as this means increased delays and the requirement for extra logic at the outputs of the MIN. A simple routing algorithm is desired so that the network can be self-routing while keeping the switch complexity low. Lastly, there is modularity which is a desired trait for iniplement.ation in VLSI (very large scale integration).

Various types of switching networks have been pro- posed to handle B-ISDN using ATM [2-111. Most of these employ a type of banyan network as the basic element which, although allowing for a very simple routing algorithm, is highly susceptible to blocking because of its single path structure [4]. Furthermore, a single fault in either a switching element (SE) or link may render the network iiicapable of performing its intended function. Hence, the various networks that havc been proposed usua1l.y involve steps to enhance the basic banyan network 15.0 as to allow for multiple paths which in turn gives it a degree of fault to1e:rance

One example of a fault tolerant network is that pro- posed by D. Agrawal [2] which is based on Benes net- work [4, 51. The number of stages in a N x N Benes network is 2n - 1 (where IZ = log2N) which means increased delays as the network gets larger. Moreover, fault tolerance is only available until the middle stage, after which a cell has no choice but to follow a single path.

Another fault tolerant network is that of the parallel banyan network wherein a second plane (or more) is added in order to create more paths between any input- output pair [5]. But although this approach offers increased throughput, the amount of hardware and its complexity increases accordingly. More importantly, fault tolerance is offered only at the first (demulti- plexer) stage as each plane is essentially just a baseline network.

Here we present a new switching network that pro- vides multiple paths at each stage of the network. Throughput is high as the two banyan planes can both be used for transmitting messages. Delay is kept low with or without faults as the number of stages eixoun- tered by a cell is fixed to log2N. Sequencing of cells is assured as the cells follow the same path whether or not there are faults. Fault tolerance is offered at all

[3-1 I].

317

Page 2: Performance of fault tolerant ATM switches

stages. Routing requires only some minor modifica- tions to the basic banyan routing scheme. Furthermore, its modular design makes it ideal for VLSI implementa- tion.

Simulations will be done to ascertain the cell loss rates for the proposed network in fault free and faulty environments. In addition, expressions for the time dependent reliability (R(t)), the mean time to failure (MTTF), and the availability ( A ) will be derived to find its suitability for real time applications. The proposed network is compared with banyan, parallel banyan and Benes networks in terms of redundancy, cost, cell loss rate, reliability, mean time to failure and availability.

3 Proposed network architecture

The proposed network employs not one but two ban- yan networks, labeled 0 and 1. Unlike the parallel ban- yan network, the two banyan planes in the proposed network are closely linked. In the proposed network it is possible to transfer to and from the two planes to bypass faults and conflicts; this is not the case in the parallel banyan network. In the following subsections we describe the proposed network's architecture and routing.

2

In this Section, the fault tolerance of baseline, parallel and Benes networks are briefly reviewed since these networks will be used to compare the performance of the proposed network.

0- i + - . t M

Review of fault tolerant switches

a

USEofplaneO I S E of plane1 a

b

b Fig.2 Proposed network a 8 x 8 Network b Chiinnel graph for N = 16 network

C

Fig. 1 a Baseline network h Parallel banyan network L Benes network

Channel graph jor N = 16 networks

Channel graphs for the baseline, parallel and Benes networks, all of size N = 16 are shown in Fig. 1. A cell in a baseline network has to go through log2N stages but the single path structure makes it susceptible to faults. This is clearly illustrated using the channel graph in Fig. la.

Like the baseline banyan, a cell in a parallel banyan has to go through logzN stages (Fig. 16) before it reaches an output port. However, the availability of an extra plane gives it a certain degree of fault tolerance. The number of planes can be increased further but the hardware cost would increase tremendously. As a par- ticular note, this network is still susceptible to faults that may occur within a particular plane while the cell is traversing the given plane.

A Benes network requires 2n ~ 1 stages as shown in Fig. IC. One characteristic of a Benes network is that it is only fault tolerant up to stage n. Once in stage n a cell has only one path to traverse to reach a particular output line. In fact, stages n to 2n - 1 are actually a baseline banyan network in reverse.

3. 'I Network architecture An 8 x 8 proposed network is shown in Fig. 2u. It has two complete 8 x 8 banyan networks in planes 0 and 1. 4 x 4 SEs are used to link the switch in stage i with two SEs in stage I + 1 in the same plane and two SEs in stage i + 1 in the other plane. The exceptions to this are the SEs in the first and last stages which are com- posed of 2 x 4 and 4 x 2 SEs respectively. Like in the parallel banyan network, there is the potential to allow for twice the throughput of the baseline network as there are two physical networks. Unlike the parallel banyan, however, a cell always has two paths to choose from in each stage, i.e. the SE in plane 0 or plane 1. Unlike a Benes network the proposed network allows for fault tolerance in every stage of the network. This can be clearly seen in the channel graph shown in Fig. 2b.

The stages for both planes are labelled from 0 to n ~ 1 and SEs are labelled from 0 to Ni2. Hence, a par- ticular SE can be given a label [a, i, s] where a = 0, 1 (to represent the plane), i is the ith stage of SEs and s is the switch number. In Fig. 2, the white and black switches represent the switches of plane 0 and plane 1 of the banyan networks, respectively.

3.2 Routing There are various ways in which routing for the proposed network can be done. To a large part this is due to the fact that there are various ways that the second plane can be utilised. Assume that network inputs are labeled S = S,_~S,~~...S~ and outputs as D = dn-ldn-2...do. The basic routing algorithm is similar to

318 IEE Proc -Cornmu, , Vol. 143, No 5, October I996

Page 3: Performance of fault tolerant ATM switches

that used in baseline banyan networks. The algorithm relies solely on the destination address where for each stage i, where 0 s 1' 5 n ~ I , bit i of the destination address is examined to determine whether the upper path should be taken (if di = 0) or the lower (if d, = 1) within a particular plane.

Take the simplest case, that of point to point routing. For this first case, aissume that plane I is only going to be used for fault tolerance purposes. In short, plane I is on standby mode and will not come into use until a fault in plane 0 is detected. A routing tag, RT, is simply the destination output D with a = 0 appended to the front, i.e. bit n. While the cell is traversing plane 0 and it encounters a fault, the cell is deflected to plane 1 to bypass the fault in plane 0. Thus, the: cell can bounce to and from different planes to bypass faults. Note that the number of stages traversed in the faulty network is still log2N since the cells always move forward and the sequence of cells is maintained. Routing for this case could then be done ;as follows: ROUTE begin

RT = OD = Odn-idn-z ... di ... do f o r i = 0 to n ~ 1

case df: /* examine bit di at switch [a, i, s] */ 1 : choose lower output of SE 0 : choose upper output of SE

endcase if [a, i + 1, next s] is faulty then

send cell to [ii, i + 1, next s] else

send cell to [a, i + 1, next s] end In the above algorithm, a cell stays in plane 0 until a fault is encountered. Under no faults., destination D is reached as in a baseline network. If it fault is encoun- tered then the cell is sent to the alternate plane. The cell then proceeds to its destination using the alternate plane.

This approach also frees up the other plane for other traffic and leads to a better utilisation of plane 1. A processor connected to an input S can alternately send cells to planes 0 and 1 by making a = 0 or a = 1. This would allow the processor to issue cells at a faster rate, effectively doubling the throughput. However, this would also require the output to receive two cells simultaneously

A third case designates plane 1 as the plane of choice for high priority traffic while maintaining plane 0 for use by other cells. Thus, the bit a in the routing tag RT can be used to designate priority, 1 for high priority and 0 otherwise. This third case is based on the assumption that cells with a = 1 (i.e. high priority) are less than or equal to the number of cells with a = 0 (i.e. low priority). If faults are present in either plane then cells can be deflected to the alternate .plane. Higher pri- ority cells can be deflected back to the high priority plane in the subsequent stages once the faults have been bypassed. The same approach can be applied to lower priority cells.

Multicasting is an essential service for B-ISDN as well. A fourth case can therefore be where plane 1 is used primarily for multicasting. Like case 3, on the assumption that multicast cells are fewer in number,

IEE Pvoc.-Commun., Vol. 143, No. 5, October 1996

such cells would be able to get to their destination without much interference with cells in plane 0.

For all of the above cases, a cell gets deflected to the other plane if faults are encountered or output conten- tion occurs. Such decisions would be dependent on each switch [a, i, s].

4 Performance

This section is basically divided into two general parts. The first part is composed of the first three subsections wherein we describe the number of redundant path, its effect on the cost in terms of switching elements and links and lastly, the cell loss rate in both fault-free and faulty environments. The second part completes the performance and reliability study of the proposed network by dealing with the networks' time-dependent reliability, the mean time to failure (MTTF) and the steady state availability. Numerical results for these parameters are presented in the last subsection where the proposed network is compared vis-a-vis the 2-plane parallel banyan and Benes networks.

4. I Redundancy The single path structure of the baseline banyan makes it highly susceptible to any single faults. A higher number of redundant paths allows a multistage interconnection network to maintain a path between any input-output pair in case of faults. The redundant paths for each of the networks discussed are:

2" redundant pathsBenes = -

2 redundant pathsparallel = 2 redundant pathsproposed = 2"

where N is the network size and n = logzN. Note the higher number of paths in the proposed network which is always double that of a Benes network.

4.2 Cost analysis For this section, we seek to compare the amount of hardware needed to implement the proposed network, done in the context of switching elements and links. To determine the number of SEs, the following formulas were used for Benes, parallel and proposed networks respectively:

N 2

S E R ~ ~ ~ ~ -(an - 1)

Sqmu21el = N n S E p r o p o s e d = Same as S E p a r a l l e l

The proposed network and the parallel banyan have the same number of switching elements though the former has far more redundant paths as shown in the preceding discussion. However, since the SEs alre of different sizes, we now consider the number of links in each of the networks. The number of links are as follows:

/2nksg,,,, = W ( 2 n - 2 )

1~nksprop,,,d = 8 N ( n - 1)

1?nkSp,,,llel = 4 N ( n - 1)

Therefore, a higher number of links are needed in the implementation of the proposed network. However, the succeeding discussion will s'how that the benefit will far outweigh this slight increase in hardware cost.

319

Page 4: Performance of fault tolerant ATM switches

4.3 Cell loss rate The cell loss rate of the proposed network is compared vis-a-vis other networks using simulations. Several comparisons are made for each of two cases. The f'irst case assumes that each input processor is issuing one cell every clock cycle, i.e. 100'%1 loading. The second case is for SO'%, loading.

The first comparison is done in the context of no faults occurring in the network. Comparison is done with Benes and parallel banyan networks. and the des- tination of each cell is generated randomly. For the parallel banyan and the proposed network, a cell is sent randomly to either of the two planes. The choice is made randomly and no buffers are available for any of the SEs. Also, the output ports in the proposed and parallel banyan networks can only receive one cell at a time. Simulations are done for network size n = 2 to 11

= 10, with the results shown in Fig. 3.

i 0.80 ~~ . - -~,~ ~, - ~- ~ ~-_-- ~

i

2 4 6 8 10 network size n

Fig. 3 A- - - -A 0- - - 0 0--- -0 proposed (100%)) V- - - -V E- - ~ 0

Cell h s s rate VP~.YU.S net i iwk size at 100% und 50% /oadinir?g Benes (1 O O ' X ) parallel banyan (100%)

Benes (50'%1) parallel bmyan (50%) *-. --x proposed (50'%/;a

Clearly shown in the graph is that, although all three network exhibit an increase in cell loss probability [or increasing network size, the rate of increase for the proposed network is lower than for the other two. Moreover, even for large network sizes (n = 8, 9, 10) the cell loss probability of the proposed network is in the same range as the cell loss probability of either Benes or parallel banyan for smaller network sizes ( 1 2 =

For the second comparison, faults are injected ran- domly into the networks. It is assumed that no faults can occur in the first and last stages of Benes network. The same is assumed for the parallel banyan and pro- posed networks, where the first and last stages are the demultiplexers and multiplexers, respectively. Faults are assumed to affect an entire switching element.

Ten different fdUh distributions are generated at ran- dom. The networks are subjected to various number of faulty SEs ranging from I to as many as 20 faulty switches. Again, cells that are unable to win contention for the links to non faulty SEs are considered lost. Sim- ulations are done for network sizes n = 6, 8 and 10.

2, 3) .

320

Evident from the graphs shown in Figs. 4 and S is the superior performance of the proposed network even in the presence of high number of faults in the net- work. The cell loss probabilities for the proposed net- work are well below that of Benes network. Moreover, the proposed network's cell loss stays below 40 percent even in the presence of up to 20 faults in the system. The exception, of course, to this observation is the net- work of size n = 6 which, being the smaller network, is more susceptible to faults than the bigger networks.

0.801----~ ' ' i I

0 20 L.L.d , . , , 1

0 4 8 12 16 20 number of faults

Fig. 4 A-- A Bcncs: n = 6 0- - - -0 +- --+ propoml n = 6 V- - - -V n- - -0 x- x D - - - D B e i i e s : n = l O 0- - - -0 -$L ~ -~ +?+ propo5ed: 11 = I O

ci,ll 10.c~ rule IJLWLLS tzimzhcr. of fciziit.r. for 100% louting

parallel banyan. n = 6

Bcncs: 11 = 8 parallel banyan: 11 = 8 pi-oposed: n = 8

pdnillel hanydn 11 = 10

t 0.60 1 A - P

0 .40

0 I L I" L"

n u m b e r of f a u l t s Fig. 5 A - - A Benes. n = 6 0- - - -0 +-~ + proposed: n = 6 V ~- V Benes ii = 8 n- - - -0 parallel banyan: n = 8 x-- ~--px proposed: 11 = 8 D- - D Bciies: n = I O 0- - - -0 pdrallel bankan, n = I O -x- ~ - -X proposed: n = I O

('Ld /o.~s mtc ver~sw iiuniber o f f c i u h for 5 0 % loading

parallel banyan. n = 6

IEE Pi.~i~.-C'~riimun., Vol. 143, No 5, O(io11er I996

Page 5: Performance of fault tolerant ATM switches

A third comparison was based on the premise that 20 percent of the cells k i n g sent are given higher priority. Hence, higher priority cells are only lost i l there are contentions between two or more high priority cells. The results for network size II = 6, 8 and 10 show that evcn as the number of faulty switches increase the pri- ority cell loss probability of the proposed network stays at around 50 to 60 percent that of Benes [12].

4.4 Reliability, IVlTTF and availability In this subsection, the reliability, MTrF and availabil- ity of the proposed network is discussed using hierar- chical composition. The approach is summarised in Section 7 . I . The corresponding equations for Benes and parallel banyan, networks are discussed in Section 7.2.

The proposed network at its worst case will mimick a parallel banyan network, hence its lower bound relia- bility equations will be similar to those shown in Scc- tion 7.2.2. However, a more representative block diagram is shown in Fig. 6. At any stage in the net- work, with the exception of the demultiplexer and mul- tiplexer stages, a cell1 always has two SEs to go to and hence we have SE pairs in series.

I I I I

i.__+'-log2 N pairs---1

Fig. 6 Reliability block diugrarn fiir th(3 proposed network

Like the parallel banyan, the N demultiplexers and N multiplexers are in series as a fault in any one of them will cause system failure. Unlike the parallel banyan, the proposed network utilises 4 x 4 switching elements (SE) hence each SE will have a different failure rate, A,. As a 4 x 4 SE is much more complex than that of a 2 x 2 SE, the failure ra1.e A4 of the former is assumed to be equal to three timer; that of A2, the failure rate of a 2 x 2 SE. For the sake of simplicity, we assume that all SEs in the proposed network are 4 x 4 SESs, although those in the first and last stages are the simpler 2 x 4 and 4 x 2 SEs respectively. The expression for each SE pair can now be derived using eqns. 5 to 8 with A = A,. There are K = (N/2)log2N pairs so using eqri. 9, the reliability for the SE pairs is simply the product of the reliability of each pseudo component composed of a pair of 4 x 4 SEs. The reliability is therefore as folllows: RPToPO se d ( t ) zz t: ~ -' "8 [ B4 e ~ N 4 t + C4 e P'lt] e 11' '

\ I / J=o (1)

where K = (N12)log2N and B,, C,, a4 and /3, arc deter- mined using eqns. 5 to 8 using A = &.

The MTTF equation for the proposed network is derived from eqn. 1 using eqn. I O .

where, again, K = (N/2)10g2N and B4, C,, a, and 0, are determined using eqns. 5 to 8 using A = A,.

The steady state availability equation for the pro- posed network is the product of the availabilities of the 2N demultiplexers/niultiplexers and the K = (N/2)log2N pairs of 4 x 4 SEs and is as follows:

where K = (N/2)log2N. The first term in the availability equation is the

steady state availability of 2N components (i.e. multi- plexersidemultiplexers with failure rate A,,,) while the second term is obtained ~isiiig eqn. 11 with A = A, and K which is givcn above.

4.5 Numerical results Numerical results for the proposed network is com- pared vis-a-vis Benes and parallel banyan networks. The equations used for the latter two are listed in the Appendix

I ~ - ~ - ,-- ' 01

0.21

0 0 i time, h

Fig. 7 O--~ ~ 0 banyan n--~n parallel banyan A--A Beiics (ub) '5-- -V Benes (lb) *-- -& proposed

Relicihility comparisonfbr N = 256

Fig. 7 shows the time dependent reliability for all four networks for network size N = 256. Both the upper and lower bound models of Benes network were used in the analysis, Perfect coverage and no repair were assumed for these graphs as the intent is to show the efrect of fault tolerance. A failure rate of A2 = 10 for a 2 x 2 SE is used, and as stated earlier An7 =: 1/2h2 and A, = 3A2, where Ar,l and A, are the failure rates for multiplexers/demultiplexers and 4 x 4 SEs, respectively. It can be clearly seen from the graphs that the pro- posed network exhibits higlher reliability than the other networks. This is because the proposed network pro- vides fault tolerance at every stage of the network.

Similarly, perfect coverage and no repair were assumed for the computation for MTTF as the intent is

321 IEE Proc.-Commurr., Vol. 143, No. 5, October 1996

Page 6: Performance of fault tolerant ATM switches

to show the effect of fault tolerance as a function of network size. The same failure rates used above were used to compute for the MTTF. Fig. 8 shows the MTTF graph (multiplied by a factor of for net- work sizes from n = 3 ( N = 8) to n = 10 (N = 1024). The simplicity of the baseline banyan allows it to have a better MTTF for small sized networks ( N = 8) but the fault tolerant characteristics of the proposed net- work give it a better MTTF for increasing network size (N z 16). The Benes network provides a low MTTF because only half the network provides fault tolerance. ---- T--I

k::

network size, n Fig.8 0-0 banyan 0-0 parallel banyan A----A Benes (ub) V-V Benes (Ib) %-% proposed

Mean time to failure ( I P ) versus network size

0 . 9

2 ’ 0 . 8

n CI 0

.- - .-

- .-

2 0 . 7

0 . 6

0 . 5 4 6 8 10

network size. n Fig. 9 0-0 banyan 0-0 parallel banyan 0-0 Benes (ub) A-A Benes (lb) *-# proposed

Steady state availability versus network size

Fig. 9 shows the steady state availabilities of the net- works under study as a function of network size. For this graph, perfect coverage was once again assumed

322

but repair was permitted. The same values for &, Am and h4 were used while y = lo4& and y = y/lO. The graph shows that the proposed network has better results for increasing network size; its steady state availability only falls below that of the baseline banyan and the parallel banyan for very small network sizes.

5 Conclusions

This paper has described a new high performance self- routing fault tolerant switching network for B-ISDN. The network consists of two closely linked banyan net- works. Links are provided at every stage to allow cells to transfer to and from each plane, thereby offering multiple paths between input-output pairs. The net- work’s two planes can be used purely for fault toler- ance purposes, for doubling throughput and/or as a plane for high priority cells. For all of these cases, cells are deflected to the other plane if faults are encoun- tered or during switch output contentions.

Using simulations, the proposed network was shown to have far lower cell loss rate probabilities than the other networks for both fault free and faulty environ- ments. The proposed network was also shown to be more adept at allowing higher priority cells to reach their destination whether or not faults were present in the network.

Furthermore, this paper has addressed the derivation of expressions for the time dependent reliability, mean time to failure and steady state availability of the pro- posed network in order to ascertain its applicability for real time applications. This has been compared with similar expressions for the baseline banyan, parallel banyan and Benes networks. Numerical results show that the proposed network is better than both the parallel banyan and Benes net- works. This is mainly due to the ability of the proposed network to offer fault tolerance at every stage of the network. The parallel banyan only offers fault toler- ance at the initial stage while the Benes network only offers fault tolerance up to the middle stage. Results for time dependent reliability show that the proposed network is more reliable than the other two networks. The advantage of the proposed network’s fault toler- ance features are likewise seen in the results for mean time to failure, as it gave better results for all but the smaller network sizes (N = 8). Lastly, the proposed net- work exhibited far better results for availability than the parallel banyan and Benes networks, with the gap widening as the network size increases.

6 References

1 MINZER, S.: ‘Broadband ISDN and asynchronous transfer mode (ATM)’, IEEE Commun. Mug., September 1989, pp. 17-24

2 AGRAWAL, D.: ‘Testing and fault-tolerance of multistage inter- connection networks’, Computer, April 1982, pp. 41-53

3 AHMADI, H., and DENZEL, W.: ‘A survey of modern high- performance switching techniques’, IEEE J. Sel. Areas Commun., 7, (7), pp. 1091-1103

4 MCMILLEN, R.: ‘A survey of interconnection networks’. Pro- ceedings of IEEE GLOBECOM, 1984, pp. 105-1 13

5 ITOH, A.: ‘A fault-tolerant switching network for B-ISDN’, ZEEE J. Se/. Areas Commun., 1991, 9, (8), pp. 1218-1226

6 TOBAGI, F., KWOK, T., and CHIUSSI, F.: ‘Architecture, per- formance, and implementation of the tandem banyan fast packet switch’, IEEE J. Sel. Aveas Commun., 1991, 9, (8), pp. 1173-1193 ZEGURA, E.: ‘Architectures for ATM switching systems’, IEEE Commun. Mug., February 1993, pp. 28-37 PATTAVINA, A.: ‘Nonblocking architectures for ATM switch- ing’, IEEE Commun. Mag., February 1993, pp. 38-48

7

8

IEE Proc.-Commun., Vol. 143, No. 5, October 1996

Page 7: Performance of fault tolerant ATM switches

~

9

10

11

12

13

14

15

16

7

7.

LEE, T., and LIEW, 15.: ‘Broadband packet switches based on dilated interconnection networks’. Proceedings of IEEE ICC,

HUI, J., and LEE, T.: ‘A large scale ATM switching network with sort-banyan switch modules’. Proceedings of IEEE GLOBE- COM, 1992, pp. 133-137 ADAMS, G. III., AGRAWAL, D., and SIEGEL, H.: ‘A survey and comparison of fault-tolerant multistage interconnection net- works’, Computer, June 1987, pp. 14-27 TAGLE, P., and SHARMA, N.: ‘A high-performance fault-toler- ant switching network for B-ISDN’. Proceedings of IEEE IPCCC, 1995, pp. 599--606 BLAKE, J.T., and TRIVEDT, K.S.: ‘Reliability analysis of inter- connection networks using hierarchical composition’, ZEEE Trans. Reliab., 1989, 316, (I), pp. 111-120 RAMAKUMAR, R.: ‘Engineering reliability: fundamentals and applications’ (Prentice--Hall, Inc., 1993) BLAKE, J.T., and TRIVEDI, K.S.: ‘Multistage interconnection network reliability’, IEEE Trans. Cornput., 1989, 38, (Il) , pp.

JOHNSON, B.: ‘Design and analysis of fauli. tolerant digital sys- tems’ (Addison-Wesley Publishing Company. 1989)

1992, pp. 255-261

1600-1604

Appendix

1 Hierarchical composition Hierarchical composition is a technique which overcomes the difficulty of both model construction and solution by avoiding the use of a large one level model [13]. Blake and Trivedi used a subsystem of two components in parallel as a pseudo component (PC) and developed a reliability and steady state availability model (which incorporates imperfect coverage) for such a pseudo component [ 131.

The reliability of a pseudo component (PC) is as follows [13]:

where RPC = Bepat + CePPt (4)

p + 3X -- Jp2 + Xp(8c - 2) + X2] (5) 2 1 2

/3 = - [p + 3X +- Jp2 + Ap(8c - 2) + X2] (6)

1 = --((p + 2X(c - 1))

P - a (7)

The reliability of n such pseudo components in series is [13, 141:

n

(9) i=l

The mean time to failure (MTTF) is obtainable from R(t) and is equal to the following [14-161:

00

M T T F = 1 R(t)dt (10)

The expression for the steady state availability of the same pseudo component is as follows [13]:

7.2 Reliability, IVITTF and availability

7.2. I Baseline banyan: As a baseline banyan is a unique path MIN, a single faulty SE will cause system failure. Hence, from the reliability point of view it can be represented as (1\J/2)log2N SEs in series. Assuming a constant failure for each component, let h2 be the fail-

ure rate of a 2 x 2 SE. Hence, the time dependent relia- bility, MTTF and steady state availability of a N x N baseline banyan network is:

1 LNlog, 2 N

M T T F B B = (13)

where M = ’i2Mog,N and p is the repair rate.

7.2.2 Parallel banyan: Each of the separate networks can be represented as (N/2)log2N SEs in series. As any faulty demultiplexer and multiplexer will cause the sys- tem to fail, these are both represented as N devices in series. The failure rate of dernultiplexers and multiplex- ers being much simpler devices is assumed to be half that of a 2 x 2 SE. The reliability block diagram for the parallel banyan is illustrated in Fig. 10.

I in series I in series in series

I in series 1 Fig. 10 Reliability block diagram for two plane parallel banyan network

From the reliability block diagram it is now straight- forward to derive the equation for reliability. For a two-plane parallel banyan network, the two banyan networks can be considered as a pseudo component but with a failure rate A = (Ni2,)log2NA2. Therefore, using eqns. 5 to 8 the values for B, C 8, and p can be deter- mined. The expressions for reliability, MTTF and steady state availability can now be given as follows:

r 1

where M = (N/2)log2N, ,U is the repair rate and c is the coverage.

7.2.3 Benes: It is difficult to construct a single relia- bility block diagram to represent a Benes network. The last n = log2N stages are basically a baseline ba.nyan network, and hence could be represented as (N/2)log2N SEs in series. The first stage can be represented as N/2 SEs in series as a single fault in any of these will cause the system to fail. Stages 2 to n - 1, however, can be viewed in two different perspectives. When a cell goes through these stages, it always has two path options hence it is possible to view it as always having two SEs in parallel. But as these SEs are shared among vatrious input-output connections, it is also possible to view

IEE Proc -Commun , Vol. 143, No. 5, October 1996 323

Page 8: Performance of fault tolerant ATM switches

these as two parallel devices each having (N/2)(log2N ~

2) SEs in series. The former is treated as the upper bound the for Benes network and is shown in Fig. 11 while the latter, shown in Fig. 12, is considered as the lower bound representation of a Benes network.

log2 N SEs in series in series

SE2

i$(log2 N - 2 ) pairs 1 Fig. 11 Reliability block diagram for Benes network, upper bound

U SEs in series I I ,

in series

I S E ~ in series I

1 - [?(Dh2 + P ) + DXZPu(1 - c)

- YP

A B e n e s l b - Fig. 12 Reliability block diagram jor Bene3 network, lower bound

1 + 7 2M h 2

1 ~ 2 y 2 ~ W 1 - C ) + 2(;;2)2 x !

The reliability, MTTF and steady state availability for upper bound and lower bound models are therefore listed as follows: where M = (N/2)(log2N - 2), D = (N/2) + (N/2)log2N.

324 IEE ProcCommun.. Vol. 143, No. 5, October 1996