Deadlock Preventive Adaptive Wormhole Routing on k-ary n-cube Interconnection Networks

Deadlock Preventive Adaptive Wormhole Routing

on k-ary n-cube Interconnection Networks

Franck Binard

December 7, 2003

1

Abstract

A primary concern in all adaptive networks is the cost of deadlock prevention. While

wormhole routing can be considered superior to other routing schemes when looked at in

terms of low latency combined with low buffer requirements, it makes the deadlock issue

more complex to resolve, as packets of a single message can block several links at the

same time. Because planar-adaptive routing limits routing freedom to two dimensions

at a time, it makes it possible to prevent deadlock with only a fixed number of virtual

channels, independent of the number of network dimensions. In this essay, i will study

two planar-adaptive schemes (the Chien and Kim’s algorithm and the turn model) in

the context of cost effectiveness and deadlock prevention.

Contents

1 Introduction 6

2 k-ary n-cube Interconnection Networks 7

3 Wormhole routing 8

4 Adaptive Routing 8

5 Livelock Prevention 9

6 Deadlock Prevention 10

6.1 The Cost of Deadlock Prevention . . . . . . . . . . . . . . . . . . . . . . . . 11

7 Minimal Routing 12

8 Virtual Channels 13

8.1 Virtual Channel Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

8.1.1 Advantages of Using Virtual Channel . . . . . . . . . . . . . . . . . 14

8.1.2 Disadvantages of Virtual Channel Usage . . . . . . . . . . . . . . . . 14

8.2 Using Virtual Channels for Deadlock Avoidance . . . . . . . . . . . . . . . . 15

9 Planar Adaptive Wormhole Routing 16

9.1 The Turn Model for Adaptive Routing . . . . . . . . . . . . . . . . . . . . . 17

9.1.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

9.1.2 The West-First Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 18

9.1.3 The North-Last Algorithm . . . . . . . . . . . . . . . . . . . . . . . 18

9.1.4 The Negative-First Algorithm . . . . . . . . . . . . . . . . . . . . . . 19

9.1.5 Deadlock Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

9.1.6 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

9.2 Chien and Kim’s Partially Adaptive Routing Algorithm . . . . . . . . . . . 19

9.2.1 Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . . 20

9.2.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

9.2.3 Deadlock Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

9.2.4 Performance Comparisons . . . . . . . . . . . . . . . . . . . . . . . . 23

9.2.5 Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

9.3 The Modified Chien and Kim’s Partially Adaptive Routing Algorithm . . . 25

10 Conclusion 25

List of Figures

1 Examples of k-ary n-cube interconnection network topologies . . . . . . . . 7

2 Adaptive vs Deterministic Routing Algorithms . . . . . . . . . . . . . . . . 10

3 A deadlock situation: four messages have entered the network through dif-

ferent switches, and are blocked by each other in a cycle after having each

acquired the first-hop link . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Virtual Channel Router Architecture Diagram . . . . . . . . . . . . . . . . 13

5 Virtual Channel Deadlock Control . . . . . . . . . . . . . . . . . . . . . . . 16

6 Turns prohibited in the west-first algorithm . . . . . . . . . . . . . . . . . . 18

7 Turns prohibited in the north-last algorithm . . . . . . . . . . . . . . . . . . 18

8 Turns prohibited in the negative-last algorithm . . . . . . . . . . . . . . . . 19

9 Chien and Kim’s planar-adaptive routing in a 2-ary 3-cube . . . . . . . . . 20

10 Numbering of virtual channels with respect to node A in dimension i, i + 1 21

11 Increasing and Decreasing Virtual Networks of an Adaptive Plane . . . . . 22

12 Deactivation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

13 Comparison of Chien and Kim’s planar adaptive scheme and of the modifi-

cation algorithm terms of buffer requirements and channel utilization . . . 26

1 INTRODUCTION 6

1 Introduction

Routing algorithms are crucial to the efficient operation of interconnection networks as they

specify the paths packets will take when messages are being sent among the processors of

the network. A good routing algorithm will reduce the latency of the network by mini-

mizing the number of hops that are required for packets to reach their destination, ideally

forcing packets to advance closer to their destination with every hop. It should also be

able to handle deadlock and livelock situations. Ideally, it would have features that would

allow it to route around network faults. Finally, it should be able to balance the load of

the routing traffic on the interconnection network’s routing resources.

There are two categories of routing approaches in the context of interconnection networks:

deterministic and adaptive[13]. While with deterministic routing, a packet follows a path

that is determined exclusively by its source and its destination, adaptive routers choose the

path that the packet is to be routed through based on the current dynamic conditions of

the network. Adaptive routing schemes increase network performance, however, they also

increase the cost and complexity of the network. Adaptive routing schemes will also aug-

ment the difficulty of providing deadlock/livelock prevention and correction to the network.

Restricting adaptivity is one way to reduce some of the problems associated with adaptive

routing.

Planar-adaptive routing is a limited adaptive scheme. By restricting the set of possible

paths that a message can take, planar-adaptive routing resolves some of the difficulties as-

sociated with full adaptivity at a reasonable cost. Because planar-adaptive routing also

has most of the desirable routing properties that are found in fully-adaptive routing (such

as load balancing properties), it constitutes a compromise between deterministic routing

and fully-adaptive routing.

In this essay, I provide the reader with a survey of planar-adaptive wormhole routing

schemes and a study of deadlock-free planar-adaptive wormhole routed networks in the

2 K-ARY N-CUBE INTERCONNECTION NETWORKS 7

3-ary 2-cube (torus) 2-ary 3-cube 4-ary 3-cube

Figure 1: Examples of k-ary n-cube interconnection network topologies

context of deadlock-prevention, performance benefits, cost effectiveness and fault tolerance.

In terms of topology, the domain of discourse will remain fixed on a constant k-ary n-cube

topology.

2 k-ary n-cube Interconnection Networks

Definition 2.1 (k-ary n-cube Interconnection Network) A k-ary n-cube interconnec-

tion network is an n-dimensional network in which each dimension has k nodes. k is also

referred to as the radix. The relation:

N = kn, (n = logkN)

holds where N is the total number of nodes in the network. The network is composed of

kn nodes connected by 2 ∗ n physical channels.

Figure 1 presents possible architectures for 3 different k-ary n-cube topologies. Nodes

in k-ary n-cubes are identified by a an n-digit radix k address, a0, . . . , an−1, where the ai

address represents the node’s position in the ith dimension. There is always a physical

channel between the node with address (a0, . . . , ai, . . . , an−1) and its upper neighbor in the

3 WORMHOLE ROUTING 8

ith dimension, (a0, . . . , (ai+1)mod(k), . . . , an−1).

3 Wormhole routing

Wormhole routing is a routing technique by which a switch immediately forwards an in-

coming message to the desired output link when that link is available [11].

In wormhole routing, each packet is divided into a sequence of small units (typically 8-32

bits[16]) of data called flits. Once a communication channel has started to send the first

flits of a packet, it must transmit all the flits of that packet before it can be used for any

other messages. The header flit contains routing information and governs the route. The

remaining data flits follow in a pipelined fashion. If the header is blocked, the data flits are

also blocked.

Because wormhole routing does not make it necessary for a node to allocate an entire

packet buffer before accepting each packet, the buffers do not need to be as voluminous as

they are with other routing schemes such as virtual cut-through[7]. Thus, wormhole routing

allows the construction of fast and inexpensive interconnection networks.

When compared with cut-through and store-and-forward techniques, wormhole routing re-

duces the buffer requirements of the network. However, it may cause lower network through-

put because when a packet is blocked, it remains in the network as opposed to being buffered

and disposed of. When a message’s output link is used by another message a collision takes

place, and the incoming message is blocked, which augments the probability of deadlocks.

4 Adaptive Routing

An adaptive routing scheme allows routers to choose from several possible paths based on

channel loading, network faults and other dynamic information. A fully adaptive minimal

routing algorithm will permit any possible minimal path between a source and destination

to be used when messages are routed through the network.

5 LIVELOCK PREVENTION 9

Deterministic routing algorithms avoid deadlocks by defining a single possible path be-

tween source and destination. However, this means that the interconnection network can

not make effective use of its routing resources[15]. The result being that deterministic

routing will often result in non-maximal performance under some specific traffic patterns.

Adaptive routing on the other hand allows the effective use of the interconnection network’s

resources, but presents a new set of challenge in terms of deadlock and livelock prevention.

There is another advantage to using adaptively routed systems: since path choices can be

made on the basis of any local information, adaptive routers can easily be made fault tol-

erant.

Full adaptivity is expensive as it requires a large number of additional hardware resources.

The Linder-Harden algorithm, a fully-adaptive routing algorithm for k-ary n-cubes, requires

a number of virtual channels exponential to n ((n + 1)2n−1 virtual channels per physical

channels[12]). The Berman et al., another fully-adaptive algorithm for k-ary n-cubes uses

as many as 10(n − 1) + 6 virtual channels per physical channels. Because some of the

most expensive part of an interconnection network are the buffers and switching hardware,

limiting the required number of virtual channel is an important factor in the choice of the

routing algorithm.

5 Livelock Prevention

Livelock represent a state in which one or more messages could be forever denied of the

resources they require to progress towards their destinations.

Unlike deadlock or indefinite postponement (a packet that is waiting forever to acquire a

network resource for which other packets are always competing successfully), livelock does

not stop a packet’s movement, but rather its progress towards its final destination.

6 DEADLOCK PREVENTION 10

Figure 2: Adaptive vs Deterministic Routing Algorithms

Livelock-freedom can, in general, be ensured by assigning resources (channels or buffers) to

waiting messages in a FIFO manner[12].

Both Chien and Kim’s algorithm and the turn algorithm are livelock-free. The proofs are

given in [15] and in [8] respectively.

6 Deadlock Prevention

When a message’s output link is used by another message, a collision takes place, and the

incoming message is blocked. Deadlock occurs when messages are blocked by messages

that are themselves blocked, and the sequence of blocking forms a cycle[15]. Figure 3 il-

lustrates such a cycle. In Figure 3, four color labelled messages have entered the network

through different switches. Each message has acquired the first-hop link, but is blocked on

the second-hop link. A deadlock situation is created. Ensuring deadlock-freedom is more

difficult than ensuring livelock-freedom, and depends heavily on the design of the routing

algorithm[12].

Wormhole routing makes deadlock avoidance a more difficult problem to solve because:

6 DEADLOCK PREVENTION 11

1. In wormhole routing, once a communication channel has started to send the first flits

of a packet, it must transmit all the flits of that packet before it can be used for any

other messages[11]. This means that a packet could be blocking several network links

at the same time.

2. When deadlock situations occur, it is generally hard to assemble the incoming message

in a buffer for later transmission. This is because in wormhole routing the message

lengths are not limited, and the buffers are small[17].

The deterministic version of the wormhole algorithm has only two ways of handling deadlock

situations:

1. By using a technique called backpressure flow [17], where a control signal is sent up-

stream to the reverse link to stop or resume transmission. This however doesn’t pre-

vent message blocking, which increases the message transfer time. This also requires

efficient deadlock detection and resolution mechanisms, which might be expensive in

terms of added hardware and complexity. For example, deadlocks can be detected

using timers. The timer is started when the message arrives at the switch. The timer

eventually expires, allowing the detection of the messages that are not progressing.

2. By using an acyclic routing scheme where there are no cycles in the sequence in links

that is used[3]. This however leads to non-minimal paths and to the concentration

of traffic in some nodes of the interconnection network. It might also be difficult to

design.

Planar-adaptive wormhole routing routing approaches deadlock prevention by restricting

the available paths that the message can take in a way that ensures that no cycles of blocked

packets can be formed. We will see how this is done when we look at planar-adaptive

schemes.

6.1 The Cost of Deadlock Prevention

In general, supporting both adaptivity and deadlock prevention is expensive because it

requires additional virtual channels and larger crossbar switches[13]. Increasing routing

7 MINIMAL ROUTING 12

Figure 3: A deadlock situation: four messages have entered the network through different

switches, and are blocked by each other in a cycle after having each acquired the first-hop

link

flexibility multiplies the possibilities for deadlock situations which in turns increases the

cost of deadlock prevention[6].

Constraining the routing freedom to a few dimensions at a time greatly reduces the hardware

requirements for deadlock avoidance. While partially planar approaches sacrifice some

routing freedom, they also drastically reduces the possibilities of deadlock[6] at a much

lower hardware cost than the cost of deadlock-freedom mechanisms used with full-adaptive

schemes.

7 Minimal Routing

We distinguish between adaptive routers which route messages using only minimal paths

(wasting no work) and those that will consider nonminimal paths, potentially wasting rout-

ing work in exchange for increased routing freedom.

In minimal routing algorithms, messages will get closer to their destination with each hop

taken. Since message latencies increase with the number of hops[6], minimal routing makes

it possible to utilize the full wire capacity of the network productively.

8 VIRTUAL CHANNELS 13

Figure 4: Virtual Channel Router Architecture Diagram

The Chien and Kim planar-adaptive scheme is a minimal, adaptive routing algorithm[15].

It will however allow misrouting in its fault tolerant version. The turn model can be either

minimal or not[8].

8 Virtual Channels

Virtual channels are used in wormhole routed interconnection networks to avoid deadlocks

and to improve link utilization and network throughput ([7],[3],[16]). Deadlock-free planar-

adaptive adaptive routing relies on the use of virtual channels.

Definition 8.1 (Virtual Channels [7]) A virtual channel is a pair of flit buffers (each

is stored in two connected nodes) connected by a shared physical channel. The physical

channel is timeshared by the virtual channels.

Figure 4 depicts four virtual channels sharing a single physical channel. Virtual channels

have their own flit queue, but share the bandwidth of associated physical channel with other

virtual channels in a time-multiplexed fashion.


8.1 Virtual Channel Utilization

A network that uses virtual channels for flow control purposes organizes the flit buffers

associated with each channel into lanes. The buffers in each lanes can then be allocated

independently of the buffers in the other lanes. This increases channel utilization and by

extension throughput[7]. As virtual channels are an integral part of deadlock free partial

adaptive routing, it is worthwhile to consider the advantages and inconveniences of using

virtual channelswhen looking at the use of planar-adaptive.

8.1.1 Advantages of Using Virtual Channel

Below, I outline some of the major advantages of using virtual channels.

1. Adding lanes to the network allows blocked packets to be passed. This in turns

increases network throughput[7] and facilitates deadlock prevention.

2. Virtual channels provide an additional degree of freedom in the allocation of net-

work resources for the routing of packets in the network. This facilitates the use of

scheduling strategies, reducing the variance of the network latency[7].

3. Because buffer memory tends to be cheaper than physical channel bandwidth, adding

virtual channels to a network provides a cost effective way to increase bandwidth as

it permits the decoupling of wire resources.

4. Physical idle channel time is reduced because when using virtual channels, a physical

channel is idle only when all of its virtual channels are idle or blocked. [7] shows that

the probability of this happening is small.

5. Virtual channels make it easier to build virtual topologies. This results in easier

network separation.

8.1.2 Disadvantages of Virtual Channel Usage

Below, I outline some of the major inconveniences of using virtual channels.

1. Adding virtual channels to a physical channel is less expensive than adding new phys-

ical channels, but it is not free. Adding buffer space and control logic will contribute


to increasing the cost of the underlying hardware of the interconnection network, as

well as its complexity.

2. Each extra virtual channel will reduce the bandwidth of the other virtual channels

that are already sharing the physical channel.

3. Virtual channels increase the signaling overhead of the interconnection network.

4. Virtual channels increase the cycle time and scheduling overhead of the interconnec-

tion network.

5. Virtual channels increase the scheduling complexity of the system (packet stretching

problem).

6. Preserving packet transmission order is difficult in any interconnection network that

uses multipath routing[15]. Using virtual channels makes it harder, as some packets

might be buffered while others might be progressing. It is difficult and expensive to

use packet sequence numbering schemes in large networks, and reassembly schemes

are expensive. It is however possible to modify the planar-adaptive routing presented

in [15] to make it order-preserving. the modify version works by restricting the routing

paths even further, essentially reducing planar-adaptive routing to dimension ordering

routing.

8.2 Using Virtual Channels for Deadlock Avoidance

Planar-adaptive routing uses virtual channels primarily for deadlock avoidance. Any cyclic

network can be made deadlock-free by restricting routing in such a way that there are no

cycles in the channel dependency graph. Virtual channel are then added to reconnect the

network[7].

Figure 5(a) shows how a packet A blocked between routers 3 and 4 also blocks the packet

B when the network is not equipped with virtual channels. In figure 5(b), the network is

equipped with virtual channels, allowing dual utilization of the physical channel between

9 PLANAR ADAPTIVE WORMHOLE ROUTING 16

(a)

(b)

Figure 5: Virtual Channel Deadlock Control

node 3 and node 4. Packet B can now pass A.

While virtual channels are expensive, the good news is that planar-adaptive routing requires

only a constant number of virtual channels to be provably deadlock-free, independently of

network size and dimension[15]. In contrast, the virtual channel requirement of deadlock-

free fully-adaptive routing schemes is much higher.

9 Planar Adaptive Wormhole Routing

A primary concern in all adaptive networks is the cost of deadlock prevention. Because

planar-adaptive routing limits routing freedom, it makes it possible to prevent deadlock with

only a fixed number of virtual channels, independent of the number of network dimensions.

Planar-adaptive routing supports full adaptivity, but only at the 2-dimensional plane level.

The routing dimensions change as the packet progresses towards its destination[15]. Though

there is less routing freedom than with fully adaptive routing, planar-adaptive routing still

allows choice from a large number of paths from source to destination[6].


9.1 The Turn Model for Adaptive Routing

Proposed in [8], the turn model is deadlock-free and livelock-free. While the algorithm can

be applied to networks with extra channels, unlike the Chien and Kim’s model, presented

in section 9.2 of this essay, it is not based on the addition of virtual channels, but rather on

the analysis of the directions in which the messages’ packets can turn in the network and

the cycles that the turns can form. The algorithm works by prohibiting only those turns

(change in dimension) in the network that could cause deadlock.

9.1.1 The Algorithm

The term channel is used to designate both physical channels and virtual channels. The

steps of the algorithm for a 2-dimensional mesh are as follows:

1. The first step partitions the channels in the network into sets according to the direc-

tions in which they route packets. Nodes with v channels in a physical direction are

treated as being in v distinct virtual directions and are divided into v distinct sets.

Wraparound channels are in a separate set and are used in step 5 of the algorithm.

2. The possible turns from one virtual direction to another are identified. 180-degree

and 0-degree turns are ignored.

A 0-degree turn represents a transition from one set of channels to another.A 0-degree

is only possible when there are multiple channels in one direction. [8] indicates that

in general, identifying the simplest cycles in each plane of the topology is enough.

3. The cycles that these abstract turns can form are then identified.

4. One turn in each abstract cycle is prohibited so as to prevent deadlock. The prohibited

turns are chosen carefully so as to break every possible cycle, including complex cycles

not identified in step 3.

5. Turns originating from the wraparound channels are then added back, but only after

checking that they do not reintroduce cycles. [8] indicates that at least one turn for

each wraparound channel can always be incorporated.


Figure 6: Turns prohibited in the west-first algorithm

Figure 7: Turns prohibited in the north-last algorithm

6. All the 180-degree and 0-degree turns that do not reintroduce cycles are added back.

9.1.2 The West-First Algorithm

One possibility to avoid the possibility of cycle creation is to prohibit all turns to the west

(picture 6). This ensures that a packet that needs to go west does so at the beginning of

its path.

9.1.3 The North-Last Algorithm

Another possibility is to prohibit all turns to the north (picture 7). Doing this forces a

packet that needs to go north to do so at the end of its path.


Figure 8: Turns prohibited in the negative-last algorithm

9.1.4 The Negative-First Algorithm

The last variant of the turn model is the negative-first algorithm. Here, the prohibited

turns will be the two from a positive to a negative direction, forcing the packet routing to

proceed west and south first, and then east and north.

9.1.5 Deadlock Freedom

All the turn prohibition based algorithms presented above are deadlock-free. This comes

from the existence of a channel numbering system for each algorithm in which packets

can be shown to always be routed along channels with strictly decreasing (or increasing)

numbers.

9.1.6 Performance

In [8] it is proven that the turn model will require the prohibition of at least a quarter of all

possible turns in order to prevent deadlock. Turn prohibiting has an impact on performance,

as it reduces adaptivity.

9.2 Chien and Kim’s Partially Adaptive Routing Algorithm

This scheme is presented in [15]. In this version of planar-adaptive wormhole routing,

three bidirectional virtual channels must be provided for each physical channels. It is fault

tolerant and deadlock-free, however, two faults may prevent many packets from being routed

by using their method[1]. The algorithm works by dividing a k-ary n-cube topology into


Figure 9: Chien and Kim’s planar-adaptive routing in a 2-ary 3-cube

n − 1 virtual planes and routing adaptively in each plane, and deterministically from one

plane to the next. This is repeated until the header packet has reached its final destination.

Figure 9 gives a high level idea of the way the algorithm works for 2-ary 3-cube. In figure 9,

a packet that needs to be routed through each of the network’s three dimensions is first

routed in the X-Z dimension, then in the Y-Z dimension.

9.2.1 Notation and Terminology

The virtual channels for each nodes are labelled from 0 to 2. Each plane i can now be

defined as the union of the three sets of virtual channels : {di,0 +di,1 +di,2} (see Figure 10).

Definition 9.1 (Adaptive Plane) An adaptive plane Ai is defined formally as a set of

virtual channels:

Ai = di,2 + di+1,0 + di+1,1

over two dimensions i and i + 1. Within the plane, messages are routed adaptively with

respect to these two dimension.

Given a k-ary n-cube with n dimensions, the algorithm starts by creating n− 1 such adap-

tive virtual planes.

Given a bidirectional virtual channel di,jo in the dimension i of the network, we differentiate

between the two directions of the data flow passing through the virtual channel by writing


A

d i+1, 0 d i+1, 1

d i, 2

i + 1

i

Figure 10: Numbering of virtual channels with respect to node A in dimension i, i + 1

di,jo+ (for the increasing traffic), and di,jo− (for the decreasing traffic).

We can now separate each adaptive plane’s virtual channels into two separate virtual net-

works:

1. The increasing network, which routes increasing traffic and is defined as the union of

the two sets: di,2+, di+1,0

2. The decreasing network, which routes decreasing traffic and is defined as the union of

the two sets: di,2−, di+1,1

Figure 11 shows a plane logically decoupled into disjoint sets of virtual channels(the in-

creasing and the decreasing network).

9.2.2 The Algorithm

1. At the adaptive plane level, messages are routed adaptively by looping through the

adaptive planes, starting with A0, all the way to An−2. Within each adaptive plane


Figure 11: Increasing and Decreasing Virtual Networks of an Adaptive Plane

Ai, packets may use any channels leading toward their destination until the di address

is correct. Of course, within each adaptive plane Ai, if the di address of the destination

is lower than the di current address of the packet, the packet is routed through the

decreasing network only, and vice versa, the di address of the destination being higher

than the current di address of the packet forces it to route through the increasing

network.

2. At the loop’s exit, only the dn−1 address of the packet might be different from its

final address. If that is the case, the packet is routed to its final destination using the

dn−1,2 channel.

9.2.3 Deadlock Freedom

It is easy to see how the adaptive routing that is done at the plane level is deadlock-free.

As each adaptive plane is divided into two completely separate networks, input routing

in di+ can only depend on output routing from di+ and di+1,0 channels. Similarly, input

routing in di− can only depend on output routing from di− and di+1,1 channels. This pre-

vents any cycles from forming, as the routing flow is always unidirectional in the i dimension.

Across the planes, deadlock freedom is also trivial as the loop always routes from the lower

to the higher dimension, again making cycles impossible.


9.2.4 Performance Comparisons

[6], provides an evaluation of the performance of planar-adaptive routing in comparison

to deterministic and fully adaptive routing with similar resources. The simulation stud-

ies presented show that planar-adaptive routers can increase the robustness of network

throughput for nonuniform communication patterns.

Chien and Kim’s Planar-Adaptive Routing vs Deterministic Routing

The performance of planar-adaptive routing is first evaluated against the performance of

deterministic routing under three different traffic patterns:

1. Random (uniform) Each node sends with equal probability to all other nodes in

the system.

2. Dimension-reversal Nodes send messages to nodes with address of reversed dimen-

sion index. (x, y) sends to (y, x), (x, y, w, z) sends to (y, x, z, w) and so on.

3. Bit-reversal A node with address abcd2 sends messages to a nod with address dcba2

The comparisons show that while the performance of planar-adaptive routing is similar

or slightly worst than the performance of deterministic routing in terms of latency when

traffic loads are uniform, non-uniform traffic conditions give a clear advantage to planar-

adaptive routing, as the planar-adaptive routed networks get saturated much later than

the deterministically routed networks. In general, Planar-adaptive routed networks tend to

be more consistent in terms of performance when confronted with different traffic patterns.

Chien and Kim’s Planar-Adaptive Routing vs Fully Adaptive Routing

Comparisons with fully adaptive routers show that planar-adaptive routers, can give supe-

rior performance[15] with fewer resources.

9.2.5 Fault Tolerance

The scheme can be augmented with misrouting to support fault tolerance. The resulting

networks tolerate faults by routing around. The approach taken in [6] is a complement


of the one presented in [5], where all faulty regions are augmented until they are convex

(deactivation algorithm given in [6]). Requiring the faulty regions to be convex allows a

larger fraction of the nodes to remain in service for a given pattern of faults[6]. Augmen-

tation ensures that if the faulty regions are not naturally convex, good nodes and channels

are marked as faulty until the regions become convex. Planar-adaptive routing will then

route packets to the parts of the machine which remain connected. The flexibility of the

adaptive routing algorithm is used to circumvent faulty channels.The algorithm is modified

to support fault-tolerance as follows:

1. Where the algorithm previously didn’t make use of the dn−1,2, d0,0 and d0,1 channels,

we now define a new adaptive plane using these channels (An−1).

2. Where we routed using n−1 adaptive planes, we now route using n planes (A0 . . . An−1)

, using the same high level algorithm as before in between the planes, but using the

following algorithm in the adaptive planes:

(a) If there are no faults, route as in the first algorithm.

(b) If the packet is blocked by a fault in di, route in di+1. If blocked by a fault in

di+1, route in di

(c) If the packet is blocked by a fault in di, route and the di+1 has already been

reduced to 0, then misrouting occurs. If we were routing in the di+1 direction,

we continue in the same direction. If we were routing in the di direction, we pick

any di+1 channel and we start in that direction. A soon as possible, we go back

to the di direction, going back to the first step of the algorithm.

Circumventing faulty regions with only local information requires packet misrouting.

While allowing misrouting also introduces the possibility of livelock, it is shown in [6] that

planar-adaptive routing forces all packets to always make progress toward their destina-

tions, ensuring that the resulting networks are livelockfree.

10 CONCLUSION 25

Figure 12: Deactivation Algorithm

9.3 The Modified Chien and Kim’s Partially Adaptive Routing Algo-

rithm

[2] proposes a modified version of Chien and Kim’s algorithm, intended to correct the low

channel utilization of Chien and Kim’s algorithm. The modification determines wether or

not the packet will need to go through the wrap-around links or not. When a packet will

not need to use the wrap-around links, it is marked as a free packet. A free packet that

cannot find a virtual channel by the previous assignment is allowed to use the higher level

of virtual channels. The effect is to improve the channel utilization, while still increasing

the virtual channel assignment.

[2] compares the two schemes in terms of performance, and shows that while the modified

scheme will improve the performance of the original one, its buffer requirements are larger.

The throughput vs buffer utilization of the two algorithms is given in figure 13.

10 Conclusion

All adaptive routing algorithms increase the hardware complexity of the network in or-

der to support the additional routing flexibility ([12],[13]). Because the most expensive

part of routing networks (after the wires for the physical channels) are the buffers and the

10 CONCLUSION 26

Numbers of buffers occupied (x10 4 )

Th

rou

gh

pu

t (p

acke

ts/c

ycle

)

unmodified Chien and Kim algorithm: Wormadp modified Chien and Kim algorithm: Wormadpmod

Figure 13: Comparison of Chien and Kim’s planar adaptive scheme and of the modification

algorithm terms of buffer requirements and channel utilization

REFERENCES 27

switching hardware [15], increasing the underlying hardware complexity of the network re-

sults in a substantial increase in the cost of the network. In addition, increased hardware

complexity can significantly reduce router speed, decreasing total network performance [15].

[15] show that planar-adaptive routers outperform deterministic routers with equal hardware

resources. Further, adding virtual lanes to planar-adaptive routers increases this advantage.

In this essay, we have seen that not only the structure of planar-adaptive routers is easy

to implement efficiently [6], but that planar-adaptive routing provides a simple type of

support for deadlockfree adaptive routing in k-ary n-cubes of more than two dimensions[6].

In addition, planar-adaptive routing has some nice fault tolerant features [9] as it allows

messages to be routed around failed channel and nodes.

In terms of load balancing, by allowing more freedom in the paths that are taken by mes-

sages, planar-adaptive routing spreads network load over physical channels more evenly,

thus improving the performance of the interconnection network ([15],[9]). Simulations show

a clear advantage for planar-adaptive routing under uneven network load conditions. This

is true of all adaptive routing algorithms, however, by restricting adaptivity, planar-adaptive

schemes also reduce the hardware complexity of the interconnection network[6], which will

have a positive impact on its performance.Planar-adaptive routing allows routing flexibility

at a lower hardware cost than full adaptivity.

References

[1] Jau-Der Shih, ”Adaptive Fault Tolerant Wormhole Routing Algorithms for Hypercube and

Mesh ”

[2] Yen-Wen Lu, Kallol Bagchi, James B. Burr, Allen M. Peterson, A Comparison of Different

Wormhole Routing Schemes.

[3] W.J. Dally and C. L. Seitz, ”Deadlock-free message routing in multiprocessor interconnec-

tion networks,” IEEE Trans. Compul., vol. C-36, no. 5, pp. 547 553, May 1987.

REFERENCES 28

[4] William J. Dally, Performance Analysis of k-ary n-cube Interconnection Networks, 1988,

IEEE Transactions on Computers.

[5] J. Y. Ngai, and C. L. Seitz, ”A Framework for Adaptive Routing in Multicomputer Net-

works,” Proc. Symp. on Parallel Algorithms and Architectures, (1989), pp. 1–9.

[6] J. H. Kim and A. A. Chien, ”An evaluation of planar-adaptive routing (PAR),” in Proc.

Fourth IEEE Symp. on Par. and Distr. Processing, 1992.

[7] W. J. Dally, Virtual channel Flow Control. IEEE Trans. Parallel and Distributed Systems,

1992

[8] Christopher J. Glass, Lionel M. Ni, The Turn Model for Adaptive Routing, 1992, 25 Years

ISCA: Retrospectives and Reprints

[9] W. J. Dally and H. Aoki, Deadlock-free adaptive routing in multicomputer networks using

virtual channels. IEEE Trans. Parallel and Distributed Systems, 4:466–475, 1993

[10] R. V. Boppana and S. Chalasani., New Wormhole Routing Algorithms for Multicomputers,

In International Parallel Processing Symposium, pages 419–423, 1993.

[11] L. M. Ni and P. K. McKinley, A survey of wormhole routing techniques in direct networks,

IEEE Computer Magazine 26 (1993), no. 2, 62-76.

[12] Rajendra V. Boppana and Suresh Chalasani, A Comparison of Adaptive Wormhole Routing

Algorithms, ISCA,351-360,1993

[13] Kazuhiro Aoyama and Andrew A. Chien, ”The Cost of Adaptivity and Virtual Lanes in a

Wormhole Router”, Journal of VLSI Design, 1993

[14] W. Dally and L. Dennison and D. Harris and K. Kan and T. Xanthopoulos, ”Architecture

and Implementation of the Reliable Router”, In Hot Interconnects II: 1994

[15] A. A. Chien and J. H. Kim, ”Planar-adaptive routing: Low-cost adaptive networks for

multiprocessors”, Journal of the ACM, vol. 42, pp. 91–123, January 1995

[16] Akhilesh Kumar, Laxmi N. Bhuyan, Effect of Virtual Channels and Memory Organization

on Cache-Coherent Shared-Memory Multiprocessor, 1996

[17] Arne Folkestad and Christian Roche, ”Deadlock Probability in Unrestricted Wormhole

Routing Networks”, ICC(3)”, 1401-1405,1997

Technology

Deadlock Preventive Adaptive Wormhole Routing on k-ary n-cube Interconnection Networks