A Survey on Multi-net Global Routing for Integrated Circuitspeople.ece.umn.edu/users/sachin/jnl/integration01.pdfsequential routing and rip-up-and-reroute, and then discusses multicommodity

A Survey on Multi-net Global Routing for Integrated Circuits ∗

Jiang Hu and Sachin S. Sapatnekar

{jhu, sachin}@mail.ece.umn.edu

Department of Electrical and Computer Engineering

University of Minnesota

Minneapolis, MN 55455, USA

Tel: 612-625-0025, Fax: 612-625-4583

Abstract

This paper presents a comprehensive survey on global routing research over about the last two

decades, with an emphasis on the problems of simultaneously routing multiple nets in VLSI circuits

under various design styles. The survey begins with a coverage of traditional approaches such as

sequential routing and rip-up-and-reroute, and then discusses multicommodity flow based methods,

which have attracted a good deal of attention recently. The family of hierarchical routing techniques

and several of its variants are then overviewed, in addition to other techniques such as move-based

heuristics and iterative deletion. While many traditional techniques focus on the conventional ob-

jective of managing congestion, newer objectives have come into play with the advances in VLSI

technology. Specifically, the focus of global routing has shifted so that it is important to augment

the congestion objective with metrics for timing and crosstalk. In the later part of this paper, we

summarize the recent progress in these directions. Finally, the survey concludes with a summary of

possible future research directions.

∗This work is supported in part by the NSF under contract CCR-9800992 and the SRC under contract 98-DJ-609.

Contents

1 Introduction 4

2 Problem background and formulation 6

3 Basic techniques 9

3.1 Maze routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Steiner tree construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3 0-1 integer linear programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.4 Network flow model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Sequential routing techniques 14

4.1 Force-directed routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2 Sequential routing through Steiner min-max tree construction . . . . . . . . . . . . . . . 16

4.3 Minimum weighted Steiner tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 Region-wise routing 19

5.1 Unique pattern first and outer rim first routing . . . . . . . . . . . . . . . . . . . . . . . 19

5.2 Routing in order of wire orientations and in terms of rows . . . . . . . . . . . . . . . . . 22

6 Move-based heuristics 23

7 Rip-up and reroute 24

8 Multicommodity flow based approach 28

8.1 The Shragowitz-Keel algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

8.2 The Raghavan-Thompson rounding method . . . . . . . . . . . . . . . . . . . . . . . . . 31

8.3 Application of the Shahrokhi-Matula algorithm . . . . . . . . . . . . . . . . . . . . . . . 33

8.4 Application of Garg-Könemann algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 34

9 Hierarchical methods 36

9.1 Top down successive refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2

9.2 Bottom up merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

9.3 Hybrid hierarchical method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

9.4 Hierarchical routing for custom design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

9.5 Hierarchical bisection and linear assignment . . . . . . . . . . . . . . . . . . . . . . . . . 46

9.6 Four bend hierarchical routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

10 Iterative deletion 49

11 Timing driven global routing 51

11.1 Multicommodity flow based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

11.2 Iterative deletion based routing for standard cells . . . . . . . . . . . . . . . . . . . . . . 53

11.3 Hierarchical bisection and assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

12 Crosstalk driven global routing 56

13 Conclusions and future directions 59

3

1 Introduction

Eighteen years ago, a then new journal, the IEEE Transactions on Computer-Aided Design, pre-

sented a special issue on routing in microelectronics that contained several landmark papers. Another

significant publication at about the same time was a collection of papers edited by Hu and Kuh [40]. In

the years that have elapsed since the publication of these two works, with the scaling of feature sizes,

VLSI technology and circuits have undergone dramatic progress, as characterized by Moore’s law. Par-

ticularly in recent years, this has resulted in interconnect delay accounting for a major portion of circuit

clock cycle, as a result of which VLSI physical design has grown to be a critical factor. Moreover, the

growing circuit complexity has enlarged the size of the design automation problems in physical design

and has brought forth a new set of challenges.

Within the physical design flow, one of the most critical steps is global routing, a stage where signal

nets are connected coarsely under a given placement so that wire/via spaces are allocated to each signal

net. The quality of the global routing solution directly affects chip area, speed, power consumption and

the number of iterations required to complete the design cycle, and hence this step plays an important

role in determining circuit performance. On the other hand, global routing is a notoriously difficult

problem: even the most simple version of the problem, where a set of two-pin nets is to be routed under

congestion constraints, is an NP-complete problem [49].

As a result of both the importance of the problem and its difficulty, a great deal of research has been

carried out on global routing during the last two decades, covering a variety of design styles including

gate arrays, sea of gates, standard cell-based designs and custom circuits. Various techniques and

strategies have been proposed, including rip-up-and-reroute, hierarchical methods, multi-commodity

flow techniques and iterative deletion. However, even with all of these efforts, it is not entirely accurate

to imply that the global routing problem has been solved satisfactorily. In particular, the newest

advances in VLSI technology have raised a new set of issues to be solved and have further complicated

the requirements on global routing.

The purpose of this survey is to provide a comprehensive overview of research in global routing,

with specific emphasis on the problem of the simultaneous global routing of multiple nets in integrated

circuits. This problem requires competent resource management as the global nets compete for a

restricted set of global resources such as routing resources, crosstalk budgets and net delays. The

4

nature of this problem has changed remarkably over the last two decades, with several design styles

gaining favor and then falling out of favor, but many of the fundamental techniques that were introduced

are useful to other more modern design paradigms. In reading this survey, the reader is cautioned not

to set too much store on the specific technology being discussed, but rather, to focus on the underlying

algorithms in an attempt to determine how best they may be extended to the specific routing problem

du jour.

There are several prior surveys that complement the material presented here. Two early surveys on

global routing are presented in a paper by Kuh and Marek-Sadowska [50], and in a chapter of the book

by Lengauer [56]. The books by Sherwani [78], Sait and Youssef [73] and Sarrafzadeh and Wong [75],

present a more updated coverage of progress in global routing. The book by Kahng and Robins [45]

and the survey paper by Cong et al. [16] focus on global routing issues for a single net. In most of these

sources, the attention paid to the problems of simultaneously routing multiple global nets is limited,

and the objective of this work is to attempt to bridge that gap. In the remainder of this paper, the

phrase “global routing” will implicitly imply, unless otherwise stated, that the routing of all nets is being

considered. For a survey of this type, it is appropriate to also list a set of related routing problems that

have not been covered in this survey in order to limit its scope. These include MCM/PCB routing,

FPGA routing, single layer routing, and parallel algorithms for global routing.

This paper is organized as follows. At the outset, the problem background and formulation are

described in Section 2, and the basic techniques that are frequently used in global routing are summarized

in Section 3. Section 4 introduces the first set of global routing algorithms, namely, sequential routing

techniques. Edge-wise routing methods are covered in Section 5, followed by a brief review of move-

based heuristics in Section 6. Next, rip-up-and-reroute approaches are described in Section 7 and

multicommodity flow based methods in Section 8. Hierarchical global routing procedures and the

iterative deletion technique are discussed in Sections 9 and 10, respectively. Most of the methods

described until this section use conventional metrics to manage resource contention. Towards the end,

in Sections 11 and 12, we present ways in which some objectives related to timing driven and crosstalk

driven global routing are incorporated. Finally, Section 13 concludes this survey and suggests directions

for future research.

5

2 Problem background and formulation

For a given location for every cell or macro-block in the layout, each set of electrically equivalent pins

that belong to different cells or macro-blocks, called a net, must be wired together in a step commonly

referred to as routing. Due to technological or methodological restrictions, it is often the case that

some areas of the layout are prohibited from allowing wire routes, corresponding to wiring blockages.

The fundamental goal of the routing step is to connect every net successfully and to resolve resource

contentions. In modern VLSI design, this could be an exceedingly complex problem if it were done in a

single step, since there could be millions of elements and nets integrated on a single chip. One common

approach is to divide the routing procedure into two stages: global routing and detailed routing. In

global routing, the chip area is divided into a set of coarsely-defined regions, and wires that cross

the boundaries of these regions are allocated a coarse route that determines which regions they must

traverse. Following this step, the routing within each such region is carried out in the detailed routing

stage.

(b)(a)

Figure 1: Tessellation for (a) gate array/sea of gates and (b) standard cell design.

Depending on the design style, different forms of routing graphs are constructed to define the coarse

regions for global routing. For gate array, sea of gates and standard cell designs, the entire routing area

is typically tessellated into a grid array, as shown in Figure 1. Note that the horizontal grid line usually

goes through the middle of a row of cells in the standard cell design. The dual graph of this tessellation

is the routing graph G = (V, E), as shown in Figure 2(a). Each vertex v ∈ V represents a grid cell and

each edge e ∈ E corresponds a boundary between two adjacent grid cells. In custom design, on the

6

(a) (b)

Figure 2: The construction of a routing graph from the tessellation for a two-layer routing scheme. Viaconstraints are explicitly modeled in (b) but not in (a).

other hand, the cell placement is not regular as in gate array or standard cell design styles. Therefore,

the routing graph is more appropriately based on the floorplan of the building blocks. This is illustrated

for an example floorplan in Figure 3 where each rectangle represents a building block. Like the routing

graph for gate array and standard cell designs, this routing graph is also a dual to the floorplan graph.

Figure 3: Routing graph for custom design.

The global routing problem requires a set of nets N = {N1, N2, ..., Nk} to be routed over the routing

graph G. A net Ni, 1 ≤ i ≤ k is a set of pins (or terminals) {vi,0, vi,1, vi,2, ...} ⊆ V , among which vi,0 is

the source pin and the others are sink pins. Under the coarse assumptions made for global routing, each

pin is often assumed to lie at the center of the grid cell that contains the pin. The routing problem for a

7

net Ni is to find an additional subset of vertices Vi,Steiner ⊂ V and a set of edges Ei = {ei,1, ei,2, ...} ⊂ E

to form a rectilinear minimum spanning tree Ti = (Vi, Ei), where Vi = Ni ∪ Vi,Steiner.

As multiple nets are routed over the routing graph G, a common boundary between two neighboring

grid cells may be crossed by wires belonging to different nets. In terms of the graph representation, this

implies that the routing trees for these nets utilize a common edge e ∈ E. The number of wires that

utilize an edge e ∈ E is called the flow (or demand), f(e), on the edge. The number of available routing

tracks, which forms an upper bound on the allowable demand, is referred to as the boundary capacity

(or supply), u(e). The ratio of flow to capacity f(e)/u(e) for an edge is referred to as the density or

congestion, λ(e). If f(e) > u(e), then the overflow on the edge is defined as φ(e) = f(e)−u(e); otherwise

the overflow is zero. The most fundamental statement of the global routing problem is to route all of

the nets to ensure that for each edge e corresponding to a cell boundary, the number of wires across it

does not exceed its routing capacity, i.e., φ(e) = 0,∀e ∈ E.

Another important consideration in routing global nets is to manage the number of vias on the net,

and it is relatively easy to modify the above formulation to address this objective. Let us consider the

routing problem under the reserved layer model, where the routing direction for all wires in a layer is

identical. If a routing path makes a change in direction, i.e., a bend is induced, then a layer change is

necessitated, and wires along different directions must be connected through a via at the bend position.

For several reasons, such as considerations of reliability, area, and signal delay and quality, it is desirable

to minimize or control the number of vias during routing. One way to do so is to limit the number

of allowable vias in each grid cell. Under this scenario, the available via spaces in a grid cell may

also be modeled by an edge, and we can obtain a routing graph G of the type shown in Figure 2(b)

for a two-layer routing problem. This routing graph is identical to that defined earlier, except for the

introduction of via edges along the grid. For a via edge e, we define f(e) as the number of vias utilized

in the corresponding grid cell and u(e) as the number of available via spaces in the cell. In the literature,

some works explicitly consider this via constraint while others do not, but these can often be extended

fairly easily to do so.

The specific constraints that are of particular importance vary with the design style, but congestion

management is a uniformly important objective across all design styles. For gate array/sea of gates

designs, the locations of the gates are fixed, and global routing is an effort to route every net without any

8

overflow. The global router on this type of design is typically evaluated through metrics that measure

the number of nets routed without overflow, or the total overflow with all of the nets routed. In custom

design, the objective is similar, and is sometimes extended to minimize the maximum congestion λ̂ over

all edges e ∈ E, even if no wiring overflow occurs. The motivation for doing so is to provide a greater

flexibility to a subsequent detailed routing step within a grid cell by evenly spreading out the global

wires over the grid. For traditional standard cell design in a two-layer environment, only horizontal

wiring channels, which are the spaces between two adjacent row of cells, are available. If a net crosses

two different channels, then feedthrough cells must be inserted into each row of cells between the two

channels to allow the the inter-channel connection for this net. The problem of assigning feedthroughs

to improve the routing quality constitutes a separate problem that is not treated in this survey.

In addition to the congestion objective, wire length is an important concern: large wire lengths imply

a larger power consumption and often, greater delays. It is not hard to see that optimizing congestion

and wire length can often be competing objectives, and some global routing works have attempted

to combine these objectives together. Another important consideration in the deep submicron era is

that the interconnect delay, which consumes a major portion of the clock cycle today and is projected

to continue to do so in the future. Therefore, timing performance must be included into the set of

objectives considered during global routing. In addition, crosstalk has become an increasingly vexing

issue and an attempt at crosstalk management during global routing is of great value in an overall

strategy to manage coupling effects between wires.

3 Basic techniques

In this section, we summarize some of the basic techniques that are frequently used to solve sub-

problems in global routing. These include maze routing, Steiner tree construction, 0-1 integer linear

programming, and network flows.

3.1 Maze routing

One basic subproblem that is commonly encountered in global routing is that of finding a shortest path

connecting two pins in the presence of wiring blockages. Perhaps the most well known solution to this

problem is the maze routing [54,66] algorithm, which works on a routing graph similar to the grid graph

9

in Figure 2(a). Each edge e has a cost c(e) associated with it, and this cost may be different for different

routing directions. If an edge e is within the wiring blockage area, then its cost c(e) = ∞; otherwise its

cost is finite and specified. A common cost metric is the rectilinear edge length, counted in terms of

the number of grid cells.

The maze routing algorithm can be regarded as an implementation of Dijkstra’s shortest path algo-

rithm [26] on the routing graph that minimizes the path cost. For a given graph G = (V, E), where each

edge e ∈ E has a cost c(e), Dijkstra’s algorithm is able to find a minimum cost path p(vs, vt), which is

set of consecutive edges, connecting a source vertex vs ∈ V and a target vertex vt ∈ V . This algorithm

consists of a cost labeling step followed by a path tracing phase. In the cost labeling phase, starting

from the source vertex vs, the accumulated cost from the source to each vertex is labeled one by one in

a “wave expansion” manner until the target vertex is reached. The minimum cost path is then traced

back from the target vertex to the source by retrieving the bookkeeping information that is maintained

in the cost labeling phase. The runtime for this algorithm is O(|E| + |V | log |V |) through a Fibonacci

heap implementation, where |E| is the number of edges and |V | is the number of vertices.

If there is an optimal shortest path, maze routing is guaranteed to find it. However, in practice, it is

found that the maze routing algorithm is slow and has large memory requirements, and many efforts [78]

have attempted to improve its speed and memory usage. Another well known class of methods is the

category of line probe-based algorithms (see, for example, [34]), which do not rely on a grid graph.

While these methods are faster than maze routing, they are not guaranteed to find a solution, even if

one exists. For a good description on maze routing and related algorithms, the reader is referred to [78].

The maze routing algorithm has been extended to find a path connecting two pins in such a way that

it favors a path that passes through less congested areas [68]. Since maze routing inherently considers

only one net at a time, this extension requires nets to be considered one at a time, with the consequent

dependence on the order in which nets are processed. The procedure computes the edge cost c(e), e ∈ E,

so that it reflects the current congestion at its corresponding cell boundary, instead of a distance metric

such as the rectilinear length. There are several variations on this cost definition: for example, one

10

could use the density λ(e), or λ2(e). Another effective cost function, in our experience, is given by

c(e) =

f(e)+1u(e)−f(e) : f(e) < u(e)

∞ : f(e) ≥ u(e)

An extension to maze routing that considers congestion costs explicitly can achieve the purpose of

not only helping the path to avoid wiring blockages, but also of distributing the routing congestion,

and therefore this technique is frequently used in global routing. A technique for timing driven maze

routing has been proposed recently in [43,44]

3.2 Steiner tree construction

source

sink

sink

(b)(a)

Steiner node

Figure 4: A three pin net connected by (a) a spanning tree and (b) a Steiner tree.

Both maze routing and line probe methods are designed for connecting two pin nets. However, in

practice, nets with more than two pins are often encountered in the routing problem. A common

approach in dealing with a multi-pin net is to decompose it into a set of two-pin nets. One way of

performing this decomposition is to begin with constructing a minimum spanning tree (MST) over the

pins, and to maze-route each pair of pins that corresponds to an edge in the MST. In Figure 4(a), a

possible solution using this procedure for routing a three-pin net, decomposed into two two-pin subnets,

is shown. It is easily seen that such a procedure may not lead to the best routing quality. The wire

length of a routing tree can often be reduced by introducing extra nodes in addition to the given pins,

and constructing an MST over all of these nodes. The added nodes are called Steiner nodes, and an

illustration is shown in an example in Figure 4(b).

In most VLSI routing problems, since all of the wires are either horizontal or vertical, only rectilinear

11

Steiner trees (RST) are considered. One basic RST construction formulation is to minimize the wire

length of the tree, i.e., to construct a rectilinear Steiner minimum tree (RSMT). RSMT construction is a

well-known NP-complete problem and many approximation algorithms have been developed. Several of

these are based on results that relate the wire length of the optimal RSMT to that of the optimal MST.

When timing is considered, merely minimizing wire length is not adequate. As a first approximation,

it can be observed that the length of the source-sink routing path length plays an important role in

determining the signal delay. Several early works [5, 17] are dedicated to finding a good compromise

between wire length and the maximum source-sink path length (also known as the radius). Later

research efforts have directly considered delay metrics in the tree construction procedure [7, 19, 22, 36,

37, 58, 85]. For surveys on Steiner tree construction techniques for VLSI routing, the reader is referred

to [11,16,45].

Many of the Steiner tree construction algorithms that have been proposed in the literature focus on

the optimization of a single net, and do not consider wire congestion issues explicitly. Nevertheless, these

algorithms can be applied to serially route the nets, with the most critical nets being routed in advance

of other non-critical nets. When the edge cost is defined according to congestions, the Steiner minimum

tree algorithms may be applied directly to even out congestion while simultaneously restraining the wire

length [13]. Steiner minimum tree algorithms are also often employed in providing a set of candidate

routes to multicommodity flow based methods [3,9] and other iterative techniques. These methods will

be described in detail later in this paper.

3.3 0-1 integer linear programming

Global routing may be formulated as a special type of optimization problem, called a zero-one integer

linear programming (0-1 ILP) problem [61]. For a set of candidate routing trees Ti = {Ti,1, Ti,2, ...} for

net Ni, we use variable xi,j to indicate if tree Ti,j is selected for net Ni. The global routing problem

12

can then be formulated as:

Minimize λ̂

Subject to:∑

Ti,j∈Ti xi,j = 1, ∀Ni ∈ N

∑

i,j:e∈Ti,j xi,j ≤ λ̂u(e), ∀e ∈ E

xi,j = {0, 1}, ∀Ni ∈ N ,∀Ti,j ∈ Ti

(1)

The first constraint, along with the restriction of the xi,j ’s to {0, 1}, requires that one tree be chosen

for each net. The second constraint and the objective together ensure that the maximum congestion is

minimized.

One straightforward approach to this problem is to first solve the continuous linear programming

relaxation, obtained by replacing the third constraint with xi,j ∈ [0, 1], since practical solutions to

linear programming problems can be found in polynomial time [46], but the 0-1 ILP problem is NP-

complete. The fractional solution thus obtained may then be transformed to integer solutions through

rounding techniques such as randomized rounding, as in [71]. An alternative approach in [83] applies

an interior point method in conjunction with column generating techniques [39] to solve the 0-1 integer

linear programming problem.

In practice, the global routing problem is seldom solved entirely using the 0-1 ILP formulation since

the problem size can grow to be very large. More often, the ILP technique is embedded into a larger

overall global routing strategy, such as solving a subproblem at one hierarchical level of a hierarchical

routing procedure [8,33,41,62,81], where the complexity of the computing the optimal solution to a 0-1

ILP is manageable.

3.4 Network flow model

The objective of the global routing problem is to allocate a limited set of resources evenly to a given

set of demands. Intuitively speaking, the nature of this problem is quite coherent with the problem

of finding optimal flows in a network [52], and several research efforts have pursued this as solution

technique.

A network is a connected graph consisting of a set of vertices and edges; for the global routing

problem, this graph is a minor modification of the routing graph in Figures 2 and 3. In a basic network

13

flow model, there are two special vertices: one called the source and another called the target or sink.

A certain amount of flow, called the demand, for one commodity must be shipped from the source to

the target vertex. Each edge has a flow capacity, which represents the upper bound for the flow that

is allowed to pass through the edge. One version of the problem requires the transport of as much

flow as possible through the network without exceeding any edge capacity, and this is referred to as

the max-flow problem. Another version assigns a cost per unit flow to each edge, and sets the problem

objective to be the minimization of the total transportation cost through the network for a given flow

from the source to the sink; this formulation is called the min-cost flow problem. The appealing feature

of the network flow problem is that it can be solved in polynomial time to obtain an optimal integer

solution when edge capacities are integers. A good description of network flow methods can be found

in the book by Ahuja et al. [2]. While it is not possible to model the entire global routing problem as

a single commodity network flow problem, this technique can be employed to solve some subproblems

in global routing [14,38,65] and can lead to high quality solutions.

There is a special type of network flow model that can be applied directly to the global routing

problem. This is the multicommodity flow problem, where many commodities must be shipped on a

common network, and each commodity has its own sources and targets that may be different from

that for other commodities. In the mapping to the global routing problem, each net can be treated

as a commodity. An advantage of this approach is that the multicommodity flow problem can be

formulated as a linear programming problem. Since linear program solvers are slow for the sizes of

problems encountered in global routing, research on the multicommodity flow problem has mostly

focused on heuristics and combinatorial approximation algorithms. These methods will be introduced

in the following sections.

4 Sequential routing techniques

Perhaps the oldest and most straightforward strategy for routing multiple nets is to select a specific

order and to then route the nets sequentially in that order. The major advantage of this approach is

that the congestion information for previously routed nets can be taken into consideration while routing

a given net. For example, in early algorithms that operated by decomposing multi-pin nets into two-pin

nets, each net was routed using techniques such as the obstacle avoidance version of maze routing or

14

the line probe method. In these approaches, a cell boundary is said to be open to path searching until

all of the tracks have been occupied by previously considered nets; after that point, the boundary was

treated as an obstacle.

The drawback of this sequential approach is that the quality of the solution depends greatly on the

order in which nets are processed, and that it is hard to find a good net ordering. Under any net

ordering, it is often more difficult to route the nets that are considered later since they are subject to

more blockages. Moreover, there is no feedback mechanism that permits these nets to feed information

back to the nets routed earlier with directions on regions that should be left free for their routes. Early

work by Abel [1] concluded that there is no single net ordering technique that consistently performs

better than any other ordering method. Despite the controversial net ordering issue, there are several

good research results on sequential routing that have been reported, mostly through the use of iterative

loops that feed back congestion information from the later routed nets to the earlier routed nets.

4.1 Force-directed routing

In [31], the global routing problem for two-pin nets is solved by emulating a particle movement in a force

field. This approach assumes that a net order is given and routes each net sequentially. For each net, a

particle departs from from the source pin and moves toward the target pin under the field generated by

the source pin, target pin, unrouted pins in other nets, and routed wires. The trajectory of this particle

motion forms the routing path connecting its source and target pin.

attractive force repulsive force circular force

Figure 5: Various types of forces in force-directed routing.

There are three types of forces, as illustrated in Figure 5. When a particle moves to a certain position,

an unrouted pin in another net will exert a repulsive force along the direction of the line joining the

15

net being routed

particle

unrouted pin of other nets

routed wire

(b)

target

source

A

B

source

target

(a)

Figure 6: An example of the forces exerted on a particle: (a) a repulsive force from an unrouted pinin another net that does not lie in same row or column as the particle. (b) a circular force from anunrouted pin in another net in the same row or column as the particle, or from a routed wires in thesame row or column.

particle and the unrouted pin, as shown in Figure 6(a). The magnitude of the force is 1pr2

, where p

is the number of the total unrouted pins and r is the distance between the particle and the unrouted

pin. The source pin generates a similar repulsive force with a magnitude of 1r2

. To a particle in motion,

the only attractive force is from the target pin along the line joining the particle and the target, and

the magnitude of this force is 1r1.5

, which is generally greater than any other kind of force, so that the

particle can be guaranteed to reach the target. The direction of the combined force from the source and

the target is snapped to the direction of motion of the particle direction if the two directions are not

too different, so that the number of bends can be reduced.

If an unrouted pin or a segment of routed wire from a different net is in the same row or column as

the article, it will exert a circular force of 1r2

. If this force is horizontal (vertical), it will take the same

direction as the horizontal (vertical) component of the combined force from the source and the target,

as illustrated in Figure 6(b). This force-directed method can be parallelized by letting the particle for

every net move simultaneously.

4.2 Sequential routing through Steiner min-max tree construction

A Steiner min-max tree (SMMT) is a Steiner tree whose maximum-weight edge is minimized over all

Steiner trees. In [12], Chiang et al. solve the global routing problem by constructing Steiner min-

max trees for each net sequentially, defining each edge weight according to estimates of the routing

congestion.

16

The algorithm sorts the nets according to the perimeter of their bounding box and routes one net at a

time, with the nets that have a smaller bounding box being routed earlier. The routing tree construction

is carried out on a routing graph of the type described in Section 2, with the edge weights on the graph

being proportional to the wire congestion. For a certain net, a cell is called terminal cell if it contains a

pin of this net; otherwise, it is called non-terminal cell. An optimal Steiner min-max tree that spans the

net on the routing graph is obtained through a two-step procedure in polynomial time. In the first step,

a minimum spanning tree over all of the cells in the routing graph is constructed. In the second step,

any degree-one non-terminal cells are eliminated from this MST. It can be proven that the resulting

tree is an optimal Steiner min-max tree. The edge weights are updated dynamically after the routing

of each net, reflecting changes in the congestion.

Intuitively speaking, a Steiner min-max tree attempts to minimize the maximum edge weight, i.e.,

to minimize the wire congestion. However, the procedure does not consider wire length in constructing

the SMMT. Finding a Steiner min-max tree with minimum wire length (MSMMT) is shown to be

an NP-complete problem, and the authors of [12] propose a heuristic algorithm, based on the SMMT

algorithm, that tries to find the MSMMT. In this heuristic, a wire length limit ratio ρ > 1 is defined,

and the routing is performed over several iterations. Initially, the limit ratio is set to be tight, in a

range that lies in the interval (1, 2). In each iteration, the nets are routed one at a time using the

SMMT algorithm, in the ascending order of the semiperimeter of the bounding box for the net. After

this SMMT solution is obtained, its wire length is evaluated to check whether it is less than ρ times a

lower bound on the wire length, given by the semiperimeter of the bounding box for the net. If not,

this routing solution is rejected, and this net will be considered for routing in the next iteration. The

value of ρ is increased from each iteration to the next so that every net can eventually be routed.

At the end of this iterative SMMT phase, this method iteratively reroutes each net using the minimum

spanning tree algorithm in the same constant order. A rerouting result is accepted only when it is better

than the solution of SMMT phase.

4.3 Minimum weighted Steiner tree

In the SMMT based global routing procedure, the primary objective is to minimize the maximal con-

gestion and the secondary objective is to minize wire length. In [13], another sequential global routing

17

method is developed in an effort to simultaneously minimize congestion and wire length. This method

routes the nets sequentially using a minimum weighted rectilinear Steiner tree (WRST) approximation

algorithm.

The example shown in the work of [13] is based on a custom design such as that shown in Figure

7(a), where the routing area is divided into a set of small regions. Each region is assigned a weight

that is defined according to the complexity and wire congestion of the region. The shaded rectangles

correspond to macros that act as wire blockages, and the weights in these regions are set to ∞.

This work constructs a routing graph for a net as the union of the Hanan grid for the net and the

grid formed by extending the borders of each region until they encounter a blockage or the boundary.

For example, the routing graph for the design shown in Figure 7(a) is illustrated using dashed lines

in Figure 7(b). For a wire that is routed along a border between two regions with different weights,

it is assumed that the weight with a smaller value is chosen. The weight of an edge is the product of

the edge length and the weight of the region the edge belongs to, and the objective of the procedure

is to minimize the cost of the weighted rectilinear Steiner tree. It can be shown that there is always a

minimum weight path connecting two nodes using edges that lie exclusively on this routing graph.

(a) (b)

4

5 3

3

Infty

Infty

22

2

Infty

9

2

4

Figure 7: (a) A set of regions in a layout, each with a specified weight. (b) A routing graph and aminimum weighted Steiner tree for this layout.

As in the SMMT-based approach, the nets are first sorted in increasing order of the bounding box

size and are routed sequentially in this order. The weights for all regions are updated after each net is

routed.

18

Each net is routed by means of a minimum weighted Steiner tree over the net. Based on a proof

that the weight of a minimum spanning tree (MST) is at most two times the weight of optimal WRST

solution, the tree construction proceeds by beginning with an MST skeleton and heuristically attempting

to maximize the overlaps in its embedding on to the routing grid. For the net in Figure 7(b) whose pins

are indicated by the three solid squares, the MST is shown by a set of thickened dashed lines. After the

MST is obtained, each edge in the MST is instantiated into an actual wire route one by one, with the

shortest edges being embedded first. Wherever possible, a wire route is merged with already embedded

wire routes from the same net to reduce the wire length. If there is more than one minimum weight

path for an edge in the SMT, the one that can deliver the largest wire length reduction is chosen. Since

this process can only decrease the wire length from the MST result, the final result has at most twice

the wire length of the optimal WRST’s in the worst case.

5 Region-wise routing

One way to avoid the net ordering problem is to route one region at a time instead of one net at a

time. This alters the problem to one of determining the order in which regions should be considered,

and this may be decided according to net distributions, as in the method described in Section 5.1. In

an alternative approach outlined in Section 5.2, inter-row wires are routed first, and the intra-row nets

are then routed in the order of horizontal wires, vertical wires and wires with both orientations.

5.1 Unique pattern first and outer rim first routing

In an investigation of the properties of net distributions, Li and Marek-Sadowska [57] made two obser-

vations. Firstly, some nets or subnets may have to take certain unique patterns, regardless of what the

final feasible routing is, and it is useful to identify and route these first. Secondly, in typical layouts

the wiring is crowded at the center of the chip and relatively sparse at the outer rim. Therefore, it is

better to start routing from the outer rim of the entire routing area and then shift the routing toward

the center step by step. Although this work was specifically directed towards gate array designs, the

key ideas can be extended to other problems.

The following definitions are useful in describing the algorithm and are related to an abstraction of

the problem in terms of a routing graph, G.

19

Definition 5.1 (Non-Pass-Through (NPT) cell) A cell is said to be a non-pass-through (NPT)

cell if the number of unconnected pins of different nets inside it is equal to or less by one than the total

remaining channel capacities on its four boundaries.

Definition 5.2 (Outermost and inner meshes) In a particular planar drawing of the cell graph G,

where the boundary of the exterior face of a planar embedding of G corresponds to cells at the boundary

of the chip, the outermost mesh is a subset of nodes and edges in the routing graph G that bounds the

exterior face from G, and an inner mesh is a subset of nodes and edges in the routing graph G that

bounds an area where edges are missing.

An inner mesh exists when there is a grid edge whose wiring capacity is zero. An example illustrating

the idea of the outermost mesh and the inner mesh is shown in Figure 8.

meshinner

(a)

meshoutermost

capacity > 0capacity = 0

(b)

Figure 8: An example of outermost mesh and inner mesh.

The algorithm is based on the following properties when the global routing problem has a feasible

solution with no wiring overflow.

Lemma 5.1 If there is a two-pin net, Nk, with pins in two adjacent cells with non-zero wiring capacity

at their common boundary, then there exists a feasible routing solution in which Nk is wired through the

common boundary of the two adjacent cells.

Lemma 5.2 If the total wiring capacity on a closed region boundary is U , and there are P pins inside

such that P > U , then at least P − U of them must be connected within this region.

20

Lemma 5.3 If there are two adjacent NPT cells, ga and gb, with a non-zero wiring capacity uab on the

common boundary, and there are uab or fewer nets with pins in both ga and gb, then there is a feasible

solution in which these pins are connected by the shortest path between ga and gb.

This algorithm proceeds by first finding pairs of pins in adjacent cells and connects them directly

according to Lemma 5.1. Next, it identifies the NPT cells from this partial routing result and determines

unique routes according to following rules:

1. Nets with pins in the neighborhood of NPT cells must find unique routes that do not pass through

the neighboring NPT cells.

2. By Lemma 5.3, nets that have two pins in adjacent NPT cells are connected through the common

boundary of the adjacent NPT cells.

3. If there is a net with one pin located in an NPT cell and the other in its non-NPT neighbor, they

are connected directly across their common boundary.

After these unique patterns have been routed, some new NPT cells may be generated due to the

consequent reduction in the wiring capacity. These cells are identified and the process is repeated until

no such new cells are created.

In the next phase, the routing proceeds along the outermost mesh and the inner meshes. Any nets

with more than two pins within a mesh are connected within the mesh, as long as such a connection

does not lead to barriers separating the routing graph into disjoint regions. Those pins that remain

unconnected are connected to neighboring non-mesh cells. After one layer of outermost mesh is routed,

the outermost mesh is expanded inwards. Similarly, a routed inner mesh is expanded outward. If

a routing decision at an outer mesh is found to be incorrect when the inner meshes are routed, the

algorithm resorts to backtracking to correct the previously made decision. This procedure is repeated

until all of the cells are routed. For a routing instance with k nets to be routed on a p × q grid, this

algorithm has a complexity of O(k2(log k + pq)).

21

terminal GRC

via GRC

possible path segment

Figure 9: Candidate paths for inter-row connections in the algorithm proposed by Thaik et al.

5.2 Routing in order of wire orientations and in terms of rows

In [81], Thaik et al. designed a global routing algorithm for a sea-of-gates environment that proceeds

edge-wise instead of net-wise. They consider the global routing problem for three layers such that top

and bottom layers are reserved for vertical wires and middle layer is devoted to horizontal wires. The

routing region is tessellated into a tile graph as shown in Figure 9, and each row is further divided into

two subrows separated by the thin dashed lines.

The algorithm consists of three phases. In the first, each net that spans multiple rows is connected

by a vertical path in the top layer that covers the vertical span of the entire net. The end point of the

path may either be a terminal GRC (global routing cell) that contains a pin of the net, or a via GRC.

For a specific net, there could be several paths that cover its vertical span, as illustrated in Figure 9,

and only one such path needs to be chosen. This choice should be made while ensuring that there are no

wiring capacity overflows. This decision process is implemented in [81] by means of a zero-one integer

linear programming (ILP) formulation that is solved using a 0-1 ILP package.

In the second phase, connections within each row are considered. Recall that each row consists of

two subrows. If two pins lie in two adjacent subrows of the same row and are in the same column,

they are connected by a straight vertical edge in the lowest routing layer as long as no overflow occurs.

Next, straight horizontal connections are made on the middle layer between pairs of terminal-GRC’s,

via-GRC’s and/or metal-3 paths. These problems are also formulated and solved as 0-1 ILP problems.

22

After phase two, there may still be nets whose routing has not been completed due to wiring con-

straints. These remaining connections are completed in the third phase, which is also constrained to

routing within a row. In this step, multi-bend routes are considered, and the route for each row is solved

as a 2×N routing problem using a hybrid hierarchical approach, where the routing at each hierarchical

level is solved as a 0-1 ILP problem.

6 Move-based heuristics

Move-based heuristics are commonly used to find the optimal solutions to computationally difficult

problems. Arguably, the most well known move-based heuristic is simulated annealing, developed by

Kirkpatrick et al. [47]. The motivation for this approach is that in solving computationally difficult

combinatorial optimization problems, greedy heuristics are easily trapped at local minima because only

those moves that reduce the value of the cost function are accepted. In simulated annealing, a certain

amount of hill-climbing is permitted, providing an increased chance of finding the global minimum.

More precisely, any cost-decreasing move is always accepted, and a cost increasing move is accepted

with a probability of e−∆C/T , where ∆C is the increase in the cost and T is a parameter known as the

temperature. It can be seen that the less the increase on the cost, the more likely the move is accepted.

The procedure works iteratively, with an outer loop starting from a high temperature where virtually

all moves are accepted, and gradually “cooling” the temperature so that the likelihood of accepting

cost-increasing moves is progressively diminished. At each temperature, a number of moves are carried

out, which are either accepted or rejected according to the criterion outlined above. This process is said

to be analogous to the cooling process during annealing of metals.

The simulated annealing technique is applied to global routing in [84], where only two-pin nets are

considered and the number of bends for each net is no more than two. In the well-known TimberWolf

package [76], simulated annealing is applied to both placement and global routing. In this package, a set

of candidate routing trees is created for each net and one of the trees is randomly chosen as the initial

solution. Each move implies a switch from one tree to another for a net. The cost to be minimized is

the total wiring overflow over the entire routing graph. The net whose topology is to be changed by a

move is randomly chosen from a grid cell boundary where an overflow exists.

Other move-based techniques applied to global routing include simulated evolution [10], genetic al-

23

gorithm [28] and tabu search [91].

7 Rip-up and reroute

Another common approach to avoid the net ordering dilemma is the rip-up-and-reroute method. This

approach starts by routing each net individually without considering congestion, usually constructing

Steiner minimum trees for each net. After all of the nets have been routed, the congested areas are

identified, and the nets in those areas are ripped up and rerouted through less congested areas. This

rerouting is often based on the maze routing algorithm. Although this method sounds very simple, it

is surprisingly effective and has long been the most commonly used global routing method in industry.

Moreover, it can always be combined with other global routing methods as a post-processing step

to further improve the routing quality. The degrees of freedom in this approach are related to, for

example, the different strategies for choosing the net to be rerouted, or the order in which boundaries

with overflow are processed.

An early and influential rip-up-and-reroute global routing method was proposed by Ting and Tien [82].

Broadly speaking, their procedure first selects a set of congested grid cell boundaries and then chooses

the subset of nets that pass across these boundaries to be rerouted.

1

1

2

2

3

3

boundary full/overflow

before reroute

after reroute

Figure 10: An example showing the overflow along a full loop. This overflow can be reduced only whenthere is at least one net that crosses the loop twice.

A special case corresponds to the situation where the set of saturated or oversaturated boundaries

forms a closed loop, as illustrated by the thickened dashed square in Figure 10. Along this loop, each

boundary is either full or has an overflow, and at least one boundary has an overflow. In this case, if a

24

wire crosses this loop only once, as is the case for net 1, rerouting this net will not satisfy the congestion

constraints. More generally, if all of the nets pass through the loop only once, then no feasible solution

exists. On the other hand, if a net crosses this loop twice and all of its pins lie either entirely within or

entirely outside the loop, as is the case for nets 2 and 3, respectively, then we can reroute these nets to

ensure that the devious routes taken by these double-crossing nets are altered to lie entirely within or

without the loop. The first step of Ting and Tien’s algorithm is to identify such loops and to reroute

the two-crossing nets.

N1N2

N3

N4

N5N6

N7N8

N9N10

BoundaryNet

N4

N3

N5

N6

N7

N8

b1

b2

w1 = 1

b3w3 = 1

w2 = 1b1

b2

b3

(a) (b)

Figure 11: Mapping the problem of choosing nets to be rerouted on to the problem of finding a coveron a bipartite graph.

When no such loop exists, Ting and Tien’s algorithm selects a set of k > 1 most congested boundaries

to reroute. There are typically many nets that run across these boundaries, and the technique employed

to choose the nets to be rerouted is illustrated through the example in Figure 11(a). In this example,

the wiring capacity on each boundary is assumed to be two, and it is easily verified that there are

three boundaries, b1, b2 and b3, with overflow. A bipartite graph is then constructed as shown in Figure

11(b). The group of vertices to the right correspond to the overflow boundaries, and those to the

left correspond to the nets that cross those boundaries. An arc is set up between a net vertex and a

boundary vertex if the net crosses this boundary. Each boundary vertex bi has a weight wi equal to the

overflow of the corresponding boundary. A minimum cardinality cover set from the group of vertices to

the left is selected to cover every boundary vertex wi times, and the nets in this cover set are selected

to be rerouted.

25

Another variant of rip-up-and-reroute method is due to Nair [67]. In this procedure, every net is

ripped up and rerouted based on the observation that a wire in a non-overflow area may be able to

move further and to a less congested area to leave some room for wires in adjacent congested areas.

Another feature is that every net is rerouted in the same constant order iteratively. The rationale

for this is that the routes that are chosen early in an iteration are based on less accurate congestion

information compared to the routes that are determined later, and therefore, these early routed nets

should be corrected first in the next iteration as well.

(a)

subtree T2

v’

v1

v2

v’ = source

(b) (c)

v1

v2

v’ = source

targettarget

subtree T1

Figure 12: An example of rerouting using a network flow model.

In [65], Meixner and Lauther reroute a set of nets simultaneously using a single commodity min-

cost network flow formulation, so that the net ordering problem in rerouting is avoided. The method

successively considers each node v′ ∈ V for a routing graph G = (V, E) and reroutes the nets that have

either a sink node or a Steiner node with degree greater than two at v′. For each such net, its routing

edges from v′ to the next sink or Steiner node with degree greater than two are ripped up, leaving

behind a set of partial routing trees. For example, two nets N1 and N2 may leave behind the trees T1

and T2, respectively, shown in Figure 12(a).

The problem is now to connect nodes at v′ to subtrees T1 and T2 for nets N1 and N2 without causing

a wiring overflow, and simultaneously minimizing the wire length. To solve this problem, the partial

routing graph is transformed into a network flow model as shown in Figure 12(b). Each undirected

edge is mapped on to a pair of directed edges whose capacity is the corresponding wiring capacity and

whose cost is the edge length. Two pseudo nodes v1 and v2 are added for nets N1 and N2, respectively,

26

together with a target node. Every node in partial tree Ti is connected to the node vi, i = 1, 2, by means

of a directed edge, and finally, each vi is connected to the target node. These edges all have a cost

of zero and capacity of one. The node v′ serves as the source node, and a minimum cost flow for this

network is determined for a total flow of p units through the network, where p is the number of partial

routing trees. Such a flow can be used to determine an optimal solution, if it exists, that satisfies wiring

capacities and minimizes the wire length. A sample solution for our example is illustrated in Figure

12(c).

In addition to avoiding net ordering problem, this method has the advantage that an optimal integer

solution to this single commodity flow problem can be obtained in polynomial time and no rounding

procedure is required, since the single commodity network flow problem has a polynomial time optimal

integer solution if each edge capacity is an integer [2]. However, the quality of the solution depends on

the order in which the nodes in G are processed, though this node ordering problem is less severe than

the original net ordering problem.

pin

grid line

wire

switchable segment

two possible configurations

Figure 13: An example of a switchable segment.

In [55], Lee and Sechen proposed a rip-up-and-reroute strategy using a multi-stage refinement proce-

dure, instead of applying maze routing immediately after initial tree construction. As in other rip-up-

and-reroute methods, they initially construct Steiner trees for each net without considering congestion.

Next, wires are ripped up and rerouted to reduce congestion in four successive stages. One scenario is

related to the idea of switchable segments, illustrated in Figure 13. In a gate array or a standard cell

design, an equivalent pair of pins at the top and bottom of a cell is treated as a single pin located at

the middle of the row where a horizontal grid line goes through. In stage one of the rerouting process,

27

one of these two options for each switchable segment is chosen in a way such that the congestion is

minimized. In stage two, the L-shaped wires in the initial Steiner trees are allowed to be rerouted as

Z-shaped connections to reduce congestion; note that there is no wire length increase in either stage

one or stage two. In stage three, connections that were originally straight in the initial Steiner trees are

allowed to detour and become U-shaped connections, with a bound on the permissible increase in the

wire length. During this stage-wise refinement, the algorithm attempts to control the wire length and

the number of bends. If some overflow still remains after these three stages of rerouting, traditional

maze routing is finally employed to further reduce the congestion in stage four.

In [60], a customized routing graph is constructed for global routing using multilayer macrocells, or

building blocks. This routing graph is similar to the graph in Figure 7(b), except that it is in three

dimensions to reflect the multilayer design. The authors extend Wang’s heuristic [87] to construct

Steiner trees for each net separately at the beginning. After the initial tree construction, their method

starts by inspecting every grid cell to see if the number of wires (vias) that pass through it exceeds the

available routing tracks (via spaces). Next, every over-congested cell is processed in order of the most

congested cell first. For each such over-congested cell, a new route is found for every net in it that can

avoid all the over-congested or full cells, if such a route exists. These routes are maintained in a priority

queue with the top route having the minimum wire length increase compared to its corresponding old

route. After all of the possible new routes have been found, the top new route is popped out from the

queue and used to replace its corresponding old route repeatedly until the over-congestion problem for

this cell is resolved.

8 Multicommodity flow based approach

Although the sequential, rip-up-and-reroute and other heuristics may be effective in practice, they

cannot provide a certain answer as to whether or not a feasible solution exists. In other words, if they

fail to find a feasible solution, it is not clear whether this is attributable of the non-existence of a feasible

solution or because of shortcomings of the heuristic. Moreover, when a heuristic does find a feasible

solution, it is not known whether or not this solution is optimal, or how far it is from the optimal

solution.

These questions may be answered if we formulate and exactly solve the global routing as a multicom-

28

modity flow problem. A multicommodity flow operates on a network that is a graph G = (V, E), where

V = {v1, v2, ..., vn} is a set of n vertices and E = {e1, e2, ..., em} is a set of m edges. Some researchers

define this network as a directed graph while some others define it as a undirected graph. Generally

speaking, this difference affects only the form of the problem formulation. For multi-pin net global

routing, an undirected graph formulation is more convenient.

Over this network, k commodities must be transported from some vertices to other vertices. In global

routing, we can treat each net as a commodity and we can say that there is a set of N = {N1, N2, ..., Nk}

commodities that are to be shipped. For each commodity Ni, a certain amount di, namely the demand,

is required to be shipped. In global routing, each Ni corresponds to a net with more than one pin

and each di is always one. Each edge has a flow capacity u(e) and cost c(e), which have the same

interpretation as in the global routing problem formulation. The routing can be expressed in terms

of either edges or trees. In the edge-based expression, a variable fi(e) represents the amount of flow

passing through edge e ∈ E. Note that fi(e) is always non-negative in an undirected network while it

can be any real number for a directed network. The tree-based expression assumes that there is a set

of possible routing trees Ti = {Ti,1, Ti,2, ...} for each net Ni. The binary variable xi,j is set to one if

Ti,j ∈ Ti is selected for net Ni; otherwise, xi,j is zero. Of course, it is impractical to enumerate all of

the possible trees and these trees are usually generated on the fly in practice.

Typically, there are two constraints that must be satisfied in a multicommodity flow problem. The

first is the demand constraint, which requires that the amount of flow shipped for each commodity

should be equal to its demand, and the second is the bundle constraint, which states that the total

amount of flow f(e) passing through each edge e ∈ E should not exceed its capacity u(e). In the

fractional flow version, the decision variables fi(e) or xi,j may be any non-negative real number. In the

zero-one integer flow version, there is the third zero-one integer constraint that regulates that fi(e) and

xi,j must take a value of either zero or one. The global routing problem maps on to the zero-one integer

version. Due to computational complexity issues related to integer programming and the large size of

the global routing problem, the integer flow problem here is often relaxed to the fractional formulation,

whose solution is transformed to an integer solution after a rounding procedure.

Besides these constraints, there are several variations on the objective function for the multicommod-

ity flow problem. One formulation tries to minimize the total cost of transportation, and this is called

29

the min-cost multicommodity flow problem. Another formulation attempts to minimize the maximum

edge density (as defined in Section 2) and is called the concurrent flow problem.

8.1 The Shragowitz-Keel algorithm

The work by Shragowitz and Keel in [79] is perhaps the first reported work on global routing using

the multicommodity flow model. Unlike many subsequent efforts using this formulation, they did not

employ an off-the-shelf multicommodity flow algorithm, but instead, developed their own polynomial

time algorithm. Superficially, this algorithm looks similar to the rip-up-and-reroute method and the

authors investigated the feasibility of convergence and the convergence rate. However, no statements

about the optimality are made with the analysis.

Their approach is based on a directed network and is restricted to two-pin nets. For each net Ni, one

of its pins vs ∈ V is selected as source where a unit of flow is generated, i.e., the net flow of net Ni at

this vertex di(vs) is 1. The other pin vt is the sink where a unit of flow dissipated, i.e., the net flow of

net Ni is di(vt) = −1. For simplicity, in this description, a directed edge from vertex v to v′ is denoted

by the vertex pair (v, v′). The Shragowitz-Keel formulation of the global routing problem is as follows:

minimize∑

∀e∈E

∑

∀Ni∈N c(e)|fi(e)|

subject to:∑

v′:(v,v′)∈E fi(v, v′) −

∑

v′:(v′,v)∈E fi(v′, v) = di(v), ∀Ni ∈ N ,∀v ∈ V

∑

∀Ni∈N |fi(e)| ≤ u(e), ∀e ∈ E

fi(e) ∈ {0,±1}, ∀Ni ∈ N ,∀e ∈ E

(2)

Initially, this algorithm discards the bundle constraint to obtain a min-cost solution where the edge

cost c(e) is defined to be the rectilinear edge length. Next, it iteratively reduces the violations on the

bundle constraint while maintaining a minimum feasible cost. The first step is solved individually for

each net and any shortest path algorithm may be used. As in other methods, this step corresponds to

finding the minimum cost route for each net while ignoring the effects of the other nets. The algorithm

then identifies the subset of edges Eφ with maximum overflow φ = max(φ(e),∀ ∈ E) and the subset of

edges Eφ,φ−1 with an overflow of either φ or φ − 1. If φ = 0, then the algorithm is done. The cost of

each edge in Eφ,φ−1 is updated to infinity and all other edges keep their cost unchanged. The subset

Nφ is defined such that each commodity or net in it has at least one fi(e) = 1 such that e ∈ Eφ. The

30

subset of Nφ,φ−1 is defined in a similar manner. Based on the updated cost, the shortest path algorithm

is executed for every commodity in Nφ. If there is a new path with a cost of infinity, this algorithm

checks a feasibility condition which can verify whether there is a feasible solution for this problem. If the

problem remains infeasible, this algorithm perturbs the flows and reruns the iterations again. If there

is no new path with a cost of infinity, this algorithm chooses the commodity Ni that has the smallest

increase in cost from the old path to the new path, and replaces its old path with the new path. This

process is repeated until there is no overflow.

8.2 The Raghavan-Thompson rounding method

One interesting formulation for multi-terminal nets routing to multicommodity flow model is conducted

by Raghavan and Thompson in [72], where the exposition of the algorithm is restricted to nets with

exactly three pins, although the authors state that generalizations to multiple pins are possible. This

formulation is also based on a routing graph G = (V, E), and the objective is to minimize the maximum

flow among all edges, i.e., minimize f̂ = max(f(e),∀e ∈ E).

Consider a situation where each net Ni ∈ N consists of three terminals vi1, vi2 and vi3. In the

formulation, Raghavan and Thompson assume that one Steiner node is needed for each net and an

indicator variable si(v) ∈ {0, 1} is used to denote whether or not each vertex v ∈ V is the Steiner node

for net Ni. The global routing problem is then formulated as a multicommodity flow problem in which

si(v) units of flow are to be shipped from each of vi1, vi2 and vi3 to vertex v ∈ V .

This problem is first relaxed to be a linear programming problem and solved by any linear program

solver. The next crucial task is to obtain the integer solution from the optimal linear program relaxation

solution. We use f̃(e) and s̃i(v) to represent the fractional results from the linear programming and

denote the optimal max-flow as f̃ . Then these values must satisfy the following conditions.

∑

v∈V

s̃i(v) = 1, ∀Ni ∈ N (3)

f̃(e) ≤ f̃ , ∀e ∈ E (4)

Here all of the f̃(e), s̃i(v) ∈ [0, 1]. We express a set of solutions as S and start with the initial solution

31

S0 as:

S0 = {f̃ , f̃(e), s̃i(v),∀Ni ∈ N ,∀v ∈ V, ∀e ∈ E}.

The rounding procedure proceeds for k stages to get a sequence of solutions S1, S2, ..., Sk, where k = |N |.

In each stage i, the flow of one net Ni is rounded to integers and will not be changed later, and the

solution proceeds from Si−1 to Si. For each solution S, a potential function is defined as:

Ψ(S) =∑

e∈E

∏

Ni∈N

[fi(e)ω + 1 − fi(e)], (5)

where the parameter ω > 1 will be defined later.

Each stage consists of two phases. The Steiner node for the net being processed is determined in

phase one and the integer flow to the Steiner node is obtained in phase two. In phase one of stage i, a

vertex vp is identified so that si(vp) is forced to 1 and si(v) = 0, v 6= vp, v ∈ V . If we denote the flow

from any vertex vij ∈ Ni to vertex v ∈ V as fi(e, v), selecting vp as Steiner node for Ni also implies that

we let fi(e, vp) = f̃i(e, vp)/s̃i(vp) and fi(e, v) = 0 for all other vertices v. This corresponds to picking

paths from each pin of the net to the Steiner point. If the solution after Steiner node selection is S′i,

then the Steiner must be chosen in a way so that Ψ(S′i) is minimized. The authors have proved that

this procedure ensures that Ψ(S′i) ≤ Ψ(Si−1).

After the Steiner node is selected, the fractional flows from net Ni to vertex vp, f′i(e, vp), are rounded

to be either 0 or 1 in phase two. Let us consider how to round the flow from one of the pins vi1 to

vp. Through a so-called path stripping procedure, the fractional flow from vi1 to vp is organized into a

set of paths {P1, P2, ...}, each with a certain amount of flow. The summation of flow over all of these

paths must be one. The flow along path Pl is now selected to be one such that the solution from this

choice S′i(Pl) minimizes the potential Ψ(S′i(Pl)). After the flows from all three pins of net Ni have been

rounded, we can obtain the solution Si, and it is proven that Ψ(Si) ≤ Ψ(Si−1).

If the parameter ω is chosen so that it satisfies the relation

[eω−1

ωω]f̃ =

1

m,

32

then the upper bound of the final integer max-flow f̂ is as follows:

f̂ ≤

f̃ + (e − 1)√

f̃ lnm : f̃ ≥ lnm

e ln mln e ln m

f̃

: f̃ < lnm(6)

where m is the number of edges in G and e is the base of the natural logarithm.

8.3 Application of the Shahrokhi-Matula algorithm

In [9], Carden et al. developed the first reported global router with a theoretical bound from the

optimal solution, based on a multicommodity flow algorithm. They applied Shahrokhi and Matula’s

two-terminal multicommodity fractional flow algorithm [77] followed by randomized rounding to obtain

a multi-terminal multicommodity integer flow solution. Shahrokhi and Matula’s algorithm is an ǫ-

optimal approximation algorithm. In contrast with Shragowitz and Keel’s approach, their method is

directed towards a concurrent multicommodity flow formulation instead of a min-cost multicommodity

flow formulation.

The integer linear programming formulation used here is the same as formulation (1). Its linear pro-

gramming relaxation is obtained by omitting the integer constraint in (1) and we denote this relaxation

as LP. The dual linear programming (DLP) of the LP is:

maximize∑

Ni∈N θi

subject to:∑

e∈E c(e)l(e) = 1

∑

e∈Ti,j l(e) ≥ θi, ∀Ni ∈ N ,∀Ti,j ∈ Ti

l(e) ≥ 0, ∀e ∈ E

(7)

The variable l(e) is the dual variable corresponding to each edge e and is also referred to here as the

edge weight. The variable θi represents the throughput of the flow from net Ni. According to the

theory of duality in linear programming, a feasible solution to DLP provides a lower bound for the

optimal solution of LP, and the LP solution reaches its optimum when it equals the DLP solution.

Based on this property, Shahrokhi and Matula design an approximation algorithm in which the LP and

DLP solutions are pushed closer after each iteration, and the final difference between the LP and DLP

33

solutions provides an upper bound on how far the LP solution away from the optimal solution. An

ǫ-optimal algorithm implies that the resulting λ̂ is at most (1 + ǫ) times the optimal solution λ̂∗, where

the value of ǫ is a specified parameter between (0, 1]. The smaller the value of ǫ is, the more optimal

the solution is and the longer is the computation time.

Initially, the edge weight is defined as l(e) = 1/u(e)∀e ∈ E. The algorithm begins by constructing

a minimum weight Steiner tree Ti for each net Ni to obtain a solution satisfying the first constraint in

the formulation (1) without considering the bundle constraint. As a consequence, the value of λ̂ may

initially be inordinately large. In subsequent steps, the algorithm iteratively recomputes the Steiner

trees to decrease the value of λ̂ until the optimality condition is satisfied. In order to avoid iterations on

trees with very small flows, a parameter σ0 is defined as the smallest allowable fraction of flow changes.

If U =∑

e∈E u(e) and u∗ = min(u(e),∀e ∈ E), then σ0 is chosen to beu2∗ǫ

16αU , where the constant α,

which is a user-tunable parameter, is typically between 0.01 and 100.

Next, the weights of the edges that whose flows were altered in the last iteration are updated according

to l(e) = eαλ(e). Using this updated weight, new minimum weight Steiner trees are recomputed for each

net. If θ =

∑

e∈El(e)u(e)

∑

∀Ni∈N

∑

∀e∈Til(e)

, then ǫ-optimality is satisfied if θ− 1λ̂θ

≤ ǫ, If ǫ-optimality is not satisfied,

then the net with the largest cost reduction from the old Steiner tree to the new Steiner tree is rerouted

so that a portion of flow in the old Steiner tree is switched to the new Steiner tree, and this cost

update and rerouting process is repeated until the ǫ-optimality condition is satisfied. At this point,

the actual flow for each tree is scaled to x̄i,j = xi,j/λ̂. Finally, the fractional solution is integerized

in to integer solution through randomized rounding, and any remaining overflow is resolved through a

rip-up-and-reroute process.

8.4 Application of Garg-Könemann algorithm

The latest reported global routing procedure based on multicommodity flow algorithm is due to Albrecht

[3]. This work is an application of Garg and Könemann’s multicommodity flow approximation algorithm

[29], which is simpler and faster than Shahrokhi and Matula’s algorithm [77]. Besides congestion, wire

length is also considered in this method.

Albrecht’s algorithm works on a routing graph G = (V, E) shown in Figure 2(b), which represents

a two layer global routing problem in which one layer is reserved for horizontal wires and the other

34

for vertical wires. The edges between two layers represent vias and their capacities are the maximum

number of vias allowed in a grid cell. The algorithm permits different wire widths, and uses the symbol

wi(e) to represent the wire width of edge e ∈ E belonging to net Ni. The resulting integer linear

programming formulation is slightly different from that in (1). Assuming that all of the demands

di = 1,∀Ni ∈ N , the formulation is:

minimize λ̂

subject to:∑

Ti,j∈Ti xi,j = 1, ∀Ni ∈ N

∑

i,j:e∈Ti,j wi(e)xi,j ≤ λ̂u(e), ∀e ∈ E

xi,j ∈ {0, 1}, ∀Ni ∈ N ,∀Ti,j ∈ Ti.

(8)

1. l(e) ← δ/u(e),∀e ∈ E2. xi,j ← 0,∀Ti,j ∈ Ti,∀Ni ∈ N3. ji ← 0, i = 1, ..., k4. While (

∑

e∈E l(e)u(e) < 1)5. For i = 1 to k6. If ji == 0 or

∑

e∈T i,j wi(e)l(e) > (1 + γǫ)θi7. Find a minimal Steiner tree Ti,ji ∈ Ti

for net Ni with edge weight wi(e)l(e), e ∈ E8. θi ←

∑

e∈Ti,jiwi(e)l(e)

9. xi,ji ← xi,ji + 110. l(e) ← (1 + ǫwi(e)/u(e))l(e),∀e ∈ Ti,ji

Figure 14: An approximation algorithm for fractional global routing.

As before, the ILP problem is relaxed to the a linear programming problem which is solved by the

approximation algorithm in Figure 14. The symbols δ, γ and ǫ are constants, and k is the number of nets

to be routed. In the Shahrokhi-Matula algorithm, a fraction of flow is switched from a higher weight

(more congested) tree to a lower weight (less congested) tree in each iteration. In Garg-Könemann’s

algorithm, a flow of di is added to a low weight tree Ti,j ∈ Ti in each iteration with the previously placed

flows remaining unchanged. Finally, the amount of flow on each edge is scaled back by the number of

iterations. The following significant results are reproduced from [3].

35

Theorem 8.1 Given any non-negative values l(e) for each edge e ∈ E, the expression:

∑ki=1 minT∈Ti

∑

e∈T wi(e)l(e)∑

e∈E l(e)u(e)

provides a lower bound on the optimum value of the fractional global routing problem. Moreover, there

exist nonnegative values ye, e ∈ E, such that the expression above is equal to the optimum value of the

fractional global routing problem.

Theorem 8.2 If there exists a solution for the fractional global routing problem with maximum relative

congestion smaller than 1, the algorithm finds a ρ−approximation in at most

1 +1

ǫ′λ̂∗ln1+ǫ′(

m

1 − ǫ′)

phases, if ρ = 1/(1− ǫ′)3 and ǫ′ = ǫ(1 + γǫ). Moreover, the variables ye, e ∈ E, converge to the solution

of the dual linear program.

9 Hierarchical methods

The first prominent hierarchical method for routing was proposed by Burstein and Pelavin in [8].

This method recursively divides the routing regions into successively smaller sub-regions, and nets at

each hierarchical level are routed simultaneously and refined in the subsequent levels. The sub-regions

are referred to as super cells, compared to the global routing cells (GRC) defined in the global routing

formulation. This is a systematic divide-and-conquer approach and therefore transforms the large and

complicated global routing problem into a series smaller and simpler sub-problems. As a result, this

method is inherently faster than flat routing methods. In fact, this approach is quite versatile and can

be applied to detailed routing as well. Moreover, it reduces the problem of net-ordering in sequential

routing since higher level decisions are used to guide the solution at lower levels of the hierarchy. While

this approach does not entirely remove the net ordering issue, this is an issue of diminished importance

in comparison with flat approaches. The simplicity of this method carries with it limitations on the

quality of the routing solution, since a routing decision at one hierarchical level neglects the requirements

at subsequent level and may make a choice that is suboptimal for subsequent levels. Thus, a hierarchical

method is often combined with other global routing techniques, such as rip-up-and-reroute.

36

Hierarchical methods form a rich family of global routing approaches and numerous works [8, 14,

32, 33, 41, 51, 53, 59, 62–64, 69, 70, 89] have been reported. These methods can proceed in a top-down,

bottom-up or hybrid manner and have been applied to gate array/sea of gates, standard cell and custom

design. Typical partitioning methods may proceed by bisection or quadrisection, and may be either

uniform or non-uniform. In this section, we will describe some major representative works in this family

of methods.

9.1 Top down successive refinement

The original method proposed by Burstein and Pelavin in 1983 was applicable to gate arrays, switchbox

routing and channel routing. The method is based on the assumption of uniformity of the wiring

substrate, a restriction removed by subsequent works. In addition to minimizing wire congestion, this

method tries to minimize via congestion and wire length in terms of the number of grid cells traversed

by the wires.

(a) (b)

(d)(c)

Figure 15: An example of routing by top-down hierarchical refinement.

As shown in Figure 15, the routing region is recursively bisected into smaller super cells, and at each

level, the routing is performed in terms of the super cells at that level. In the example of Figure 15, the

routing at each level is represented by the shaded super cells for a 3-pin net. This process is performed

37

in a top-down manner until the super cells reduce to the actual grid cells for global routing.

The initial routing is trivial and the critical part of this algorithm is related to refining the routing

solution from one hierarchical level (a 1×N super cell) to a lower level (a 2×N (super) cell) after each

bisection. For the example in Figure 15, a routing solution for the set of 1 × 4 cells that lie inside the

thickened box in (c) is refined to a routing solution for the 2 × 4 cells inside the thickened box in (d)

in the next step. Thus, the critical problem is to determine routes for multiple nets in a 2 × N cell to

minimize congestion and wire length. Burstein and Pelavin proposed two heuristics to solve the 2 × N

routing problem: (1) divide-and-conquer, and (2) dynamic programming.

(a) (b)

(c) (d)

(f)(e)

Figure 16: An example of divide and conquer in 2 × N routing.

The divide-and-conquer heuristic is illustrated in the example in Figure 16. The 2 × N (N = 8)

routing problem in Figure 16(a) is first reduced to a 2 × 2 routing problem and solved as in Figure

16(b). Next, the cells are partitioned in halves and two pseudo pins (black dots in Figure 16(c)) are

introduced at the two neighboring cells across the cutline and on the routing path of solution in (b).

Now the problem is reduced to routing a 2 × N1(N1 = 4) and a 2 × N2(N2 = 4) problems, where

N1 +N2 = N . The two new sub-problems can be recursively solved as a series of 2×2 routing problems

as shown in Figure 16(d) – (f).

38

Figure 17: An enumeration of all patterns for 2 × 2 routing.

Through this divide-and-conquer process, the 2 × N problem is reduced to a series of 2 × 2 routing

problems, each of which is relatively simple since all of the possible routing patterns can be enumerated

easily as shown in Figure 17. Therefore, the problem of routing each net in a 2× 2 grid is equivalent to

choose a routing pattern for each net among the patterns in Figure 17. Burstein and Pelavin formulate

this problem as integer programming problem to minimize overflow and maximize wiring and via slacks;

the wiring slack is defined as the available wiring tracks on a grid boundary minus the number of wires

across this boundary, while the via slack is the number of via spaces in a grid cell minus the number of

bends in this grid cell. The integer programming is first relaxed and solved as a linear program, after

which the integer solution is obtained by rounding the fractional solutions.

39

T1

T2

T3

T4

Figure 18: Examples for different tree configurations representing optimal subsolutions for a dynamicprogramming based solution to the problem of 2 × N routing.

The second approach that was proposed is based on dynamic programming. For a set of 2×N cells,

the procedure starts with the leftmost column k, with a terminal cell containing a pin, and grows a tree

towards the right by incrementing k by one column at a time until the N th column is reached. At each

step, there are a set of optimal subsolutions of partial trees that are maintained for the 2 × k cells to

the left. These trees can be enumerated as:

• T 1(k): a minimum cost tree connecting all of the terminal cells in the left k columns to cell g(1, k).

• T 2(k): a minimum cost tree connecting all of the terminal cells in the left k columns to cell g(2, k).

• T 3(k): a minimum cost tree connecting all of the terminal cells in the left k columns, to both

g(1, k) and g(2, k).

• T 4(k): two disjoint minimum cost trees T ∗ and T ∗∗, g(1, k) ∈ T ∗ and g(2, k) ∈ T ∗∗. Each terminal

cell is in either T ∗ or T ∗∗.

An example of a set of such trees is shown in Figure 18. When the trees grow from column k to k+1,

a new set of trees T i(k +1), 1 ≤ i ≤ 4 are obtained from T i(k), 1 ≤ i ≤ 4, and the minimum cost tree of

40

each type is stored for each value of k. The dynamic programming property arises from the fact that the

optimal subsolution for a value of k can be created from the optimal subsolution trees for column k−1.

The above 2 × N grid routing procedure is performed for each net sequentially with the wiring cost

dynamically updated according to the changes on congestions. After all of the nets have been routed,

each net is ripped up and rerouted by repeating the same procedure, but using the complete congestion

information. The computational complexity for the presented algorithms is O(k(p + q) log(pq)) for k

nets to be routed on a p × q grid.

9.

Documents

A Survey on Multi-net Global Routing for Integrated Circuitspeople.ece.umn.edu/users/sachin/jnl/integration01.pdfsequential routing and rip-up-and-reroute, and then discusses multicommodity