68
A Survey on Multi-net Global Routing for Integrated Circuits Jiang Hu and Sachin S. Sapatnekar {jhu, sachin}@mail.ece.umn.edu Department of Electrical and Computer Engineering University of Minnesota Minneapolis, MN 55455, USA Tel: 612-625-0025, Fax: 612-625-4583 Abstract This paper presents a comprehensive survey on global routing research over about the last two decades, with an emphasis on the problems of simultaneously routing multiple nets in VLSI circuits under various design styles. The survey begins with a coverage of traditional approaches such as sequential routing and rip-up-and-reroute, and then discusses multicommodity flow based methods, which have attracted a good deal of attention recently. The family of hierarchical routing techniques and several of its variants are then overviewed, in addition to other techniques such as move-based heuristics and iterative deletion. While many traditional techniques focus on the conventional ob- jective of managing congestion, newer objectives have come into play with the advances in VLSI technology. Specifically, the focus of global routing has shifted so that it is important to augment the congestion objective with metrics for timing and crosstalk. In the later part of this paper, we summarize the recent progress in these directions. Finally, the survey concludes with a summary of possible future research directions. This work is supported in part by the NSF under contract CCR-9800992 and the SRC under contract 98-DJ-609.

A Survey on Multi-net Global Routing for Integrated Circuitspeople.ece.umn.edu/users/sachin/jnl/integration01.pdfsequential routing and rip-up-and-reroute, and then discusses multicommodity

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • A Survey on Multi-net Global Routing for Integrated Circuits ∗

    Jiang Hu and Sachin S. Sapatnekar

    {jhu, sachin}@mail.ece.umn.edu

    Department of Electrical and Computer Engineering

    University of Minnesota

    Minneapolis, MN 55455, USA

    Tel: 612-625-0025, Fax: 612-625-4583

    Abstract

    This paper presents a comprehensive survey on global routing research over about the last two

    decades, with an emphasis on the problems of simultaneously routing multiple nets in VLSI circuits

    under various design styles. The survey begins with a coverage of traditional approaches such as

    sequential routing and rip-up-and-reroute, and then discusses multicommodity flow based methods,

    which have attracted a good deal of attention recently. The family of hierarchical routing techniques

    and several of its variants are then overviewed, in addition to other techniques such as move-based

    heuristics and iterative deletion. While many traditional techniques focus on the conventional ob-

    jective of managing congestion, newer objectives have come into play with the advances in VLSI

    technology. Specifically, the focus of global routing has shifted so that it is important to augment

    the congestion objective with metrics for timing and crosstalk. In the later part of this paper, we

    summarize the recent progress in these directions. Finally, the survey concludes with a summary of

    possible future research directions.

    ∗This work is supported in part by the NSF under contract CCR-9800992 and the SRC under contract 98-DJ-609.

  • Contents

    1 Introduction 4

    2 Problem background and formulation 6

    3 Basic techniques 9

    3.1 Maze routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    3.2 Steiner tree construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    3.3 0-1 integer linear programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3.4 Network flow model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    4 Sequential routing techniques 14

    4.1 Force-directed routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    4.2 Sequential routing through Steiner min-max tree construction . . . . . . . . . . . . . . . 16

    4.3 Minimum weighted Steiner tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    5 Region-wise routing 19

    5.1 Unique pattern first and outer rim first routing . . . . . . . . . . . . . . . . . . . . . . . 19

    5.2 Routing in order of wire orientations and in terms of rows . . . . . . . . . . . . . . . . . 22

    6 Move-based heuristics 23

    7 Rip-up and reroute 24

    8 Multicommodity flow based approach 28

    8.1 The Shragowitz-Keel algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    8.2 The Raghavan-Thompson rounding method . . . . . . . . . . . . . . . . . . . . . . . . . 31

    8.3 Application of the Shahrokhi-Matula algorithm . . . . . . . . . . . . . . . . . . . . . . . 33

    8.4 Application of Garg-Könemann algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    9 Hierarchical methods 36

    9.1 Top down successive refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    2

  • 9.2 Bottom up merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    9.3 Hybrid hierarchical method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    9.4 Hierarchical routing for custom design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    9.5 Hierarchical bisection and linear assignment . . . . . . . . . . . . . . . . . . . . . . . . . 46

    9.6 Four bend hierarchical routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    10 Iterative deletion 49

    11 Timing driven global routing 51

    11.1 Multicommodity flow based approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    11.2 Iterative deletion based routing for standard cells . . . . . . . . . . . . . . . . . . . . . . 53

    11.3 Hierarchical bisection and assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    12 Crosstalk driven global routing 56

    13 Conclusions and future directions 59

    3

  • 1 Introduction

    Eighteen years ago, a then new journal, the IEEE Transactions on Computer-Aided Design, pre-

    sented a special issue on routing in microelectronics that contained several landmark papers. Another

    significant publication at about the same time was a collection of papers edited by Hu and Kuh [40]. In

    the years that have elapsed since the publication of these two works, with the scaling of feature sizes,

    VLSI technology and circuits have undergone dramatic progress, as characterized by Moore’s law. Par-

    ticularly in recent years, this has resulted in interconnect delay accounting for a major portion of circuit

    clock cycle, as a result of which VLSI physical design has grown to be a critical factor. Moreover, the

    growing circuit complexity has enlarged the size of the design automation problems in physical design

    and has brought forth a new set of challenges.

    Within the physical design flow, one of the most critical steps is global routing, a stage where signal

    nets are connected coarsely under a given placement so that wire/via spaces are allocated to each signal

    net. The quality of the global routing solution directly affects chip area, speed, power consumption and

    the number of iterations required to complete the design cycle, and hence this step plays an important

    role in determining circuit performance. On the other hand, global routing is a notoriously difficult

    problem: even the most simple version of the problem, where a set of two-pin nets is to be routed under

    congestion constraints, is an NP-complete problem [49].

    As a result of both the importance of the problem and its difficulty, a great deal of research has been

    carried out on global routing during the last two decades, covering a variety of design styles including

    gate arrays, sea of gates, standard cell-based designs and custom circuits. Various techniques and

    strategies have been proposed, including rip-up-and-reroute, hierarchical methods, multi-commodity

    flow techniques and iterative deletion. However, even with all of these efforts, it is not entirely accurate

    to imply that the global routing problem has been solved satisfactorily. In particular, the newest

    advances in VLSI technology have raised a new set of issues to be solved and have further complicated

    the requirements on global routing.

    The purpose of this survey is to provide a comprehensive overview of research in global routing,

    with specific emphasis on the problem of the simultaneous global routing of multiple nets in integrated

    circuits. This problem requires competent resource management as the global nets compete for a

    restricted set of global resources such as routing resources, crosstalk budgets and net delays. The

    4

  • nature of this problem has changed remarkably over the last two decades, with several design styles

    gaining favor and then falling out of favor, but many of the fundamental techniques that were introduced

    are useful to other more modern design paradigms. In reading this survey, the reader is cautioned not

    to set too much store on the specific technology being discussed, but rather, to focus on the underlying

    algorithms in an attempt to determine how best they may be extended to the specific routing problem

    du jour.

    There are several prior surveys that complement the material presented here. Two early surveys on

    global routing are presented in a paper by Kuh and Marek-Sadowska [50], and in a chapter of the book

    by Lengauer [56]. The books by Sherwani [78], Sait and Youssef [73] and Sarrafzadeh and Wong [75],

    present a more updated coverage of progress in global routing. The book by Kahng and Robins [45]

    and the survey paper by Cong et al. [16] focus on global routing issues for a single net. In most of these

    sources, the attention paid to the problems of simultaneously routing multiple global nets is limited,

    and the objective of this work is to attempt to bridge that gap. In the remainder of this paper, the

    phrase “global routing” will implicitly imply, unless otherwise stated, that the routing of all nets is being

    considered. For a survey of this type, it is appropriate to also list a set of related routing problems that

    have not been covered in this survey in order to limit its scope. These include MCM/PCB routing,

    FPGA routing, single layer routing, and parallel algorithms for global routing.

    This paper is organized as follows. At the outset, the problem background and formulation are

    described in Section 2, and the basic techniques that are frequently used in global routing are summarized

    in Section 3. Section 4 introduces the first set of global routing algorithms, namely, sequential routing

    techniques. Edge-wise routing methods are covered in Section 5, followed by a brief review of move-

    based heuristics in Section 6. Next, rip-up-and-reroute approaches are described in Section 7 and

    multicommodity flow based methods in Section 8. Hierarchical global routing procedures and the

    iterative deletion technique are discussed in Sections 9 and 10, respectively. Most of the methods

    described until this section use conventional metrics to manage resource contention. Towards the end,

    in Sections 11 and 12, we present ways in which some objectives related to timing driven and crosstalk

    driven global routing are incorporated. Finally, Section 13 concludes this survey and suggests directions

    for future research.

    5

  • 2 Problem background and formulation

    For a given location for every cell or macro-block in the layout, each set of electrically equivalent pins

    that belong to different cells or macro-blocks, called a net, must be wired together in a step commonly

    referred to as routing. Due to technological or methodological restrictions, it is often the case that

    some areas of the layout are prohibited from allowing wire routes, corresponding to wiring blockages.

    The fundamental goal of the routing step is to connect every net successfully and to resolve resource

    contentions. In modern VLSI design, this could be an exceedingly complex problem if it were done in a

    single step, since there could be millions of elements and nets integrated on a single chip. One common

    approach is to divide the routing procedure into two stages: global routing and detailed routing. In

    global routing, the chip area is divided into a set of coarsely-defined regions, and wires that cross

    the boundaries of these regions are allocated a coarse route that determines which regions they must

    traverse. Following this step, the routing within each such region is carried out in the detailed routing

    stage.

    (b)(a)

    Figure 1: Tessellation for (a) gate array/sea of gates and (b) standard cell design.

    Depending on the design style, different forms of routing graphs are constructed to define the coarse

    regions for global routing. For gate array, sea of gates and standard cell designs, the entire routing area

    is typically tessellated into a grid array, as shown in Figure 1. Note that the horizontal grid line usually

    goes through the middle of a row of cells in the standard cell design. The dual graph of this tessellation

    is the routing graph G = (V, E), as shown in Figure 2(a). Each vertex v ∈ V represents a grid cell and

    each edge e ∈ E corresponds a boundary between two adjacent grid cells. In custom design, on the

    6

  • (a) (b)

    Figure 2: The construction of a routing graph from the tessellation for a two-layer routing scheme. Viaconstraints are explicitly modeled in (b) but not in (a).

    other hand, the cell placement is not regular as in gate array or standard cell design styles. Therefore,

    the routing graph is more appropriately based on the floorplan of the building blocks. This is illustrated

    for an example floorplan in Figure 3 where each rectangle represents a building block. Like the routing

    graph for gate array and standard cell designs, this routing graph is also a dual to the floorplan graph.

    Figure 3: Routing graph for custom design.

    The global routing problem requires a set of nets N = {N1, N2, ..., Nk} to be routed over the routing

    graph G. A net Ni, 1 ≤ i ≤ k is a set of pins (or terminals) {vi,0, vi,1, vi,2, ...} ⊆ V , among which vi,0 is

    the source pin and the others are sink pins. Under the coarse assumptions made for global routing, each

    pin is often assumed to lie at the center of the grid cell that contains the pin. The routing problem for a

    7

  • net Ni is to find an additional subset of vertices Vi,Steiner ⊂ V and a set of edges Ei = {ei,1, ei,2, ...} ⊂ E

    to form a rectilinear minimum spanning tree Ti = (Vi, Ei), where Vi = Ni ∪ Vi,Steiner.

    As multiple nets are routed over the routing graph G, a common boundary between two neighboring

    grid cells may be crossed by wires belonging to different nets. In terms of the graph representation, this

    implies that the routing trees for these nets utilize a common edge e ∈ E. The number of wires that

    utilize an edge e ∈ E is called the flow (or demand), f(e), on the edge. The number of available routing

    tracks, which forms an upper bound on the allowable demand, is referred to as the boundary capacity

    (or supply), u(e). The ratio of flow to capacity f(e)/u(e) for an edge is referred to as the density or

    congestion, λ(e). If f(e) > u(e), then the overflow on the edge is defined as φ(e) = f(e)−u(e); otherwise

    the overflow is zero. The most fundamental statement of the global routing problem is to route all of

    the nets to ensure that for each edge e corresponding to a cell boundary, the number of wires across it

    does not exceed its routing capacity, i.e., φ(e) = 0,∀e ∈ E.

    Another important consideration in routing global nets is to manage the number of vias on the net,

    and it is relatively easy to modify the above formulation to address this objective. Let us consider the

    routing problem under the reserved layer model, where the routing direction for all wires in a layer is

    identical. If a routing path makes a change in direction, i.e., a bend is induced, then a layer change is

    necessitated, and wires along different directions must be connected through a via at the bend position.

    For several reasons, such as considerations of reliability, area, and signal delay and quality, it is desirable

    to minimize or control the number of vias during routing. One way to do so is to limit the number

    of allowable vias in each grid cell. Under this scenario, the available via spaces in a grid cell may

    also be modeled by an edge, and we can obtain a routing graph G of the type shown in Figure 2(b)

    for a two-layer routing problem. This routing graph is identical to that defined earlier, except for the

    introduction of via edges along the grid. For a via edge e, we define f(e) as the number of vias utilized

    in the corresponding grid cell and u(e) as the number of available via spaces in the cell. In the literature,

    some works explicitly consider this via constraint while others do not, but these can often be extended

    fairly easily to do so.

    The specific constraints that are of particular importance vary with the design style, but congestion

    management is a uniformly important objective across all design styles. For gate array/sea of gates

    designs, the locations of the gates are fixed, and global routing is an effort to route every net without any

    8

  • overflow. The global router on this type of design is typically evaluated through metrics that measure

    the number of nets routed without overflow, or the total overflow with all of the nets routed. In custom

    design, the objective is similar, and is sometimes extended to minimize the maximum congestion λ̂ over

    all edges e ∈ E, even if no wiring overflow occurs. The motivation for doing so is to provide a greater

    flexibility to a subsequent detailed routing step within a grid cell by evenly spreading out the global

    wires over the grid. For traditional standard cell design in a two-layer environment, only horizontal

    wiring channels, which are the spaces between two adjacent row of cells, are available. If a net crosses

    two different channels, then feedthrough cells must be inserted into each row of cells between the two

    channels to allow the the inter-channel connection for this net. The problem of assigning feedthroughs

    to improve the routing quality constitutes a separate problem that is not treated in this survey.

    In addition to the congestion objective, wire length is an important concern: large wire lengths imply

    a larger power consumption and often, greater delays. It is not hard to see that optimizing congestion

    and wire length can often be competing objectives, and some global routing works have attempted

    to combine these objectives together. Another important consideration in the deep submicron era is

    that the interconnect delay, which consumes a major portion of the clock cycle today and is projected

    to continue to do so in the future. Therefore, timing performance must be included into the set of

    objectives considered during global routing. In addition, crosstalk has become an increasingly vexing

    issue and an attempt at crosstalk management during global routing is of great value in an overall

    strategy to manage coupling effects between wires.

    3 Basic techniques

    In this section, we summarize some of the basic techniques that are frequently used to solve sub-

    problems in global routing. These include maze routing, Steiner tree construction, 0-1 integer linear

    programming, and network flows.

    3.1 Maze routing

    One basic subproblem that is commonly encountered in global routing is that of finding a shortest path

    connecting two pins in the presence of wiring blockages. Perhaps the most well known solution to this

    problem is the maze routing [54,66] algorithm, which works on a routing graph similar to the grid graph

    9

  • in Figure 2(a). Each edge e has a cost c(e) associated with it, and this cost may be different for different

    routing directions. If an edge e is within the wiring blockage area, then its cost c(e) = ∞; otherwise its

    cost is finite and specified. A common cost metric is the rectilinear edge length, counted in terms of

    the number of grid cells.

    The maze routing algorithm can be regarded as an implementation of Dijkstra’s shortest path algo-

    rithm [26] on the routing graph that minimizes the path cost. For a given graph G = (V, E), where each

    edge e ∈ E has a cost c(e), Dijkstra’s algorithm is able to find a minimum cost path p(vs, vt), which is

    set of consecutive edges, connecting a source vertex vs ∈ V and a target vertex vt ∈ V . This algorithm

    consists of a cost labeling step followed by a path tracing phase. In the cost labeling phase, starting

    from the source vertex vs, the accumulated cost from the source to each vertex is labeled one by one in

    a “wave expansion” manner until the target vertex is reached. The minimum cost path is then traced

    back from the target vertex to the source by retrieving the bookkeeping information that is maintained

    in the cost labeling phase. The runtime for this algorithm is O(|E| + |V | log |V |) through a Fibonacci

    heap implementation, where |E| is the number of edges and |V | is the number of vertices.

    If there is an optimal shortest path, maze routing is guaranteed to find it. However, in practice, it is

    found that the maze routing algorithm is slow and has large memory requirements, and many efforts [78]

    have attempted to improve its speed and memory usage. Another well known class of methods is the

    category of line probe-based algorithms (see, for example, [34]), which do not rely on a grid graph.

    While these methods are faster than maze routing, they are not guaranteed to find a solution, even if

    one exists. For a good description on maze routing and related algorithms, the reader is referred to [78].

    The maze routing algorithm has been extended to find a path connecting two pins in such a way that

    it favors a path that passes through less congested areas [68]. Since maze routing inherently considers

    only one net at a time, this extension requires nets to be considered one at a time, with the consequent

    dependence on the order in which nets are processed. The procedure computes the edge cost c(e), e ∈ E,

    so that it reflects the current congestion at its corresponding cell boundary, instead of a distance metric

    such as the rectilinear length. There are several variations on this cost definition: for example, one

    10

  • could use the density λ(e), or λ2(e). Another effective cost function, in our experience, is given by

    c(e) =

    f(e)+1u(e)−f(e) : f(e) < u(e)

    ∞ : f(e) ≥ u(e)

    An extension to maze routing that considers congestion costs explicitly can achieve the purpose of

    not only helping the path to avoid wiring blockages, but also of distributing the routing congestion,

    and therefore this technique is frequently used in global routing. A technique for timing driven maze

    routing has been proposed recently in [43,44]

    3.2 Steiner tree construction

    source

    sink

    sink

    (b)(a)

    Steiner node

    Figure 4: A three pin net connected by (a) a spanning tree and (b) a Steiner tree.

    Both maze routing and line probe methods are designed for connecting two pin nets. However, in

    practice, nets with more than two pins are often encountered in the routing problem. A common

    approach in dealing with a multi-pin net is to decompose it into a set of two-pin nets. One way of

    performing this decomposition is to begin with constructing a minimum spanning tree (MST) over the

    pins, and to maze-route each pair of pins that corresponds to an edge in the MST. In Figure 4(a), a

    possible solution using this procedure for routing a three-pin net, decomposed into two two-pin subnets,

    is shown. It is easily seen that such a procedure may not lead to the best routing quality. The wire

    length of a routing tree can often be reduced by introducing extra nodes in addition to the given pins,

    and constructing an MST over all of these nodes. The added nodes are called Steiner nodes, and an

    illustration is shown in an example in Figure 4(b).

    In most VLSI routing problems, since all of the wires are either horizontal or vertical, only rectilinear

    11

  • Steiner trees (RST) are considered. One basic RST construction formulation is to minimize the wire

    length of the tree, i.e., to construct a rectilinear Steiner minimum tree (RSMT). RSMT construction is a

    well-known NP-complete problem and many approximation algorithms have been developed. Several of

    these are based on results that relate the wire length of the optimal RSMT to that of the optimal MST.

    When timing is considered, merely minimizing wire length is not adequate. As a first approximation,

    it can be observed that the length of the source-sink routing path length plays an important role in

    determining the signal delay. Several early works [5, 17] are dedicated to finding a good compromise

    between wire length and the maximum source-sink path length (also known as the radius). Later

    research efforts have directly considered delay metrics in the tree construction procedure [7, 19, 22, 36,

    37, 58, 85]. For surveys on Steiner tree construction techniques for VLSI routing, the reader is referred

    to [11,16,45].

    Many of the Steiner tree construction algorithms that have been proposed in the literature focus on

    the optimization of a single net, and do not consider wire congestion issues explicitly. Nevertheless, these

    algorithms can be applied to serially route the nets, with the most critical nets being routed in advance

    of other non-critical nets. When the edge cost is defined according to congestions, the Steiner minimum

    tree algorithms may be applied directly to even out congestion while simultaneously restraining the wire

    length [13]. Steiner minimum tree algorithms are also often employed in providing a set of candidate

    routes to multicommodity flow based methods [3,9] and other iterative techniques. These methods will

    be described in detail later in this paper.

    3.3 0-1 integer linear programming

    Global routing may be formulated as a special type of optimization problem, called a zero-one integer

    linear programming (0-1 ILP) problem [61]. For a set of candidate routing trees Ti = {Ti,1, Ti,2, ...} for

    net Ni, we use variable xi,j to indicate if tree Ti,j is selected for net Ni. The global routing problem

    12

  • can then be formulated as:

    Minimize λ̂

    Subject to:∑

    Ti,j∈Ti xi,j = 1, ∀Ni ∈ N

    i,j:e∈Ti,j xi,j ≤ λ̂u(e), ∀e ∈ E

    xi,j = {0, 1}, ∀Ni ∈ N ,∀Ti,j ∈ Ti

    (1)

    The first constraint, along with the restriction of the xi,j ’s to {0, 1}, requires that one tree be chosen

    for each net. The second constraint and the objective together ensure that the maximum congestion is

    minimized.

    One straightforward approach to this problem is to first solve the continuous linear programming

    relaxation, obtained by replacing the third constraint with xi,j ∈ [0, 1], since practical solutions to

    linear programming problems can be found in polynomial time [46], but the 0-1 ILP problem is NP-

    complete. The fractional solution thus obtained may then be transformed to integer solutions through

    rounding techniques such as randomized rounding, as in [71]. An alternative approach in [83] applies

    an interior point method in conjunction with column generating techniques [39] to solve the 0-1 integer

    linear programming problem.

    In practice, the global routing problem is seldom solved entirely using the 0-1 ILP formulation since

    the problem size can grow to be very large. More often, the ILP technique is embedded into a larger

    overall global routing strategy, such as solving a subproblem at one hierarchical level of a hierarchical

    routing procedure [8,33,41,62,81], where the complexity of the computing the optimal solution to a 0-1

    ILP is manageable.

    3.4 Network flow model

    The objective of the global routing problem is to allocate a limited set of resources evenly to a given

    set of demands. Intuitively speaking, the nature of this problem is quite coherent with the problem

    of finding optimal flows in a network [52], and several research efforts have pursued this as solution

    technique.

    A network is a connected graph consisting of a set of vertices and edges; for the global routing

    problem, this graph is a minor modification of the routing graph in Figures 2 and 3. In a basic network

    13

  • flow model, there are two special vertices: one called the source and another called the target or sink.

    A certain amount of flow, called the demand, for one commodity must be shipped from the source to

    the target vertex. Each edge has a flow capacity, which represents the upper bound for the flow that

    is allowed to pass through the edge. One version of the problem requires the transport of as much

    flow as possible through the network without exceeding any edge capacity, and this is referred to as

    the max-flow problem. Another version assigns a cost per unit flow to each edge, and sets the problem

    objective to be the minimization of the total transportation cost through the network for a given flow

    from the source to the sink; this formulation is called the min-cost flow problem. The appealing feature

    of the network flow problem is that it can be solved in polynomial time to obtain an optimal integer

    solution when edge capacities are integers. A good description of network flow methods can be found

    in the book by Ahuja et al. [2]. While it is not possible to model the entire global routing problem as

    a single commodity network flow problem, this technique can be employed to solve some subproblems

    in global routing [14,38,65] and can lead to high quality solutions.

    There is a special type of network flow model that can be applied directly to the global routing

    problem. This is the multicommodity flow problem, where many commodities must be shipped on a

    common network, and each commodity has its own sources and targets that may be different from

    that for other commodities. In the mapping to the global routing problem, each net can be treated

    as a commodity. An advantage of this approach is that the multicommodity flow problem can be

    formulated as a linear programming problem. Since linear program solvers are slow for the sizes of

    problems encountered in global routing, research on the multicommodity flow problem has mostly

    focused on heuristics and combinatorial approximation algorithms. These methods will be introduced

    in the following sections.

    4 Sequential routing techniques

    Perhaps the oldest and most straightforward strategy for routing multiple nets is to select a specific

    order and to then route the nets sequentially in that order. The major advantage of this approach is

    that the congestion information for previously routed nets can be taken into consideration while routing

    a given net. For example, in early algorithms that operated by decomposing multi-pin nets into two-pin

    nets, each net was routed using techniques such as the obstacle avoidance version of maze routing or

    14

  • the line probe method. In these approaches, a cell boundary is said to be open to path searching until

    all of the tracks have been occupied by previously considered nets; after that point, the boundary was

    treated as an obstacle.

    The drawback of this sequential approach is that the quality of the solution depends greatly on the

    order in which nets are processed, and that it is hard to find a good net ordering. Under any net

    ordering, it is often more difficult to route the nets that are considered later since they are subject to

    more blockages. Moreover, there is no feedback mechanism that permits these nets to feed information

    back to the nets routed earlier with directions on regions that should be left free for their routes. Early

    work by Abel [1] concluded that there is no single net ordering technique that consistently performs

    better than any other ordering method. Despite the controversial net ordering issue, there are several

    good research results on sequential routing that have been reported, mostly through the use of iterative

    loops that feed back congestion information from the later routed nets to the earlier routed nets.

    4.1 Force-directed routing

    In [31], the global routing problem for two-pin nets is solved by emulating a particle movement in a force

    field. This approach assumes that a net order is given and routes each net sequentially. For each net, a

    particle departs from from the source pin and moves toward the target pin under the field generated by

    the source pin, target pin, unrouted pins in other nets, and routed wires. The trajectory of this particle

    motion forms the routing path connecting its source and target pin.

    attractive force repulsive force circular force

    Figure 5: Various types of forces in force-directed routing.

    There are three types of forces, as illustrated in Figure 5. When a particle moves to a certain position,

    an unrouted pin in another net will exert a repulsive force along the direction of the line joining the

    15

  • net being routed

    particle

    unrouted pin of other nets

    routed wire

    (b)

    target

    source

    A

    B

    source

    target

    (a)

    Figure 6: An example of the forces exerted on a particle: (a) a repulsive force from an unrouted pinin another net that does not lie in same row or column as the particle. (b) a circular force from anunrouted pin in another net in the same row or column as the particle, or from a routed wires in thesame row or column.

    particle and the unrouted pin, as shown in Figure 6(a). The magnitude of the force is 1pr2

    , where p

    is the number of the total unrouted pins and r is the distance between the particle and the unrouted

    pin. The source pin generates a similar repulsive force with a magnitude of 1r2

    . To a particle in motion,

    the only attractive force is from the target pin along the line joining the particle and the target, and

    the magnitude of this force is 1r1.5

    , which is generally greater than any other kind of force, so that the

    particle can be guaranteed to reach the target. The direction of the combined force from the source and

    the target is snapped to the direction of motion of the particle direction if the two directions are not

    too different, so that the number of bends can be reduced.

    If an unrouted pin or a segment of routed wire from a different net is in the same row or column as

    the article, it will exert a circular force of 1r2

    . If this force is horizontal (vertical), it will take the same

    direction as the horizontal (vertical) component of the combined force from the source and the target,

    as illustrated in Figure 6(b). This force-directed method can be parallelized by letting the particle for

    every net move simultaneously.

    4.2 Sequential routing through Steiner min-max tree construction

    A Steiner min-max tree (SMMT) is a Steiner tree whose maximum-weight edge is minimized over all

    Steiner trees. In [12], Chiang et al. solve the global routing problem by constructing Steiner min-

    max trees for each net sequentially, defining each edge weight according to estimates of the routing

    congestion.

    16

  • The algorithm sorts the nets according to the perimeter of their bounding box and routes one net at a

    time, with the nets that have a smaller bounding box being routed earlier. The routing tree construction

    is carried out on a routing graph of the type described in Section 2, with the edge weights on the graph

    being proportional to the wire congestion. For a certain net, a cell is called terminal cell if it contains a

    pin of this net; otherwise, it is called non-terminal cell. An optimal Steiner min-max tree that spans the

    net on the routing graph is obtained through a two-step procedure in polynomial time. In the first step,

    a minimum spanning tree over all of the cells in the routing graph is constructed. In the second step,

    any degree-one non-terminal cells are eliminated from this MST. It can be proven that the resulting

    tree is an optimal Steiner min-max tree. The edge weights are updated dynamically after the routing

    of each net, reflecting changes in the congestion.

    Intuitively speaking, a Steiner min-max tree attempts to minimize the maximum edge weight, i.e.,

    to minimize the wire congestion. However, the procedure does not consider wire length in constructing

    the SMMT. Finding a Steiner min-max tree with minimum wire length (MSMMT) is shown to be

    an NP-complete problem, and the authors of [12] propose a heuristic algorithm, based on the SMMT

    algorithm, that tries to find the MSMMT. In this heuristic, a wire length limit ratio ρ > 1 is defined,

    and the routing is performed over several iterations. Initially, the limit ratio is set to be tight, in a

    range that lies in the interval (1, 2). In each iteration, the nets are routed one at a time using the

    SMMT algorithm, in the ascending order of the semiperimeter of the bounding box for the net. After

    this SMMT solution is obtained, its wire length is evaluated to check whether it is less than ρ times a

    lower bound on the wire length, given by the semiperimeter of the bounding box for the net. If not,

    this routing solution is rejected, and this net will be considered for routing in the next iteration. The

    value of ρ is increased from each iteration to the next so that every net can eventually be routed.

    At the end of this iterative SMMT phase, this method iteratively reroutes each net using the minimum

    spanning tree algorithm in the same constant order. A rerouting result is accepted only when it is better

    than the solution of SMMT phase.

    4.3 Minimum weighted Steiner tree

    In the SMMT based global routing procedure, the primary objective is to minimize the maximal con-

    gestion and the secondary objective is to minize wire length. In [13], another sequential global routing

    17

  • method is developed in an effort to simultaneously minimize congestion and wire length. This method

    routes the nets sequentially using a minimum weighted rectilinear Steiner tree (WRST) approximation

    algorithm.

    The example shown in the work of [13] is based on a custom design such as that shown in Figure

    7(a), where the routing area is divided into a set of small regions. Each region is assigned a weight

    that is defined according to the complexity and wire congestion of the region. The shaded rectangles

    correspond to macros that act as wire blockages, and the weights in these regions are set to ∞.

    This work constructs a routing graph for a net as the union of the Hanan grid for the net and the

    grid formed by extending the borders of each region until they encounter a blockage or the boundary.

    For example, the routing graph for the design shown in Figure 7(a) is illustrated using dashed lines

    in Figure 7(b). For a wire that is routed along a border between two regions with different weights,

    it is assumed that the weight with a smaller value is chosen. The weight of an edge is the product of

    the edge length and the weight of the region the edge belongs to, and the objective of the procedure

    is to minimize the cost of the weighted rectilinear Steiner tree. It can be shown that there is always a

    minimum weight path connecting two nodes using edges that lie exclusively on this routing graph.

    (a) (b)

    4

    5 3

    3

    Infty

    Infty

    22

    2

    Infty

    9

    2

    4

    Figure 7: (a) A set of regions in a layout, each with a specified weight. (b) A routing graph and aminimum weighted Steiner tree for this layout.

    As in the SMMT-based approach, the nets are first sorted in increasing order of the bounding box

    size and are routed sequentially in this order. The weights for all regions are updated after each net is

    routed.

    18

  • Each net is routed by means of a minimum weighted Steiner tree over the net. Based on a proof

    that the weight of a minimum spanning tree (MST) is at most two times the weight of optimal WRST

    solution, the tree construction proceeds by beginning with an MST skeleton and heuristically attempting

    to maximize the overlaps in its embedding on to the routing grid. For the net in Figure 7(b) whose pins

    are indicated by the three solid squares, the MST is shown by a set of thickened dashed lines. After the

    MST is obtained, each edge in the MST is instantiated into an actual wire route one by one, with the

    shortest edges being embedded first. Wherever possible, a wire route is merged with already embedded

    wire routes from the same net to reduce the wire length. If there is more than one minimum weight

    path for an edge in the SMT, the one that can deliver the largest wire length reduction is chosen. Since

    this process can only decrease the wire length from the MST result, the final result has at most twice

    the wire length of the optimal WRST’s in the worst case.

    5 Region-wise routing

    One way to avoid the net ordering problem is to route one region at a time instead of one net at a

    time. This alters the problem to one of determining the order in which regions should be considered,

    and this may be decided according to net distributions, as in the method described in Section 5.1. In

    an alternative approach outlined in Section 5.2, inter-row wires are routed first, and the intra-row nets

    are then routed in the order of horizontal wires, vertical wires and wires with both orientations.

    5.1 Unique pattern first and outer rim first routing

    In an investigation of the properties of net distributions, Li and Marek-Sadowska [57] made two obser-

    vations. Firstly, some nets or subnets may have to take certain unique patterns, regardless of what the

    final feasible routing is, and it is useful to identify and route these first. Secondly, in typical layouts

    the wiring is crowded at the center of the chip and relatively sparse at the outer rim. Therefore, it is

    better to start routing from the outer rim of the entire routing area and then shift the routing toward

    the center step by step. Although this work was specifically directed towards gate array designs, the

    key ideas can be extended to other problems.

    The following definitions are useful in describing the algorithm and are related to an abstraction of

    the problem in terms of a routing graph, G.

    19

  • Definition 5.1 (Non-Pass-Through (NPT) cell) A cell is said to be a non-pass-through (NPT)

    cell if the number of unconnected pins of different nets inside it is equal to or less by one than the total

    remaining channel capacities on its four boundaries.

    Definition 5.2 (Outermost and inner meshes) In a particular planar drawing of the cell graph G,

    where the boundary of the exterior face of a planar embedding of G corresponds to cells at the boundary

    of the chip, the outermost mesh is a subset of nodes and edges in the routing graph G that bounds the

    exterior face from G, and an inner mesh is a subset of nodes and edges in the routing graph G that

    bounds an area where edges are missing.

    An inner mesh exists when there is a grid edge whose wiring capacity is zero. An example illustrating

    the idea of the outermost mesh and the inner mesh is shown in Figure 8.

    meshinner

    (a)

    meshoutermost

    capacity > 0capacity = 0

    (b)

    Figure 8: An example of outermost mesh and inner mesh.

    The algorithm is based on the following properties when the global routing problem has a feasible

    solution with no wiring overflow.

    Lemma 5.1 If there is a two-pin net, Nk, with pins in two adjacent cells with non-zero wiring capacity

    at their common boundary, then there exists a feasible routing solution in which Nk is wired through the

    common boundary of the two adjacent cells.

    Lemma 5.2 If the total wiring capacity on a closed region boundary is U , and there are P pins inside

    such that P > U , then at least P − U of them must be connected within this region.

    20

  • Lemma 5.3 If there are two adjacent NPT cells, ga and gb, with a non-zero wiring capacity uab on the

    common boundary, and there are uab or fewer nets with pins in both ga and gb, then there is a feasible

    solution in which these pins are connected by the shortest path between ga and gb.

    This algorithm proceeds by first finding pairs of pins in adjacent cells and connects them directly

    according to Lemma 5.1. Next, it identifies the NPT cells from this partial routing result and determines

    unique routes according to following rules:

    1. Nets with pins in the neighborhood of NPT cells must find unique routes that do not pass through

    the neighboring NPT cells.

    2. By Lemma 5.3, nets that have two pins in adjacent NPT cells are connected through the common

    boundary of the adjacent NPT cells.

    3. If there is a net with one pin located in an NPT cell and the other in its non-NPT neighbor, they

    are connected directly across their common boundary.

    After these unique patterns have been routed, some new NPT cells may be generated due to the

    consequent reduction in the wiring capacity. These cells are identified and the process is repeated until

    no such new cells are created.

    In the next phase, the routing proceeds along the outermost mesh and the inner meshes. Any nets

    with more than two pins within a mesh are connected within the mesh, as long as such a connection

    does not lead to barriers separating the routing graph into disjoint regions. Those pins that remain

    unconnected are connected to neighboring non-mesh cells. After one layer of outermost mesh is routed,

    the outermost mesh is expanded inwards. Similarly, a routed inner mesh is expanded outward. If

    a routing decision at an outer mesh is found to be incorrect when the inner meshes are routed, the

    algorithm resorts to backtracking to correct the previously made decision. This procedure is repeated

    until all of the cells are routed. For a routing instance with k nets to be routed on a p × q grid, this

    algorithm has a complexity of O(k2(log k + pq)).

    21

  • terminal GRC

    via GRC

    possible path segment

    Figure 9: Candidate paths for inter-row connections in the algorithm proposed by Thaik et al.

    5.2 Routing in order of wire orientations and in terms of rows

    In [81], Thaik et al. designed a global routing algorithm for a sea-of-gates environment that proceeds

    edge-wise instead of net-wise. They consider the global routing problem for three layers such that top

    and bottom layers are reserved for vertical wires and middle layer is devoted to horizontal wires. The

    routing region is tessellated into a tile graph as shown in Figure 9, and each row is further divided into

    two subrows separated by the thin dashed lines.

    The algorithm consists of three phases. In the first, each net that spans multiple rows is connected

    by a vertical path in the top layer that covers the vertical span of the entire net. The end point of the

    path may either be a terminal GRC (global routing cell) that contains a pin of the net, or a via GRC.

    For a specific net, there could be several paths that cover its vertical span, as illustrated in Figure 9,

    and only one such path needs to be chosen. This choice should be made while ensuring that there are no

    wiring capacity overflows. This decision process is implemented in [81] by means of a zero-one integer

    linear programming (ILP) formulation that is solved using a 0-1 ILP package.

    In the second phase, connections within each row are considered. Recall that each row consists of

    two subrows. If two pins lie in two adjacent subrows of the same row and are in the same column,

    they are connected by a straight vertical edge in the lowest routing layer as long as no overflow occurs.

    Next, straight horizontal connections are made on the middle layer between pairs of terminal-GRC’s,

    via-GRC’s and/or metal-3 paths. These problems are also formulated and solved as 0-1 ILP problems.

    22

  • After phase two, there may still be nets whose routing has not been completed due to wiring con-

    straints. These remaining connections are completed in the third phase, which is also constrained to

    routing within a row. In this step, multi-bend routes are considered, and the route for each row is solved

    as a 2×N routing problem using a hybrid hierarchical approach, where the routing at each hierarchical

    level is solved as a 0-1 ILP problem.

    6 Move-based heuristics

    Move-based heuristics are commonly used to find the optimal solutions to computationally difficult

    problems. Arguably, the most well known move-based heuristic is simulated annealing, developed by

    Kirkpatrick et al. [47]. The motivation for this approach is that in solving computationally difficult

    combinatorial optimization problems, greedy heuristics are easily trapped at local minima because only

    those moves that reduce the value of the cost function are accepted. In simulated annealing, a certain

    amount of hill-climbing is permitted, providing an increased chance of finding the global minimum.

    More precisely, any cost-decreasing move is always accepted, and a cost increasing move is accepted

    with a probability of e−∆C/T , where ∆C is the increase in the cost and T is a parameter known as the

    temperature. It can be seen that the less the increase on the cost, the more likely the move is accepted.

    The procedure works iteratively, with an outer loop starting from a high temperature where virtually

    all moves are accepted, and gradually “cooling” the temperature so that the likelihood of accepting

    cost-increasing moves is progressively diminished. At each temperature, a number of moves are carried

    out, which are either accepted or rejected according to the criterion outlined above. This process is said

    to be analogous to the cooling process during annealing of metals.

    The simulated annealing technique is applied to global routing in [84], where only two-pin nets are

    considered and the number of bends for each net is no more than two. In the well-known TimberWolf

    package [76], simulated annealing is applied to both placement and global routing. In this package, a set

    of candidate routing trees is created for each net and one of the trees is randomly chosen as the initial

    solution. Each move implies a switch from one tree to another for a net. The cost to be minimized is

    the total wiring overflow over the entire routing graph. The net whose topology is to be changed by a

    move is randomly chosen from a grid cell boundary where an overflow exists.

    Other move-based techniques applied to global routing include simulated evolution [10], genetic al-

    23

  • gorithm [28] and tabu search [91].

    7 Rip-up and reroute

    Another common approach to avoid the net ordering dilemma is the rip-up-and-reroute method. This

    approach starts by routing each net individually without considering congestion, usually constructing

    Steiner minimum trees for each net. After all of the nets have been routed, the congested areas are

    identified, and the nets in those areas are ripped up and rerouted through less congested areas. This

    rerouting is often based on the maze routing algorithm. Although this method sounds very simple, it

    is surprisingly effective and has long been the most commonly used global routing method in industry.

    Moreover, it can always be combined with other global routing methods as a post-processing step

    to further improve the routing quality. The degrees of freedom in this approach are related to, for

    example, the different strategies for choosing the net to be rerouted, or the order in which boundaries

    with overflow are processed.

    An early and influential rip-up-and-reroute global routing method was proposed by Ting and Tien [82].

    Broadly speaking, their procedure first selects a set of congested grid cell boundaries and then chooses

    the subset of nets that pass across these boundaries to be rerouted.

    1

    1

    2

    2

    3

    3

    boundary full/overflow

    before reroute

    after reroute

    Figure 10: An example showing the overflow along a full loop. This overflow can be reduced only whenthere is at least one net that crosses the loop twice.

    A special case corresponds to the situation where the set of saturated or oversaturated boundaries

    forms a closed loop, as illustrated by the thickened dashed square in Figure 10. Along this loop, each

    boundary is either full or has an overflow, and at least one boundary has an overflow. In this case, if a

    24

  • wire crosses this loop only once, as is the case for net 1, rerouting this net will not satisfy the congestion

    constraints. More generally, if all of the nets pass through the loop only once, then no feasible solution

    exists. On the other hand, if a net crosses this loop twice and all of its pins lie either entirely within or

    entirely outside the loop, as is the case for nets 2 and 3, respectively, then we can reroute these nets to

    ensure that the devious routes taken by these double-crossing nets are altered to lie entirely within or

    without the loop. The first step of Ting and Tien’s algorithm is to identify such loops and to reroute

    the two-crossing nets.

    N1N2

    N3

    N4

    N5N6

    N7N8

    N9N10

    BoundaryNet

    N4

    N3

    N5

    N6

    N7

    N8

    b1

    b2

    w1 = 1

    b3w3 = 1

    w2 = 1b1

    b2

    b3

    (a) (b)

    Figure 11: Mapping the problem of choosing nets to be rerouted on to the problem of finding a coveron a bipartite graph.

    When no such loop exists, Ting and Tien’s algorithm selects a set of k > 1 most congested boundaries

    to reroute. There are typically many nets that run across these boundaries, and the technique employed

    to choose the nets to be rerouted is illustrated through the example in Figure 11(a). In this example,

    the wiring capacity on each boundary is assumed to be two, and it is easily verified that there are

    three boundaries, b1, b2 and b3, with overflow. A bipartite graph is then constructed as shown in Figure

    11(b). The group of vertices to the right correspond to the overflow boundaries, and those to the

    left correspond to the nets that cross those boundaries. An arc is set up between a net vertex and a

    boundary vertex if the net crosses this boundary. Each boundary vertex bi has a weight wi equal to the

    overflow of the corresponding boundary. A minimum cardinality cover set from the group of vertices to

    the left is selected to cover every boundary vertex wi times, and the nets in this cover set are selected

    to be rerouted.

    25

  • Another variant of rip-up-and-reroute method is due to Nair [67]. In this procedure, every net is

    ripped up and rerouted based on the observation that a wire in a non-overflow area may be able to

    move further and to a less congested area to leave some room for wires in adjacent congested areas.

    Another feature is that every net is rerouted in the same constant order iteratively. The rationale

    for this is that the routes that are chosen early in an iteration are based on less accurate congestion

    information compared to the routes that are determined later, and therefore, these early routed nets

    should be corrected first in the next iteration as well.

    (a)

    subtree T2

    v’

    v1

    v2

    v’ = source

    (b) (c)

    v1

    v2

    v’ = source

    targettarget

    subtree T1

    Figure 12: An example of rerouting using a network flow model.

    In [65], Meixner and Lauther reroute a set of nets simultaneously using a single commodity min-

    cost network flow formulation, so that the net ordering problem in rerouting is avoided. The method

    successively considers each node v′ ∈ V for a routing graph G = (V, E) and reroutes the nets that have

    either a sink node or a Steiner node with degree greater than two at v′. For each such net, its routing

    edges from v′ to the next sink or Steiner node with degree greater than two are ripped up, leaving

    behind a set of partial routing trees. For example, two nets N1 and N2 may leave behind the trees T1

    and T2, respectively, shown in Figure 12(a).

    The problem is now to connect nodes at v′ to subtrees T1 and T2 for nets N1 and N2 without causing

    a wiring overflow, and simultaneously minimizing the wire length. To solve this problem, the partial

    routing graph is transformed into a network flow model as shown in Figure 12(b). Each undirected

    edge is mapped on to a pair of directed edges whose capacity is the corresponding wiring capacity and

    whose cost is the edge length. Two pseudo nodes v1 and v2 are added for nets N1 and N2, respectively,

    26

  • together with a target node. Every node in partial tree Ti is connected to the node vi, i = 1, 2, by means

    of a directed edge, and finally, each vi is connected to the target node. These edges all have a cost

    of zero and capacity of one. The node v′ serves as the source node, and a minimum cost flow for this

    network is determined for a total flow of p units through the network, where p is the number of partial

    routing trees. Such a flow can be used to determine an optimal solution, if it exists, that satisfies wiring

    capacities and minimizes the wire length. A sample solution for our example is illustrated in Figure

    12(c).

    In addition to avoiding net ordering problem, this method has the advantage that an optimal integer

    solution to this single commodity flow problem can be obtained in polynomial time and no rounding

    procedure is required, since the single commodity network flow problem has a polynomial time optimal

    integer solution if each edge capacity is an integer [2]. However, the quality of the solution depends on

    the order in which the nodes in G are processed, though this node ordering problem is less severe than

    the original net ordering problem.

    pin

    grid line

    wire

    switchable segment

    two possible configurations

    Figure 13: An example of a switchable segment.

    In [55], Lee and Sechen proposed a rip-up-and-reroute strategy using a multi-stage refinement proce-

    dure, instead of applying maze routing immediately after initial tree construction. As in other rip-up-

    and-reroute methods, they initially construct Steiner trees for each net without considering congestion.

    Next, wires are ripped up and rerouted to reduce congestion in four successive stages. One scenario is

    related to the idea of switchable segments, illustrated in Figure 13. In a gate array or a standard cell

    design, an equivalent pair of pins at the top and bottom of a cell is treated as a single pin located at

    the middle of the row where a horizontal grid line goes through. In stage one of the rerouting process,

    27

  • one of these two options for each switchable segment is chosen in a way such that the congestion is

    minimized. In stage two, the L-shaped wires in the initial Steiner trees are allowed to be rerouted as

    Z-shaped connections to reduce congestion; note that there is no wire length increase in either stage

    one or stage two. In stage three, connections that were originally straight in the initial Steiner trees are

    allowed to detour and become U-shaped connections, with a bound on the permissible increase in the

    wire length. During this stage-wise refinement, the algorithm attempts to control the wire length and

    the number of bends. If some overflow still remains after these three stages of rerouting, traditional

    maze routing is finally employed to further reduce the congestion in stage four.

    In [60], a customized routing graph is constructed for global routing using multilayer macrocells, or

    building blocks. This routing graph is similar to the graph in Figure 7(b), except that it is in three

    dimensions to reflect the multilayer design. The authors extend Wang’s heuristic [87] to construct

    Steiner trees for each net separately at the beginning. After the initial tree construction, their method

    starts by inspecting every grid cell to see if the number of wires (vias) that pass through it exceeds the

    available routing tracks (via spaces). Next, every over-congested cell is processed in order of the most

    congested cell first. For each such over-congested cell, a new route is found for every net in it that can

    avoid all the over-congested or full cells, if such a route exists. These routes are maintained in a priority

    queue with the top route having the minimum wire length increase compared to its corresponding old

    route. After all of the possible new routes have been found, the top new route is popped out from the

    queue and used to replace its corresponding old route repeatedly until the over-congestion problem for

    this cell is resolved.

    8 Multicommodity flow based approach

    Although the sequential, rip-up-and-reroute and other heuristics may be effective in practice, they

    cannot provide a certain answer as to whether or not a feasible solution exists. In other words, if they

    fail to find a feasible solution, it is not clear whether this is attributable of the non-existence of a feasible

    solution or because of shortcomings of the heuristic. Moreover, when a heuristic does find a feasible

    solution, it is not known whether or not this solution is optimal, or how far it is from the optimal

    solution.

    These questions may be answered if we formulate and exactly solve the global routing as a multicom-

    28

  • modity flow problem. A multicommodity flow operates on a network that is a graph G = (V, E), where

    V = {v1, v2, ..., vn} is a set of n vertices and E = {e1, e2, ..., em} is a set of m edges. Some researchers

    define this network as a directed graph while some others define it as a undirected graph. Generally

    speaking, this difference affects only the form of the problem formulation. For multi-pin net global

    routing, an undirected graph formulation is more convenient.

    Over this network, k commodities must be transported from some vertices to other vertices. In global

    routing, we can treat each net as a commodity and we can say that there is a set of N = {N1, N2, ..., Nk}

    commodities that are to be shipped. For each commodity Ni, a certain amount di, namely the demand,

    is required to be shipped. In global routing, each Ni corresponds to a net with more than one pin

    and each di is always one. Each edge has a flow capacity u(e) and cost c(e), which have the same

    interpretation as in the global routing problem formulation. The routing can be expressed in terms

    of either edges or trees. In the edge-based expression, a variable fi(e) represents the amount of flow

    passing through edge e ∈ E. Note that fi(e) is always non-negative in an undirected network while it

    can be any real number for a directed network. The tree-based expression assumes that there is a set

    of possible routing trees Ti = {Ti,1, Ti,2, ...} for each net Ni. The binary variable xi,j is set to one if

    Ti,j ∈ Ti is selected for net Ni; otherwise, xi,j is zero. Of course, it is impractical to enumerate all of

    the possible trees and these trees are usually generated on the fly in practice.

    Typically, there are two constraints that must be satisfied in a multicommodity flow problem. The

    first is the demand constraint, which requires that the amount of flow shipped for each commodity

    should be equal to its demand, and the second is the bundle constraint, which states that the total

    amount of flow f(e) passing through each edge e ∈ E should not exceed its capacity u(e). In the

    fractional flow version, the decision variables fi(e) or xi,j may be any non-negative real number. In the

    zero-one integer flow version, there is the third zero-one integer constraint that regulates that fi(e) and

    xi,j must take a value of either zero or one. The global routing problem maps on to the zero-one integer

    version. Due to computational complexity issues related to integer programming and the large size of

    the global routing problem, the integer flow problem here is often relaxed to the fractional formulation,

    whose solution is transformed to an integer solution after a rounding procedure.

    Besides these constraints, there are several variations on the objective function for the multicommod-

    ity flow problem. One formulation tries to minimize the total cost of transportation, and this is called

    29

  • the min-cost multicommodity flow problem. Another formulation attempts to minimize the maximum

    edge density (as defined in Section 2) and is called the concurrent flow problem.

    8.1 The Shragowitz-Keel algorithm

    The work by Shragowitz and Keel in [79] is perhaps the first reported work on global routing using

    the multicommodity flow model. Unlike many subsequent efforts using this formulation, they did not

    employ an off-the-shelf multicommodity flow algorithm, but instead, developed their own polynomial

    time algorithm. Superficially, this algorithm looks similar to the rip-up-and-reroute method and the

    authors investigated the feasibility of convergence and the convergence rate. However, no statements

    about the optimality are made with the analysis.

    Their approach is based on a directed network and is restricted to two-pin nets. For each net Ni, one

    of its pins vs ∈ V is selected as source where a unit of flow is generated, i.e., the net flow of net Ni at

    this vertex di(vs) is 1. The other pin vt is the sink where a unit of flow dissipated, i.e., the net flow of

    net Ni is di(vt) = −1. For simplicity, in this description, a directed edge from vertex v to v′ is denoted

    by the vertex pair (v, v′). The Shragowitz-Keel formulation of the global routing problem is as follows:

    minimize∑

    ∀e∈E

    ∀Ni∈N c(e)|fi(e)|

    subject to:∑

    v′:(v,v′)∈E fi(v, v′) −

    v′:(v′,v)∈E fi(v′, v) = di(v), ∀Ni ∈ N ,∀v ∈ V

    ∀Ni∈N |fi(e)| ≤ u(e), ∀e ∈ E

    fi(e) ∈ {0,±1}, ∀Ni ∈ N ,∀e ∈ E

    (2)

    Initially, this algorithm discards the bundle constraint to obtain a min-cost solution where the edge

    cost c(e) is defined to be the rectilinear edge length. Next, it iteratively reduces the violations on the

    bundle constraint while maintaining a minimum feasible cost. The first step is solved individually for

    each net and any shortest path algorithm may be used. As in other methods, this step corresponds to

    finding the minimum cost route for each net while ignoring the effects of the other nets. The algorithm

    then identifies the subset of edges Eφ with maximum overflow φ = max(φ(e),∀ ∈ E) and the subset of

    edges Eφ,φ−1 with an overflow of either φ or φ − 1. If φ = 0, then the algorithm is done. The cost of

    each edge in Eφ,φ−1 is updated to infinity and all other edges keep their cost unchanged. The subset

    Nφ is defined such that each commodity or net in it has at least one fi(e) = 1 such that e ∈ Eφ. The

    30

  • subset of Nφ,φ−1 is defined in a similar manner. Based on the updated cost, the shortest path algorithm

    is executed for every commodity in Nφ. If there is a new path with a cost of infinity, this algorithm

    checks a feasibility condition which can verify whether there is a feasible solution for this problem. If the

    problem remains infeasible, this algorithm perturbs the flows and reruns the iterations again. If there

    is no new path with a cost of infinity, this algorithm chooses the commodity Ni that has the smallest

    increase in cost from the old path to the new path, and replaces its old path with the new path. This

    process is repeated until there is no overflow.

    8.2 The Raghavan-Thompson rounding method

    One interesting formulation for multi-terminal nets routing to multicommodity flow model is conducted

    by Raghavan and Thompson in [72], where the exposition of the algorithm is restricted to nets with

    exactly three pins, although the authors state that generalizations to multiple pins are possible. This

    formulation is also based on a routing graph G = (V, E), and the objective is to minimize the maximum

    flow among all edges, i.e., minimize f̂ = max(f(e),∀e ∈ E).

    Consider a situation where each net Ni ∈ N consists of three terminals vi1, vi2 and vi3. In the

    formulation, Raghavan and Thompson assume that one Steiner node is needed for each net and an

    indicator variable si(v) ∈ {0, 1} is used to denote whether or not each vertex v ∈ V is the Steiner node

    for net Ni. The global routing problem is then formulated as a multicommodity flow problem in which

    si(v) units of flow are to be shipped from each of vi1, vi2 and vi3 to vertex v ∈ V .

    This problem is first relaxed to be a linear programming problem and solved by any linear program

    solver. The next crucial task is to obtain the integer solution from the optimal linear program relaxation

    solution. We use f̃(e) and s̃i(v) to represent the fractional results from the linear programming and

    denote the optimal max-flow as f̃ . Then these values must satisfy the following conditions.

    v∈V

    s̃i(v) = 1, ∀Ni ∈ N (3)

    f̃(e) ≤ f̃ , ∀e ∈ E (4)

    Here all of the f̃(e), s̃i(v) ∈ [0, 1]. We express a set of solutions as S and start with the initial solution

    31

  • S0 as:

    S0 = {f̃ , f̃(e), s̃i(v),∀Ni ∈ N ,∀v ∈ V, ∀e ∈ E}.

    The rounding procedure proceeds for k stages to get a sequence of solutions S1, S2, ..., Sk, where k = |N |.

    In each stage i, the flow of one net Ni is rounded to integers and will not be changed later, and the

    solution proceeds from Si−1 to Si. For each solution S, a potential function is defined as:

    Ψ(S) =∑

    e∈E

    Ni∈N

    [fi(e)ω + 1 − fi(e)], (5)

    where the parameter ω > 1 will be defined later.

    Each stage consists of two phases. The Steiner node for the net being processed is determined in

    phase one and the integer flow to the Steiner node is obtained in phase two. In phase one of stage i, a

    vertex vp is identified so that si(vp) is forced to 1 and si(v) = 0, v 6= vp, v ∈ V . If we denote the flow

    from any vertex vij ∈ Ni to vertex v ∈ V as fi(e, v), selecting vp as Steiner node for Ni also implies that

    we let fi(e, vp) = f̃i(e, vp)/s̃i(vp) and fi(e, v) = 0 for all other vertices v. This corresponds to picking

    paths from each pin of the net to the Steiner point. If the solution after Steiner node selection is S′i,

    then the Steiner must be chosen in a way so that Ψ(S′i) is minimized. The authors have proved that

    this procedure ensures that Ψ(S′i) ≤ Ψ(Si−1).

    After the Steiner node is selected, the fractional flows from net Ni to vertex vp, f′i(e, vp), are rounded

    to be either 0 or 1 in phase two. Let us consider how to round the flow from one of the pins vi1 to

    vp. Through a so-called path stripping procedure, the fractional flow from vi1 to vp is organized into a

    set of paths {P1, P2, ...}, each with a certain amount of flow. The summation of flow over all of these

    paths must be one. The flow along path Pl is now selected to be one such that the solution from this

    choice S′i(Pl) minimizes the potential Ψ(S′i(Pl)). After the flows from all three pins of net Ni have been

    rounded, we can obtain the solution Si, and it is proven that Ψ(Si) ≤ Ψ(Si−1).

    If the parameter ω is chosen so that it satisfies the relation

    [eω−1

    ωω]f̃ =

    1

    m,

    32

  • then the upper bound of the final integer max-flow f̂ is as follows:

    f̂ ≤

    f̃ + (e − 1)√

    f̃ lnm : f̃ ≥ lnm

    e ln mln e ln m

    : f̃ < lnm(6)

    where m is the number of edges in G and e is the base of the natural logarithm.

    8.3 Application of the Shahrokhi-Matula algorithm

    In [9], Carden et al. developed the first reported global router with a theoretical bound from the

    optimal solution, based on a multicommodity flow algorithm. They applied Shahrokhi and Matula’s

    two-terminal multicommodity fractional flow algorithm [77] followed by randomized rounding to obtain

    a multi-terminal multicommodity integer flow solution. Shahrokhi and Matula’s algorithm is an ǫ-

    optimal approximation algorithm. In contrast with Shragowitz and Keel’s approach, their method is

    directed towards a concurrent multicommodity flow formulation instead of a min-cost multicommodity

    flow formulation.

    The integer linear programming formulation used here is the same as formulation (1). Its linear pro-

    gramming relaxation is obtained by omitting the integer constraint in (1) and we denote this relaxation

    as LP. The dual linear programming (DLP) of the LP is:

    maximize∑

    Ni∈N θi

    subject to:∑

    e∈E c(e)l(e) = 1

    e∈Ti,j l(e) ≥ θi, ∀Ni ∈ N ,∀Ti,j ∈ Ti

    l(e) ≥ 0, ∀e ∈ E

    (7)

    The variable l(e) is the dual variable corresponding to each edge e and is also referred to here as the

    edge weight. The variable θi represents the throughput of the flow from net Ni. According to the

    theory of duality in linear programming, a feasible solution to DLP provides a lower bound for the

    optimal solution of LP, and the LP solution reaches its optimum when it equals the DLP solution.

    Based on this property, Shahrokhi and Matula design an approximation algorithm in which the LP and

    DLP solutions are pushed closer after each iteration, and the final difference between the LP and DLP

    33

  • solutions provides an upper bound on how far the LP solution away from the optimal solution. An

    ǫ-optimal algorithm implies that the resulting λ̂ is at most (1 + ǫ) times the optimal solution λ̂∗, where

    the value of ǫ is a specified parameter between (0, 1]. The smaller the value of ǫ is, the more optimal

    the solution is and the longer is the computation time.

    Initially, the edge weight is defined as l(e) = 1/u(e)∀e ∈ E. The algorithm begins by constructing

    a minimum weight Steiner tree Ti for each net Ni to obtain a solution satisfying the first constraint in

    the formulation (1) without considering the bundle constraint. As a consequence, the value of λ̂ may

    initially be inordinately large. In subsequent steps, the algorithm iteratively recomputes the Steiner

    trees to decrease the value of λ̂ until the optimality condition is satisfied. In order to avoid iterations on

    trees with very small flows, a parameter σ0 is defined as the smallest allowable fraction of flow changes.

    If U =∑

    e∈E u(e) and u∗ = min(u(e),∀e ∈ E), then σ0 is chosen to beu2∗ǫ

    16αU , where the constant α,

    which is a user-tunable parameter, is typically between 0.01 and 100.

    Next, the weights of the edges that whose flows were altered in the last iteration are updated according

    to l(e) = eαλ(e). Using this updated weight, new minimum weight Steiner trees are recomputed for each

    net. If θ =

    e∈El(e)u(e)

    ∀Ni∈N

    ∀e∈Til(e)

    , then ǫ-optimality is satisfied if θ− 1λ̂θ

    ≤ ǫ, If ǫ-optimality is not satisfied,

    then the net with the largest cost reduction from the old Steiner tree to the new Steiner tree is rerouted

    so that a portion of flow in the old Steiner tree is switched to the new Steiner tree, and this cost

    update and rerouting process is repeated until the ǫ-optimality condition is satisfied. At this point,

    the actual flow for each tree is scaled to x̄i,j = xi,j/λ̂. Finally, the fractional solution is integerized

    in to integer solution through randomized rounding, and any remaining overflow is resolved through a

    rip-up-and-reroute process.

    8.4 Application of Garg-Könemann algorithm

    The latest reported global routing procedure based on multicommodity flow algorithm is due to Albrecht

    [3]. This work is an application of Garg and Könemann’s multicommodity flow approximation algorithm

    [29], which is simpler and faster than Shahrokhi and Matula’s algorithm [77]. Besides congestion, wire

    length is also considered in this method.

    Albrecht’s algorithm works on a routing graph G = (V, E) shown in Figure 2(b), which represents

    a two layer global routing problem in which one layer is reserved for horizontal wires and the other

    34

  • for vertical wires. The edges between two layers represent vias and their capacities are the maximum

    number of vias allowed in a grid cell. The algorithm permits different wire widths, and uses the symbol

    wi(e) to represent the wire width of edge e ∈ E belonging to net Ni. The resulting integer linear

    programming formulation is slightly different from that in (1). Assuming that all of the demands

    di = 1,∀Ni ∈ N , the formulation is:

    minimize λ̂

    subject to:∑

    Ti,j∈Ti xi,j = 1, ∀Ni ∈ N

    i,j:e∈Ti,j wi(e)xi,j ≤ λ̂u(e), ∀e ∈ E

    xi,j ∈ {0, 1}, ∀Ni ∈ N ,∀Ti,j ∈ Ti.

    (8)

    1. l(e) ← δ/u(e),∀e ∈ E2. xi,j ← 0,∀Ti,j ∈ Ti,∀Ni ∈ N3. ji ← 0, i = 1, ..., k4. While (

    e∈E l(e)u(e) < 1)5. For i = 1 to k6. If ji == 0 or

    e∈T i,j wi(e)l(e) > (1 + γǫ)θi7. Find a minimal Steiner tree Ti,ji ∈ Ti

    for net Ni with edge weight wi(e)l(e), e ∈ E8. θi ←

    e∈Ti,jiwi(e)l(e)

    9. xi,ji ← xi,ji + 110. l(e) ← (1 + ǫwi(e)/u(e))l(e),∀e ∈ Ti,ji

    Figure 14: An approximation algorithm for fractional global routing.

    As before, the ILP problem is relaxed to the a linear programming problem which is solved by the

    approximation algorithm in Figure 14. The symbols δ, γ and ǫ are constants, and k is the number of nets

    to be routed. In the Shahrokhi-Matula algorithm, a fraction of flow is switched from a higher weight

    (more congested) tree to a lower weight (less congested) tree in each iteration. In Garg-Könemann’s

    algorithm, a flow of di is added to a low weight tree Ti,j ∈ Ti in each iteration with the previously placed

    flows remaining unchanged. Finally, the amount of flow on each edge is scaled back by the number of

    iterations. The following significant results are reproduced from [3].

    35

  • Theorem 8.1 Given any non-negative values l(e) for each edge e ∈ E, the expression:

    ∑ki=1 minT∈Ti

    e∈T wi(e)l(e)∑

    e∈E l(e)u(e)

    provides a lower bound on the optimum value of the fractional global routing problem. Moreover, there

    exist nonnegative values ye, e ∈ E, such that the expression above is equal to the optimum value of the

    fractional global routing problem.

    Theorem 8.2 If there exists a solution for the fractional global routing problem with maximum relative

    congestion smaller than 1, the algorithm finds a ρ−approximation in at most

    1 +1

    ǫ′λ̂∗ln1+ǫ′(

    m

    1 − ǫ′)

    phases, if ρ = 1/(1− ǫ′)3 and ǫ′ = ǫ(1 + γǫ). Moreover, the variables ye, e ∈ E, converge to the solution

    of the dual linear program.

    9 Hierarchical methods

    The first prominent hierarchical method for routing was proposed by Burstein and Pelavin in [8].

    This method recursively divides the routing regions into successively smaller sub-regions, and nets at

    each hierarchical level are routed simultaneously and refined in the subsequent levels. The sub-regions

    are referred to as super cells, compared to the global routing cells (GRC) defined in the global routing

    formulation. This is a systematic divide-and-conquer approach and therefore transforms the large and

    complicated global routing problem into a series smaller and simpler sub-problems. As a result, this

    method is inherently faster than flat routing methods. In fact, this approach is quite versatile and can

    be applied to detailed routing as well. Moreover, it reduces the problem of net-ordering in sequential

    routing since higher level decisions are used to guide the solution at lower levels of the hierarchy. While

    this approach does not entirely remove the net ordering issue, this is an issue of diminished importance

    in comparison with flat approaches. The simplicity of this method carries with it limitations on the

    quality of the routing solution, since a routing decision at one hierarchical level neglects the requirements

    at subsequent level and may make a choice that is suboptimal for subsequent levels. Thus, a hierarchical

    method is often combined with other global routing techniques, such as rip-up-and-reroute.

    36

  • Hierarchical methods form a rich family of global routing approaches and numerous works [8, 14,

    32, 33, 41, 51, 53, 59, 62–64, 69, 70, 89] have been reported. These methods can proceed in a top-down,

    bottom-up or hybrid manner and have been applied to gate array/sea of gates, standard cell and custom

    design. Typical partitioning methods may proceed by bisection or quadrisection, and may be either

    uniform or non-uniform. In this section, we will describe some major representative works in this family

    of methods.

    9.1 Top down successive refinement

    The original method proposed by Burstein and Pelavin in 1983 was applicable to gate arrays, switchbox

    routing and channel routing. The method is based on the assumption of uniformity of the wiring

    substrate, a restriction removed by subsequent works. In addition to minimizing wire congestion, this

    method tries to minimize via congestion and wire length in terms of the number of grid cells traversed

    by the wires.

    (a) (b)

    (d)(c)

    Figure 15: An example of routing by top-down hierarchical refinement.

    As shown in Figure 15, the routing region is recursively bisected into smaller super cells, and at each

    level, the routing is performed in terms of the super cells at that level. In the example of Figure 15, the

    routing at each level is represented by the shaded super cells for a 3-pin net. This process is performed

    37

  • in a top-down manner until the super cells reduce to the actual grid cells for global routing.

    The initial routing is trivial and the critical part of this algorithm is related to refining the routing

    solution from one hierarchical level (a 1×N super cell) to a lower level (a 2×N (super) cell) after each

    bisection. For the example in Figure 15, a routing solution for the set of 1 × 4 cells that lie inside the

    thickened box in (c) is refined to a routing solution for the 2 × 4 cells inside the thickened box in (d)

    in the next step. Thus, the critical problem is to determine routes for multiple nets in a 2 × N cell to

    minimize congestion and wire length. Burstein and Pelavin proposed two heuristics to solve the 2 × N

    routing problem: (1) divide-and-conquer, and (2) dynamic programming.

    (a) (b)

    (c) (d)

    (f)(e)

    Figure 16: An example of divide and conquer in 2 × N routing.

    The divide-and-conquer heuristic is illustrated in the example in Figure 16. The 2 × N (N = 8)

    routing problem in Figure 16(a) is first reduced to a 2 × 2 routing problem and solved as in Figure

    16(b). Next, the cells are partitioned in halves and two pseudo pins (black dots in Figure 16(c)) are

    introduced at the two neighboring cells across the cutline and on the routing path of solution in (b).

    Now the problem is reduced to routing a 2 × N1(N1 = 4) and a 2 × N2(N2 = 4) problems, where

    N1 +N2 = N . The two new sub-problems can be recursively solved as a series of 2×2 routing problems

    as shown in Figure 16(d) – (f).

    38

  • Figure 17: An enumeration of all patterns for 2 × 2 routing.

    Through this divide-and-conquer process, the 2 × N problem is reduced to a series of 2 × 2 routing

    problems, each of which is relatively simple since all of the possible routing patterns can be enumerated

    easily as shown in Figure 17. Therefore, the problem of routing each net in a 2× 2 grid is equivalent to

    choose a routing pattern for each net among the patterns in Figure 17. Burstein and Pelavin formulate

    this problem as integer programming problem to minimize overflow and maximize wiring and via slacks;

    the wiring slack is defined as the available wiring tracks on a grid boundary minus the number of wires

    across this boundary, while the via slack is the number of via spaces in a grid cell minus the number of

    bends in this grid cell. The integer programming is first relaxed and solved as a linear program, after

    which the integer solution is obtained by rounding the fractional solutions.

    39

  • T1

    T2

    T3

    T4

    Figure 18: Examples for different tree configurations representing optimal subsolutions for a dynamicprogramming based solution to the problem of 2 × N routing.

    The second approach that was proposed is based on dynamic programming. For a set of 2×N cells,

    the procedure starts with the leftmost column k, with a terminal cell containing a pin, and grows a tree

    towards the right by incrementing k by one column at a time until the N th column is reached. At each

    step, there are a set of optimal subsolutions of partial trees that are maintained for the 2 × k cells to

    the left. These trees can be enumerated as:

    • T 1(k): a minimum cost tree connecting all of the terminal cells in the left k columns to cell g(1, k).

    • T 2(k): a minimum cost tree connecting all of the terminal cells in the left k columns to cell g(2, k).

    • T 3(k): a minimum cost tree connecting all of the terminal cells in the left k columns, to both

    g(1, k) and g(2, k).

    • T 4(k): two disjoint minimum cost trees T ∗ and T ∗∗, g(1, k) ∈ T ∗ and g(2, k) ∈ T ∗∗. Each terminal

    cell is in either T ∗ or T ∗∗.

    An example of a set of such trees is shown in Figure 18. When the trees grow from column k to k+1,

    a new set of trees T i(k +1), 1 ≤ i ≤ 4 are obtained from T i(k), 1 ≤ i ≤ 4, and the minimum cost tree of

    40

  • each type is stored for each value of k. The dynamic programming property arises from the fact that the

    optimal subsolution for a value of k can be created from the optimal subsolution trees for column k−1.

    The above 2 × N grid routing procedure is performed for each net sequentially with the wiring cost

    dynamically updated according to the changes on congestions. After all of the nets have been routed,

    each net is ripped up and rerouted by repeating the same procedure, but using the complete congestion

    information. The computational complexity for the presented algorithms is O(k(p + q) log(pq)) for k

    nets to be routed on a p × q grid.

    9.