1351IEEL TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 13, NO 11. NOVEMBER 1994
Closing the Gap: Near-OptimalSteiner Trees in Polynomial Time
Jeff Griffith, Gabriel Robins, Member, IEEE, Jeffrey S. Salowe, and Tongtong Zhang
Abstract-The minimum rectilinear Steiner tree (MRST) prob-lem arises in global routing and wiring estimation, as well asin many other areas. The MRST problem is known to be NP-hard, and the best performing MRST heuristic to date is theIterated 1-Steiner (IIS) method recently proposed by Kahng andRobins. In this paper, we develop a straightforward, efficient im-plementation of IIS, achieving a speedup factor of three orders ofmagnitude over previous implementations. We also give a parallelimplementation that achieves near-linear speedup on multipleprocessors. Several performance-improving enhancements enableus to obtain Steiner trees with average cost within 0.257 ofoptimal, and our methods produce optimal solutions in up to90t/' of the cases for typical nets. We generalize 11S and itsvariants to three dimensions, as well as to the case where allthe pins lie on k parallel planes, which arises in, e.g., multi-layer routing. Motivated by the goal of reducing the runningtimes of our algorithms, we prove that any pointset in theManhattan plane has a minimum spanning tree (MST) withmaximum degree 4, and that in three-dimensional Manhattanspace every pointset has an MST with maximum degree of 14(the best previous upper bounds on the maximum MST degreein two and three dimensions are 6 and 26, respectively); theseresults are of independent theoretical interest and also settle anopen problem in complexity theory.
I. INTRODUCTION
T HE MINIMUM rectilinear Steiner tree problem is centralto VLSI physical design phases such as global routing
and wiring estimation, where we seek low-cost topologies tointerconnect the pins of signal nets [30], [37]:
The Minimum Rectilinear Steiner Tree (MRST) prob-lem: Given a set P of n points, find a set S of Steiner points
such that the minimum rectilinear spanning tree (MST) over
P U S has minimum cost.The cost of a tree edge is the Manhattan distance between
its endpoints, and the cost of a tree is the sum of its edgecosts. The terms points and pins are synonymous and areused interchangeably, depending on the context (a net is aset of pins). Fig. I shows an MST and an MRST for the samefour-pin net.
Research on the MRST problem has been guided by severalfundamental results. First, Hanan [211 has shown that there
Manuscript received July 27, 1993; revised May 27, 1994. This work wassupported in part by NSF Young Investigator Award MIP-9457412 (awardedto G. Robins) and by NSF grants MIP-9107717 and CCR-9224789. This paperwas recommended by Associate Editor M. Sarrataideh.
I Griffith is with the Department of Computer Science, University ofTennessee, Knoxville, TN 37996-1301 USA.
G. Robins and T. Zhang are with the Department of Computer Science,University of Virginia, Charlottesville, VA 22903-2442 USA
J. S. Salowe is with Questecb, Inc., Falls Church, VA 22043 USA.IEEE Log Number 9404037.
Fig. 1. A minimum spanning tree (left) and MRST (rght) for a fixed net;hollow dots represent the original pointset P, while solid dots represent theset S Of added Steiner points.
Fig. 2. Hanan's theorem: There always exists an MRST with Steiner pointschosen from the intersections of all the horizontal and vertical lines passingthrough all the points.
always exists an MRST with Steiner points chosen from theintersections of all the horizontal and vertical lines passing
through all the points in P (see Fig. 2), and this result wasgeneralized by Snyder [42] to all higher dimensions. However,a second major result by Garey and Johnson [16] establishesthat despite this restriction on the solution space, the MRSTproblem remains NP-complete; this has given rise to numerousheuristics as surveyed by Hwang, Richards, and Winter [27].
In solving intractable problems, we often seek provablygood heuristics having bounded worst-case error from optimal.
Thus, a third important result is the discovery by Hwang [25]that the rectilinear MST is a fairly good approximation to the
MRST, with a worst-case performance ratio of cnst(Ts ) <cost(MRSTI) -
2. This implies that any MST-based strategy that improvesupon an initial MST topology will also enjoy a performanceratio of at most 3. This has prompted a large number of Steiner
tree heuristics that resemble classic MST construction methods[23], [24], [26], [31], [32], all producing Steiner trees with
average cost 7 to 9% smaller than the MST cost [27].Unfortunately, all MST-based MRST constructions were
recently shown by Kahng and Robins [29] to have a worst-
case performance ratio of exactly 2. This negative result
has motivated research into alternate schemes for MRSTapproximation, with the best performing among these beingthe Iterated I-Steiner (IIS) algorithm [28], [38]. 11S always
0278-0070/94$04.00 4) 1994 IEEE
-0-
N �
I
1352 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN I
performs strictly better than 3 times optimal,1 and also per-
forms very well in practice, achieving almost 11% average
improvement over MST cost, which is on average less than
0.5% away from optimal [40]. The Iterated 1-Steiner methodwas generalized to arbitrary weighted graphs by Alexander
and Robins [1], [2], and is thus a suitable basis of a practical
global router that must handle congestion, obstacles, etc. [15].
The performance success of 11 S was achieved at the expense
of a high time complexity: although a more efficient variant ofIIS can be implemented to run within time 0(o2 log ol) [28],
the computational geometric methods employed to achieve
this time bound hide large constant factors and are also
difficult to code. Thus, actual previous implementations of 11 S
typically use a more straightforward approach that requirestime 0(i 4 logn).
The first contribution of this paper is a practical implemen-
tation of IIS that runs within time 0(n3 ). Our method is based
on a dynamic minimum spanning tree update scheme and es-
tablishes the practicality and viability of the iterated I -Steiner
approach. For 100 points, our new implementation is about
three orders of magnitude faster than the naive implementation,and the speedup over the naive implementation increases with
the number of points. This dramatic improvement in speed
enables for the first time the testing of IIS on nets containing
several hundred pins.Since a typical CAD environment consists of a network of
workstations, exploiting the available large-grain parallelismprovides a natural and effective means of improving the
performance of CAD algorithms. With this in mind, a second
contribution of our work is a parallel version of 11S that
achieves high parallel speedups. Since Steiner tree constructionis a computationally expensive part of global routing, our
parallel implementation may be viewed as an important firststep toward obtaining a "Steiner engine," i.e. an efficient toolfor producing near-optimal Steiner trees.
Our third contribution entails several performance-
improving enhancements to the IIS method. Our methodsrely on an approach that deviates from pure greed and
instead employs randomness in order to break ties during
Steiner point selection. Although asymptotically no slower
than the original IIS variants, our methods afford improved
average performance. Extensive simulations indicate that for
uniformly distributed random nets of up to 8 pins, the average
performance of our enhanced IuS algorithm is only 0.25%
away trom optimal. Moreover, for 8-pin nets, our method
produces the optimal Steiner tree in 90% of all instances.We also propose a method of improving performance at the
expense of running time, allowing a smooth tradeoff betweensolution quality and efficiency.
1 Recently, Berman and Ramaiyer [7] and Foessmneier, Kaufmann, and
Zelikovsky [6], 1131 have extended the fundamental work of Zelikovsky [461,[471 to yield a method similar to 0S (specifically, to the 'batched" 11S method
descnbed below) with performance ratio bounded by 'S; this work settles
in the affirmative the longstanding open question of whether there exists
a polynomial-time rectilinear Steiner tree heuristic with performance ratio
strictly smaller than 3 [25]. At the time of this writing, Berman. Foessmeier,Karpinski, Kaufmann, and Zelikovsky [ 131 further improved the performance
bound of their polynomial-time rectilinear Steiner heuristic to Lt = 1 271.
OF INTEGRATED CIRCUITS AND SYSTEMS, VOi 13. NO. I1, NOVEMBER 1994
Next, we generalize 1lS and its variants to three dimensions,as well as to the intermediate case where all pins lie on
A, parallel planes. This formulation has several applications,including multi-layer routing [9], [20], [22] and the design of
buildings [41]. Empirical testing suggests that this approachis effective for three-dimensional Steiner routing, yielding
up to 15% average improvement over MST cost in three-
dimensional Manhattan space.Finally, in order to reduce the running time of the dynamic
MST-maintenance component of our algorithms, we prove the
following results under the Manhattan metric: I) every two-dimensional pointset has an MST with maximum degree of
at most 4; and 2) every three-dimensional pointset has an
MST with maximum degree of at most 14 (the best previously
known bounds for two and three dimensions were 6 and 26,
respectively). Our results and algorithms on degree-bounded
minimum spanning trees are of significant independent theoret-
ical interest [39] and settle several open issues in complexitytheory.
2
The rest of the paper is organized as follows. In SectionII we review the IHS method. Section III outlines our more
efficient implementation of u1S, and Section IV describes a
variant of IIS with enhanced performance. Section V gen-eralizes IIS to three dimensions and proposes an efficient
method for three-dimensional MST-maintenance. Section VI
proves tight bounds for the maximum MST degree in two and
three dimensions under the Manhattan metric and discusses the
implications of our results to a problem in complexity theory.Section VII outlines the parallel implementation, and SectionVIII presents extensive simulation results regarding perfor-mance, running times, and parallel speedups. We conclude in
Section IX with directions for further research. A preliminary
version of this work appeared in [3] and in [4].
Il. REVIEW OF THE ITERATED I-STEINER METHOD
We begin with a review of the Iterated 1-Steiner method
of Kahng and Robins [28]. For two pointsets P and S,
we define the MST savings of S with respect to P as
AMST(P, S) = cost(MST(P)) - cost(MST(PUS)). We use
H(P) to denote the set of Hanan Steiner point candidates (i.e.,the intersections of all horizontal and vertical lines passingthrough points of P). For a pointset P, a I-Steiner point
x c H(P) maximizes AMST(P, {x}) > 0. The IIS method
repeatedly finds 1-Steiner points and includes them into S. Thecost of the MST over P U S will decrease with each added
point, and the construction terminates when there is no x withAMST(P U S. ax}) > 0. Although a Steiner tree may contain
at most n - 2 Steiner points [18], IIS may add more than
n - 2 Steiner points; therefore, at each step we eliminate any
extraneous Steiner points having degree 2 or less in the MST
over P U S. Fig. 3 illustrates a sample execution of Il S. and
Fig. 4 describes the algorithm formally.Although a single I-Steiner point may be found in 0(n2)
time using complicated techniques from computational geom-
2 Robins and Salowe [39] investigate the maximum MST degree for higher
diiiensions and other L, norms, and relate the iiiaxiiuui MST degree to the
so called "Hadwiger" numbers.
GRIFFITH et al.: NEAR OPTIMAL STEINER TREES IN PO0YNOMIAL TIME
(b)
(d) (e)
Fig. 3. Execution of Iterated 1-Steiner (II S) on a 4-pin net. Note that in step(d) a degree-2 Steiner point is formed, and is eliminated from the topology.
Algorithm Iterated 1-Steiner (IIS) [28]
Input: A set P of n pointsOutput: A rectilinear Steiner tree over P
While T = {x e H(P)l AMST(P U S, {s}) > O} 3 i 0 Do
Find x E T with maximum AMST(P U S, {x})
S - S U {j}Remove from S points with degree < 2 in ATST(P U S)
Output MST(P US)
Fig. 4. The Iterated I Steiner algonthm.
etry [17], [28], such methods suffer from large constants in
their time complexities, and they are notoriously difficult to
implement. Thus, a batched variant of IIS is usually favored
that efficiently adds an entire set of "independent" Steiner
points in a single round, thereby affording both practicality
and reduced time complexity [28], [38].
Following Hanan's result, for each candidate Steiner point
£ E H(P) the Batched 1-Steiner (BIS) variant computes
the induced MST savings AMST(Px{}). Next, we select
a maximal "independent" set of Steiner points, where the
criterion for independence is that no candidate Steiner point
is allowed to reduce the potential MST cost savings of any
other candidate. More formally, a set S of Steiner points is
independent if AMST(P, S) > AEZs AMST(P, {x}). The
weight of set S is AMST(P, S), and our goal is to find an
independent set of maximum weight; the Steiner points in that
independent set are grouped together during a round of B IS.
Using a reduction from the NP-complete problem of finding
a maximum independent set in a graph, it is easy to show that
our maximization problem is NP-complete also, even if the
independent sets obey the "inheritance property," one of the
axioms for matroids (see Cormen et al. [ 11]). Our independent
sets do not necessarily obey the inheritance property or the
exchange property for matroids. Nevertheless, we can use a
greedy approximation, described in Fig. 5, to approximate the
weight of a maximum independent set. Note that the greedy
algorithm would find a maximum independent set if the subset
system was a matroid. This approximation is efficient and
seems to work well in practice; it would he interesting to
prove nontrivial performance bounds.
1353
Algorithm Batched 1-Steiner (BIS) 1281Input: A set P of n pointsOutput: A rectilinear Steiner tree over P
While T ={ p H(P)fc MST(P, {r}) > O} =S 0 DoS-=t0For x E {T in order of non-increasing AMST} Do
If AMST(P U S. {x}) > AAYSi'(P, {j) Then S= U 4}
P =PuSRemove from P Steiner points with degree < 2 in MST(P)
Output MST(P)
Fig. 5. The Batched 1-Steiner (BIS) algorithm.
Once an approximate maximum independent set S is deter-mined, it is inserted into P and we iterate this process withP set to P U S until we reach a round that fails to induce aSteiner point. The total time required by BIS is Q(r0 log n)per round (the number of rounds in practice is a small constantindependent of net size, i.e., less than 3 on average [281). TheBIS algorithm is summarized in Fig. 5.
III. A NEW, FASTER IMPLEMENTATION
In speeding up the MST-savings computations, a key ob-servation is that once we have computed an MST over thepointset P. the addition of a single new point x into P can onlyinduce a small constant number of changes between MST(P)and MST(P U {x}). This follows from the observation thateach point can have at most eight neighbors in a rectilinearplanar MST, i.e. at most one per octant [24]. Thus, to updatean MST with respect to a newly added point x, it sufficesto consider only the closest point to x in each of the eightplane octants with respect to x (below we show that for eachpoint it suffices to examine at most four potential candidatesfor connection in the MST).
Thus, we have the following linear-time algorithm fordynamic MST maintenance: connect the new point x to eachof its potential neighbors (i.e. the closest point to x in each ofthe 0(1) octants around x), then delete the longest edge onany resulting cycle. This dynamic MST maintenance schemereduces the time complexity of each round of BIS fromO(r4 log n) to 0(n'), a substantial savings. An executionexample of this method is given in Fig. 6, and Fig. 7 describesit formally. Note that dynamic MST maintenance can alsobe achieved in sub-linear time [14], but such methods seemimpractical due to their complicated description and largehidden constants. A similar method was also used by Yao [451to obtain a sub-quadratic MST algorithm in higher dimensions,but no attempt was made to optimize the number of necessaryregions, whereas we also optimize the number of regions.
We now show that only four regions suffice for dynamicMST maintenance, namely the four regions defined by thetwo lines oriented at 45 and -45 degrees (Fig. 8(a)); wecall this the diagonal partition. We couch the discussion ingeneral terms so that we can later extend our terminology andtechniques to the three-dimensional case. We begin by definingthe following key property for regions and partitions:
The Uniqueness Property: Given a point p, a region R hasthe uniqueness property with respect to p if for every pair of
I -- t l
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOl. 1, NO. I1. NOVEMBER 1994
l.a (Id
K ½
longest\edge in 4
cyc~e
(d)
Fig. 6 Dynamic MST maintenance, adding a point to an existing MSTentails connecting the point to its closest neighbor in each octant and deletingthe longest edge on each resulting cycle (the Euclidean metric has been usedfor clarity in this example).
Dynamic MST Maintenance (DMSTM)Input: A set P of o points, 51.51 (1), a neo" pointOutput: MST(P L x))T =5fST(P)For i = i to #regions do
Find in region R,(xi) the point p c P closest to a
Add to T the edge tp.)If T contains a enete Then remote from T the longest edge on the cye
Output P
Fig. 7. Linear-time dynamic MST maintenance.
points t, uc E R, either dist(w, u) < dist(w.p) or dist(t'. ti;) <
dist(u, p).
We use dist(u, w) to denote the Manhattan distance between
the two points u and w. A partition of space (i.e., into a
finite set of mutually disjoint regions whose union covers
the space) is said to have the uniqueness property if each of
its regions has the uniqueness property. Clearly, any partition
scheme having the uniqueness property can be used in dynamic
MST maintenance, since each region having the uniqueness
property can contain at most one candidate for connection
in the MST. Naturally, simpler partition schemes (i.e., ones
containing a smaller number of regions) are preferable to more
complicated ones. We now prove that the diagonal partition
has the uniqueness property.
Lemma 3.1: Given a point p in the Manhattan plane, each re-
gion of the diagonal partition with respect to p has the uniqueness
property.
Proof The two diagonal lines through p partition the
plane into four disjoint regions RI through R 4 (see Fig. 8(a)).
The boundary points between two neighboring regions may
be arbitrarily assigned to either region. Consider one of
the four regions, say RI, and let a, w E R1 (Fig. 8(b)).
Assume without loss of generality that dist(u, p) < dist(w, p)
(otherwise swap u and w in this proof). Consider the diamond
D in R1 with one corner at p, and with u on the boundary
of D (see Fig. 8(c)). Let c be the center of D, so that c is
equidistant from all points on the boundary of D, and let a
ray starting at p and passing through lt intersect the boundary
of D at the point b. By the triangle inequality, dist(w, u) <
dist(w, b) + dist (bc) + dist(c, u) = dist(w.b) + dist(b,c)
+ dist(c, p) = dist(w, p). Thus, w is not closer to p than it is
to u, and the region R1 therefore has the uniqueness property.
The other three regions are handled similarly. It follows that
the diagonal partition has the uniqueness property. El
A natural question is how to determine an optimal parti-
tioning scheme for a given dimension and metric, i.e., finding
a partition scheme that contains the smallest possible number
of regions, yet still possesses the uniqueness property. The
existence of pointsets in the Manhattan plane where the MST
is forced to have degree 4 (i.e., the five points corresponding
to the center and four corners of a diamond) establishes the
optimality of the diagonal partition, in the sense that no
partition of the Manhattan plane into less than four regions
can have the uniqueness property. Section V addresses the
problem of finding an optimal partition for three-dimensional
Manhattan space.We have shown above that in the Manhattan plane. the
degree of any particular single MST node can be made to
be 4 or less. But note that this does not imply that the degrees
of all nodes can be made 4 or less simultaneously, since
decreasing the degree of one node can increase the degree
of neighboring nodes. Thus, it is not immediately obvious
that in the Manhattan plane there always exists an MST with
maximum degree 4; this requires additional proof as detailed
in Section VI below.
IV. A VARIANT WITH IMPROVED PERFORMANCE
At each iteration, the basic IIS heuristic uses pure greed
to select a I -Steiner point, and this may unfortunately pre-
clude additional savings in subsequent iterations. A similar
phenomenon may occur due to tie-breaking among 1-Steiner
candidates that induce equal savings. For example, in Fig. 9
we observe that an unfortunate choice for a 1-Steiner point
can interfere with the savings of future potential 1-Steiner
candidates, resulting in a suboptimal solution.
Empirical tests indicate that ties in MST savings for the
various 1-Steiner point candidates occur very often. Therefore,
in order to avoid breaking ties in ways that would preclude
possible future savings, we propose the following scheme:
when an MST savings tie occurs among a number of 1-Steiner
candidates, rather than using a deterministic tie-breaking rule,
we instead randomly select one of the I-Steiner candidates
and proceed with the execution. We then run this randomized
variant of 11S m times on the same input and select the best
solution (i.e., the least costly of the m trees produced), where
nt is an input parameter.In order to further avoid the pitfalls of a purely greedy
strategy (i.e., getting trapped in local minima), we also propose
a mechanism that allows IIS to select a 1-Steiner candidate
if its MST savings is within 6 units from that of the best
candidate, where 6 is again an input parameter. This strategy
1354
GRIFFITH et t: NEAR OPTIMAL STEINER TREES IN POLYNOMIAL TIME
(a) (b) (c)
Fig. 8. The diagonal partition of the plane into four regions with respect to a point p (a) has the uniqueness property: for every two points u and it, thatlie in the same region (b), either dist(u u) < dist(u .ip) or else dist(it. w) < dist( i,p) (c).
(a)0 0
0 0 0
Fig. 9. An unlucky tie-breaking choice for a I-Steiner point may interferewith the savings of other potential I-Steiner candidates (a). If PI is selectedin the first iteration (b), then the MST savings of both P2 and P3 vanishduring the second iteration, yielding a suboptimal tree of cost 7 (c); on theother hand, if P2 is selected in the first iteration, then P3 may be selected inthe second iteration, yielding an optimal tree of cost 6 (d).
Fig. 10. The Enhanced Iterated k-Steiner (ElkS) method.
would enable the acceptance of a slightly suboptimal choice(with respect to the best immediate possible savings), with thepossibility of realizing greater savings in future iterations.
Finally, we note that performance may be further improvedif instead of looking for individual 1-Steiner points, we searchfor pairs of Steiner candidates that offer maximum savingswith respect to other candidates or pairs of candidates. Forexample, such an Iterated 2-Steiner algorithm (12S) will op-
timally solve the pointset of Fig. 9. Combining these threetechniques of 1) nondeterministic tie-breaking, 2) near-greedysearch, and 3) k-Stetner selection, we obtain a new EnhancedIterated k-Steiner (ElkS) algorithm, as shown in Fig. 10.
Note that the original IIS algorithm of Kahng and Robins[28] (see Fig. 4) is equivalent to our new ElkS algorithmwith k = 1. rn = 1, and 6 = 0. Our ElkS scheme can alsobe extended using a '"noninterfering" criterion as in [28], to
yield an enhanced batched k-Steiner (EBkS) algorithm, wherea maximal number of Steiner points are added during eachround. The time complexity of EBkS is 0(m n 2(k-1) 1T( ,where T(n) is the time complexity of B13S. For fixed mn, EBISruns asymptotically within the same time as BIS, namely0(n 3 ) per round. The EBkS method improves the quality ofthe solutions at the expense of running time, allowing a smoothtradeoff between performance and efficiency. Although EBkSis guaranteed to always yield optimal solutions for <A: + 2pins, its time complexity increases exponentially with k: thus,tin order to remain within polynomial time, k must be fixed.
Finally, it was observed empirically that only a smallfraction of the Hanan candidates have positive MST savingsin a given round of BIS; moreover, only candidates withpositive MST savings in an earlier round are likely to producepositive MST savings in subsequent rounds. Therefore, ratherthan examine the MST savings of all Hanan candidates ina given round, we may consider only the candidates thatproduced positive savings in the previous round. The empiricalsimulations described in Section VIII indicate that this strategysignificantly reduces the time spent dunng a round, withoutdegrading solution quality. We call this streamlined version ofBIS the modified batched 1-Steiner (MB IS) algorithm.
V. STEINER ROUTING IN THREE DIMENSIONS
Three-dimensional packaging is emerging as a viable VLSIdesign technology [9], [20], [22]; however, most existing CADrouting tools and techniques still implicitly address two dimen-sions only. In contrast, the ElkS method readily generalizesto arbitrary dimensions. We distinguish between the generalthree-dimensional version of the Steiner problem and the lessgeneral (but more realistic) multi-layer formulation, where thepoints of P lie on L parallel planes. Note that the unrestrictedthree-dimensional version of the Steiner problem occurs inthe limit when L = on, and the standard two-dimensionalformulation is the case [ = 1. The cost of routing between onelayer and another (i.e., using vias) is likely to be substantiallyhigher than staying on the same single layer (i.e., in terms ofmanufacturing expense, signal propagation delay, etc.), andthis may be modeled by varying the distance between thelayers. In Section VIII, we present simulation data for severalcombinations of values for L and n.
Algorithm Enhanced Iterated k-Steiner (ElkS)Input: A set P of n points, parameters 6i >0, k > 1. and to > 1Output: A rectilnear Steiuer tree over 'T -MST(P)Do mn times
S-= 0WhileC= fX C H(P) I XI < k,AMST(PLS,X) >}# 7i0 Do
Find Y E C with maximom AMST(P u S. Y)Randomly sele-if E C with AMST(P U S, .) > AMST(1'uS S. )S =SUZRemove from S points with degree < 2 in MST(P u Si
If cos(MST(IP J S) < cost(T) Then T =MST(P U S)Output T
1 355
PI
CLT�_�n
PI(0 Fo_�� 0 P2 PI
1356 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN
Our three-dimensional ElkS method may be implementedefficiently by using Snyder's [42] generalization of Hanan'stheorem to higher dimensions. In particular, there always existsan optimal Steiner tree whose Steiner points are chosen fromthe O(ro) intersections of all orthogonal planes (i.e., planesparallel to the coordinate axis) passing through all points inP. The three-dimensional analog of Hwang's result suggeststhat the maximum MST/MRST ratio for three dimensions is atmost 3 (this is a consequence of a more general conjecture forhigher dimensions [19]), although there is currently no knownproof of this. An example consisting of six points located in themiddles of the faces of a rectilinear cube establishes that I is
a lower bound for the worst MST/MRST performance ratio inthree dimensions. Thus, we expect the average performance ofour heuristics, expressed as percent improvement over MST,to be higher in three dimensions than it is in two dimensions;this is indeed confirmed by our experimental results in SectionVIII. Also as expected, the average performance improves asthe number of layers L increases.
As noted above, once we have computed an MST over apointset P, the addition of a single new point p into P canonly induce a small constant number of topological changesbetween MST(P) and MST(P U {p}). This follows from thefact that in a fixed dimension, each point can have at most0 (1) neighbors in a rectilinear MST, as noted above in SectionIII. Thus, MST savings in three dimensions may be efficientlycalculated by partitioning the space with respect to the newpoint p into 0(1) mutually disjoint regions Ri(p) having theuniqueness property, namely that only the closest point to pin each region Ri(p) may be connected to p in the MST(P).This implies that in three-dimensional Manhattan space, lineartime suffices to compute the MST savings of each 1-Steinercandidate.
Using similar arguments to those of Lemma 3.1, we canpartition three-dimensional Manhattan space into 14 regions,with the uniqueness property holding for each region. Sucha partition corresponds to the faces of a truncated cube(Fig. I I (a)), i.e., a solid obtained by chopping off the cornersof a cube, yielding six square faces and eight equilateral trian-gle faces (Fig. 1(b)): this solid is known as a "cuboctahedron"[34]. The 14 solid regions of this partition are induced by the14 faces of the cuboctahedron, namely the six pyramids withsquare cross-section (Fig. I l(c)) and the eight pyramids withtriangular cross-section (Fig. I1(d)). Again, points located onregion boundaries may be arbitrarily assigned to any of theadjacent regions. We call this particular partition of spacethe cuboctahedral partition, and we refer to the two types of
induced regions as square pyramids and triangular pyramids,
respectively. We now show that the cuboctahedral partitionhas the uniqueness property.
Theorem 5.1: Given a point i) in three-dimensional space
under the Manhattan metric, each of the 14 regions of the cuboc-tahedral partition of space with respect to p has the uniquenessproperty.
Proof: To show the uniqueness property for the squarepyramids, consider one of the square pyramids R with respect
to p (Fig. II(c)), and let isw E R. Assume without lossof generality that dist(u,p) < dist(w.p) (otherwise swap the
OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 13, NO 11, NOVEMBER 1994
roles of u and w in this proof). Consider the locus of points
D c R that are distance dist(u.p) from p. (Fig. I1(e)); Dis the upper half of the boundary of an octahedron. Let cbe the center of the octahedron determined by D. so that e
is equidistant from all points of D. Let b be the intersectionof the surface of D with a ray starting from p and passingthrough w. By the triangle inequality, dist(w, u) < dist(w, b)+ dist(b. c) + dist(c, o) = dist(rw, b) + dist(b, c) + dist(cp) =dist(w.p). Thus, iw is not closer to p than it is to u, andtherefore the region R has the uniqueness property. The othersquare pyramids are handled similarly.
To show the uniqueness property for the triangular pyra-mids, consider one of the triangular pyramids R with respectto p (Fig. 1 1(d)), and let u. C R. Assume without loss of
generality that dist(a.p) < dist(w,p) (otherwise swap the
roles of u and in). Consider the locus of points D in Athat are distance dist(up) from p (Fig. 11(f)). Let b be theintersection of D with a ray starting from p and passing
through w. By the triangle inequality, dist(ii, o) < dist(w. b)
+ dist(bu) < dist(wbl) + dist(Lp) = dist(wp). Thus, w
is not closer to p than it is to u and therefore the region R
has the uniquenes property. The other triangular pyramids arehandled similarly. H
We thus have the following:Corollary 5.2: Given a poin/set P in three-dimensional Man-
hattan space and an additional new point p, there exists an MSTover P U {p} where p has degree of at most 14 in the MST.
Proof: The cuboctahedral partition of space with respectto p yields 14 regions, each possessing the uniqueness prop-
erty. This implies that for any two points inside a region, oneis closer to the other rather than to the origin; thus, only onepoint inside each region is a viable candidate for an MSTneighbor of p. Therefore, the degree of p (or any other singlepoint) in the MST can be made 14 or less. C:
It is still an open question whether for three dimensions thecuboctahedral partition is optimal (i.e., whether the cuboctahedral partition has a minimum number of regions among allpartitions having the uniqueness property); we conjecture thatit is. On the other hand, we can show that 13 is a lower boundon the maximum MST degree in three-dimensional ManhattanSpace:
Theorem 5.3: There are three-diimensional pain/sets for whichthe maximum degree of any MST is at least 13.
Proof.' Consider the pointset P {(0. 0. 0),(+100 .0 0), (0 100. 0), (0. 0. ±100), (47. 4, 49),
(-6. -49.15), (-49,8,43), (-4.47, -49), ( 49-6. -45),(8, -49, -43), (49. 49. 2)1.
The distance between every point and the origin is exactly100 units, but the distance between any two non-origin pointsis strictly greater than 100 units. Therefore, the MST overP is unique with a star topology (i.e., all 13 points mustconnect to the origin in the MST), and thus the origin pointhas degree 13 in the MST. El
A remaining open question is whether there exists a three-dimensional pointset where the maximum MST degree isforced to be as high as 14. Given that each point can connectto at most 14 neighbors in the MST, we obtain the fol-lowing linear-time algorithm for dynamic MST maintenance
1357GRIFFITH et a: NEAR-OPUIMAL STEINER TREES IN POLYNOMIAL TIME
(a) (b)
(c) (d)
* wo -.
......... , ll......:\ :4,
(e) (f)
Fig. I1. A truncated cube (a)-(b) induces a three-dimensional cuboctahedral space partition where each region has the uniqueness property. The 14induced regions consist of six square pyramids (c) and eight triangular pyramids (d). Using the triangle inequality, the uniqueness property may be
shown to hold for each region (e)-(f).
(DMSTM) in three dimensions: connect the new point in VI. ON THE MAXIMUM MST DEGREE
turn to each of its < 14 potential neighbors, then deletethe longest edge on each resulting cycle. This is a three- As noted above, Lemma 3.1 does not imply that in the
dimensional analog of the two-dimensional method given in Manhattan plane the maximum MST degree is 4, nor does
Figs. 6 and 7. Corollary 5.2 imply that the maximum MST degree in three
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS. VOL. 13. NO. 11, NOVEMBER 1994
Algorithm Parallel Iterated -Steiner (PI1S)Input: A set P ole go its (assuming p processors available)Output: A Sr uilineas Stenes iree tor PMaster Process:H -0Do Forever
For i =tI to p D.Send the i" pie-sor P U S and Offl different H.-an candidates frto H(P)
Fosi- I topD.Receive Ioo bhe 7th grocessoi the candidate ,i e ith the largest \MST
Let a be th h whith thhe Iargest AMMST(P Li SI (hi) to oal 1 I< < pIf AMST(f S,. {}) > 0 Then
Remove fern S points with degree < 2 in MUS el U S)Else Output MST(P S anid Stop
Slave Process:Receive fern the nster pie.e.s P J S aed s of It a .n .ordidoies hl... /1hReturn th candidates h, sith tie largest AITtST(P US.}hi)
Fig. 12. The Parallel Iterated 1-Steiner algorithm. The master process sendseach one of p slave processors 1 of the Hanan candidates: a slave processthen determines the candidate with the best AMST and returns it to the masterprocess. Out of all the p candidates received, the master process then usesthe candidate with the highest MST savings for inclusion into the pointset.To further increase the parallel execution speed, slower processes are sentsmaller sets of candidates, in proportion to their observed speed (for clanty,this simple load-balancing scheme is not shown in the above template).
dimensions is 14, since reducing the MST degree of any one
point may increase the MST degree of other points. It turns
out, however, that ties for connection during MST construction
may always be broken appropriately so as to keep the overall
MST degree low. We begin by defining a strict version of the
uniqueness property:
The Strict Uniqueness Property: Given a point p, a space
region R has the strict uniqueness property with respect to
p if for every pair of points u,7'tf E R, either dist(wv, u) <
dist(w, p) or dist(u, w) < dist(u, p).
Note that if a region has the strict uniqueness property it
also has the (nonstrict) uniqueness property. Clearly each d-
dimensional region satisfying the strict uniqueness property
may contribute at most 1 to the maximum MST degree. We
now prove that by breaking ties judiciously, the maximum
MST degree can be made no larger than the number of d-
dimensional regions in a partition having the strict uniqueness
property:
Theorem 6.1: Given a partition of d-dimensional space intor regions, and given that only r' < r of these regions are d-dimensional and have the strict uniqueness property (the rest ofthe r -r' regions being lower-dimensional, and are not requiredto have the uniqueness property), then the maximum MST degreein this space is r' or less.
Proof: Given a pointset P. perturb the coordinates of
each point by a tiny amount so that the lower-dimensional
regions with respect to each point do not contain any other
points. This is always possible to do, and yields a new
perturbed pointset P'. Clearly, interpoint distances in P' only
differ by a tiny amount from the corresponding interpoint
distances in P, and therefore the cost of the MST's over
P' and P can differ by only a similarly tiny amount, which
can be made arbitrarily small. But the MST over P' has
maximum degree r', since only the r' d-dimensional regions
of the partition are nonempty with respect to the points of P'.
We now use the topology of the MST for P' to connect the
corresponding points of P; this would induce an MST over P
having maximum degree r'. H
1358
Applications of this technique to 2 and 3 dimensions areimmediate:
Corollary 6.2: Every pointser in the Manhattan plane has anMST with maximum degree 4.
Proof Modify the diagonal partition into a strict di-agonal partition having a total of eight regions: four two-dimensional open wedges (i.e., not containing any of theirown boundary points), and four one-dimensional rays (i.e.,the boundaries between the wedges). By arguments similar tothose of Lemma 3. 1, each of the open wedges possesses thestrict uniqueness property, and thus by Theorem 6.1 pointslying on the boundaries between wedges can be perturbed intothe interiors of the wedges themselves. Thus, the maximumMST degree given such a strict diagonal partitioning schemeis 4. H
The bound of 4 on the maximum MST degree in theManhattan plane is tight, since there are examples that achievethis bound (e.g., the center and vertices of a diamond). Notethat the best previously known upper bound for the maximumMST degree in the Manhattan plane was 6, as was statedwithout proof in [24].
Corollary 6.3: Every pointset in three-dimensional Manhat-tan space has an MST with maximum degree 14.
Proof: Modify the cuboctahedral partition into a strict
cuboctahedral partition having a total of 38 regions: 14 three-dimensional open pyramids (i.e., 8 triangular pyramids and6 square pyramids, each not containing any of their ownboundary points), and 24 two-dimensional regions (i.e.. cor-responding to all the boundaries between the pyramids). Byarguments identical to those of Theorem 5. 1. each of the openpyramids possesses the strict uniqueness property, and thus byTheorem 6.1, points lying on the boundaries between the 14pyramids can be perturbed into the interiors of the pyramidsthemselves. Thus, the maximum MST degree given such astrict cuboctahedral partition scheme is equal to the numberof three-dimensional regions of the cuboctahedral partition, i.e.14. C]
While Theorem 5.3 gives an example that illustrates thatthe maximum MST degree in three-dimensional Manhattanspace can be as large as 13, it is still open whether there existexamples in three-dimensional Manhattan space that force anMST degree of 14. Note that the best previously known boundfor the maximum MST degree in three-dimensional Manhattanspace was 26, as implied by a result from the theory of spherepacking [12), [441.
Our results regarding MST bounds also settle some openquestions in complexity theory, since it is known that the prob-lem of finding a degree-bounded MST is NP-complete, evenwhen the degree bound is fixed at 2 (yielding the TravelingSalesman Problem) or at 3, as shown by Papadimitriou andVazirani [36]. Corollary 6.2 implies that the degree-boundedMST problem in the Manhattan plane is solvable in polynomialtime when the degree bound is fixed at 4 or more, sincewe have shown how to efficiently find an MST that meetsthese maximum degree constraints; this was previously anopen problem. Similarly, Corollary 6.3 implies that the degree-bounded MST problem in three-dimensional Manhattan spaceis solvable in polynomial time when the degree bound is fixed
1359GRIFFITH et al.: NEAR-OPT1IMAL STEINER TREES IN POLYNOMIAL TIME
Fig. 13. An example of the output of Batched I-Steiner (BIS) on a random
300-point set (hollow dots). The Steiner points produced by BIS are denoted
by dark solid dots.
at 14 or more (since we have shown how to efficiently find an
ordinary MST that meets such maximum degree constraints).
Monma and Suri [35] used a similar perturbation argument to
prove that for any poinset in the Euclidean plane, there is an
MST with maximum degree of 5.
VII. A PARALLEL IMPLEMENTATION
Since a typical CAD environment consists of a network of
workstations, exploiting the available large-grain parallelism
provides a natural and effective means of improving the
performance of CAD algorithms. We therefore propose a
parallel implementation of the Iterated 1-Steiner method that
achieves high parallel speedups. The IIS algorithm is highly
parallelizable because each one of p processors can compute
independently and in parallel the MST savings of 0( r)
of the Steiner candidates. In our parallel implementation, all
processors send their best candidate to a master processor,
which selects the best of these candidates for inclusion into the
pointset. This procedure is iterated until no improving candi-
dates can be found (the B IS algorithm parallelizes similarly).
We used the Parallel Virtual Machine (PVM) system [5],
[431 to control remote processors. PVM is a widely available
software package that allows a heterogeneous network of
parallel and serial computers to be used as a single com-
putational resource. The PVM system consists of two parts,
a daemon process, and a user library, providing mechanisms
for initiating processes on other machines and for controlling
synchronization and communication among processes. Using
PVM as a framework for parallelism alleviates the need to
hand-code the synchronization and communication protocols
from scratch; for the specific details on how PVM manages
the underlying system resources, see [5], [431.
Each of the processors in our parallel implementation isa SUN workstation, communicating over an Ethernet. The
master processor sends to the p available processors equal-
sized subsets of the Hanan candidate set H(P); each slave
process then determines from among the candidates that it
received the one with the best AMST and returns the winner
to the master process. Out of all the p candidates received, the
master process then determines the candidate with the highest
MST savings and includes it into the pointset. This parallel
version of the Iterated 1-Steiner algorithm is shown in Fig. 12.
To further increase the parallel execution speed, we im-
plemented a load-balancing scheme in order to mitigate any
significant variations that may exist between the speeds of dif-
ferent processors (due to a heavy CPU load, the underlying ar-
chitecture type, swapping behavior, etc.) To this end, the wall-
clock response time of each processor is tracked. If any indi-
vidual processor is determined to be considerably slower than
the rest, it is henceforth given smaller tasks to perform (i.e., it
is sent less Hanan candidates, in an amount proportional to its
observed speed). If a processor does not complete a task within
reasonable time, it is sent an abort message and its computation
task is reassigned altogether to the fastest idle processor avail-
able. This prevents individual slow (or crashed) processorsfrom seriously impeding the speed of the overall computation.
VIII. EXPERIMENTAL RESULTS
We have implemented both serial and parallel enhanced
versions of the BIS algorithm, using C in the SUN workstation
environment. Our code is available upon request. The serial
B IS heuristic has been benchmarked on up to 10000 instances
of each net size. The instances were generated randomly by
independently choosing the coordinates of each point from a
uniform distribution in a 10000 x 10000 grid; such instances
are statistically similar to the pin locations of actual VLSI nets
and are the standard testbed for Steiner tree heuristics [27]. As
is the convention in the Steiner approximation literature [27],
we evaluate the performance of our method by comparing
the cost of our solutions to the MST cost over the same
inputs. Performance results are summarized in Table I and are
illustrated in Fig. 14(a)-14(c). BIS yields Steiner trees with
cost averaging almost 1% less than the MST cost.3
We timed the execution of the serial and parallel versions
of B I S, using both the naive 0(n' log n) implementation [28]
and our new 0(n 3 ) implementation, which incorporates the
efficient, dynamic MST maintenance as described in Section
III. We observe that the number of rounds required by BIS is
always very small, and may therefore be considered a constant;
for example, when n - 300, the average number of rounds is
only about 2.5, and we never observed BIS to require more
than 5 rounds.The parallel implementation uses nine SUN 4/40 (IPC)
workstations, with a SUN 4/75 (Sparc2) as the master proces-
sor. The improvements in running time are dramatic: for n =
100, our fast serial implementation is 247 times faster than the
naive serial implementation (as measured by CPU time), and
3Recently, other Steiner heunstics with performance approaching that of
IIS were proposed by Borah, Owens. and Irwin 181, Chao and Hsu [10], and
by Lewis, Pong, and VanCleavc [33].
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 13, NO 11, NOVEMBER 1994
TABLE IBATCHED 1-STEINER (B IS) STATISTICS: THE PERFORMANCE FIGURES DENOTE PERCENT IMPROVEMENT OVER MST COST. ALSO GIVEN ARE STATISTICS REGARDING
THE NUMBER OF STEINER POINTS PRODUCED, THE NUMBER OF ROUNDS, AND THE NUMBER OF STEINER POINTS PRODUCED IN EACH ROUND
Batched 1-Steiner (H1S) Performance ResultsPerformance Number of SPs Number of rounds Ave #SPs per round
n Min Ave Max Min Ave Max Min Ave Max 1 2 3 44 0.00 8.83 27.19 0 2.10 4 0 0.97 2 1.14 0.37 0.00 0.005 0.00 9.35 23.59 0 2.59 5 0 1.08 3 1.51 0.46 0.06 0.006 0.00 9.83 25.99 0 3.09 6 0 1.15 3 1 91 1.42 0.03 0.007 0.41 9.82 23.44 2 3.52 7 1 1.17 3 2.32 0.59 0.02 0.008 0 00 9.92 20 93 0 3.99 8 0 1.23 3 2.73 0.96 0.17 0.009 1.04 10.07 20.62 2 4.48 7 1 1.31 3 3-12 1.16 0.19 0.00
10 1.26 10.24 20.03 3 4.95 8 1 1.34 4 3.56 1.11 0.10 0.0111 2.01 10.32 20.81 3 5.42 10 1 1.33 3 4.01 1.01 0.03 0.0012 1.85 10.32 18.83 2 5.88 10 1 1.43 4 4.31 1.28 0.09 0.0013 2.06 10.27 19.74 3 6 39 11 1 1.46 4 4.80 1.23 0.09 0.0414 3.01 10.28 19.56 3 6.82 11 1 1.48 4 5.18 1.25 0.14 0.0715 2.78 10.41 18.10 4 7.27 12 1 1.49 4 5.59 1.33 0.12 0.0016 2.47 10.42 18.45 4 7 76 13 1 1.52 4 6.01 1.33 0.13 0.0017 3.00 10.58 17.94 5 8.33 14 1 1.56 3 6.50 1.38 0.27 0.0018 3.62 10.46 18.19 o 8.70 14 1 1.58 4 6.85 1.52 0.26 0.0119 3.71 10.37 18.05 5 9.14 14 1 1.61 5 7.21 1.58 0.10 0.0120 4.35 10.39 17.90 5 9.63 11 1 1.60 4 7.68 1.58 1.00 0.00
30 6.49 10.64 14.73 9 13.47 20 1 1.85 4 11.61 1.75 0.21 0.0140 6.28 10.68 14.55 12 18.10 26 1 1.98 5 15.95 2.47 0.23 0.0050 7.25 10.72 15.28 15 23.70 33 1 2.04 4 19.97 2.78 0.56 0.0060 8.67 10.77 14.29 19 27T13 37 1 2 12 5 23.88 3.81 0.58 0.0070 7.80 10.82 13.55 23 32.26 42 1 2.17 4 28.02 3.52 0.62 0.0080 7.34 10.83 12.88 28 36.99 50 1 2.20 4 32.07 4.66 0.64 0.0190 7.99 10.84 13.52 32 4161 56 1 2.24 4 36.18 5.14 1.17 0.04100 8.78 10.86 13.60 35 46.30 59 1 2.25 4 40.21 5.55 1.31 0.01110 8.04 10.86 12.86 40 30.98 67 1 2.28 4 44.96 5.77 0.53 0.08120 7.83 10.86 12.62 44 o5.59 68 1 2.29 5 48.64 7.15 1.02 0.00130 8.25 10.85 12.63 47 60.03 73 1 2.34 4 52.87 7.22 0.82 0.00140 8.45 10.84 12.57 51 64.97 79 1 2.38 5 56.94 8.12 1.07 0.22150 9.07 10.92 12.63 57 69.83 87 1 2.39 4 59.92 8.28 1.13 0.00200 9.39 10.97 12.32 79 93 10 107 2 247 4 80.04 11.68 1.48 0.50250 10.26 10.98 11.68 103 116 32 131 2 2.57 4 100.00 14.53 0.82 0.00300 9.76 10.97 12.18 131 137.67 145 2 2.50 4 121.85 16.85 1.33 0.00
our parallel implementation running on 10 processors is 1163times faster than the naive implementation (as measured byelapsed wall-clock time). It should be noted that for the serialimplementation, the CPU time and the elapsed wall-clock
time are essentially identical, since we report averages overhundreds of cases, and all of our algorithms are CPU-bound
(i.e., swapping and 1/0 time is negligible). Detailed executiontimings for various cardinalities are given in Table II. andare illustrated in Fig. 14(d). Note that most of the observed
speedup over the naive implementation stems from the fastserial implementation using the dynamic MST update scheme(i.e., as described in Section III). The parallel implementationonly adds a speedup factor somewhat less than the numberof processors/workstations available (10 SUN's in our case).The serial speedup grows as a function of the net size, sinceour new serial implementation is asymptotically faster than
the naive serial implementation.As noted by Ganley and Cohoon [15], most nets in actual
VLSI designs have a small number of pins (i.e., less than 10).It is therefore of particular interest to observe the behavior
of our algorithms on small nets. Even for small pointsets, ournew implementations are considerably faster than the previous,naive ones; for example, for n = 5, the new serial BIS is onaverage twice as fast as the naive implementation, while for
n = 1il it is 7 times as fast. Our parallel speedup increaseswith problem size, reaching about 7.2 for n = 250 runningon 10 processors. This enables us to examine the asymptoticbehavior of B IS for much larger pointsets than was previouslypossible: the output of Batched 1-Steiner (B1lS) for a randompointset of size 300 is shown in Fig. 13.
We have also implemented the EMkS and the EBkS algo-rithms, and the following algorithms were tested side-by-sideon the same inputs:
* BIS-The Batched 1-Steiner method, with the code fur-ther streamlined for speed;
* MBIS-The "modified" B IS variant that uses only win-ning candidates from previous rounds;
* EBIS-The enhanced version of BIS:* 12S-The Iterated 2-Steiner algorithm;* EI2S-The enhanced version of 12S;* META(B1S,MB1S)-The metaheuristic over heuristics
BIS and MB 1S, i.e., the best solution found by either ofthese heuristics;
* META(EB1S,t2S,EI2S) -The metaheuristic over EB I S,12S, and E12S;
* META(BlS,MBlS,EBlS,12S,EI2S)-The metaheuristicover BOS, MBIS, EBIS, 12S, and E12S;
* OPT-The optimal Steiner tree algorithm [40].
1 360
1361GRIFFITH t al.. NEAR OPTIMAL STEINER TREES IN POI YNOMIAL TIME
2 11.4
I-
10.
g9 10.S0.
A10
.< 9.:,.
Naive B IS(serial)
Fast BIS(serial) Fast B IS
. i/(parallel
10o 200 300
Pointset Size(d)
Fig. 14. (a) Average performance of BIS, shown as percent improvement over MST cost. (b) Average number of rounds for BIS. (c) Average number of
Steiner points induced by BIIS (vertical bars indicate the range of the minimum and maximum number of Steiner points added). (d) Average execution times
(in elapsed wall-clock seconds) for the serial and parallel BIS, using both the naive MST implementation and our new incremental MST maintenance scheme,
over a wide range of net sizes; note that for the seinal versions, the elapsed wall-clock times are essentially the same as the CPU time, since the benchmarks
represent average inning times over hundreds of cases (the algorithms are CPU-bound, so that swapping and 110 have negligible effect on the running time).
Recall from the discussion at the end of Section IV that MB I S
is a more efficient version of BIS, since it only examines
a fraction of the Hanan candidates in a typical round (i.e.,
only the ones with positive MST savings in the previous
round). OPT is the fastest known optimal rectilinear Steiner
tree algorithm [40]. All variants have been benchmarked on up
to 10000 random instances of each net size. Fig. 15(a) shows
the performance comparison of MB IS, E12S, and OPT, while
Table III gives more detailed performance data. We observe
that the average performance of our methods is extremely close
to optimal: for n - 8, E12S is on average only about 0.11%
away from optimal and solutions are optimal in about 90%
of the cases. Even for n = 30, MBIS is only about 0.30%
away from optimal, and yields optimal solutions in about one
quarter of all cases. Table IV tracks the percentage of cases
where the various heuristics find the optimal solution, and this
data is also depicted in Fig. 15(b).
Table V gives the average CPU times, in seconds, for each
heuristic and net size. Our most time-efficient algorithm is
MBIS, requiring an average of 0.009 CPU seconds per 8-pin
net, and an average of 0.375 seconds per 30-pin net. Using
ElkS (or EBkS) with values of k greater than 2 improves
the performance, but slows down the algorithm; it is easy to
see that for arbitrary k, ElkS (EBkS) always yields optimal
solutions for <k + 2 pins, but has time complexity greater by
a factor of n2(k-1) than that of EBIS. This allows a smooth
tradeoff between performance and efficiency. However, the
performance of the EBkS algorithm with k = 2 is already so
close to optimal, that in most applications increasing k further
is not likely to justify the incurred time penalty.
In three dimensions, we observed that the average perfor-
mance of EBIS approaches 15% improvement over MST cost
and the performance increases with the number of planes L.
It is not surprising that the average savings over MST cost
in three dimensions is higher than it is in two dimensions,
since the worst-case performance ratio in three dimensions
is higher also (i.e., 3 for three dimensions versus 3 for two
dimensions). Fig. 15(c) shows the performance of our method
in three dimensions for various values of the number of parallel
planes L, including the unrestricted three-dimensional case,
corresponding to the limit when L approaches oc. Table VII
gives statistics for three dimensions on the number of Steiner
..CA
50 1oo 150 200 250 300Pointset Size
(b)
3.0 -
2.5 -
2.0 -
1.5 -
00
0
.0S0
U.1
M
M
(a)
A.U9
U)C
Us3 000 -
r
0W
0-
.1
Pointset Size(c)
OI .U
)Pi
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 13, NO. 11, NOVEMBER 1994
PCr _
W
C6a-
.S
C ].cr
aC
PFr
e 15Q
C
t 14
a
C.EE 12
ate
~ 0'.C
S060
W0W
'C.
Pointset Size(a)
a2iiC
Cr
15
C
It
41
t;*
Pointset Size(c)
Pointset Size(b)
Pointset Size(d)
Fig. 15. (a) Average performance iii two dimensions of E12S, B IS. and OPT; note that E12S is only 0.25%/ (or less) away from optimal. (b) Percentage of allcases when the heuristics find the optimal solution (note that E12S yields optimal solutions a large percentage of the time). (c) Average performance of EBISil three dimensions for various values of L =number of parallel planes. (d) Average number of Steiner points added by BIS in three dimensions for L = oc.
points induced by BIS (see Fig. 15(d)), as well as on the
number of rounds that occur in BIS before termination. Asis the case in two dimensions, the number of rounds for BiSin three dimensions is on average very small. Table VI givesmore detailed performance data. In all cases, the L parallelplanes were uniformly spaced in the unit cube (i.e., they wereseparated by G units, where G - 10000 is the gridsize).Unfortunately, the OPT algorithm of Salowe and Warme [40]does not easily generalize to higher dimensions; thus, we werenot able to compare our three-dimensional version of EB IS tooptimal.
IX. CONCLUSION
We have proposed enhanced serial and parallel implemen-tations of the Batched I-Steiner heuristic (BiS), achievingspeeds of up to three orders of magnitude faster than previousimplementations. Moreover, the speedup increases with thenumber of points. This has enabled the testing of BIS onseveral hundred points for the first time, and we observed thatfor such large pointsets BIS consistently improves lI5c overMST cost.
Next, we enhanced BlS by using a near-greedy approach
with random tie-breaking. Our method enjoys the same asymp-totic time complexity as B IS, yet offers improved average per-formance. We also allow performance increase at the expenseof running time, creating a smooth tradeoff between solutionquality and computational efficiency. Extensive simulationsindicate that for typical nets, the average performance ofour methods is less than 0.25% away from optimal, and oursolutions are actually optimal for up to 90% of uniformlydistributed nets of typical sizes.
We generalized B 1S and its variants to three dimensions, aswell as to the case where all the pins lie on L parallel planes,which arises in, e.g., three-dimensional VLSI and multi-layerrouting. Our methods are highly parallelizable and generalizeto arbitrary weighted graphs; thus, they are suitable to supporta multi-layer global router, where obstruction and congestionconsiderations affect routing. Since Steiner tree constructionis a computationally expensive component of global routing,our techniques suggest the feasibility of a "Steiner engine" forefficiently computing near-optimal Steiner trees.
We reduced the running time of our algorithms througha dynamic MST-maintenance scheme, and we proved that
1 362
iq-
en
GRIFFITH et 1.: NEAR-OPTIMAL STEINER TREES IN POLYNOMIAL TIME
TABLE 11EXECUTION TIMES FOR BATCHED 1-STEINER (BIS) THE SERIAL EXECUTION
TIMES ARE GIVEN IN CPU SECONDS, WHILE THE PARALLELEXECUTION TIMES ARE ELAPSED WALL CLOCK TIMES BOTH THE
SERIAL AND PARALLEL VERSIONS WERE TESTED WITH THE OLD NAIVEIMPLEMENNIAIION, AS WELL AS THE NEW FASTER IMPLEMENTATION THE
OVERALL GAIN OS THE RATIO OF THE OLD SERIAL TIME TO THE NEWPARALLEL TIME. THE PARALLEL IMPLEMENTATION USES 9 SUN 4/40 (IPC)
WORKSTATIONS, WItH A SUN 4/75 (SPARC2) AS THE MASTER PROCESSOR
n43
6
7
91012] 41G1820305050607n
8090100120140160180200250300
Batched Sterio PB verage Execulion Speeds (PU Second' I
Serial Par. l 1dold 11l ratioA B A.B
0.010.020.040.060.000.120.160.260.440.610.8110.93.808.5916.128.141.962.287.5114232:3443765718011528
1 002.002.503.334.115.256.30" 17710.6613.771801122.0839 7460.7770.1997.69131.74166.40176.57246.84
1800
0.04
0.10
0.200370 631 0220284.69804014.6724.071515221130274;5520103,501.n4,5028140
o.d nle ItioC_ DB C -/B
0.200.210.210.270.390.51I .06I 572.074.145 540.2I262007531002208425824748
0.190.200.220 240 270.290.350.410.470.57
0 61l.793.235.038.0312.117.116.524 240.5
79.6i7103129212
447
1.051 .051.091.131.141.763.033.834.407.269.13
22.4639.0139.7693.7782.81121 .7156.48196.20
old new 0 il*A/C RD A/D
0070.060.200.48
0.511.371.622.002.152.994.063.544.323l.754.1156;
3,655.514.974.975.93
0 070.110.200.270 380.440..550.741.071.301421792.122.663203 501 .463.645.304.715.735.874.725.546.217.214.03
0.070.210.;'30.911.542.'333.52
6.51
17.87
25.73
39.46
84.36
161.61
221.65
34 1.84
456.20
605.26
936 36116.81
TABLE HIPERFORMANCE STATISrlcS: THE FIGURES DENOTE
AVERAGE PERCENT IMPROVEMENT OVER MST COST
Average Performance Results in Two Dimensions ('c Improvement over 5MST)# ( (21 (3) (4) 5) Meta Meta Set.
n cts BIS MB1S EBiS 12S E12S (1-2) (35) (1-5) OPT4 10000 851 .54 8.54 8.54 8.54 8.54 8.54 854 3 54o 10000 9.34 9 34 9.39 9.44 9045 9.37 9.(6 9046 9.476 10000 9.79 9.78 9.85 9 90 9 92 9 82 9093 993 9.97,7 5000 1004 100)3 1012 16.15 10.19 10.08 10.21 1021 10.278 5000 1006 100.0 10.15 10.17 10.22 10.11 10.24 10.24 10.339 5001) 10.16 10.14 10.25 10.27 10.33 10.20 10.34 10.34 10.47160 5000 19 1 0.16 10.27 10.
28 1034 10.22 10.36 10.36 10.52
12 5000 10.24 10.24 10.34 10.33 1039 1029 10441 1HA1 I 1.5R
11 0000 10.33 10.33 10.42 10.40 1047 1037 10.49 1049 10.70
16 5000 10.33 10.32 10.42 10.40 10.47i 10.37 10.49 10.49 10.7318 4000 10.51 10 51 10.61 10.58 10.66 10.56 10.67 10.67 10.93
20 3000 10.51 [IH51 loo.t 10.56 10.64 10.5. 10.660 066 10.9225 2000 10.47 10.48 10.07 10.53 10.61 10.52 10.62 10 62 10.90:30 1100 10.45 10.59 10.76 1055 10.62 10.51 10.63 1063 10.89
50 500 10.89 10.89 10.99 10.03 11.03 10.93 1 .04 11 05
[70 00 1 0.'] 1077 10.66 1086 1080 1091 1091
under the Manhattan metric: 1) in two dimensions we can
always find an MST with maximum degree 4; and 2) inthree dimensions we can always find an MST with maximumdegree 14. The best previously known bounds for two andthree dimensions were 6 and 26, respectively. These resultswere used to decrease the running times of our algorithms,and moreover they have independent theoretical significance;for example, our bounds on the maximum MST degree wereused to settle an open problem in complexity theory [39].
1363
---------- dl
TABLE IVOPTIMALITY PERCENTAGES: FIGURES DENOTE THlE PERCEN I OF THE CASES
WhERE THE VARIOUS HEURISTICS FOUND THE OPTIMAL SOLUTION
Pcrcnt o theCd9O Whe Sovton aptimal(I) i2) (3) () () Mt eaMt
n-nts BIS MBIS EBIS 12S ES 2 ( ; (I) OP*1 10000 149.0 4020 6 00.00I3 co 00 40 1.3 002 1006 10000 94.29 93.96 96-53 984 sn ;3 9.11 1 J2 0"ooo
0 30 90 34 20 U 93168 4310 304, 4100 5v33 o7.2 10000
20266 10 10441 31 40 3500 140.9 430104
3 00 8 w20 84 7 15 89 9 4 0 40 11 90 4 0 4 1000081 000 70 22 78.707 85.26 84373 0.107 '03 00 0268916no 75.80 703 88 82 20 0372 0. 50 007, 10.12 0400
1603 01I .4 0. 00 027 14461.6807
20 so 72103 70 .144 790.32 77lo.0 092 7 23ns84.04 24.49 4 oo 112 5000 0320 6194 13 20 40.7709768 9 700 72 70204 13.06
300 'I03 09 0 01 0.94 100. 102 490
4 U 5865.73 66.69 62 R4 70, 73 62.3l 723 233 oos50 n 3 36 49.92 60.17 54.1 763092 . 4.4 60 73 084 7: lIS 4 00 44.90 45 02 55 6.S 49.33 .8 49.45 61.02 61 02 lo OQ
20 300 42 87 42.00 53. 20 45.70 55 47 47 007 1823 ss.23 6085 o25 2000 31 25 31 70 41.45 34.15 44.3 356 59 4.0loo
30 100) 27.oo 26.00 41.00 32 0o45 en 3]0U 47, 0 47 00 100.00
TABLE VAVERAGE EXECUTION TIMES, IN CPU SECONDS. FOR EACH OF THE HEURISTICS
average CoT tEle Per Net CPU recon soet-(1) (2) (3) (4) Meta Meta.et
n BIS MB1S EBIS 32S EC2S 7 C2) (3-5) (1-5) OPT4 0.002 91 92 001 0.072 0.003 0.127 0 130 0.0065 0.004t 0.002 0.107 0.003 0.068 O.O[16 0.178 0.184 0.0106 0.006 0.006 4 0.006 0.128 01010 0.317 0328 0.0187 0.009 0.006 0.287 0.012 0.231 0.015 0.530 0.545 0.0318 0.013 0.009 0.432 0.019 0.384 0.022 0.83oi n 857 0.0469 0.019 0.012 0.571 0.030 0.634 0.031 14.2 1 266 0.06610 0.025 9.016 0.806 16 04 0.951 0.04 1.802 1844 0.09012 0 043 0.026 1.324 0.096 1.967 0.069 3.387 1 456 4.1814 0.066 0.041 2.127 0.180 3.623 0.107 5.930 6,037, 0.26816 0.096 0.059 3.288 0.317 6.372 0.155 9.971 10.13 0.40518 2 .034 O16 4212 041 .140 9.700 0.215 14.46 14.68 0177420 0.181 0.115 6.440 0.833 15.87 0.296 23.14 23.44 1.61825 0.341 0.220 1194 2.0150 48.09 0.561 62.08 62.64 13.8830 0.569 0.375 1S.93 4.184 86.10 0.9,41 109.2 110.1 495.850 2.732 1.694 79.78 35.9 1 764.64 4426 8.80.3 884.770 6.661 4.236 255.4 150.3 5 668L 10.90 -604 68
TABLE VIAVERAGE PERCENT IMPROVEMENT OF EB IS OVER MST
COST IN THREE DIMENSIONS FOR L PLANES EQUALLY SPACEDIN T(oE UNIT CUBE. THE UNRESTRICTED THREE-DIgnENSIONAL
CASE OCCURS WHFN L = cO (LAST COLUMN)
Avbeter erfo rformance bu that rns Imopre ovem t ovsc MST)- L2 13 L=_4 L= = -0 L=14 L-20 L=-
.1 .1 9 72 _H)1 1.1106 1.2-6 10.63 10.664 9.7 1 6 11.14 11.80 12.33 12.39 12.17 12.01 12.575 10.42 11.46 12.46 1 2.48 13.12 13.26 13.48 13.45 13.617 10.66 12.16 IJ 07 13.24 13.95 13.91 14.19 14.20 14.23
10 10.14 12.86 I1'3,64 14 22 I1.s.54 15.33 14.26 14.34t 15.2614 10.53 12.20 13.25 13,77 14 38 14.7 14.27 14.77 14.8420 10.19 12.10 13.48 13.37 14.2n 14.95 14.74 14.84 14.9730 10.33 11.23 11.92 13.21 13.82 14.52 15.02 15.04 15.24
50 10.28 10.20 11.31 13.01 13.49 14,49 15.14 15.08 15.16
Remaining open research questions include finding anMRST heuristic with performance consistently higher thanEBIS (or MBIS), but with a significantly better runningtime (note that it would not suffice to just find a heuristic withbetter performance, but that runs more slowly, since increasingthe parameter k in EBkS will achieve exactly that). On thetheoretical front, it would be interesting to determine whetherthe maximum MST degree in three-dimensional Manhattanspace is 13 or 14 and extend such results to higher dimensionsand to different Lp norms.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL 13, NO. 11, NOVEMBER 1994
TABLE VIISTATISTiCS REGARDING THE NUMBER OF STFINER POINTS INDUCED BY
B IS iN THE UNRESTRICTED THREE DIMENSIONAL CASE (i.e.,
L = ac). ALSO GIVEN ARE THE NUMBER OF ROUNDs ExECUTED BY
B IS. SHOWN ARE THE MINIMUM, AVERAGE, AND MAXIMUM VALUES
ACKNOWLEDGMENT
We would like to thank Mr. Tim Barrera for his help withthe coding, and the referees for their helpful comments.
REFERENCES
[1] M. J. Alexander and G. Robins, "An architecture-independent unifiedapproach to FPGA routing," Dept. of Computer Science, U niv ofVirginia, Tech. Rep. CS 93-51, Oct. 1993.
[2] M. J. Alexander and G. Robins, "A new approach to FPGA routingbased on multi-weighted graphs." in Proc. ACM/SIGDA Int. Workshopon Field-Programmable Gate Arrays, Berkeley, CA, Feb. 1994.
[3] T. Barrera, J. Griffith, S. A. McKee, G. Robins, and T. Zhang, "Towarda Steiner engine: Enhanced serial and parallel implementations ofthe iterated 1-Steiner algorithm," in Proc- Great Lakes Symp. VLSI,
Kalamazoo, MI. Mar. 1993, pp. 90-94.[4] T. Barrera, J. Griffith, G. Robins, and T. Zhang, "Narrowing the gap:
Near-optimal Steiner trees in polynomial time," in Proc. IEEE Int. ASIC
Conf:, Rochester, NY, Sept. 1993, pp. 87-90.[5] A. Beguelin, J. J. Dongarra, G. A. Geist. R. Manchek, and V. S.
Sunderam, "A user's guide to PVM: Parallel virtual machine," OakRidge National Laboratory, Tech. Rep. ORNU/TR 11826, 1991.
[6] P. Berman, U. Foessmeier, M. Karpinski, M. Kaufmann, and A. Z. Zelikovsky, "Approaching the 5/4-Approximation for rectilinear Steinertrees," Wilhelm Schiekard Institut fur Informatik, Tech. Rep. WSI-94-06, 1994.
[7] P. Berman and V. Ramaiyer, "Improved approximations for the Steinertree problem," in Proc. ACM/SIAM Symp. Discrete Algorithms, SanFrancisco, CA, Jan 1992, pp. 325-334.
8] M. Borah, R. M. Owens, and M. J. Irwin, 'An edge-based heuristicfor rectilinear Steiner trees," Dept. of Computer Science, PennsylvaniaState Univ., Tech. Rep. CS-93-003, 1993.
[9] D. Braun, J. Bums, S. Devadas, H. K. Ma, K. M. F. Romeo, andA. Sangiovanm-Vincentelli, "Chameleon: A new multi-layer channelrouter," in Proc. ACM/IEEE Design Automation Conf, 1986, pp.495-502
[10] T. H. Chao and Y. C. Hsu, "Rectilinear Steiner tree construction bylocal and global refinement," IEEE Trans. Computer-Aided Design, pp.303 309, 1994.
[11] T. H. Cormen, C. E. Leiserson, and R. Rivest, Introduction to Algo-rithms. Cambridge, MA: MIT Press, 1990.
[12] J. T. Croft, K. J. Falconer and R. K. Guy, Unsolved Problems inGeometriy New York: Springer-Verlag, 1991.
[13] U. Foessmeier, M. Kaufmann, and A. Z. Zelikovsky, "Fast approxi-
mation algorithms for the rectilinear Steiner tree problem," Wilhelm
Schickard-lnstitut fur Informatik, Tech Rep. WSI-93-14, 1993.[14] G. N. Fredrickson, "Data structures for on-line updating of minimum
spanning trees," SIAM J. Comput., vol. 14, pp. 781-798, 1985.
[15] J. L. Ganley and J. P. Cohoon, "Routing a multi-termrinal critical net:Steiner tree construction in the presence of obstacles," in Proc. IEEE
Intl. Symp. Circuits Syst., London, England, May 1994.
#SPs and #rounds for B1S in 3D#SPs #rounds
n min ave max min ave max3 0 0.90 1 1 1.90 24 0 1.53 2 1 2.18 35 1 2.25 3 2 2.33 47 1 3.63 6 2 2.60 5
10 2 5.63 8 2 2.92 714 5 8.22 11 2 3.18 620 9 12.14 15 2 3.26 530 15 18.99 22 3 3.53 650 29 32.86 35 3 4.14 6
[16] M. Garey and D. S. Johnson, "The rectilinear Steiner problem is NP-complete," SIAM J Applied Math., vol. 32, pp. 826-834, 1977.
[17] G. Georgakopoulos and C. 11. Papadimtriou, "The I Steiner tree prob-lem," J Algorithms, vol. 8, pp. 122-130, 1987.
[18] E. N. Gilbert and H. 0. Pollak, "Steiner minimal trees," SIAM J. AppliedMath., vol. 16, pp. 1-29, 1968.
[19] R. L. Graham and F. K. Hwang, "A remark on Steiner minimal trees,"Bull. Inst Math. Acad. Sinica, vol. 4, pp. 177-182, 1976.
[201 A. Hanafusa, Y. Yamashita. and M. Yasuda, "Three-dimensional routing
for multilayer ceramic printed circuit boards," in Proc. IEEE itn. ConfComputer-Aided Design, Santa Clara, CA, Nov. 1990, pp. 386-389.
1211 M. Hanan, "On Steiner's problem with rectilinear distance," SIAM .Applied Math., vol. 14, pp. 255-265, 1966.
[22] A. C. Harier, Three-Dimensional integrated Circuit Laiyout. New York:Cambridge University Press, 1991,
[231 N. Hasan, G. Vijayan, and C. K. Wong, "A neighborhood improvementalgorithm for rectilinear Steiner trees," in Proc. IEEE Int. Symp. CircuitsSyst, New Orleans, LA, 1990.
[24] J.-M. Ho, G. Vijavan, and C. K. Wong, "New algorithms for therectilinear Steiner tree problem," IEEE Trans. Computer-Aided Design,vol. 9, pp, 185-193 1990.
[25] F. K. Hwang, "On Steiner minimal trees with rectilinear distance,"' SIAMJ. Applied Math., vol. 30, pp. 104-114, 1976.
[26] F. K. Hwang, "An 0(n log Ti) algorithm for rectilinear minimal span-ning trees," J. ACM., vol. 26, pp. 177 182, 1979.
[27] F. K. Hwang, D. S. Richards, and P. Winter, The Steiner Tree Problem.North Holland. 1992.
[28] A. B. Kahng and G. Robins, "A new class of iterative Steiner tree heuris-tics with good performance," [FEE Trans. Computer-Aided Design, vol.11, pp. 893 902, 1992.
[29] A. B. Kahng and G. Robins, "On performance bounds for a class ofrectilinear Steiner tree heuristics in arbitrary dimension," IEEE Trans.Computer Aided Design, vol. 11, pp. 1462-1465. 1992.
[30] A. B. Kahng and G. Robins, On Optimal Interconnections for VLSILayout. Boston, MA' Kluwer Academic Publishers (in press).
[31] 1. H. Lee, N. K. Bose, and F. K. Hwang, "Use of Steiner's problemin sub-optimal routing in rectilinear metric," IEEE Trans Circuits Eyst_vol. 23. pp. 470-476, 1976.
[32] K. W. Lee and C. Sechen, "A new global router for row-based layout," inPoc. IEEE Int Conf Computer-Aided Design, Santa Clara, CA, Nov.1990, pp. 180-183
[33] F. D. Lewis, W. C. Pong, and N. VanCleave, "Local improvement inSteiner trees," in Proc. Great Lakes Synip. VLSI, Kalamazoo, MI, Mar,1993, pp. 105-106.
[34] A. L. Loeb, Space Structures. Their Harmony and Counterpoint. NewYork: Birkhauser. 1991.
[35] C. Monma and S. Sun, "Transitions in geometric minimum spanningtrees," Discrete & Computational Geometry, vol. 8, pp. 265-293, 1992.
[36] C. H. Papadinsitriou and U. V. Vazirani, "On two geometric problemsrelating to the traveling salesman problem," J. Algorithms, vol. 5, pp.231-246, 1984.
[37] B. T. Preas and M. J. Lorenzetti, Physical Design Automation of VLSISvstems. Menlo Park, CA: Benjainiii/Cummings, 1988.
[38] G. Robins, "On optimal interconnections," Ph.D. Dissertation, CSD-TR-920024, Dept. of Computer Science, UCLA, 1992.
[39] G. Robins and J. S. Salowe. "On the maximum degree of minimumspanning trees," in Pmc. ACM Symp. Computational Geometry, StonyBrook, NY, June 1994, pp, 250-258
[40] J. S. Salowe and D. M. Warme, "An exact rectilinear Steiner treealgorithm," in Proc. IEEE Inc Conf Computer Design, Cambridge, MA,Oct. 1993, pp. 472-475.
[41] J. M. Smith and J. S. Liebman, "Steiner trees, Steiner circuits and theinterference problem in building design. Engineering Optimization, vol.4, pp. 15-36. 1979.
[42] T. L. Snyder, "On the exact location of Steiner points in generaldimension," SIAM. J. Compuc, vol. 21, pp. 163-180, 1992
[431 V S. Sunderam, "PVM: A framework for parallel distributed comput-ing,' Concurrency: Practice and Erperience, vol. 2, pp. 315-339, 1990.
[44] G. F. Tdth, "New results in the theory of packing and covenng,"Convexity and its Applications, 1983.
[451 A. C. C. Yao, "On constructing minimum spanning trees in k-dimensional spaces and related problems," SIAM I Comput., vol.11, pp. 721-736, 1982.
[461 A. Z. Zelikovsky, "An 11/6 approximation algorithm for the networkSteiner problem," Algorithmica, vol. 9, pp. 463-470, 1993.
[47] A. Z. Zelikovsky, "A faster approximation algorithm for the Steiner tree
problem in graphs," Information Processing Letters, vol. 46, pp. 79-83,1993.
1364
GRIFFITH ft el.: NEAR-OPTIMAL STEINER TREES IN POLYNOMIAL TIME
Jeff Griffith received his Bachelor's degree inmathematics in 1992 from the University of Vir-ginia in Charlottesville. He is presently workingon his Ph.D. in computer science at the Universityof Tennessee in Knoxville. His areas of interestinclude physical design for VLSI circuits, as wellas scientific computing.
Gabriel Robins (S'91-M'91) received the Ph.D.degree in 1992 from the UCLA Computer Sci-ence Department, where he won the DistinguishedTeaching Award and held an IBM Graduate Fellow-ship. Currently, Dr. Robins is assistant professorof computer science at the University of Virginiain Charlottesville, where he recently won the NSFYoung Investigator Award. His primary areas ofresearch are VLSI CAD and geometric algorithms,with recent work focusing on performance-driven
roumng, sterner tree heuristics, curnputanonal ge-ometry, motion planning, pattern recognition, and computational biology. Dr.Robins is the author of a Distinguished Paper at the 1990 IEEE InternationalConference on Computer-Aided Design, and his Ph.D. dissertation wasnominated for the ACM Distinguished Dissertation Award. He is a memberof ACM, IEEE, MAA, and SIAM.
Jeffrey S. Salowe received the B.A. Degree inmathematics from the University of Virginia, Char-lottesville, VA, in 1983, and the MS. and Ph.D. de-grees in computer science from Rutgers University,New Brunswick, NJ, in 1985 and 1987, respectively.Currently, he is a consultant for Questech, Inc. Hisresearch interests include the analysis of algorithms,computational geometry, data structures, and graphalgorithms.
Tongtong Zhang studied at the University of Bei-jing, China for three years and received her Bach-elor's degree in 1992 from Bridgewater College,Virginia. She is currently working on her Ph.D.in computer science at the University of Virginiain Charlottesville. Her areas of interest includephysical design for VLSI circuits and computationalbiology.
1365