13
ABSTRACT We consider the problem of buffering a given tree with the minimum number of buffers under load cap and buffer skew constraints. Our contributions include: • A proof that the greedy algorithm proposed by Tellez and Sarrafzadeh (TCAD’97) is suboptimal for all non-zero skew bounds • An optimal dynamic programming algorithm for the problem • Experimental results on test cases extracted from recent industrial designs showing that the dynamic programming algorithm has practical run time and saves up to 20% of the buffers inserted by the On the Skew-Bounded Minimum Buffer Routing Tree Problem C. Albrecht (Synopsys), A.B. Kahng, B. Liu, I. Mandoiu (UCSD), A. Zelikovsky (GSU)

ABSTRACT

Embed Size (px)

DESCRIPTION

On the Skew-Bounded Minimum Buffer Routing Tree Problem C. Albrecht (Synopsys), A.B. Kahng, B. Liu, I. Mandoiu (UCSD), A. Zelikovsky (GSU). ABSTRACT - PowerPoint PPT Presentation

Citation preview

Page 1: ABSTRACT

ABSTRACTWe consider the problem of buffering a given tree with the minimum number of buffers under load cap and buffer skew constraints. Our contributions include:

• A proof that the greedy algorithm proposed by Tellez and Sarrafzadeh (TCAD’97) is suboptimal for all non-zero skew bounds • An optimal dynamic programming algorithm for the problem• Experimental results on test cases extracted from recent industrial designs showing that the dynamic programming algorithm has practical run time and saves up to 20% of the buffers inserted by the algorithm of Tellez and Sarrafzadeh

On the Skew-Bounded Minimum

Buffer Routing Tree ProblemC. Albrecht (Synopsys), A.B. Kahng, B. Liu, I. Mandoiu (UCSD), A. Zelikovsky (GSU)

Page 2: ABSTRACT

Motivation

• In order to initiate meaningful placement and timing optimizations, every design flow requires early elimination of all electrical violations (e.g., load cap and slew violations), even for non-critical nets. Bounds on load caps

- Serve as proxies for signal slew rate bound

- Improve coupling noise immunity

- Reduce delay uncertainty due to coupling noise

- Improve reliability with respect to hot-carrier and AC self-heating effects

- Facilitate technology migration since designs are more balanced

- Guarantee bounded input rise/fall times at buffers and sinks• For clock and test distribution an additional design requirement is bounding the buffer skew, i.e., the difference between the maximum and the minimum number of buffers over all source-to-sink paths in a routing tree, since buffer skew is one of the main factors affecting the actual delay skew• To make progress with any methodology, it is crucial to have a fast and resource efficient method for fixing load cap and buffer skew violations. Of particular interest are practical methods for buffering non-critical nets that have up to tens of thousands of sinks (e.g., scan enable)

Page 3: ABSTRACT

Given:– Net N with source r and set of sinks S– Binary routing tree T = (r, V, E) for N

– Input capacitance cs for each sink s S

– Buffer input capacitance Cb

– Unit-length wire capacitance Cw

– Capacitive load upper-bound CU

– Buffer-skew bound

Find: buffering of the routing tree T such that– The load cap of each buffer and of the

source r is at most CU

– The buffer skew is at most – The number of inserted buffers is

minimized

Minimum-Buffered Routing Problem

Tree with bounded buffer load cap

CU 0.75CU

0.75CU

Cw=Cb=0

Tree with bounded buffer load cap and zero buffer-skew

CU 0.75CU

0.75CU

Cw=Cb=0=0

Page 4: ABSTRACT

Bounded load cap w/o buffer skew bound

For each u V, in bottom-up order, do– A. packNode(u): Let v and w be the

two children of u. If cap(Tv) + cap(Tw) > Cu add a buffer at the topmost position of the child branch with the largest cap (the greedy choice) then remove the subtree driven by the buffer

– B. packEdge(u): While cap(Tu) > Cu add a buffer on edge (u,parent(u)) at the highest possible position still meeting the load cap bound Cu

The Greedy Algorithm• Proposed by Tellez and Sarrafzadeh (IEEE Trans. on CAD, vol. 16, 1997, pp. 333-342)

packNode(u) w/ buffer skew bound – A.0 If l(Tv) < l(Tw) (longest path

of v is less than longest path of w) then swap v and w.

– A.1 If l(Tv) - l(Tw) > then insert l(Tv) - l(Tw) - buffers at the topmost position of (u,w); exit if cap(Tu)<Cu

– A.2 Perform packNode(u) excluding child branches with maximum longest path; exit if cap(Tu)<Cu

– A.3 Insert buffers at topmost position of child branches with shortest path equal to l(u) –

– A.4 Perform packNode(u) considering only child branches with maximum longest path

Page 5: ABSTRACT

The Greedy Algorithm is Suboptimal

Greedy buffering Optimum buffering

• The greedy algorithm of Tellez and Sarrafzadeh finds the optimum buffering when = 0

• However, the algorithm is suboptimal for any buffer skew > 0

Counterexample 1.

Buffer skew = 1, sink input cap Cu=CU, Cv=Cx=0.75CU

Interconnect and buffer have zero cap

CU 0.75CU

0.75CU

CU 0.75CU

0.75CU

Page 6: ABSTRACT

• To guarantee optimality, solutions w/ different longest path lengths may be required for a subtree in any bottom-up algorithm

• Counterexample 3: Cw=Cb=0, ‘u’leaves, each with cu = CU – ,

one ‘v ’ leaf with cv = • Optimum: depending on upstream

tree topology, each of the following bufferings may be the only way to complete the optimum solution

• To guarantee optimality, arbitrarily many solutions may be need for a subtree in any bottom-up algorithm

• Counterexample 2: =1, Cw=Cb=0, cu=CU and cv satisfies cv2d-2<CU and cv(2d-2+1)>CU where d is depth of Ta

• Greedy buffers one of the two branches into node a, this triggers the insertion of arbitrarily many buffers upstream due to the skew constraint

• Optimum: buffers as many of the ‘v ’ nodes as needed in one of the two subtrees of node a

Why No Greedy Algorithm Will Work

v

u u u

u u u

u u u

v

v

a

u u u uvvvv

Page 7: ABSTRACT

• Initialize solution set L(u) = , u V• For each u V, in bottom-up order, do

(1) Let v and w be the children of u (2) For each buffering X L(v) and Y L(w), with l(X) ≥ l(Y), do

(a) Let Z be XY with max{0,l(X)-s(Y)} buffers added at the top(b) For each i = 0, …, min{max{0, s(X) – s(Y)}, l(X) – l(Y)} do

– Let Zi be Z with i buffers added at the top of edge (w,u)

– EdgeBuffering(Zi,u)(3) Remove from L(u) all bufferings with more than NB buffers(4) For each buffering with (nb, l, s) buffers in total, on longest path, and on

shortest path, respectively, remove from L(u) all bufferings with parameters (nb+k, l+k, s+k) where k ≥2

• Return the buffering X L(v) with minimum number of buffers

Procedure EdgeBuffering(X,u):• While cap(X) > CU, add a buffer on edge (u, parent(u)) at the highest position

meeting the load cap bound Cu

• L(u) L(u) + X• If cap(X) > Cb then L(u) L(u) + X’ where X’ is X with an additional buffer just

below parent(u)

Dynamic Programming Algorithm

Page 8: ABSTRACT

Analysis

Corectness:• By induction: for each buffering X of the branch driven by (u,parent(u)) there

exists k > 0 and a buffering Y L(u) such that X is dominated by Y with k buffers added at the top

The dynamic programming algorithm returns an optimum feasible buffering

Runtime:• For each node u T, the solution set L(u) computed by the dynamic

programming algorithm contains at most 2(+1)NB bufferings The running time of the algorithm is O(n(+1)3NB2) time, where n, and NB

are the number of sinks, the given skew bound and a given upper-bound on the optimum number of buffers, respectively

• The bound is not known to be tight, in practice the runtime is much better

Page 9: ABSTRACT

• DP has practical runtime (less than 1 second for the above 2676-sink test)• DP saves up to 20% of the buffers inserted by Tellez-Sarrafzadeh algorithm• Compared to zero-skew buffering, DP achieves a significant reduction in the

number of inserted buffers even with a very small buffer skew (=1 or 2)

Experimental Results

CU

=0 =1 =2 =3 =4 LB

=TS97 DP TS97 DP Gain TS97 DP Gain TS97 DP Gain TS97 DP Gain

500 266 266 238 211 11.3% 229 204 10.9% 226 198 12.4% 227 196 13.7% 196

0.04 0.14 0.03 0.33 0.04 0.60 0.03 0.82 0.04 1.02

1000 125 125 117 104 11.1% 109 99 9.2% 106 98 7.5% 106 98 7.5% 97

0.03 0.10 0.04 0.27 0.03 0.50 0.04 0.71 0.04 0.87

2000 64 64 55 50 9.1% 52 49 5.8% 52 48 7.7% 52 48 7.7% 48

0.03 0.10 0.04 0.29 0.03 0.50 0.04 0.69 0.04 0.86

4000 34 34 30 26 13.3% 29 23 20.7% 28 22 21.4% 28 22 21.4% 22

0.03 0.10 0.04 0.28 0.04 0.50 0.04 0.70 0.04 0.88

8000 15 15 15 12 20.0% 13 11 15.4% 13 11 15.4% 13 11 15.4% 11

0.04 0.11 0.04 0.28 0.06 0.48 0.04 0.66 0.05 0.81

Page 10: ABSTRACT

On the Skew-Bounded Minimum

Buffer Routing Tree Problem

C. Albrecht (Synopsys Inc.)

A.B. Kahng, B. Liu, I. Mandoiu (UC San Diego)

A. Zelikovsky (Georgia State U.)

Page 11: ABSTRACT

Minimum-Buffered Routing

• Early elimination of load cap and slew violations is needed for all nets, even for non-critical ones. Bounds on load caps

- Serve as proxies for signal slew rate bound

- Improve coupling noise immunity

- Reduce delay uncertainty due to coupling noise

- Improve reliability with respect to hot-carrier and AC self-heating effects

- Facilitate technology migration since designs are more balanced

- Guarantee bounded input rise/fall times at buffers and sinks

• For clock and test distribution an additional design requirement is bounding the buffer skew, i.e., the difference between the maximum and the minimum number of buffers over all source-to-sink paths in a routing tree

• Minimum-Buffered Routing Problem: Given a routed net, sink/buffer input caps, and unit-wire cap, insert the minumum number of buffers to satisfy given load cap and buffer skew constraints

• Introduced by Tellez and Sarrafzadeh (IEEE TCAD’97) who gave a greedy algorithm

Page 12: ABSTRACT

Our Contributions

• A proof that the greedy algorithm of Tellez and Sarrafzadeh is suboptimal for all non-zero skew bounds

- We give examples showing that no greedy algorithm can achieve optimality

• An optimal dynamic programming algorithm for the problem- The algorithm computes lists of undominated feasible solutions for all subtrees, in bottom-up order

- Worst-case runtime is O(n(+1)3NB2) time, where n, and NB are the number of sinks, the skew bound, and a given upper-bound on the optimum number of buffers, respectively

- Runtime is much better in practice

• Experimental study of buffering algorithms on test cases extracted from recent industrial designs

- The dynamic programming algorithm uses significantly fewer buffers than the algorithm of Tellez and Sarrafzadeh

Page 13: ABSTRACT

• DP has practical runtime (less than 1 second per run)• DP saves up to 20% of the buffers inserted by Tellez-Sarrafzadeh algorithm• Compared to zero-skew buffering, DP achieves a significant reduction in the

number of inserted buffers even with a very small buffer skew (=1 or 2)

Results on a 2676-sink testcase

CU

=0 =1 =2 =3 =4 LB

=TS97 DP TS97 DP Gain TS97 DP Gain TS97 DP Gain TS97 DP Gain

500 266 266 238 211 11.3% 229 204 10.9% 226 198 12.4% 227 196 13.7% 196

0.04 0.14 0.03 0.33 0.04 0.60 0.03 0.82 0.04 1.02

1000 125 125 117 104 11.1% 109 99 9.2% 106 98 7.5% 106 98 7.5% 97

0.03 0.10 0.04 0.27 0.03 0.50 0.04 0.71 0.04 0.87

2000 64 64 55 50 9.1% 52 49 5.8% 52 48 7.7% 52 48 7.7% 48

0.03 0.10 0.04 0.29 0.03 0.50 0.04 0.69 0.04 0.86

4000 34 34 30 26 13.3% 29 23 20.7% 28 22 21.4% 28 22 21.4% 22

0.03 0.10 0.04 0.28 0.04 0.50 0.04 0.70 0.04 0.88

8000 15 15 15 12 20.0% 13 11 15.4% 13 11 15.4% 13 11 15.4% 11

0.04 0.11 0.04 0.28 0.06 0.48 0.04 0.66 0.05 0.81