31
Optimality, Scalability and Stability study of Partitioning and Placement Algorithms Jason Cong, Michail Romesis, M in Xie UCLA Computer Science Department This work is partially supported by Semiconductor Research Corporation and National Science Foundation

Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

  • Upload
    miron

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

Optimality, Scalability and Stability study of Partitioning and Placement Algorithms. Jason Cong, Michail Romesis, Min Xie UCLA Computer Science Department. This work is partially supported by Semiconductor Research Corporation and National Science Foundation. Overview. - PowerPoint PPT Presentation

Citation preview

Page 1: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

Optimality, Scalability and Stability study of Partitioning and Placement

Algorithms

Jason Cong, Michail Romesis, Min Xie

UCLA Computer Science Department

This work is partially supported by Semiconductor Research Corporation and National Science Foundation

Page 2: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

2

Overview

Motivation and related work Our contribution

Construction of Partitioning Examples with Known Upper bound

Construction of Placement Examples with Known Upper bound

Optimality, Scalability and Stability study Conclusions and future work

Page 3: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

3

Overview

Motivation and related workMotivation and related work Our contribution

Construction of Partitioning Examples with Known Upper bound

Construction of Placement Examples with Known Upper bound

Optimality, Scalability and Stability study Conclusions and future work

Page 4: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

4

Motivation Partitioning

0

20

40

60

80

100

120

FM PANZA CLIP LSR hMetis

(1982) (1995) (1996) (1997) (1997)

MCNC ISPD

Significant progress in partitioning during the mid-to-late 90’s

No significant improvement in the last 5 years

Have we reached a plateau?

Page 5: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

5

Motivation Placement

Lack of significant progress in wirelength reduction Rate of reduction is about 5-10% every 2-3 years Latest developments in placement differ mainly in runti

me Capo [A. Caldwell et al, 2000] Dragon [M. Wang et al, 2000] Mongrel [S. Hur et al, 2000] mPL [T. Chan et al, 2000] mPG [C. Chang et al, 2002]

How much is the room for further improvement?

Page 6: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

6

Motivation

Most work compare only with known heuristics Use real design based benchmarks

ISPD98 [C. Alpert 1998] WSI [D. Ghosh et al, 1997]

Use synthetic benchmarks circ and gen [M. D. Hutton et al, 1998] gnl [D. Stroobandt et al, 2000]

Little understanding about the divergence from the optimal

Page 7: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

7

Related Work Quantified Suboptimality of VLSI Layo

ut Heuristics [L. Hagen et al, 1995] Construct scaled instance with k

nown upperbound from an initial problem

x

x x x

x x x

x x x

? Over 10% area suboptimality in

TimberWolf Notable wirelength suboptimalit

y in GORDIAN-L Significant improvement was po

ssible for placement and partitioning

But test cases are small, the largest netlist is less than 40K

Page 8: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

8

Related Work Optimality and Scalability of Existing

Placement Algorithms [C. Chang et al, 2003] Construct instances with known

optimal using the characteristic of the original problem?

Existing placement algorithms can be 70% to 150% away from the optimal

Average solution quality deteriorates by an additional 4% to 25% when the problem size increases by a factor of 10

All the connections are local, no global connections

Page 9: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

9

Overview

Motivation and related work Our contributionOur contribution

Construction of Partitioning Examples with Construction of Partitioning Examples with Known Upper bound Known Upper bound

Construction of Placement Examples with Construction of Placement Examples with Known Upper bound Known Upper bound

Optimality, Scalability and Stability studyOptimality, Scalability and Stability study Conclusions and future work

Page 10: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

10

BEKU Construction Example

Cutsize improved to 4 after FM

Input: t = 16, D={12,8} B = 5

Generate 9 2-pin nets that do not cross the partition line

C

D

BA

P1 P2 Create two partitions of size 8

Generate 3 2-pin nets that cross the partition line

Generate 6 3-pin nets that do not cross the partition line

Generate 2 3-pin nets that cross the partition line

Cutsize = 5

Page 11: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

11

Construction of Multiway Partitioning Examples with Known

Upper Bounds (MEKU)

Divide the nodes into mm partitions of equal size

Create BB nets that cross at least two partitions. The remaining nets stay in one partition

Improve by multiway FM

Page 12: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

12

BEKU and MEKU Suite

2-way partitions occupy 45-55% of the total area

8-way partitions occupy 11.8-13.3% of the total area

# of nodes # of nets# of

partsUpper bound

500,000 530,705 2 92,343500,000 530,705 2 111,873

1,000,000 1,061,410 2 184,7141,000,000 1,061,410 2 223,5201,500,000 1,592,114 2 276,6701,500,000 1,592,114 2 335,2422,000,000 2,122,819 2 369,5262,000,000 2,122,819 2 447,781500,000 530,705 8 139,943500,000 530,705 8 160,163

1,000,000 1,061,410 8 279,9751,000,000 1,061,410 8 320,4571,500,000 1,592,114 8 420,2791,500,000 1,592,114 8 479,9712,000,000 2,122,819 8 560,2752,000,000 2,122,819 8 640,459

URL : http://cadlab.cs.ucla.edu/~pubbench/partitioning/

Page 13: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

13

Tested three State-of-the-Art Partitioning Tools

hMetis [G. Karypis et al, 1997] Based on multilevel framework MHEC and FC clustering algorithms Variations of FM for refinement at each level

MLPart [A. Caldwell et al, 2000] Based on multilevel framework Different algorithms for coarsening (PinEC) and refinement

(VRW) Flare [J. Cong et al, 2000]

Two-level hierarchy created by the ESC clustering algorithm Based on the LR bipartitioning engine and the PM multiway

partitioning framework

Page 14: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

14

Experimental Results on BEKU

MLPart produces the best results (very close to our estimated upper bound), and Flare the worst

The value of the bound (as a percentage of nets) influences the quality of hMetis and Flare

0.9

0.951

1.051.1

1.15

1.21.25

1.31.35

1.4

15% 17% 19% 21% 23% 25%

Bound (% of nets)

Qu

alit

y R

atio

MLPart hMetis Flare

Page 15: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

15

Experimental Results on BEKU

The runtime scale well (almost linearly) Flare runs out of memory when problem size exceeds

1M nodes

0

10

20

30

40

500000 1000000 1500000 2000000

Circuit size

Min

ute

s

hMETIS MLPart Flare

Page 16: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

16

Experimental Results on MEKU

hMetis is worse by only 2% when the initial bound is 30%, but the gap increases to 18% for a bound of 35%

MLPart does not support multiway partitioning

0

0.5

1

1.5

2

30% 35%Bound (% of nets)

Qu

alit

y R

atio

hMetis Flare

Page 17: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

17

Placement Examples with Global Connections

circuit height widthWL of

longest netWL contribution of longest 10%

ibm01 8158 4530 7148 51%ibm02 8158 6430 14224 46%ibm03 8158 6740 10624 58%ibm04 8158 9140 15171 53%ibm05 8158 11055 19064 47%ibm06 8158 8715 13966 61%ibm07 8158 14605 14051 51%ibm08 8158 15895 16142 60%ibm09 8158 16395 13780 55%ibm10 8158 27890 30755 53%ibm11 16350 10925 19234 59%ibm12 16350 15545 26748 52%ibm13 16350 12230 19539 59%ibm14 16350 25475 26370 61%ibm15 16350 23785 27284 63%ibm16 16350 34015 42860 59%ibm17 16283 38895 45686 56%ibm18 16350 37065 52846 64%

Produced by Dragon on ISPD98

The wirelength contribution from global connections can be significant!

Need to consider the impact of global connections

Page 18: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

18

Placement Examples with Global Connections only

Each net connects either a row or column

Obvious upper bound Sum the length of each

row and column Similar to datapath

examples

Page 19: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

19

Placement Examples with Non-local Connections

Extend PEKO [ C.Chang 2003] by introducing non-local nets to mimic global connections All the modules are of equal size, and there is

no space between rows and adjacent modules

For nets of degree ii, *d*dii of them are generat

ed by randomly conneting ii modules, the rest are generated optimally as in PEKO

Page 20: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

20

Placement Examples with Non-local Connections

Input : t = 64, D = {d2=34,d3=20,d4=7,d5=4,d6=2, d7=1} =0.2

Total WL = 160

Generate 28 2-pin optimally

Generate 6 2-pin randomly

Generate 16 3-pin optimally

Generate 4 3-pin randomly

Generate 6 4-pin randomly

Generate 1 4-pin randomly

Generate 4 5-pin optimally

Generate 2 6-pin optimally

Generate 1 7-pin optimally

Page 21: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

21

G-PEKU Suite

circuit #cell #net #row UBGPeku01 12506 224 113 7.93E+05GPeku05 28146 336 169 1.79E+06GPeku10 68685 525 263 4.38E+06GPeku15 161187 803 402 1.03E+07GPeku18 210341 918 460 1.34E+07

Module number extracted from ISPD98

URL: http://cadlab.cs.ucla.edu/~pubbench/peku.htm

Page 22: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

22

PEKU Suite

Module number t and NDVs extracted from ISPD98

Remove connections with pads Vary from 0 to 10% 15% white space by expanding one dime

nsion of the chip

Page 23: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

23

PEKU Suite% non-

local nets

circuit #cell #net #rowRow

utilization

LB UB

Peku01 12506 14111 113 85% 8.14E+05 8.14E+05Peku05 28146 28446 169 85% 1.91E+06 1.91E+06Peku10 68685 75196 263 85% 4.73E+06 4.73E+06Peku15 161187 186608 402 85% 1.15E+07 1.15E+07Peku18 210341 201920 460 85% 1.32E+07 1.32E+07Peku01 12506 14111 113 85% 8.14E+05 9.23E+05Peku05 28146 28446 169 85% 1.91E+06 2.24E+06Peku10 68685 75196 263 85% 4.73E+06 6.17E+06Peku15 161187 186608 402 85% 1.15E+07 1.71E+07Peku18 210341 201920 460 85% 1.32E+07 2.01E+07Peku01 12506 14111 113 85% 8.14E+05 1.02E+06Peku05 28146 28446 169 85% 1.91E+06 2.63E+06Peku10 68685 75196 263 85% 4.73E+06 7.52E+06Peku15 161187 186608 402 85% 1.15E+07 2.30E+07Peku18 210341 201920 460 85% 1.32E+07 2.75E+07

Up to 10%

0

0.25%

0.50%

URL: http://cadlab.cs.ucla.edu/~pubbench/peku.htm

Page 24: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

24

Tested four State-of-the-Art Placers Capo [A. Caldwell et al, 2000]

Based on multilevel partitioner Aims to enhance the routability

Dragon [M. Wang et al, 2000] Uses hMetis for initial partition SA with bin-based swapping

mPL [T. Chan et al, 2000] Nonlinear programming on the coarsest level Goto based relaxation

mPG [C. Chang et al, 2002] Uses FC clustering and hierarchical density control Incremental A-tree for routability

Page 25: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

25

Experimental Results on G-PEKU

The gap between their solutions and the upper bound varies between 79% and 102% in the worst case

Another validation that there is significant room for improvement for the placement problem

circuitDragon v.2.20

QRCapo v.8.5

QRmPG v.1.0

QRmPL v.2.0

QR

GPeku01 1.98 1.56 1.91 1.69GPeku05 2.01 1.69 1.97 1.83GPeku10 2.02 1.72 1.98 1.94GPeku15 1.99 1.79 1.97 1.97GPeku18 2.02 1.78 1.98 1.98

Page 26: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

26

Experimental Results on PEKU

1

1.2

1.4

1.6

1.8

2

2.2

0.00% 0.25% 0.50% 0.75% 1.00% 2.00% 5.00% 10.00%% of non-local nets

Qu

alit

y R

atio

Capo v.8.5 Dragon v.2.20 mPG v.1.0 mPL v.2.0

mPL’s QR increases when is increased from 0 to 0.75%, while for the other three placers, QRs are steadily decreasing

Absolute value of the QRs may not be meaningful, but it helps to identify the technique that works best under each scenario

Page 27: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

27

Overview

Motivation and related work Our contribution

Partitioning Examples with Known Upper bound

Placement Examples with Known Upper bound

Optimality, Scalability and Stability study Conclusions and future workConclusions and future work

Page 28: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

28

Conclusions Bipartitioning techniques seem fairly mature

The best available algorithms perform and scale very well on examples by our construction

The best available multiway partitioning algorithms do not perform equally well The worst divergence from upperbound is 18%

by hMetis There is still significant room for improve

ment in circuit placement Existing placement algorithms may produce so

lutions far away from the optimal (or upper bound)

Their effectiveness depends much on the characteristic of circuits

Page 29: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

29

Future Work

Construction of more synthetic examples

Measure routability optimality Measure timing optimality

Understand the deficiencies of existing algorithms using these examples

Guide the development of new VLSI CAD algorithms

Page 30: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

30

Acknowledgement

Prof. I. Markov for providing Capo’s latest version

Prof. S. Lim for providing Flare’s latest version

X. Yuan for providing the data of mPG J. Shinnerl and K. Sze for providing the e

xperimental data of mPL

Page 31: Optimality, Scalability and Stability study of Partitioning and Placement Algorithms

31

THE END

THANK YOU