3
Recursive Bisection Placement: Feng Shui 5.0 Implementation Details Ameya R. Agnihotri SUNY Binghamton CSD Binghamton, NY [email protected] Satoshi Ono SUNY Binghamton CSD & University of Kitakyushu Binghamton, NY [email protected] u.ac.jp Patrick H. Madden SUNY Binghamton CSD & University of Kitakyushu Binghamton, NY [email protected] ABSTRACT In this paper, we summarize circuit placement techniques and al- gorithms developed by the BLAC CAD research group; these have been integrated into our recursive bisection based placement tool feng shui. We also briefly describe current research interests. Catagories and Subject Descriptions: J.6 [Computer-Aided En- gineering]: CAD General terms: Algorithms Keywords: Placement, floorplanning, mixed block design 1. INTRODUCTION Circuit placement is a well-studied area of physical design. Sim- ulated annealing[13, 22, 26], recursive bisection[3, 7, 4, 2], and analytic methods[14, 25, 8, 24] are widely used. In this paper, we summarize research performed at Binghamton University, result- ing in improved techniques and algorithms for placement. Due to space constraints, we describe our work briefly; a longer version of this paper, related publications, and binary releases of our tools are available on the web at http://vlsicad.cs.binghamton.edu. The primary vehicle for testing research ideas has been our re- cursive bisection based placement tool feng shui; the overall flow of our tool is shown in Figure 1. Following the release of version 5.0, development based around recursive bisection will be put on hold, as our focus is now on a new approach. This paper is organized into four main sections; we consider global placement, legalization, detailed placement, and future di- rections for placement research. 2. GLOBAL PLACEMENT The overall flow of our bisection based placement approach fol- lows traditional methods[3, 7]. Starting from an initial circuit netlist and placement region, we repeatedly divide the logic elements us- ing a partitioner (either hMetis or MLPart). Terminal propagation normally results in an interdependence of partitioning results for Copyright is held by the author/owner. ISPD’05, April 3–6, 2005, San Francisco, California, USA. ACM 1-59593-021-3/05/0004. different regions. A common approach to address the interdepen- dence is through “cycling[23, 10, 11];” a region may be partitioned multiple times before terminal propagation becomes stable. As an alternative to cycling, we investigated large-scale multi-way par- titioning in [27], using a method based on iterative deletion[18]. While we were unable to eliminate cycling from our flow, and wire length improvements were modest, using the initial partitioning so- lutions provided by iterative deletion reduced the number of cycles needed for convergence. For bisection based placement, the direction of cut lines can in- fluence results considerably. Early methods for cut line selection were essentially ad hoc; in [28], we performed dynamic program- ming based analysis using Rent’s Rule[20]. This analysis showed that an aspect ratio based methodology is appropriate; adjustment of the aspect ratios used resulted in improved results for both feng shui and Capo. The bisection approach, cycling for terminal propagation, and different possible cut sequences, are illustrated in Figure 2. While results of feng shui were good on MCNC benchmarks, pathological behavior was observed on more recent benchmarks– and in particular, the synthetic PEKO examples [6]. The degre- dation in performance was due to the narrow region problem; this problem was addressed in a novel way by the introduction of frac- tional cut lines[2]. Rather than aligning horizontal cut lines with row boundaries, they are instead allowed to float freely–this re- quires a legalization step, which we describe in the next section. While the need for legalization introduced a new problem to be solved, we find that it can in fact be solved relatively easily; the net result of fractional cut lines is roughly a 5% reduction in wire length, and a greater degree of stability in standard cell placement. A second, and perhaps greater benefit of fractional cut lines is a simplification in placement for circuits that contain both macro blocks and standard cells. Without a constraint on row alignment or legality during global placement, macro blocks and standard cells can be handled simultaneously by recursive bisection[12]. 3. LEGALIZATION To obtain a legal placement after fractional cut based bisection, a number of techniques were investigated. Dynamic Programming Based Legalization: our first approach to legalization used a row-by-row dynamic programming algorithm[2]. The approach minimizes displacement for cells assigned to a row, and is similar in many respects to other DP based optimization tech- niques. We are currently investigating methods to legalize multiple rows at a time, using cost functions other than cell displacement. The method operates as follows: we first sort all cells by their

Recursive Bisection Placement: Feng Shui 5.0

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Recursive Bisection Placement: Feng Shui 5.0

Recursive Bisection Placement:Feng Shui 5.0 Implementation Details

Ameya R. AgnihotriSUNY Binghamton CSD

Binghamton, NY

[email protected]

Satoshi OnoSUNY Binghamton CSD &University of Kitakyushu

Binghamton, NY

[email protected]

Patrick H. MaddenSUNY Binghamton CSD &University of Kitakyushu

Binghamton, NY

[email protected]

ABSTRACTIn this paper, we summarize circuit placement techniques and al-gorithms developed by the BLAC CAD research group; these havebeen integrated into our recursive bisection based placement toolfeng shui. We also briefly describe current research interests.

Catagories and Subject Descriptions: J.6 [Computer-Aided En-gineering]: CAD

General terms: Algorithms

Keywords: Placement, floorplanning, mixed block design

1. INTRODUCTIONCircuit placement is a well-studied area of physical design. Sim-

ulated annealing[13, 22, 26], recursive bisection[3, 7, 4, 2], andanalytic methods[14, 25, 8, 24] are widely used. In this paper, wesummarize research performed at Binghamton University, result-ing in improved techniques and algorithms for placement. Due tospace constraints, we describe our work briefly; a longer version ofthis paper, related publications, and binary releases of our tools areavailable on the web at http://vlsicad.cs.binghamton.edu.

The primary vehicle for testing research ideas has been our re-cursive bisection based placement tool feng shui; the overall flowof our tool is shown in Figure 1. Following the release of version5.0, development based around recursive bisection will be put onhold, as our focus is now on a new approach.

This paper is organized into four main sections; we considerglobal placement, legalization, detailed placement, and future di-rections for placement research.

2. GLOBAL PLACEMENTThe overall flow of our bisection based placement approach fol-

lows traditional methods[3, 7]. Starting from an initial circuit netlistand placement region, we repeatedly divide the logic elements us-ing a partitioner (either hMetis or MLPart). Terminal propagationnormally results in an interdependence of partitioning results for

Copyright is held by the author/owner.ISPD’05, April 3–6, 2005, San Francisco, California, USA.ACM 1-59593-021-3/05/0004.

different regions. A common approach to address the interdepen-dence is through “cycling[23, 10, 11];” a region may be partitionedmultiple times before terminal propagation becomes stable. As analternative to cycling, we investigated large-scale multi-way par-titioning in [27], using a method based on iterative deletion[18].While we were unable to eliminate cycling from our flow, and wirelength improvements were modest, using the initial partitioning so-lutions provided by iterative deletion reduced the number of cyclesneeded for convergence.

For bisection based placement, the direction of cut lines can in-fluence results considerably. Early methods for cut line selectionwere essentially ad hoc; in [28], we performed dynamic program-ming based analysis using Rent’s Rule[20]. This analysis showedthat an aspect ratio based methodology is appropriate; adjustmentof the aspect ratios used resulted in improved results for both fengshui and Capo.

The bisection approach, cycling for terminal propagation, anddifferent possible cut sequences, are illustrated in Figure 2.

While results of feng shui were good on MCNC benchmarks,pathological behavior was observed on more recent benchmarks–and in particular, the synthetic PEKO examples [6]. The degre-dation in performance was due to the narrow region problem; thisproblem was addressed in a novel way by the introduction of frac-tional cut lines[2]. Rather than aligning horizontal cut lines withrow boundaries, they are instead allowed to float freely–this re-quires a legalization step, which we describe in the next section.While the need for legalization introduced a new problem to besolved, we find that it can in fact be solved relatively easily; thenet result of fractional cut lines is roughly a 5% reduction in wirelength, and a greater degree of stability in standard cell placement.A second, and perhaps greater benefit of fractional cut lines is asimplification in placement for circuits that contain both macroblocks and standard cells. Without a constraint on row alignment orlegality during global placement, macro blocks and standard cellscan be handled simultaneously by recursive bisection[12].

3. LEGALIZATIONTo obtain a legal placement after fractional cut based bisection,

a number of techniques were investigated.Dynamic Programming Based Legalization: our first approach

to legalization used a row-by-row dynamic programming algorithm[2].The approach minimizes displacement for cells assigned to a row,and is similar in many respects to other DP based optimization tech-niques. We are currently investigating methods to legalize multiplerows at a time, using cost functions other than cell displacement.

The method operates as follows: we first sort all cells by their

Page 2: Recursive Bisection Placement: Feng Shui 5.0

Aspect Ratio basedcut line selection [28]

Iterative Deletion [18,27]

Multi-level Partitioning

Fractional-cut basedregion adjustment [2]

Sliding WindowBranch-and-Bound

Global Placement:bisection based, with fractionalcut lines to obtain uniformarea utilization.The initial region size matchesthe total area of cells and macroblocks; partitioning attempts tosplit each region roughly in half.

Iterative deletion provides earlyterminal propagation information,which is refined through cycling.

Recent collaboration with Purdueand UCLA has shown that cutlines can be shifted to adjustplacement density without degradingwire lengths.

Optional cut line shiftingfor routability or gate sizing[16,17]

Single or Multi-rowdynamic programming(standard cell design)[2]

Greedy Tetris(mixed size)[9,15,12]

Legalization:a dynamic programming methodprovides good results for standardcell designs, while a greedy methodis used for mixed size.

Detailed Placement:Standard branch-and-boundmethods are used to optimizestandard cell locations; recentexperiments indicate a great dealof potential in this area.

Cycle (by default,

three passes).

Figure 1: Overall flow of feng shui 5.0

vertical location, based on the result of global placement. A subsetof these cells – the “highest” locations, with a total area slightlyin excess of the desired area of the cell row – are selected. Thedynamic programming formulation then assigns a cost for eitherinserting the cell into the top row (the cost being equal to displace-ment), or for deferring the legalization of the cell for a subsequentrow (the cost being equal to the distance to the next row). By pro-cessing cells from left to right, the optimal solution (for a singlerow) can be found quickly. After placing cells into the top row, theprocess moves to the next row down.

Greedy Legalization: for mixed size benchmarks, our early at-tempts to adapt our dynamic programming based method were notsatisfactory. Based on a preprint of a paper by Li and Koh[15],we began investigating a simple greedy approach; we later learnedthat a similar standard cell legalization method had previously beenpatented by Hill[9]. We extended this method to handle macroblocks, which were not addressed in the patent. The method is sur-prisingly fast, and produces excellent results in designs where thecircuit elements are distributed uniformly. On average, the place-ments produced by feng shui 2.6[12] improved wire lengths by 26%over mPG[5], and by 29% over a combination of Capo and Par-quet[1].

With fractional cut bisection, distribution for standard cells isquite uniform, making the legalization step trivial. For cases wherethe macro blocks are large and rectangular, there can be significantamounts of overlap–this results in a degredation of wire length. Weare addressing this problem in our current work.

4. DETAILED PLACEMENTWhen the PEKO[6] benchmarks were first presented, there ap-

peared to be a large gap between optimal placements and the resultsof current placement tools. The degree of suboptimality–with wirelengths being from 1.5X to more than 2.5X away from optimal–generated a flurry of interest.

Our recent research[19, 21], however, indicates that for the PEKObenchmarks, this suboptimality is almost entirely “local.” Version

(a) The overall recursive bisection approach

First cut line

Second cut line Third cut lineSubregionA

SubregionB

SubregionA1

SubregionA2

SubregionA1

SubregionA2

SubregionB1

SubregionB2

A1

A2

B1

B2

A1’

A2’

B1

B2

A1’

A2’

B1’

B2’

A1’’

A2’’

B1’

B2’

A1’’

A2’’

B1’’

B2’’

A1’’’

A2’’’

B1’’

B2’’

(b) Cycling of partitioning to address terminal propagation changes

1

2 31 2 3

1

2

3

(c) Many cut orders and directions are possible; aspect ratio works well

Figure 2: Cut line orientation and terminal propagation arekey issues in bisection based placement; our tool employs cy-cling, and aspect ratio based cut line orientation.

5.0 of feng shui supports a graphical user interface that can displayplacements in variety of ways. For the “pixmap” display method, agraphical image is mapped onto the cells of a reference placement,and then the cells are rearranged to match placements produced byother tools.

In Figure 3, we show the feng shui 5.0 interface displaying theoptimal placement for PEKO01, with an image of one of the au-thors mapped to the cells. Below this are rearrangements of cellsto match the results of the recursive bisection engine of feng shui,a structural placement technique[19] also included in feng shui,and the results of the tools mPL [6], Dragon[26], Capo[4], andKraftwerk[8]. While the images are rotated or stretched, it is clearthat the “global” placements of all tools are quite good–and thus,the bulk of observed suboptimality is “local.” Despite the lack ofexternal pads, the placement of the analytic tool Kraftwerk also dis-plays a recognizable pattern.

The detailed placement engine of feng shui is based on sliding-window branch-and-bound optimization. Optimal rearrangementsof small groups of cells – usually 6 or fewer – are found repeatedly.The images in Figure 3 show that while the placement results are“close” to optimal, the windows used for detail placement are notlarge enough to eliminate the small scale suboptimality.

For this reason, we are currently exploring methods to performoptimal placement for windows that are much larger than 6 ele-ments. A method based on branch-and-price integer linear pro-gramming[21] has been developed, and can solve problems withup to 100 cells; at present, however, the method is too slow to beused across an entire design, and this has not yet been fully inte-grated into the feng shui code base.

5. FUTURE DIRECTIONSThe research focus of the group has moved away from recur-

sive bisection. Observations made with PEKO benchmarks suggestthat from a global perspective, results from annealing, analytic, andbisection are surprisingly similar. While there are many groups

Page 3: Recursive Bisection Placement: Feng Shui 5.0

feng shui 5.0 GUI with an image mapped tothe optimal placement of PEKO01

Dragon placement

Capo placement

Kraftwerk placement(not legalized)

mPL placementfeng shui structuralmode placement

feng shui recursivebisection placement

Figure 3: The feng shui 5.0 user interface, displaying an imagemapped onto the optimal placement of PEKO01. By rearrang-ing the cells to match the results of other placement tools, itis clear that a great deal of placement suboptimality for thesebenchmarks is “local” in nature.

working on traditional global placement methods, only relativelymarginal improvements seem likely–thus, we are investigating afundamentally new global placement approach.

Our new approach attempts to extract structure by topologicalanalysis of a circuit[19]. This works extremely well on the PEKObenchmarks, where our results are now within 22% of optimal. Webelieve that the method can be extended to more realistic structures–and that to address rising interconnect delay, future circuit struc-tures will transition towards more “PEKO-like” topologies.

We are also actively pursuing incremental placement techniques[16, 17]. To facilitate convergence of physical synthesis, we stronglybelieve that transforming an existing placement to support sizingand inserted buffers is superior to attempting to gain stability in aseparate run of the placement tools. Our work on placement legal-ization complements this. With our white space allocation meth-ods, cut line shifting, and an improvement on the cell shifting tech-nique presented in [24], we are able to regain placement legalitywithout wide changes to net lengths.

Finally, we are actively pursuing the integration of physical de-sign steps into a cohesive optimization approach. The global anddetail routing methods developed by the group are currently inte-grated into the development version of feng shui, and will likelybe released publicly in the summer of 2005. The next release mayalso feature a novel delay analysis approach, as well as support forSynopsys PowerARC delay libraries. We expect to have our gatesizing and buffer insertion methods fully integrated shortly.Acknowledgements: a number of students have worked on fengshui. In particular, Prof. Mehmet Can Yildiz is responsible fora large portion of the early work. Ateen Khatkhate and UdayanDeshpande influenced the development of the mixed size place-ment approach. We actively collaborate with our colleagues at Pur-due, Dr. Cheng-Kok Koh, and his student Chen Li; they have beeninstrumental in the development of both the mixed size legalizationstrategy, and the incremental placement techniques. The researchperformed by the group would not have been possible without thegenerous support of IBM, Intel, SRC, NYSTAR/MDC, IEEC, andthe SUNY Binghamton Research Foundation.

6. REFERENCES[1] S. N. Adya, I. L. Markov, and P. G. Villarrubia. On whitespace in mixed-size

placement and physical synthesis. In Proc. Int. Conf. on Computer AidedDesign, pages 311–318, 2003.

[2] A. Agnihotri, M. C. Yildiz, A. Khatkhate, A. Mathur, S. Ono, and P. H.Madden. Fractional cut: Improved recursive bisection placement. In Proc. Int.Conf. on Computer Aided Design, pages 307–310, 2003.

[3] M. A. Breuer. A class of min-cut placement algorithms. In Proc. DesignAutomation Conf, pages 284–290, 1977.

[4] A. E. Caldwell, A. B. Kahng, and I. L. Markov. Can recursive bisection aloneproduce routable placements? In Proc. Design Automation Conf, pages477–482, 2000.

[5] C. C. Chang, J. Cong, , and X. Yuan. Multi-level placement for large-scalemixed-size ic designs. In Proc. Asia South Pacific Design Automation Conf.,pages 325–330, 2003.

[6] C. C. Chang, J. Cong, and M. Xie. Optimality and scalability study of existingplacement algorithms. In Proc. Asia South Pacific Design Automation Conf.,pages 621–627, 2003.

[7] A. E. Dunlop and B. W. Kernighan. A procedure for placement of standard-cellVLSI circuits. IEEE Trans. on Computer-Aided Design of Integrated Circuitsand Systems, CAD-4(1):92–98, January 1985.

[8] H. Eisenmann and F. M. Johannes. Generic global placement and floorplanning.In Proc. Design Automation Conf, pages 269–274, 1998.

[9] D. Hill. US patent 6,370,673: Method and system for high speed detailedplacement of cells within an integrated circuit design, 2002.

[10] D. J.-H. Huang and A. B. Kahng. Partitioning based standard cell globalplacement with an exact objective. In Proc. Int. Symp. on Physical Design,pages 18–25, 1997.

[11] A. B. Kahng and S. Reda. Placement feedback: A concept and method for bettermin-cut placements. In Proc. Design Automation Conf, pages 357–362, 2004.

[12] A. Khatkhate, C. Li, A. R. Agnihotri, M. C. Yildiz, S. Ono, C.-K. Koh, andP. H. Madden. Recursive bisection based mixed block placement. In Proc. Int.Symp. on Physical Design, pages 84–89, 2004.

[13] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulatedannealing. Science, 220(4598):671–680, May 1983.

[14] J. Kleinhans, G. Sigl, F. Johannes, and K. Antreich. GORDIAN: VLSIplacement by quadratic programming and slicing optimization. IEEE Trans. onComputer-Aided Design of Integrated Circuits and Systems, 10(3):356–365,1991.

[15] C. Li and C.-K. Koh. On improving recursive bipartitioning-based placement.Technical Report TR-ECE-03-14, Purdue University ECE, 2003.

[16] C. Li, C.-K. Koh, and P. H. Madden. Routability-driven placement and whitespace allocation. In Proc. Int. Conf. on Computer Aided Design, pages394–401, 2004.

[17] C. Li, C.-K. Koh, and P. H. Madden. Floorplan management: Incrementalplacement for gate sizing and buffer insertion. In Proc. Asia South PacificDesign Automation Conf., pages 349–354, 2005.

[18] P. H. Madden. Partitioning by iterative deletion. In Proc. Int. Symp. on PhysicalDesign, pages 83–89, 1999.

[19] S. Ono and P. H. Madden. On structure and suboptimality in placement. InProc. Asia South Pacific Design Automation Conf., pages 331–336, 2005.

[20] C. E. Radke. A justification of, and an improvement on, a useful rule forpredicting circuit-to-pin ratios. In Proc. Design Automation Conf, pages257–267, 1969.

[21] P. Ramachandran, A. R. Agnihotri, S. Ono, P. Damodaran, H. Srihari, and P. H.Madden. Optimal placement by branch-and-price. In Proc. Asia South PacificDesign Automation Conf., pages 337–341, 2005.

[22] C. Sechen and A. Sangiovanni-Vincentelli. Timberwolf3.2: A new standard cellplacement and global routing package. In Proc. Design Automation Conf, pages432–439, 1986.

[23] P. R. Suaris and G. Kedem. An algorithm for quadrisection and its applicationto standard cell placement. IEEE Trans. on Circuits and Systems,35(3):394–303, 1988.

[24] N. Viswanathan and C. C.-N. Chu. Fastplace: Efficient analytical placementusing cell shifting, iterative local refinement and a hybrid net model. In Proc.Int. Symp. on Physical Design, pages 26–33, 2004.

[25] J. Vygen. Algorithms for large-scale flat placement. In Proc. DesignAutomation Conf, pages 746–751, 1997.

[26] M. Wang, X. Yang, and M. Sarrafzadeh. Dragon2000: Standard-cell placementtool for large industry circuits. In Proc. Int. Conf. on Computer Aided Design,pages 260–263, 2000.

[27] M. C. Yildiz and P. H. Madden. Global objectives for standard cell placement.In Proc. Great Lakes Symposium on VLSI, pages 68–72, 2001.

[28] M. C. Yildiz and P. H. Madden. Improved cut sequences for partitioning basedplacement. In Proc. Design Automation Conf, pages 776–779, 2001.