30
Power-Aware Placement Yongseok Cheon, Pei-Hsin Ho Advanced Technology Group, Synopsys, Inc. {cheon,pho}@synopsys.com Andrew B. Kahng, Sherief Reda and Qinke Wang UCSD CSE Department

Power-Aware Placement Yongseok Cheon, Pei-Hsin Ho Advanced Technology Group, Synopsys, Inc. {cheon,pho}@synopsys.com Andrew B. Kahng, Sherief Reda and

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

  • Slide 1
  • Power-Aware Placement Yongseok Cheon, Pei-Hsin Ho Advanced Technology Group, Synopsys, Inc. {cheon,pho}@synopsys.com Andrew B. Kahng, Sherief Reda and Qinke Wang UCSD CSE Department {abk,sreda,qiwang}@cs.ucsd.edu
  • Slide 2
  • 2 Outline Introduction Activity-based register clustering Activity-based net weighting Experiments Conclusions
  • Slide 3
  • 3 IC Power Consumption Switching power largest source of power dissipation usually accounts for 40% to 80% of total power switching power of a net is proportional to the product of net capacitance and signal switching rate Short circuit power power dissipation due to short current that happens briefly during the switching of a CMOS gate Leakage power power dissipation due to spurious currents in the non-conducting state of a transistor
  • Slide 4
  • 4 Clock Power Consumption Clock net a major contributor to dynamic power much larger capacitances than most signal nets highest switching activity typically consumes up to 40% of total dynamic power across a variety of design types Traditional placement methodologies treat registers no differently than combinational cells lead to sub-optimal placements in terms of power
  • Slide 5
  • 5 Power Aware Placement Method Activity-based register clustering reduce capacitance of clock nets hence clock power Activity-based net weighting reduce capacitance of high-activity signal nets hence total net switching power
  • Slide 6
  • 6 Outline Introduction Activity-based register clustering Activity-based net weighting Experiments Conclusions
  • Slide 7
  • 7 Large Weight for Clock Net? Not a good idea May only affect registers close to boundaries Introduce hot spots and highly congested areas
  • Slide 8
  • 8 Distribution of Clock Tree Capacitance Observation: most of the clock tree capacitance (e.g., 80%) is at the leaf level
  • Slide 9
  • 9 Register Clustering Goal: reduce capacitance of a clock net Method: clumping the registers within the same leaf cluster of the clock tree into a smaller area Result: reduced leaf-level clock tree capacitance and potentially clock skew
  • Slide 10
  • 10 Flow of Register Clustering 1.Quick CTS algorithm: group registers into clusters such that each cluster can become a leaf cluster of the actual clock tree 2.Group Bounds: constrain the placement of a cluster of registers within smaller bounding box
  • Slide 11
  • 11 Quick Clock-Tree Synthesis Algorithm Decide a scope of target cluster size heuristically based on size of the clock net design rule constraints: max fanout and max load user configuration Perform clustering for each direction from left, right, top and down and each target cluster size Select the clustering with the best CTS objective e.g., minimum clock skew, minimum clock delay, minimum # clock buffers, etc.
  • Slide 12
  • 12 Quick CTS Algorithm (contd) Start with the leftmost (rightmost, highest or lowest) un- clustered clock pin Add clock pin with shortest Manhattan distance to the capacitance weighted centroid of the current cluster Grow until target cluster size Repeat growing clusters until all done
  • Slide 13
  • 13 Group Bounds Control bounding box of a cluster and reduce it while still fitting the registers Compute current bounding box of registers Shrink the bounding box proportionally Shrink ratio p specified shrinking factor of p 0 switching rate of clock net SR and max switching rate MSR
  • Slide 14
  • 14 Aspect Ratio of Bounding Box Close to the original bounding box aspect ratio AR old when shrinking ratio p is close to 1 without serious increasing of signal net length Close to square when shrinking ratio p is close to 0 reduced clock skew Linear function of original aspect ratio AR old and shrink ratio p
  • Slide 15
  • 15 Outline Introduction Activity-based register clustering Activity-based net weighting Experiments Conclusions
  • Slide 16
  • 16 Effectively reduce capacitance of leaf-level clock tree Increase the length of some signal nets Cancel out clock power reduction Pros and Cons of Register Clustering
  • Slide 17
  • 17 Activity-Based Net Weighting Goal: reduce capacitance of signal nets Assigning larger weight to signal nets with higher switching rates Combining register clustering and activity- based net weighting further reduces the total net switching power
  • Slide 18
  • 18 Assign larger weights to nets with higher switching rates T: threshold for selecting high activity nets MSSR: maximum signal net switching rate W: controls the scope of power weights Activity-Based Net Weighting
  • Slide 19
  • 19 Compatibility with Timing Weights Linear combination of power and timing net weighting Power ratio : 0 ~ 1 control the ratio of power weight knob for trade-off between timing and power
  • Slide 20
  • 20 Outline Introduction Activity-based register clustering Activity-based net weighting Experiments Conclusions
  • Slide 21
  • 21 Experimental Setup Implemented on Synopsys IC compiler Eight industry circuits: #cells: 20k ~ 186k #registers: 2.3k ~ 44.2k clock power: 32% of total power net switching power: 39% of total power Power aware placement shrink ratio and power ratio around 0.8
  • Slide 22
  • 22 Experimental Flow Commercial IC implementation flow Power analysis: IC Compiler specified switching rates of primary inputs net switching rates estimated by probabilistic simulation Place CTS Route Extract RC STAPower Analysis
  • Slide 23
  • 23 Clock Net Switching Power 11.2%
  • Slide 24
  • 24 Total Net Switching Power 25.4%
  • Slide 25
  • 25 Results
  • Slide 26
  • 26 Summary Reduction clock net switching power: 11.3% (1.6% ~ 34.5%) total net switching power: 25.3% (10.5% ~ 47.1%) total power: 11.4% (6.5% ~ 18.8%) clock WL: 10.1% clock skew: random Impact WNS (worst negative slack): 2.0% total cell area: 1.2% runtime: 11.5%
  • Slide 27
  • 27 Power-Timing Trade-Off with Power Ratio
  • Slide 28
  • 28 Power-Timing Trade-Off with Shrink Ratio
  • Slide 29
  • 29 Conclusions We have presented a power-aware placement method that performs activity- based net weighting and register clustering to reduce the capacitance of high-activity signal and clock nets We have experimented the method on eight real designs through a complete industrial physical design flow Our approach achieved average 25.3% and 11.4% reduction in net switching and total power, with 2.0% timing, 1.2% total cell area and 11.5% runtime degradation
  • Slide 30
  • 30 Thank You !