20
-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi- Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo Han, Andrew B. Kahng, Jongpil Lee, Jiajia Li and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego

-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

Embed Size (px)

Citation preview

Page 1: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-1-UC San Diego / VLSI CAD Laboratory

A Global-Local Optimization Framework for Simultaneous Multi-

Mode Multi-Corner Clock Skew Variation Reduction

A Global-Local Optimization Framework for Simultaneous Multi-

Mode Multi-Corner Clock Skew Variation Reduction

Kwangsoo Han, Andrew B. Kahng, Jongpil Lee, Jiajia Li and Siddhartha Nath

VLSI CAD LABORATORY, UC San Diego

Page 2: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-2-

OutlineOutline

Motivation Related Work Our Optimization Framework Experimental Setup and Results Conclusions

Page 3: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-3-

MotivationMotivation Many signoff PVT corners in modern SoCs Clock skew variation across corners

“ping-pong” effect == fixing timing issues at one corner leads to timing violation at others

Our goal: Minimize clock skew variation

datapath

launch path capture path

CornerClock latency

SkewLaunc

h Captur

e

SS, 0.7V, -25°C 1.0 1.1 -0.1

FF, 1.1V, -25°C 0.9 0.7 +0.2

Low voltage: gate delay dominatesHigh voltage: wire delay dominates Skew reversal Power/area overheads

1.0 1.1

Skew = -0.1/+0.2

/0.7/0.7

Page 4: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-4-

OutlineOutline

Motivation Related Work Our Optimization Framework Experimental Setup and Results Conclusions

Page 5: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-5-

Related WorkRelated Work

Skew minimization at multiple corners [Cho05] perform temperature-aware skew reduction based

on an improved DME [Lung10] minimize the worst clock skew across corners

with delay correlation factors

Skew variation minimization across corners [Restle01] propose two-level non-tree structure, in which

mesh is applied at bottom level [Su01] use mesh for top-level of clock network [Rajaram04] insert crosslinks in a clock tree to minimize

skew variation

Our work: systematic optimization framework for minimization of clock skew variation in clock tree

Page 6: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-6-

Skew Variation Reduction ProblemSkew Variation Reduction Problem

Clock skew between sink pair (i, j) at corner C: difference between delays from r to sinks i and j at corner C

Skew variation between corner pair (C, C’)

Maximum skew variation for sink pair (i, j)

Skew variation reduction problem: Given a routed clock tree, minimize the sum over all sink pairs of maximum skew variation

Minimize

At C :Skewi,j

C

At C’ : Skewi,j

C’

i j

r

r: root; i, j: sinks

C’

C’’ i j

r

C

C’’ i j

r

C

C’

i j

r

max…

Page 7: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-7-

OutlineOutline

Motivation Related Work Our Optimization Framework Experimental Setup and Results Conclusions

Page 8: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-8-

Our Optimization FrameworkOur Optimization Framework Incremental optimization of a CTS solution Perform both global and local optimization Global optimization uses LP to determine delta delays on arcs Local optimization performs iterative local moves

root

last-stage buffer

sinksOriginal routed clock tree

target

buffer

After global optimization

root

root

After local optimization

Routed clock tree database

Global Optimization Buffer insertion/removal,

routing detour

Local Optimization Local moves (e.g.,

sizing/displacement)

Optimized database

Page 9: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-9-

Global Optimization: LPGlobal Optimization: LP Formulate linear program to minimize skew variation

Determine the delta delay on each arc at each corner Based on LUTs to insert/remove buffer and detour wires

Discreteness of buffer delays ECO feasibility is importantMinimize (: delta delay of arc k at corner C) (1)Subject to (: maximum skew variation) (2) (3) (: clock latency to sink i at corner C) (4) min delay without wire detour (: arc delay) (5) range of delay ratio from LUTs (6)

(1) Minimize number of ECO changes (2) Sweep U for solution with minimum skew variation (3) Ensure no skew degradation (4) Maximum clock latency constraint (1, 5, 6) Improve ECO feasibility

Page 10: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-10-

Our Optimization FrameworkOur Optimization Framework Incremental optimization of a CTS solution Perform both global and local optimization Global optimization use LP to determine delta delays on arcs Local optimization perform iterative local moves

Routed clock tree database

Global Optimization Buffer insertion/removal,

routing detour

Local Optimization Local moves (e.g.,

sizing/displacement)

Optimized database

Page 11: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-11-

Local Optimization: MovesLocal Optimization: Moves Iterative local moves to minimize skew variation Tree types of local moves

1. Displacement {N, S, E, W, NE, NW, SE, SW} by 10μm x one-step sizing2. Displacement by 10μm x one-step sizing on child buffer3. Reassign to a new driver (i) at the same level, (ii) within bounding

box of 50μm x 50μm10μm

...

...... ...

(1)

10μm

...

...... ...

(2)

...

...... ...

...

(3) Each move is expensive (= legalization, ECO routing, RC extraction, STA)

Each buffer has ~100 candidate moves Which move is the best? Our solution: learning-based

model

Page 12: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-12-

Machine Learning-Based ModelMachine Learning-Based Model

Predict driver-to-fanout latency change due to local moves

Local move

Analytical models Routing: FLUTE, STST Cell delay: Liberty LUTs Wire delay: Elmore, D2M

Delta delays

Learning-based model

Delta delays

0 2 4 6 8 10 120%

20%

40%

60%

80%

100%

Flute+EDFlute+D2MSTST+EDSTST+D2MModel

#Attempts

%B

uff

ers

id

en

tifi

ed

to

h

av

e t

he

be

st

mo

ve

Each attempt is a local move 114 buffers 45 candidate moves for each buffer Learning-based model identifies best

moves for more buffers with less #attempts

Page 13: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-13-

OutlineOutline

Motivation Related Work Our Optimization Framework Experimental Setup and Results Conclusions

Page 14: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-14-

Experimental SetupExperimental Setup Technology: foundry 28nm LP Initial clock tree from Synopsys IC Compiler Testcases: (a) high-speed application processor,

(b) memory controller

CornersClock ports Clock ports

In yellow are clock nets/cells and sinks

Corner

Process

Voltage

Temperature BEOL Apply to which testcase

C0 SS 0.90V -25°C Cmax (a), (b)

C1 SS 0.75V -25°C Cmax (a), (b)

C2 FF 1.10V 125°C Cmin (b)

C3 FF 1.32V 125°C Cmin (a)

Page 15: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-15-

Experimental Results (1)Experimental Results (1) Up to 22% reduction on sum of skew variation over all sink

pairs No skew degradation at all corners Negligible area and power overhead

Testcase Flow

Variation

(ns)

Skew (ps)

#CellsPower(mW)

Area(μm2)C0 C1 C2/C3

(a)Original 512 214 530 226 2515 0.355 3615

Global-local 399 175 387 188 2553 0.356 3706

(b)Original 972 179 192 282 5568 0.865 8556

Global-local 841 176 192 232 5574 0.866 8557

Page 16: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-16-

Experimental Results (2)Experimental Results (2) Figure shows comparison of skew variation on (a) Our optimization significantly reduces the large skew variation

between corner pairs

Corner pair = (C0, C3)Corner pair = (C0, C1)

Optimized skew variation (ns)

Ori

gin

al s

kew

var

iatio

n (n

s)

Optimized skew variation (ns)

Ori

gin

al s

kew

var

iatio

n (n

s)

Page 17: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-17-

OutlineOutline

Motivation Related Work Our Optimization Framework Experimental Setup and Results Conclusions

Page 18: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-18-

Conclusion and Future WorksConclusion and Future Works First framework to minimize sum of skew variation over

all sink pairs in a clock tree Up to 22% reduction of the sum of skew variation Future works

– Study resultant power and area benefits– Model to predict a buffer location for minimum skew

over a continuous range of possible locations

Thank You!

Page 19: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-19-

Backup Slides

Page 20: -1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo

-20-

Experimental Results (3)Experimental Results (3) Figure shows distribution of skew ratios between C0 and C1 Our optimization significantly reduces the variation of skew

ratios between corner pairs

μ = 1.34𝜎2 =

3.21

μ = 2.26𝜎2 =

2.26

Ratio (= skew at C1 / skew at C0)Ratio (= skew at C1 / skew at C0)

#Sin

k pa

irs

#Sin

k pa

irs

Original Global-local