METHODOLOGY FOR HIGH- SPEED CLOCK TREE IMPLEMENTATION IN LARGE CHIPS Ravinder Rachala Aaron Grenat...

METHODOLOGY FOR HIGH-SPEED CLOCK TREE IMPLEMENTATION IN LARGE CHIPS

Ravinder RachalaAaron GrenatPrashanth VallurChristopher Ang

January 31, 2013

2| Methodology for High-Speed Clock Tree Implementation in Large Chips | January 31, 2013 | Public

ADVANTAGES OF CUSTOM CLOCK DISTRIBUTIONLow skew

Smaller AOCV timing uncertainty compared to full CTS

Custom buffers are more tolerant to OCV, IR drop, supply noise

The plot here displays a scenario where increased skew would require boosting voltage to achieve target Fmax. Effectively skew translates to higher power (dynamic and leakage) for meeting a target frequency.

0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.151000000000

1200000000

1400000000

1600000000

1800000000

2000000000

2200000000

2400000000

2600000000

2800000000

3000000000

0ps skew

10ps skew

20ps skew

30ps skew

40ps skew

50ps skew

60ps skew

70ps skew

80ps skew

90ps skew

100ps skewLow Skew

High Skew

OLD METHODOLOGY – CLOCK SPINE FRIENDLY FPLAN

• Regular and repetitive structure like the above floorplan is conducive to thin, long clock macro structures like above. Here we build 2 unique types of clock macros and stamp them. So, custom macro effort is relatively small compared to more complex floorplans.

Clock Spine Macros

OLD FLOW - CLOCK SPINE TOPOLOGY IN COMPLEX FLOORPLAN

• In more complex floorplans like above we would end up needing too many custom clock spine macros which are resource intensive and hard to converge in time for chip tapeout.

• Traditional clock spine macro style is not scalable for today’s complex chips

ISSUES WITH OLD METHODOLOGY

Very resource intensive. Increasing number of SOCs in roadmap makes this even more challenging

Area taken by the clock trees is badly utilized …<10%

Increasing size of the macros (of the order of ~20mm) runs risk of not converging through the custom macro/IP build flow

Floorplan challenges in accommodating the clock macros and minimizing the number of unique macros typically consumes lot of resource energy and time

Re-use of clock macros across projects is heavily restricted by even small floorplan changes between projects

Clock Macro

TMAC FLOW : NEW METHODOLOGY

Clock macros are broken down to cells (called as TMACs: Tiny-MACros) that will be flat instantiations at IP level

Connection between the TMACs is done in overlay (or RDL - Route Distribution Layers)

TMACcells

TMAC FLOW : NEW METHODOLOGY – SAMPLE CLOCK SPINE + MESH TOPOLOGY

MH (Horizontal Low-Res Layer)

MV (Vertical Low-Res Layer)

CTS Root buffer or Clock Gater

PRIOR WORK: EXAMPLE

Tile/RLM

Conduit - 1

Vtree - 1

Htree - 8

Total unique clock macros = 10

IP floorplan

(All 8 flavors are delay-matched)

Bad skew

Driving large areas of the design from a corner (i.e., huge cap on the buffer, big current through the wire) causes EM, self-heating issues

Long distribution wire susceptible to ringing/reflections (parasitic inductance)

Tile/RLM

CLOCK COVERAGE IS BETTER IN NEW METHODOLOGY

1 clock spine All TMAC cells connected in overlay More clock coverage

Overlay

IP floorplan

TMAC FLOW : NEW METHODOLOGY BENEFITS

Entire distribution is contained in one clock spine

Reduces number of circuit and layout resources

Frees up area between the TMACs for RLMs/Tiles

TMAC library of cells built once per technology node (e.g. GF 28nm), reused across all projects in that process technology

Floorplan changes can be easily accommodated even in late stages of design cycle

Provides more complete and robust clock coverage. Bad skew zones are avoided, reliability concerns minimized

Instance swapping (Sizing clock mesh drivers for power and performance optimization) can be done easily based on the clock mesh load

Creates full-custom quality clock spine network with significantly “less” effort

GRID CAP OPTIMIZATION, SDF FOR SKEW ANNOTATION

Clock grid optimization techniques - reduced clock metal capacitance (by ~45%)

– Classic clock mesh pruning methods like on-demand-grid

– Pushing VIA stack into the MPCTS (Multi-Point CTS) buffer.

Providing clock arrival times at each MPCTS entry point on the mesh (SDF file) for full-chip timing flow

CLK (M2)

Standard MPCTS buffer cell. Auto router built connection from ‘CLK’ pin to ‘MH’ clock grid route.

Clock mesh (MH Layer)

New MPCTS buffer cell. Connection from M2 pin to MH layer is built into the cell. Pin is elevated to MH layer. New cell is the same size as standard cell.

CLK (M2)

CLK (MH)

Clock mesh (MH Layer)

All of this route cap is saved. Skew from circuitous route is avoided.

TMAC METHODOLOGY : FLOW CHART

Import IP/SOC floorplan (DEF or GDS) into Cadence Virtuoso layout XL

Draw full clock spine in Cadence Virtuoso XL (schematic, layout)

Export entire clock spine layout to a DEF file (using internal flow)

Merge clock spine DEF with other overlay DEF (top layer power grid + clock mesh etc.) – First Encounter

Push down clock design (distribution + mesh/grid) into floorplan views for RLM/tiles to see for CTS buffer placement etc.

Extract clock routes (StarRCXT) at IP/SOC top level and run timing using Primetime.

CUSTOM DESIGN DATA TO DEF CONVERSION FLOW CHART

def writer

gdsiicdl

annotatedgdsii file

crossreference

internaldatabase

data processing tools

component cell list

CLOCK GRID INSERTION AND SDF GENERATION: FLOW CHART

Draw clock mesh/grid routes in FE (Spec from clock circuit team – route width, space, shielding)

Push down the mesh into the tiles. CTS buffer placement flow is run. Tiles close placement, routing and timing..

All tile DEFs are exported for full clock mesh extraction and spice simulation flow (CES)

Top level script prunes MH route completely and inserts back shortest possible segment to connect CTS entry buffers to nearest MV layer

Run CES flow. Skew (clock arrival times – SDF file) is reported to full-chip timing flow. Here clock routes are analyzed for EM pass/fail criteria as well.

Extract clock distribution routes at IP/SOC level and run full-chip STA timing (Primetime).

BENEFITS PROVEN IN RECENT AMD SOCS

Less resource needs

– 32nm SOI APU Graphics IP: 7 clocks. ~30 clock macros. 4 circuit and 4 layout resources

– 28nm APU Graphics IP: 9 clocks: 1 clock spine DEF. 1.5 circuit and 1 layout resource

Area savings

– 32nm SOI APU Graphics IP area : 98 mm2

clock macro area: 1.21 mm2 1.23%

– 28nm APU Graphics IP area: 131 mm2 clock macro area: 0.18 mm2 0.12%

Floorplan flexibility

– With the new methodology (TMAC flow), high-speed clock distribution can be designed to fit into any floorplan.

– E.g.: We were able to deliver clock distribution design to a server SOC in ¼ the time it takes in the old clock spine macro flow.

Reuse across projects

– TMAC library (clock buffer cells etc.) developed for a technology process are being leveraged for multiple APU projects.

Thank You

Trademark Attribution

AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. Other names used in this presentation are for identification purposes only and may be trademarks of their respective owners.

METHODOLOGY FOR HIGH- SPEED CLOCK TREE IMPLEMENTATION IN LARGE CHIPS Ravinder Rachala Aaron Grenat...

Documents

Ravinder Kumar Final

Ravinder Ppt's

vitreo brillante shiningengine.koduleht.net/templates/plaadipunkt2/files/mdl_files.php/Shadi… · Raspberry Rouge framboise Himbeere Frambueso Малина Granata Garnet Grenat

Will You Still Love Me? Ravinder Singh

SDLC and STLC by Ravinder in 2009

Can love happen twice by ravinder singh

Msl Project Marketing (Ravinder Dehar)

Ravinder Thakur

RAVINDER NATH SHARMA 3382 DELHI GATE DELHI …almondzglobal.com/pdf/Unclaimed_And_Unpaid_Dividend_for_financi…RAVINDER NATH SHARMA 3382 DELHI GATE DELHI 110002 INDIA DELHI CENTRAL

Glimpses... Ravinder Bhan at Business Class 2017

Ravinder Kumar Soni - poems - PoemHunter.Com...Ravinder Kumar Soni(05/04/1944) Ravinder Kumar Soni was born on 5th April 1944 at Delhi (India) . He is the eldest of the six sons of

RAVINDER SINGH

DIRECTOR(POWER),BHEL ED(PMG) DIRECTOR(PROJ),NTPC visit to NTECL Vallur site

Ravinder)Singh) IAPExperience2019) BostonAPICS ... · Ravinder)Singh) IAPExperience2019) The)IAPwas)14January)2019to)1February)2019.)The3weeks)passed)likeadream. )The)weather)in)

Buergers disease by dr .ravinder narwal

Your Leadership Edge - #1 Bestseller - Ravinder Tulsiani

Mobile Testing Theory-Ravinder

20 leadership quotes - Compiled by Ravinder Tulsiani

Learning Strategy Presentation By Ravinder Tulsiani

Vallur-II Feasibility Report