13
<Insert Picture Here> RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

Embed Size (px)

Citation preview

Page 1: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

<Insert Picture Here>

RAPID Standard Cell Library Evaluation

by David Artz & Cory Krug

Oracle LabsNovember 2011

Page 2: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

2

<Insert Picture Here>

Introduction

• Standard Cell (Logic, ECO, Power) Library Evaluation Comparison Criteria• Performance• Power• Layout Architecture (routability, area, power rails, tapless,

etc.)• Features• Drive Strengths• Supported Views (CCS/NLDM, APL, DFT, etc.)• Documentation, Support

Page 3: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

3

<Vendor B> vs. <Vendor A> Standard Cell Comparisons

• Two libraries were compared, <Vendor B> and <Vendor A>. Both vendors have what they call general purpose (or low power) variants built around a 9 track layout architecture and high performance (12 track) variant.

• Both vendors supply typical mix of combinational (simple & complex), sequential (latches/flops), I/O’s, ECO cells, power management cells (header/footer switches, level shifters, isolation cells), etc., this is where the similarities end.

• <Vendor A> has a much richer library in terms of drive strengths, beta ratios, and device lengths where <Vendor B> has only Vt mix.

Page 4: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

4

<Vendor B> vs. <Vendor A> Richness (Vt, Leff, General Purpose and High Performance)

CategoryLibrary

Leff

Vt Low Std Hi Low Std Hi Low Std HiStandard Cell Libraries P P P P P P

Power Management Kit P P P P P P PECO Kit P P

CategoryLibrary

Leff

Vt Low Std Hi Low Std Hi Low Std HiStandard Cell Libraries P P P P P P P P

Power Management Kit P P P P P P PECO Kit P P

High Performance

High DensitySC9 SC9MC

40SC12 SC12MC

40 50

40 40 50

<Vendor B>

<Vendor A>

Page 5: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

5

<Vendor B> vs. <Vendor A> Richness (Functionality & Drive Strengths/Beta Ratio’s)

• For the purpose of comparison the standard cell libraries are categorized as follows:

Standard Cells

Clocking DatapathCombinational

Complex Simple

Physical Storage

Flip-Flop Register FileLatch ScanableFlip-Flop

SpecialPower

Voltage Island Switches

Level Shifter/Isolation Cells

Storage

ECO

Page 6: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

6

<Vendor B> vs. <Vendor A> Richness (Functionality & Drive Strengths/Beta Ratio’s Cont.)

6/20 indicates 6 functions, with 20 drive

strengths total

9 track standard cell library comparisonARM TSMC Descriptions

3/180 8/213 Special cells for clock distribution, e.g., balanced nand, clock gates, etc.

11/126 17/126 Full and half adders, booth encoders, etc.

Complex 46/1197 52/699 And-Or-Inverts, inverted input(s) simple combinationals, etc.

Simple 19/942 21/489 Inv, buff, nand, nor, xor, etc.

4/45 8/87 Antenna tie downs, decaps, filler cells

Latch 12/162 16/252

Flip-Flop 9/114 24/222 D Flip-flops

Register File 4/36 0/0

Scanable 18/222 28/375 Scannable versions of flip flops

1/24 2/45 Bus holder, delay cells

Power Island Gates

6/48 6/20 Header/footer switches

Interface Logic 10/114 4/48 Level shifters, isolation cells

Retention Storage

24/303 18/150 Retention flops, etc.

167/3513 204/2726

Low Power Design

Total

Standard Cells

Clocking

Datapath

Combinational

Category

Physical Design

Storage

Special

<Vendor A> <Vendor B>

Page 7: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

7

<Vendor B> vs. <Vendor A> Richness (Functionality & Drive Strengths/Beta Ratio’s Cont.)

• Previous chart shows 22% more functions in the <Vendor B> library over <Vendor A>. This can be misleading as it is my opinion many of these functions are of little use (e.g., many dubiously useful flavors of scanable flip-flops with both Q and QN outputs, etc.)

• Despite the richer feature set of the <Vendor B> library the <Vendor A> library has 29% addition drive strengths which consist of differing beta ratios and finer drive granularity.

• Beta Ratios: device P/N sizing to adjust timing arc performance, e.g.– Max finger size which fit’s in cell– Minimize the average delay– Equalize the delays– Equalize the output slews (e.g., on clock cells)– Minimize the maximum delay– Minimize the delay for rising output

• <Vendor A> libraries use all of the above beta ratios (where they make sense). <Vendor B> uses only minimize(max(tplh, tphl)).

• The finer granularity and sizing's allows optimization approaches to better fine tune for power, performance, and area goals.

• The “multi-channel” libraries from <Vendor A> afford even more optimization opportunities for improving leakage and minimizing processing variance (especially important in clocking).

Page 8: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

8

<Vendor B> vs. <Vendor A> Documentation

• <Vendor A> documentation is more readable and concise.– Truth tables show “don’t care” conditions rather then explicitly

listing out all permutations of input/output states.– Detailed descriptions of the operating conditions and

constraints over which the cells were characterize (e.g., surrounding dummy metal included at representative densities, etc.) are given.

– BKM’s on routablity, power strapping, etc. within commercial tools is documented.

– Gate level schematic diagrams are included and not just a cell symbol.

Page 9: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

9

<Vendor B> vs. <Vendor A> Layout

• <Vendor B> cell pitch is 0.14um while <Vendor A> is 0.18um• Power rails in <Vendor A> are M2 while <Vendor B>

is classical M1. My experience has taught me M2 affords better IR drop robustness and little impact if any to routability.• All library offerings come with a standard tech.lef

defining BEOL for various stackups.• Both libraries are tapless allowing for back biasing to

reduce leakage.

Page 10: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

10

<Vendor B> vs. <Vendor A> Views

• Both vendors offer all typical library views (schematic symbols, place & route LEF, verilog, pre & post spice decks, DFT, etc.)

• <Vendor A> has some pre-compiled views (e.g., milkyway) where as <Vendor B> does not.

• Timing & Power Views– Synopsis .lib in NLDM and CCS are supplied. <Vendor B> libraries elicit warnings when

checked with the semantic checker, <Vendor A>’s do not.– The number of indices in NDLM tables are the same but interestingly <Vendor A>

characterizes over a much broader range (e.g., on small inverters 50% wider range of input slews and 280% wider on loads) then <Vendor B>.

– <Vendor B> appears to characterize more robustly for power then <Vendor A>, e.g., internal nodal currents are captured for header/footer switches.

– APL (Apache Power Libraries) are supposed to be available from both vendors (note, it appears <Vendor A> only offers APL for 12 track libraries)

– When comparing the closest matching cells across libraries (functions, drive strength, PVT, input slew rate, output loading, etc.) the <Vendor B> cells appear to be on average 3% faster in performance then <Vendor A>. I feel this is more of a characterization discrepancy (what conditions did <Vendor B> assume in the neighborhood used for characterizing these cells, was the input an ideal voltage source or a properly shaped waveform, was the output a passive cap or another representative DUT, etc.) then an actual difference in performance.

Page 11: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

11

<Vendor B> vs. <Vendor A> Misc. Observations

• <Vendor B> offers thick gate oxide decoupling caps, <Vendor A> appears not to. Thicker oxides reduce leakage (Note: I’m dubious about any of the decaps frequency response to supply instantaneous current at our higher frequency goals).• The power saving library from <Vendor B> (i.e.,

head/footer switches) appear have more functionality in that they afford a pre-trickle charge phase signal before the final charge phase, thus supplying out of the box finer control for ramp time of voltage islands.

Page 12: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

12

<Vendor B> vs. <Vendor A> Synthesis Observations

3000

4000

5000

6000

7000

8000

9000

10000

11000

12000

500 750 1000 1250 1500 1750

Are

a (u

m^2

)

Speed (MHz)

32-bit multiply

TSMC

ARM

750

950

1150

1350

1550

1750

1950

2150

2350

2550

2750

500 750 1000 1250 1500 1750 2000

Are

a (u

m^2

)

Speed (MHz)

16-bit multiply

TSMC

ARM

0

50

100

150

200

250

500 750 1000 1250 1500 1750 2000

Are

a (u

m^2

)

Speed (MHz)

32-bit add

TSMC

ARM

350

360

370

380

390

400

410

420

430

440

450

1250 1500 1750 2000

Are

a (u

m^2

)

Speed (MHz)

32-bit decode

TSMC

ARM

Page 13: RAPID Standard Cell Library Evaluation by David Artz & Cory Krug Oracle Labs November 2011

13

<Vendor B> vs. <Vendor A> Recommendation

• <Vendor A> offers a superior library in terms of performance, functionality, power, and integration. We saw no area penalty despite the difference in cell pitch. This was shown through a systematic comparison of individual library elements and as on synthesized representative blocks where <Vendor A> implementation (with all things being equal, e.g., wire load model, constraints, etc.) on average outperformed <Vendor B> by 3%-5%.

• The 9 track <Vendor A> library gives us good power savings and reasonable performance that should meet RAPID targets. The high performance library (12 track) could be used in functional units requiring higher performance (at the cost of power and area).