43
-1- UC San Diego / VLSI CAD Laboratory On Potential Design Impacts of Electromigration Awareness Andrew B. Kahng, Siddhartha Nath and Tajana S. Rosing VLSI CAD LABORATORY, UC San Diego

On Potential Design Impacts of Electromigration Awareness

  • Upload
    tareq

  • View
    46

  • Download
    1

Embed Size (px)

DESCRIPTION

On Potential Design Impacts of Electromigration Awareness. Andrew B. Kahng, Siddhartha Nath and Tajana S. Rosing VLSI CAD LABORATORY, UC San Diego. Outline. Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power - PowerPoint PPT Presentation

Citation preview

Page 1: On Potential Design Impacts of  Electromigration  Awareness

-1-UC San Diego / VLSI CAD Laboratory

On Potential Design Impacts of Electromigration

AwarenessAndrew B. Kahng, Siddhartha Nath

and Tajana S. RosingVLSI CAD LABORATORY, UC San Diego

Page 2: On Potential Design Impacts of  Electromigration  Awareness

-2-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax

Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 3: On Potential Design Impacts of  Electromigration  Awareness

-3-

Electromigration in Interconnects Electromigration (EM) is the

gradual displacement of metal atoms in an interconnect

Iavg causes DC EM and affects power delivery networks

Irms causes AC EM and affects clock and logic signals

Page 4: On Potential Design Impacts of  Electromigration  Awareness

-4-

𝑡50=𝐴∗

𝐽𝑛∙𝑒

𝐸𝑎

𝑘𝑇

EM Lifetime

t50 – median time to failure (= loge 2 x MTTF) A* – geometry-dependent constant J – current density in interconnect segment n – constant ( = 2) Ea – activation energy of metal atoms k – Boltzmann’s constant T – temperature of the interconnect

EM degrades interconnect lifetime Black’s Equation calculates lifetime of interconnect

segment due to EM degradation

Page 5: On Potential Design Impacts of  Electromigration  Awareness

-5-

Parameters Affecting EM MTTF

A B Inverse relation; if A increases then B decreases

A B Direct relation; if A increases then B increases

Design parameters

α

Jrms

Temp

Wwire

MTTF

Driver size

Vdd

Fanout

Freq

Runtime parameters

Page 6: On Potential Design Impacts of  Electromigration  Awareness

-6-

Why Is EM Important Now? ITRS 2011 data shows that EM will be a

significant reliability issue Physical design teams trade off

performance and/or resources to meet EM MTTF

What values of MTTF do we really need?– In the US, people replace

Cell phones every 2 years Laptops every 3 – 5 years Servers every 3 – 7 years

Devices can be designed with small EM lifetimes

0

0.51

1.5

2

2.53

Year

Curr

ent

Dens

ity

(MA/

cm2)

10 9 8 7 6 5 4 3 2 10

0.5

1

1.5

2

2.5

MTTF (years)

Curr

ent

Den

sity

(MA

/cm

2)

Jrms

MTTF

Page 7: On Potential Design Impacts of  Electromigration  Awareness

-7-

Examples of EM Guardband

To meet EM MTTF margin at given wire width upper bound– Reduce Jrms reduce driver size slower circuit

To meet EM MTTF margin at given performance requirement– Increase Wwire increase capacitance, dynamic power

α

Jrms

Temp

Wwire

MTTF

Driver size

Vdd

Fanout

Freq

Page 8: On Potential Design Impacts of  Electromigration  Awareness

-8-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax

Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 9: On Potential Design Impacts of  Electromigration  Awareness

-9-

To Meet EM Lifetime Requirements Three major categories of prior work EM MTTF modeling

– Black69 (Black’s Equation)– Liew89 (AC lifetime models) – Lu07 and Wu12 (Joule heating)

Architecture changes to mitigate EM– Srinivasan04 (RAMP)– Romanescu08 (core cannibalization)

Synthesis and physical design (PD) techniques to reduce current density violations– Dasgupta96 (limit Jrms violation at synthesis)– Jerke04 (limit Jrms violation at PD)– Lienig03 (post-route Jrms fixes)

Page 10: On Potential Design Impacts of  Electromigration  Awareness

-10-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax

Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 11: On Potential Design Impacts of  Electromigration  Awareness

-11-

Key Idea We quantify impact of EM guardband on

performance (Fmax), area and power

[Black’s Equation]

FmaxArea / Power

(Irms,limit)2 = (Irms,default)2 x MTTFdefault/MTTFreduced

Decrease MTTFreduced increase Irms,limit

We study impacts on Fmax, area and power

𝑀𝑇𝑇𝐹=𝐴∗ (𝑊𝐻 )2

𝐼𝑟𝑚𝑠2 ∙𝑒

𝐸𝑎

𝑘𝑇

Page 12: On Potential Design Impacts of  Electromigration  Awareness

-12-

Approach We conduct two studies

1. MTTF vs. Fmax tradeoffs with fixed resource budget

2. MTTF vs. resources tradeoffs with fixed performance requirement

Assumptions–10 years = example default EM MTTF–Six testcases

Report three representative (AES, DMA, JPEG)

Page 13: On Potential Design Impacts of  Electromigration  Awareness

-13-

Key Contributions We are the first to quantify impacts of EM guardband on

performance and resources by using PD flows We introduce EM slack as an accurate measure of potential

performance improvements in different circuits at reduced MTTF requirements – Black’s Equation cannot accurately quantify the impacts of

EM-awareness in circuits We study how tightness vs. looseness of timing constraints

determine area and power trends at reduced MTTF Our study flow/methodology can potentially be used by

– architects and front-end designers to improve performance at no area cost

– physical designers whose levers are conventional SI and EM fixing methods

Page 14: On Potential Design Impacts of  Electromigration  Awareness

-14-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax

Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 15: On Potential Design Impacts of  Electromigration  Awareness

-15-

EM SlackWhen EM violations occur

Black’s Equation

Theoretical limit of Irms,net

Basic Concept: EM slack of a net (units: mA)

𝑀𝑇𝑇𝐹=𝐴∗ (𝑊𝐻 )2

𝐼𝑟𝑚𝑠2 ∙𝑒

𝐸 𝑎

𝑘𝑇

𝐼 𝑟𝑚𝑠 ,𝑛𝑒𝑡≤ 𝐼 𝑟𝑚𝑠 ,𝑙𝑖𝑚𝑖𝑡 ∙√ 𝑀𝑇𝑇𝐹 𝑑𝑒𝑓𝑎𝑢𝑙𝑡

𝑀𝑇𝑇𝐹 𝑟𝑒𝑑𝑢𝑐𝑒𝑑

𝐼 𝑟𝑚𝑠 ,𝑛𝑒𝑡=𝐶𝑙𝑜𝑎𝑑 ∙𝑉 𝑑𝑑∙√𝛼 ∙𝐹𝑚𝑎𝑥 ∙( 1𝑡𝑟𝑖𝑠𝑒+1𝑡 𝑓𝑎𝑙𝑙 )>𝐼 𝑟𝑚𝑠 ,𝑙𝑖𝑚𝑖𝑡

𝐸𝑀 𝑠𝑙𝑎𝑐𝑘 ,𝑛𝑒𝑡=𝐼 𝑟𝑚𝑠 ,𝑛𝑒𝑡− 𝐼𝑟𝑚𝑠 , 𝑙𝑖𝑚𝑖𝑡≤ 𝐼 𝑟𝑚𝑠 ,𝑙𝑖𝑚𝑖𝑡 ∙(√ 𝑀𝑇𝑇𝐹 𝑑𝑒𝑓𝑎𝑢𝑙𝑡

𝑀𝑇𝑇𝐹 𝑟𝑒𝑑𝑢𝑐𝑒𝑑−1)

Page 16: On Potential Design Impacts of  Electromigration  Awareness

-16-

Significance of EM Slack Positive EM slack potential for improved Fmax

If EM slack > 0, a part of it can be used to increase Irms,limit by reducing MTTF (from Black’s

Equation), and improve Fmax by using SP&R knobs (e.g., gate

sizing) without causing EM violations

𝐼 𝑟𝑚𝑠 ,𝑛𝑒𝑡=𝐶𝑙𝑜𝑎𝑑 ∙𝑉 𝑑𝑑∙√𝛼 ∙𝐹𝑚𝑎𝑥 ∙( 1𝑡𝑟𝑖𝑠𝑒 +1𝑡 𝑓𝑎𝑙𝑙 )>𝐼 𝑟𝑚𝑠 ,𝑙𝑖𝑚𝑖𝑡

Page 17: On Potential Design Impacts of  Electromigration  Awareness

-17-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 18: On Potential Design Impacts of  Electromigration  Awareness

-18-

Study 1: MTTF vs. Fmax Study MTTF vs. Fmax tradeoffs given upper bounds

on area, temperature and #EM violations Setup

– Three testcases: AES, DMA and JPEG– Two technology libraries: TSMC 45GS and 65GPLUS– Upper bounds

temperature = 378 K area = 66% utilization #EM violations = 25

– Synopsys DesignCompiler and Cadence SOC Encounter flows

– Thermal analysis using Hotspot

Page 19: On Potential Design Impacts of  Electromigration  Awareness

-19-

Automated Flow to Determine Fmax

Synthesis (DC)

RTL LIB SDC

NETLIST

Place and Route (SOCE)LEF

MTTFreduced

Tech. LEF

αPI

UTILtar

AR

Check constraints

Met all constraint

s?

Decrease frequency

Y

N

Increase frequency

Frequency tried before?

N

YEXIT

-- timing slack-- peak

temperature-- UTILeff-- #EM violations

Page 20: On Potential Design Impacts of  Electromigration  Awareness

-20-

Automated Flow to Determine Fmax

Synthesis (DC)

RTL LIB SDC

NETLIST

Place and Route (SOCE)LEF

MTTFreduced

Tech. LEF

αPI

UTILtar

AR

Check constraints

Met all constraint

s?

Decrease frequency

Y

N

Increase frequency

Frequency tried before?

N

YEXIT

-- timing slack-- peak

temperature-- UTILeff-- #EM violations

Page 21: On Potential Design Impacts of  Electromigration  Awareness

-21-

Derating LEF We derate current density limits in technology

Library Exchange Format (LEF) file

Reduced lifetime (ratio > 1)

increases Jrms limit

Reduced toggle rate (ratio > 1) increases Jrms

limit

Increased maximum

temperature increases Jrms

limit

 

Page 22: On Potential Design Impacts of  Electromigration  Awareness

-22-

Automated Flow to Determine Fmax

Synthesis (DC)

RTL LIB SDC

NETLIST

Place and Route (SOCE)LEF

MTTFreduced

Tech. LEF

αPI

UTILtar

AR

Check constraints

Met all constraint

s?

Decrease frequency

Y

N

Increase frequency

Frequency tried before?

N

YEXIT

-- timing slack-- peak

temperature-- UTILeff-- #EM violations

Page 23: On Potential Design Impacts of  Electromigration  Awareness

-23-

Binary Search for Fmax Increase frequency by step until some

constraint is violated Perform binary search between the

current F and the last feasible F to find Fmax

Page 24: On Potential Design Impacts of  Electromigration  Awareness

-24-

Automated Flow to Determine Fmax

Synthesis (DC)

RTL LIB SDC

NETLIST

Place and Route (SOCE)LEF

MTTFreduced

Tech. LEF

αPI

UTILtar

AR

Check constraints

Met all constraint

s?

Decrease frequency

Y

N

Increase frequency

Frequencytried

before?

N

YEXIT

-- timing slack-- peak

temperature-- UTILeff-- #EM violations

Page 25: On Potential Design Impacts of  Electromigration  Awareness

-25-

Flow to Fix EM Violations

Group nets depending on the extent of Irms,limit violations

Create nondefault rules (NDR) for each

net group

All EM violations fixed?

Downsize drivers

Timing met?

N

Yreturn Y

Perform ECO route

Decrease fanout

Perform timing analysis

Y

Netlist from Fmax determination

flow

Nreturn N

Page 26: On Potential Design Impacts of  Electromigration  Awareness

-26-

Automated Flow to Determine Fmax

Synthesis (DC)

RTL LIB SDC

NETLIST

Place and Route (SOCE)LEF

MTTFreduced

Tech. LEF

αPI

UTILtar

AR

Check constraints

Met all constraint

s?

Decrease frequency

Y

N

Increase frequency

Frequencytried

before?

N

YEXIT

-- timing slack-- peak

temperature-- UTILeff-- #EM violations

Page 27: On Potential Design Impacts of  Electromigration  Awareness

-27-

Observation 1

10 9 8 7 6 5 4 3 2 10%10%20%30%40%50%60%70%80%90% DMA AES JPEG

% In

crea

se in

Per

form

ance

Fmax scaling is not uniform across designs and at reduced MTTF as suggested by Black’s Equation

Fmax scaling is determined by the EM slack in each design at each MTTF requirement

Large Fmax improvements may be setup artifacts

45nm

Page 28: On Potential Design Impacts of  Electromigration  Awareness

-28-

Observation 2

10 9 8 7 6 5 4 3 2 10%10%20%30%40%50%60%70%80%90%

DMA AES JPEG%

EM

sla

ck

EM slack determines Fmax at fixed resources % of positive EM slack is usable to improve Fmax by

reducing MTTF requirement EM violations in critical paths lead to positive EM slack

10 9 8 7 6 5 4 3 2 10%

20%

40%

60%

80%

100%

120% DMA AES JPEG%

of E

M v

iola

tion

s

EM slack (not timing slack) limits performance scaling due to AC EM

Page 29: On Potential Design Impacts of  Electromigration  Awareness

-29-

10 9 8 7 6 5 4 3 2 175%

80%

85%

90%

95%

100%

105%

110%DMA AES JPEG%

of t

arge

t ut

i-liz

atio

n

Area limits Fmax scaling for MTTF 7 years (DMA) Area upper bounds are violated for MTTF 6 years;

Temperature upper bounds are violated for MTTF 3 years

10 9 8 7 6 5 4 3 2 1

EM AREA

TEM

P

Area and temperature can be dominating constraints at lower MTTF requirements

Observation 3

Page 30: On Potential Design Impacts of  Electromigration  Awareness

-30-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax

Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 31: On Potential Design Impacts of  Electromigration  Awareness

-31-

Study 2: MTTF vs. Area, Power Study MTTF vs. area and power tradeoffs at

a fixed performance requirement Setup

– DMA at 2000 MHz (2ps slack after SP&R at 45nm)– AES at 1100 MHz (1.6ps slack after SP&R at 45nm)– JPEG at 850 MHz (93ps slack after SP&R at 45nm)– Two technology libraries: TSMC 45GS and 65GPLUS

Page 32: On Potential Design Impacts of  Electromigration  Awareness

-32-

Observation 4

0 1 2 3 4 5 6 7 8 9 1048750

48800

48850

48900

48950

49000

Are

a (µ

m2)

Large positive timing slack at MTTF = 10 years can lead to smaller area when MTTF requirement is reduced

Large positive timing slack at MTTF = 10 years can lead to smaller power when MTTF requirement is reduced

0 1 2 3 4 5 6 7 8 9 1035

35.2

35.4

35.6

35.8

36

36.2

36.4

36.6

Pow

er (

mW

)

Page 33: On Potential Design Impacts of  Electromigration  Awareness

-33-

Observation 5

0 1 2 3 4 5 6 7 8 9 1013560135801360013620136401366013680137001372013740

Area

(µm

2)

Small positive timing slack at MTTF = 10 years can lead to increase in area as MTTF requirement is reduced

Small positive timing slack at MTTF = 10 years can lead to increase in power as MTTF requirement is reduced

0 1 2 3 4 5 6 7 8 9 1014.5

15

15.5

16

16.5

17Po

wer

(mW

)

Area and power can decrease as MTTF requirement is reduced for designs with loose

timing constraints

Page 34: On Potential Design Impacts of  Electromigration  Awareness

-34-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax

Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 35: On Potential Design Impacts of  Electromigration  Awareness

-35-

Conventional EM Fixes and MTTF Study how conventional SI and EM fixing methods

affect area and performance at reduced MTTF requirements.

Setup– Sweep MTTF from 10 years down to 1 year– Apply per-net NDRs, driver downsizing and

fanout reduction fixes– Study using AES, JPEG and DMA testcases– Two technology libraries: TSMC 45GS and

65GPLUS– Insights are very instance-, technology/library-

and flow-specific

Page 36: On Potential Design Impacts of  Electromigration  Awareness

-36-

Observation 6

AES JPEG DMA0%0.5%

1%1.5%

2%2.5%

3% Fmax Area

Fixing EM violations using NDRs can be effective in improving Fmax only till MTTF = 7 years

% increase in Fmax is less than 5% % increase in area is ~2%

Page 37: On Potential Design Impacts of  Electromigration  Awareness

-37-

Observation 7

8 11 14 17 20 23 26 290%0.5%

1%1.5%

2%2.5%

3%3.5% Fmax Area

Fanout

Fanout reductions to fix EM can increase Fmax by 3% at the cost of 1.86% increase in area

Drive downsizing to fix EM can increase Fmax by 2.5% at the cost of 2% increase in area

12 16 20 24 28 320%0.5%

1%1.5%

2%2.5%

3% Fmax Area

Driver sizeNDRs can be more effective knobs to increase Fmax with less increase in area

Page 38: On Potential Design Impacts of  Electromigration  Awareness

-38-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. Fmax

Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

Page 39: On Potential Design Impacts of  Electromigration  Awareness

-39-

Conclusions We study and quantify potential impacts of improved

EM-awareness in designs through two basic studies Our key observations

– Study 1: Available performance scaling (up to 80%) from MTTF reduction is dependent on EM slack

– Study 2: Area and power can decrease when MTTF is reduced in designs with loose timing constraints

– Additional studies: NDRs can be more effective in increasing performance ~5% at the cost of 2% increase in area for MTTF up to 7 years

Ongoing work – EM reliability requirements in multiple operating modes– Combined impacts of EM and other back end of the line

reliability mechanisms on interconnect lifetime

Page 40: On Potential Design Impacts of  Electromigration  Awareness

-40-

Acknowledgments Work supported by IMPACT, SRC, NSF,

Qualcomm Inc. and NXP Semiconductors

Page 41: On Potential Design Impacts of  Electromigration  Awareness

-41-

Thank You!

Page 42: On Potential Design Impacts of  Electromigration  Awareness

-42-

Backup

Page 43: On Potential Design Impacts of  Electromigration  Awareness

-43-

Hotspot Setup We use Hotspot5.0 calibrated with thermal

package from Qualcomm Inc. We perform two kinds of modeling

– Without heat spread and heat sink when profiling single block of AES, JPEG or DMA (area in µm2)

– With heat spreader and heat sink when profiling 50x50 blocks of AES, JPEG, or DMA in an area of ~5mm2

We get same values of temperature for a single block from both these methods