Upload
brooke-dickerson
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
4141stst DAC Tuesday Keynote DAC Tuesday Keynote
Giga-scaleGiga-scale Integration for Integration for Tera-OpsTera-Ops PerformancePerformanceOpportunities and New FrontiersOpportunities and New Frontiers
Pat GelsingerPat GelsingerSenior Vice President & CTOSenior Vice President & CTO
Intel CorporationIntel Corporation
June 8, 2004June 8, 2004
Why Bother?Why Bother?
$1
$10
$100
$1,000
$10,000
$100,000
1960 1970 1980 1990 2000 2010
Lit
ho
To
ol
Co
st
($K
)
G. Moore ISSCC 03
Litho CostLitho Cost
$1
$10
$100
$1,000
$10,000
1960 1970 1980 1990 2000 2010
Fab
Co
st (
$M)
www.icknowledge.com
FAB CostFAB Cost
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1980 1990 2000 2010
Test
Cap
ital
($) Per Chip
Based on SIA roadmap
Test CapitalTest Capital
Scaling dead at 130-nm, says IBM technologistBy Peter Clarke , Silicon Strategies May 04, 2004 (2:28 PM EDT)PRAGUE, Czech Republic — The traditional scaling of semiconductor manufacturing processes died somewhere between the 130- and 90-nanometer nodes, Bernie Meyerson, IBM's chief technology officer, told an industry forum.
Why Bother?Why Bother?
$1
$10
$100
$1,000
$10,000
$100,000
1960 1970 1980 1990 2000 2010
Lit
ho
To
ol
Co
st
($K
)
G. Moore ISSCC 03
Litho CostLitho Cost
$1
$10
$100
$1,000
$10,000
1960 1970 1980 1990 2000 2010
Fab
Co
st
($M
)
www.icknowledge.com
FAB CostFAB Cost
No exponential is forever, No exponential is forever, but you can delay forever…but you can delay forever…
––Gordon MooreGordon Moore
Believe in the LawBelieve in the Law
1.E-02
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1.E+04
1960 1970 1980 1990 2000 2010
$/M
IPs
$ per MIPS$ per MIPS
1.E-06
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1960 1970 1980 1990 2000 2010$/
Tra
nsi
sto
r
$ per Transistor$ per Transistor
Direction For The Direction For The FutureFuture
CMOS OutlookCMOS OutlookHigh Volume High Volume
ManufacturingManufacturing20042004 20062006 20082008 20102010 20122012 20142014 20162016 20182018
Technology Node Technology Node (nm)(nm)
9090 6565 4545 3232 2222 1616 1111 88
Integration Integration Capacity (BT)Capacity (BT)
2 4 8 16 32 64 128 256
Moore’s Law Is Alive & Well … Moore’s Law Is Alive & Well … Moore’s Law Is Alive & Well … Moore’s Law Is Alive & Well …
However … However …
CMOS OutlookCMOS OutlookHigh Volume High Volume
ManufacturingManufacturing20042004 20062006 20082008 20102010 20122012 20142014 20162016 20182018
Technology Node Technology Node (nm)(nm)
9090 6565 4545 3232 2222 1616 1111 88
Integration Integration Capacity (BT)Capacity (BT)
2 4 8 16 32 64 128 256
Delay = CV/I Delay = CV/I scalingscaling
0.70.7 ~0.7~0.7 >0.7>0.7 Delay scaling will slow downDelay scaling will slow down
Energy/Logic Op Energy/Logic Op scalingscaling
>0.35>0.35 >0.5>0.5 >0.5>0.5 Energy scaling will slow downEnergy scaling will slow down
Bulk Planar CMOSBulk Planar CMOS High Probability Low ProbabilityHigh Probability Low Probability
Alternate, 3G etcAlternate, 3G etc Low Probability High ProbabilityLow Probability High Probability
VariabilityVariability Medium High Very HighMedium High Very High
ILD (K)ILD (K) ~3~3 <3<3 Reduce slowly towards 2-2.5Reduce slowly towards 2-2.5
RC DelayRC Delay 11 11 11 11 11 11 11 11
Metal LayersMetal Layers 6-76-7 7-87-8 8-98-9 0.5 to 1 layer per generation0.5 to 1 layer per generation
Guiding ObservationsGuiding Observations
TransistorsTransistors (and silicon) are (and silicon) are freefree
PowerPower is the only real is the only real limiterlimiter
OptimizingOptimizing for for frequency AND/OR areafrequency AND/OR area may achieve may achieve
neitherneither
MOS Transistor ScalingMOS Transistor ScalingGATEGATE
SOURCESOURCE
BODYBODY
DRAINDRAIN
XXjj
TToxoxDD
GATEGATE
SOURCESOURCE DRAINDRAIN
LLeffeff
BODYBODY
Dimensions scale Dimensions scale down by 30%down by 30%
Doubles transistor Doubles transistor densitydensity
Oxide thickness Oxide thickness scales downscales down
Faster transistor, Faster transistor, higher performancehigher performance
VVdddd & V & Vtt scaling scaling Lower active powerLower active power
Technology has scaled well, and will continue…Technology has scaled well, and will continue…Technology has scaled well, and will continue…Technology has scaled well, and will continue…
Delivering Performance in Delivering Performance in Power EnvelopePower Envelope
0.9
1
1.1
1.2
130nm 90nm
Re
lati
ve
Pe
rfo
rma
nc
e
17%
MobileMark
Mobile, Power Envelope ~20-30WMobile, Power Envelope ~20-30W
0.9
1
1.1
1.2
1.3
130nm 90nm
Re
lati
ve
Pe
rfo
rma
nc
e
21%
Spec 2000
Desktop, Power Envelope ~60-90WDesktop, Power Envelope ~60-90W
0.9
1
1.1
1.2
1.3
130nm 90nm
Re
lati
ve
Pe
rfo
rma
nc
e
23%
Spec 2000
Server, Power Envelope ~100-130WServer, Power Envelope ~100-130W
Strained Silicon – 90nm+Strained Silicon – 90nm+
D
G
S S D
G
Tensile SiTensile Si33NN44 Cap Cap SiGe S-D creates SiGe S-D creates strainstrain
10-25% higher ON current10-25% higher ON current 84-97% leakage current reduction84-97% leakage current reduction
OROR 15% active power reduction15% active power reduction
PMOS NMOS
Source: Mark Bohr, IntelSource: Mark Bohr, Intel
Gate & Source-Drain LeakageGate & Source-Drain Leakage
Gate Leakage Solutions:Gate Leakage Solutions:High-K + Metal GateHigh-K + Metal Gate
1
10
100
1000
10000
30 50 70 90 110 130
Temp (C)Io
ff (
na/
u)
0.25u
45nm
90nm MOS Transistor
50nm50nm
Silicon substrateSilicon substrate
1.2 nm SiO1.2 nm SiO22
GateGate
New Transistors: Tri-Gate…New Transistors: Tri-Gate… Tri-gate
WSiLg
TSi
Gate 1
Gate 2
Gate 3
Source
Drain
Improved short-channel effectsImproved short-channel effectsHigher ON current for lower SD LeakageHigher ON current for lower SD Leakage
Manufacturing control: research underwayManufacturing control: research underway
SourceSourceDrainDrain
GateGate
Source: IntelSource: Intel
1
10
100
1000
10000
350 250 180 130 90 65
Del
ay (
ps) Clock Period
RC delay of 1mm interconnect
Copper Interconnect
0
0.5
1
500 250 130 65 32
Lin
e C
ap
(R
ela
tiv
e)
Low-K ILD
Metal InterconnectsMetal Interconnects
1
10
100
1000
500 250 130 65 32
Lin
e R
es
(R
ela
tiv
e)
1
10
100
500 250 130 65 32
RC
De
lay
(R
ela
tiv
e)
0.7x Scaled RC Delay
Interconnect RC DelayInterconnect RC Delay
New Challenge: VariationsNew Challenge: VariationsStatic & DynamicStatic & Dynamic
Random Dopant FluctuationsRandom Dopant Fluctuations
10
100
1000
10000
1000 500 250 130 65 32
Technology Node (nm)
Me
an
Nu
mb
er
of
Do
pa
nt
Ato
ms
UniformUniform Non-uniformNon-uniform
0.01
0.1
1
1980 1990 2000 2010 2020
micron
10
100
1000
nm
Sub-wavelength Lithography Sub-wavelength Lithography Adds VariationsAdds Variations
193nm193nm248nm248nm
365nm365nmLithographyLithographyWavelengthWavelength
65nm65nm
90nm90nm
130nm130nm
GenerationGeneration
GapGap
45nm45nm
32nm32nm
13nm 13nm EUVEUV
180nm180nm
Impact of Static VariationsImpact of Static Variations
130nm
30%
5X
FrequencyFrequency~30%~30%
LeakageLeakagePowerPower~5-10X~5-10X
0.90.9
1.01.0
1.11.1
1.21.2
1.31.3
1.41.4
11 22 33 44 55Normalized Leakage (Isb)Normalized Leakage (Isb)
No
rmal
ized
Fre
qu
en
cyN
orm
aliz
ed F
req
ue
ncy
40
50
60
70
80
90
100
110
Tem
per
atu
re (
C)
0
50
100
150
200
250
He
at
Flu
x (
W/c
m2
)
Dynamic Variations: Dynamic Variations: VVdddd & Temperature & Temperature
Heat Flux (W/cmHeat Flux (W/cm22))Results in VResults in Vcc cc variationvariation
Temperature Variation (Temperature Variation (°°C)C)Hot spotsHot spots
Technology ChallengesTechnology Challenges
Power: Active + LeakagePower: Active + Leakage
Interconnects (RC Delay)Interconnects (RC Delay)
VariationsVariations
Design Methodology IsDesign Methodology IsChanging… Changing…
Active Power ReductionActive Power Reduction
SlowSlow FastFast SlowSlow
Lo
w S
up
ply
L
ow
Su
pp
ly
Vo
ltag
eV
olt
age
Hig
h S
up
ply
H
igh
Su
pp
ly
Vo
ltag
eV
olt
age
Multiple VMultiple Vdddd
• VVdd dd scaling will slow downscaling will slow down
• Mimic VMimic Vdddd scaling with multiple V scaling with multiple Vdddd
• Challenges:Challenges:– Interface between low & high VInterface between low & high Vdddd
– Delivery and distributionDelivery and distribution
Leakage ControlLeakage ControlBody BiasBody Bias
VVdddd
VVbpbp
VVbnbn-V-Vee
+V+Vee
2-10X2-10XReductionReduction
Sleep TransistorSleep Transistor
Logic BlockLogic BlockLogic BlockLogic Block
2-1000X2-1000XReductionReduction
Stack EffectStack Effect
Equal LoadingEqual Loading
5-10X5-10XReductionReduction
Adaptive Body BiasingAdaptive Body BiasingN
umbe
r of
die
sN
umbe
r of
die
s
FrequencyFrequency
too too slow slow
fftargettarget
too too leakyleaky
fftargettarget
FBBFBB RBBRBB
Num
ber
of d
ies
Num
ber
of d
ies
FrequencyFrequencyff ff
ABBABB
FBBFBB RBBRBB
Adaptive Body BiasingAdaptive Body Biasing
0%0%
20%20%
60%60%
100%100%
Acc
epte
d D
ieA
ccep
ted
Die
No BBNo BB
100% yield100% yield
ABBABB
High Frequency BinHigh Frequency BinLow Frequency Bin Low Frequency Bin
97% highest bin97% highest bin
Within die ABBWithin die ABB
97% highest freq bin with ABB for within die variability 97% highest freq bin with ABB for within die variability
100% yield with Adaptive Body Biasing100% yield with Adaptive Body Biasing
RC Delay MitigationRC Delay Mitigation
Logic BlockLogic BlockFreq Freq = 1= 1VVdddd = 1= 1
Throughput = 1Throughput = 1PowerPower = 1= 1Area Area = 1 = 1 Power Den Power Den = 1= 1
VVdddd
Logic BlockLogic BlockFreq Freq = 0.5= 0.5VVdddd = 0.5= 0.5
Throughput = 1Throughput = 1Power Power = 0.25= 0.25Area Area = 2= 2Power Den Power Den = 0.125= 0.125
VVdddd/2/2
Logic BlockLogic Block
Throughput Oriented DesignThroughput Oriented Design
RC Delay Tolerant DesignRC Delay Tolerant Design
Lower Power And Power DensityLower Power And Power Density
Variation Tolerant Circuit DesignVariation Tolerant Circuit Design
00
0.50.5
11
1.51.5
22
Low-VLow-Vtt usage usagelowlow highhigh
Higher Higher probabilityprobability of of target frequencytarget frequency with: with:1.1. Larger transistor sizes Larger transistor sizes 2.2. Higher Low-VHigher Low-Vtt usage usage
But with power penaltyBut with power penalty
00
0.50.5
11
1.51.5
22
Transistor sizeTransistor sizesmallsmall largelarge
powerpower
target target frequency frequency probabilityprobability
µ-architecture Is Also µ-architecture Is Also Changing… Changing…
Variations and µ-architectureVariations and µ-architecture
1.11.1
1.21.2
1.31.3
1.41.4
11 99 1717 2525
# of critical paths# of critical paths
Mea
n c
lock
fre
qu
ency
Mea
n c
lock
fre
qu
ency
Clock frequencyClock frequency
Nu
mb
er o
f d
ies
Nu
mb
er o
f d
ies
0%0%
20%20%
40%40%
60%60%
0.90.9 1.11.1 1.31.3 1.51.5
# critical # critical pathspaths
0%0%
20%20%
40%40%
-16%-16% -8%-8% 0%0% 8%8% 16%16%
DelayDelay
20%20%
40%40% NMOSNMOSPMOSPMOS
Device IDevice I ONON
# o
f sa
mp
les
(%)
# o
f sa
mp
les
(%)
Variation (%)Variation (%)
0.00.0
0.50.5
1.01.0
Logic depthLogic depthR
atio
of
Rat
io o
f
del
ay-
del
ay-
to
Ion
-to
Ion
-
1616 4949
0
0.5
1
1.5
Logic depthLargeSmall
frequency
target frequency probability
Variation Tolerant µ-architectureVariation Tolerant µ-architecture
Decrease variability in the design:Decrease variability in the design:1.1. Deeper logic depthDeeper logic depth2.2. Smaller number of critical pathsSmaller number of critical paths
# uArch critical paths
0
0.5
1
1.5
More Less
Implications For CADImplications For CAD
Logic & CircuitsLogic & Circuits
LayoutLayout
TestTest
Leakage PowerLeakage Power
Fre
qu
en
cyF
req
ue
ncy
DeterministicDeterministic
ProbabilisticProbabilistic10X variation 10X variation
~50% total power~50% total power
Probabilistic DesignProbabilistic Design
DelayDelayPath DelayPath Delay P
rob
abili
tyP
rob
abili
tyDeterministic design techniques inadequate in the futureDeterministic design techniques inadequate in the futureDeterministic design techniques inadequate in the futureDeterministic design techniques inadequate in the future
Due to Due to variations in:variations in:VVdddd, V, Vtt, and , and
TempTemp
Delay TargetDelay Target
# o
f P
ath
s#
of
Pa
ths
DeterministicDeterministic
Delay TargetDelay Target
# o
f P
ath
s#
of
Pa
ths
ProbabilisticProbabilistic
Shift in Design ParadigmShift in Design Paradigm
• Multi-variable design optimization for:Multi-variable design optimization for:– Yield and bin splits Yield and bin splits – Parameter variations Parameter variations – Active and leakage powerActive and leakage power– Performance Performance
Tomorrow:Tomorrow:Global OptimizationGlobal Optimization
Multi-variateMulti-variate
Today:Today:Local OptimizationLocal Optimization
Single VariableSingle Variable
Today’s Freelance LayoutToday’s Freelance Layout
Vss
Vdd
OpIp
Vss
Vdd
Op
No layout restrictionsNo layout restrictionsNo layout restrictionsNo layout restrictions
Future Transistor Orientation RestrictionsFuture Transistor Orientation Restrictions
Vss
Vdd
OpIp
Vss
Vdd
Op
Transistor orientation restricted to improve Transistor orientation restricted to improve manufacturing controlmanufacturing control
Transistor orientation restricted to improve Transistor orientation restricted to improve manufacturing controlmanufacturing control
Op
Vss
Vdd
Ip
Vss
Vdd
Op
Future Transistor Width QuantizationFuture Transistor Width Quantization
Today’s Unrestricted RoutingToday’s Unrestricted Routing
Future Metal RestrictionsFuture Metal Restrictions
Today’s Metric: Today’s Metric: Maximizing Transistor DensityMaximizing Transistor Density
Dense layout causes hot-spotsDense layout causes hot-spotsDense layout causes hot-spotsDense layout causes hot-spots
Tomorrow’s Metric: Tomorrow’s Metric: Optimizing Transistor & Power DensityOptimizing Transistor & Power Density
Balanced LayoutBalanced LayoutBalanced LayoutBalanced Layout
Other Challenges …Other Challenges …
Test & DebugTest & Debug
Test ChallengesTest Challenges
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
1.E+01
1.E+02
1.E+03
1980 1990 2000 2010
Test
Cap
ital
($) Per Chip
Based on SIA roadmap
Test CapitalTest Capital
Understandable …Understandable …Understandable …Understandable …
1.E-08
1.E-07
1.E-06
1.E-05
1.E-04
1980 1990 2000 2010
Test
Cap
ital
($)/
Tra
nsis
tor
From SIA roadmap
Based on SIA roadmap
Test Capital/ TransistorTest Capital/ Transistor
Disturbing …Disturbing …Disturbing …Disturbing …
On Die Test MethodologyOn Die Test Methodology
ISSCC 2003: ISSCC 2003: 8Gb/s Differential Simultaneous Bidirectional Link with 4mV, 9ps Waveform Capture 8Gb/s Differential Simultaneous Bidirectional Link with 4mV, 9ps Waveform Capture Diagnostic CapabilityDiagnostic Capability
<1E-8<1E-8
1E-71E-7
1E-61E-6
1E-51E-5
>1E-4>1E-4
-0.25-0.25
-0.125-0.125
00
0.1250.125
0.250.25
00 104104 208208 312312 416416Time (ps)Time (ps)
Vo
ltag
e (V
)V
olt
age
(V)
-0.4-0.4
-0.2-0.2
0.00.0
0.20.2
0.40.4
0.00.0 1.81.8 3.63.6 5.45.4 7.17.1 8.98.9 10.710.7 12.512.5Time (ns)Time (ns)
Dif
fere
nti
al
Vo
lta
ge
(V
)D
iffe
ren
tia
l V
olt
ag
e (
V)
On-Die Scope WaveformOn-Die Scope Waveform
• Move from external to on-die “self testing”Move from external to on-die “self testing”• High-speed test & debug hardware on each dieHigh-speed test & debug hardware on each die• Low speed, low cost, interface to external testerLow speed, low cost, interface to external tester
On die debug & test of 8Gb/sec IO interfaceOn die debug & test of 8Gb/sec IO interface
Other Challenges …Other Challenges …
Mixed-signal Design
System-level DesignSystem-level Design
CorrectnessCorrectness
Multi-clock domainsMulti-clock domains
ResiliencyResiliency
Business As Usual Is Business As Usual Is NOT An Option For CAD… NOT An Option For CAD…
SummarySummary
CMOS CMOS scalingscaling will will continuecontinue, , transistors transistors becomebecome free free
Deterministic Deterministic Probabilistic, Single Probabilistic, Single Multi Multi
locallocal to to globalglobal optimization: optimization: powerpower,…,…
BELIEVEBELIEVE
SHIFTSHIFT
EMBRACEEMBRACE