Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
Introduction toCMOS VLSI
Design
Scaling
Lecture by Peter KoggeUniversity of Notre Dame
Fall 2011, 2015, 2018
Modified from presentation by Jay Brockman in 2008
Based on lecture slides by David Harris, Harvey Mudd College
http://www.cmosvlsi.com/coursematerials.html
Slide 1Scaling
CMOS VLSI DesignScaling Slide 2
Outline Moore’s Law & ITRS Roadmap
Ideal Scaling
Real World Scaling
2004
Scaling in the World of Multi-core
2
CMOS VLSI DesignScaling Slide 3
Moore’s Law In 1965, Gordon Moore predicted the exponential
growth of the number of transistors on an IC
Transistor count doubled
every year since invention
Predicted > 65,000
transistors by 1975!
Growth limited by power
[Moore65]
CMOS VLSI Design
ITRS ITRS: INTERNATIONAL TECHNOLOGY
ROADMAP FOR SEMICONDUCTORS
International group of experts from
– Industry, Research Labs, Academia
Run by Semiconductor Industries of America – SIA
Every 3 years produce huge document describing projections on how future commercial technology will improve
On non-full document years, update tables
Scaling Slide 4
3
CMOS VLSI Design
ITRS Feature Size Feature Size used to be minimum gate length/width
Now on interconnect pitch by product category: DRAM, Logic, Flash
And based on what is in widespread use
Also distinguish between “Drawn” & “Physical” Gate length
Scaling Slide 5
CMOS VLSI DesignScaling Slide 6
Scaling Feature size used to shrink by 30% every 2-3 years
– Transistors became smaller, faster
– Circuits got smaller and faster
– More transistors fit on chip
– Double gain in performance: speed & density
Define Scale factor S
– Applied to feature size
– ~ every 0.7 shrink factor
• Corresponds to
• Called a Technology node
Year
0.1
1
10
1965 1970 1975 1980 1985 1990 1995 2000 2005
Fe
atu
re S
ize
(m
)
10
6
3
1.51
0.80.6
0.350.25
0.180.13
0.092S
4
CMOS VLSI Design
Scaling Effects Area: if all dimensions shrink by 1/S, area shrinks by 1/S2
– S = √2 => area shrink by 1/2
Speed or Clock rate of a circuit
– Function of how fast a logic gate can change input voltage of another downstream gate
– Depends on transistor gate capacitance and saturation current
• Smaller W & L reduces capacitance
• W/L and Vdd affects saturation current
Power drawn by a circuit
– Function power lost charging and discharging capacitor
– Depends on transistor gate capacitance, Vdd, & clock
Scaling Slide 7
CMOS VLSI Design Slide 8
CMOS Energy 101
C
Scaling
Focus onthis logic
Idsp
Idsn
Basic Equations:• Power = ∫ Energy lost
• = ∫ V(t)*I(t) = Vdd∫I(t)• C = sum of εLW/tox
• Idsat = β(Vgs-Vt)Vdd2/2
• Vacross_cap = Q/C, Q = ∫ I(t)• When charging C:
• Power dissipated in P-type• Energy stored in C
• When discharging C:• Power dissipated in N-type
Actually there’s morethan just gate capacitance
If we define an activity cycle is Vin going from 0 to V and back to 0,then energy lost from one activity is CV2.If logic has clock f cycles/sec, and α activity cycle per clock period, then power dissipated by this gate is αfCV2
Vin
5
CMOS VLSI Design Slide 9
CMOS Energy 101
C
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20
Vo
ltag
e
Vin
Vg
Dissipate CV2/2And store CV2/2
Dissipate CV2/2From Capacitance
One clock cycle dissipates C*V2
Smaller C => less power, faster transitionScaling
Focus onthis logic
Vin
Threshold to switch nextlogic gate
CMOS VLSI Design
Constant Fielda.k.a. Dennard
Scaling
Scaling Slide 10
6
CMOS VLSI Design
The Scaling of the Transistor
What can change? Doping levels in diffusion Physical dimensions: W, L, tox
Electrical parameters Vgs, Vds
W
L
tox
Vgs
Vds
A key parameter:The Electric Fieldacross the Gateproportional to Vgs/tox
Scaling Slide 11
CMOS VLSI DesignScaling Slide 12
Scaling Variations What changes between technology nodes?
Constant Field Scaling (aka Dennard Scaling)
– All dimensions (W, L, tox) scale by same S
– Voltage (VDD) Scales down by S
– Doping levels change
Constant Voltage Scaling (Today)
– All but voltage changes
Lateral Scaling
– Only gate length L shrinks
– Often done as a quick gate shrink (S = 1.05)
7
CMOS VLSI Design
Why Is It Called “Constant Field”?
If V decreases by factor of 1/S And tox decreases by factor of 1/S Then E field (V/tox) remains constant! So mobility & the like remain constant And then if L gets shorter by 1/S Then capacitance (LW/tox) drops as (1/S)*(1/S)/(1/S) = 1/S Then the transistor is faster by factor of S And energy per cycle (CV2) goes down as (1/S)3
W
L
tox
Vgs
Vds
Scaling Slide 13
S = ratio by which Feature Size (i.e. L)decreases.Eg. S=2 => L is ½ ofprevious value
CMOS VLSI Design
Translating into Core Parameters
Core Clock ≈ S
Core Capacitance ≈ 1/S
Core Voltage ≈ 1/S
Core Power ≈ Capacitance*Clock*Vdd2 ≈ 1/S2
Cores/Constant Area Die ≈ 1/S2
Power Density ≈ Constant
Compute Cycles/Die ≈ S3
Energy/Cycle = Power/Clock ≈ 1/S3
Scaling Slide 14
8
CMOS VLSI Design
Perfect Dennard Trends
1.0E‐05
1.0E‐04
1.0E‐03
1.0E‐02
1.0E‐01
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.010.0100.0
Relative
to 100 nm Core
Feature Size (nm)Core Area Core Power Core Clock Core Power Density
Core Energy/Cycle Cores/Die Cycles/Die Die Power Density
Core Energy/Cycle
Core Power
Die Power Density
Core Clock
Cores per Die
Cycles per Die
Core Area
Core Power Density
Scaling Slide 15
Assume no change to core design
CMOS VLSI DesignScaling Slide 16
Device ScalingParameter Sensitivity Dennard
ScalingConstant
Voltage
Lateral Scaling
L: Length 1/S 1/S 1/SW: Width 1/S 1/S 1tox: gate oxide thickness 1/S 1/S 1VDD: supply voltage 1/S 1 1Vt: threshold voltage 1/S 1 1NA: substrate doping S S 1 W/(Ltox) S S SIon: ON current (VDD-Vt)2 1/S S SR: effective resistance VDD/Ion 1 1/S 1/SC: gate capacitance WL/tox 1/S 1/S 1/S: gate delay RC 1/S 1/S2 1/S2
f: clock frequency 1/ S S2 S2
E: switching energy / gate CVDD2 1/S3 1/S 1/S
P: switching power / gate Ef 1/S2 S SA: area per gate WL 1/S2 1/S2 1Switching power density P/A 1 S3 SSwitching current density Ion/A S S S
Green: “Better”; Yellow: “Neutral”; Shades of Red: “Worse”
9
CMOS VLSI Design
Interconnect (aka Wires)
Scaling Slide 17
Parameter Sensitivity Scale Factorw: width 1/Ss: spacing 1/St: thickness 1/Sh: height 1/SDc: die size Dc
Rw: wire resistance/unit length 1/wt S2
Cwf: fringing capacitance / unit length t/s 1Cwp: parallel plate capacitance / unit length w/h 1Cw: total wire capacitance / unit length Cwf + Cwp 1twu: unrepeated RC delay / unit length RwCw S2
twr: repeated RC delay / unit length sqrt(RCRwCw) sqrt(S)Crosstalk noise w/h 1Ew: energy per bit / unit length CwVDD
2 1/S2
Parameter Sensitivity Local / Semiglobal Globall: length 1/S Dc
Unrepeated wire RC delay l2twu 1 S2Dc2
Repeated wire delay ltwr sqrt(1/S) Dcsqrt(S)Energy per bit lEw 1/S3 Dc/S2
More on this in later lecture
CMOS VLSI DesignScaling Slide 18
Interconnect Observations Capacitance per micron is
remaining constant
– About 0.2 fF/m
– Roughly 1/10 of gate capacitance
Local wires are getting faster
– Not quite tracking transistor improvement
– But not a major problem
Global wires are getting slower
– No longer possible to cross chip in one cycle
10
CMOS VLSI Design
Scaling in the “Old”Real World
Scaling Slide 19
CMOS VLSI Design
1
10
100
1000
10000
1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020 2025
Fea
ture
Siz
e (n
m)
MPU M1 1/2 Pitch DRAM 1/2 PitchFlash 1/2 Pitch Printed GatePhysical Gate
Feature Size Improves
Scaling Slide 20
11
CMOS VLSI Design
Feature Size In Detail
Scaling Slide 21
1
10
100
2000 2005 2010 2015 2020 2025
Fe
atu
re S
ize
(n
m)
MPU M1 1/2 Pitch DRAM 1/2 PitchFlash 1/2 Pitch Printed GatePhysical Gate
CMOS VLSI Design
Leading Edge Fabs Are Doing Better
Scaling Slide 22
0
20
40
60
80
2005 2010 2015 2020 2025 2030
Feature Size (nm)
2013 ITRS Xeon Data Top10
2006 ITRS Xeon Line Top10 Projection
12
CMOS VLSI Design
0.01
0.1
1
10
100
1000
10000
100000
1/1
/72
1/1
/76
1/1
/80
1/1
/84
1/1
/88
1/1
/92
1/1
/96
1/1
/00
1/1
/04
1/1
/08
1/1
/12
1/1
/16
1/1
/20
1/1
/24
Tran
sis
tors
/Ch
ip (
M)
Actual 2X every 20 months ITRS Projections
Scaling Slide 23
More Moore Transistor counts have doubled periodically for the
past three decades
1 Billion
CMOS VLSI Design
Transistor Density
Scaling Slide 24
0.0001
0.001
0.01
0.1
1
10
100
1000
10000
100000
1/1/
72
1/1/
76
1/1/
80
1/1/
84
1/1/
88
1/1/
92
1/1/
96
1/1/
00
1/1/
04
1/1/
08
1/1/
12
1/1/
16
1/1/
20
1/1/
24
Tra
ns
isto
r D
en
sit
y (M
T/m
m2
)
Actual 2X every 24 months ITRS Projections
13
CMOS VLSI Design
Inherent Delay Better Than Dennard
1
10
100
10 100 1000
Inve
rter
Del
ay (
ps
)
Feature Size (nm) (1/2 M1 pitch)ITRS Data Delay = 0.04*Size^1.37
Dennard predicted FITRS says F1.37
Scaling Slide 25
CMOS VLSI Design
2004:The End of the World
as We Knew It
Scaling Slide 26
14
CMOS VLSI Design
The World Changed in 2004
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
01/01/96
01/01/00
01/01/04
01/01/08
01/01/12
Compound Annual G
rowth Rate: CAGR
Rmax (Gflop/s) Total Cores
Ave Cycles/sec per core (Mhz) Mem/Core (GB)
Ave. Cores/Socket TC: Total Concurrency (Rmax)
2004: The Watershed point
Rmax continues at 2X
While clocks flattenedor fell
Total cores& Cores/socketskyrocketed
As did flops per machinecycle
While memoryper core fell
(Data taken from top 10 supercomputers over last 20 years)
Scaling Slide 27
CMOS VLSI Design
The 2004 Event S = scale factor from one technology node to next
Assume we port same design unchanged
In constant field scaling
– Area goes down by 1/S2
– Vdd goes down by 1/S2
– Capacitance goes down by 1/S
– Clock goes up by S
– So power goes down by (1/S)*S*(1/S2) = 1/S2
– And power per unit area = (1/S2)/ (1/S2) = constant
– But if area increases, chip power increases
In 2004 Vdd stopped decreasing and we maxed out our ability to cool chips
Scaling Slide 28
15
CMOS VLSI Design
The Origins of 2004 Entering “Constant Voltage” Scaling
With minimal change in tox
If Clock continued to go up by S
– Power/core goes down by (1/S)*S*(1) = constant
– And power per unit area = (1)/(1/S2) = S2!!!!!
Once we max out chip power
– We cannot allow clocks to increase
– We use all remaining tricks to keep power constant
Scaling Slide 29
CMOS VLSI Design
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
1/1/
80
1/1/
84
1/1/
88
1/1/
92
1/1/
96
1/1/
00
1/1/
04
1/1/
08
1/1/
12
1/1/
16
1/1/
20
1/1/
24
1/1/
28
Op
era
tin
g V
olt
ag
e (
Vd
d)
Actual MPU High Performance Logic Low Power Logic DRAM
Vdd Has Flattened
Scaling Slide 30
Now
16
CMOS VLSI DesignScaling Slide 31
CMOS VLSI DesignCMOS VLSI Design 4th Ed.15: Scaling and Economics 10
Real Scaling tox scaling has slowed since 65 nm
– Limited by gate tunneling current– Gates are only about 4 atomic layers thick!– High-k dielectrics have helped continued scaling
of effective oxide thickness VDD scaling has slowed since 65 nm
– SRAM cell stability at low voltage is challenging Dennard scaling predicts cost, speed, power all
improve– Below 65 nm, some designers find they must
choose just two of the three
CMOS VLSI Design
Real World Chip Power
1
10
100
1,000
1/1/
80
1/1/
84
1/1/
88
1/1/
92
1/1/
96
1/1/
00
1/1/
04
1/1/
08
1/1/
12
Tota
l Die
Po
we
r (
Wat
ts)
Scaling Slide 32
17
CMOS VLSI Design
Real World Rise in Power Density
1
10
100
1,000
1/1/
80
1/1/
84
1/1/
88
1/1/
92
1/1/
96
1/1/
00
1/1/
04
1/1/
08
1/1/
12
1/1/
16
1/1/
20
1/1/
24
Po
we
r D
ensi
ty (
Wat
ts/c
m2 )
Historical ITRS Projections
100W light bulb
Hot Plate
Nuclear Reactor
Scaling Slide 33
CMOS VLSI Design
Die Size Became Constant
Scaling Slide 34
10
100
1,000
1/1/
80
1/1/
84
1/1/
88
1/1/
92
1/1/
96
1/1/
00
1/1/
04
1/1/
08
1/1/
12
1/1/
16
1/1/
20
1/1/
24
Die
Siz
e (
mm
2 )
Actual MPU ITRS Chip Size at Introduction
18
CMOS VLSI Design
Microprocessor System Clocks
Scaling Slide 35
1
10
100
1,000
10,000
1/1
/72
1/1
/76
1/1
/80
1/1
/84
1/1
/88
1/1
/92
1/1
/96
1/1
/00
1/1
/04
1/1
/08
1/1
/12
Clo
ck (
MH
z)
CMOS VLSI Design
1.E+01
1.E+02
1.E+03
1.E+04
1.E+05
1.E+06
1/1/92 1/1/96 1/1/00 1/1/04 1/1/08 1/1/12 1/1/16
MHz
Heavyweight Lightweight
Hybrid Trend: CAGR=1.42
Supercomputer System Clocks
Scaling Slide 36
FLAT!!!!
19
CMOS VLSI Design
Scaling Time Periods
Scaling Slide 37
0
1
2
3
4
5
110100100010000
Vd
d (
Vo
lts)
MPU Feature Size (1/2 M1 Pitch)
High Performance Low Power
ConstantVoltage
ConstantField
Mostly ConstantVoltage
CMOS VLSI Design Slide 38
How Did CV2 Improve With Time?
10
100
1000
1985 1990 1995 2000 2005 2010 2015 2020 2025
Fea
ture
Siz
e
Assume capacitance of a circuitscales as feature size
0.01
0.10
1.00
10.00
100.00
1000.00
1/1/
88
1/1/
90
1/1/
92
1/1/
94
1/1/
96
1/1/
98
1/1/
00
1/1/
02
1/1/
04
1/1/
06
1/1/
08
1/1/
10
1/1/
12
1/1/
14
1/1/
16
1/1/
18
1/1/
20
1/1/
22
1/1/
24
CV
^2 r
elat
ive
to 9
0nm
Hi P
erf
Lo
gic
High Perf Logic Low Operating Pow er Logic Memory Process
330X
15X
90nm picked as breakpoint because that’s when Vdd and thus clocks flattened
0
1
2
3
4
5
6
1970 1980 1990 2000 2010 2020
Vd
d
Scaling
Today
20
CMOS VLSI DesignScaling Slide 39
Looking Forward:ITRS Projections (from 2013)
http://www.itrs.net/
2000 2005 2010 2015 2020 2022 2024Feature Size (nm) 82.9 32.0 27.0 16.8 10.7 8.9 7.4
Microprocessor MTransistors/sq. cm 26 97 564 2548 8080 12840 20400DRAM Gbits per chip 0.25 1 2 8 16 32 32
Production MPU Chip Size (sq. mm) 170 111 99 88 111 140 88Production DRAM Chip Size (sq. mm) 129 88 47 29 37 23 15
Max ASIC Signal pins/chip 900 2000 2400 2800 3100 3420 3420Max On-Chip Clock Rate (GHz) 1.0 5.0 5.9 4.4 5.3 5.8 6.2
Max Logic Wiring Levels 7 10 12 13 14 15 15High Perf MPU Supply Voltage (V) 1.8 1.1 1.0 0.8 0.8 0.7 0.7
Max Watts/sq. mm of chip 0.50 0.66 0.96 1.19 1.24 1.73 #N/A
CMOS VLSI Design
ITRS Clock Predictions
1
10
100
1995 2000 2005 2010 2015 2020 2025
On Chip Clock Rate (GHz)
1999 2001 2004
2006 2008 2010
2011 2012 TOP500 top clock
Scaling Slide 40
21
CMOS VLSI Design
Scaling in the World of
Multi-Core Chips
Scaling Slide 41
CMOS VLSI Design
Multi-Core Chips
Scaling Slide 42
Place many copies of same core on chip, and run parallel programs
22
CMOS VLSI Design
Pure Dennard Trends
1.0E‐05
1.0E‐04
1.0E‐03
1.0E‐02
1.0E‐01
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.010.0100.0
Relative
to 100 nm Core
Feature Size (nm)Core Area Core Power Core Clock Core Power Density
Core Energy/Cycle Cores/Die Cycles/Die Die Power Density
Core Energy/Cycle
Core Power
Die Power Density
Core Clock
Cores per Die
Cycles per Die
Core Area
Core Power Density
Scaling Slide 43
CMOS VLSI Design
We See Vdd Flattening Clearer vs F
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
1101001000
Vd
d
Feature Size (nm)
High Perf Low Power DRAM
Dennard Scaling said V varies as F
Scaling Slide 44
23
CMOS VLSI Design
Relating the Regimes: No Change to Clocks
Voltage Scales As:
1/S 1/S^0.7 1/S^0.2Dennard Pre 2004 Post 2004
Area 1/S^2
Capacitance 1/S
Clock S
Core Power 1/S^2 1/S^1.4 1/S^0.4
Power Density 1 S^0.6 S^1.6
Energy/Cycle 1/S^3 1/S^2.4 1/S^1.4
Scaling Slide 45
CMOS VLSI Design
1.0E‐04
1.0E‐03
1.0E‐02
1.0E‐01
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+001.0E+011.0E+02
Relative to 100 nm Core
Feature Size (nm)Core Area Core Power Core Clock Core Power Density
Core Energy/Cycle Cores/Die Cycles/Die Die Power Density
Comparative Trends: Pre 2004
Core Energy/Cycle
Core Power
Die Power DensityCore Clock
Cores per Die
Cycles per Die
Core Area
Core Power Density
Vdd scales as S0.7, not SClocks still grow at 1/S
Scaling Slide 46
24
CMOS VLSI Design
1.0E‐05
1.0E‐04
1.0E‐03
1.0E‐02
1.0E‐01
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+001.0E+011.0E+02
Relative to 100 nm Core
Feature Size (nm)Core Area Core Power Core Clock Core Power Density
Core Energy/Cycle Cores/Die Cycles/Die Die Power Density
Comparative Trends: Post 2004
Core Energy/Cycle
Core Power
Die Power Density
Core Clock
Cores per Die
Cycles per Die
Core Area
Core Power Density
Vdd scales as F0.2, not FClocks still grow at 1/F
Scaling Slide 47
CMOS VLSI Design
1.0E‐05
1.0E‐04
1.0E‐03
1.0E‐02
1.0E‐01
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+001.0E+011.0E+02
Relative to 100 nm Core
Feature Size (nm)Core Area Core Power Core Clock Core Power Density
Core Energy/Cycle Cores/Die Cycles/Die Die Power Density
Comparative Trends
Core Energy/CycleCore Power
Die Power Density
Core Clock
Cores per DieCycles per Die
Core Area
Core Power Density
Clocks fixed at 2004 levels
Scaling Slide 48
25
CMOS VLSI Design
Should We Expect Dennard Scaling of Cores?
Effective Core Area = Die area /# cores
How does # cores affect effective core area?
How many cores may fit on constant area die?
Core itself
– Growth in aggregate cache
– Modifications to microarchitecture
– Short SIMD additions
Multi-core chip
– Routing usually grows faster than linear
– Off chip interfaces may be same
Scaling Slide 49
CMOS VLSI Design
Core Area Not Following “Moore’s Law”
1
10
100
10 100 1000
Area (m
m2 ) per (Core/M
B)
Feature Size (nm)Core Area (mm2) Cache Density (mm2/MB)
1.54*F^.66 0.08*F^1.2
Scaling Slide 50
26
CMOS VLSI Design
Looking Forward Vdd varies as 1/S0.2 (not as 1/S)
Clock remains constant
Core area varies as 1/S0.66 (not as 1/S2)
Voltage 1/S 1/S0.2
Area 1/S2 1/S0.66
Capacitance 1/S
Clock S 1
Core Power 1/S2 1/S1.4
Power Density 1 1/S0.74
Energy/Cycle 1/S3 1/S0.74
Scaling Slide 51
CMOS VLSI Design
1.0E‐05
1.0E‐04
1.0E‐03
1.0E‐02
1.0E‐01
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+001.0E+011.0E+02
Relative to 100 nm Core
Feature Size (nm)Core Area Core Power Core Clock Core Power Density
Core Energy/Cycle Cores/Die Cycles/Die Die Power Density
Comparative Trends
Core Energy/CycleCore PowerDie Power Density
Core ClockCores per DieCycles per Die
Core AreaCore Power Density
Clocks fixed at baselineCore Area as F^0.66
Scaling Slide 52
27
CMOS VLSI Design
Core Energy/Cycle
1.0E‐05
1.0E‐04
1.0E‐03
1.0E‐02
1.0E‐01
1.0E+00
1.0E+001.0E+011.0E+02
Relative to 100 nm Core
Feature Size (nm)
Dennard Pre 2004 Post 2004 Constant Clock Area Adj
Dennard
Pre 2004
Constant ClockPost 2004
Area Adjusted
100X
Scaling Slide 53
CMOS VLSI Design
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+001.0E+011.0E+02
Relative to 100 nm Core
Feature Size (nm)
Dennard Pre 2004 Post 2004 Constant Clock Area Adj
Compute Cycles/DieDennard:Both Pre 2004
Constant Clock
And Post 2004
Area Adjusted
1,000X
Scaling Slide 54
28
CMOS VLSI Design
Conclusions We have seen the end of Dennard Scaling
– No more “faster and faster chips”
Multi-core has taken over processor chips
– Forcing parallel programming for more performance
Power has become #1 design issue
With power of interconnect becoming dominant
Scaling Slide 55