Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
inst.eecs.berkeley.edu/~ee241b
Borivoje Nikoli
EE241B : Advanced Digital Circuits
Lecture 24 – DTS, Clock
EECS241B L24 CLOCK
The Promise and Pitfalls of Neuromorphic Computers, by Sunny Bains, EE Times, April 22, 2020.
https://www.eetimes.com
Announcements
• Assignment 4 due on Friday.• Quiz on Tuesday
• Last lecture on Tuesday
• Final on Thursday, April 30
• Project presentations on Monday, May 4
• Reading: Wong, JSSC, 2006
EECS241B L24 CLOCK
Outline
• Module 5• Dynamic threshold scaling
• Optimal thresholds and supplies
• Module 6• Clock generation
EECS241B L24 CLOCK
5.M Dynamic Threshold Scaling
EECS241B L24 CLOCK
Dynamic Body Bias (Bulk)
EECS241B L24 CLOCK
Switches between active and sleep
... ...
450mVFBB
450mVFBB
VCC
VSS
PMOSbody
NMOSbody
PMOSbias
NMOSbias
PMOSbias ... ...
NMOSbias
500mVRBB
500mVRBB
VCC
VSS
PMOSbody
NMOSbody
VHIGH
VLOW
Forward body bias (FBB)
Local VCC tracking
Active mode
Reverse body bias (RBB)
Triple well needed
Idle mode
Dual-VTcore
Tschanz, ISSCC’03
Dynamic Body Bias (Bulk)
EECS241B L24 CLOCK
Body Bias Layout
Sleep transistor LBGs
Number of ALU core LBGs 30
Number of sleep transistor LBGs 10
PMOS device width 13mm
Area overhead 8%
ALU core LBGs
Sleep transistor LBGsALU core LBGs
ALU
EECS241B L24 CLOCK
0%
5%
10%
15%
20%
10 100 1000 10000 100000 1000000Number of idle cycles
Tota
l pow
er s
avin
gs
Total Active Power Savings(Fixed activity: = 0.05)
Body bias (1.28V): active: FBB, idle: ZBB
Reference: 450mV FBB to core with clock gating, 1.28V, 4.05GHz, 75°C
0.5 5 50 500 5000 50000
Number of consecutive idle cycles (TOFF)
Number of consecutive active cycles (TON)
Power savings for TOFF > ~100 idle cycles
PMOS sleep transistor (1.32V)
Tota
l pow
er s
avin
gs Max 18%
Max 8%
EECS241B L24 CLOCK
Generating Back-Bias
• Tradeoff – speed of charging and discharging well caps
• Often measure VBB indirectly (leakage)
• Challenge: Generating –VSS
• 28nm FDSOI implementation
D. Jacquet, VLSI 2013
EECS241B L24 CLOCK
Generating Back Bias
• Fast and wide voltage range back-bias in FDSOI
M. Blagojevic, VLSI 2016
Switched capacitors generate negative bias and pump substrate
chN
dchN
chP
dchP
nwell
pwell
VDD1V8
GND
GND
Neg.BootStrap
Neg.BootStrap
VDD1V8
GND
GND
1
1
2
2
Cfly
pwellCharger
pwellDischarger
nwellCharger
nwellDischarger
LEVEL
SHIFT
VDD1V0VDD1V8
LEVEL
SHIFT
VDD1V0VDD1V8
EECS241B L24 CLOCK
Supply/Process Compensation
• Able to track ~200mV supply droops and maintain constant frequency (measured by a replica) by back-bias adjustments
EECS241B L24 CLOCK
5.N Dynamic Threshold Scaling and Variations
EECS241B L24 CLOCK
Body Biasing and Variations
• Body biasing with a local control loop can be used to lower the impact of process variations
• Used to limit die-to-die and within-die variations
EECS241B L24 CLOCK
Self-Adjusting Threshold-Voltage Scheme (SATS)
• Older bulk technologies had stronger body effect
EECS241B L24 CLOCK
Dynamic Frequency Loop in FDSOI
Quelen, ISSCC’18
EECS241B L24 CLOCK
Techniques Summary (around 130nm node)
0
20
40
60
80
100
0 0.2 0.4 0.6 0.8 1
Il eak
(nor
mal
ized
)
VDD [V]
Sleep transistor - up to~25x leakage reduction
Standby supply reduction~3-4x leakage reduction
Reverse bias~3x leakage
reduction
Standby supply + reverse bias~10x leakage reduction
Reduced VDD
Off-transistorload line
EECS241B L24 CLOCK
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic designScaled VDD
Trans. sizingMulti-VDD
Clock gatingDFS, DVS
Leakage
Stack effectsTrans sizingScaling VDD
+ Multi-VTh
Sleep T’sMulti-VDD
Variable VTh
+ Input control
+ Variable VTh
EECS241B L24 CLOCK
5.O Optimal VDD, VTh
EECS241B L24 CLOCK
Dynamic Voltage Scaled Microprocessor
External VDD 3.3V±10% Internal VDDL 0.8V~2.9V ±5%
TX3900
User Logic PLL
VT
VS
Pow
er D
issi
patio
n (m
W)
Operating Frequency (MHz)
0
100
200
300
0 10 20 30 40
TheoryMeasurement
Courtesy: Prof. Kuroda
EECS241B L24 CLOCK
Adapting VDD and VTH
• Adapting both VDD and VTh during runtime• VTh is much less sensitive Miyazaki, ISSCC’02
EECS241B L24 CLOCK
Adapting VDD and VTH
Miyazaki, ISSCC’02
EECS241B L24 CLOCK
Optimal VDD, VTh
• Adjusting VDD, VTh trades of energy and delay
• We studied energy-limited design• And alternate ways for optimizing energy and delay together
• E.g. energy-delay product (EDP)
• Or EnDm, n,m > 1
EECS241B L24 CLOCK
Optimal EDP Contours
• Plot of EDP curves in VDD, VTh plane
Gonzalez, JSSC 8/97EECS241B L24 CLOCK
Topology Inverter Adder Decoder(ELk/ESw)ref 0.1% 1% 10%
Technology parameters (Vddmax, Vth
ref) rarely optimal
Reference Design:Dref (Vdd
max,Vthref)
Large variation in optimal circuit parameters Vddopt, Vth
opt, wopt
Vddmax
Vddmin Vth
min
Vthmax
Sizing, Supply, Threshold Optimization
EECS241B L24 CLOCK
Delay (Dref)
ReferenceDesign(Dref,Eref)
Energy efficient curvef (W,Vdd,Vth)
(Dref,Emin)
(Dmin,Eref)
Ener
gy (E
ref)
Sensitivity W Vdd Vth(Dref,Eref) 1.5 0.2
(Dref,Emin) 1
(Dmin,Eref) 22 16 22
40% delay improvement without energy penalty
-80%
-40%
80% of energy savedwithout delay penalty
Result: E-D Tradeoff in an Adder
EECS241B L24 CLOCK
Energy-constrained delay
• Active power
f = 1/LDtp
• Leakage power
• Eliminate one variable(VTh) and find Pmin(VDD)
Nose, ASP-DAC’00
2DDact fCVP
DDS
VV
leak VeIPDDTh
0
EECS241B L24 CLOCK
Minimum energy: ESw = 2ELk
10-2
10-1
100
101
0
0.2
0.4
0.6
0.8
1
ELeakage/ESwitching
EO
p / no
min
al E
Op
ref
nominalparallelpipeline
Vthref-180mV
0.81Vddmax
Vthref-95mV
0.57Vddmax
Vthref-140mV
0.52Vddmax
2
lnLk Sw opt
d
avg
E EL K
Optimal designs have high leakage (ELk/ESw 0.5)
Large (ELk/ESw)opt
Flat EOp minimumTopology dependent
EECS241B L24 CLOCK
Subthreshold Optimum
f = 30kHz Minimum is independent of VT
Calhoun, JSSC 9/05EECS241B L24 CLOCK
6. Clocks and Supplies
EECS241B L24 CLOCK
Clock Subsystem
• Clock Generation
• Clock Distribution
• Synchronization
EECS241B L24 CLOCK
Clock Subsystem
• Intel Xeon – Bowhill, ISSCC’15• Independent clocks for 4-18 cores
• Self-biased (SB) and LC PLLs
EECS241B L24 CLOCK
6.B Clock Generation
EECS241B L24 CLOCK
Clock Generation
PhaseDetector
ChargePump
Filter
DelayLine
PhaseDetector
ChargePump
VCO÷N
Delay-Locked Loop (Delay Line Based)
Phase-Locked Loop (VCO/DCO-Based)
U
D
U
D
fREF
fO
fO
fREF
Filter
EECS241B L24 CLOCK
PLL Signals
time
fRef
fO
PD out
LPF out
PhaseDetector
ChargePump
VCO÷N
U
D
fO
fREF
Filter
EECS241B L24 CLOCK
Loop Performance• Ideal clock
Clock w/ jitter
Phase histogram
Phase offset
Worst case p-p jitter
Time domain
Phase offset, peak-to-peak jitter, RMS jitter
Bandwidth, locking time, frequency rangeEECS241B L24 CLOCK
Phase Detector
• Detects the phase difference
time
PhaseDetector
Vout(t)
KPD
Vout
Vout = KPD ·
EECS241B L24 CLOCK
Delay-Locked Loop
• First order loop: inherently stable
• No filtering of input jitter
• Constant frequency (no synthesis)
• No phase error accumulation
KPDChargePump ICP
Filter
KDL
U
D
fREF
fO
KF
EECS241B L24 CLOCK
DLL Locking
Courtesy of IEEE Press, New York. 2000EECS241B L24 CLOCK
Delay-Locked Loop
DLFPDREFDLCPPDOI
O KKKs
FKIsC
KsDsD
sD 11
KPDChargePump ICP
Filter
KDL
U
D
fREF
fO
KFOpen loop transfer function
DLFPD
DLFPD
I
OKKKsKKK
sDsDsH
Closed loop transfer function
EECS241B L24 CLOCK
Delay-Locked Loop
• N > an order of magnitude below FREF
• Use of DLLs requires low-jitter input
• VCDL must span adequate delay range + reset to min delay
• Noise sources:• Delay line (Supply sensitivity)
• Clock buffers that follow
• Device noise (small)
|H(s)|
N = KPDKFKDL
EECS241B L24 CLOCK
Voltage-Controlled Delay Line
• Delay controlled by voltage with proportionality KDL
EECS241B L24 CLOCK
DLL Use
EECS241B L24 CLOCK
6.C Clock Generation: PLLs
EECS241B L24 CLOCK
Phase-Locked Loop
• PLL is locked when the phase difference is zero
• Second/third order loop
• N for frequency synthesis (and x M)
• Filters input jitter
• Accumulates phase error
PD CP VCO÷N
U
D
fOut
fIn
Filter
EECS241B L24 CLOCK
Voltage-Controlled Oscillator
• Oscillation frequency controlled by voltage
ctrlVCOFRout VKVCOVctrl out
KVCO
out
Vctrl
FR
dtVKtAty tctrlVCOFRout cos
FR – free-running frequency
EECS241B L24 CLOCK
Example VCO
• Ring-oscillator-based VCO: RC loaded
Vctrl
Vout
Vctrl
Vout
Ring-oscillator-based VCO: Current-starved
EECS241B L24 CLOCK
PLL vs. DLL Dynamics
• The key difference is in the VCDL vs. VCO transfer characteristics
• VCO integrates (accumulates) phase
HVCO(s) = KVCO/s
EECS241B L24 CLOCK
Charge Pump
• Push/pull current source operation
VDD
UP
DN
To VCO Control Input
EECS241B L24 CLOCK
Charge-Pump PLL
VDD
UP
DN
CP
PFD VCO
VCOPFD
VCOPFD
VCOPFD
VCOPFD
KKsKK
sK
sK
sK
sK
sH 21
Phase transfer function
EECS241B L24 CLOCK
Charge Pump PLL with a Zero
• Charge pump PLL has a stability problem
• Compensation by adding a zero
CP
R
CP
EECS241B L24 CLOCK
Charge Pump PLL with a Zero
VDD
UP
DN
CP
PFD VCO
R
VCOP
SVCO
PP
VCO
KCIRKIs
sRCCIK
sH
22
12
2
VCOP
n KCI
2
VCOP KICR
22
EECS241B L24 CLOCK
Higher Order Loops
• Another pole naturally exists• Filters the control voltage VCTRL
• Lowers phase margin
• Reduces the lock rangeVDD
UP
DN
CP1
PFD VCO
R
CP2
EECS241B L24 CLOCK
Phase Noise at the PLL Input
• Low-pass characteristic
22
2
2 nn
n
sssH
|log out/ in|
LPF VCOin out
EECS241B L24 CLOCK
VCO Phase Noise
• High-pass characteristic
22 2 nn
LPF
VCO
out
ssss
ss
|log out/ VCO|
LPF VCOin= 0 out
VCO
EECS241B L24 CLOCK
Next Lecture
• Finish clocks and supplies
• Finale
EECS241B L24 CLOCK