1
EE241 Spring 2008EE241 - Spring 2008Advanced Digital Integrated Circuits
Lecture 26: MultipliersLatches Latches
AnnouncementsHomework 5
Due todayWrapping-up the class:
Final presentations May 8, 1-5pm, BWRCFinal reports due May 7Final exam, Monday, May 12 3-4:30pm, 241 Cory
Presentations
2
12 minutes (max 10 slides) + 3 minutes for Q & A
2
AgendaWrap up multipliersLatches and flip-flops
3
MultipliersMultipliers
3
Generalized Counters
5
Stenzel,Trans on Comp 10/77
Generalized Counters
6
4
Generalized Counters
32x32busing (5,5,4)with (3,2) inthe last stage
7
4:2 Counters (Compressors)
4-2 carry-save module
8Weinberger, IBM J. ResDev 1/81Santoro, Horowitz, JSSC 4/89
5
4:2 Compressors
Built of CSAsPipelined version compresses8 partial products per cycle8 partial products per cycle
9
4:2 Compressors
10
Interconnect can be more regular than in Wallace tree
6
Three Dimensional Optimization
11Oklobdzija, Villeger, Liu, Trans on Comp 3/96
Vertical Slices in TDM
12
7
Final Addition
13
Final Addition
14
8
Example: CPL Multiplier
Block Yano,Diagram
Yano,JSSC 4/90
15
CriticalPath
Example: DPL Multiplier
16Ohkubo, JSSC 3/95
9
Example: DPL Multiplier
Booth encoder Partial product generatorBooth encoder Partial product generator
17
Example: DPL Multiplier
FA-based 4:2 Modified 4:2FA based 4:2 Modified 4:2
18
10
Latches and flip-flopsLatches and flip flops
Latches: ReadingRabaey et al, Chapters 7 and 10Chapter 10 in Chandrakasan et al, by PartoviStojanovic, Oklobdzija, JSSC 4/99
20
11
Latch vs. Flip-FlopLatchstores data when clock is low
Flip-Flop (register)stores data when clock rises clock is low
D
Clk
Q D
Clk
Q
Clk Clk
21
D D
Q Q
Latch vs. Flip-Flop
22Courtesy of IEEE Press, New York. © 2000
12
Latch Pair vs. Flip-FlopPerformance metricsDelay metrics
Delay penaltyClock skew penaltyInclusion of logicInherent race immunity
Power/Energy Metrics
23
gyPower/energyPDP, EDP
Design robustness
Latches
Transmission-Gate Latch C2MOS Latch
D
Clk
Clk
Q
Clk
D
Clk
Q
24
Clk
13
Latches
25Courtesy of IEEE Press, New York. © 2000
Latch Pair as a Flip-Flop
26
14
Requirements for the Flip-Flop Design• High speed of operation:
• Small Clk-Output delay• Small setup time• Small hold time→Inherent race immunity
• Low power• Small clock load• High driving capability• Integration of logic into flip-flop• Multiplexed or clock scan
27
• Robustness
Sources of Noise
28Courtesy of IEEE Press, New York. © 2000
15
Gate Isolation
29Courtesy of IEEE Press, New York. © 2000
Flip-Flop RobustnessRobustness of the storage nodeInput isolationData stored statically, max resistance limitMin capacitance limitPreventing storage node exposure
30
16
Types of Flip-Flops
Latch Pair(Master-Slave)( )
D
Clk
Q D
Clk
QData
D
Clk
Q
Clk
Data
Pulse-Triggered Latch
L1 L2 L
31
Clk Clk
Clk
ClkClk
Flip-Flop Delay
Sum of setup time and Clk-output delay is the true measure of the performance with respect to the system speedT = TClk-Q + TLogic + Tsetup (ignoring skew)
D Q D QLogicN
32
Clk Clk
TLogicTClk-Q TSetup
17
Delay vs. Setup/Hold Times
300
350
100
150
200
250
300
Clk
-Out
put [
ps]
Setup Hold
Minimum Data-Output
33
0
50
-200 -150 -100 -50 0 50 100 150 200
Data-Clk [ps]
Master-Slave Latch Pairs
Positive setup timesTwo clock phases:» distributed globally» generated locally
Small penalty in delay for incorporating MUX
34
Some circuit tricks needed to reduce the overall delay
18
Master-Slave Latch Pairs
Case 1: PowerPC 603 (Gerosa, JSSC 12/94)
Vdd Vdd
QClk Clkb
D
35
ClkClkb
T-G Master-Slave Latch
•Feedback added for static operationU b ff d i t•Unbuffered input
input capacitance depends on the phase of the clockover-shoot and under-shoot with long routeswirelength must be restricted at the input
•Clock load is high•Low power
36
•Small clk-output delay, but positive setup
19
Master-Slave Latches
Case 2: C2MOSVdd Vdd
VddVdd Vdd
Vdd
Vdd
VddClk Ck
Ck
Ck
CkCkb
Ckb
CkbQD
37
Ck CkbFeedback added for static operationLocally generated clockPoor driving capability
Pulse-Triggered Latches
•First stage is a pulse generatorgenerates a pulse (glitch) on a rising edge of the clock
•Second stage is a latchcaptures the pulse generated in the first stage
•Pulse generation results in a negative setup time•Frequently exhibit a soft edge property
38
•Note: power is always consumed in the pulse generator
20
Pulsed Latch
Simple pulsed latch
39Kozu, ISSCC’96
Intel/HP Itanium 2
40Naffziger, ISSCC’02
21
Pulse-Triggered Latches
Hybrid Latch Flip-Flop, AMD K-6Partovi ISSCC’96Partovi, ISSCC 96
Vdd
Q
Q
41
D
Clk
HLFF Operation
1-0 and 0-1 transitions at the input with 0ps setup time
42
22
Hybrid Latch Flip-Flop
Skew absorption
43Partovi et al, ISSCC’96
Pulse-Triggered Latches
AMD K-7
44Courtesy of IEEE Press, New York. © 2000
23
Pulse-Triggered LatchesSemi-Dynamic Flip-Flop (SDFF), Sun UltraSparc III, Klass, VLSI Circuits’98
Vdd Vdd
Clk
D
Q
Q
45
Clk
Pulse generator is dynamic, cross-coupled latch is added for robustness. Loses soft edge on rising transitionLatch has one transistor less in stack - faster than HLFF, but 1-1 glitch existsSmall penalty for adding logic
Pulse-Triggered Latches
7474, from early 1960’s
Clk
QS
46D
Q
R
24
Pulse-Triggered Latches
First stage is a sense amplifier
Case 4: Sense-amplifier-based flip-flop, Matsui 1992.DEC Alpha 21264, StrongARM 110
First stage is a sense amplifier, precharged to high, when Clk = 0After rising edge of the clock sense amplifier generates the pulse on S or RThe pulse is captured in S-R latchCross-coupled NAND has different
47
propagation delays of rising and falling edges
Sense Amplifier-Based Flip-Flop
48Courtesy of IEEE Press, New York. © 2000
25
Sampling Window Comparison
49Naffziger, JSSC 11/02
Local Clock Gating
QCKI
0.85 0.85
2
0 5
1.2
DD
CKIB
0.850.5 0.5
0.5
0.50.85 0.50.85
XNOR
CKIB 0.5
0.5
P l
Data-TransitionLook-Ahead
DI
50
CP
XNOR
CKIB
CKI
0.85
0.5
PulseGenerator
‘Clock on demand’Flip-flop
26
Next LectureTiming
51