Upload
cory-maxwell
View
221
Download
1
Tags:
Embed Size (px)
Citation preview
Skewed Flip-Flop Skewed Flip-Flop Transformation for Transformation for Minimizing LeakageMinimizing Leakagein Sequential Circuitsin Sequential Circuits
Skewed Flip-Flop Skewed Flip-Flop Transformation for Transformation for Minimizing LeakageMinimizing Leakagein Sequential Circuitsin Sequential Circuits
Jun SeomunJun Seomun, Jaehyun Kim, Youngsoo , Jaehyun Kim, Youngsoo ShinShin
Dept. of Electrical Engineering, KAIST, KOREADept. of Electrical Engineering, KAIST, KOREA
Leakage Power in Technology ScalingLeakage Power in Technology Scaling
250250
200200
150150
100100
5050
00
Po
wer
(W
)P
ow
er (
W)
TechnologyTechnology
0.250.25µµ 0.180.18µµ 0.130.13µµ 0.100.10µµ 0.070.07µµ
Dynamic PowerDynamic Power
Leakage PowerLeakage Power
Intel Corporation, 2002Intel Corporation, 2002
Overview of Mixed Vt
TechniqueOverview of Mixed Vt
Technique
Mixed VMixed Vtt CMOS CMOS– Low VLow Vtt : fast but high leakage : fast but high leakage– High VHigh Vtt : low leakage but slow : low leakage but slow
Value of mixed VValue of mixed Vtt is limited is limited– It considers only the combinational portion of It considers only the combinational portion of
circuitscircuits
Low Vt
High Vt
High Vt gates can be assigned some non–critical path
Critical path
Initially all low Vt
MotivationMotivation
Leakage of sequential elementsLeakage of sequential elements– Sequential elements take large proportion in Sequential elements take large proportion in
many controllersmany controllers
s29
8s2
98
s34
4s3
44
s34
9s3
49
s38
2s3
82
s40
0s4
00
s44
4s4
44
s52
6s5
26
s64
1s6
41
s71
3s7
13
s83
8s8
38
s92
34
s92
34
Flip-flopFlip-flopComb.Comb.
100%100%
80%80%
60%60%
40%40%
20%20%
0%0%
s29
8s2
98
s34
4s3
44
s34
9s3
49
s38
2s3
82
s40
0s4
00
s44
4s4
44
s52
6s5
26
s64
1s6
41
s71
3s7
13
s83
8s8
38
s92
34
s92
34
Mixed Vt
Why Not High Vt Flip-Flop?Why Not High Vt Flip-Flop?
Large effects on the slack Large effects on the slack – The delay overhead of high VThe delay overhead of high Vtt flip-flops is flip-flops is
larger than that of the other high Vlarger than that of the other high Vtt combinational gates combinational gates
– Flip-flop typically affects more than one of the Flip-flop typically affects more than one of the timing paths in a circuit timing paths in a circuit
00
55
1010
1515
2020
2525
F/FF/F INVINV NAND2NAND2 NOR2NOR2 NAND3NAND3 NAND4NAND4
00
11
22
33
44
s29
8s2
98
s34
4s3
44
s34
9s3
49
s40
0s4
00
s44
4s4
44
s52
6s5
26
s64
1s6
41
s71
3s7
13
s83
8s8
38
s92
34
s92
34
[Ave
rage
# fa
nout
tim
ing
path
s on
F/F
s]
[Ave
rage
# fa
nout
tim
ing
path
s on
F/F
s]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
on c
omb.
Gat
es]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
on c
omb.
Gat
es]
De
lay
of
hig
h V
De
lay
of
hig
h V
tt ga
te
ga
te
- d
ela
y o
f lo
w V
- d
ela
y o
f lo
w V
tt ga
te
ga
te
Mixed LMixed Lgategate flip-flop flip-flop – Lager LLager Lgategate transistor transistor
Smaller delay overhead than high VSmaller delay overhead than high Vtt transistor transistor Footprint of gate remains almost the sameFootprint of gate remains almost the same
– Selective assignment of larger LSelective assignment of larger Lgate gate in flip-flopin flip-flop Smaller delay overhead than entire assignment Smaller delay overhead than entire assignment
in flip-flopin flip-flop Maximum reduction can be obtained up to Maximum reduction can be obtained up to
same amount of leakage reduction with the same amount of leakage reduction with the case when all gates are larger Lcase when all gates are larger Lgategate
Unequal leakage along with values of D and QUnequal leakage along with values of D and Q– Four kinds of SFFsFour kinds of SFFs
Characterized to minimize leakage Characterized to minimize leakage corresponding to four states (D & Q)corresponding to four states (D & Q)
SFSF0000, SF, SF0101, SF, SF1010 and SF and SF1111
Skewed Flip-FlopsSkewed Flip-Flops
Delay : 32% Leakage : 72%
cf. high Vt inverter Delay : 81% Leakage : 92%
LeakageLeakage
DelayDelay
De
lay
[ps]
De
lay
[ps]
Gate length (nm)Gate length (nm)L
ea
kag
e [
nA
]L
ea
kag
e [
nA
]0
20
40
60
80
100
120
0
10
80
70
60
50
40
30
20
45 46 47 48 49 50
Skewed Flip-Flops Skewed Flip-Flops
Design of an SFF (in case of SFDesign of an SFF (in case of SF0000))– Assume CK = 0 in idle state (clock gating)Assume CK = 0 in idle state (clock gating)
clkclk11
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
00
11
00
11
11
00
11
00
11 00
00
QQDD
CKCK clkclk00 00
00 0011
Larger LLarger Lgategate
Skewed Flip-Flops Skewed Flip-Flops
Skewed flip-flopsSkewed flip-flops
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk
clk
clk
clk
clk
clk
clk
clk
QD
CK clk
clk
clk clk
SFSF0000 SFSF0101
SFSF1010 SFSF1111
Leakage Characteristic of SFFsLeakage Characteristic of SFFs 45-nm PTM, 4 nm biasing45-nm PTM, 4 nm biasing
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(a) SF(a) SF0000
D/QD/Q
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(b) SF(b) SF0101
D/QD/Q
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(c) SF(c) SF1010
D/QD/Q
00
400400
800800
12001200
0/00/0 0/10/1 1/01/0 1/11/1
(d) SF(d) SF1111
D/QD/Q
Cur
rent
[nA
]C
urre
nt [n
A]
Cur
rent
[nA
]C
urre
nt [n
A]
Orig.Orig.SFSF0000
Orig.Orig.SFSF0101
Orig.Orig.SFSF1010
Orig.Orig.SFSF1111
Cur
rent
[nA
]C
urre
nt [n
A]
Cur
rent
[nA
]C
urre
nt [n
A]
45-nm PTM, 4 nm biasing45-nm PTM, 4 nm biasing
Timing Characteristic of SFFsTiming Characteristic of SFFs
00
1010
2020
3030
4040
00
1010
2020
3030
4040
00
1010
2020
3030
4040
00
1010
2020
3030
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(a) SF(a) SF0000
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(b) SF(b) SF01 01
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(c) SF(c) SF1010
Rising Rising TTsusu
Falling Falling TTsusu
RisingRisingTTc-qc-q
Falling Falling TTc-qc-q
(d) SF(d) SF1111
Del
ay [p
s]D
elay
[ps]
Del
ay [p
s]D
elay
[ps]
Del
ay [p
s]D
elay
[ps]
Del
ay [p
s]D
elay
[ps]
Orig.Orig.SFSF0000
Orig.Orig.SFSF0101
Orig.Orig.SFSF1010
Orig.Orig.SFSF1111
(a) Rising T(a) Rising Tsusu (b) Falling T(b) Falling Tsusu
TT susu ''
TT susu
TT susu ''
TT11
TT11 '' TT11 ''
TT11
DD clkclk
TT susu
DD
clkclk
CK (rising edge)CK (rising edge) CK (rising edge)CK (rising edge)
0.90.9
00
0.90.9
00
Orig.Orig.SFSF 0000
Orig.Orig.
SFSF 0000
TimeTime TimeTime
Vol
tage
[V
]V
olta
ge [
V]
Vol
tage
[V
]V
olta
ge [
V]
SFF TransformationSFF Transformation
Utilize SFFs while maintaining timing constraints– Input : netlist & idle state probabilities of flip-
flops– Output : new netlist with skewed flip-flops
Skewed flip-flop Skewed flip-flop
transformationtransformation
under under timing timing constraintsconstraints
Initial SFF Initial SFF assignment assignment
Flip-flop Flip-flop transformationtransformation
Find critical pathFind critical path
Find candidate Find candidate
Substitute Substitute
Netlist & Idle state Netlist & Idle state probabilities probabilities
Mixed VMixed Vtt assignment assignment
on combinational on combinational subcircuitssubcircuits
For a smoother transition For a smoother transition – HSFHSF00 : unchanged setup time delay : unchanged setup time delay– HSFHSF11 : unchanged clock-to-q delay : unchanged clock-to-q delay
Half Skewed Flip-Flops (HSFs)Half Skewed Flip-Flops (HSFs)
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD
CKCK
clkclk
clkclk
(b) Design of HSF(b) Design of HSF11
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD QQDD
CKCK
clkclk
clkclk
(b) Design of HSF(b) Design of HSF11
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD
CKCK
clkclk
clkclk
(a) Design of HSF(a) Design of HSF00
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD
CKCK
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
clkclk
QQDD QQDD
CKCK
clkclk
clkclk
(a) Design of HSF(a) Design of HSF00
HSFHSF00 HSFHSF11
SFF Transformation Algorithm SFF Transformation Algorithm
Select a flip-flop to be Select a flip-flop to be transformedtransformed– Find critical pathFind critical path– Find candidateFind candidate
Both ends of the Both ends of the most critical pathmost critical path
Larger timing Larger timing improvement improvement
Skewed flipSkewed flip--flop flop
transformationtransformation
under under timing timing constraintsconstraints
Initial SFF Initial SFF assignment assignment
FlipFlip--flop flop transformationtransformation
Find critical pathFind critical path
Find candidate Find candidate
Substitute Substitute
Netlist & Idle state Netlist & Idle state probabilities probabilities
Mixed VMixed Vtt assignment assignment on combinational on combinational
subcircuitssubcircuits
Skewed flipSkewed flip--flop flop
transformationtransformation
under under timing timing constraintsconstraints
Initial SFF Initial SFF assignment assignment
FlipFlip--flop flop transformationtransformation
Find critical pathFind critical path
Find candidate Find candidate
Substitute Substitute
Netlist & Idle state Netlist & Idle state probabilities probabilities
Mixed VMixed Vtt assignment assignment on combinational on combinational
subcircuitssubcircuits
SubstituteSubstitute– (1) Most effective SFFs in terms of delay given (1) Most effective SFFs in terms of delay given
position and phase of transition position and phase of transition – (2) If (1) fails, try HSFs(2) If (1) fails, try HSFs– (3) If (2) fails, use the original flip-flops(3) If (2) fails, use the original flip-flops
Experimental ResultsExperimental Results
For ISCAS benchmark circuits (45-nm PTM For ISCAS benchmark circuits (45-nm PTM library)library)
BenchmarkBenchmark Mixed VMixed Vt t onlyonly SFX + Mixed VSFX + Mixed Vtt
NameName # Gates# Gates # FFs# FFs Comb. (uA)Comb. (uA) SE (uA)SE (uA) Total (uA)Total (uA) Comb. (x)Comb. (x) SE (x)SE (x) Total (x)Total (x)
s298s298 130130 1414 3030 1313 4343 0.970.97 0.440.44 0.810.81
s344s344 144144 1515 3131 1515 4646 0.990.99 0.540.54 0.860.86
s349s349 142142 1515 3131 1515 4646 1.001.00 0.540.54 0.860.86
s382s382 185185 2121 3838 1919 5757 1.061.06 0.380.38 0.840.84
s400s400 198198 2121 3838 1919 5757 1.121.12 0.360.36 0.870.87
s444s444 199199 2121 4949 1919 6868 1.121.12 0.360.36 0.910.91
s526s526 258258 2121 4141 1919 6060 0.990.99 0.550.55 0.850.85
s641s641 206206 1919 3030 1818 4848 0.990.99 0.450.45 0.790.79
s713s713 206206 1919 3434 1818 5252 1.001.00 0.450.45 0.810.81
s838s838 416416 3232 7070 3030 100100 1.031.03 0.370.37 0.830.83
s5378s5378 15341534 163163 244244 155155 399399 1.071.07 0.420.42 0.820.82
s9234s9234 14571457 135135 280280 121121 401401 1.031.03 0.360.36 0.830.83
Avg.Avg. 1.041.04 0.440.44 0.840.84
Comparison of Mixed Vt Flip-FlopComparison of Mixed Vt Flip-Flop
s29
8
s34
4
s34
9
s40
0
s44
4
s52
6
s64
1
s71
3
s83
8
s92
34
s38
2
Mixed Vt FFs + Mixed Vt comb.
SFX + Mixed Vt comb.
0.6
0.7
0.8
0.9
1.0
0
1
2
3
4
s29
8
s34
4
s34
9
s40
0
s44
4
s52
6
s64
1
s71
3
s83
8
s92
34
[Ave
rage
# fa
nout
tim
ing
path
s of
F/F
s]
[Ave
rage
# fa
nout
tim
ing
path
s of
F/F
s]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
of c
omb.
Gat
es]
/ [A
vera
ge #
fano
ut ti
min
g pa
ths
of c
omb.
Gat
es]
ConclusionConclusion
Proposed Skewed Flip-FlopsProposed Skewed Flip-Flops– The set of mixed LThe set of mixed Lgategate flip-flops flip-flops– Skewed characteristics in terms of leakage Skewed characteristics in terms of leakage
and delayand delay
A heuristic algorithm that substitutes SFFs A heuristic algorithm that substitutes SFFs – An average leakage saving of 16% is achieved, An average leakage saving of 16% is achieved,
compared to the use of mixed Vcompared to the use of mixed Vtt alone alone