Upload
jvandome
View
221
Download
1
Embed Size (px)
Citation preview
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
1/37
CpE 690: Digital System DesignFall 2013
Lecture 9Logical Effort and Multi-stage Logic Networks
1
Bryan AcklandDepartment of Electrical and Computer Engineering
Stevens Institute of TechnologyHoboken, NJ 07030
Adapted from Lecture Notes, David Mahoney Harris CMOS VLSI Design
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
2/37
2
So far, we have seen how to estimate the delay of a gate
in terms of its topology and fanout:
or, in units of = delay of unloaded, unit size inverter ( = 3RC):(~3ps in 65nm process, ~60 ps in 0.6 process)
How can we use this to optimize a network of gates?
Linear Delay Model
delay = (3+3h)RC delay = (9+5h)RC
d inv = 1+h d nand3 = 3+(5/3)h
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
3/37
3
Chip designers face a bewildering array of
choices What is the best circuit topology for a function? How many stages of logic give least delay? How wide should the transistors be?
Simulating or analyzing all the alternatives isimpractical
Logical effort is a method to make thesedecisions
Builds on our simple linear (RC) model of delay Allows back-of-the-envelope calculations Helps make rapid comparisons between alternatives Provides intuitive understanding of network delay
Logical Effort
? ? ?
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
4/37
4
Decoder specifications:
16 word register file Each word is 32 bits wide Each bit presents load of 3 unit-sized transistors True and complementary address inputs A[3:0] Each (address) input may present a load of 10 unit-sized
transistors We need to decide:
How many stages to use? How large should each gate be?
How fast can decoder operate?
Example
A[3:0] A[3:0]
16
32 bits
1 6 w
or d
s
4 : 1
6 D
e c o d er
Register File
Suppose we need to design
an address decoder for aregister file:
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
5/37
5
Gate delay takes the form: d = p + g.h = p + f
p: parasitic delay Represents delay of gate driving no load (doing no useful work) Set by internal parasitic capacitance
f : effort delay = g. h (a.k.a. stage effort) Result of doing useful work (driving other gates)
g: logical effort Measures complexity of a gate (=1 for inverter) input capacitance (relative to inverter ) to deliver unit output current
h: electrical effort = C out / C in Ratio of output (load) capacitance to input capacitance Sometimes called fanout
Delay in a Logic Gate
d inv = 1+h (in units of ) dnand2 = 2+(4/3)h
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
6/37
6
Delay Plots
Electrical Effort:h = C
out / C
in
N o r m a
l i z e
d D e
l a y : d
Inverter 2-inputNAND
g =p =d =
g =p =d =
0 1 2 3 4 5
0
1
2
3
4
5
6d = f + p= g.h + p
( )
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
7/37
7
Delay Plots
Electrical Effort:h = C
out / C
in
N o r m a
l i z e
d D e
l a y : d
Inverter 2-inputNAND
g = 1p = 1d = h + 1
g = 4/3p = 2d = (4/3)h + 2
Effort Delay: f
Parasitic Delay: p
0 1 2 3 4 5
0
1
2
3
4
5
6d = f + p= g.h + p
What about NOR2?
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
8/37
8
Logical effort of a gate is the ratio of the input capacitance
of a gate to the input capacitance of an inverter deliveringthe same output current . Property of topology of the gate Independent of size of the gate
Can measure from delay vs. fanout plots Or estimate by counting transistor widths
Logical Effort
A Y A
B
Y A
B Y1
2
1 1
2 2
2
2
4
4
C in = 3
g = 3/3
C in = 4
g = 4/3
C in = 5
g = 5/3
C C C
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
9/37
9
Logical Effort of Common Gates
Gate type Number of inputs
1 2 3 4 n
Inverter 1
NAND 4/3 5/3 6/3 (n+2)/3
NOR 5/3 7/3 9/3 (2n+1)/3
Tristate / mux 2 2 2 2 2
XOR, XNOR 4, 4 6, 12, 6 8, 16, 16, 8
LogicalEffort g
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
10/37
10
Parasitic delay of a gate is the delay of the gate when it
drives zero load Delay due to internal diffusion capacitance Property of topology of the gate Independent of size of the gate
Usually just count diffusion capacitance on output node but beware of high fan-in gates
Parasitic Delay
A Y AB
Y A
B Y1
2
1 1
2 2
22
4
4
C in = 3
g = 3/3
C in = 4
g = 4/3
C in = 5
g = 5/3
Cdiff = 3
p = 3/3
Cdiff = 6
p = 2
Cdiff = 6
p = 2
C C C
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
11/37
11
Parasitic Delay of Common Gates
Gate type Number of inputs
1 2 3 4 n
Inverter 1
NAND 2 3 4 n
NAND (Elmore) 7/3 4 6 (n 2+5n)/6
NOR 2 3 4 n
Tristate / mux 2 4 6 8 2n
XOR, XNOR 4 6 8
ParasiticDelay p
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
12/37
12
Estimate the frequency of an N-stage ring oscillator:
Logical Effort: g =Electrical Effort: h =Parasitic Delay: p =Stage Delay: d =Frequency: f osc =
Example: Ring Oscillator
111g.h+p = 2
1/(2*N*d) = 1/(4N. )
31 stage ring oscillator in 0.6 m process: = 60ps, f = 135 MHz
65nm process: = 3ps, f = 2.7GHz
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
13/37
13
Estimate the delay of a fanout-of-4 (FO4) inverter
Logical Effort: g =Electrical Effort: h =Parasitic Delay: p =Stage Delay: d =
Example: FO4 Inverter
14
1g.h+p = 5
d
The FO4 delay is about
300 ps in 0.6 m process
60 ps in a 180 nm process
15 ps in an 65 nm process
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
14/37
14
What if we have a multi-stage path with
a mix of different gate types along the path a known load capacitance at the end of the path a limitation on the input capacitance to the path
How can we size these gates to minimize the delay? We generalize the concept of logical effort from single
gate to a multistage path
Multistage Logic Networks
20
10* x* y* z*
* represents size of gate as measured by its input capacitance
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
15/37
15
Define Path Logical Effort
Define Path Electrical Effort
Define Path Effort
Can we write F = G.H ?
Path Effort
iG g= out-path
in-path
C H
C =
i i iF f g h= =
20
10 x y z
g1 = 1h1 = x/10f 1 = x/10
g2 = 5/3h2 = y/xf 2 = 5y/3x
g3 = 4/3h3 = z/yf 3 = 4z/3y
g4 = 1h4 = 20/zf 4 = 20/z
= 20/9
= 20/10 = 2
= 40/9
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
16/37
16
No! Consider paths that branch:
G =
H =
G.H =h1 =
h2 =
F =
Paths that Branch
5
15
1590
90
1
90 / 5 = 18
18(15 +15) / 5 = 6
90 / 15 = 6
(g 1.h 1).(g 2.h 2) = 36 = 2G.H
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
17/37
17
Introduce branching effort
Accounts for branching between stages in path
Define
Define path branching effort
Now we compute the path effort F = G.B.H
Branching Effort
i B b=
ih BH =Note:
=: _ + : _
: _
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
18/37
18
Now, including branching effort:
G =
H =
B =B.G.H =
h1 =
h2 =F =
Paths that Branch
5
15
1590
90
1
90 / 5 = 18
36
(15 +15) / 5 = 6
90 / 15 = 6(g 1.h 1).(g 2.h 2) = 36 = B.G.H
2.1 = 2
b1 = 2
b2 = 1
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
19/37
19
Now, remember that delay of each gate is:
d = g.h + p = f + p
So, total path delay is:
Define Path Effort Delay:
Define Path Parasitic Delay:
Path Delay
Multistage Delays
= +
F i D f = iP p=
i F D d D P= = +
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
20/37
20
We want to set the size of the individual gates along the
path so as to minimize:
Remember P is independent of gate size.
We want to minimize given
Delay is smallest when each stage bears same effort
Thus minimum delay of N stage path is
Minimizing Path Delay
i F D d D P= = +
= (a constant)
1 N D NF P= +
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
21/37
21
This is the key result of logical effort:
Find fastest delay doesnt require calculating gate sizes
Can estimate D knowing : number of stages (N) path effort (depends on stage topologies) parasitic gate delay's (depends on stage topologies)
Compare to optimization by simulation
Minimum Path Delay
1
N D NF P= +
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
22/37
22
How wide should the gates be for minimum delay?
Working backward, apply capacitance transformation tofind input capacitance of each gate given load it drives. Input capacitance determines gate size
Check work by verifying input cap spec is met.
Setting Gate Sizes
out
in
i
i
C C
i out in
f gh g
g C C
f
= =
=
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
23/37
23
Select gate sizes x and y for least delay from A to B
Example: 3-stage path
8 x
x
x
y
y
45
45
AB
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
24/37
24
Logical Effort G =
Electrical Effort H =
Branching Effort B =
Path Effort F =Best Stage Effort
Parasitic Delay P =
Minimum Delay D =
Example: Minimum Path Delay
8 x
x
x
y
y
45
45
AB
(4/3)*(5/3)*(5/3) = 100/27
45/8
3*2 = 6
G.B.H = 125
2+3+2 = 7
3 5 f F = =
3*5 + 7 = 22 (=4.4 FO4)
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
25/37
25
Work backward for sizes:
y =
x =
Example: Calculating Gate Sizes
8 x
x
x
y
y
45
45
AB
((5/3) * 45) / 5 = 15
((5/3) * (15+15)) / 5 = 10
A load = ((4/3)*(10+10+10))/5 = 8 (agrees with original spec.)
P:12
N:3
P:4N:6
P:4N:4
transistor widthsassume C in units of C g
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
26/37
26
How many stages should a path use? Minimizing number of stages is not always fastest
Example: control signal to drive 64-bit datapath with signalsourced by a unit inverter
Optimizing Number of Stages
1 1 1 1
64 64 64 64
InitialDriver
Datapath Load
N:f:
D:
1 2 3 4
G = 1
H = 64B = 1F = 64P = Nf i = (64) 1/N
D = NF 1/N + P= N(64) 1/N + N
64
65
8
18
4
15
2.8
15.3
8
8
4
16
23
2.8
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
27/37
27
Separate out optimization of logic function from theoptimization of number of stages by adding extrainverters to output
Add (N - n 1) inverters to output to create N-stage network Extra inverters do not change path logic effort F, but they
do add parasitic delay Optimum delay of extended path is:
Adding Stages to Logical Function
N - n 1 Extra InvertersLogic Block:
n1 StagesPath Effort F
( )11
1
1
N
n
i inv
i
D NF p N n p=
= + +
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
28/37
28
Differentiate with respect to N and set to zero todetermine optimum value for N:
Define best stage effort = 1 (note ) then delay minimized when:
No closed form solution for as a function of p inv Neglecting parasitics ( p inv = 0 ), we find = 2.718 ( )
For p inv = 1 , solve numerically for = 3.59
Optimizing Extended Path
( )11
11
N
n
i invi
D NF p N n p=
= + +
1 1 1
ln 0 N N N
inv
DF F F p N
= + + =
( )1 ln 0inv p + =
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
29/37
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
30/37
30
Decoder specifications:
16 word register file Each word is 32 bits wide Each bit presents load of 3 unit-sized transistors True and complementary address inputs A[3:0] Each input may present a load of 10 unit-sized transistors
We need to decide: How many stages to use? How large should each gate be? How fast can decoder operate?
Revisit Decode Design Example
A[3:0] A[3:0]
16
32 bits
1 6 w
or d
s
4 : 1
6 D
e c o d er
Register File
Suppose we need to design
an address decoder for aregister file:
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
31/37
31
Decoder specifications: 16 word register file
Each word is 32 bits wide Each bit presents load of 3 unit-
sized transistors Each input may drive 10 unit-sized
transistors
Estimate Number of Stages A[3:0] A[3:0]
16
32 bits
1 6 w
or d
s
4 : 1
6 D
e c o d er
Register File
H = (3 x 32)/10 = 9.6Each input address line drives 8 decoder gates, soB= 8
There will be something like NAND4 to decode each address, so assume
G = 2then, path effort P = G.B.H = 153.6
assuming =4, N = log 4(153.6) = 3.63
Lets try a 3-stage design!
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
32/37
32
Complete the Design
Logical Effort: G = 1 * 6/3 * 1 = 2
Path Effort: F = GBH = 153.6Stage Effort:Path Delay:Gate sizes: z = y =
1/ 3 5.36 f F = == 3. + 1 + 4 + 1 = 22.1
96*1/5.36 = 18 18*2/5.36 = 6.7
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
33/37
Compare Alternative Solutions
Compare many alternatives with a spreadsheet
D = N(76.8 G) 1/N + P
33
Design N G P D
NOR4 1 3 4 234
NAND4-INV 2 2 5 29.8
NAND2-NOR2 2 20/9 4 30.1
INV-NAND4-INV 3 2 6 22.1
NAND4-INV-INV-INV 4 2 7 21.1
NAND2-NOR2-INV-INV 4 20/9 6 20.5
NAND2-INV-NAND2-INV 4 16/9 6 19.7
INV-NAND2-INV-NAND2-INV 5 16/9 7 20.4
NAND2-INV-NAND2-INV-INV-INV 6 16/9 8 21.6
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
34/37
Review of Definitions
34
Term Stage Path
number of stages
logical effort
electrical effortbranching effort
effort
effort delay
parasitic delay
delay
iG g= out-path
in-path
C
C H =
N
i B b= F GBH =
F i D f =
iP p= i F D d D P= = +
out
in
C C h =
on-pa th off-path
on-path
C C C b
+=
f gh=
f
p
d f p= +
g
1
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
35/37
Review Method of Logical Effort
1) Compute path effort
2) Estimate best number of stages
3) Sketch path with N stages
4) Estimate least delay
5) Determine best stage effort
6) Find gate sizes
35
F GBH =
4log N F =
1 N D NF P= +
1 N f F =
i
i
i out in
g C C
f =
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
36/37
Limits of Logical Effort
Chicken and egg problem
Need path to compute G But dont know number of stages without G
Simplistic delay model
Neglects input rise time effects
Interconnect Iteration required in designs with wire
Maximum speed only Not minimum area/power for constrained delay
36
8/13/2019 Lecture 9 - CpE 690 Introduction to VLSI Design
37/37
Summary
Logical effort is useful for thinking about delay in circuits
Numeric logical effort characterizes gates NANDs are faster than NORs in CMOS Paths are fastest when effort delays are ~4 Path delay is weakly sensitive to stages, sizes But using fewer stages doesnt mean faster paths Delay of path is about log 4F FO4 inverter delays Inverters and NAND2 best for driving large caps
Provides language for discussing fast circuits requires practice to master
37