VLSI DESIGN
LECTURE 5DELAY ESTIMATION
ANDLOGICAL EFFORT
Waqar Ahmad
Department of Electrical Engineering
SWITCH-LEVEL RC MODELS
Use equivalent circuits for MOS transistors Ideal switch + capacitance and ON resistance Unit nMOS has resistance R, capacitance C Unit pMOS has resistance 2R, capacitance C
Capacitance proportional to width Resistance inversely proportional to width
kgs
d
g
s
d
kCkC
kCR/k
kgs
d
g
s
d
kC
kC
kC
2R/k
kgs
d
kgs
dkgs
d
g
s
d
kCkC
kCR/k
kgs
d
2
VLSI D
esign
INVERTER RC DELAY ESTIMATE Estimate the delay of a fanout-of-1 inverter in response to a
step input function
C
CR
2C
2C
R
2
1A
Y
C
2C
C
2C
C
2C
RY
2
1
tpd = 6RC
2
1A
Y 2
1
C
CR
2
1A
Y
C
Y2
1
C
CR
2C
2C
R
2
1A
Y
C
2C
Y2
1
kgs
d
g
s
d
kCkC
kCR/k
kgs
d
g
s
d
kC
kC
kC
2R/k
3
VLSI D
esign
4
EXAMPLE: 3-INPUT NAND
Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R).
3
3
3
2 2 22
1A
Y 2
1
VLSI D
esign
EXAMPLE: 3-INPUT NAND GATE
Annotate the 3-input NAND gate with gate and diffusion capacitance
2 2 2
3
3
33C
3C
3C
3C
2C
2C
2C
2C
2C
2C
3C
3C
3C
2C 2C 2C
5
VLSI D
esign
6
2 2 2
3
3
33C
3C
3C
3C
2C
2C
2C
2C
2C
2C
3C
3C
3C
2C 2C 2C
3-INPUT NAND CAPS
Annotate the 3-input NAND gate with gate and diffusion capacitance.
VLSI D
esign
ELMORE DELAY ON transistors look like resistors Pullup or pulldown network modeled as RC
ladder Elmore delay of RC ladder
R1 R2 R3 RN
C1 C2 C3 CN
nodes
1 1 1 2 2 1 2... ...
pd i to source ii
N N
t R C
R C R R C R R R C
7
VLSI D
esign
EXAMPLE: 3-INPUT NAND
Estimate worst-case rising and falling delay of 3-input NAND driving h identical gates.
9C
3C
3C3
3
3
222
5hCY
n2
n1
9 5pdrt h RC
3 3 3 3 3 33 3 9 5
12 5
R R R R R Rpdft C C h C
h RC
8
VLSI D
esign
COMPUTING THE RISE AND FALL DELAYS Estimate rising and falling propagation delays of a
2-input NAND driving h identical gates.
h copies
6C
2C2
2
224hC
B
Ax
Y
R
(6 + 4 h )CY 6 4p d rt h R C
2 2 22 6 4
7 4
R R Rpdft C h C
h RC
(6+4h)C2CR/2
R/2x Y
9
VLSI D
esign
DELAY COMPONENTS
Delay has two components: Parasitic delay (due to gate own diffusion capacitance)
Fixed n RC
For 2-input NAND n =6,7 For 3-input NAND n = 9,12
Independent of load Effort delay
xh RC x = input gate capacitance h = fan-out
Proportional to load capacitance
10
VLSI D
esign
CONTAMINATION DELAY
Best-case (contamination) delay can be substantially less than propagation delay.
Example: For 3-input NAND, if all three inputs fall simultaneously
59 5 33 3cdrRt h C h RC
9C
3C
3C3
3
3
222
5hCY
n2
n1
11
VLSI D
esign
CONTAMINATION DELAY (2-INPUT NAND) If both inputs fall simultaneously
6C
2C2
2
224hC
B
Ax
Y
R
(6+4h)CYR
3 2cdrt h RC
• Order of inputs also impact propagation delay. Which is better AB=10 -> 11 or AB=01 ->11? 12
VLSI D
esign
7C
3C
3C3
3
3
222
3C
2C2C
3C3C
IsolatedContactedDiffusionMerged
UncontactedDiffusion
SharedContactedDiffusion
DIFFUSION CAPACITANCE
We assumed contacted diffusion on every s / d. Good layout minimizes diffusion area Ex: NAND3 layout shares one diffusion contact
Reduces output capacitance by 2C Merged uncontacted diffusion might help too
13
VLSI D
esign
LAYOUT COMPARISON
Which layout is better?
AVDD
GND
B
Y
AVDD
GND
B
Y
14
VLSI D
esign
IMPACT OF TRANSISTOR SIZING
What happens to the delay if we increase the transistor sizes by K?
Is it the case that increasing the size of the transistor always reduces delay?
15
VLSI D
esign
IMPACT OF SIZING IN A PATH
Cout×K
Less output resistance; increase output capacitance→ delay reduces (parasitic delay stays the same)
Larger input capacitance → increases delay of previous stage!
What is the final outcome? Should we size? By how much? 16
VLSI D
esign
EXPRESSING DELAY AS A LINEAR MODELC is the capacitance of unit width transistord = R/k(4h’C+ 6kC)d = RC(4h’/k + 6)
parasiticdelay
effortdelay
Normalize with respect to 3RC (delay of unloaded inverter)d = 4/3 * h’/k + 2
logical effort(affected by gate type or geometry)
electric effort
17
VLSI D
esign
SUMMARY OF LINEAR DELAY MODEL
• g: logical effort = ratio between input capacitance of the gate to the input capacitance of the inverter that would deliver the same current• h: electric effort = ratio between load capacitance and the gate input capacitance (sometimes called fanout)• p: parasitic delay
• represents delay of gate driving no load• set by internal parasitic capacitance
18
VLSI D
esign
IMPACT OF GATE SIZING
3
3
222
3
9C
3C
3C3
3
3
222
5C5C
5C
If you decide to increase everything by a factor of k
How about an inverter? 12 ps in 180 nm process
40 ps in 0.6 m process
Unloaded delay =3RC
19
VLSI D
esign
LOGICAL EFFORT OF AN INVERTER
Logical effort is the ratio between input capacitance of the gate to the input capacitance of the inverter that would deliver the same current
Thus, logical effort of an inverter is 1
20
VLSI D
esign
COMPUTING LOGICAL EFFORT OF NAND GATE
2-input NAND g = (2+2)/ (1+2) = 4/3
For 3 input NAND gate g = (3+2)/ (1+2) =5/3
For n input NAND gate g = (n+2)/ 3
21
VLSI D
esign
COMPUTING LOGICAL EFFORT OF NOR GATE
2-input NOR g = (1+4)/ (1+2) = 5/3
For 3-input NOR g = (1+6)/ (1+2) = 7/3
For n input NOR gate g = (1+2n)/3
22
VLSI D
esign
COMPUTING LOGICAL EFFORT OFCOMPLEX GATE
gA = (2+4)/ (1+2) = 2 gB = (2+4)/ (1+2) = 2 gC = (1+4)/ (1+2) = 5/3
23
VLSI D
esign
COMPUTING PARASITIC DELAY
24
VLSI D
esign
EXAMPLE: RING OSCILLATOR
Estimate the frequency of an N-stage ring oscillator
Logical Effort: g = Electrical Effort: h =Parasitic Delay: p =Stage Delay: d =Frequency: fosc =
25
VLSI D
esign
EXAMPLE: RING OSCILLATOR
• Estimate the frequency of an N-stage ring oscillator
Logical Effort: g = 1Electrical Effort: h = 1Parasitic Delay: p = 1Stage Delay: d = 2Frequency: fosc = 1/(2*N*d) = 1/4N
31 stage ring oscillator in 0.6 m process has frequency of ~ 200 MHz
26
VLSI D
esign
EXAMPLE: FO4 INVERTER
Estimate the delay of a fanout-of-4 (FO4) inverter
Logical Effort: g = Electrical Effort: h =Parasitic Delay: p =Stage Delay: d =
d
27
VLSI D
esign
EXAMPLE: FO4 INVERTER
Estimate the delay of a fanout-of-4 (FO4) inverter
Logical Effort: g = 1Electrical Effort: h = 4Parasitic Delay: p = 1Stage Delay: d = 5
d
The FO4 delay is about
200 ps in 0.6 m process
60 ps in a 180 nm process
f/3 ns in an f m process
28
VLSI D
esign
MULTISTAGE LOGIC NETWORKS
Logical effort generalizes to multistage networks
Path Logical Effort
Path Electrical Effort
Path Effort
iG gout-path
in-path
CH
C
i i iF f g h 10 x y z 20g1 = 1h1 = x/10
g2 = 5/3h2 = y/x
g3 = 4/3h3 = z/y
g4 = 1h4 = 20/z
Can we write F=GH?29
VLSI D
esign
CAN WE WRITE F = GH? No! Consider paths that branch:
G = 1H = 90 / 5 = 18GH = 18h1 = (15 +15) / 5 = 6h2 = 90 / 15 = 6F = g1g2h1h2 = 36 = 2GH
5
15
1590
90
How to fix this problem?
30
VLSI D
esign
BRANCHING EFFORT
Introduce branching effort Accounts for branching between stages in path
Now we compute the path effort F = GBH
on path off path
on path
C Cb
C
iB bih BH
Note:
31
VLSI D
esign
LOGICAL EFFORT CAN HELP US ANSWERINGTWO KEY QUESTIONS
1. How large should be each stage in a multi-stage network to achieve the minimium delay?
2. What is the optimal number of stages to achieve the minimum delay
32
VLSI D
esign
1. WHAT IS THE OPTIMAL SIZE OF EACH STAGE?
Delay is minimized when each stage bears the same effort
Gate1
Gate2
GND
Answer can be generalized. Thus, for N stages, minimum delay is achieved when each stage bears the same effort
33
VLSI D
esign
EXAMPLE: 3-STAGE PATH
Select gate sizes x and y for least delay from A to B
8 x
x
x
y
y
45
45
AB
34
VLSI D
esign
EXAMPLE: 3-STAGE PATH
Logical Effort G = Electrical Effort H =Branching Effort B =Path Effort F =Best Stage EffortParasitic Delay P =Delay D =
8 x
x
x
y
y
45
45
AB
f̂
35
VLSI D
esign
EXAMPLE: 3-STAGE PATH
Logical Effort G = (4/3)*(5/3)*(5/3) = 100/27Electrical Effort H = 45/8Branching Effort B = 3 * 2 = 6Path Effort F = GBH = 125Best Stage EffortParasitic Delay P = 2 + 3 + 2 = 7Delay D = 3*5 + 7 = 22 = 4.4 FO4
8 x
x
x
y
y
45
45
AB
3ˆ 5f F
36
VLSI D
esign
EXAMPLE: 3-STAGE PATH
Work backward for sizesy = 45 * (5/3) / 5 = 15x = (15*2) * (5/3) / 5 = 10
P: 4N: 4
45
45
AB
P: 4N: 6 P: 12
N: 3
37
VLSI D
esign
2. WHAT IS THE OPTIMAL NUMBER OF STAGES?
Consider adding inverters to end of path How many give least delay? N - n1 Extra Inverters
Logic Block:n1 Stages
Path Effort F
11
11
N
n
i invi
D NF p N n p
1 1 1
ln 0N N Ninv
D F F F pN
1 ln 0invp
1NF Define best stage effort
38
VLSI D
esign
OPTIMAL NUMBER OF STAGES
has no closed-form solution Neglecting parasitics (pinv = 0), we find r = 2.718 (e) For pinv = 1, solve numerically for r = 3.59 A path achieves least delay by using stages How sensitive is delay to using exactly the best number
of stages? ρ = 4 is reasonable
1 ln 0invp
1.0
1.2
1.4
1.6
1.0 2.00.5 1.40.7
N / N
1.151.26
1.51
(=2.4)(=6)
D(N)
/D(N
)
0.0 39
VLSI D
esign
BEST NUMBER OF STAGES
How many stages should a path use? Minimizing number of stages is not always fastest
Example: drive 64-bit datapath with unit inverter
D = NF1/N + P= N(64)1/N + N
1 1 1 1
8 4
16 8
2.8
23
64 64 64 64
Initial Driver
Datapath Load
N:f:D:
16465
2818
3415
42.815.3
Fastest 40
VLSI D
esign
REVIEW OF DEFINITIONS
Term Stage Pathnumber of stages
logical effort
electrical effort
branching effort
effort
effort delay
parasitic delay
delay
iG gout-path
in-path
CCH
N
iB bF GBH
F iD f
iP pi FD d D P
out
in
CCh
on-path off-path
on-path
C CCb
f gh
f
p
d f p
g
1
41
VLSI D
esign
METHOD OF LOGICAL EFFORT
Compute path effort Estimate best number of stages Sketch path with N stages Estimate least delay Determine best stage effort
Find gate sizes
F GBH
4logN F
1ND NF P
1ˆ Nf F
ˆi
i
i outin
g CC
f
42
VLSI D
esign
LIMITS OF LOGICAL EFFORT
Chicken and egg problem Need path to compute G But don’t know number of stages without G
Simplistic delay model Neglects input rise time effects
Interconnect Iteration required in designs with wire
Maximum speed only Not minimum area/power for constrained delay
43
VLSI D
esign
SUMMARY
Logical effort is useful for thinking of delay in circuits Numeric logical effort characterizes gates NANDs are faster than NORs in CMOS Paths are fastest when effort delays are ~4 Path delay is weakly sensitive to stages, sizes But using fewer stages doesn’t mean faster paths Delay of path is about log4F FO4 inverter delays Inverters and NAND2 best for driving large caps
Provides language for discussing fast circuits But requires practice to master
44
VLSI D
esign