Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
1
EE241 - Spring 2005Advanced Digital Integrated Circuits
Lecture 7:Logic Families for Performance
2
Admin
Homeworks due on We. New assignment on your way.
Will get feedback on the projects within a week.
2
3
Logical Effort: Summary
D = DH + Pd = h + pDelay
pIntrinsic Delay
N1Number of Stages
hEffort Delay
H = FGBh = fgEffort
n/aBranching Effort
f = Cout/CinElectrical Effort (Fanout)
gLogical Effort
PathStage
∏= igG
inout CCF /=
∏= ibB
∑= iH hD
∑= ipP
4
Increasing Performance
Scaling technology
Circuit/logic level:1. Logic optimizations
2. Transistor sizing, buffering
3. Wire optimization, repeaters
4. Supply voltage
5. Threshold voltage
6. Logic styles
7. Timing, latches
Microarchitecture level
3
5
Design Techniques
Performance does not come for free
Performance
DesignEffort
ASIC/RTL
‘Enhanced’ ASIC
Structured ASIC
Custom design
Dynamic custom
6
RTL Design Flow
RTLSynthesis
HDL
netlist
logicoptimization
netlist
Library
physicaldesign
layout
a
b
s
q0
1
d
clk
a
b
s
q0
1
d
clk
ModuleGenerators
ManualDesign
[from K. Keutzer]
4
7
RTL/ASIC Design
Design description in Verilog/VHDL RTL
Synthesized logicStandard cells
Pre-defined macros
Static timing verification, pre- and post-layoutStatistical vs. extracted wire loads
Physical designTop-level floorplan
Automatic place and route
Clock tree synthesized
Post layout optimization, verification
8
Logic Optimization
Perform a variety of transformations and optimizations
Structural graph transformationsBoolean transformationsMapping into a physical library
smaller, fasterless power
logicoptimization
netlist
netlist
Library
a
b
s
q0
1
d
clk
a
b
s
q0
1
d
clk
[from K. Keutzer]
5
9
Combinational Logic OptimizationInput:
• Initial Boolean network
• Timing characterization for the module
- input arrival times and drive factors
- output loading factors
• Optimization goals
- output required times
• Target library description
Output:
• Minimum-area netlist of library gates which meets timing constraints
A very difficult optimization problem ! [from K. Keutzer]
10
Logic Optimization
logicoptimization
netlist
netlist
Library
techindependent
techdependent
2-levelLogic opt
multilevelLogic opt
RealLibrary
GenericLibrary
[from K. Keutzer]
6
11
Modern Approach to Logic OptimizationDivide logic optimization into two subproblems:
• Technology-independent optimization
- determine overall logic structure
- estimate costs (mostly) independent of technology
- simplified cost modeling
• Technology-dependent optimization (technology mapping)
- binding onto the gates in the library
- detailed technology-specific cost model
Orchestration of various optimization/transformation techniques for each subproblem
12
Logic Level Optimizations
R R
Logic Depth
or
Techniques: Restructuring, pipelining, retiming, technology mapping
Well covered by today’s logic and sequential synthesis
7
13
Logic Optimizations (2)
Technique: Removal of common sub-expressionStart from tree structure/output
Fanout
Tp = O(FO) also effects wiring capacitance
Late arriving
14
Technology mapping
1 3 5 7 9fan-in
0.0
1.0
2.0
3.0
4.0
t p(n
sec)
tpHL
tp
tpLHlinear
quadratic
AVOID LARGE FAN-IN GATES! (Typically not more than FI < 4)
Tp = O(FI2) !Observation: only true if FI
translates in series devices -
otherwise linear
e.g. NAND pull-down
NOR pull-up
Fanin
8
15
Technology Mapping for Performance
Alternative coverings
Use low FI modules on critical path(s)Library composition?
16
CMOS Logic Styles
CMOS tradeoffs:SpeedPower (energy)Area
Design tradeoffsRobustness, scalability
Design time
Many styles: don’t try to remember the names –remember the principlesChanging the logic style – can it be done without breaking the synthesis flow?
9
17
CMOS Logic Styles
PUN
PDN
ABC
OUT
VDD
GND
ABC
Complementary
robustscales
large and slow
LOGICNETWORK
ABC
OUT
Pass Transistor Logic
simple and fastnot always very efficientversatile
18
CMOS Logic Styles
LOAD
ABC PDN
OUT
GND
GND
VDD
Ratioed Logic
small & faststatic power
RPDN <<RLOAD
VDD
PDN
φ
In1In2
In3
Out
φ
CL
Dynamic Logic
Small & fastest!Noise issuesScales?
10
19
Others
Current-mode logic
Adiabatic logic
20
Pulsed Static CMOS
RH – Reset highRL – Reset low
Fast pull-up Fast pull-down
Chen, Ditlow, US Pat. 5,495,188 Feb. 1996.
11
21
PS-CMOS
Evaluation and reset waves: reset is 1.5x slower
22
PS-CMOS
Advantages:
No dynamic nodes – good noise immunity
Reset delay slower than evaluation
No data dependent delay (worst case gets better)
No false transitions
Disadvantages
Width of reset wave limits logic depth
Margin in design
12
23
Skewing Gates
Different rising and falling delays
W
W
LE =
24
Skewing Gates
4W
W
LE =
13
25
Ratioed Logic
VDD
VSS
PDNIn1In2In3
F
RLLoad
VDD
VSS
In1In2In3
F
VDD
VSS
PDNIn1In2In3
F
VSS
PDN
Resistive DepletionLoad
PMOSLoad
(a) resistive load (b) depletion load NMOS (c) pseudo-NMOS
VT < 0
Goal: to reduce the number of devices over complementary CMOS
26
Pseudo-NMOS
0.0 0.5 1.0 1.5 2.0 2.50.0
0.5
1.0
1.5
2.0
2.5
3.0
Vin, V
Vo
ut,
V
W/Lp = 4
W/Lp = 2
W/Lp = 1
W/Lp = .25
W/Lp = 0.5
VDD
In1In2In3
F
PMOSload
PDN
Trade-off between performance and power + noise margins
14
27
Differential Logic
28
Differential Logic
Differential Cascode Voltage Switch (DCVS)
Differential Split-Level Logic (DSL)
Regenerative Push-Pull Cascode Logic (PPCL)
Pass transistor logic families
Dynamic logic families
15
29
Differential Logic
+ implicit invert, higher logic density
30
Cascode Voltage Switch Logic
VDD
VSS
PDN1
Out
VDD
VSS
PDN2
Out
AABB
M1 M2
Cascode Voltage Switch Logic (CVSL)
Sometimes called Differential Cascode Voltage Switch Logic (DCVSL)
16
31
CVSL
A
B
M1
M2
A B
0 0.2 0.4 0.6 0.8 1.0-0.5
0.5
1.5
2.5
Time, ns
Vol
tage
,V
Out
Out
A,BA,BM3 M4
OutOut
VDD - Vth
Fast (but hysteresis due to latch function)No static power dissipationBUT: large cross-over current!
32
CVSL
Full adder design
How to design for reduced transistor count?
17
33
Karnaugh Map Technique
34
Karnaugh Map Technique
0
1
00 01 11 10x1
x2x3
0
0
0 01
1 1 1
Build sharedcubes first!
Add other cubes next
LOAD
x1
x3
x1
x3
x2 x2
Q Q
LOAD
x1
x3
x1
x3
x2 x2
Q Q
x1 x2
18
35
Example
Q = x1x2x3x4 + x1(x2+x3+x4)
36
Push-Pull Cascode Logic
Gieseke et al, U.S. Patent 5,023,480 June 1991.
19
37
DSL Differential Split-Level Logic
38
Simulation Results for Different Adders
20
39
Pass-Transistor Logic
Inpu
ts Switch
Network
OutOut
A
B
B
B
• N transistors
• No static consumptionA
B
BF = AB
0
• Transistor implementation using NMOS
40
Pass-Transistor Logic
Performance of PTL:Advantage over CMOS in implementing XOR, MUXDisadvantage in implementing AND, OR.
Datapaths, arithmetic circuits are examples of use:Adders and multipliers use XOR, MUXAdvantage of complementary implementation
Comparisons:When a new logic family is introduced, the examples are chosen to show its advantages; (not disadvantages).Comparison papers sometimes point to the disadvantages
Full-custom design
21
41
Examples of PTL Styles
Complementary Pass-Transistor LogicNMOS-only pass-transistor network
Transmission-gate logicNMOS+PMOS pass gates
Double Pass-Transistor LogicNMOS+PMOS network
Numerous other logic families
42
NMOS-only switch
A =2.5V
B
C = 2.5V
CL
A = 2.5 V
C = 2.5 V
BM2
M1
Mn
Threshold voltage loss causes static power consumption
0 0.5 1 1.5 20.0
1.0
2.0
3.0
Time, ns
Volta
ge, V
xOut
In
22
43
Solutions
Transmission gates – adding complexity
Low-threshold switches – leakage!
Level-restoration
M 2
M 1
M n
M r
OutA
B
V DDV DDLevel Restorer
X
44
Single-Ended Level Restoring
OutputInput
Feedback Inverter
Output InverterLevel Restoration
Transistor
23
45
Differential Level Restoring
Differential NMOS Logic Tree
f f
Inputs
Inputs
Different level restoration leads to different logic families
46
Different Restoration Schemes
Differential NMOS Logic Tree
f f
Inputs
Inputs
Swing-Restored Pass-Transistor Logic
Parameswar, et alCICC’94, JSSC 6/96
24
47
Other Level-Restoring Schemes
Differential NMOS Logic Tree
f f
Inputs
Inputs
Differential NMOS Logic Tree
ff
Inputs
Inputs
Energy Economized Pass-TransistorLogic
DCVS with Pass Gates(DCVS-PG)
48
Pass-Transistor Logic Families
25
49
Complementary Pass-Transistor Logic (CPL)
F
F
Pass-Transistor
Network
Pass-TransistorNetwork
AABB
AABB
Complementary
• Complementary functions• Reduced number of logic levels• Less transistors than CMOS • Fast – reduced load• Complementary inputs – complementary outputs• VT drop – several solutions
50
CPL
Level restoration
Yano et al, CICC’89, JSSC 4/90
26
51
CPL
Same topology of networksJust different signal arrangements
52
Complementary Pass-Transistor Logic (CPL)
AA
S S
A A
B
B
C
C
SS(a) (b)
B
B
Q Qb
n1 n2
n4n3
XOR Sum
nFET logicnetwork
- Fast- VT drop- Efficient
implementationof arithmetic
27
53
CPL Karnaugh Maps
A
B
0 0
0 1
C1 C2
A
A
B A
BA ⋅
C2 C1
A
A
B
BA ⋅
C2 C1
54
CPL vs. CMOS
28
55
Skewing Output Inverter
56
Differential vs. Single-Ended
29
57
Leap Cell Library
Yano et al, CICC’94, JSSC 6/96
Goal: Implement full logic functionality with small libraryRely on automated design methodology
58
Various Logic Functions of the Leap Library
30
59
LEAP Comparison
60
Double Pass-Transistor Logic (DPL)
A
B
A B B A
VDD
B
A
OO
A
B
A B BA
B
A
OO
B
A
B A B A
B
A
A B
A B
AND/NAND
XOR/XNOR
31
61
Designing DPL Gates
A
B
0 0
0 1
C3 C4
C1
C2
A
A
B
BA×
C2
C1
A B
B
C3
C4
62
Designing DPL Gates (2)
A
B
0 1
1 0
C3 C4
C2
C1
A
B
0 1
1 0
C1 C2
C3
C4
B
BAÅ
B
A
A A
B
B
A
C2
C1C3
C4
A
BAÅ
A
B
B B
A
A
B
C1 C2
C3C4
32
63
Applications of DPL
1.5ns 32-bit ALU in 0.25µm CMOS
Full adder:
Suzuki, ISSCC’93JSSC 11/93
64
Comparison of Logic Styles
Zimmermann, Fichtner, JSSC 7/97
33
65
Comparison of Logic Styles
66
Comparison of Logic Styles
34
67
Results
68
Results
35
69
Results