Upload
adam-grant
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
Full Chip Analysis
Chung-Kuan Cheng
Computer Science and Engineering Department
University of California, San Diego
La Jolla, CA 92093-0114
2
Outlines
I. IntroductionII. Circuit Level AnalysisIII. Logic Level Analysis
I. Timing AnalysisII. Functional Analysis
IV. Mixed Signal AnalysisV. Research DirectionsVI. Conclusion
3
I. Introduction
1. Trends of On-Chip Technologies
2. Statistics About Design Flaws
3. Spectrum of Analysis
4
I.1 Trends of On-Chip Technologies
System: Huge Numbers of Devices and Wires
Power/Ground Distribution: Low Voltage, High Current
Wires: Lateral Coupling, Fragmented ParasiticsDevices: Modeling, NoiseMixed Signal Design: RF+Analog+Digital
5
6
7
Power/Ground Distribution (ITRS)
2002 2003 2004 2005
Supply
Voltage(V)
1.5 1.5 1.2 1.2
Max
Power
130 140 150 160
On-Chip
Freq(MHz)
1,600 1,724 1,857 2,000
Off-Chip
Freq(MHz)
885 932 982 1,035
Lower V margin: Higher I & Inductance x Freq.
8
Static Vs. Dynamic Voltage Drop
Current envelope Dynamic - peak current
nT (n+1)T (n+2)T
Static- average current
Wire sizing can be used to control static drop Precise de-cap insertion filters peak current spikes Courtesy of Apache
9
Flip Chip Dynamic Effects
F:MHz
IRStatic vs.Dynamic
Ri(t) Ldi/dt
9.9 mV
17.3 mV
1.2 mV
16.2 mV
29.3 mV
12.3 mV
19.3 mV
36 mV
28.5 mV
22 mV
38.8 mV
41.6 mV
0.5%1.02%
1.08%2.77%
1.6%5.37%
2.2%8.0%
DynamicTotal18.5 mV
1.8
1.5
1.2
1.0
41.6 mV
64.5 mV
80.4 mV
250
500
750
1,000
Static
Dynamic
Vdd
Courtesy of Apache
10
Wire-bond Dynamic Effects
IR R i(t) Ldi/dt
103 mV
147 mV
3 mV
181 mV
275 mV
13 mV
200 mV
276 mV
75 mV
5.7% 8.3%
12% 19.2%
16.6%29.2%
DynamicTotal
150 mV
1.8
1.5
1.2
288 mV
351 mV
133
250
400
Static
Dynamic
Static vs.Dynamic
F:MHz Volt
Courtesy of Apache
11
12
Increasing System Complexity
Courtesy of Mentor
RF front end
DSP
Memory
Complexconverters
13
I.2 Statistics about Design Flaws Percent of Total Flaws Fixed in IC/ASIC Designs Having Two or More Silicon Spins
2%
3%
4%
4%
5%
5%
7%
12%
13%
47%
0% 10% 20% 30% 40% 50%
Other flaws
IR Drops
Mixed-Signal Interface
Power
Race Condition
Clocking
Yield
Noise
Slow Path
Logical or Functional
Collett Intl. 2000 Survey
4%
13%
17%
17%
20%
21%
23%
25%
28%
29%
35%
67%
0% 10% 20% 30% 40% 50% 60% 70% 80%
Other flaws
Firmware
Power
Race Condition
IR Drops
Mixed-Signal Interface
Yield
Clocking
Slow Path
Noise
Analog Circuit
Logical or Functional
Collett Intl. 2001 Survey
Logical or Functional Analog Noise Slow Path Mixed-signal interface Clock, Power/Ground Firmware
Logical or Functional Slow Path Noise
14
I.3 Spectrum of Analysis
Device
Circuit Logic System
Electrical Behavior Timing Switches Gate RTL
Physics EngineeringEE CS
Circuit theoryAlgorithmsDatabaseProgramming
Extraction
PrecisionMath DiscreteComplex, Real
15
I.3 Spectrum of Analysis(flow)
System
Software Hardware
Floorplan
Logic
Layout
Library
IP blocks
Analog
Architect
Chip
FunctionTimingCircuit Anal.
emulation
powerclock
critical paths
function
characterization
freq
power noise
global wirescross talk
mixed signal
16
I.3 Spectrum of Analysis(coverage)
Circuit Size
Coverage
Circuit Analysis
Logic: Static Timing (sign off)
Logic: FunctionalMixed Signal
Spice Hspice
ASX EldoApache
NassdaIota
Cadence
Synopsys
Mentor
Celestry
Cadence
Synopsys
Mentor
Axis
IBM
Aptix
CadenceSynopsys Mentor
Mentor
Cadence
17
I.3 Spectrum of Analysis(trend)
Layout Dominated Analysis Power/Ground, Clock Wires Pre-layout, Post-layout
Layout Oriented Analysis EE + CS
EE=> CS High Complexity CS=>EE Deep Submicron Effect Accuracy and Efficiency
18
II. Circuit Level Analysis
1. Circuit Analysis Advancement
2. Circuit Analysis Techniques
3. Examples
4. Tasks
19
1st Generation (SPICE) 2nd Generation (Fast SPICE) Next Generation (HSIM)
Circuit Size
Memory Usage
CPU Time
512M Bytes
Circuit Size
2 hrs
300M elements
Circuit Size
Memory Usage
CPU Time
512M Bytes
Circuit Size
2 hrs
300M elements
Circuit Size
Circuit Size
Memory Usage
CPU Time
1G Bytes
2M elements
20 hrs
Circuit Size
Circuit Size
Memory Usage
CPU Time
Bytes
20 hrs
2M elements
Circuit Size
Memory Usage
CPU Time
100M
Bytes
100K elements
100 hrs
Circuit Size
Circuit Size
Memory Usage
CPU Time
100 hrs
Circuit Size
100K elements
II.1 Circuit Analysis Advancement
Courtesy of Nassda
20
II.2 Circuit Analysis Techniques
Memory: Hierarchical Database Circuit Size: Parasitic Reduction Device Complexity: Table Model Simulation:
Backward Euler, Trapezoidal Integration Hierarchical Flow Event Driven (ignoring miller effect) Mixed Rate, Multiple step sizes (partition)
21
Circuit Type
(#MOS, #R,#C,#L)
Total
Elements
Memory
Usage
CPU
Time (hrs)
Memory A
(159M, 159M, 155M,0)
473M 775MB 1.65
Memory B
(3.1M, 5.4M, 4.5M, 88)
13M 195MB 0.69
D/A
(9K,65K,47K,0)
121K 42MB 1.11
PLL
(2K, 8K, 23K, 0)
51K 15MB 0.21
Analog
(119K, 175K, 232K,0)
525K 111MB 0.37
II.3 Examples (HSIM)
Courtesy of Nassda
22
II.4. Tasks
Convergence
Matrix SolverIntegration Partition
Hierarchy DatabaseInput Patterns
EE CSSpeed
Accuracy
Circuit Red.Device Mod.
Event DrivenHierarchical Flow
Math
23
III. Logic Level Analysis
1.Separation of Timing and Function
2.Static Timing Analysis
Algorithms, Gate Models, Path, Cross Talks
3.Functional Analysis
Event Driven, Cycle Based
4.Tasks
24
III.1 Separation of Timing and Function
FunctionalAnalysis
Simple timingmodel
Timing Analysis
Input vectordriven
Inputindependent
Slew, RC treecross talk
Function +Timing
High Complexity!
25
III.2 Static Timing Analysis
i. Algor.: Shortest and Longest Paths Search
ii. Gate Model: • Logic: Unate, Binate Signal Propagations• Timing: functions of Input Slope and Output Load
iii. Path Model:• Logic: False Path, Multiple Cycle Path, Cycles of
Combinational Logic, Multiple Clock Frequencies• Timing: RC Tree
iv. Cross Talks: Timing Window, ATPG
v. Tasks
26
III.2.i Algor.: Path Search
ArrivalTime,Slew Rate
Required Arrival Time
Static Timing Analysis: Worst Case Analysis, Independent of Input Patterns
0->1 slew rate window1->0 slew rate window
0->1 arrival time window1->0 arrival time window
PI1
PI2
PI3
A
G
F
B
H
E
C
D
J PO
Longest &Shortest Paths
27
III.2.I Algor: Path Search(cont)
0/0
0/0
0/0
1/2PI1
PI2
PI3
A
G
F
B
H
E
C
D
J PO3
21
2 2
1
1
2
3 21
2
32
23/3
2/4min/max
28
III.2.I Algor: Path Search(cont)
aminj, amaxj
amini,amaxidji
amini=minj aminj+dji
amaxi=maxj amaxj+dji
29
III.2i Algor.: Path Search
PI1
PI2
PI3
A
G
F
B
H
E
C
D
J PO3
21
2 2
1
1
2
3 21
2
32
2
0/0
0/0
0/0
1/2
3/3
2/4
4/5 6/7
4/4
4/6 6/8
6/10 8/12
min/maxLongest: PI2,G,F,E,D,J,POShortest: PI2,G,H,J,PO
30
III.2.ii Gate Logic Model: Unate & Binate Signals
a y
a y
a yNAND
Unateness: a 0->1 => y 1->0
XNOR
Binateness: a 0->1 => y 0->1 & 1->0
BDD
Check unateness based on BDD
31
III.2.ii Gate Timing Model
Slew rate of y Delay of y
Slew rate of a Slew rate of a
CeffCeff
a y
Ceff
Interconnect
32
III.2.iii Path Logic Model: False Path
4 bit adder 4 bit adder c0
c4c8
p[0,3]
y
Carry skip adderz
101101001111
C0=11
P[0:3]=110000 Z=1
+
33
III.2.iii Path Logic Model: False Path
False path: c0->y->c4 ->c8
Assumption: z->c4 ->c8 derives results faster
If we erase all false paths, we can identify the true critical paths and the corresponding input patterns
4 bit adder 4 bit adder c0
c4c8
p[0,3]
y
Carry skip adderz
34
III.2.iii Path Logic Model: False Path
False path b->c->d->e
3
10
410
2
a
b
c d
e
f
3,10 7,14
17,24
16
red+red=>redred+blue=>blueblue +blue=>blue
35
III.2.iv Cross Talk
WCN: worst-case noise: Delay & GlitchNoise with maximum
pulse height Fixed circuit structure
and parameters Fixed transition time of
input signals Variable arrival time of
input signals
WCN
36
Aggressor / Victim Input Victim Output
III.2.iv Cross Talk: Timing Window
P1
P2
P3
P4
P5
Aligned arrival time Skewed peak noise
A1
A2
A3
A4
V0
37 Aggressor Alignment WITHOUT Timing Constraints
III.2.iv Cross Talk: Timing Window
Skewed arrival time Aligned peak noise
Victim Output
P1
P2
P3
P4
P5
A1
A2
A3
A4
V0Sweep line
Aggressor / Victim Input
38 Aggressor Alignment WITH Timing Constraints
III.2.iv Cross Talk: Timing Window
Victim Output
A1
A2
A3
A4
V0
P1
P2
P5
Sweep Line
Aggressor / Victim Input
39
III.2.iv Cross Talk: Effective Timing Window
Timing window for aggressor input
Earliest arrival time
Timing window for victim output
Latest peak noise occurring time
aT bTLaT
RbT at bt
Lat
Rbt
maxV
iP
aT bTiA
Latest arrival time
at bt
iP
Earliest peak noise occurring time
40
Aggressor Alignment with Timing Constraints -- Reformulation
A1
A2
A3
A4
V0
(a) Original timing window (b) Shifted timing window
Old Sweep Line New Sweep Line
(c) Expanded timing window
New Sweep Line
41
III.2.v Tasks
Path model: special cases
ATPG
Path search in hierarchy
Gate model: power, noise
Path model: RCLK reduction
Cross talk
Timing window+pattern
EE CSSpeed
Accuracy
Math
42
III.3 Logic Level: Functional Analysis
i. Functional Analysis Techniques
ii. Event Driven Analysis
iii. Cycle Based Analysis
iv. Tasks
43
III.3.i Functional Analysis Techniques
Event Driven Simulation VCS, Verilog-XL, VSS, ModelSim
Cycle Based Simulation Frontline, Speedsim, Cyclone
Domain Specific Simulation SPW, COSSAP
44
III.3.ii Event Driven Analysis
Event Wheel Maintains schedules of events Enables sub-cycle timing
Advantages Timing accuracy Good Debug Capability Handles asynchronous
Disadvantages Performance
45
III.3.iii Cycle Based Analysis
RTL Description All gates evaluated every cycle Schedule is determined at compile time No timing No asynchronous feedback, latches Regression Phase High Performance High Capacity
46
III.3.iv Tasks
Pattern generation
Dynamic timing model
coverage
Hardware acceleration
EE CSSpeed
Accuracy
Math
47
IV. Mixed Signal Analysis
RF front-end
RF Simulator
Analog IP
VHDL-AMSVerilog-AMS
DSP
HDLVHDL/Verilog
Embedded
MemoriesHierarchical Fast SPICE
Complex
AnalogTraditional
SPICE
Custom Logic & Mixed-SignalFast SPICE
Single-Kernel simulator for full-chip SoC Verification
Courtesy of Mentor
48
IV. Mixed Signal Analysis: Interface
Digital AnalogD/A
A/D
D/A A/D
Analog Signal 0, 1, X
Threshold Detector
Rise, Fall TimeRise, Fall Resistance
49
Spice VHDL-AMS VHDL Verilog Verilog-AMS
VerilogSpice VHDLVHDL-AMS
VHDL-AMS Verilog-ASpice Spice Verilog-A
CVerilog VHDL
VHDL-AMS SpiceC Verilog-A
Mixed Signal: Mixed Languages
•Single Kernel Architecture•Single Netlist Hierarchy•Automatic D/A and A/D converter insertion
Courtesy of Mentor
50
IV. Tasks
EE CSSpeed
Accuracy
LanguageRF, Analog,
Power, Noise,
Convergence Partition
Interface Compiler
Math
51
IV. Research Directions
Hierarchy Management Analysis + Optimization Layout Oriented Analysis Circuit Reduction Spice
52
Physical objAlgor
Floorplan
Partition,Bus
LogicRep, buffer
PlaceWire-length,Congestion,
Block,alignment,match,size
RoutePattern,vias
AnalysisSignal integrity
Ir drop, didtmanufacturability
verification
CircuitComp,dynamic,
pass
Power, clockpackaging
View, hierarchy
Hierarchy Management
53
Hierarchy management
logic layout
Design process
algor
54
Hierarchy Management (cont.)
•Hierarchy Tree Construction•Hierarchy Tree Transformation•Incremental Changes•Graph Process on Tree Structure
55
Analysis and Optimization
i. Circuit Reductionii. Transient Analysisiii. Optimization of
• power/ground: pads, decoup caps, network
• clock networks: topology, shield, decoup caps
• Buses: shield, topology
56
•Huge Circuitry•Millions of nodes
•Whole Chip Analysis•Power/Ground, Substrate, Analog
•Guaranteed Accuracy •Accuracy vs Execution Time
•Construction or Incremental Changes
Layout Oriented Analysis
57
Layout Based Signal Analysis
•Generalized Y-Delta Transformation
•R,C,L,Coupling, Sources•Natural Frequency•Realizability•Hierarchical Circuit Analysis
58
2112 ggy
1 21g
2g
1 212y
Conductance in parallel
59
21
2112 gg
ggy
1 212y1 21g 2g0
Conductance in series
60
1
2 3
12y 13y
23y
321 ggg
ggy jiij
1g
2g 3g
1
2 3
0
Conductance in Y-structure
61
212 11
1
clsgls
g
csls
g
lsg
y
1
2 3
12y 13y
23y
e.g.
g
ls
1cs
1
2 3
0
Admittance in Y-structure
62
1
2 3
12y 13y
23y
22 11
1
clsgls
I
csls
g
IlsI
is the same , and
g
ls
1cs
1
2 3
I
0
44
2I
12y
Admittance in Y-structure, with current source
63
skm1
sk11
1
4
31
2
skm1
skm1
sk22
1
skm1
221
11
121
11
22
11
IVV
IVV
sksk
sksk
m
m
4
31
2
sk11
1
skm1
sk22
11V
2V
1I 2I
K-element
64
Reduction example
65
Waveform Estimation
Transient response evaluated using Y-Δ transformation with Hurwitz polynomial approximation.
8th order stabilized Y-Δ models are used for near-end and far-end node waveform evaluation.
Only a 3rd order AWE model is obtained.
66
Efficiency Comparison
Circuit type
Elements CPU time (s)
#R #L #C Stabilized Y-Δ *
Spice3f4 Efficiency
Tree-like 1035 1034 1001 0.34 3.94 11.58
16397 16394 14299 11.42 134.05 11.74
Mesh-like 1675 2439 733 5.22 73.19 14.02
8035 0 8038 2.07 25.95 12.54
66941 0 67119 41.25 1536.77 37.26
*15th order Y-Δ transformation is used.
67
Delay Accuracy Vs. Efficiency
Model order50% delay 90% delay
Delay Accuracy Delay Accuracy Efficiency
3rd 28.9 95% 33.4 95% 29.3
6th 27.9 99% 31.9 99.6% 27.9
3rd 49.5 96% 72.5 96% 14.7
6th 52.7 97% 71.5 97% 11.5
12th 51.2 99.8% 69.9 99.7% 10.1
RC*
RLC**
* 50% delay is 27.6ps, and 90% delay is 31.7ps.** 50% delay is 51.3ps, and 90% delay is 69.7ps.
.10,10,10 pHLfFCR
68
Observe Overshooting
Model Order * Overshooting ** Accuracy Efficiency
3rd 2.627 97.8% 11.1
6th 2.699 99.5% 10.7
12th 2.683 99.9% 10.3
* Mesh-like RLC circuit is tested, with ** Waveform will converge to DC 2.5v.
.10,10,10 pHLfFCR
69
Pole Analysis
Both AWE and Y-Δ transformation have artificial positive poles;
High order AWE tends to collapse approximate poles, hiding other less dominant ones.
Y-Δ transformation with model stabilization yields no positive poles, and has broader band in pole estimation.
70
V. Conclusion
Layout Oriented Analysis Unified tools combining EE and CS
with Math as foundation New Methodologies
Larger Circuits, Shorter Product Turnaround
71
References
M. Marek-Sadowska, UCSB L.T. Pileggi, CMU CK Cheng, et al, Interconnect Analysis and
Synthesis, John Wiley ACM/IEEE Design Automation Conf. IEEE/ACM Int. Conf. On CAD Apache, Nassda, Mentor, Synopsys,
Cadence, Celestry, IBM, and etc.