View
220
Download
2
Category
Tags:
Preview:
Citation preview
ECE 260B – CSE 241A Intro and ASIC Flow .1 http://vlsicad.ucsd.edu
ECE260B – CSE241AWinter 2005
Introduction and ASIC Flow
Instructor: Bao Liu
Website: http://vlsicad.ucsd.edu/courses/ece260b-w05
Slides courtesy of Prof. Andrew B. Kahng
ECE 260B – CSE 241A Intro and ASIC Flow .2 http://vlsicad.ucsd.edu
Why not a Silicon Compiler?
Spec/Matlab/VHDL
Circuit on Silicon
? placement
routing
synthesis
verification
Ideal Reality
Silicon Compiler Design methodology
Simple Complex
No human interaction Lots of human interaction
ECE 260B – CSE 241A Intro and ASIC Flow .3 http://vlsicad.ucsd.edu
Teams in a Design Process
Spec/Matlab/VHDL
Circuit on Silicon
? placement
routing
synthesis
verification
CAD developers
VLSI designers
Process people
Testing team
VLSI designers
CAD developers
Process people
Testing team
ECE 260B – CSE 241A Intro and ASIC Flow .4 http://vlsicad.ucsd.edu
Class Objectives
Learn about ASIC implementation flow: VerilogGDSII Semi-custom implementation of CMOS digital circuits, and
optimization with respect to different constraints: area, speed, power, reliability, cost
Understand impact of constraints, tradeoffs, technology scaling
Get some feel for each phase of the implementation flow
Learn about building blocks: wires, gates, memories
Prepare for future design experiences
Get some feel for industry-standard design tools, libraries
- Will mostly use Cadence BuildGates and SOC Encounter, and Artisan
TSMC 0.18/0.13um libraries
Synthesize small cores from RTL into GDSII
ECE 260B – CSE 241A Intro and ASIC Flow .5 http://vlsicad.ucsd.edu
Outline
Introduction
Technology Evolution Silicon Complexity System Complexity
Design Flows Traditional State of the Art
- Design Metrics
- Design Closure
ECE 260B – CSE 241A Intro and ASIC Flow .6 http://vlsicad.ucsd.edu
Technology Evolution: Cost and Integration Drivers
Moore’s Law is about cost
Increased integration, decreased cost more possibilities for semiconductor-based products
Pentium 4 die shot: 2.2cm
Slide courtesy of Mary Jane Irwin, PSU
ECE 260B – CSE 241A Intro and ASIC Flow .7 http://vlsicad.ucsd.edu
Sense of Scale (Scaling) What fits on a VLSI Chip
today?
State of the art logic chip 20mm on a side (400mm2) 0.13m drawn gate length 0.5m wire pitch 8-level metal
For comparison 32b RISC processor
- 8K x 16K SRAM
- about 32x 32 per bit- 8K x 16K is 128Kb, 16KB
DRAM- 8 x 16 per bit- 8K x16K is 1Mb, 128KB
20mm(40,000 wire pitches)
320,000
0.13m (2 )
32b RISCProcessor
64b FPProcessor
0.5m(8
Slide courtesy of Ken Yang, UCLA
ECE 260B – CSE 241A Intro and ASIC Flow .8 http://vlsicad.ucsd.edu
MOS Transistor Scaling (1974 to present)
S=0.7 [0.5x per 2 nodes]
Source: 2001 ITRS - Exec. Summary, ORTC Figure
(Typical
MPU/ASIC)
Poly
Pitch
(Typical
DRAM)
Metal
Pitch
Decreased transistor/feature sizes
Increased variability (tox, BEOL, DFM, SEU, etc.)
Short channel effect, leakage power
ECE 260B – CSE 241A Intro and ASIC Flow .9 http://vlsicad.ucsd.edu
Parameter Type 99 01 03 05 07 10 13 16
Vdd MPU 1.5 1.2 1.0 0.9 0.7 0.6 0.5 0.4 LOP 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 LSTP 1.3 1.2 1.2 1.2 1.1 1.0 0.9 0.9Vth (V) MPU 0.21 0.19 0.13 0.09 0.05 0.021 0.003 0.003 LOP 0.34 0.34 0.36 0.33 0.29 0.29 0.25 0.22 LSTP 0.51 0.51 0.53 0.54 0.52 0.49 0.45 0.45Ion (uA/um) MPU 1041 926 967 924 1091 1250 1492 1507 LOP 636 600 600 600 700 700 800 900 LSTP 300 300 400 400 500 500 600 800CV/I (ps) MPU 2.00 1.63 1.16 0.86 0.66 0.39 0.23 0.16 LOP 3.50 2.55 2.02 1.58 1.14 0.85 0.56 0.35 LSTP 4.21 4.61 2.96 2.51 1.81 1.43 0.91 0.57
Ioff (uA/um) MPU 0.00 0.01 0.07 0.30 1.00 3 7 10 LOP 1e-4 1e-4 1e-4 3e-4 7e-4 1e-3 3e-3 1e-2 LSTP 1e-6 1e-6 1e-6 1e-6 1e-6 3e-6 7e-6 1e-5
HP / LOP / LSTP Device Roadmaps
ECE 260B – CSE 241A Intro and ASIC Flow .10 http://vlsicad.ucsd.edu
SEMATECH Prototype BEOL stack, 2000
Wire
ViaGlobal (up to 5)
Intermediate (up to 4)
Local (2)
Passivation
Dielectric
Etch Stop Layer
Dielectric Capping Layer
Copper Conductor with Barrier/Nucleation Layer
Pre Metal DielectricTungsten Contact Plug
Reverse-scaled global interconnects
Growing interconnect complexity
Performance critical global interconnects
ECE 260B – CSE 241A Intro and ASIC Flow .11 http://vlsicad.ucsd.edu
Intel 130nm BEOL Stack
Intel 6LM 130nm process with vias shown (connecting layers)
Aspect ratio = thickness / minimum width
ECE 260B – CSE 241A Intro and ASIC Flow .12 http://vlsicad.ucsd.edu
Interconnect Capacitance: Parallel Plate Model
SiO2
Substrate
L
W
T
HILD
ILD = interlevel dielectric
Bottom plate of cap can be another metal layer
Cint = ox * (W*L / tox)
ECE 260B – CSE 241A Intro and ASIC Flow .13 http://vlsicad.ucsd.edu
Line Dimensions and Fringing Capacitance
w S
Capacitive coupling
Crosstalk effect
Signal integrity
Lateral cap
ECE 260B – CSE 241A Intro and ASIC Flow .14 http://vlsicad.ucsd.edu
Interconnect Evolution and Modeling Needs
Before 1990, wires were thick and wide while devices were big and slow
Large wiring capacitances and device resistances Wiring resistance << device resistance Model wires as capacitances only
In the 1990s, scaling (by scale factor S) led to smaller and faster devices and smaller, more resistive wires
Reverse scaling of properties of wires RC models became necessary
In the 2000s, frequencies are high enough that inductance has become a major component of total impedance
ECE 260B – CSE 241A Intro and ASIC Flow .15 http://vlsicad.ucsd.edu
Evolving Interconnects Affect Timing Interconnect capacitance > gate input capacitance
Better prediction
Interconnect resistance no longer ignorable Better modeling: distributed R(L)C network, AWE, etc. Effective capacitance < total load capacitance
Interconnect delay > gate delay for sub-micron technologies
ECE 260B – CSE 241A Intro and ASIC Flow .16 http://vlsicad.ucsd.edu
Sub-Wavelength Optical Lithography
What are implications of this picture?
•Slide courtesy of Numerical Technologies, Inc.
ECE 260B – CSE 241A Intro and ASIC Flow .17 http://vlsicad.ucsd.edu
…Complexity of Photomasks
How many wafers, on average, are printed with a mask set?
ECE 260B – CSE 241A Intro and ASIC Flow .18 http://vlsicad.ucsd.edu
Summary of Technology Scaling
Scaling of 0.7x every three (two?) years .25u .18u .13u .10u .07u .05u 1997 1999 2002 2005 2008 2011 5LM 6LM 7LM 7LM 8LM 9LM
Interconnect delay dominates system performance consumes up to 70% of clock cycle
Cross coupling capacitance is dominating cross capacitance 100%, ground capacitance 0% ground capacitance is 90% in .18u huge signal integrity implications (e.g., guardbands in static
analysis approaches)
Multiple clock cycles required to cross chip whether 3 or 15 not as important as fact of “multiple” > 1
ECE 260B – CSE 241A Intro and ASIC Flow .19 http://vlsicad.ucsd.edu
New Materials Implications
Lower dielectric permittivity reduces total capacitance doesn’t change cross-coupled / grounded capacitance
proportions
Copper metallization reduces RC delay avoids electromigration (factor of 4-5 ?) thinner deposition reduces cross cap
Multiple layers of routing enabled by planarization; 10% extra cost per layer reverse-scaled top-level interconnects relative routing pitch may increase room for shielding
ECE 260B – CSE 241A Intro and ASIC Flow .20 http://vlsicad.ucsd.edu
Technical Issues
Manufacturability (chip can't be built) antenna rules minimum area rules for stacked vias CMP (chemical mechanical polishing) area fill rules layout corrections for optical proximity effects in subwavelength
lithography; associated verification issues
Signal integrity (failure to meet timing targets) crosstalk induced errors timing dependence on crosstalk IR drop on power supplies
Reliability (design failures in the field) electromigration on power supplies hot electron effects on devices wire self heat effects on clocks and signals
Courtesy Hormoz/Muddu, ASIC99
Noise
Analog design concerns are due to physical noise sources
because of discreteness of electronic charge and stochastic nature of electronic transport processes
example: thermal noise, flicker noise, shot noise
Digital circuits due to large, abrupt voltage swings, create deterministic noise which is several orders of magnitude higher than stochastic physical noise
still digital circuits are prevalent because they are inherently immune to noise
Technology scaling and performance demands make noisiness of digital circuits a big problem
ECE 260B – CSE 241A Intro and ASIC Flow .22 http://vlsicad.ucsd.edu
Silicon Complexity ChallengesSilicon Complexity Challenges Silicon Complexity = impact of process scaling, new materials, new
device/interconnect architectures
Non-ideal scaling (leakage, power management, circuit/device innovation, current delivery)
Coupled high-frequency devices and interconnects (signal integrity analysis and management)
Manufacturing variability (library characterization, analog and digital circuit performance, error-tolerant design, layout reusability, static performance verification methodology/tools)
Scaling of global interconnect performance (communication, synchronization)
Decreased reliability (soft error uncertainty, gate insulator tunneling and breakdown, joule heating and electromigration)
Complexity of manufacturing handoff (reticle enhancement and mask writing/inspection flow, manufacturing NRE cost)
If you don’t know a term, ask…
ECE 260B – CSE 241A Intro and ASIC Flow .23 http://vlsicad.ucsd.edu
In a PDA…
Reference Design: personal digital assistant (PDA)
Composed of CPU, DSP, peripheral I/O, and memory
ECE 260B – CSE 241A Intro and ASIC Flow .24 http://vlsicad.ucsd.edu
Required Performance for Multi-Media ProcessingGOPS
0.01 0.1 1 10VideoVideo
AudioAudioVoiceVoice
CommunicationCommunicationRecognitionRecognition
GraphicsGraphics
FAXModem
2D Graphics
3D Graphics
MPEGDolby-AC3
JPEG
MPEG1Extraction
MPEG2 ExtractionMP/ML MP/HLCompression
VoIP Modem
Word Recognition
Sentence Translation
GOPS: Giga Operations Per Second
100
Voice Auto Translation
10Mpps 100Mpps
MPEG4
Face RecognitionVoice Print Recognition
SW Defined Radio
Moving Picture Recognition
ECE 260B – CSE 241A Intro and ASIC Flow .25 http://vlsicad.ucsd.edu
…Implemented With an SoC
0.18um / 400MHz / 470mW (typical)
CPU
I-cache32KB
D-cache32KB
I2C
FICP
USB
MMC
UART AC97
I2S
OST
GPIO
SSP
PWM RTC
DMA controller
LCDCnt.
MEMCnt.
PWR CPG
SDRAM64MB
Flash32MB
LCDPeripheral Area 4 – 48MHz
Data Transfer Area
100MHz
Processor Area
Max 400MHz
MM Application MP3 JPEG Simple Moving Picture
6.5MTrs.
Available Time 6-10Hr
SpecificationUSB
MMC
KEY
Sound
If the PDA must have 200h standby time with a 120g battery… ?
ECE 260B – CSE 241A Intro and ASIC Flow .26 http://vlsicad.ucsd.edu
System Complexity ChallengesSystem Complexity Challenges System Complexity = exponentially increasing transistor counts, with
increased diversity (mixed-signal SOC, …)
Reuse (hierarchical design support, heterogeneous SOC integration, reuse of verification/test/IP)
Verification and test (specification capture, design for verifiability, verification reuse, system-level and software verification, AMS self-test, noise-delay fault tests, test reuse)
Cost-driven design optimization (manufacturing cost modeling and analysis, quality metrics, die-package co-optimization, …)
Embedded software design (platform-based system design methodologies, software verification/analysis, codesign w/HW)
Reliable implementation platforms (predictable chip implementation onto multiple fabrics, higher-level handoff)
Design process management (team size / geog distribution, data mgmt, collaborative design, process improvement)
ECE 260B – CSE 241A Intro and ASIC Flow .27 http://vlsicad.ucsd.edu
Outline
Introduction
Technology Evolution Silicon Complexity System Complexity
Design Flows Traditional State of the Art
- Design Metrics
- Design Closure
ECE 260B – CSE 241A Intro and ASIC Flow .28 http://vlsicad.ucsd.edu
Levels of VLSI Design in a Traditional Flow Specification
what the system (or component) is supposed to do
Architecture high-level design of component
- state defined
- logic partitioned into major blocks
Logic gates, flip-flops, and the connections
between them
Circuit transistor circuits to realize logic
elements
Device behavior of individual circuit elements
Layout geometry used to define and connect
circuit elements
Process steps used to define circuit elements
High Level Synthesis
GDSII
Synthesis
Placement
Routing
Extraction and Timing
Verification
Manufacturing
Architecture Design
Verification
RTL
ECE 260B – CSE 241A Intro and ASIC Flow .29 http://vlsicad.ucsd.edu
Design Principles (Traditional)
Partition the problem (hirarchical design)
Different abstraction levels: RTL, gate-level, switch-level,
transistor-level
Orthogonize concerns
Abstraction vs. implementation
Logic vs. timing
Constrain the design space to simplify the design process
Balance between design complexity and performance E.g., standard-cell methodology
ECE 260B – CSE 241A Intro and ASIC Flow .30 http://vlsicad.ucsd.edu
VLSI Design Flow Evolution Expanding in two directions
System-on-Chip (SoC) Design
Design for Manufacturability (DFM)
More design metrics
Area
Timing
Power
Signal Integrity
Reliability
Tighter Integration
Design closure
RTL/GDSII sign-off re-defined
High Level Synthesis
GDSII
Synthesis
Placement
Routing
Extraction and Timing
Verification
Manufacturing
Architecture Design
Verification
RTL
ECE 260B – CSE 241A Intro and ASIC Flow .31 http://vlsicad.ucsd.edu
Design Procedure and Tools
Behavior modeling Matlab/C/VHDL
Logic synthesis DesignCompiler, BuildGates, … Verification of synthesis
- Formal Verification (Verplex)
- Static timing analysis (PrimeTime)
Place and route Astro, SOCE, … Verification of layout
- DRC, ERC, LVS (Calibre)
- Extraction (SignalStorm)
- Delay Calculation (CeltIC)
- Simulation (SPICE)
DFM
High Level Synthesis
GDSII
Synthesis
Placement
Routing
Extraction and Timing
Verification
Manufacturing
Architecture Design
Verification
RTL
ECE 260B – CSE 241A Intro and ASIC Flow .32 http://vlsicad.ucsd.edu
Design Principles(State of the Art)
Integrate the problem (design closure)
Back-annotation, predictability
Balance design metrics
Area/timing/power/signal integrity/reliability
Explore the design space
Balance between design complexity and performance
Platform-based SoC design
ECE 260B – CSE 241A Intro and ASIC Flow .33 http://vlsicad.ucsd.edu
Design Methodologies (+ business models)
Full-Custom (high effort, leading-edge performance, high-volume)
Semi-Custom (strong infrastructure, economical in lower volumes)
ASIC (Application-Specific Integrated Circuit)
Standard Cell/Gate Array/Via Programmable/Structured ASIC
FPGA
Special
Analog (custom layout, I/Os and sense amps)
Mixed-Signal / RF (unique to each process, no scaling)
System-on-Chip ( System-in-Package)
Various components: IP blocks, ASIC, FPGA, memory, uP, RF, etc.
Define implementation platform, hardware-software co-design
Performance vs. complexity
ECE 260B – CSE 241A Intro and ASIC Flow .34 http://vlsicad.ucsd.edu
Flow
Schematic Entry Cell
CharacterizationLayout Entry
Standard Cell Library
3-D RLC Modeling
Tool
Wire ModelDevice model
Layout rules
,Layers
Synthesis Library (Timing/Power/Area)
C-Model Verilog Behavioral
ModelVerilog
Structural RTL
Structural Model
Parasitic Extraction LibraryPlace & Route Library (Ports)
Floorplan
Global Layout
Block Layout
Floorplan
P & R
Functional
DRC/ERC/LVS
Static/Dynamic Timing w/extractFunctional
Static TimingPower/Area Scan/Testability
Synthesis P & R
Clock Routing/Analysis
Slide courtesy of Mary Jane Irwin, PSU
ECE 260B – CSE 241A Intro and ASIC Flow .35 http://vlsicad.ucsd.edu
Test Generation
Design Verification Timing Verification
Simulation Floorplanning
Logic Partitioning
Die Planning
Logic
Synthesis
Logic Design and
Simulation
Behavioral Level Design
Global Placement
Detail Placement
Clock Tree Synthesis
and Routing
Global Routing
Detail Routing
Power/Ground
Stripes, Rings Routing
Extraction and Delay Calc.
Timing Verification
LVS
DRC
ERC
IO Pad Placement
Traditional Taxonomy
Front End
Back End
ECE 260B – CSE 241A Intro and ASIC Flow .36 http://vlsicad.ucsd.edu
Generic Flow Steps
Library preparation
Library data preparation
Design data preparation
Logic design
Specification to RTL
RTL simulation
Hierarchical floorplanning
Synthesis
Formal verification
Gate level simulation
Static timing analysis
Physical design
•Physical floorplanning
•Place and route
•RC extraction
•Formal verification
•Physical verification
•Release to manufacturing
Design for test
Engineering change order
ECE 260B – CSE 241A Intro and ASIC Flow .37 http://vlsicad.ucsd.edu
Library and Design Data
Models and technology data required to execute the design flow
Power, timing: ALF, DCL, OLA, .lib, STAMP
Layout: LEF, DEF, GDSII
Delays and path timing, parasitics: SDF, GCF, SDC, DSPF, RSPF, SPEF, SPICE
Layout rules: Dracula, Calibre “deck”
ECE 260B – CSE 241A Intro and ASIC Flow .38 http://vlsicad.ucsd.edu
Architecture Design
Platform-based SoC Design Platform is a library of design resources Helps design space exploration Meet in the middle
Embedded system Hardware-software co-design
Application space
Application instance
System platform
Platform instanceArchitecture space
Platform specification
Platform design-spaceexploration
Figure courtesy of Alberto Sangiovanni-Vincentelli, UCB
ECE 260B – CSE 241A Intro and ASIC Flow .39 http://vlsicad.ucsd.edu
High-Level Synthesis (Behavior RTL) Scheduling
Assignment of each operation to a time slot corresponding to a clock cycle or time interval
Resource allocation Selection of the types of hardware components and the number for
each type to be included in the final implementation
Module binding Assignment of operation to the allocated hardware components
Controller synthesis Design of control style and clocking scheme
Compilation of the input specification language to the internal representation
Parallelism extraction usually via data flow analysis techniques
…
ECE 260B – CSE 241A Intro and ASIC Flow .40 http://vlsicad.ucsd.edu
Architecture Level Floorplanning
Defines the basic chip layout architecture Define the standard cell rows and I/O placement locations
Place RAMs and other macros
Separate gate array, memory, analog, RF blocks
Define power distribution structures such as rings and stripes
Allow space for clock, major buses, etc.
Rules of thumb for cell density are used to initially calculate design size
ECE 260B – CSE 241A Intro and ASIC Flow .41 http://vlsicad.ucsd.edu
Logic Synthesis Conversion of RTL to gate-level netlist
Targeted to a foundry-specific library
Can be performed hierarchically (block by block)
Timing-driven Clock information Primary input arrival times, primary output required times Input driving cells, output loading False paths, multi-cycle paths
Interconnect delay may be calculated based on a
“wireload model” which uses fanout to estimate delay
Clock parameters (insertion delay, skew, jitter, etc.) are
assumed to be attainable later in place and route
ECE 260B – CSE 241A Intro and ASIC Flow .42 http://vlsicad.ucsd.edu
Formal Verification
RTL description and gate level netlist are compared to verify functional equivalence, thereby verifying the synthesis results
Formal methods Graph isomorphism Binary Decision Diagram (BDD)
Emerging technology that supplements the more traditional gate-level simulation approach
FV also performed after place-and-route (if gate netlist changes)
ECE 260B – CSE 241A Intro and ASIC Flow .43 http://vlsicad.ucsd.edu
RTL Simulation
RTL code, written in Verilog, VHDL or a combination of both, is simulated to verify functional correctness
Testbenches apply input stimulus to the design
Several methods are used to verify the outputs Self-checking testbenches automatically verify output
correctness and report mismatches
Results can be stored in a file and compared to previous results
Waveform displays can be used to interactively verify the outputs
ECE 260B – CSE 241A Intro and ASIC Flow .44 http://vlsicad.ucsd.edu
Gate-Level Simulation Covers both functionality and timing
Correctness is only as good as the test vectors used
Especially critical for non-synchronous designs, verification of false path and multi-cycle path constraints
Cell timing is included in the simulation models and interconnect delay is passed from the synthesis run
Worst case PVT conditions are used to analyze for setup violations, and best case PVT conditions are used to analyze for hold violations
PVT = Process, Voltage, Temperature
ECE 260B – CSE 241A Intro and ASIC Flow .45 http://vlsicad.ucsd.edu
Static Timing Analysis Verifies that design operates at desired frequency
Implicitly assumes correct timing constraints (!), e.g., boundary conditions
Timing constraints are similar to those used by logic synthesis
Verifies setup and hold times at FF inputs; can also check timing from and to PI’s and PO’s; can also check point-to-point delay values (with blocking of pins, etc.)
As with gate-level simulation, both best- and worst-case analysis is performed
Typically performed on full-chip (not block) basis May require modified constraints for inter-block issues: multiple clock
domains, multi-cycle paths, etc.
For compatibility with timing-driven layout flow, helps to have simple / single set of constraints
Other issues: incremental analysis, …
ECE 260B – CSE 241A Intro and ASIC Flow .46 http://vlsicad.ucsd.edu
Block-Level Physical Floorplanning
Reconcile logical and physical hierarchies
Cells that are interconnected want to be close together Take advantage of RTL hierarchy Generate a physical hierarchy RTL hierarchy = best physical hierarchy?
Often bundled within the same cockpit as the place and route tool
Give placement some initial clues to reduce complexity
ECE 260B – CSE 241A Intro and ASIC Flow .47 http://vlsicad.ucsd.edu
Place and Route
Automatically place the standard cells
Generate clock trees
Add any remaining power bus connections
Route clock lines
Route signal interconnects
Design rule checks on the routes and cell placements
Timing driven tools Require timing constraints and analysis algorithms similar to those
used during the static timing analysis step
ECE 260B – CSE 241A Intro and ASIC Flow .48 http://vlsicad.ucsd.edu
RC(L) Extraction Calculate resistance and capacitance (and inductance) of
interconnects Based on placement of cells Routing segments
Calculate capacitive (inductive) effects of adjacent segments Extract capacitance between metal segments
RC(L) data transferred back to Static timing analysis (back annotation) Gate level simulation Replaces wire load model used in synthesis
Drive delay calculation, signal integrity analysis (crosstalk, other noise), static timing
Q: How do parasitics and noise affect performance?
ECE 260B – CSE 241A Intro and ASIC Flow .49 http://vlsicad.ucsd.edu
Physical Verification DRC – Design Rule Check
Spacing, min dimension rules
LVS – Layout Versus Schematic Verifies that layout and netlist are equivalent at the transistor
level
Electrical Rule Check Dangling nets, floating nodes
GDSII (Stream Format) Final merge of layout, routing and placement data for mask
production
ECE 260B – CSE 241A Intro and ASIC Flow .50 http://vlsicad.ucsd.edu
Release to Manufacturing Final edits to the layout are made
Metal fill and metal stress relief rules are checked
Manufacturing information such as scribe lanes, seal rings, mask shop data, part numbers, logos and pin 1 identification information for assembly are also added
DRC and LVS are run to verify the correctness of the modified database
‘Tapeout’ documentation is prepared prior to release of the GDSII to the foundry
Pad location information is prepared, typically in a spreadsheet
Cadence’s Virtuoso is used for custom-manual edits of the mask layers
Manufacturing steps generation of masks silicon processing wafer testing assembly and packaging manufacturing test
ECE 260B – CSE 241A Intro and ASIC Flow .51 http://vlsicad.ucsd.edu
Fnl. Design
Synthesis
Clock distribution
Design Specs
Lib.+CWLMConstraints
Route, scan re-order
Timing analysis, IPO
ERC, DRC, LVS
Tape-out
Fnl., pwr., SI ECO
Reqmts.
Floor-plan & PGLib.+CWLM
Placement
• Architectural optimization (timing)• Inter-group buses, bandwidth• Clock, SI, test; validation
• Row definitions• Placement of cells• Congestion analysis
• Full RC back-annotation• Hierarchical timing, electrical and
SI analysis and IPO/ECO
• Floorplanning and custom WLM• Power distribution (Internal, I/O)• I/O driver, padring design• Board-level timing, SI
• Placement-based re-synthesis• Noise minimization, isolation • Clock distribution
• Full routing• Scan stitching, re-ordering
Physical re-synth
A More Detailed Design Flow
A. Khan, Simplex/Altius
ECE 260B – CSE 241A Intro and ASIC Flow .52 http://vlsicad.ucsd.edu
Outline
Introduction
Technology Evolution Silicon Complexity System Complexity
Design Flows Traditional State of the Art
- Design Metrics
- Design Closure
ECE 260B – CSE 241A Intro and ASIC Flow .53 http://vlsicad.ucsd.edu
More Design Metrics and Techniques Area
Cell area Wirelength
Timing Gate Interconnect
Power Dynamic Static Leakage
Signal Integrity Crosstalk (capacitive, inductive) Supply voltage drop (IR drop, LdI/dt)
Reliability Variation (Vdd, thermal, process
variation (tox, BEOL)) Electromigration Hot electron effect (SEU)
Cost minimization Synthesis (technology mapping) Placement, routing
Performance optimization
Logic transformation, transistor sizing
Buffering, re-routing
Power minimization Gating (sleep transistors), variant Vdd Process optimization Dual-Vth
Signal Integrity Sizing, net ordering, shielding P/G design, placement, synthesis
Reliability Statistical design optimization Design margin
Past (250–180nm) Present (130–90nm) Future (65nm –)
Functional
Performance
Testability
Verification
SPEC
Hw/Sw
SW
Logic
Circuit
Place
Wire
other
Perf.
Timing
Power
Noise
Test
Mfg.
otherRepository
Hw/Sw
Data
Model
Optimize Analyze
Comm.
Cockpit
Auto-Pilot
EQ
ch
eck
MASKS
System
DesignSystem
Model
MASKS
SPEC
System Design
System
Model
SWRTL
Functional
Verification
Performance
Testability
Verification
Logic
Place
Wire
other
Timing
Power
Noise
Test
otherRepository
Data
Model
Optimize Analyze
Comm.
Cockpit
Auto-Pilot
EQ
Ch
eck
SW
Opt
HW/SW
Opt
Perf.
Model
System
DesignSystem
Model
File
Synthesis
+ Timing Analysis
+ Placement Opt
File
Place/Wire
+ Timing Analysis
+ Logic Opt
SW
Opt
Performance
Testability
Verification
Functional
Verification
MASKS
RTL
SW
EQ
Ch
eck
HW/SW
Optimization
Multiple design files are converged into one efficient Data Model
Disk accesses are eliminated in critical methodology loops
Verification of function, performance, testability and other design
criteria all move to earlier, higher levels of abstraction followed by
Equivalence checking
Assertion-driven design optimizations
Industry standard interfaces for data access and control
Incremental modular tools for optimization and analysis
Past (250–180nm) Present (130–90nm) Future (65nm –)
Functional
Performance
Testability
Verification
SPEC
Hw/Sw
SW
Logic
Circuit
Place
Wire
other
Perf.
Timing
Power
Noise
Test
Mfg.
otherRepository
Hw/Sw
Data
Model
Optimize Analyze
Comm.
Cockpit
Auto-Pilot
EQ
ch
eck
MASKS
System
DesignSystem
Model
MASKS
SPEC
System Design
System
Model
SWRTL
Functional
Verification
Performance
Testability
Verification
Logic
Place
Wire
other
Timing
Power
Noise
Test
otherRepository
Data
Model
Optimize Analyze
Comm.
Cockpit
Auto-Pilot
EQ
Ch
eck
SW
Opt
HW/SW
Opt
Perf.
Model
System
DesignSystem
Model
File
Synthesis
+ Timing Analysis
+ Placement Opt
File
Place/Wire
+ Timing Analysis
+ Logic Opt
SW
Opt
Performance
Testability
Verification
Functional
Verification
MASKS
RTL
SW
EQ
Ch
eck
HW/SW
Optimization
Multiple design files are converged into one efficient Data Model
Disk accesses are eliminated in critical methodology loops
Verification of function, performance, testability and other design
criteria all move to earlier, higher levels of abstraction followed by
Equivalence checking
Assertion-driven design optimizations
Industry standard interfaces for data access and control
Incremental modular tools for optimization and analysis
Multiple design files are converged into one efficient Data Model
Disk accesses are eliminated in critical methodology loops
Verification of function, performance, testability and other design
criteria all move to earlier, higher levels of abstraction followed by
Equivalence checking
Assertion-driven design optimizations
Industry standard interfaces for data access and control
Incremental modular tools for optimization and analysis
Design Flow Evolution (ITRS-2003)
Design Convergence Drivers and Approaches
ECE 260B – CSE 241A Intro and ASIC Flow .56 http://vlsicad.ucsd.edu
Wireload Model
Helps delay estimation at synthesis stage
Gate delay = f(input slew, load cap) Wire cap = f’(fanout number)
Empirical Different for each technology, library,
tool, design, and design stage Statistical (from library), custom
(multiple iterations), structural (look at adjacent nets) …
Large deviation remains Routing obstacles (hard IP blocks,
macros, etc.) Routing algorithms/implementations
(timing driven, net ordering, details)-10
-5
0
5
10
15
0 5 10 15
Design
% E
st
Err
or
2 5 10 15#Pins
Cap
ECE 260B – CSE 241A Intro and ASIC Flow .57 http://vlsicad.ucsd.edu
Interconnect Statistics
Local Interconnect
Global Interconnect
What are some implications?
SLocal = STechnology
SGlobal = SDie
ECE 260B – CSE 241A Intro and ASIC Flow .58 http://vlsicad.ucsd.edu
Rent’s Rule
Power law distribution
N = Gp
N: number of nets
G: number of gates
p: Rent exponent between 0 ~ 1
Foundation of statistical interconnect prediction
Empirical, unclear theoretical root
lgG
lgN
ECE 260B – CSE 241A Intro and ASIC Flow .59 http://vlsicad.ucsd.edu
Constructive Interconnect Prediction
Statistical models have their limitations
Critical paths and the law of small numbers Statistics properties, e.g., average wirelength Extreme statistics properties, e.g., critical path length
Implementation details Routing congestion, e.g., horizontal effect Timing optimization, e.g., layer assignment Via blockage, pin accessability, wrong way routing, etc.
Predict by construction (physical synthesis) try a fast (global) router
Scheffer and Nequist, Proc. ACM SLIP 2000, pp. 139-144
ECE 260B – CSE 241A Intro and ASIC Flow .60 http://vlsicad.ucsd.edu
Goal: Design Convergence What must converge?
logic, timing, power, SI, reliability in a physical embedding support front-end signoff with a predictable back-end
Achieve Convergence through Predictability correct by construction (“assume, then enforce”)
- constraints and assumptions passed downstream; not much goes upstream
- ignores concerns via guardbanding
- separates concerns as able (e.g., FE logic/timing vs. BE spatial embedding)
construct by correction (“tight loops”)- logic-layout unification; synthesis-analysis unification, concurrent
optimization
elimination of concerns- reduced degrees of freedom, pre-emptive design techniques
- e.g., power distribution, layer assignment / repeater rules
ECE 260B – CSE 241A Intro and ASIC Flow .61 http://vlsicad.ucsd.edu
Floorplan / Placement
Routing
“Physical Prototyping Philosophy”
Prototype delivers accurate physical data
Levels of accuracy Placement-acknowledgeable
synthesis (PKS) Including global route Post-detailed-route (In-Place
Optimization, i.e., IPO)
Hierarchical timing budgeting: Chip-level CTS, top-level route
and IPO, power analysis and grid design
Block-level synthesis, placement, IPO, routing
“Handoff with enough physical information to ensure correct results”
RTL
Gates
Physical Prototype
Functionality known
Timing / routability known
M. Courtoy, Silicon Perspective
Coarse Placement Drives Partitioning, Coarse Routing Drives Pin Assignment / Timing Opt
Full-chip prototype results in optimal pin placement
Results in narrower channels and reduced die size
Reduces the routing congestion
Improves the chip timing
Accurate timing budgets result in predictable timing convergence
Physical Prototype Partitioning
Block 1
Block 2
Block 3
Block-Level Timing BudgetsBlock-Level Pin Assignments
M. Courtoy, Silicon Perspective
ECE 260B – CSE 241A Intro and ASIC Flow .63 http://vlsicad.ucsd.edu
Power IR Drop Analysis
Hierarchical Clock Tree Synthesis
Full Chip Power Planning
Block-Level Optimization
Timing Closure
150psskew
120ps skew
50psskew
50psskew
100psskew
130ps skew
PlaceDetailed Trial Route
RC ExtractionDelay Calc / STA
IPO
Full ChipPhysical Prototype
Partition
“Tape Out Every Day”
Cool Pictures of the Pieces…
M. Courtoy, Silicon Perspective
Recommended