Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
9VLSI – Design of Integrated Circuits
Prof. Dr. Dr. h.c. mult. M. Glesner
Dipl.-Ing. M.-D. DoanDipl.-Inf. M. GasteierDipl.-Ing. H. GentherDipl.-Ing. T. Hollstein
Dipl.-Ing. P. PochmullerDr.-Ing. N. Wehn
Dipl.-Ing. P. Windirsch
Darmstadt University of Technology
Contents
Contents
List of Figures 0-12
List of Tables 0-29
1 Basics of CMOS Circuit Design 1-1
1.1 pn Junction Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
1.1.1 pn Junction Space Charge Area and Electric Field . . . . . . . . . . . . 1-1
1.1.2 pn Junction Built-in Potential . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.1.3 pn Junction Depletion Width . . . . . . . . . . . . . . . . . . . . . . . . 1-3
1.1.4 pn Junction with External Voltage . . . . . . . . . . . . . . . . . . . . . 1-4
1.1.5 pn Junction Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
1.1.6 pn Junction Current Flow . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
1.2 MOS Transistor Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
1.2.1 MOSFET Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
1.2.2 MOS Capacitor and Threshold Voltage . . . . . . . . . . . . . . . . . . 1-7
1.2.3 MOSFET Operation Modes . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
1.2.4 MOSFET current characteristic . . . . . . . . . . . . . . . . . . . . . . . 1-16
1.2.5 Biased MOSFET Current Equations . . . . . . . . . . . . . . . . . . . . 1-21
1.2.6 Measurement of device parameters . . . . . . . . . . . . . . . . . . . . . 1-22
1.2.7 The Complete MOSFET GCA Analysis . . . . . . . . . . . . . . . . . . 1-23
1.2.8 Depletion mode n–channel MOSFET . . . . . . . . . . . . . . . . . . . 1-24
1.2.9 p–channel MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-28
1.2.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-29
1.2.11 Modelling the MOS Transistor for Circuit simulation . . . . . . . . . . . 1-30
1.3 DC Characteristics of MOS Inverters . . . . . . . . . . . . . . . . . . . . . . . . 1-32
1.3.1 Basic Inverter characteristics . . . . . . . . . . . . . . . . . . . . . . . . 1-33
1.3.2 Inverter with Linear Resistor Load . . . . . . . . . . . . . . . . . . . . . 1-38
VLSI DesignCourse 0-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
1.3.3 Inverter Design: Resistor Model . . . . . . . . . . . . . . . . . . . . . . 1-42
1.3.4 Inverter with Saturated Enhancement Load . . . . . . . . . . . . . . . . 1-44
1.3.5 Inverter with Nonsaturated Enhancement Load . . . . . . . . . . . . . . 1-45
1.3.6 Inverter with Depletion mode MOSFET Load . . . . . . . . . . . . . . . 1-46
1.3.7 CMOS inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-51
1.4 Switching of MOS Inverters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-54
1.4.1 The output High-to-Low Time tHL . . . . . . . . . . . . . . . . . . . . . 1-54
1.4.2 Rise Time tLH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-54
1.4.3 NMOS Propagation Delay Time . . . . . . . . . . . . . . . . . . . . . . 1-56
1.4.4 CMOS Inverter Transient Response . . . . . . . . . . . . . . . . . . . . 1-57
1.4.5 Propagation Delay Time tp of CMOS Inverters . . . . . . . . . . . . . . 1-57
1.4.6 Power-Delay-Product (PDP) . . . . . . . . . . . . . . . . . . . . . . . . 1-65
1.4.7 MOSFET Capacitances . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-70
1.4.8 Inverter Output Capacitance . . . . . . . . . . . . . . . . . . . . . . . . 1-75
1.4.9 Scaled Inverter Performance . . . . . . . . . . . . . . . . . . . . . . . . . 1-78
1.5 CMOS Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-79
1.5.1 CMOS Process Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-79
1.5.2 The Latch-Up Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-80
2 Static CMOS Logic Design and Combinational Circuits 2-1
2.1 Overview: Combinational Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.2 Complex nMOS Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2.2.1 nMOS NOR Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2.2.2 nMOS NAND Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.2.3 nMOS Complex Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
2.3 Complex Static CMOS Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
2.3.1 CMOS NAND and NOR Gates . . . . . . . . . . . . . . . . . . . . . . . 2-10
2.3.2 Static CMOS Logic Design . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
2.3.3 Pseudo nMOS Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23
2.4 Passtransistor and Transmission Gate Logic . . . . . . . . . . . . . . . . . . . . 2-24
2.4.1 Passtransistor Charging Characteristics . . . . . . . . . . . . . . . . . . 2-25
2.4.2 Passtransistor Discharging Characteristics . . . . . . . . . . . . . . . . . 2-26
2.4.3 CMOS Transmission Gates . . . . . . . . . . . . . . . . . . . . . . . . . 2-28
3 Synchronous MOS Logic 3-1
3.1 Clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
VLSI DesignCourse 0-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
3.1.1 Single and Multiple Clock Signals . . . . . . . . . . . . . . . . . . . . . 3-2
3.2 Clocked Static Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3.3 Charge Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.4 Dynamic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
3.4.1 Dynamic nMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
3.4.2 Dynamic pMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
3.4.3 Dynamic CMOS Properties and Conditions . . . . . . . . . . . . . . . . 3-15
3.4.4 Complex Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
3.4.5 Dynamic Cascades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
3.5 Domino CMOS Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
3.5.1 Domino Logic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
3.5.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-23
3.5.3 Charge Leakage and Charge Sharing . . . . . . . . . . . . . . . . . . . . 3-24
3.6 NORA Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
3.6.1 NORA Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
3.6.2 The Signal Race Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
3.6.3 NORA Structuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-28
3.7 Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-31
3.7.1 Principle of CMOS Information Storage . . . . . . . . . . . . . . . . . . 3-31
3.7.2 Dynamic Flip-Flops: Pseudo 2-Phase Clocking . . . . . . . . . . . . . . 3-33
3.7.3 Pseudo 2-Phase Memory Structures . . . . . . . . . . . . . . . . . . . . 3-34
3.7.4 Dynamic Flip-Flop with reduced Transistor Count and Clock Connection 3-37
3.7.5 Dynamic D-Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38
3.7.6 Pseudo 2-Phase Logic Structures . . . . . . . . . . . . . . . . . . . . . . 3-39
3.7.7 Pseudo 2-Phase Logic Structures: Domino Logic . . . . . . . . . . . . . 3-40
3.7.8 2-Phase Memory Structures: Skew Reduction . . . . . . . . . . . . . . . 3-41
3.7.9 2-Phase Memory Structures: Chain Latch . . . . . . . . . . . . . . . . . 3-42
3.7.10 2-Phase Memory Structures: Static Flip-Flops . . . . . . . . . . . . . . 3-43
3.7.11 2-Phase Memory Structures: Static D Flip-Flops . . . . . . . . . . . . . 3-44
3.7.12 Static D Flip-Flop with Set and Reset . . . . . . . . . . . . . . . . . . . 3-47
4 Performance 4-1
4.1 Signaldelay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
4.1.1 Resistance Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
4.1.2 Capacitance Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
VLSI DesignCourse 0-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
4.1.3 RC-line model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
4.2 CMOS Gate Transistor Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
4.3 Power Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
4.3.1 Static power dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
4.3.2 Dynamic power dissipation: . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
4.3.3 Power delay product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11
4.4 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-15
4.4.1 Scaling principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-15
4.4.2 Interconnect layer scaling . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4.5 Power and Clock Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-18
4.5.1 Power distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-18
4.5.2 Clock distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-19
4.5.3 Clock and Timing Circles . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20
4.5.4 Clock Generation Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . 4-21
4.5.5 Clock Drivers and Distribution Techniques . . . . . . . . . . . . . . . . 4-22
4.6 Input Protection Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-23
4.7 Static Gate Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-25
4.8 Off-Chip Driver Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-29
4.8.1 Basic Off-Chip Driver Design . . . . . . . . . . . . . . . . . . . . . . . . 4-29
4.8.2 Tri-State and Bidirectional I/O . . . . . . . . . . . . . . . . . . . . . . . 4-30
5 CMOS Process and Layout Design of Integrated Circuits 5-1
5.1 Processing Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
5.1.1 Wafer Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
5.1.2 The n-Well CMOS Process . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
5.1.3 The p-Well CMOS Process . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.1.4 The Twin-Tub Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.1.5 Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5.1.6 Latchup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5.2 Design Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
5.2.1 Lithography and Fabrication . . . . . . . . . . . . . . . . . . . . . . . . 5-14
5.2.2 Basic Design Rule Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
5.3 Circuit Extraction and Electrical Process Parameters . . . . . . . . . . . . . . . 5-22
5.3.1 Connectivity Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
5.3.2 Parasitic Capacitance Extraction . . . . . . . . . . . . . . . . . . . . . . 5-24
VLSI DesignCourse 0-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
5.3.3 Transistor Size Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24
5.3.4 Parasitic Resitance Extraction . . . . . . . . . . . . . . . . . . . . . . . 5-25
5.3.5 Process Parameter and Technology Description . . . . . . . . . . . . . . 5-25
5.4 Basic Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-26
5.4.1 IC Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-27
5.4.2 General Layout Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 5-27
5.4.3 Equivalent Load Concept . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28
5.4.4 Latch-Up Prevention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5.4.5 Static Gate Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5.4.6 Transistor-Gate-Based Logic . . . . . . . . . . . . . . . . . . . . . . . . 5-32
5.5 Layout Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34
6 VLSI Device Packaging 6-1
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
6.2 Package Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
6.2.1 24-pin Packaging Evolution . . . . . . . . . . . . . . . . . . . . . . . . . 6-6
6.3 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
6.3.1 VLSI Design Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
6.3.2 Thermal Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
6.3.3 Electricial Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
6.3.4 Mechanical Design Considerations . . . . . . . . . . . . . . . . . . . . . 6-11
6.4 Assembly Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
6.4.1 Wafer Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
6.4.2 Die Bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
6.4.3 Wire Bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15
6.5 Package Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17
6.5.1 Ceramic Package Technology . . . . . . . . . . . . . . . . . . . . . . . . 6-17
6.5.2 Glass-Sealed Refractory Technology . . . . . . . . . . . . . . . . . . . . 6-19
6.5.3 Plastic Molding Technology . . . . . . . . . . . . . . . . . . . . . . . . . 6-20
6.5.4 Molding Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-21
6.6 IC Package Market Share . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22
6.7 Packaging Trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
6.7.1 MultiChip Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
6.7.2 Comparison of Packaging Alternatives . . . . . . . . . . . . . . . . . . . 6-27
7 Computer Aided Design of Integrated Circuits 7-1
VLSI DesignCourse 0-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
7.1 CAD Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1
7.2 Full Custom Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.3 Cell Based Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
7.4 Design Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.4.1 Physical Design Rule Check . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.4.2 Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7.4.3 LVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7.4.4 Schematic / Electrical Rule Check (SRC / ERC) . . . . . . . . . . . . . 7-11
7.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7.5.1 Goal of Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7.5.2 Simulator Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7.5.3 Signal Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7.5.4 Signal States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7.5.5 Circuit and Delay Modelling . . . . . . . . . . . . . . . . . . . . . . . . 7-13
7.5.6 Advanced Logic Simulators . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7.5.7 Simulation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7.5.8 Switch Level Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-15
7.6 Hardware Description with VHDL . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
8 Digital Subsystem Design 8-1
8.1 Weinberger Structuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
8.2 Gate Matrix Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8.2.1 Creating a Gate Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8.2.2 Example: Half-Adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9
8.2.3 Character Definitions for Symbolic Layout . . . . . . . . . . . . . . . . . 8-10
8.2.4 Summary of Gate Matrix Properties . . . . . . . . . . . . . . . . . . . . 8-13
8.3 Optimal CMOS Complex Gate Layout . . . . . . . . . . . . . . . . . . . . . . . 8-14
8.3.1 CMOS Functional Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-15
8.3.2 Basic Layout Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18
8.3.3 Graph Theoretical Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 8-21
8.3.4 Problem Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8.3.5 Algorithm for Calculating Minimal Interlace . . . . . . . . . . . . . . . . 8-24
8.3.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
8.4 Standard Cell Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30
8.5 Programmable Logic Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32
VLSI DesignCourse 0-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
8.5.1 Floor Plan for PLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
8.5.2 Static nMOS and Pseudo-nMOS PLA . . . . . . . . . . . . . . . . . . . 8-36
8.5.3 Static CMOS PLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39
8.5.4 Dynamic CMOS PLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39
8.5.5 Noise in PLAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-41
8.5.6 Optimization of PLAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-41
8.5.7 Timing and Power Dissipation of a Static PLA . . . . . . . . . . . . . . 8-44
8.5.8 Automatic PLA Layout Generation . . . . . . . . . . . . . . . . . . . . 8-45
8.6 Finite-State Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
8.6.1 Introduction into Finite State Machines . . . . . . . . . . . . . . . . . . 8-49
8.6.2 Realization of Finite-State Machines . . . . . . . . . . . . . . . . . . . . 8-51
8.6.3 Synchronous FSM Circuit Models . . . . . . . . . . . . . . . . . . . . . 8-54
8.6.4 States and Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-56
8.6.5 Equivalence of FSMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-57
8.6.6 Regular Expressions and Nondeterministic FSMs . . . . . . . . . . . . . 8-59
8.6.7 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-62
9 ASIC Design Concepts 9-1
9.1 ASIC Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
9.1.1 The VLSI Design Process as a Transformation from Higher to LowerDescriptive Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
9.1.2 Phases of Electronic System Design . . . . . . . . . . . . . . . . . . . . 9-2
9.1.3 Application Architectural Properties . . . . . . . . . . . . . . . . . . . . 9-3
9.1.4 Synthesis Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3
9.2 ASIC Design Styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.2.1 ASIC Technology Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
9.3 Gate Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
9.3.1 Introduction to Gate Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
9.3.2 IMI Grid Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-8
9.3.3 CDI Grid Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-13
9.3.4 Gate Array Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-14
9.3.5 Personalization Examples for IMI and CDI Gate Array . . . . . . . . . 9-15
9.3.6 Qualification of Gate Array Design Style . . . . . . . . . . . . . . . . . . 9-17
9.3.7 Gate Array Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18
9.4 Standard Cell Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-19
9.4.1 Introduction to Standard Cells . . . . . . . . . . . . . . . . . . . . . . . 9-19
VLSI DesignCourse 0-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
9.4.2 Qualification of Standard Cell Design Style . . . . . . . . . . . . . . . . 9-21
9.4.3 Standard Cell Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-22
9.5 Macro Cell Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23
9.5.1 Introduction to the Macro Cell Concept . . . . . . . . . . . . . . . . . . 9-23
9.6 Mixed Design Styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-24
9.6.1 Introduction: Mixed Design Styles . . . . . . . . . . . . . . . . . . . . . 9-24
9.6.2 Features of Mixed-Mode ASICs . . . . . . . . . . . . . . . . . . . . . . . 9-24
9.7 Programmable Logic Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-25
9.7.1 Classical PLD Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-25
9.7.2 Advanced PLD Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-29
9.7.3 PLA-based Device Properties . . . . . . . . . . . . . . . . . . . . . . . . 9-33
9.8 Field Programmable Gate Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . 9-34
9.8.1 The FPGA Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-34
9.8.2 FPGA Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-35
9.8.3 Programming Technologies . . . . . . . . . . . . . . . . . . . . . . . . . 9-36
9.8.4 Overview: Commercially Available FPGAs . . . . . . . . . . . . . . . . 9-38
9.8.5 Xilinx Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-39
9.8.6 Actel Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-41
9.8.7 CAD for FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-44
9.8.8 Economical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . 9-46
9.9 Overview on Logic Design Alternatives . . . . . . . . . . . . . . . . . . . . . . . 9-47
10 Arithmetic Units 10-1
10.1 Adders / Subtracters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1
10.1.1 Basic Adder Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1
10.1.2 Adders / Subtracters for Binary Coded Integers . . . . . . . . . . . . . 10-1
10.2 Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6
11 Microarchitectures 11-1
11.1 Datapath Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.1.1 Bit-slice ALU AMD 2901 . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.2 Controller Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-5
11.2.1 Microprogrammed Controllers . . . . . . . . . . . . . . . . . . . . . . . . 11-6
12 ASIC Design Guidelines 12-1
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1
VLSI DesignCourse 0-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
12.2 Synchronous Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1
12.2.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-1
12.2.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
12.3 Clock Buffering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
12.3.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
12.3.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5
12.4 Gated Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.4.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.4.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.5 Double-edged Clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8
12.5.1 Non-Recommended Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 12-8
12.5.2 Recommended Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8
12.6 Asynchronous Resets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.6.1 Non-Recommended Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.6.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
12.7 Shift-Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-10
12.7.1 Non-recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-10
12.7.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-10
12.8 Asynchronous Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-11
12.8.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-11
12.8.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-11
12.9 Delay Lines and Monostables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.9.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.9.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.10Bistable Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.10.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.10.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16
12.11RAMs and ROMs in Synchronous Circuits . . . . . . . . . . . . . . . . . . . . . 12-17
12.11.1 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-17
12.12Tristates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-19
12.12.1 Non-Recommended Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 12-19
12.12.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-20
12.12.3 Multiplexer ↔ Tristates . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-20
12.13Parallel Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-21
12.13.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-21
VLSI DesignCourse 0-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
12.13.2 Recommended Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-21
12.14Fanout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-22
12.14.1 Non-Recommended Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 12-22
12.14.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-23
12.15Design for Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-25
12.16Design for Testability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12.16.1 Non-Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12.16.2 Recommended Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-29
13 Testing and Design for Testability 13-1
13.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-1
13.2 Economical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
13.2.1 Average Quality Level (AQL) . . . . . . . . . . . . . . . . . . . . . . . . 13-2
13.2.2 Correlation: Fault Coverage and Defective Parts . . . . . . . . . . . . . 13-3
13.3 Design Flow: Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
13.3.1 Chip Test after Manufacturing . . . . . . . . . . . . . . . . . . . . . . . 13-6
13.4 Fundamental Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
13.5 Fault Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-7
13.6 Fault Tolerant Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-12
13.7 Test Pattern Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-15
13.7.1 The D-Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-15
13.8 Fault Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-24
13.8.1 Algorithms: Serial Fault Simulation . . . . . . . . . . . . . . . . . . . . 13-24
13.8.2 Improved Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-24
13.9 Design for Testability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-25
13.9.1 Ad-Hoc Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-25
13.9.2 Scan-Path Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-28
13.9.3 Built-In Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-29
13.9.4 Evaluation of Testing Data . . . . . . . . . . . . . . . . . . . . . . . . . 13-33
13.9.5 Built-In Logic Block Observation . . . . . . . . . . . . . . . . . . . . . . 13-37
13.9.6 Example: Self-testing Circuit . . . . . . . . . . . . . . . . . . . . . . . . 13-38
14 Boundary-Scan Architecture – JTAG Standard 14-1
14.1 Classical Board Test Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2
14.2 Introduction to Boundary Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-4
14.3 The IEEE Standard 1149.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-7
VLSI DesignCourse 0-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Contents
14.3.1 IEEE Std 1149.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . 14-7
14.3.2 Test Access Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-8
14.3.3 TAP-Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-10
14.3.4 The Instruction Register . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-11
14.3.5 Test Data Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-13
15 Analog VLSI systems 15-1
15.1 Analog Signal Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1
15.1.1 Signal Bandwidths in Analog VLSI . . . . . . . . . . . . . . . . . . . . . 15-2
15.1.2 A/D and D/A Conversion in Signal Processing Systems . . . . . . . . . 15-3
15.2 Digital-To-Analog Converters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4
15.2.1 Current Scaling D/A Converters . . . . . . . . . . . . . . . . . . . . . . 15-4
15.2.2 Voltage Scaling D/A Converters . . . . . . . . . . . . . . . . . . . . . . 15-8
15.3 Analog-To-Digital Converters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-9
15.3.1 Serial A/D Converters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-10
15.3.2 Successive Approximation A/D Converters . . . . . . . . . . . . . . . . 15-10
15.3.3 Parallel A/D Converters . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-11
15.3.4 Sigma-Delta A/D Converter . . . . . . . . . . . . . . . . . . . . . . . . . 15-14
Bibliography 16-1
VLSI DesignCourse 0-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
List of Figures
1.1 Step-profile of pn junction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
1.2 n-channel enhancement-mode MOSFET . . . . . . . . . . . . . . . . . . . . . . 1-6
1.3 The basic MOS structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
1.4 MOS accumulation state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
1.5 MOS fields and potentials for positive gate voltages . . . . . . . . . . . . . . . . 1-9
1.6 Depletion in the MOS system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
1.7 Surface inversion in the MOS system . . . . . . . . . . . . . . . . . . . . . . . . 1-10
1.8 Increase in depletion charge from body bias VB . . . . . . . . . . . . . . . . . . 1-12
1.9 Basic MOSFET channel formation . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
1.10 MOSFET in cutoff mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14
1.11 MOSFET in nonsaturation mode . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15
1.12 MOSFET in saturation mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15
1.13 MOSFET geometry used in GCA (MOSFET in linear/nonsaturated region) . . 1-16
1.14 Geometry for GCA current analysis . . . . . . . . . . . . . . . . . . . . . . . . 1-17
1.15 Nonsaturated MOS current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18
1.16 Basic MOSFET characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19
1.17 Start of Saturation in a MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . 1-19
1.18 Channel length modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20
1.19 MOSFET characteristics with channel length modulation . . . . . . . . . . . . 1-21
1.20 General MOSFET bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21
1.21 Body bias effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-22
1.22 Device parameter measurement (a) . . . . . . . . . . . . . . . . . . . . . . . . . 1-22
1.23 Device parameter measurement (b) . . . . . . . . . . . . . . . . . . . . . . . . . 1-23
1.24 Comparision of circuit equations with the complete GCA model . . . . . . . . . 1-24
1.25 Comparision of modified circuit equations with the complete GCA model . . . 1-24
1.26 Depletion-mode MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-25
1.27 Simplified depletion-mode MOSFET model . . . . . . . . . . . . . . . . . . . . 1-26
VLSI DesignCourse 0-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
1.28 Depletion-mode MOSFET characteristics . . . . . . . . . . . . . . . . . . . . . 1-27
1.29 Square root of saturated depletion-mode MOSFET current . . . . . . . . . . . 1-28
1.30 p-channel MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-28
1.31 Ideal inverter properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-32
1.32 Basic nMOS inverter structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-33
1.33 Voltage transfer curve of an nMOS inverter . . . . . . . . . . . . . . . . . . . . 1-34
1.34 Definition: Noise margins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-35
1.35 Base for NM definitions: cascaded inverter stages . . . . . . . . . . . . . . . . . 1-35
1.36 Model for transmission network problem . . . . . . . . . . . . . . . . . . . . . . 1-36
1.37 Simplified AC circuit model for noise margins . . . . . . . . . . . . . . . . . . . 1-37
1.38 Inverter transient response definitions . . . . . . . . . . . . . . . . . . . . . . . 1-38
1.39 Physical reason for transition times . . . . . . . . . . . . . . . . . . . . . . . . . 1-39
1.40 Inverter with linear resistor load . . . . . . . . . . . . . . . . . . . . . . . . . . 1-40
1.41 VTC for linear resistor load nMOS inverter . . . . . . . . . . . . . . . . . . . . 1-41
1.42 VOH resistor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-42
1.43 VOL resistor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-43
1.44 Saturated enhancement load nMOS inverter . . . . . . . . . . . . . . . . . . . . 1-44
1.45 VTC for saturated enhancement load nMOS inverter . . . . . . . . . . . . . . . 1-45
1.46 Nonsaturated enhancement load nMOS inverter . . . . . . . . . . . . . . . . . . 1-46
1.47 VTC for nonsaturated enhancement load nMOS inverter . . . . . . . . . . . . . 1-47
1.48 Symbol for depletion mode MOSFET . . . . . . . . . . . . . . . . . . . . . . . 1-47
1.49 Depletion mode MOSFET load . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-48
1.50 VTC for inverter with depletion mode MOSFET load . . . . . . . . . . . . . . 1-49
1.51 Driver-load ratio for depletion-load inverter . . . . . . . . . . . . . . . . . . . . 1-50
1.52 βR for various VOL choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-58
1.53 Basic CMOS inverter structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-59
1.54 CMOS inverter characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-60
1.55 Output high to low time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-61
1.56 Rise time circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-62
1.57 Depletion load rise time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-62
1.58 Propagation delay time definitions . . . . . . . . . . . . . . . . . . . . . . . . . 1-63
1.59 CMOS transient analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-64
1.60 PDP: input signal waveforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-65
1.61 PDP for inverter with resistor load . . . . . . . . . . . . . . . . . . . . . . . . . 1-66
1.62 Power supply currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-68
VLSI DesignCourse 0-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
1.63 Capacitances: basic MOSFET structure . . . . . . . . . . . . . . . . . . . . . . 1-70
1.64 MOSFET capacitor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-71
1.65 MOSFET gate capacitances in the three operational regions . . . . . . . . . . . 1-72
1.66 Gate capacitances as functions of gate-source voltage . . . . . . . . . . . . . . . 1-73
1.67 Expanded view of an n+ drain or source region for computing depletion capac-itances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-73
1.68 Approximation used for Cout in cascaded nMOS inverters . . . . . . . . . . . . 1-75
1.69 Simplified interconnect scheme for line capacitance . . . . . . . . . . . . . . . . 1-75
1.70 Capacitance calculation for FO = 3 . . . . . . . . . . . . . . . . . . . . . . . . . 1-76
1.71 Approximation used for Cout in cascaded CMOS inverters . . . . . . . . . . . . 1-77
1.72 CMOS process flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-79
1.73 Latch-up in n-tub CMOS inverter . . . . . . . . . . . . . . . . . . . . . . . . . . 1-80
1.74 Guard rings for latch-up prevention . . . . . . . . . . . . . . . . . . . . . . . . . 1-81
2.1 Example for random logic: adder . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
2.2 Complex gate logic primitive: CMOS inverter . . . . . . . . . . . . . . . . . . . 2-2
2.3 MOS transistors viewed as switches . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
2.4 A complementary switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
2.5 Example for regular design: gate-matrix layout . . . . . . . . . . . . . . . . . . 2-4
2.6 nMOS 2-input NOR gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2.7 nMOS N-input NOR gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2.8 nMOS 2-input NAND gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2.9 Example of a complex nMOS circuit . . . . . . . . . . . . . . . . . . . . . . . . 2-7
2.10 Evolution of a nMOS XOR circuit . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
2.11 Direct NOT XOR complex gate implementation . . . . . . . . . . . . . . . . . . 2-9
2.12 CMOS NAND gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
2.13 CMOS NAND gate layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
2.14 CMOS NOR gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
2.15 General CMOS static logic gate . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
2.16 CMOS complex gate construction . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
2.17 Systematic function construction . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18
2.18 Combinational adder schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-21
2.19 Combinational adder layout possibilities for one adder circuit . . . . . . . . . . 2-22
2.20 Pseudo nMOS logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23
2.21 Pass transistor logic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-24
2.22 Pass transistor structure for NXOR function . . . . . . . . . . . . . . . . . . . . 2-24
VLSI DesignCourse 0-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
2.23 Pass transistor charging characteristics . . . . . . . . . . . . . . . . . . . . . . . 2-25
2.24 Pass transistor discharge characteristics . . . . . . . . . . . . . . . . . . . . . . 2-27
2.25 nMOS pass characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28
2.26 CMOS transmission gate symbols . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28
2.27 CMOS transmission gate realisation . . . . . . . . . . . . . . . . . . . . . . . . 2-29
2.28 pMOS pass transistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29
2.29 pMOS pass characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2.30 CMOS transmission gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30
2.31 MOSFET operational states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-31
2.32 Transmission gate: resistor switch model . . . . . . . . . . . . . . . . . . . . . . 2-31
2.33 Transmission gate: RC switch logic transfer . . . . . . . . . . . . . . . . . . . . 2-32
2.34 Transmission gate: equivalent resistances . . . . . . . . . . . . . . . . . . . . . 2-32
2.35 Transmission gate: basic layout . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-33
2.36 Transmission gate logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-34
2.37 TG-logic: 2-input path selector . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-34
2.38 TG-logic: OR gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-35
2.39 TG-logic: XOR and equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-36
2.40 TG-logic: alternate equivalence logic circuit . . . . . . . . . . . . . . . . . . . . 2-36
2.41 Half adder logic symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-37
2.42 TG-logic: Half adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-37
2.43 TG-logic: Full adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-38
2.44 Multiplex/Demultiplex operations . . . . . . . . . . . . . . . . . . . . . . . . . 2-39
2.45 TG-logic: 4-to-1 multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-39
2.46 TG-logic: Split-Array MUX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-40
2.47 Pass transistor logic with pMOS pull-up . . . . . . . . . . . . . . . . . . . . . . 2-40
3.1 Ideal nonoverlapping 2-phase clocks . . . . . . . . . . . . . . . . . . . . . . . . 3-1
3.2 Basic 2-phase clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.3 Single clock 2-phase timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.4 Generation of inverted clock phase . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.5 TG delay circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
3.6 Pseudo 2-φ clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
3.7 Shift register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3.8 Clocked shift register circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5
3.9 Leakage path in a CMOS TG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
VLSI DesignCourse 0-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
3.10 Charge leakage problem in CMOS TG . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.11 Charge leakage circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
3.12 Transmission gate capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
3.13 Basic charge sharing circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
3.14 Transient voltage behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
3.15 Basic dynamic nMOS inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
3.16 Dynamic nMOS inverter: precharge and evaluate . . . . . . . . . . . . . . . . . 3-13
3.17 Precharge network for worst case . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
3.18 Evaluation discharge network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
3.19 Basic dynamic pMOS inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
3.20 Complex dynamic logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
3.21 Cascaded nMOS-nMOS glitch problem . . . . . . . . . . . . . . . . . . . . . . . 3-17
3.22 Dynamic cascades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
3.23 Basic domino logic circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
3.24 Domino AND gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-19
3.25 Cascaded domino logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-19
3.26 Visualization of domino effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-19
3.27 Domino timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-20
3.28 Cascaded domino circuit with fanout = 2 . . . . . . . . . . . . . . . . . . . . . 3-21
3.29 Cascaded domino logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
3.30 Domino AND4 gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-23
3.31 Domino stage with pull-up MOSFET . . . . . . . . . . . . . . . . . . . . . . . . 3-24
3.32 Charge sharing in a domino chain . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25
3.33 Use of feedback to control a pull-up MOSFET for charge sharing problem . . . 3-25
3.34 Signal race problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
3.35 Clock skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-27
3.36 NORA structuring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-28
3.37 NORA φ and φ sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29
3.38 C2MOS latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3.39 NORA pipelined logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
3.40 Connection of components for a simple CMOS flip-flop . . . . . . . . . . . . . . 3-31
3.41 Physical Construction of a CMOS flip-flop . . . . . . . . . . . . . . . . . . . . . 3-32
3.42 Pseudo 2-phase clocking (a) waveforms and simple latch, (b) clock skew, and(c) slow clock edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-33
3.43 Pseudo 2-phase latches (! charge redistribution problem in (b)) . . . . . . . . . 3-34
VLSI DesignCourse 0-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
3.44 Pseudo 2-phase latch layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-35
3.45 Shift register array layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-36
3.46 Reduced transistor count latch . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-37
3.47 Reduced transistor count latch with high impedance sustainer transistor . . . . 3-37
3.48 Dynamic D-Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38
3.49 Pseudo 2-phase dynamic logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-39
3.50 Pseudo 2-phase domino logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-40
3.51 2-phase flip-flop and skew reduction . . . . . . . . . . . . . . . . . . . . . . . . 3-41
3.52 Chain latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-42
3.53 2-phase static flip-flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43
3.54 2-phase static D flip-flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-44
3.55 2-phase static D flip-flops (continued) . . . . . . . . . . . . . . . . . . . . . . . 3-45
3.56 2-phase D flip-flops layouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-46
3.57 Static D flip-flop with set and reset . . . . . . . . . . . . . . . . . . . . . . . . . 3-47
4.1 Basic LOCOS MOSFET structure. . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
4.2 MOSFET capacitor model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4.3 Expanded view of an n+ drain or source region for computing depletion capac-itances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
4.4 Representation of long wire in terms of distributed RC sections . . . . . . . . . 4-6
4.5 Segmentation of polysilicon line . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
4.6 Simple model for rc delay calculation . . . . . . . . . . . . . . . . . . . . . . . . 4-7
4.7 CMOS inverter pair timing response . . . . . . . . . . . . . . . . . . . . . . . . 4-9
4.8 Waveforms for determination of dynamic power dissipation . . . . . . . . . . . 4-11
4.9 Input voltage waveforms for the power-delay products . . . . . . . . . . . . . . 4-12
4.10 Power-delay product in a resistively loaded inverter. . . . . . . . . . . . . . . . 4-13
4.11 Current waveforms for the power-delay product calculations. . . . . . . . . . . 4-14
4.12 Layout pattern for VDD and VSS lines. . . . . . . . . . . . . . . . . . . . . . . . 4-19
4.13 Pseudo 2-Phase Clocking Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20
4.14 Pseudo 2-Phase Overlap Times . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-20
4.15 Clock Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-21
4.16 Clock Generator With a TG Delay . . . . . . . . . . . . . . . . . . . . . . . . . 4-22
4.17 Latch-Based Clock Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-22
4.18 Clock Skew Due to Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-23
4.19 Clock Line Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-24
4.20 Clock Line Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-24
VLSI DesignCourse 0-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
4.21 Input Protection Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-25
4.22 Thin Oxide MOSFET Protection Circuit . . . . . . . . . . . . . . . . . . . . . . 4-26
4.23 Capacitive Loading Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-26
4.24 Inverter Sizing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-27
4.25 Double-Inverter Off-Chip Driver Circuit . . . . . . . . . . . . . . . . . . . . . . 4-30
4.26 Tri-State Output Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-31
4.27 Bi-Directional I/O Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-32
5.1 Cazochalski process for manufacturing silicon ingots . . . . . . . . . . . . . . . 5-2
5.2 The n-Well Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.3 The Active Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5.4 The Poly Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.5 The n+ Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
5.6 The p+ Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5.7 The Contact Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.8 The Metalisation Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.9 An Example of a p-Well CMOS Process . . . . . . . . . . . . . . . . . . . . . . 5-7
5.10 continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
5.11 Twin-tub process cross-section and layout of an inverter . . . . . . . . . . . . . 5-9
5.12 LOCOS Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
5.13 Encroachment in LOCOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5.14 Trench Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
5.15 Trench Capacitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
5.16 Origin of CMOS Latchup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
5.17 Trench-isolated CMOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
5.18 Active Area Encroachment in LOCOS . . . . . . . . . . . . . . . . . . . . . . . 5-21
5.19 Effective Channel Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-22
5.20 Design-mask transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
5.21 Contact Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
5.22 A region with eight terminals has 28 interconnection resistances. Making thecross-hatched juntions into new nodes splits the region into 10 electrically iso-lated regions and reduces the number of interconnection resistances to 10 . . . 5-25
5.23 Design-mask transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28
5.24 General Layout Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-29
5.25 Complementary Transistor/Logic Blocks . . . . . . . . . . . . . . . . . . . . . . 5-29
5.26 Equivalent Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
VLSI DesignCourse 0-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
5.27 Guard Ring Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5.28 Complement Static Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32
5.29 Transmission Gate Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-33
5.30 Layout of an inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34
5.31 Layout of a 2-input nand gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35
5.32 Layout of a 2-input nor gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35
5.33 Layout of an exor gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-36
5.34 Layout of a ram cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-36
5.35 Layout of a pad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-37
5.36 Layout of a RS-latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-37
5.37 Layout of a D-latch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-38
5.38 Layout of a comparator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-38
5.39 Layout of a 1-bit fulladder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-39
6.1 Continuous growth in DRAM complexity and size places little demand on pack-age size and number of I/Os . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
6.2 Comparison of I/O requirements for DRAM, logic and microprocessor devices . 6-3
6.3 Examples for packages and PWB mounting techniques: (a) TH: Dual-in-line(DIL) package. (b) TH: Pin-grid-array (PGA) package. (c) SM: ”J”-leadedpackages, leaded chip carrier or small-outline. (d) SM: Gull-wing-leaded pack-ages, chip-carrier or small-outline. (e) SM: Butt-leaded package, small-outlinedual-in-line type. (f) Leadless type, ceramic chip carrier mounted to a matchingceramic substrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
6.4 IC package types as a function of I/Os and attachment type . . . . . . . . . . . 6-5
6.5 Package history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6
6.6 Comparison: 24-pin SO package and 48-pin SSO package . . . . . . . . . . . . 6-6
6.7 Bonding-pad pitch versus chip lead count for several chip sizes . . . . . . . . . 6-7
6.8 Arrangement of staggered bonding pads: → lower pitch than with single lineof bonding pads. (a) Bonding pads size and spacing. (b) Maximum wire anglewith respect to die edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
6.9 CAD template for positioning bonding pads (assures that wire span lengthmeets the design rules) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
6.10 CAD template for checking adherence to wire-span guidelines. The templatealso provides an extended zone (beyond the optimum shown in Fig. 6.9) forcases where location in optimum zone is not compatible with the device layout. 6-8
6.11 CAD template for checking the maximum distance that wire spans over silicon.Here: violation of the guidelines. The circle must be at minimum tangent tothe step-and-repeat centerline (case of maximum distance) or cross it . . . . . . 6-9
6.12 Lead inductances for various package sizes . . . . . . . . . . . . . . . . . . . . . 6-10
VLSI DesignCourse 0-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
6.13 TCE of materials for semiconductor devices, (C) . . . . . . . . . . . . . . . . . 6-11
6.14 Plastic package: composite structure consisting of silicon chip, metal leadframeand plastic moulding compound . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
6.15 Generic assembly sequence for plastic and ceramic packages . . . . . . . . . . . 6-13
6.16 Eutectic die bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14
6.17 Epoxy die bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15
6.18 Tailless ball-and-wedge bonding cycle . . . . . . . . . . . . . . . . . . . . . . . . 6-16
6.19 Thermosonic ball wire bonds on a gate array VLSI chip . . . . . . . . . . . . . 6-17
6.20 Process sequence to create a laminated refractory-ceramic product from a ce-ramic slurry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
6.21 Cross-sectional sketches of several package types . . . . . . . . . . . . . . . . . 6-18
6.22 Structures of CERDIP and quad CERPAC . . . . . . . . . . . . . . . . . . . . 6-19
6.23 Ball-and-wedge-bonded silicon die in a plastic DIP . . . . . . . . . . . . . . . . 6-20
6.24 Molding processing system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-21
6.25 IC package market share . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22
6.26 Worldwide IC package market share by material . . . . . . . . . . . . . . . . . 6-22
6.27 Pin count versus usable gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
6.28 Plastic IC package material costs . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
6.29 Ceramic IC package material costs . . . . . . . . . . . . . . . . . . . . . . . . . 6-25
6.30 MCM: microprocessor performance . . . . . . . . . . . . . . . . . . . . . . . . . 6-26
7.1 Cell orientations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7.2 Full custom layout (hand crafted or generated out of a stick diagram resp. alayout description) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-4
7.3 Corresponding geometrical specification file and schematic diagram . . . . . . . 7-4
7.4 Memory cell schematic and corresponding stick diagram . . . . . . . . . . . . . 7-5
7.5 Full Custom Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7.6 Standard Cell Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
7.7 Example of a design rules set checked during design verification . . . . . . . . . 7-10
7.8 Competing drivers at a bus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-15
7.9 Example: compiler driven simulation . . . . . . . . . . . . . . . . . . . . . . . . 7-16
8.1 NOR gate reduction for Weinberger structuring . . . . . . . . . . . . . . . . . . 8-2
8.2 Weinberger structuring for 3-to-8 decoder . . . . . . . . . . . . . . . . . . . . . 8-3
8.3 Weinberger structuring for 3-to-8 decoder (continued) . . . . . . . . . . . . . . 8-4
8.4 Function representation in random logic . . . . . . . . . . . . . . . . . . . . . . 8-5
8.5 Weinberger NOR array representation . . . . . . . . . . . . . . . . . . . . . . . 8-5
VLSI DesignCourse 0-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
8.6 Weinberger stick diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-5
8.7 Weinberger array structure: (a) schematic (b) layout . . . . . . . . . . . . . . . 8-6
8.8 Gate matrix layout: (a) schematic (b) layout (c) optimized layout of n part . . 8-8
8.9 Half adder NAND/INV representation . . . . . . . . . . . . . . . . . . . . . . . 8-9
8.10 Half adder realizations: (a) standard cell (b) gate matrix . . . . . . . . . . . . . 8-10
8.11 Typical gate matrix layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11
8.12 Gate matrix row and column spacings . . . . . . . . . . . . . . . . . . . . . . . 8-12
8.13 (a) CMOS complex gate schematic and (b) corresponding layout . . . . . . . . 8-14
8.14 Implementation of an EXOR function: (a) Logic diagram. (b) Circuit. (c) Layout 8-15
8.15 Example of row-based layout scheme . . . . . . . . . . . . . . . . . . . . . . . . 8-16
8.16 Alternative complex gate implementation of EXOR function: (a) Logic dia-gram. (b) Circuit. (c) Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
8.17 Basic layout of the functional cell: (a) Logic diagram. (b) Circuit. (c) Graphmodel. (d) Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18
8.18 Layout optimization: (a) Diffusion connection of adjacent transistors. (b) Op-timal arrangement (reordered input lines) . . . . . . . . . . . . . . . . . . . . . 8-19
8.19 Alternative optimal circuit layout: (a) Logic diagram. (b) Circuit. (c) Graphmodel. (d) Optimal Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
8.20 Reduction of odd numbers of edges . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8.21 Application of reduction rule: (a) Logic Diagram. (b) Graph model and itsreduction. (c) Reconstruction of an Euler path . . . . . . . . . . . . . . . . . . 8-23
8.22 Application of the heuristic algorithm: (a) New inputs p1 and p2 are added.(b) Optimal sequence of inputs without the interlace of p1 or p2. (c) Circuitwith the dual path p1,2,3,1,4,5,p2 . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8.23 Minimal interlace algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25
8.24 Application example for minimal interlace algorithm . . . . . . . . . . . . . . . 8-26
8.25 Carry look-ahead circuit (this representation has no Euler path) . . . . . . . . 8-27
8.26 Alternative topology for carry look-ahead circuit (with possibility of construct-ing an Euler path) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
8.27 Comparison of space: (a) Functional cell realization. (b) Conventional NANDrealization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
8.28 Standard cell architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30
8.29 Synchronous counter schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30
8.30 Synchronous counter floorplan using standard cells . . . . . . . . . . . . . . . . 8-31
8.31 AND-OR-PLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32
8.32 Programmable logic approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33
8.33 PLA realization for given example . . . . . . . . . . . . . . . . . . . . . . . . . 8-34
VLSI DesignCourse 0-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
8.34 PLA generic floor plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
8.35 NOR-NOR PLA structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
8.36 Pseudo nMOS NOR-NOR PLA circuit . . . . . . . . . . . . . . . . . . . . . . . 8-37
8.37 PLA implementation in pseudo nMOS logic . . . . . . . . . . . . . . . . . . . . 8-37
8.38 Stick diagram of nMOS PLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
8.39 PLA NAND-INV-INV-NAND implementation . . . . . . . . . . . . . . . . . . . 8-39
8.40 CMOS PLA layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-40
8.41 Dynamic 2-phase PLA circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-41
8.42 Noise problem in dynamic PLAs . . . . . . . . . . . . . . . . . . . . . . . . . . 8-42
8.43 Multiple sided input/output access . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
8.44 PLA before folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
8.45 Row-folded PLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
8.46 Column-folded PLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-44
8.47 Automatic PLA layout generation . . . . . . . . . . . . . . . . . . . . . . . . . 8-45
8.48 Datapath and controller block . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
8.49 Sequential circuit example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
8.50 State-transition diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
8.51 State-transition diagram for the divide-by-5 counter . . . . . . . . . . . . . . . 8-52
8.52 State-transition diagram of an arbitrary FSM . . . . . . . . . . . . . . . . . . . 8-52
8.53 FSM-realization for first encoding scheme . . . . . . . . . . . . . . . . . . . . . 8-54
8.54 FSM-realization for second encoding scheme . . . . . . . . . . . . . . . . . . . . 8-55
8.55 FSM (Moore automata) implementation . . . . . . . . . . . . . . . . . . . . . . 8-55
8.56 Treatment of asynchronous inputs in a Moore machine . . . . . . . . . . . . . . 8-56
8.57 Five-state FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58
8.58 Reduced equivalent FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-59
8.59 Example FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-60
8.60 Nondeterministic FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-60
8.61 NFSM that recognizes strings of form (A | (AB))∗ . . . . . . . . . . . . . . . . 8-61
9.1 Gate array floorplan with row structure . . . . . . . . . . . . . . . . . . . . . . 9-6
9.2 Floorplan for a sea of gates array . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7
9.3 IMI gate array structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-8
9.4 Corner of IMI gate array die . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9
9.5 Grid representation of IMI gate array . . . . . . . . . . . . . . . . . . . . . . . 9-10
9.6 Explanations of grid: (a) basic cell. (b) internal interconnects. (c) basic celland crossover (poly) block. (d) XR = transistor. (e) crossover block interconnects 9-11
VLSI DesignCourse 0-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
9.7 Symbolic IMI cell structure representation . . . . . . . . . . . . . . . . . . . . . 9-12
9.8 CMOS matrixcell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-12
9.9 CDI single metal layer gate array structure . . . . . . . . . . . . . . . . . . . . 9-13
9.10 Gate array design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-14
9.11 Personalization for inverter: (a) schematic. (b),(c) IMI layout. (d) CDI layout . 9-15
9.12 NOR gate on IMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-16
9.13 Layout of transmission gates: (a) single TG. (b) pair of TGs with common output 9-16
9.14 Gate array market by process technology . . . . . . . . . . . . . . . . . . . . . 9-18
9.15 Worldwide gate array market by user sector . . . . . . . . . . . . . . . . . . . . 9-18
9.16 Circuit and corresponding standard cell . . . . . . . . . . . . . . . . . . . . . . 9-19
9.17 Standard cell scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-20
9.18 Standard cell floorplan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-20
9.19 Standard cell market by process technology . . . . . . . . . . . . . . . . . . . . 9-22
9.20 Standard cell market by application . . . . . . . . . . . . . . . . . . . . . . . . 9-22
9.21 Floor plan for macro cell design style (= building block approach . . . . . . . . 9-23
9.22 Mixed design style structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-24
9.23 Combinational PAL devices: AMD 16L2 . . . . . . . . . . . . . . . . . . . . . . 9-26
9.24 Sequential PAL devices: AMD PAL16R4 . . . . . . . . . . . . . . . . . . . . . . 9-27
9.25 Arithmetic PAL devices: AMD PAL16A4 . . . . . . . . . . . . . . . . . . . . . 9-28
9.26 Advanced PLD devices: Altera EP1800 . . . . . . . . . . . . . . . . . . . . . . 9-29
9.27 Local macro cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-30
9.28 Global macro cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-30
9.29 Synchronous clock, output enabled by product term . . . . . . . . . . . . . . . 9-31
9.30 Asynchronous clock, output permanently enabled . . . . . . . . . . . . . . . . . 9-31
9.31 Block diagram of MAX7000 family . . . . . . . . . . . . . . . . . . . . . . . . . 9-32
9.32 MAX7000 macrocell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-33
9.33 Principal FPGA structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-34
9.34 Four classes of commercially available FPGAs . . . . . . . . . . . . . . . . . . . 9-35
9.35 SRAM programming technology . . . . . . . . . . . . . . . . . . . . . . . . . . 9-36
9.36 Actel PLICE anti-fuse structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-37
9.37 Quicklogic ViaLink Anti-Fuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-37
9.38 EEPROM programming technology . . . . . . . . . . . . . . . . . . . . . . . . . 9-38
9.39 General architecture of XILINX FPGAs . . . . . . . . . . . . . . . . . . . . . . 9-39
9.40 Xilinx XC4000 CLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-39
9.41 Xilinx XC4000 single length lines . . . . . . . . . . . . . . . . . . . . . . . . . . 9-40
VLSI DesignCourse 0-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
9.42 Xilinx XC4000 double length lines and long lines . . . . . . . . . . . . . . . . . 9-40
9.43 General architecture of Actel FPGAs . . . . . . . . . . . . . . . . . . . . . . . . 9-41
9.44 Act-1 logic module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-41
9.45 Act-1 programmable interconnection architecture . . . . . . . . . . . . . . . . . 9-42
9.46 Act-2 logic cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-43
9.47 FPGA CAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-44
9.48 The Xilinx design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-45
9.49 Cost per Chip (Dollars) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-46
9.50 Logic design alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-47
9.51 Relative merits of various ASIC implementation styles . . . . . . . . . . . . . . 9-48
10.1 Serial adder principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
10.2 Ripple carry adder principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
10.3 Carry lookahead adder for 4 bits . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
10.4 Clustered carry lookahead adder for 16 bits . . . . . . . . . . . . . . . . . . . . 10-4
10.5 Carry select adder for 16 bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
10.6 Carry save adder for summation of 4 operands (V, W, X, Y) . . . . . . . . . . 10-5
10.7 Structure of SAA multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6
10.8 Structure of CSM multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7
10.9 Architecture of the block multiplier . . . . . . . . . . . . . . . . . . . . . . . . . 10-8
11.1 Microarchitecture blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-1
11.2 Datapath example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
11.3 Corresponding layout scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.4 2901 4-bit ALU slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.5 2901 µ-OPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3
11.6 16-bit bit-sliced ALU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-4
11.7 Basic controller structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-5
11.8 ROM based controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6
11.9 PLA based controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-6
11.10Horizontal microinstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11.11Vertical microinstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-7
11.12A microcode/nanocode controller . . . . . . . . . . . . . . . . . . . . . . . . . . 11-8
12.1 Flip-flop driving clock input of another Flip-flop . . . . . . . . . . . . . . . . . 12-1
12.2 Gated clock line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
VLSI DesignCourse 0-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
12.3 Double-edged clocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
12.4 Flip-flop driving asynchronous reset of another Flip-flop . . . . . . . . . . . . . 12-2
12.5 Unequal depth of clock buffering . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
12.6 Unbalanced fanout of clock buffers . . . . . . . . . . . . . . . . . . . . . . . . . 12-4
12.7 Balanced clock tree buffering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5
12.8 Combined geometric/tree buffering . . . . . . . . . . . . . . . . . . . . . . . . . 12-6
12.9 Multiplexer on clock line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.10Enabled (E-type) flip-flop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.11Toggle (T-type) flip-flop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-7
12.12Pipelined logic with double-edged clocking . . . . . . . . . . . . . . . . . . . . . 12-8
12.13Pipelined logic with single-edged clocking . . . . . . . . . . . . . . . . . . . . . 12-8
12.14Flip-flop driving asynchronous reset of another flip-flop . . . . . . . . . . . . . . 12-9
12.15Global asynchronous reset by external signal . . . . . . . . . . . . . . . . . . . 12-9
12.16Flip-flop driving synchronous reset of flip-flop . . . . . . . . . . . . . . . . . . . 12-9
12.17Shift register with forward chain of clock buffers . . . . . . . . . . . . . . . . . 12-10
12.18Shift register with balanced tree of clock buffers . . . . . . . . . . . . . . . . . . 12-10
12.19Series D-type flip-flops for capturing asynchronous input . . . . . . . . . . . . . 12-11
12.204-bit register used as shift register to capture an asynchronous input . . . . . . 12-11
12.21Asynchronous handshake circuit . . . . . . . . . . . . . . . . . . . . . . . . . . 12-12
12.22Operation of asynchronous handshake circuit . . . . . . . . . . . . . . . . . . . 12-13
12.23Monostable pulse generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.24Pulse generator using flip-flop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.25Multivibrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-14
12.26Synchronous pulse generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-15
12.27Bistable storing element formed by cross-coupled NAND gates . . . . . . . . . 12-15
12.28Bistable storing element formed by cross-coupled NOR gates . . . . . . . . . . 12-15
12.29Asynchronous RS flip-flop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16
12.30Latch configured as RS flip-flop . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16
12.31ME and WEbar RAM/DPRAM timing scheme . . . . . . . . . . . . . . . . . . 12-17
12.32Interfacing RAM into synchronous circuit: ME and WEbar generation . . . . . 12-17
12.33Using flip-flop for WEbar generation: timing schene . . . . . . . . . . . . . . . 12-18
12.34Avoiding floating RAM/DPRAM output propagation . . . . . . . . . . . . . . 12-18
12.35Tristate bus with non-central enable control . . . . . . . . . . . . . . . . . . . . 12-19
12.36Tristate bus with central control of tristate enables and additional driver acti-vated on non-controlled states . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-20
VLSI DesignCourse 0-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
12.37Wired-OR part used to create higher fanout . . . . . . . . . . . . . . . . . . . . 12-21
12.38High-fanout buffer replacing wired OR part . . . . . . . . . . . . . . . . . . . . 12-21
12.39Excessive fanout on control signal . . . . . . . . . . . . . . . . . . . . . . . . . . 12-22
12.40Geometric buffering on control signal . . . . . . . . . . . . . . . . . . . . . . . . 12-23
12.41Tree buffering on control signal . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-24
12.424-input AND gate and 2-input NAND/NOR equivalent . . . . . . . . . . . . . 12-25
12.43Multiplexer using AOI logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-25
12.44Late changing input fed late into combinational logic . . . . . . . . . . . . . . . 12-25
12.454-stage Johnson counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-26
12.46Using duplicate logic for reducing fanout . . . . . . . . . . . . . . . . . . . . . . 12-26
12.47Circuit with inaccessible internal logic: only first block is controllable and onlylast block is directly observable . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12.48Chain of counters: first counter is not directly observable and second counteris not directly controllable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-27
12.49Counter with closed feedback loop: initial state not known . . . . . . . . . . . . 12-28
12.50Circuit with test inputs and outputs . . . . . . . . . . . . . . . . . . . . . . . . 12-29
12.51Chain of counters broken by test input and output signals . . . . . . . . . . . . 12-29
12.52Counter with feedback loop opened by test control and output signals . . . . . 12-30
12.53Compiled megacell with compiled inputs/outputs . . . . . . . . . . . . . . . . . 12-30
12.54E-type scan path flip-flop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-31
12.55Circuit with scan path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-31
12.56JTAG test circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-32
13.1 Defect level as function of yield and fault coverage . . . . . . . . . . . . . . . . 13-4
13.2 A typical synthesis flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
13.3 Relationship between faults, errors and failures . . . . . . . . . . . . . . . . . . 13-6
13.4 Three-universe model of a system . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
13.5 Examples for physical faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-7
13.6 Fault detection by duplication with complementary logic . . . . . . . . . . . . . 13-12
13.7 4-by-4 array with one spare column . . . . . . . . . . . . . . . . . . . . . . . . . 13-13
13.8 Reconfigured array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-14
13.9 Basic concept of D-algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-16
13.10Primitive D-cube of fault (pdcf) for two-input NAND gate . . . . . . . . . . . . 13-16
13.11Propagation-D-cube (pdc) for two-input NAND gate . . . . . . . . . . . . . . . 13-17
13.12Singular cover for two-input NAND gate . . . . . . . . . . . . . . . . . . . . . . 13-17
13.13Singular covers for several basic logic gates . . . . . . . . . . . . . . . . . . . . 13-18
VLSI DesignCourse 0-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
13.14Construction the singular cover of an logic module . . . . . . . . . . . . . . . . 13-19
13.15Example circuit illustrating D-algorithm . . . . . . . . . . . . . . . . . . . . . . 13-20
13.16Serial fault simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-24
13.17Design for testability: complex gate (a) not testable with stuck-at model. (b)fully testable with stuck-at model . . . . . . . . . . . . . . . . . . . . . . . . . . 13-25
13.18Testability: ad-hoc techniques (partitioning for testability) . . . . . . . . . . . . 13-26
13.19Testability: ad-hoc techniques (a) insertion of register in order to limit logicdepth to a given maximum value. (b) test shift registers for PLA test (increasingPLA area). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-27
13.20Feedback logic with scanpath . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-28
13.21Examples for built-in test pattern generators . . . . . . . . . . . . . . . . . . . 13-30
13.22Pseudo random pattern generator . . . . . . . . . . . . . . . . . . . . . . . . . . 13-31
13.23Example for pseudo random pattern generator . . . . . . . . . . . . . . . . . . 13-32
13.24Counting techniques for test data evaluation . . . . . . . . . . . . . . . . . . . . 13-33
13.25Test data evaluation by signature analyse . . . . . . . . . . . . . . . . . . . . . 13-34
13.26Parallel signature register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-36
13.27BILBO registers: 1. full circuit 2. normal use 3. scan-path use 4. signatureanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-37
13.28Example: self-testing circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-38
14.1 In-circuit test using bed-of-nails . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2
14.2 Functional test using board connector . . . . . . . . . . . . . . . . . . . . . . . 14-2
14.3 Combined use of in-circuit and functional test . . . . . . . . . . . . . . . . . . . 14-3
14.4 Scan design at the board level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-4
14.5 Testing for interconnection faults . . . . . . . . . . . . . . . . . . . . . . . . . . 14-5
14.6 Testing on-chip logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-6
14.7 IEEE Std 1149.1 test logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-7
14.8 Test data registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-8
14.9 Serial connection of IEEE Std 1149.1-compatible ICs . . . . . . . . . . . . . . . 14-9
14.10Parallel connection of IEEE Std 1149.1-compatible ICs . . . . . . . . . . . . . . 14-9
14.11Use of bus master chip to control IEEE Std 1149.1 chips . . . . . . . . . . . . . 14-10
14.12Daisy-chain connection of instruction registers . . . . . . . . . . . . . . . . . . . 14-11
14.13Instruction register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-11
14.14An example instruction register cell (stage) . . . . . . . . . . . . . . . . . . . . 14-12
14.15Example design for bypass register . . . . . . . . . . . . . . . . . . . . . . . . . 14-13
14.16Use of bypass register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-13
VLSI DesignCourse 0-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Figures
14.17Provision of boundary-scan cells . . . . . . . . . . . . . . . . . . . . . . . . . . 14-14
14.18Basic boundary-scan cell for input pin . . . . . . . . . . . . . . . . . . . . . . . 14-15
14.19Basic boundary scan cell for output pin . . . . . . . . . . . . . . . . . . . . . . 14-15
15.1 Block diagram of a typical signal processing system . . . . . . . . . . . . . . . . 15-1
15.2 Bandwidths of signals used in signal processing applications . . . . . . . . . . . 15-2
15.3 Signal bandwidths that can be processed by present day (1989) technologies . . 15-2
15.4 Converters in signal processing systems: (a) A/D, (b) D/A . . . . . . . . . . . 15-3
15.5 (a) Conceptual block diagram of a D/A converter, (b) Clocked D/A converter . 15-4
15.6 (a) Sample-and-hold circuit, (b) Waveforms illustrating the operation of thesample-and-hold circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-5
15.7 Block diagram of a D/A converter . . . . . . . . . . . . . . . . . . . . . . . . . 15-5
15.8 Ideal input-output characteristics for a 3-bit D/A converter . . . . . . . . . . . 15-6
15.9 (a) Conceptual illustration of a current-scaling D/A converter, (b) Implemen-tation of (a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7
15.10A current-scaling D/A converter using an R-2R ladder . . . . . . . . . . . . . . 15-7
15.11Illustration of a voltage-scaling D/A converter . . . . . . . . . . . . . . . . . . . 15-8
15.12Block diagram of a general analog-to-digital converter . . . . . . . . . . . . . . 15-9
15.13Ideal input-output characteristics for a 3-bit A/D converter . . . . . . . . . . . 15-9
15.14Example of a successive approximation A/D converter architecture . . . . . . . 15-10
15.15The successive approximation process . . . . . . . . . . . . . . . . . . . . . . . 15-11
15.16A 3-bit parallel A/D converter . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-12
15.17A time-interleaved A/D converter array . . . . . . . . . . . . . . . . . . . . . . 15-13
15.18Basic structure of a sigma-delta converter . . . . . . . . . . . . . . . . . . . . . 15-14
15.19First-order sigma-delta modulator block diagram . . . . . . . . . . . . . . . . . 15-14
15.20Output of first-order sigma-delta modulator . . . . . . . . . . . . . . . . . . . . 15-15
15.21Frequency domain linearized model of a sigma-delta modulator . . . . . . . . . 15-15
15.22Noise-shaping filter function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-16
VLSI DesignCourse 0-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Tables
List of Tables
3.1 Static D flip-flop set/reset truth table . . . . . . . . . . . . . . . . . . . . . . . 3-47
4.1 Approximation of intrinsic MOS gate capacitance . . . . . . . . . . . . . . . . 4-4
4.2 Influence of first-order scaling on MOS device characteristics . . . . . . . . . . 4-16
4.3 Influence of scaling on interconnect media . . . . . . . . . . . . . . . . . . . . . 4-18
5.1 CMOS 1.5-Micron Design Rule Example . . . . . . . . . . . . . . . . . . . . . . 5-16
5.2 Basic n-well CMOS Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
5.3 Layer capacitances of an n-well CMOS process . . . . . . . . . . . . . . . . . . 5-26
5.4 Layer resistances of an n-well CMOS process . . . . . . . . . . . . . . . . . . . 5-26
7.1 Simplified geometrical specification language . . . . . . . . . . . . . . . . . . . 7-3
7.2 MOS layer definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
7.3 Rotations of geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
8.1 State-transition truth table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-50
8.2 State-transition table for divide-by-5 counter . . . . . . . . . . . . . . . . . . . 8-52
8.3 State-transition truth table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
13.1 Propagation D-cube table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-20
13.2 Singular cover table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-20
13.3 D-cube intersection table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-21
VLSI DesignCourse 0-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
pn Junction Properties
Chapter 1
Basics of CMOS Circuit Design
1.1 pn Junction Properties
Figure 1.1: Step-profile of pn junction
Figure 1.1 shows the profile of a pn junction
p-type region (x < 0): doping Na [cm−3]n-type region (x > 0): doping Nd [cm−3]
The following analysis is done for the pn junction without external voltage (V = 0).
1.1.1 pn Junction Space Charge Area and Electric Field
Diffusion (statistical phenomenon) of mobile carriers over the junction lets the dopants becomeionized and space charge regions arise. The diffusion is restricted by the electric field causedby the space charge (moved electrons/holes). The equation describing the relation betweenthe space charge density ρ(x), the depletion electric field E(x) and the potential φ(x) (Poissonequation) is given by
−d2φ(x)dx2
=dE(x)dx
=ρ(x)εSi
. (1.1)
VLSI DesignCourse 1-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
pn Junction Properties
ρ(x) is the volume charge density of ionized dopants and can idealized be written as
ρ(x) =
+qNd [0, xn],−qNa [−xp, 0].
(1.2)
The electric field is calculated by integration:
E(x) =x∫
x0
ρ(x)εSi
dx (1.3)
Integrating with the boundary conditions E(xn) = 0 = E(−xp) (due to V = 0) gives
E(x) =
− qNd
εSi(xn − x) [0, xn],
− qNaεSi
(x+ xp) [−xp, 0].(1.4)
The maximum of the field is at x = 0 and E(x = 0) has a magnitude of
Emax =qNdxnεSi
=qNaxpεSi
. (1.5)
1.1.2 pn Junction Built-in Potential
The built-in potential is a characteristic for the doping and is found to be
φ0 =xn∫−xp
E(x) dx. (1.6)
The build-in potential can be derived from the following equations:The diffusion hole-current density Jp diff (x) is proportional to the positive charge carriergradient and is given by
Jp diff (x) = −qDpdp(x)dx
(1.7)
where Dp is the diffusion constant for holes and p(x) is the density of holes at x. Diffusionand charge carrier mobility µ are statistical phenomenons and the relationship between themis given by the Einstein equation
Dp
µp=Dn
µn= VT =
kT
q(1.8)
where k is the Boltzmann constant (in joules per Kelvin) and T the temperature (in K). Theelectic field E(x) in the analyzed junction semiconductor has not for all x the value 0 whichmeans that also a drift current density Jdrift exists. The equation for Jdrift for positive chargecarriers is
Jp drift(x) = qµpp(x)E(x) (1.9)
The resulting hole current density is
Jp(x) = qµpp(x)E(x)− qDpdp(x)dx
[Am2
](1.10)
VLSI DesignCourse 1-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
pn Junction Properties
and equivalent for electrons
Jn(x) = qµnn(x)E(x) + qDndn(x)dx
[Am2
]. (1.11)
Setting Jp = 0 (equilibrium condition) and using the Einstein relationship Dp = µp VT weobtain
E(x) = −dφ(x)dx
=VTp(x)
dp(x)dx
(1.12)
and we can calculate the potential as
dV = −VTdp(x)(p)
. (1.13)
Integration from x1 (with concentration p1 and potential V1) to a point x2 (with p2 and V2)yields
V21 = VT lnp1p2. (1.14)
For the built-in potential φ0 we obtain
φ0 = VT lnp(−xp)p(xn)
. (1.15)
Withp(−xp) = Na (1.16)
and
np = n2i (1.17)
=⇒ p(xn) =n2i
Nd(1.18)
we get the final expression for φ0
φ0 = VT lnNaNd
n2i
. (1.19)
Note: Equation 1.17 is valid independent of the amount of donor and acceptor impuritydoping.
1.1.3 pn Junction Depletion Width
W = xp + xn (1.20)
With Naxp = Ndxn follows
xn =Na
Na +NdW (1.21)
xp =Nd
Na +NdW. (1.22)
(1.23)
VLSI DesignCourse 1-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
pn Junction Properties
From
−E(x) =dφ
dx=
qNdεSi
(xn − x) [0, xn],
qNaεSi
(x+ xp) [−xp, 0].(1.24)
and the integration
φ0 =q
εSi
0∫−xp
Na (x+ xp) dx+xn∫0
Nd (xn − x) dx
=
q
εSi
Na
2(x2 + 2xxp)
∣∣∣∣0−xp
+Nd
2(2xxn − x2)
∣∣∣∣xn0
=
q
2εSi
[Nax
2p +Ndx
2n
]=
q
2εSi
[Na
(Nd
Na +NdW
)2
+Nd
(Na
Na +NdW
)2]
=q
2εSi
[NaN
2d
(Na +Nd)2+
NdN2a
(Na +Nd)2
]W 2
=q
2εSi
[(Nd +Na)NaNd
(Na +Nd)2
]W 2
=q
2εSi
(1Na
+1Nd
)−1
W 2 (1.25)
we obtain for the depletion width W the following equation:
W =
√2εSiφ0
q
(1Na
+1Nd
). (1.26)
A one-sided junction is obtained if Nd Na or vice versa. In this case
W '√
2εSiφ0
qN, (1.27)
where N = min(Na, Nd).
1.1.4 pn Junction with External Voltage
Assuming that the positive side of an external voltage V is attached to the p-type area andthe negative side to the n-side area (V > 0: forward bias; V < 0: reverse bias) we can modifythe equilibrium equations by the transformation φ0 → (φ0−V ) and obtain for the depletionwidth:
W =
√2εSiq
(φ0 − V )(
1Na
+1Nd
). (1.28)
VLSI DesignCourse 1-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
pn Junction Properties
1.1.5 pn Junction Capacitance
The junction capacitance originates from the depletion charge. It is important in reverse bias(V < 0), where it is given by
Cj(V ) =εSi
W (V )[F/cm2] (1.29)
C is nonlinear since it changes with the voltage V .
1.1.6 pn Junction Current Flow
Current flow through the junction is established by tracking the minority carriers:
• electron current In on the p-side
• hole current Ip on the n-side
• recombination-generation current originating from the depletion region
In and Ip combine to give the ideal diode equation
I = I0(eqV/kT − 1), (1.30)
where
I0 = qA
(Dnnp0Ln
+Dppn0
Lp
)(1.31)
is the reverse saturation current. The reverse generation current (V < 0) is found as
Igen ' −qAni2τ0
W (V ), (1.32)
while the forward recombination current assumes the form
Irec 'qAniW (V )
2τ0eqV/2kT , (1.33)
where τ0 is the average carrier lifetime. These contributions must be added to the ideal diodecurrent.
VLSI DesignCourse 1-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
1.2 MOS Transistor Theory
1.2.1 MOSFET Structure
Figure 1.2: n-channel enhancement-mode MOSFET
Quantity meaning process parameter design (layout) parameterXox gate oxide thickness ×L channel length ×W channel width ×
⇒ the aspect ratio W/L is the characteristical transistor design parameter
MOSFET type: n-channel p-channelSubstrate material: weak p-type Silicon weak n-type SiliconDrain,Source material: strong n+ Silicon strong p+ SiliconGate material: strong doped Polysilicon → low resistance
VLSI DesignCourse 1-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
1.2.2 MOS Capacitor and Threshold Voltage
Figure 1.3: The basic MOS structure
n-channel MOSFET:
• p-type wafer (single crystal p-type silicon) uniformly doped with acceptor (e.g. boron)concentration Na (Na ' 1015cm−3)
• Close to the bulk electrode, the majority and minority thermal equilibrium concentra-tions are approximated by
ppo ' Na and npo 'n2i
Na(1.34)
where ni is the intrinsic carrier density (ni ' 1, 45 · 1010cm−3)
• Oxide layer (SiO2 = quartz glass) is used as insulating dielectric between metal andsemiconductor layer with a resistivity > 1015 Ωcm.
• State of the art MOS processes use poly silicon as gate material.The gate capacity is given by
Cox =εoxxox
[F/cm2] (1.35)
with εox = 3, 9ε0, ε0 = 8, 854 · 10−14 Fcm
xox ' 50nm ⇒ Cox ' 10−8 Fcm2
Cges = Cox ·A [F]
• the top layer of metal is used for low resistance connections of transistor structures
VLSI DesignCourse 1-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
%newpage Varying the gate voltage gives three modes of operation for the MOS capacitor:
1. accumulation (VG < 0)
2. depletion (VG > 0) (VG small) and
3. inversion (VG > 0)
Accumulation
Positively charged majority carriers (holes) accumulate at the Si-SiO2 interface (Fig. 1.4).The MOS system behaves as a capacitor (Eq. 1.35). This state is only useful for measuringsome basic MOS properties. It is no operational region.
Figure 1.4: MOS accumulation state
VLSI DesignCourse 1-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Depletion
Figure 1.5: MOS fields and potentials for positive gate voltages
MOS field effect: An externally applied gate voltage VG controls the semiconductor electricfield E(x) and the semiconductor potential φ(x) and therefore the Silicon carrier densities p, n.
E(x) = −dφ(x)dx
(1.36)
Potential boundary condition: φ(x)→ VB = 0 at the bulk electrode.The total voltage accross the semiconductor is equal to the surface potential
φS = φ(x = 0). (1.37)
Applying the KVL leads toVG = Vox + φS (1.38)
Connection between VG and ES :
ES = E(x = 0) = − dφ
dx
∣∣∣∣x=0
(1.39)
ES is the maximum value of the semiconductor field and is controlled (Poisson equation) bythe voltage VG and influences the surface carrier concentrations⇒ negatively charged acceptor ions are termination points for the electric field lines.
pS = pp(x = 0) and nS = np(x = 0) (1.40)
If VG is increased to a point where pS Na (induced by electric field ES) is satisfied, thedepletion region extends from x = 0 to x = xd. The depletion phenomenon in the MOSsystem is analogous to the p-side of a one-sided n+p profile junction with the difference thatthere is the voltage φS across the depletion region.Replacing the built-in voltage φ0 by the surface potential φS leads to an equation for thedepletion width:
xd =
√2εSiqNa
φS , εSi = 11, 8ε0. (1.41)
VLSI DesignCourse 1-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.6: Depletion in the MOS system
The bulk depletion charge per unit area is
QB0 = −qNaxd[C/cm2] (1.42)whereQB0 = QB|VB=0
= −√
2qεSiNaφS . (1.43)
MOS capacitator: ⇒ QS = QB < 0
Vox = −QSCox
(1.44)
Inversion
Figure 1.7: Surface inversion in the MOS system
VLSI DesignCourse 1-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Increasing VG implies increasing φS and driving xd deeper towards the bulk electrode. WhenVG reaches a critical threshold value VT0 (assuming VB = 0) the inversion phenomenon occurs:The depth of the depletion area remains constant (xd = xdm) and a layer of minority carriersaccumulates at the surface (x=0). The depth of the depletion area remains constant, becausethe inversion layer electrons shield the bulk substrate from the increasing field at the surface.The inversion condition is given by
VG ≥ VT (1.45)with φS(VG = VT ) = 2|φF | (1.46)
where |φF | =kT
qln(Na
ni
)(bulk Fermi potential). (1.47)
The maximum depletion width is
xdm =
√2εSiqNa
(2 |φF |) , (1.48)
the bulk depletion charge density
QB0 = −√
2qεSiNa (2 |φF |) , (1.49)
and the total surface charge density
QS(VG) = QB0 +QI(VG) (1.50)
(where QI(VG) is electron inversion layer charge).At the onset of inversion, QI QBO, so the ideal threshold voltage is
V idealT0 =
−QB0
Cox+ 2|φF | (1.51)
=√
2qεSiNa (2 |φF |)Cox︸ ︷︷ ︸
voltage drop across oxide
+ 2|φF |︸ ︷︷ ︸voltage drop across substrate
(1.52)
In reality exists an additional term VFB (called Flatband voltage) to the oxide voltage drop:
VFB = ΦGS −1Cox
(Qox +QSS) (1.53)
• ΦGS = ΦG − ΦS represents the difference in work functions Φ between the gate andsubstrate materials (material specific contact voltages which can be taken from tables).
• Qox is the oxid charge (unwanted positive ions) density
• QSS is the surface state density
Since Qox and QSS are positive, VFB may become negative resulting in a negative thresholdvoltage.
VLSI DesignCourse 1-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
To ensure a positive VT0 a additional acceptor ion implantation is introduced in the MOSprocess with a ion dose DI [ions/cm2].
Final threshold voltage for VB = 0:
VT0 = VFB +√
2qεSiNa(2|φF |)Cox
+ 2|φF |+qDI
Cox(1.54)
The electron charge density in the inversion layer is
QI = −Cox(VG − VT0) (1.55)
MOS Transistor Threshold Voltage for Nonzero Bulk-Source Voltages
Figure 1.8: Increase in depletion charge from body bias VB
Non zero bulk voltage reverse biases the pn junction. The depletion charge is
QB = −√
2qεSiNa(2|φF |+ VB) (1.56)
Threshold shift:
∆VT = VT (VB)− VT0, VT0 = VT (VB = 0)
=√
2qεSiNa
Cox
(√2|φF |+ VB −
√2|φF |
)(1.57)
VT = VT0 + γ
(√2|φF |+ VB −
√2|φF |
)(1.58)
with
γ =√
2qεSiNa
Cox[V1/2] (1.59)
body effect constant
The n-channel inversion charge is given by
QI = −Cox(VG − VT ) (1.60)
VLSI DesignCourse 1-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
1.2.3 MOSFET Operation Modes
Figure 1.9: Basic MOSFET channel formation
n-channel MOSFET:
• Source electrode (n+ region) is at the lowest potential
• Source potential is the reference potential for all voltages:
VDS = VD − VS , VGS = VG − VS , VSB = (VS − VB) (1.61)
• VSB > 0 because VB must be more negative than VS to make sure that the pn-junctionfrom bulk to source is reverse biased.
VLSI DesignCourse 1-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
MOSFET operation Modes: Cutoff, Nonsaturation, Saturation
Cutoff : VGS < VT
Figure 1.10: MOSFET in cutoff mode
Nonsaturation : VGS ≥ VT and VDS ≤ (VGS − VT )Saturation : VGS ≥ VT and VDS ≥ (VGS − VT )
VLSI DesignCourse 1-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.11: MOSFET in nonsaturation mode
Figure 1.12: MOSFET in saturation mode
VLSI DesignCourse 1-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
1.2.4 MOSFET current characteristic
The Gradual Channel Approximation
• analysis with the gradual channel approximation=reduction of the three-dimensionalproblem to a one-dimensional current flow problem
• approximation describes very well ”large devices”
• analysis first done for VS = 0
• assumption for derivation of GCA equations: depletion charge is supported entirely bythe vertical electric field Ex(y); (assume VT0(QB0) indep. of V (y))
Figure 1.13: MOSFET geometry used in GCA (MOSFET in linear/nonsaturated region)
The channel electric field Ey(y) is established by the drain source voltage VDS is
Ey(y) = −dV (y)dy
(1.62)
with V (y = 0) = VS = 0, V (y = L) = VDS .The depletion depth has its maximum at the drain electrode because V (y) has a maximum aty = L:
Xdm(y) '√
2εSiqNa
[2|ΦF |+ V (y)] (1.63)
The inversion charge density as a function of the position y is given by
QI(y = 0) = −Cox[VGS − VT ] (1.64)QI(y) = −Cox[VGS − VT − V (y)] (1.65)
The resistance for a differential channel increment dy is
dR = − dy
µnWQI(y)=
dy
σA[Ω] (1.66)
VLSI DesignCourse 1-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.14: Geometry for GCA current analysis
with A : channel cross sectionµn : electron surface mobilityW : channel widthσ : conductivity
Rearranging
dV = IDdR = − IDdy
µnWQI(y)(1.67)
⇔ ID
L∫0
dy = −µnWVDS∫0
QI(V )dV (1.68)
and Integration yields
ID = µnCoxW
L
VDS∫0
(VGS − VT − V )dV (1.69)
= k′W
L
[(VGS − VT )VDS −
12V 2DS
](1.70)
with the process transconductance parameter k′ = µnCox[AV 2
]and the device transconduc-
tance parameter β = k′WL [A/V 2].
VLSI DesignCourse 1-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
MOSFET Current Equations
The resulting equation from the GCA for the nonsaturated current in a conveniant form is
ID =β
2[2(VGS − VT )VDS − V 2
DS ] (1.71)
At the onset of saturation the current ID reaches a peak value and remains constant in the
Figure 1.15: Nonsaturated MOS current
saturation region:∂ID∂VDS
= 0 = β(VGS − VT − VDS) (1.72)
Evaluation of the derivation yields
VDS,SAT = VGS − VT (1.73)
⇒ ID,SAT = ID(VDS = VDS,SAT ) =β
2(VGS − VT )2 (1.74)
⇒ parabolic border between saturation and nonsaturation.
VLSI DesignCourse 1-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.16: Basic MOSFET characteristics
Figure 1.17: Start of Saturation in a MOSFET
Channel length modulation in saturation
The effective channel lenght in saturation is L′ = L−∆L.
From GCA:
⇒ QI(L′) = 0 (1.75)⇒ V (L′) ' VDS,SAT (1.76)
(VDS,SAT = VGS − VT0 ⇒ no inversion charge is induced).
∆L may be approximated as a depletion region for a one-sided pn junction with a voltageVDS − VDS,SAT across it.
∆L '√
2εSiqNa
[VDS − VDS,SAT ] (1.77)
VLSI DesignCourse 1-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.18: Channel length modulation
The saturated current is modified to
ID ' k′
2W
L′(VGS − VT )2 (1.78)
' ID0
1− ∆LL
(1.79)
with
ID0 =β
2(VGS − VT0)2.
Using the empirical relation
1− ∆LL' 1− λVDS (1.80)
with λ[V−1] the channel length modulation factor and assuming that λVDS 1 the currentcan be represented by
ID =ID0
1− λVDS
=ID0
1− λVDS1 + λVDS1 + λVDS
=ID0 (1 + λVDS)
1− (λVDS)2︸ ︷︷ ︸ 1
(1.81)
⇒ ID ' ID0(1 + λVDS) =β
2(VGS − VT0)2(1 + λVDS) (1.82)
λ has typical values from 0.1 to 0.01V−1 and represents the influence of VDS on ID insaturation. λ is important in small geometrie devices. In the following exercises we willneglect λ.
VLSI DesignCourse 1-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.19: MOSFET characteristics with channel length modulation
1.2.5 Biased MOSFET Current Equations
Figure 1.20: General MOSFET bias
VT = VT0 + γ(√
2|φF |+ VSB −√
2|φF |) (1.83)ID ' 0 (VGS < VT ) (1.84)
ID =β
2
[2(VGS − VT )VDS − V 2
DS
](VGS > VT , VDS < VDS,sat) (1.85)
VDS,sat = VGS − VT (1.86)
ID =β
2(VGS − VT )2(1 + λVDS) (VGS > VT , VDS ≥ VDS,sat) (1.87)
VLSI DesignCourse 1-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.21: Body bias effects
1.2.6 Measurement of device parameters
Figure 1.22: Device parameter measurement (a)
Get
(1) VT0 from intercept
(2) k = k′W
Lfrom slope:
√k =
√2ID
VGS − VT
(3) γ =VT (VSB)− VT0√
2|φF |+ VSB −√
2|φF |
and λ from
(4)ID2
ID1=
1 + λVD2
1 + λVD1
VLSI DesignCourse 1-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.23: Device parameter measurement (b)
1.2.7 The Complete MOSFET GCA Analysis
• includes additional depletion charge created by the channel voltage V (y), which is reversebias across the n+p junction at the channel-substrate boundary
• assume VS = 0 = VB
• calculation for nonsaturated MOSFET
VT0(V ) = VFB + 2|φF |+qDI
Cox+
1Cox
√2qεSiNa(2|φF |+ V ) (1.88)
The basic GCA integral
ID =VDS∫0
[VGS − VT0(V )− V ] dV (1.89)
is modified to (now: VT0 not constant and dependent of QB0)
ID = β
VDS∫0
[VGS − VFB − 2|φF | −
qDI
Cox
]− V
− 1Cox
√2qεSiNa(2|φF |+ V )
dV (1.90)
which gives for the nonsaturated drain current
ID = β
(VGS − VFB − 2|φF | −
qDI
Cox
)VDS −
12V 2DS
− 23Cox
√2qεSiNa[(2|φF |+ VDS)3/2 − (2|φF |)3/2]
. (1.91)
Introduction of a “reduction factor” M < 1 modifies the nonsaturated current equation to
ID = Mβ
2[2(VGS − VT0)VDS − V 2
DS ]. (1.92)
The saturated current is then given by
ID,sat = Mβ
2(VGS − VT0)2 (1.93)
VLSI DesignCourse 1-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.24: Comparision of circuit equations with the complete GCA model
Figure 1.25: Comparision of modified circuit equations with the complete GCA model
1.2.8 Depletion mode n–channel MOSFET
⇒ only used in NMOS as load device.
VLSI DesignCourse 1-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
With the donator implant dose DI , VT is modified to
VT = VFB + 2|ΦF |+1Cox
√2qεSiNa(2|ΦF |+ VSB)− qDI
Cox(1.94)
so that VT of a depletion MOSFET is negative. The n-type layer resulting from donor doping
Figure 1.26: Depletion-mode MOSFET
is modeled by(Nd −Na) > 0. (1.95)
The current ID can be modeled by
ID = −µn(W
L
)∫ VDS
0QC(V )dV (1.96)
with QC(V ) the channel charge density
QC(V ) = −Qn +QS(V ) +Qj(V ) (1.97)
VLSI DesignCourse 1-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.27: Simplified depletion-mode MOSFET model
Qn: total charge density of electrons in the n-type layer
QS : MOS surface charge density (VFB gives the voltage necessary to create a charge-neutralflatband state at the surface of the semiconductor)
Qj : amount of depletion charge on the n-side of the pn junction n-type layer ←→ substrate
Qn = −q(Nd −Na)a (1.98)QS(V ) = −Cox[VGS − VFB − V ] (1.99)
Qj(V ) =√
2qεSiN(φ0 + V ) (1.100)
φ0 '(kT
q
)ln
[(Nd −Na)Na
N2i
](built-in voltage) (1.101)
N =(Nd −Na)Na
(Nd −Na) +Na=Na
Nd(Nd −Na) (1.102)
Using these charge densities gives
ID = −µn(W
L
)∫ VDS
0[q(Nd −Na)a+ Cox(VGS − VFB − V )
−√
2qεSiN(φ0 + V )]dV
= β
q(Nd −Na)a
CoxVDS +
[(VGS − VFB)VDS −
12V 2DS
]− 2
3Cox
√2qεSiN [(φ0 + VDS)3/2 − (φ0)3/2]
. (1.103)
This equation is too complicate for hand-calculations, so usually the D-mode MOSFET isdescribed by
ID =β
2[2(VGS − VT0)VDS − V 2
DS ], (1.104)
VLSI DesignCourse 1-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
where VT0 < 0. Saturation current:
ID,sat =β
2(VGS + |VT0|)2 (1.105)
Application of D-mode MOSFETs often as Depletion load (saturation region):
Figure 1.28: Depletion-mode MOSFET characteristics
VLSI DesignCourse 1-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Figure 1.29: Square root of saturated depletion-mode MOSFET current
1.2.9 p–channel MOSFET
Figure 1.30: p-channel MOSFET
The source electrode is connected to VDD.
VLSI DesignCourse 1-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Threshold voltage:
VTP=ΦGS − 2ΦFn − 1Cox
(QSS +Qox)− 1Cox
QBn
ΦFn=kTq ln
(Ndni
)> 0 Nd : n− type substrate doping
QBn=√
2qεSiNd[2ΦFn + VBSp ]
VTp=VTOp − γp(√
VBSp + 2ΦFn −√
2ΦFn
)with γp=
√2qNdεSiCox
(1.106)
VTp is negative for enhancement p-channel MOSFET. Current equations are similar to n-channel MOSFET but all the signs are opposite.
1.2.10 Conclusions
VLSI DesignCourse 1-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
n− channel transistor p− channel transistorFermipotential
ΦFp = kTq ln
(niNa
)< 0 ΦFn = kT
q ln(Ndni
)> 0
Threshold Voltagepositive negative
VTn = VT0n + γn(√
(|2ΦFp | − VBS)−√|2ΦFp |
)VTp = VT0p − γp
(√(VBSp + 2ΦFn)−
√2ΦFn
)γn =
√2qNaεSi/Cox γp =
√2qNdεSi/Cox
Current EquationsCutoff : VGS < VTn |VGS | < |VTp |
ID = 0 ID = 0
NonsaturationVGS > VTn and VDS ≤ (VGS − VTn) |VGSp | > |VTp | and |VDSp | ≤ |VGSp − VTp |
ID = βn2
[2(VGSn − VTn)VDSn − V 2
DSn
]IDp = βp
2
[2(VSGp + VTp)VSDp − V 2
SDp
]Saturation
VGS > VTn and VDS ≥ (VGS − VTn) |VGSp | > |VTp | and |VDSp | ≥ |VGSp − VTp |
ID = βn2 (VGS − VTn)2 ID = βp
2 (VSGp + VTp)2
1.2.11 Modelling the MOS Transistor for Circuit simulation
MOSFET SPICE Parameters
SPICE=(Simulation Program with IC Emphasis)
VLSI DesignCourse 1-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
MOS Transistor Theory
Symbol Name Parameter Units Default Example
LEVEL Model Index 1VT0 VTO Zero-bias threshold voltage V 0.0 1.0k′ KP Transconductance parameter A/V2 2.0E-5 3.1E-5
γ GAMMA Bulk threshold parameter V1/2 0.0 0.372|φF | PHI surface potential V 0.6 0.65λ LAMBDA Channel-length modulation 1/V 0.0 0.02rd RD Drain ohmic resistance Ω 0.0 1.0rs RS Source ohmic resistance Ω 0.0 1.0Cbd CBD Zero-bias B-D junction capacitance F 0.0 2.0E-14Cbs CBS Zero-bias B-S junction capacitance F 0.0 2.0E-14Is IS Bulk junction saturation current A 1.0E-14 1.0E-15φ0 PB Bulk junction potential V 0.8 0.87
CGSO Gate-source overlap capacitanceper meter channel width F/m 0.0 4.0E-11
CGDO Gate-drain overlap capacitanceper meter channel width F/m 0.0 4.0E-11
CGBO Gate-bulk overlap capacitanceper meter channel length F/m 0.0 2.0E-10
RSH Drain and source diffusionsheet resistance Ω/2 0.0 10.0
Cj0 CJ Zero-bias bulk junction bottom capacitanceper square meter of junction area F/m2 0.0 2.0E-4
m MJ Bulk junction bottom grading coefficient 0.0 0.5CJSW Zero-bias bulk junction sidewall capacitance
per meter of junction perimeter F/m 0.0 1.0E-9m MJSW Bulk junction sidewall grading coefficient 0.33
JS Bulk junction saturation currentper square meter of junction area A/m2 1.0E-8
tox TOX Oxide thickness m 1.0E-7 1.0E-7NA or ND NSUB Substrate doping 1/cm3 0.0 4.0E15QSS/q NSS Surface state density 1/cm2 0.0 1.0E10
NFS Fast surface state density 1/cm2 0.0 1.0E10TPG Type of gate material 1.0
+1 opposite to substrate-1 same as substrate0 Al gate
Xj XJ Metallurgical junction depth m 0.0 1.0E-6LD LD Lateral diffusion m 0.0 0.8E-6µ UO Surface mobility cm2/Vs 600 700
VLSI DesignCourse 1-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
1.3 DC Characteristics of MOS Inverters
Figure 1.31: Ideal inverter properties
VLSI DesignCourse 1-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
1.3.1 Basic Inverter characteristics
Figure 1.32: Basic nMOS inverter structure
The voltage transfer curve for DC voltages is defined as Vout(Vin).The DC equation for the load current IL is
IL = ID(Vin, Vout) (1.107)
and for the output voltage we get
Vout = VDD − VL(IL). (1.108)
If Vin is increased from 0 (Vout = VDD initially) to values greater than VT :VDS = Vout > (VGS − VT ) ⇒ the driver changes from cutoff mode to saturation:
βD2
(Vin − VT )2(1 + λVout) = IL(VL) (1.109)
= IL(VDD − Vout)
If Vin is more increased and when the point is reached where Vout < (VGS − VT ) then thedriver is in ohmic mode:
βD2
[2(Vin − VT )Vout − V 2out] = IL(VL) (1.110)
= IL(VDD − Vout)
VLSI DesignCourse 1-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.33: Voltage transfer curve of an nMOS inverter
Characteristical points of the Voltage transfer curve:
VOL : output low voltage of the inverter
VOH : output high voltage of the inverter
VIL : input low voltage of the inverter
VIH : input high voltage of the inverter
at the point dVout
dVin= −1
VTH : Inverter threshold voltage at Vout = Vin
VLSI DesignCourse 1-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.34: Definition: Noise margins
Noise margins NMH = VOH − VIHNML = VIL − VOL
Input voltage ranges for Logic 1 : VIH to VDDLogic 0 : 0 to VIL
Output voltage ranges for Logic 1 : VOH to VDDLogic 0 : 0 to VOL
Figure 1.35: Base for NM definitions: cascaded inverter stages
VLSI DesignCourse 1-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.36: Model for transmission network problem
VLSI DesignCourse 1-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.37: Simplified AC circuit model for noise margins
Inverter Transient Response
Current equation for change of Vin from VOL to VOH :
ID = −CoutdVoutdt
+ IL (1.111)
and for change of Vin from VOH to VOL:
IL(Vout) = CoutdVoutdt
(1.112)
VLSI DesignCourse 1-37
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.38: Inverter transient response definitions
1.3.2 Inverter with Linear Resistor Load
The description of the load is given by
Vout = VDD − ILRL (1.113)
IL =VDD − Vout
RL(1.114)
To obtain the VTC, the load current must be set equal to the driver current (IL = ID)(assuming a slow change of Vin).When the driver is in cutoff (Vin < VT ⇒ ID = 0), there is a zero voltage drop across RL andVout = VDS = VOH . When Vin is increased, the driver starts conduction in saturation mode,because the output voltage is initially high, so Vout = VDS > (VGS − VT ). In this case, theVTC equation is
Vout = VDD −β
2RL(Vin − VT )2. (1.115)
When Vin is more increased, Vout = VDS drops to the value (Vin− VT ) and the driver changesto ohmic mode, where the VTC equation is given by
Vout = VDD −β
2RL[2(Vin − VT )Vout − V 2
out]. (1.116)
VLSI DesignCourse 1-38
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.39: Physical reason for transition times
Calculation of VOH
VOH = VDD because ID = 0 when driver in cutoff.
Calculation of VOL
Vin = VOH and the driver is nonsaturated, because Vout < Vin − VT .
VDD − VOLRL
=βD2
[2(VOH − VT )VOL − V 2OL] (1.117)
⇔ V 2OL − 2
(1
βDRL+ VDD − VT
)VOL +
2VDDβDRL
= 0 (1.118)
Solving quadratic equation ⇒ VOL.
VLSI DesignCourse 1-39
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.40: Inverter with linear resistor load
Calculation of VIL
For Vin = VIL the driver transistor is saturated, because Vout is slightly below VOH . From ID= IL follows:
βD2
(Vin − VT )2 =VDD − Vout
RL(1.119)
VIL is defined as the point wheredVoutdVin
= −1 (1.120)
Differentials of both sides of ID(Vin) = IL(Vout):
dIDdVin
dVin =dILdVout
dVout (1.121)
⇔ dVoutdVin
=dIDdVindILdVout
(1.122)
=βD(Vin − VT )− 1RL
VLSI DesignCourse 1-40
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.41: VTC for linear resistor load nMOS inverter
= −βDRL(Vin − VT ) = −1 (1.123)
With Vin = VIL:
VIL = VT +1
βDRL(1.124)
Replacing Vin in equation 1.119 by the preceding equation term yields for Vout:
Vout(VIL) = VDD −1
2βDRL(1.125)
Calculation of VIH
For Vin = VIH , Vout < (VGS−VT ), so the driver is in the ohmic (nonsaturated) mode. EquatingID and IL gives
βD2
[2(VIH − VT )Vout − V 2out] =
1RL
(VDD − Vout) (1.126)
Evaluation of the condition (dVout/dVin) = -1 for ID(Vin, Vout) = IL(Vout) gives
∂ID∂Vin
dVin +∂ID∂Vout
dVout =dILdVout
dVout. (1.127)
Rearranging,dVoutdVin
=∂ID∂Vin
dILdVout
− ∂ID∂Vout
(1.128)
VLSI DesignCourse 1-41
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
computing the derivatives,βDVout
1RL
+ βD(Vin − VT − Vout)= 1 (1.129)
rearranging and setting Vin = VIH yields
Vout =12
(VIH − VT ) +1
2βDRL(1.130)
Substitution of this expression for Vout in equation 1.126 gives
(VIH − VT )2 +2
βDRL(VIH − VT )−
(8VDD
3βDRL− 1β2DR
2L
)= 0 (1.131)
VIH can be computed by solving this quadratic equation and selecting the proper physicalroot.
Calculation of Vth
The inverter threshold voltage is defined as the VTC point where Vin = Vout.The current equation can be written as (with Vth = Vin = Vout):
βD2
(Vth − VT )2 =VDD − Vth
RL(1.132)
Rearranging and solving the equation
V 2th − 2
(VT −
1βDRL
)Vth +
(V 2T −
2VDDβDRL
)= 0 (1.133)
yields Vth.
1.3.3 Inverter Design: Resistor Model
In this approach VOH and VOL are of first and VIH and VIL of secondary importance.The inverter is modeled as series resistive voltage divider.
Figure 1.42: VOH resistor model
VOH =Roff
Roff +RLVDD (1.134)
VLSI DesignCourse 1-42
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.43: VOL resistor model
For RL Roff is VOH ' VDD. Current equations for VOL (assuming Vin = VOH):
βD2
[2(VOH − VT )VOL − V 2OL] =
VDD − VOLRL
(1.135)
Rearrangement yields
RL
(W
L
)D
=2(VDD − VOL)
k′[2(VOH − VT )VOL − V 2OL]
(1.136)
with βD = k′(W/L). This equation describes the needed product RL(W/L) for a given voltageVOH . The driver on resistance can be written as follows:
Ron =VOLID
=1
k′(WL
)D
[(VOH − VT )− 1
2VOL] (1.137)
Ron → as small as possible ⇒ (W/L)→ as high as possible (ratio logic)
VLSI DesignCourse 1-43
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
1.3.4 Inverter with Saturated Enhancement Load
Figure 1.44: Saturated enhancement load nMOS inverter
With VGSL = VDSL ⇒ VDSL > (VGSL − VTL) the load is automatically saturated and thecurrent is given by
IL =k′
2
(W
L
)L
(VGSL − VTL)2 (1.138)
Since VGSL = (VDD − Vout) and Vout = VDSD,
ID = IL =k′
2
(W
L
)L
[VDD − VDSD − VTL(VDSD)]2 (1.139)
VSBL = Vout, soVTL = VT0L + γ(
√Vout + 2|φF | −
√2|φF |) (1.140)
The driver is in cutoff for Vin < VTD ⇒ Vout = VOH . As Vin increases above VTD the driveris saturated, so
βD2
(Vin − VTD)2 =βL2
[VDD − Vout − VTL(Vout)]2 (1.141)
VLSI DesignCourse 1-44
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.45: VTC for saturated enhancement load nMOS inverter
When Vin is increased further and the condition Vout < (Vin − VTD) becomes true, then thecurrent is
βD2
[2(Vin − VTD)Vout − V 2out] =
βL2
[VDD − Vout − VTL(Vout)]2. (1.142)
1.3.5 Inverter with Nonsaturated Enhancement Load
The condition for the load being in nonsaturated region is
VDSL < VGSL − VTL(VBSL) (1.143)⇔ VGG > VDD + VTL(VDD) (1.144)
This extra bias ensures that VOH = VDD.Writing VDSL = (VDD − Vout) and VGSL = (VGG − Vout), the nonsaturated load current isgiven by
IL =βL2
[2(VGG − Vout − VTL)(VDD − Vout)− (VDD − Vout)2]. (1.145)
The load line is got from this equation by setting ID = IL, Vout = VDSD and rearranging:
ID =βL2
(2VGG − VDD − 2VTL − VDSD)(VDD − VDSD). (1.146)
VLSI DesignCourse 1-45
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.46: Nonsaturated enhancement load nMOS inverter
1.3.6 Inverter with Depletion mode MOSFET Load
The ideal load line in fig. 1.49 is for the case, that the load transistor body bias effects areignored.Because VGSL = 0 > VTL is always satisfied ⇒ there always exists a conducting channel inthe depletion load.
VDSL,sat = (VGSL − VTL) = |VTL| (1.147)
Border between saturated and nonsaturated load region:
VDD − Vout = |VTL|. (1.148)
VTL(Vout) = VT0L + γL(√Vout + 2|φF,L| −
√2|φF,L|) (1.149)
Condition for load beeing in saturation: Vout small ⇒ (VDD − Vout) > |VTL(Vout)|
IL =βL2
[VGSL − VTL(Vout)]2 =βL2
[−VTL(Vout)]2 (1.150)
VLSI DesignCourse 1-46
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.47: VTC for nonsaturated enhancement load nMOS inverter
Figure 1.48: Symbol for depletion mode MOSFET
Condition for load beeing in nonsaturation: (VDD − Vout) < |VTL(Vout)|
IL =βL2
[2|VTL(Vout)|(VDD − Vout)− (VDD − Vout)2] (1.151)
For the following discussion is assumed that
VTD < |VTL| < VDD (1.152)
When Vin < VTD then the driver is in cutoff and the load provides a conduction path betweenVDD and Vout, so Vout ' VOH ' VDD.When Vin is increased above VTD the driver enters the saturation region while the load remainsohmic (VDD − Vout < |VTL|):
βD2
(Vin − VTD)2 =βL2
[2|VTL(Vout)|(VDD − Vout)− (VDD − Vout)2] (1.153)
VLSI DesignCourse 1-47
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.49: Depletion mode MOSFET load
When Vin is increased further then either the driver or the load changes its operational region.If
Vout < VDD − |VTL| (1.154)
is satisfied first, then the load will change to saturation while the driver remains in saturation,otherwise
Vout < Vin − VTD (1.155)
is satisfied first and the driver becomes nonsaturated while the load is still nonsaturated.When Vin is further increased to a voltage few less than VDD the driver is nonsaturated andthe load is in saturation region:
βD2
[2(Vin − VTD)Vout − V 2out] =
βL2
[−VTL(Vout)]2 (1.156)
Calculation of VOH
Usually taken:VOH ' VDD (1.157)
VLSI DesignCourse 1-48
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Figure 1.50: VTC for inverter with depletion mode MOSFET load
Taking into consideration the resistance of the load device:
VOH = VDD − VDSL|Vin=0 (1.158)
The current IL is in this case the driver leakage current. The conductance of the nonsaturatedload is:
GDSL =IL
VDSL=βL2
[2|VTL(VOH)| − (VDD − VOH)] (1.159)
With VDSL = IL/GDSL results:
VOH = VDD −IL
k′L2
(WL
)L
[2|VTL(VOH)| − (VDD − VOH)](1.160)
Calculation of VOL
βD2
[2(Vin − VTD)Vout − V 2out] =
βL2
[−VTL(Vout)]2 (1.161)
Setting Vin = VOH and Vout = VOL yields
βR[2(VOH − VTD)VOL − V 2OL] = |VTL(VOL)|2 (1.162)
Rearranging
V 2OL − 2(VOH − VTD)VOL +
1βR|VTL(VOL)|2 = 0 (1.163)
and solution of this quadratic equation (body bias is ignored at this step) yields
VOL = (VOH − VTD)−√
(VOH − VTD)2 − 1βR|VTL(VOL)|2. (1.164)
⇒ final result for VOL by iteration of this equation
VLSI DesignCourse 1-49
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Design of Depletion Mode Inverters
The output voltages VOL and VOH are tuned to predefined values by adjusting the (W/L)ratios. For VOH the following equation has been given before
VOH = VDD −IL
k′L2
(WL
)L
[2|VTL(VOH)| − (VDD − VOH)], (1.165)
where IL is constrained by the driver leakage current.The difference VDD − VOH may be decreased by
• increasing (W/L)L by the designer (more chip area required)
• adjusting a proper process parameter VT0L
VTL(VOH) = VT0L + γL(√VOH + 2|φF,L| −
√2|φF,L|) (1.166)
Setting VOL: rearranging the current equation for VOL gives
βR =|VTL(VOL)|2
2(VOH − VTD)VOL − V 2OL
(1.167)
where the driver-load ratio is
βR =βDβL
=k′D
(WL
)D
k′L
(WL
)L
(1.168)
If the design problem is described by a simplified resistive network, the driver on resistance
Figure 1.51: Driver-load ratio for depletion-load inverter
VLSI DesignCourse 1-50
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
can be written as
Ron =VOLID
=1
k′D2
(WL
)D
[2(VOH − VTD)− VOL]=
1Gon
(1.169)
With load in saturation the drain-source resistance of the depletion-mode MOSFET is
RDSL =1
GDSL=
VDD − VOLk′L2
(WL
)L|VTL(VOL)|2
(1.170)
The equation
VOL =Ron
Ron +RDSLVDD (1.171)
implies that the value of VOL is lowered by increasing βR. The transistor conductances areproportional to their (W/L) ratios.
1.3.7 CMOS inverter
Advantages of CMOS:
• CMOS circuits dissipate power onlyduring switching events. When theinputs are stable, only leakage cur-rents are required from the powersupply. (NMOS: current flow, whendriver is on)
• VOH = VDD and VOL = 0V
• the voltage transfer curve of aCMOS inverter will exhibit a sharptransition
Disadvantages of CMOS:
• processing is more complex than forNMOS: extra processing steps mustbe added to create n-tub areas for p-transistor realizations (including ex-tra step for adjusting the thresholdvoltage of the p-channel device)
• additional processing steps for latch-up prevention: guard rings preventfrom unwanted forward biased pnjunctions
• CMOS realizations of circuits gen-erally require more transistors thanequivalent NMOS-designs
CMOS Inverter Characteristics
Vin = VGSn = VDD + VGSp (1.172)Vout = VDSn = VDD + VDSp (1.173)
For Vin < VTn ⇒ Vout = VDD the nMOS transistor is in cutoff while the pMOS transistor isin nonsaturation (|VDSp| = |Vout − VDD| < |VGSp − VTp| = |Vin − VDD − VTp|).When Vin is increased to values above VTn, the nMOS transitor starts conducting in saturationmode while the pMOS transistor is still in ohmic region:
βn2
(Vin − VTn)2 =βp2
[2(VDD − Vin − |VTp|)(VDD − Vout)− (VDD − Vout)2] (1.174)
VLSI DesignCourse 1-51
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
As Vin is increased further, Vout is decreased. When the point is reached, where
(VDD − Vout) > (VDD − Vin − |VTp|), (1.175)
both transistors are in saturation:
βn2
(Vin − VTn)2 =βp2
(VDD − Vin − |VTp|)2 (1.176)
When Vout falls to a level whereVout < (Vin − VTn), (1.177)
the nMOS transistor becomes nonsaturated:
βn2
[2(Vin − VTn)Vout − V 2out] =
βp2
(VDD − Vin − |VTp|)2. (1.178)
When the point is reached, where
(VDD − Vin) < |VTp| (1.179)
the pMOS transistor goes into cutoff (⇒ IDn = IDp = 0, Vout = 0).
Calculation of VOH
VOH ' VDD when Vin < VTn (n-channel transistor in cutoff, current is leakage current only)
Calculation of VOL
VOL ' 0 when (VDD − Vin) < |VTp| (p-channel transistor in cutoff)
Calculation of VIL
Equating currents for saturated nMOS and nonsaturated pMOS device:
βn2
(VIL − VTn)2 =βp2
[2(VDD − VIL − |VTp|)(VDD − Vout)− (VDD − Vout)2] (1.180)
Evaluation of condition (dVout/dVin) = −1 for IDn(Vin) = IDp(Vin, Vout):
dVoutdVin
=(dIDn/dVin)− (∂IDp/∂Vin)
∂IDp/∂Vout= −1 (1.181)
Evaluating the derivation gives
VIL
(1 +
βnβp
)= 2Vout +
βnβpVTn − VDD − |VTp| (1.182)
This equation has to be solved together with equation 1.180 ⇒ VIL.
VLSI DesignCourse 1-52
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
DC Characteristics of MOS Inverters
Calculation of VIH
At this point of the VTC the nMOS device is nonsaturated and the pMOS transistor issaturated.
βn2
[2(VIH − VTn)Vout − V 2out] =
βp2
(VDD − VIH − |VTp|)2. (1.183)
The derivation condition (dVout/dVin) = −1 has to be evaluated for IDn(Vin, Vout) = IDp(Vin):
dVoutdVin
=(dIDp/dVin)− (∂IDn/∂Vin)
∂IDn/∂Vout= −1 (1.184)
which gives
VIH
(1 +
βpβn
)= 2Vout + VTn +
βpβn
(VDD − |VTp|) (1.185)
This equations forms together with equation 1.183 a quadratic in VIH which has to be solved.
Calculation of Vth
For Vth = Vin = Vout both transistors are saturated.
βn2
(Vth − VTn)2 =βp2
(VDD − Vth − |VTp|)2 (1.186)
Solving for Vth yields:
Vth =VTn +
√βp/βn(VDD − |VTp|)
(1 +√βp/βn)
(1.187)
Design
While at nMOS design a lot of efforts have to be made to optimize the levels of VOH and VOL,the ratio (W/L) in CMOS design is used to set the level of Vth (VOH = VDD, VOL = 0).
βpβn
=µp(WL
)p
µn(WL
)n
(1.188)
The ratio required to establish a given inverter threshold voltage is√βnβp
=(VDD − Vth − |VTp|)
(Vth − VTn). (1.189)
To get a symmetrical VTC, Vth is set to VDD/2:√βnβp
=
(12VDD − |VTp|
)(
12VDD − VTn
) . (1.190)
If in a process is set |VTp| = VTn ⇒ βp = βn then the device aspect ratios are related by(WL
)p(
WL
)n
=µnµp. (1.191)
VLSI DesignCourse 1-53
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Since µn/µp ' 2.5 a minimum area CMOS inverter will have (W/L)n ' 1 and (W/L)p ' 2.5In this case the VTC is completely symmetric.
1.4 Switching of MOS Inverters
1.4.1 The output High-to-Low Time tHL
IDS(VOUT ) = −COUTdVOUTdt
, tHL = t2 − t1 , (1.192)
Driver goes from Cutoff over Saturation into Nonsaturation region.Border between Saturation and Nonsaturation is reached at the time tx and output voltageVout = VOH − VTD.In order to simplify the final expressions, the following integrations for computing tHL aredone with the borders from VOH to VOL (correct borders would be fromV1 = VOL + 0.9(VOH − VOL) to V0 = VOL + 0.1(VOH − VOL)).
Saturation : tx − t1 = −COUTVOH−VT∫VOH
dVOUTβ2 (VOH − VT )2
(1.193)
Nonsaturation : t2 − tx = −COUTVOL∫
VOH−VT
dVOUTβ2
[2(VOH − VT )VOUT − V 2
OUT
] (1.194)
⇒ tx − t1 =2COUTVT
β(VOH − VT )2, (1.195)
with∫
dx
x(a+ bxn)=
1an
ln(
xn
a+ bxn
)(1.196)
follows : t2 − tx =COUT
β(VOH − VT )ln[
2(VOUT − VT )VOL
− 1]
; (1.197)
⇒ tHL = τ
2VT
VOH − VT+ ln
[2(VOH − VT )
VOL− 1
](1.198)
with τ =COUT
β(VOH − VT )(1.199)
and interconnection resistance RLINE (1.200)
τ = COUT
[1
β(VOH − VT )+RLINE
](1.201)
1.4.2 Rise Time tLH
IL(VOUT ) = COUTdVOUTdt
(1.202)
VLSI DesignCourse 1-54
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
tLH =t2∫t1
dt = COUT
V1∫V0
dVOUTIL(VOUT )
(1.203)
NMOS Rise Time for Resistor Load
IL =VDD − Vout
RL(1.204)
tLH = RLCOUT
V1∫V0
dVOUTVDD − VOUT
= RLCOUT ln(VDD − V0
VDD − V1
)(1.205)
with V0 = 10% of the whole Voltage swing (1.206)V0 ' VOL + 0.1(VDD − VOL) , (1.207)
and V1 = 90% of the whole swing (1.208)V1 ' VOL + 0.9(VDD − VOL) (1.209)
NMOS Rise Time for Depletion Load Inverter
First Approximation : IL = βL2 |VTL(VOUT )|2
tLH = COUT ·∆VIL
= 2COUT (V1−V0)
βL|VTL|2(1.210)
With more accuracy : VTL is not constant because of the substrate effect (=body bias effect).Depletion MOSFET changes from saturation to nonsaturated mode, ifVDD − VOUT < |VTL|.Nonsaturation
IL =βL2
[2 |VTL(VOUT )| (VDD − VOUT )− (VDD − VOUT )2
](1.211)
tLH = COUT
VDD−|VTL|∫V0
dVOUTIL(SAT )
+ COUT
V1∫VDD−|VTL|
dVOUTIL(nonSAT )
(1.212)
=COUTβL|VTL|
2(VDD − |VTL| − V0)
|VTL|+ ln
[2|VTL| − (VDD − V1)
VDD − V1
](1.213)
Load Charge Time constant τL = COUTβL|VTL| with
interconnect line resistance RLINE : τL =(
1βL|VTL| +RLINE
)COUT
max. switching frequency fmax = 1tHL+tLH
VLSI DesignCourse 1-55
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
1.4.3 NMOS Propagation Delay Time
tP =12
(tPHL + tPLH) , with V1/2 =12
(VOL + VOH) (1.214)
tPHL = −COUT
V1/2∫VOH
dVOUTID(VOUT )
(1.215)
= −COUTVOH−VTD∫VOH
dVOUTβD2 (VOH − VTD)2
−COUT
V1/2∫VOH−VTD
dVOUTβD2
[2(VOH − VTD)VOUT − V 2
OUT
] (1.216)
= τD
2VTD
(VOH − VTD)+ ln
[4(VOH − VTD)(VOH + VOL)
− 1]
(1.217)
with τD =COUT
βD(VOH − VTD)(1.218)
(1.219)
Depletion load
tPLH = COUT
VDD−|VTL|∫VOL
dVOUTIL(SAT )
+ COUT
V1/2∫VDD−|VTL|
dVOUTIL(nonSAT )
(1.220)
tPLH =COUTβL|VTL|
2(VDD − |VTL| − VOL)
|VTL|+ ln
[2|VTL| − (VDD − V1/2)
(VDD − V1/2)
](1.221)
VLSI DesignCourse 1-56
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
1.4.4 CMOS Inverter Transient Response
The CMOS Inverter has a full supply voltage swing:
VOH = VDD and VOL = 0V (1.222)⇒ V0 = 0.1VDD and V1 = 0.9VDD. (1.223)
The high-to-low time tHL is similar to the NMOS Inverter
tHL =COUT
βn(V1 − VTn)
2VTn
(V1 − VTn)+ ln
[2(V1 − VTn)
V0− 1
](1.224)
From symmetry (VTn → VTp; βn → βp) follows:
tLH =COUT
βp(V1 − |VTp |)
2|VTp |
(V1 − |VTp |)+ ln
[2(V1 − |VTp |)
V0− 1
](1.225)
If VTn = VTp and βn = βp ⇒ tHL = tLH .
1.4.5 Propagation Delay Time tp of CMOS Inverters
tPHL = τn
2VTn
(VOH − VTn)+ ln
[4(VOH − VTn)(VOH − VOL)
− 1]
. (1.226)
From symmetry follows:
tPLH = τp
2|VTp |
(VOH − |VTp |)+ ln
[4(VOH − |VTp |)(VOH − VOL)
− 1
]. (1.227)
tp =12
(tPHL + tPLH) with (1.228)
τn =[
1βn(V1 − VTn)
+RLINE
]COUT and (1.229)
τp =
[1
βp(V1 − |VTp |)+RLINE
]COUT . (1.230)
(1.231)
Symmetrical CMOS inverter (VTn = VTp and βn = βp):
tPHL = tPLH = tP
= τn
2VT
VDD − VT+ ln
[4(VDD − VT )
VDD− 1
](1.232)
VLSI DesignCourse 1-57
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.52: βR for various VOL choices
VLSI DesignCourse 1-58
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.53: Basic CMOS inverter structure
VLSI DesignCourse 1-59
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.54: CMOS inverter characteristics
VLSI DesignCourse 1-60
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.55: Output high to low time
VLSI DesignCourse 1-61
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.56: Rise time circuit
Figure 1.57: Depletion load rise time
VLSI DesignCourse 1-62
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.58: Propagation delay time definitions
VLSI DesignCourse 1-63
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.59: CMOS transient analysis
VLSI DesignCourse 1-64
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
1.4.6 Power-Delay-Product (PDP)
The power-delay-product characterizes the overall performance of a digital circuit:
PDP = Pavtp (1.233)
where Pav is the average power dissipated by the circuit and tp is the average propagationdelay time.⇒ a small PDP is desirabel.For PDP computation the input signal waveform must be taken into consideration (Fig. 1.60).For the following PDP analysis, simplified versions of propagation delay time equations will
Figure 1.60: PDP: input signal waveforms
be used:
tPHL ' τD = RonCout (1.234)tPLH ' τL = RLCout (1.235)
⇒ tp '12
(Ron +RL)Cout average propagation delay (1.236)
VLSI DesignCourse 1-65
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
PDP for Resistor-Load Inverter
Figure 1.61: PDP for inverter with resistor load
With Iav (average power supply current) the power dissipated by the circuit is
Pav = IavVDD (1.237)
Static State Contribution to the PDP:
Pav ' V 2DD
2(Ron +RL)(1.238)
(factor 12 because resistively loaded inverter is considered to be half of time in output low state
– in output high state no power is dissipated)
(PDP )DC '14CoutV
2DD (1.239)
Output Rise Interval Contribution to the PDP:With driver in cutoff ⇒:
Iav ' Cout∆V∆t
= CoutVltLH
(1.240)
with Vl = VDD (Vout : 0→ VDD).
(PDP )LH ' CoutV 2DD
tptLH
(1.241)
VLSI DesignCourse 1-66
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Output Fall Interval Contribution to the PDP:
Iav '12
(Iinitial + Ifinal) (1.242)
whereIinitial =
1RL
(VDD − VOH), Ifinal =1RL
(VDD − VOL) (1.243)
Assuming VOL VOH = VDD ⇒Iav '
VDD2RL
(1.244)
With tPHL ' τD, tHL can be estimated as
tHL ' 2τD = 2RonCout (1.245)
(PDP )HL ' CoutV 2DD
RonRL
tptHL
(1.246)
The final PDP expression is obtained by adding all contributions:
PDP ' CoutV 2DD
(14
+tptLH
+RonRL
tptHL
). (1.247)
For well-designed inverters is Ron RL. The propagation delay time is then tp ' (τL/2).With the approximations tLH = 2τL and tHL = 2τD follows:
PDP ' 34CoutV
2DD (1.248)
PDP for Depletion-Load nMOS Inverter
Static State Contribution to the PDP:Average DC power dissipation:
(Pav)DC '12ImaxVDD (1.249)
where Imax is the maximum power supply current (this is for Vout = VOL ⇒ Imax = ID(Vout =VOL)). Assuming that the probability for the inverter being in this state is 50% ⇒
(Pav)DC ' βD4
[2(VOH − VTD)VOL − V 2OL]VDD (1.250)
' βL4
[VTL(VOL)]2VDD (1.251)
Output Rise and Fall Interval Contributions to the PDP:
(Iav)LH =1T
∫ tLH
0IL(t)dt (1.252)
withIL(t) = ID(t) + Cout
dVoutdt
(1.253)
VLSI DesignCourse 1-67
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.62: Power supply currents
⇒(Iav)LH =
1T
∫ tLH
0ID(t)dt+
1TCout
∫ tLH
0
dVoutdt
dt (1.254)
The first term may be rewritten as
ID,LH ≡1tLH
∫ tLH
0ID(t)dt . (1.255)
ID,LH = 0 if Vin is an ideal square wave (driver in cutoff). The second term can be evaluatedas follows ∫ tLH
0
dVoutdt
dt =∫ V1
V0
dVout (1.256)
⇒(Iav)LH =
1T
[ID,LHtLH + Cout(V1 − V0)] (1.257)
VLSI DesignCourse 1-68
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
The equations for the discharge time are similar:
(Iav)HL =1T
∫ tHL
0IL(t)dt (1.258)
IL(t) = ID(t) + CoutdVoutdt
(1.259)
(Iav)HL =1T
[ID,HLtHL − Cout(V1 − V0)] (1.260)
ID,HL ≡1tHL
∫ tHL
0ID(t)dt . (1.261)
So the transient power supply current is
(Iav)transient =1T
(ID,LHtLH + ID,HLtHL) (1.262)
For the total PDP of the depletion-load inverter follows:
PDP ' 12ImaxVDDtp + (ID,LHtLH + ID,HLtHL)VDD
tpT
(1.263)
To understand this expression, assume that ID,LH = ID,HL = ID,av. The logic switchingfrequency is f = 1/T and the maximum switching frequency is
fmax =1
tHL + tLH. (1.264)
⇒PDP ' 1
2ImaxVDDtp + ID,avVDDtp
f
fmax(1.265)
For f fmax the DC term of this equation is dominating. When f = fmax, the inverter isnever in the stable state where Vout = VOL, so
PDP ' ID,avVDDtp (1.266)
PDP dependence:PDP ∼ CoutVDD × (Voltage) (1.267)
PDP for CMOS Inverter
Current flows only during a switching event so the average current in a logic cycle T can bewritten as
Iav =1T
[IDn,LHtLH + IDn,HLtHL] . (1.268)
In this equation
IDn,LH ≡1tLH
∫ tLH
0IDn(t)dt (1.269)
gives the average current during the rise time, while
IDn,HL ≡1tHL
∫ tHL
0IDn(t)dt (1.270)
is the average fall time current. For a completely symmetric CMOS inverter IDn,LH =IDn,HL = IDn,av, so the power-delay product is given by
PDPCMOS = IDn,avVDDtpf
fmax(1.271)
VLSI DesignCourse 1-69
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
1.4.7 MOSFET Capacitances
• MOSFET capacitances are complicated functions of the fabrication processes and thelayout geometry
• nonlinear, voltage-dependent capacitances
• exact analysis not possible (⇒ computer simulation)
• here: hand computations/estimations in an average sense
Figure 1.63: Capacitances: basic MOSFET structure
MOS Overlap Capacitors
Refering to Fig. 1.64 the physical length of a polysilicon gate is given by
L′ = Ls + L+ LD (1.272)
The gate overlap is necessary to ensure the contact of the channel and the n+ regions.The overlap capacitances are given by
Cols = CoxWLs, Cold = CoxWLd (1.273)
with Cox = εox/xox (gate capacitance per unit area).
Self-aligned process: polysilicon gate is employed as a mask to define the n+ source anddrain regions.The overlaps occur, because the following processing steps require heating of the wafer (→lateral diffusion).The overlap capacitances may only be influenced by the designer by varying the channel widthW . In design rule sets the overlap capacitance is often defined by:
Co = CoxLo ⇒ Cols = Cold = CoW (1.274)
VLSI DesignCourse 1-70
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.64: MOSFET capacitor model
MOSFET Gate Capacitances
Cgs = CoxWLf1(VGS , VGD) (1.275)Cgd = CoxWLf2(VGS , VGD) (1.276)Cgb = CoxWLf3(VGS , VGD, VSB) (1.277)
The gate-bulk capacitance consists of the gate capacitance in series with the depletion capac-itance of the depletion region.
1. Cutoff: no inversion layer channel ⇒
Cgb ' CoxWL (1.278)Cgs ' 0 (1.279)Cgd ' 0 (1.280)
VLSI DesignCourse 1-71
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.65: MOSFET gate capacitances in the three operational regions
2. Nonsaturation: the channel shields the bulk electrode from the gate since the inversionlayer acts as conductor between drain and source ⇒ Cgb = 0
Cgb ' 0 (1.281)
Cgs '12CoxWL
(1 +
VDS3VDS,sat
)(1.282)
Cgd '12CoxWL
(1− VDS
VDS,sat
)(1.283)
3. Saturation: the channel shields the bulk electrode from the gate since the inversion layeracts as conductor between drain and source ⇒ Cgb = 0. The channel is pinched off anddoes not contact the drain n+ region.
Cgb ' 0 (1.284)
Cgs '23CoxWL (1.285)
Cgd ' 0 (1.286)
Combination of the gate capacitances with the overlap contributions:
CG = CoxWL′ (1.287)where L′ = L+ 2Lo
CGS = Cols + Cgs (1.288)CGD = Cold + Cgd (1.289)
VLSI DesignCourse 1-72
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.66: Gate capacitances as functions of gate-source voltage
The Bulk Junction Capacitances
Figure 1.67: Expanded view of an n+ drain or source region for computing depletion capaci-tances
The reverse-biased depletion capacitance per unit area of a pn junction is given by
C =Cj0(
1 + Vrφ0
)1/2(1.290)
where Vr is the magnitude of the reverse-bias voltage applied to the junction. φ0 is the built-in
VLSI DesignCourse 1-73
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
potential
φ0 =(kT
q
)ln
(NdNa
n2i
)(1.291)
and Cj0 is the zero-bias (Vr = 0) capacitance per unit area.
Cj0 =√√√√ qεSi
2(
1Na
+ 1Nd
)φ0
(1.292)
The bottom capacitance can be computed simply using the doping concentrations Nd and Na
for the pn junction:
Cbottom =Cj0WY(
1 + Vrφ0
)1/2(1.293)
For computing the sidewall capacitance the p+ channel stop doping must be taken into con-sideration (−→ see also technology description later on). The sidewall capacitance is usuallycomputed by first taking the sidewall capacitance per unit area as
Cj0sw =√√√√ qεSi
2(
1Na,sw
+ 1Nd
)φ0sw
(1.294)
where
φ0sw =(kT
q
)ln
(NdNa,sw
n2i
)(1.295)
is the sidewall built-in potential. Because the n+ area has a junction depth of xj , the sidewallcapacitance per unit length Cjsw is taken as
Cjsw = Cj0swxj (1.296)
The total sidewall capacitance is then given by
Csw =Cjswl(
1 + Vrφ0sw
)1/2(1.297)
where l is the total sidewall perimeter length (2W + 2Y ). Assuming φ0 = φ0sw, the totaldepletion capacitance for a drain or source area is given by
Cd(Vr) = Cbottom + Csw
=Cj0WY + Cjswl(
1 + Vrφ0
)1/2. (1.298)
For drain regions Vr = VDB and for source regions Vr = VSB ⇒ the depletion capacitancedepends on actual voltages.An average depletion capacitance may be defined by
Cav =1
V2 − V1
∫ V2
V1
Cd(Vr)dVr (1.299)
=2φ0CT
(V2 − V1)
[(1 +
V2
φ0
)1/2
−(
1 +V1
φ0
)1/2]
(1.300)
VLSI DesignCourse 1-74
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
whereCT = Cj0WY + Cjswl . (1.301)
Defining a dimensionless voltage factor
K(V1, V2) =CavCT
=2φ0
(V2 − V1)
[(1 +
V2
φ0
)1/2
−(
1 +V1
φ0
)1/2]< 1 (1.302)
yieldsCav = K(V1, V2)CT (1.303)
1.4.8 Inverter Output Capacitance
Figure 1.68: Approximation used for Cout in cascaded nMOS inverters
Figure 1.69: Simplified interconnect scheme for line capacitance
VLSI DesignCourse 1-75
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Cout = CGD1 + CGD2 +K(VOL, VOH)[Cdb1 + Csb2] + Cline + CG3 (1.304)
For computation of the line capacitance transmission line theory should be used (parasiticcapacitances, structures must be treated in a distributed manner). The problem can bereduced by a lumped-element approximation:
Cline ' CintAline (1.305)
withCint =
εoxxint
[F/cm2]. (1.306)
Cint is the capacitance per unit area formed between the line and the substrate, xint is theoxide thickness between line and substrate. The line resistance can be estimated in a similarmanner by
Rline = nR2 [Ω] (1.307)
where n = (d/w) is the number of squares (2) with area w2 as seen in the direction of currentflow. Fig. 1.70 gives an example for cascaded stages with a fanout of three:
Figure 1.70: Capacitance calculation for FO = 3
C → CG3 + CG4 + CG5 + (∆Cline) (1.308)
The output capacitance of CMOS inverters can be computed using similar techniques. InFig. 1.71 two cascaded CMOS inverters are shown.
Cout ' CGDn + CGDp +K(VOL, VOH)(Cdbp + Cdbn) + Cline + CG (1.309)
VLSI DesignCourse 1-76
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
Figure 1.71: Approximation used for Cout in cascaded CMOS inverters
with CG the input capacitance of the next stage, which is given by
CG = CGn + CGp (1.310)
VLSI DesignCourse 1-77
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Switching of MOS Inverters
1.4.9 Scaled Inverter Performance
Assuming that device dimensions are scaled with S > 1, such that
Length’ =LengthS
(1.311)
This length reduction applies to all geometries in the chip.nMOS high-to-low time:
tHL = τD
2VTD
VOH − VTD+ ln
[2(VOH − VTD)
VOL− 1
](1.312)
Scaling: also voltage reduction by V ′ = (V/S). The term enclosed by curly brackets in theprevious equation remains constant, but τD is modified:
τD =Cout
βD(VOH − VTD)(1.313)
(VOH − VTD)′ =(VOH − VTD)
S(1.314)
C′ox = SCox ⇒ β
′D = SβD (1.315)
Cout consists of oxide and depletion capacitances:
(C ′)oxide = C′ox(Area)′ =
(C)oxide
S(1.316)
(C ′)junction ' (C)junction
S(approximation) (1.317)
⇒ C′out =
CoutS
(1.318)
⇒ τ′D ' τD
S(1.319)
The maximum switching frequency is
f′max =
1t′HL + t
′LH
' Sfmax (1.320)
If the voltage is kept constant (only lengths are scaled):
τ′D ' τD
S2(1.321)
f′max = S2fmax (1.322)
VLSI DesignCourse 1-78
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
CMOS Technology
1.5 CMOS Technology
1.5.1 CMOS Process Flow
Figure 1.72: CMOS process flow
VLSI DesignCourse 1-79
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
CMOS Technology
1.5.2 The Latch-Up Effect
Figure 1.73: Latch-up in n-tub CMOS inverter
−→ significant problem in CMOS circuits
If the base-emitter junction of the pnp transistor becomes forward biased, the transistor isswitched on and I begins to flow, causing the npn transistor to be forward biased. The collectorcurrent of the npn transistor forces the pnp transistor to conduct more current. This feedbackleads to latch-up and the circuit will be destroyed by heat.
VLSI DesignCourse 1-80
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
CMOS Technology
Figure 1.74: Guard rings for latch-up prevention
The circuit can be prevented from latch-up by placing heavily doped guard ring around theMOSFETs. This reduces the effectiveness of the base and emitter regions in both transistors.
VLSI DesignCourse 1-81
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Overview: Combinational Logic
Chapter 2
Static CMOS Logic Design andCombinational Circuits
2.1 Overview: Combinational Logic
Several kinds of combinational logic:
• Random Logic: Circuit design using NAND gates, NOR gates and Inverters (often called“AOI Logic Gate Representation” = AND-OR-Inverter Logic)
Figure 2.1: Example for random logic: adder
VLSI DesignCourse 2-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Overview: Combinational Logic
• Complex MOS Logic: A boolean function is realized by a pull-up network (realizes theproduct terms for logic ’1’) and a pull-down network (realizes logic ’0’). Producttermrealization is done by parallel/serial combinations of MOS tranistors which inputs arecontrolled by the literals of the boolean equation.
Figure 2.2: Complex gate logic primitive: CMOS inverter
• Passtransistor Logic: transistors are used as switches which are controlled by inputliterals.
Figure 2.3: MOS transistors viewed as switches
VLSI DesignCourse 2-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Overview: Combinational Logic
Figure 2.4: A complementary switch
• Logic Arrays: PLA (programmable logic arrays), gate-matrix layout, Weinberger Arraysand regular layout achieved by application of the Euler-Graph method
VLSI DesignCourse 2-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Overview: Combinational Logic
Figure 2.5: Example for regular design: gate-matrix layout
VLSI DesignCourse 2-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex nMOS Logic
2.2 Complex nMOS Logic
2.2.1 nMOS NOR Gates
Figure 2.6: nMOS 2-input NOR gate
Figure 2.7: nMOS N-input NOR gate
VLSI DesignCourse 2-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex nMOS Logic
2.2.2 nMOS NAND Gates
Figure 2.8: nMOS 2-input NAND gate
VLSI DesignCourse 2-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex nMOS Logic
2.2.3 nMOS Complex Gates
Figure 2.9: Example of a complex nMOS circuit
VLSI DesignCourse 2-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex nMOS Logic
Figure 2.10: Evolution of a nMOS XOR circuit
VLSI DesignCourse 2-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex nMOS Logic
Figure 2.11: Direct NOT XOR complex gate implementation
VLSI DesignCourse 2-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
2.3 Complex Static CMOS Logic
2.3.1 CMOS NAND and NOR Gates
Figure 2.12: CMOS NAND gate
VLSI DesignCourse 2-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Figure 2.13: CMOS NAND gate layout
VLSI DesignCourse 2-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Figure 2.14: CMOS NOR gate
VLSI DesignCourse 2-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
2.3.2 Static CMOS Logic Design
Figure 2.15: General CMOS static logic gate
Static CMOS Complex Gate Logic Properties
• Build logic gates as shown in figure 2.15 where transistors are represented as switches
• The pMOS pull-up network replaces resistive or depletion loads used in nMOS technique
• Configure so that for each input combination:
– either a p-chain pulls the output up
– or an n-chain pulls the output down
⇒ pull-up and pull-down networks implement complementary functions, when one con-ducts the other does not
• No quiescent current through the gate means zero or very low static power dissipation
• Active pull-up chains are faster than resistive loads
• Switching time is the same for both kind of output changes
VLSI DesignCourse 2-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Figure 2.16: CMOS complex gate construction
VLSI DesignCourse 2-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Design Method
• nMOS devices pull the output to ’0’ when the gate inputs are ’1’
• pMOS devices pull the output to ’1’ when the gate inputs are ’0’
Consider a function to be realized: F (A,B,C, . . .)
• nMOS pull-down network must realize the pull-down function
FPD = F (A,B,C, . . .)
• pMOS pull-up network must realize the pull-up function
FPU = F (A,B,C, . . .)
The literals in FPU have to be inverted, because the p-channel transistors conduct, iftheir gate input is ’0’ (low).
• Example: Realization of F = A+B + C (NOR)
FPD = A+B + C
FPU = A+B + C = A ∗B ∗ C
(Boolean expression transformation is to be done by applying the Shannon inversiontheorem – De Morgan’s law)
⇒ Synthesis can use conventional logic design techniques (Boolean functions, Karnaughmaps, logic minimization) and express the results in AND/OR form for realisation inseries and parallel connections for devices
VLSI DesignCourse 2-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Rules for Logic Formation
Rule 1: nMOS transistors in series implement the AND operation
Rule 2: nMOS transistors in parallel implement the OR operation
Rule 3: Logic functions in series are ANDed together
Rule 4: Parallel nMOS branches OR the individual branch functions
First the logic nMOS transistors are structured according to the rules above. The output ofthe function is the complement of the nMOS logic. Now the pMOS transistor network has tobe structured according to the following rules:
VLSI DesignCourse 2-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Rule 5: Parallel connections of nMOS transistors have to be transformed to serial connectionsof pMOS transistors. The input literals applied to the pMOS transistors are identicalwith the gate inputs of the nMOS transistors (no inversion needed)
Rule 6: Serial connections of nMOS transistors have to be transformed to parallel connectionsof pMOS transistors. Input literals remain unchanged
Rule 7: Parallel connected logic blocks of the nMOS network−→ serial connection in the pMOS network
Rule 8: Serial connected logic blocks of the nMOS network−→ parallel connection in the pMOS network
VLSI DesignCourse 2-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Figure 2.17: Systematic function construction
VLSI DesignCourse 2-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Example: Basic AOI Logic Realization as Complex Gate
F = A(BC +D) (2.1)
Example: 4-to-1 Multiplexer
F = D0S0 S1 +D2S0S1 +D1S0S1 +D3S0S1 (2.2)
VLSI DesignCourse 2-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
⇒ several kinds of complex gate realisations possible:
• hierarchical connection of three 2-input MUX complex gates
• full complex gate realization of one 4-input MUX
⇒ . . .
Example: AOI Logic Circuit
F = AB + (A+B)C (2.3)
VLSI DesignCourse 2-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Example: Combinational Adder
CARRY = AB +AC +BC = AB + C(A+B) (2.4)SUM = ABC +ABC +ABC +ABC
= ABC + (AB +BC +AC)(A+B + C)= ABC + CARRY(A+B + C) (2.5)
Figure 2.18: Combinational adder schematic
VLSI DesignCourse 2-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
Figure 2.19: Combinational adder layout possibilities for one adder circuit
VLSI DesignCourse 2-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Complex Static CMOS Logic
2.3.3 Pseudo nMOS Logic
Figure 2.20: Pseudo nMOS logic
• Substitute the pMOS network by one single pMOS load transistor
• Consists of a single pMOS load per gate (emulating the nMOS depletion load, withoutbody effect) and a nMOS pull-down network
• Needs ratioed devices
• Dissipates static power, when pull-down network is on
• Provides a method of emulating nMOS circuits in CMOS
• Reduced noise margin
VLSI DesignCourse 2-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
2.4 Passtransistor and Transmission Gate Logic
Figure 2.21: Pass transistor logic model
Example: Pass Transistor NXOR Realisation
A B AB Pass Function0 0 1 A + B0 1 0 A + B1 0 0 A + B1 1 1 A + B
Figure 2.22: Pass transistor structure for NXOR function
VLSI DesignCourse 2-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
2.4.1 Passtransistor Charging Characteristics
Figure 2.23: Pass transistor charging characteristics
VGS,P = VDD − Vin= VDS,P (2.6)
Since the passtransistor is always saturated, the charging current equation can be written as:
CindVindt
=βP2
(VDD − Vin − VTP )2 (2.7)
whereβP = (µnCox)
(W
L
)P
(2.8)
Ignoring the body bias effect the solution of this differential equation is given by (initialcondition: Vin(0) = 0):
Vin(t) = (VDD − VTP )− (VDD − VTP )[1 + βP t
2Cin(VDD − VTP )
] . (2.9)
Withτch ≡
2CinβP (VDD − VTP )
(2.10)
VLSI DesignCourse 2-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
this solution may be written as
Vin(t) = (VDD − VTP )[
(t/τch)1 + (t/τch)
]. (2.11)
The maximum load voltage is given by
Vin(t→∞) = (VDD − VTP ) = Vmax (2.12)
or taking into account the body bias with
VTP (Vin) = VT0P + γ(√
2|φF |+ Vin −√
2|φF |) (2.13)
for the maximum voltage follows:
Vmax = VDD − VTP (Vmax)
= (VDD − VT0P )− γ(√
2|φF |+ Vmax −√
2|φF |) . (2.14)
Consequences of the Passtransistor Charging Characteristics for the Design ofPasstransistor Networks
1. Cascaded Passtransistor Chain: Vchainout = Vmax = (VDD − VTP )⇒ Vmax is propagated through the passtransistor chain
2. Pass Transistor driving another Pass Transistor:V1,max = (VDD − VTP1) and V2,max = (V1,max − VTP2) ⇒ reduction of Vmax !
2.4.2 Passtransistor Discharging Characteristics
VGS,P = VDD − VP= VDD (2.15)
VLSI DesignCourse 2-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Since the passtransistor is always nonsaturated, the charging current differential equation canbe written as:
−CindVindt
=βP2
[2(VDD − VTP )Vin − V 2in] (2.16)
Figure 2.24: Pass transistor discharge characteristics
VLSI DesignCourse 2-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Ignoring the body bias effect the solution of this differential equation is given by:
Vin(t) = (VDD − VTP )
(2e−t/τdis
1 + e−t/τdis
). (2.17)
whereτdis ≡
CinβP (VDD − VTP )
(2.18)
τdis '12τch (2.19)
⇒ Discharging much faster than charging.
Figure 2.25: nMOS pass characteristics
2.4.3 CMOS Transmission Gates
Figure 2.26: CMOS transmission gate symbols
VLSI DesignCourse 2-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Figure 2.27: CMOS transmission gate realisation
Figure 2.28: pMOS pass transistor
pMOS Transmission Characteristics
It’s not possible to discharge the capacitator to 0 Volts because
Vout(t→∞) = |VTP | = Vmin (2.20)
Transmission Gate Model
Logic Level nMOS pMOS CMOSLogic 0 0 |VTp| 0Logic 1 (VDD − VTn) VDD VDD
IDn + IDp = CoutdVoutdt
(2.21)
Logic 1 transfer:Vout(t) = VDD[1− e−(t/τTG)] (2.22)
withτTG = RTGCout (2.23)
Logic 0 transfer:Vout(t) = VDDe
−(t/τTG) (2.24)
VLSI DesignCourse 2-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Figure 2.29: pMOS pass characteristics
Figure 2.30: CMOS transmission gate
Equivalent Resistance
RTG =VTG
IDn + IDp(2.25)
Rn =1
βn(VDD − VTn)(2.26)
Rp =1
βp(VDD − |VTp|)(2.27)
VLSI DesignCourse 2-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Figure 2.31: MOSFET operational states
Figure 2.32: Transmission gate: resistor switch model
VLSI DesignCourse 2-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Figure 2.33: Transmission gate: RC switch logic transfer
Figure 2.34: Transmission gate: equivalent resistances
VLSI DesignCourse 2-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Figure 2.35: Transmission gate: basic layout
VLSI DesignCourse 2-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
TG-Based Logic Gates
S = 1 : B ← A (2.28)
Figure 2.36: Transmission gate logic
Path Selector
F = AS +BS (2.29)
S = 1 : F = A
S = 0 : F = B
Figure 2.37: TG-logic: 2-input path selector
VLSI DesignCourse 2-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
OR Gate
F = A+AB
= A+B (2.30)
Figure 2.38: TG-logic: OR gate
XOR and Equivalence
F1 = A⊕B= AB +AB (2.31)
F2 = AB= AB +A B (2.32)
VLSI DesignCourse 2-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Figure 2.39: TG-logic: XOR and equivalence
Figure 2.40: TG-logic: alternate equivalence logic circuit
VLSI DesignCourse 2-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Adders
S0 = A0 ⊕B0 (2.33)C0 = A0B0 (2.34)
Figure 2.41: Half adder logic symbol
Figure 2.42: TG-logic: Half adder
VLSI DesignCourse 2-37
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Full adder equations:
Sn = (An ⊕Bn)Cn−1 + (An ⊕Bn)Cn−1 (2.35)Cn = (An ⊕Bn)Cn−1 + (An ⊕Bn)An (2.36)
Figure 2.43: TG-logic: Full adder
VLSI DesignCourse 2-38
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Array Logic
Multiplexers/Demultiplexers
Figure 2.44: Multiplex/Demultiplex operations
4-to-1 Multiplexer:
F = D0(AB) +D1(AB) +D2(AB) +D3(AB) (2.37)
⇒ Multiplexers can be used as function generators
Figure 2.45: TG-logic: 4-to-1 multiplexer
(example: for D0=1, D1=0, D2=0, D3=0 an AND function is realized)
VLSI DesignCourse 2-39
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Passtransistor and Transmission Gate Logic
Split Arrays
⇒ improvement of the layout efficiency by separating pMOS and nMOS transistors into twodistinct areas (physical separation)
Figure 2.46: TG-logic: Split-Array MUX
Pass Transistor Logic with pMOS Pull-Up
For reduction of device count and area an nMOS version with pMOS pull-up can also be useful(→ kind of pseudo nMOS).
Figure 2.47: Pass transistor logic with pMOS pull-up
VLSI DesignCourse 2-40
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocking
Chapter 3
Synchronous MOS Logic
3.1 Clocking
Clock Signal:
• used to synchronize data flow through a digital network⇒ clocked static or dynamic circuits
• problems: clock skew (delay caused by clock distribution wires)
Figure 3.1: Ideal nonoverlapping 2-phase clocks
Condition for nonoverlapping clock signals φ1(t) and φ2(t):
φ1(t)φ2(t) = 0 ∀t (3.1)
VLSI DesignCourse 3-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocking
Figure 3.2: Basic 2-phase clocking
3.1.1 Single and Multiple Clock Signals
Figure 3.3: Single clock 2-phase timing
⇒ For nonoverlapping clock phases φ and φ fine tuned and well designed delay lines (realizedas Transmission gates) have to be inserted in order to avoid overlapping of φ and φ.
VLSI DesignCourse 3-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocking
Figure 3.4: Generation of inverted clock phase
Figure 3.5: TG delay circuit
VLSI DesignCourse 3-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocking
Figure 3.6: Pseudo 2-φ clocking
VLSI DesignCourse 3-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocked Static Logic
3.2 Clocked Static Logic
⇒ Synchronized data transfer
Figure 3.7: Shift register
Upper Frequency Limitation: Charging and Discharging Times
Figure 3.8: Clocked shift register circuit
Time constant for charging and discharging:
τTG = RTGCL (3.2)
whereCL = CTG + Cin + Cline (3.3)
VA = VDD : (Vin(0) = 0)Vin(t) ' VDD[1− e−t/τTG ] (3.4)
Inverter is switched, when Vin = VIH which occurs after
t1 ' −τTG ln[1− VIH
VDD
](3.5)
Cin = Cox[(WL)n + (WL)p] (3.6)
VLSI DesignCourse 3-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocked Static Logic
VA = 0 : (Vin(0) = VDD)Vin(t) ' VDDe−t/τTG (3.7)
The time until Vin reaches VIL is given by
t0 = −τTG ln(VILVDD
)(3.8)
Lower Frequency Limitation: Charge Leakage
Figure 3.9: Leakage path in a CMOS TG
The load capacitance, seen by the transmission gate (TG) is
CL = CTG + Cline + Cin (3.9)
The depletion capacitance contributions to CL are due to the reversed pn junctions in theMOS transistors. As shown in fig. 3.9 a leakage current flow exists across the reverse biasedpn junctions. The influence of this leakage current on the charge stored in CL depends on thevalues of ILp and ILn. With
IL = ILn − ILp (3.10)
the leakage current influence on Vin is given by
CLdVindt
= −IL (3.11)
If ILp > ILn the capacitance is charged by IL otherwise it is discharged or remains constantwhen the ideal condition ILp = ILn is true.
dQstoredt
= ILp − ILn (3.12)
Cstore =dQstoredV
(3.13)
Assuming that the leakage currents ILp and ILn are constant and that the node charge voltagerelation is linear of the form
Qstore = CstoreV (3.14)
VLSI DesignCourse 3-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocked Static Logic
Figure 3.10: Charge leakage problem in CMOS TG
follows (because Cstore is const.)
CstoredV
dt= ILp − ILn. (3.15)
The solution of this equation is
V (t) =(ILp − ILn)Cstore
t+ V (0) (3.16)
If ∆V is the maximum allowed voltage change:
tmax =Cstore ∆V
IL(3.17)
VLSI DesignCourse 3-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocked Static Logic
Figure 3.11: Charge leakage circuit
With Tmax = 2tmax (the longest allowed clock period) follows for the minimum frequency
fmin ' 12tmax
' IL2Cstore ∆V
(3.18)
The transmission gate capacitance is
Figure 3.12: Transmission gate capacitance
CT ' CG + Cline + Cols + Cold + CSBp(V ) + CDBn(V ) . (3.19)
So the storage capacitance can be estimated by voltage averaging of this expression:
Cstore ' CG + Cline + Cols + Cold +K(0, VDD)[CSBp + CDBn] (3.20)
VLSI DesignCourse 3-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clocked Static Logic
For a realistic analysis of the charge leakage problems the dependence of the leakage currentsfrom the reverse voltage bias has to be taken into consideration (see [25]).
VLSI DesignCourse 3-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Charge Sharing
3.3 Charge Sharing
Figure 3.13: Basic charge sharing circuit
t < 0 : (TG switched off)
V1(t < 0) = VDD (3.21)V2(t < 0) = 0 (3.22)
QT = C1VDD (3.23)
t > 0 : (TG switched on)
QT = (C1 + C2)Vf (3.24)
Vf = V1(t > 0) = V2(t > 0)
=C1
C1 + C2VDD (3.25)
=1
1 + (C2/C1)VDD
Charge sharing among N TG-connected capacitators
Initial charge:
QT =N∑i=1
CiVi(0) (3.26)
After connecting nodes:
QT =
(N∑i=1
Ci
)Vf (3.27)
Final voltage:
Vf =∑Ni=1CiVi(0)∑N
i=1Ci(3.28)
VLSI DesignCourse 3-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Charge Sharing
Figure 3.14: Transient voltage behaviour
VLSI DesignCourse 3-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Dynamic Logic
3.4 Dynamic Logic
• Pull-up (pull-down) network of static CMOS is replaced by a single precharge (discharge)transistor.The remaining network then conditionally discharges (charges up) the output in a secondoperation phase
• One logic level is held by dynamic charge storage
• Transistor count is reduced from 2n (static CMOS) to n+2 for dynamic prechargedCMOS (but now: 2 phases of operation)
3.4.1 Dynamic nMOS Inverter
Figure 3.15: Basic dynamic nMOS inverter
Precharge Phase
If Vin = 0 then
τch =Cout
βp(VDD − |VTp|)= RpCout (3.29)
Worst case (Vin = VDD):τch,max = Rp(Cout + Cn) (3.30)
tch,max = τch,max
[2|VTp|
(VDD − |VTp|)+ ln
(2(VDD − |VTp|)
V0− 1
)](3.31)
VLSI DesignCourse 3-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Dynamic Logic
Figure 3.16: Dynamic nMOS inverter: precharge and evaluate
Evaluation Phase
For the case that M1 is switched on and identically designed channel width for M1 and Mnthe discharge time constant is given by
τdis =(L1 + Ln)Cout
k′nW (VDD − VTn)(3.32)
VLSI DesignCourse 3-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Dynamic Logic
Figure 3.17: Precharge network for worst case
Figure 3.18: Evaluation discharge network
tdis = τdis
[2VTn
(VDD − VTn)+ ln
(2(VDD − VTn)
V0− 1
)](3.33)
Maximum Clock Frequency
tM = max(tch,max, tdis) (3.34)
VLSI DesignCourse 3-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Dynamic Logic
fmax '1
2tM(3.35)
3.4.2 Dynamic pMOS Inverter
Figure 3.19: Basic dynamic pMOS inverter
3.4.3 Dynamic CMOS Properties and Conditions
• single phase clock
• input should change during precharge only
• input must be stable at the end of the precharge phase
• in the evaluation phase the output remains HIGH (LOW) or is optionally discharged(charged)
VLSI DesignCourse 3-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Dynamic Logic
3.4.4 Complex Logic
Figure 3.20: Complex dynamic logic
VLSI DesignCourse 3-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Dynamic Logic
3.4.5 Dynamic Cascades
pMOS blocks and nMOS blocks have to be installed alternated in order to avoid glitches
Figure 3.21: Cascaded nMOS-nMOS glitch problem
Figure 3.22: Dynamic cascades
VLSI DesignCourse 3-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
3.5 Domino CMOS Logic
Figure 3.23: Basic domino logic circuit
• Domino Logic: design method for glitch-free cascading of nMOS logic blocks
• Each stage is driven by φ
– Precharge during φ = 0
– Evaluation when φ = 1
• Domino logic blocks consist of a precharge/evaluation block and an output inverter
Precharge Phase: The gate output is precharged to logic 1 and the inverter outputis going to logic 0. Logic transmission errors are avoided by providing a logic 0 atthe inverter output (avoiding discharge of the next logic stage).
Evaluation Phase: The inverter output stays according to the actual input values atlogic 0 or is set to logic 1. The correct result signal is provided at the end of thedomino cascade after stabilization of all stages.
VLSI DesignCourse 3-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
Figure 3.24: Domino AND gate
Figure 3.25: Cascaded domino logic
Figure 3.26: Visualization of domino effect
VLSI DesignCourse 3-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
D
C
A
B
$\phi_N$
D
C
B
A
$\phi_N$
n n n
Figure 3.27: Domino timing
VLSI DesignCourse 3-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
Figure 3.28: Cascaded domino circuit with fanout = 2
VLSI DesignCourse 3-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
3.5.1 Domino Logic Properties
p-channeln-channel
$\overline\rm clk$clk
out
only only
out
Figure 3.29: Cascaded domino logic
• Domino logic consists of either n-type or p-type blocks
• small load capacity to be driven by logic (one inverter only) =⇒ low dimension oftransistors
• only one clock signal required
• only positive logic realizations possible because of the input inverters ⇒ domino logic isnoninvertingFunctions as
F1 = A⊕B = AB +AB
F2 = AB = AB +AB (3.36)
cannot be directly realized in a domino chain
VLSI DesignCourse 3-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
3.5.2 Analysis
Figure 3.30: Domino AND4 gate
Precharge
Assuming that all Ai (coming from previous stages) are zero, the capacitance CX is charged,where
CX = C0 + CT (3.37)' (CGDn1 + CBDn1) + (CGDp1 + CBDp1) + CG + Cline (3.38)
Evaluate
If all inputs Ai are set to logic 1, the worst case delay time can be estimated by
tD ' RnCn + (Rn +R3)C3 + (Rn +R3 +R2)C2 ++(Rn +R3 +R2 +R1)C1 + (Rn +R3 +R2 +R1 +R0)CX (3.39)
withRj =
1k′n(W/L)j(VDD − VTn)
(3.40)
VLSI DesignCourse 3-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
3.5.3 Charge Leakage and Charge Sharing
Figure 3.31: Domino stage with pull-up MOSFET
VLSI DesignCourse 3-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Domino CMOS Logic
Figure 3.32: Charge sharing in a domino chain
Figure 3.33: Use of feedback to control a pull-up MOSFET for charge sharing problem
VLSI DesignCourse 3-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
NORA Logic
3.6 NORA Logic
(NORA = NO RAce)
3.6.1 NORA Properties
• NORA is very insensitive to clock delay
• one clock signal and the inverted clock signal with short slopes rise times are sufficient
• no inverter is needed between the logik stages, because of alternate use of n-type andp-type blocks
• the last stage is a clocked inverter, a C2MOS latch
3.6.2 The Signal Race Problem
Figure 3.34: Signal race problem
From fig. 3.34 the signal race problem can be seen: A signal race can arise, when bothtransmission gates conduct at the same time. If the new input from TG1 reaches the input ofTG2 while TG2 is still transmitting the output, the output information will be lost. ImperfectTG synchronization occurs because of normal transition intervals or clock skew.
VLSI DesignCourse 3-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
NORA Logic
Figure 3.35: Clock skew
VLSI DesignCourse 3-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
NORA Logic
3.6.3 NORA Structuring
$\overlineclk2$
$\overlineclk1$
clk2
clk1
out
clk2
clk1
in
Figure 3.36: NORA structuring
VLSI DesignCourse 3-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
NORA Logic
Figure 3.37: NORA φ and φ sections
VLSI DesignCourse 3-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
NORA Logic
Figure 3.38: C2MOS latch
Figure 3.39: NORA pipelined logic
VLSI DesignCourse 3-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7 Memory Structures
3.7.1 Principle of CMOS Information Storage
Figure 3.40: Connection of components for a simple CMOS flip-flop
Behaviour:
LD = 1 : Q← D Q← D
LD = 0 : store current state
VLSI DesignCourse 3-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
Figure 3.41: Physical Construction of a CMOS flip-flop
VLSI DesignCourse 3-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.2 Dynamic Flip-Flops: Pseudo 2-Phase Clocking
Figure 3.42: Pseudo 2-phase clocking (a) waveforms and simple latch, (b) clock skew, and (c)slow clock edges
VLSI DesignCourse 3-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.3 Pseudo 2-Phase Memory Structures
Figure 3.43: Pseudo 2-phase latches (! charge redistribution problem in (b))
VLSI DesignCourse 3-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
Figure 3.44: Pseudo 2-phase latch layouts
VLSI DesignCourse 3-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
Figure 3.45: Shift register array layout
VLSI DesignCourse 3-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.4 Dynamic Flip-Flop with reduced Transistor Count and Clock Con-nection
(Reduced Noise Margins – Poor “1” in the Slave)
Figure 3.46: Reduced transistor count latch
better with high impedance sustainer transistor:(accurate simulation is required for correct function)
$\phi_2$$\phi_1$
QD
Figure 3.47: Reduced transistor count latch with high impedance sustainer transistor
VLSI DesignCourse 3-37
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.5 Dynamic D-Latches
Figure 3.48: Dynamic D-Latches
Characteristic Equation:
Q(t) = D(t) and LD = 1= Q(t− 1) and LD = 0
where
D(t) is the state of the data at time tQ(t) is the state of the latch at time tQ(t-1) is the state of the latch at time t-1
VLSI DesignCourse 3-38
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.6 Pseudo 2-Phase Logic Structures
Figure 3.49: Pseudo 2-phase dynamic logic
VLSI DesignCourse 3-39
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.7 Pseudo 2-Phase Logic Structures: Domino Logic
a number of logic stages may be cascaded before latching the result
Figure 3.50: Pseudo 2-phase domino logic
VLSI DesignCourse 3-40
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.8 2-Phase Memory Structures: Skew Reduction
Figure 3.51: 2-phase flip-flop and skew reduction
VLSI DesignCourse 3-41
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.9 2-Phase Memory Structures: Chain Latch
Figure 3.52: Chain latch
VLSI DesignCourse 3-42
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.10 2-Phase Memory Structures: Static Flip-Flops
Figure 3.53: 2-phase static flip-flops
VLSI DesignCourse 3-43
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.11 2-Phase Memory Structures: Static D Flip-Flops
Figure 3.54: 2-phase static D flip-flops
VLSI DesignCourse 3-44
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
Figure 3.55: 2-phase static D flip-flops (continued)
VLSI DesignCourse 3-45
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
2-Phase D Flip-Flops layouts of Fig 3.53a, 3.54a and 3.54b
Figure 3.56: 2-phase D flip-flops layouts
VLSI DesignCourse 3-46
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Memory Structures
3.7.12 Static D Flip-Flop with Set and Reset
Figure 3.57: Static D flip-flop with set and reset
INPUTS OUTPUTCL D R S QX X 1 0 0X X 0 1 1X X 1 1 NA
Table 3.1: Static D flip-flop set/reset truth table
VLSI DesignCourse 3-47
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Signaldelay
Chapter 4
Performance
4.1 Signaldelay
4.1.1 Resistance Estimation
The resistance of an uniform slab of conducting material may be expressed as
R =(ρ
t
)(l
w
)where
ρ = resistivityt = thicknessl = conductor lengthw = conductor width
This expression may be rewritten as R = Rs(lw
), where Rs is the sheet resistance having
units of Ω2
(ohms per square). Thus to obtain the resistance of a layer, one would simplymultiply the sheet resistance Rs, by the ratio of the length to width of the conductor. Notethat for metal having a given thickness t, the resistivity is known, while for poly and diffusionthe resistivities are significantly influenced by the concentration density of the impurities thathave been introduced into the conducting regions during implantation. This means that theprocess parameters have to be known to accurately estimate these quantities.
Although the voltage-current characteristic of a MOS transistor is generally nonlinear, it issometimes useful to approximate its behavior in terms of a channel resistance to estimateperformance. The channel resistance may be expressed by
Rc = k
(L
W
)with
k =[µ
(ε0εrtox
)(Vgs − Vt)
]−1
VLSI DesignCourse 4-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Signaldelay
! "
#%$ #%$
& '! "
(*),+
-/. 01/23
41$ -/. 01% ( 41$ -/. 0156 (
Figure 4.1: Basic LOCOS MOSFET structure.
For both the n-channel and p-channel devices, k may take a value within the range 50, 000 to30, 000Ω
2. The equation for k as given above demonstrates the dependence of channel resistance
on the surface mobility µ of the majority carriers. Since the mobility is also a function oftemperature, the channel resistance and therefore switching time parameters, as well as powerdissipation, change with temperature variations. The increase in channel resistance may beapproximated by +0.25% per C for an increase in temperature above 25C.
4.1.2 Capacitance Estimation
The dynamic response of MOS systems are very much dependent on the parasitic capacitancesassociated with the MOS device and interconnection capacitances that are formed by metal,poly, and diffusion wires in concert with transistor and conductor resistances. The total loadcapacitance on the output of a MOS gate is the sum of:
• gate capacitance (of other inputs connected to the output of the gate)
• diffusion capacitance (of the drain regions connected to the output)
• routing capacitance (of connections between the output and other inputs).
Gate Capacitances
The large-signal MOSFET capacitance model that will be used to compute Cgate is based onthe self-aligned, poly gate LOCOS (local oxidation of silicon) structure depicted in Fig. 4.1.
VLSI DesignCourse 4-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Signaldelay
ColdCgs
CdbCsb
Cgb
ColsCgd
p+p+
n+ n+
field oxidefield oxide
Source Drain
(Oxide thickness xox)
Gate
Source n+
Ls Ld
Poly gate
Drain n+
W
YdYs
Top view geometry (b)
L
p-substrate
Basic model (a)
Figure 4.2: MOSFET capacitor model.
Although the LOCOS MOSFET has been singled out for the analysis, the model developedhere is generally applicable to any MOSFET regardless of the technology base. Figure 4.2ashows the basic lumped-element capacitances and their physical origins in terms of the devicecross section. This particular model is chosen because it allows the capacitors to be dividedinto contributions that may be computed directly from the device and processing parameters.
1. The overlap capacitances Cols and Cold are parasitic elements that originate from thebasic fabrication steps. In the self-aligned process, the polysilicon gate is employed as amask to define the n+ drain and source regions. Directly after this step, Ls = Ld = 0and L′ = L. The overlaps occure because the remaining steps require heating of thewafer. This gives rise to lateral diffusion of the n+ dopants. Typically, these overlap
VLSI DesignCourse 4-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Signaldelay
capacitance Off Linear Saturation
CgbεAtox
0 0
Cgs 0 12
(εAtox
)23
(εAtox
)
Cgd 0 12
(εAtox
)0
Cg = Cgb + Cgs + CgdεAtox
εAtox
23
(εAtox
)
Table 4.1: Approximation of intrinsic MOS gate capacitance
distances are less than a few tenths of a micron.
Cols = CoxWLs, Cold = CoxWLd
where Cox = εoxtox
2. The gate-source capacitance Cgs is really the gate-to-channel capacitance as seen be-tween the gate and source; similarly, Cgd represents the gate-drain capacitance whenthe channel is acting as a conductor to the drain n+ region. The voltage-dependentnature of the channel implies that these elements are nonlinear. Cgb is the gate-bulkcapacitance and consists of the gate capacitance in series with the depletion capacitanceestablished by the p-type space charge region. Table 4.1 shows approximated values ofthese three capacitances in various states of the MOS transistor.
Diffusion Capacitances
The two remaining capacitors in the model of Fig. 4.2a are Csb and Cdb. These represent thevoltage-dependent depletion capacitances that result from the pn junctions at the drain andsource regions. The problem of determining these elements is aided by using the expandeddrawing in Fig. 4.3. This shows an n+ well in a p-type bulk region and is representativeof either a drain or a source; note that a p+ region surrounds the n+sidewalls. The actualdoping profile around the pn junction is generally quite complicated. A step doping will beassumed for simplicity.
The total depletion capacitance Cd can be presented by
Cd = Cja · (W · Yd) + Cjp · (2W + 2Yd)
where
VLSI DesignCourse 4-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Signaldelay
Y
W
n+ p+
p-type substrate Na
Depletion region
p+ p+
p+
Nd
n+ Drain or source well
xj
Figure 4.3: Expanded view of an n+ drain or source region for computing depletion capaci-tances.
Cja = juntion capacitance per µm2
Cjp = periphery capacitance per µmW = width of diffusion regionYd = extent of diffusion region
Since the thickness of depletion layer depends on the voltage across the junction, both Cjaand Cjp are functions of junction voltage Vj . A general expression that describes the junctioncapacitance is
Cj = Cj0
(1− Vj
ΦB
)−mwhere Vj is the junction voltage (negative for reverse bias), Cj0 zero bias capacitance (Vj = 0),and ΦB the build-in junction potential (∼ 0.6V ). m is a constant, which depends on thedistribution of impurities near the junction, and has a value of the order of 0.3 to 0.5.
Routing Capacitances:
Routing capacitances between metal and poly layers and the substrate can be approximatedusing a parallel plate model (C = ε
tA), where A is the area of the plate capacitor, t isthe insulator thickness, and ε is the dielectric constant of the insulating material betweenthe plates. The parallel-plate approximation, however, ignores fringing fields. The effect offringing fields is to increase the effective area of the plates. Consequently, poly and metal lineswill actually have a higher capacitance (up to twice as large) than that predicted by the model.Interlayer capacitance such as metal-poly capacitance is also enhanced by fringing. As linewidth are scaled, the width (w) and heights of wires tend to reduce less than their separations(l). Accordingly, this fringing effect increases in importance. For current processes, a factorof 1.5 − 3 should be used. Another factor, which should be taken into account for small
VLSI DesignCourse 4-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Signaldelay
R RVj-1 Vj Vj+1
RR
Ij-1 Ij
C C CC
Figure 4.4: Representation of long wire in terms of distributed RC sections
geometries when using the parallel plate model, is that a drawn shape (on mask) will not bethe same as the actual physical shape produced on silicon.
4.1.3 RC-line model
The propagation of a signal along a wire depends on many factors, including the distributedresistance and capacitance of the wire, the impedance of the driving source, and the loadimpedance. For very long wires propagation delays caused by distributed resistance capaci-tance (RC) in the wiring layer tend to dominate. This transmission line effect is particularlysevere in poly wires because of the relatively high resistance of this layer. A long wire can berepresented in terms of several RC sections, as shown in Fig. 4.4.
The response at node Vj with respect to time is then given by
CdVjdt
= (Ij−1 − Ij) =(Vj−1 − Vj)
R− (Vj − Vj+1)
R
As the number of sections in the network becomes large (and the sections become small), theabove expression reduces to the differential form:
rcdV
dt=d2V
dx2
where
x = distance from inputr = resistance per unit lengthc = capacitance per unit length
Solution of this differential form yields an approximate signal delay of:
tl =rcl2
2
where
r = resistance per unit lengthc = capacitance per unit lengthl = length of the wire
VLSI DesignCourse 4-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Signaldelay
1mm 1mm
OutputInput
Buffer
taubuf
Figure 4.5: Segmentation of polysilicon line
V
tau
Rs Rt
CtCl
Figure 4.6: Simple model for rc delay calculation
The l2 term in the equation above shows that signal delay will be totally dominated by thisRC effect for very long signal paths. In order to optimize speed in a long poly line, one possiblestrategy is to segment the line into several sections and insert buffers within these sections asshown in Fig. 4.5.
A model for the distributed RC delay, which takes driver and receiver loading into account,is shown in Fig. 4.6. Rs is the output resistance of the driver. Cl is the receiver inputcapacitance. Rt and Ct are the total lumped resistance and capacitance of the line. τ is theRC delay calculated using the equation τ = rc.l2
2 . The concept of using RC time constants fordelay estimations is based upon the assumption that the time taken for a signal to reach 63%of its final value approximates the switching point of an inverter.
Wire length design guide
For the purpose of timing analysis, an electrical mode may be defined as that region of con-nected paths in which the delay associated with signal propagation is small in comparison withgate delays. For sufficiently small wire lengths, RC delays can be ignored. Wires can then betreated as one electrical node and modeled as simple capacitive loads. It is therefore useful todefine simple electrical rules that can be used as a guide in determining the maximum lengthof communication paths for the various interconnect levels. To do this we required that wiredelay τw and gate delay τg satisfy the following condition:
τw τg
VLSI DesignCourse 4-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
CMOS Gate Transistor Sizing
To fulfil this condition, the maximum length of the wire is given by:
l√
2τgrc
This establishes an upper bound on the allowable length of the interconnects where the aboveapproximations are valid.
4.2 CMOS Gate Transistor Sizing
To have the same rise and fall times for an inverter, we must make
Wp = 2Wn
where Wp is the channel width of the p-device and Wn is the channel width of the n-device.This, of course increases layout area and dynamic power dissipation. In some cascaded struc-tures it is possible to use minimum size devices without compromising the switching response.This is illustrated in the following analysis, in which the delay response for an inverter pair(Fig. 4.7a) with Wp = 2Wn is given by
tinv.pair = tfall + trise
= R.3Ceq + 2(R
2
)3Ceq
= 3RCeq + 3RCeq= 6RCeq
where R is the effective on resistance of a unit-sized n-transistor and Ceq = Cg + Cd is thecapacitance of a unit-size gate and drain region. The inverter pair delay with Wp = Wn is
tinv.pair = 4RCeq + 2RCeq= 6RCeq
Thus we find similar responses are obtained for the two different conditions.
4.3 Power Dissipation
There are two components that establish the amount of power dissipated in a CMOS circuit.These are:
1. Static dissipation − due to leakage current.
VLSI DesignCourse 4-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power Dissipation
tinv.pair
tinv.pair
R
R
2Ceq2Ceq
(b) Wp=Wn
3Ceq 3Ceq
(b) Wp=2Wn
R
2R
Figure 4.7: CMOS inverter pair timing response
2. Dynamic dissipation − due to:
(a) switching transient current
(b) charging and discharging of load capacitances
4.3.1 Static power dissipation
Considering a complementary CMOS gate, if the input=‘0’, the associated n-device is ‘OFF’and the p-device is ‘ON’. The output voltage is VDD or logic ‘1’. When the input=‘1’, theassociated n-channel is biased ‘ON’ and the p-channel device is ‘OFF’. The output voltage is0V (VSS). Note that one of the transistors is always ‘OFF’ when the gate is in either of theselogic states. Since no current flows into the gate terminal, and there is no D.C. current, andhence power Ps, is zero.
However, there is some small static dissipation due to reverse bias leakage between diffusionregions and the substrate. The source-drain diffusion and the p-well diffusion form parasiticdiodes. Since the diodes are reverse biased, only their leakage current contributes to staticpower dissipation. The leakage current is described by the diode equation
i0 = is(eV
kT/q − 1)
where
VLSI DesignCourse 4-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power Dissipation
is = reverse saturation currentV = diode voltageq = electronic chargek = Boltzmann’s constantT = temperature
The static power dissipation is the product of the device leakage current and the supplyvoltage. A useful estimate is to allow a leakage current of 0.1nA to 0.5nA per gate at roomtemperature. Then total static power dissipation Ps is obtained from
Ps = (∑n
1 leakage current) × suply voltage
For example, typical static power dissipation due to leakage for an inverter operating at 5Vis between 1− 2nW (nano-watts).
4.3.2 Dynamic power dissipation:
During transition from either ‘0’ to ‘1’ or, alternatively, from ‘1’ to ‘0’, both n- and p-transistorsare on for a short period of time. This results in a short current pulse from VDD to VSS .Current is also required to charge and discharge the output capacitive load. This latter termis generally the dominant term. The current pulse from VDD to VSS results in a ”short circuit”dissipation which is dependent on the load capacitance and the gate design. This is of relevanceto I/O buffer design.
The dynamic dissipation can be modeled by assuming the rise and fall time of the step inputis much less than the repetition period. The average dynamic power, Pd, dissipated duringswitching for a square-wave input Vin, having a repetition frequency of fp = 1/tp, as shownby Fig. 4.8, is given by
Pd =1tp
tp/2∫0
in(t)Vo.dt+1tp
tp∫tp/2
ip(t)(VDD − Vo).dt
where
in = n-device transient currentip = p-device transient current
For a step input with in(t) = CLdVo/dt (CL =load capacitance)
Pd =CLtp
VDD∫0
Vo.dVo +CLtp
0∫VDD
(VDD − Vo).d(VDD − Vo)
=CLV
2DD
tp
with fp = 1tp
, resulting inPd = CLV
2DDfp
VLSI DesignCourse 4-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power Dissipation
$V_DD$
$V_DD$
$0$
$0$
$0$
$V_o$
$t_p$
$I_d$
$V_in$
$I_dn$
$I_pn$
$t$
$t$
$t$
$t_f$ $t_r$
Figure 4.8: Waveforms for determination of dynamic power dissipation
Thus for the repetitive step input the average power that is dissipated is proportional to theenergy required to charge and discharge the circuit capacitance. The important factor to benoted here is that the lattest equation shows power to be proportional to switching frequencybut independent of the device parameters.
4.3.3 Power delay product
The power delay product (PDP) is used to characterize the overall performance of a digitalgate circuit. It is given by
PDP = Pavtp
where Pav is the average power dissipated by the gate and tp is the average propagation delaytime. Typically, MOS-based digital gates display power-delay products on the order of a fewpicojoules (pJ). The PDP is commonly used to compare the performance of various logicfamilies or processing technologies. A small PDP is desirable, as this implies both low powerconsumption and fast switching speeds.
As a first step towards understanding the meaning of the PDP, suppose that an ideal square
VLSI DesignCourse 4-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power Dissipation
T/2t
Vin(t)
T0
Finite rise and fall time waveform
(b)
VOH
T/2
VOL
t
Vin(t) One logic cycle
T
Ideal square wave (a)
0
VOH
VOL
V_1/2
Figure 4.9: Input voltage waveforms for the power-delay products
wave Vin(t) (Fig. 4.9a) is applied to the resistively load nMOS inverter shown in Fig. 4.10a; theoutput voltage Vout(t) then assumes the form drawn in Fig. 4.10b. The average propagationdelay is
tp ≈12
(Ron +RL)Cout
with approximations as followed
tPHL ≈ τD = RonCouttPLH ≈ τL = RLCout
where Ron is the on-resistance of the driver; note that Ron = RDS .
The average power dissipated by the circuit is given by
Pav = IavVDD
Iav is the average power supply current and is separated into two contributions: the constant(DC) current flow when the output is stable with Vout = VOL and the transient current thatflows during the rise and fall times. Using Ohms’s law, the average DC power dissipationduring the period T is
Pav =V 2DD
2(Ron +RL)The PDP that results from the constant DC current flow only is given by
(PDP )DC ≈14CoutV
2DD
VLSI DesignCourse 4-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power Dissipation
$V_in(t)$
$V_DD$
$R_L$
$+$
$-$
$+$
$-$
$V_out(t)$
\it Basic inverter (a)
$C_out$
$t$
$V_OH$
$V_1/2$
$V_OL$
$T/2$ $T$
$V_out$
\it Output voltage (b)
$V_DD$
$+$
$-$
$V_out$
$R_L$
$R_on$
\it Resistor analogy for $V_out=V_OL$ (c)
Figure 4.10: Power-delay product in a resistively loaded inverter.
The total power-delay product for the circuit must also account for the average power con-sumed by the gate during the rise and fall time intervals. Consider first the charging currentsupplied by VDD during the rise time tLH . Since the driver is in cutoff, this can be estimatedby
Iav ≈ Cout(∆V )(∆t)
= CoutVltLH
with Vl = VDD being the logic swing. The resulting PDP contribution due to this current isthen
(PDP )LH ≈ CoutV 2DD
tptLH
The power supply current used by the inverter during the discharge time tHL is approximatedby
Iav ≈12
(Iinitial + Ifinal) =12
((VDD − VOH)
RL+
(VDD − VOL)RL
)Iinitial and Ifinal give the current at the beginning and end of the discharging event. Thus,
VLSI DesignCourse 4-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power Dissipation
$0$ $T/2$ $T$$t$
$0$ $T/2$ $T$$t$
$I_leak$
$I_max$
$I_peak$
\it Input voltage waveform
\it Power supply current for an NMOS inverter
\it (b)
\it (c)
$0$ $T/2$ $T$
$t$
$V_OL$
$V_1/2$
$V_OH$
$V_in(t)$
\it (a)
\it Power supply current for a CMOS inverter
$I(t)$
$I(t)$
Figure 4.11: Current waveforms for the power-delay product calculations.
assuming VOL VOH = VDD,
Iav ≈VDD2RL
Now, noting that tPHL ≈ τD, a first-order estimate for the discharge time tHL istHL ≈ 2τD = 2RonCout.
Forming the power-delay product for this time interval gives the term
(PDP )HL ≈ CoutV 2DD
RonRL
tptHL
The complete expression for the PDP is obtained by summing all contributions:
PDP ≈ CoutV 2DD
(14
+tptLH
+RonRL
tptHL
)This can be simplified by noting that Ron RL will be valid in a well-designed inverter. Thepropagation delay time is then tp ≈ (τL/2). Using this in conjunction with the approximationstLH ≈ 2τL and tHL ≈ 2τD gives
PDP ≈ 34CoutV
2DD
as the lowest-order approximation for the total PDP.
VLSI DesignCourse 4-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Scaling
The power-delay product for the CMOS inverter is computed by using the current waveformin Fig. 4.11c. Since current flows only during a switching event, the average power supplycurrent required during a single logic cycle T can be written by
Iav =1T
[IDn,LHtLH + IDn,HLtHL]
In this equation IDn,LH gives the average current during the rise time, while IDn,HL is theaverage fall time current. For a completely symmetric CMOS inverter, the two currents arethe same, so the power-delay product is given by
PDPCMOS = IDn,avVDDtpf
fmax
4.4 Scaling
Very large-scale integration (VLSI) requires dense circuit layouts on silicon. The level ofintegration depends on the smallest-size feature permitted by the fabrication processes. Toobtain the highest packing density, the size of the transistors must be made as small as possible.This, however, changes the internal operating physics of the MOSFETs. Phenomena that arenegligible in “large” devices become limiting factors as device geometries are reduced.
This section discusses some of the important aspects involved in describing small MOSFETs.The level is introductory, with emphasis on parameters that affect circuit design. The modelwe use is a simple first-order constant field scaling.
4.4.1 Scaling principles
First-order MOS scaling theory indicates that the characteristics of an MOS device can bemaintained and the basic operational characteristics preserved if the critical parameters of adevice are scaled in accordance to a given criterion. Such an approach has shown to be veryeffective in scaling from the range 5µm to 10µm minimum features to the range 1µm to 3µmminimum feature size.
Although first-order scaling does not give optimized device performance at small dimensions,the technique is very powerful in providing the necessary guidelines to identify the improve-ments (or otherwise) that can be expected as processes are scaled.
Basically the scaled device is obtained by applying a dimensionless factor α to
• all dimensions, including those vertical to the surface
• device voltages
• the concentration densities.
The resultant effect of the first-order scaling process is illustrated in Table 4.2. Table 4.2shows that if device dimensions (which include channel length L, channel width W , oxidethickness Tox, junction depth Xj , applied voltages, and substrate concentration density N)are scaled by the constant parameter α, then the depletion thickness d, the threshold voltage
VLSI DesignCourse 4-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Scaling
SCALINGPARAMETER FACTORLength; L 1/αWidth; W 1/α
DEVICE Gate oxide thickness; tox 1/αPARAMETER Junction depth; Xj 1/α
Substrate doping; Na or Nd αSupply voltage; VDD 1/αElectric field across gate oxide; E 1Depletion layer thickness; d 1/αParasitic capacitance; WL/tox 1/αGate delay; (V C/I) 1/α
RESULTANT DC power dissipation; Ps 1/α2
INFLUENCE Dynamic power dissipation; Pd 1/α2
Power speed product 1/α3
Gate area 1/α2
Power density; (V I/A) 1Current density; (I/A) αTransconductance; gm 1
Table 4.2: Influence of first-order scaling on MOS device characteristics
Vt, and drain-to-source current Ids are also scaled. One of the important factors to be notedis that since the voltage is scaled, electric field E in the device remains constant. This hasthe desirable effect that many nonlinear factors essentially remain uneffected. A further pointis that reduction in oxide thickness would require the fabrication process to provide thinneroxides with comparable yield to conventional oxide thicknesses.
The depletion regions associated with the pn junctions of the source and drain determine howsmall we can make the channel. As a rule, the source-drain distance must be greater than thesum of the widths of the depletion layers to ensure that the gate is able to exercise control overthe conductance of the channel. Thus in order to reduce the length of the channel one needs toreduce the width of the depletion layers. This is accomplished by increasing the doping levelof the substrate silicon. As we scale device dimensions by 1/α, the drain-to-source current Idsper transistor reduces by α, the number of transistors per unit area; that is, circuit densityscales up by α2, which subsequently results in the current density scaling linearly with α.Thus wider metal conductors will be necessary for densly packed structures.
A second characteristic illustrated in Table 4.2 is power density. Both the static power dissi-pation Ps and frequency dependent dissipation Pd decrease by 1/α2 as the result of scaling.However, since the number of devices per unit area increases by α2, the resultant effect is thatthe power density remains constant.
An estimation of the limit in power density is derived from the thermodynamic relationshipgiven by
Tj = Tamb + θjA.P
where
VLSI DesignCourse 4-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Scaling
Tj = temperature of silicon chipTamb = ambient temperatureθjA = thermal resistance of the packageP = power dissipation.
Generally the thermal resistance is expressed as ∆C per watt, which means one watt of heatenergy will raise the temperature by ∆C.
As the temperature increases, the carrier mobility falls, thus reducing the gain of devices.This, in turn, would reduce the speed of circuits. If high temperature, high speed circuits arerequired, then special consideration during design is necessary.
One of the limitations of first-order scaling is that it gives the wrong impression of beingable to scale proportionally to zero dimension, or to zero threshold voltages. In reality, boththeoretical and practical considerations do not permit such behavior. This is highlightedwhen the surface concentrations become larger than surface concentrations become largerthan 1×1019cm−3, above which the gate oxide breaks down, before surface inversion can takeplace for the formation of the channel.
4.4.2 Interconnect layer scaling
Although constant-field (first-order) scaling gives a number of improvements, there are anumber of curcuit parameters such as voltage drop, line propagation delay, current density,and contact resistance that exhibit significant degradation with scaling. For example scalingthe thickness and width of a conductor by α, reduces the cross-sectional area by α2. Thescaled line resistance r′ is given by
R′ =ρ
t/α
[L/α
W/α
]= αR
where ρ is the conductivity term and t is conductor thickness. The voltage drop along such aline can now be expressed as
V ′d = (I/α)(αR) = IR
which is a constant. However, for constant chip size, the length of some of the signal pathsthat traverse across the chip, as a rule, do not scale down. This gives the principal result thatvoltage drops along communication paths are larger by a factor of α with respect to the scaledvoltages. In a similiar manner, we can derive the line response time as
τ ′s = (αR)(C/α) = RC
which is a constant. However, as before, for a constant chip size many of the communicationpaths do not scale. Thus the line response time normalize to scaled line response is larger bya factor of α. The significance of this result is that it is somewhat difficult to take the fulladvantage of the higher switching speeds inherent in scaled devices when signals are required topropagate over long paths. Thus the distribution an organization of clocking signals becomesa major problem as geometries are scaled.
The influence of scaling on interconnection paths is summarized in Table 4.3. As seen fromTable 4.3, metal lines must carry a higher current with respect to cross-sectional area; thus
VLSI DesignCourse 4-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power and Clock Distribution
PARAMETERS SCALING FACTORLine resistance; r αLine response; rc 1Normalized line response αLine voltage drop; Vd 1Normalize line voltage drop αCurrent density; J αNormalized contact voltage drop; Vc/V α2
Table 4.3: Influence of scaling on interconnect media
electron migration becomes a major factor to consider. The second problem relates to anincrease in the capacitance of wiring. As the level of integration increases, the average linelength on a chip tends to increase also. However the power dissipation per gate decreases,which diminishes the ability of gates driving wiring capacitances. Under such condition,average gate delay is determined by the interconnection rather than the gate itself.
Many of these limitations are being overcome by scaling lateral dimensions while keepingvertical dimensions approximately constant.
4.5 Power and Clock Distribution
4.5.1 Power distribution
One of the most important issues in chip planning is the routing of power. In technologies inwhich there is only one level of metal, VDD and ground are routed in interdigitated trees. Thisis illustrated in Fig. 4.12. Crossunders are very difficult. When necessary, these are done inlow resistance interconnect (poly over buried contact over active area) with a multiplicity ofcontact cuts. Consider the extreme case of a crossunder that must cary 100mA. One square oflow resistance interconnect might have a maximum resistance of, say, 10Ω/2. Thus a squarecrossunder would drop 1 volt. Over 50 contact 2µm cuts to the metal on each side wouldbe needed because of metal migration limits. Obviously, 100mA is an awful lot of current tosqueeze through a crossunder. Even 10mA can be difficult, and 10mA corresponds only toabout twenty nMOS inverters.
Power is usually distributed locally in diffusion since it must get to the sources and drainsanyway. For low-power gates, this local power distribution is not too bad, but for highperformance devices, great care must be taken. When two levels of metal are available thegeneral power distribution is much easier, though by no means trivial.
Clearly, one of the worst scenarios for power supply noise is when large segments of the chiptransition simultaneously. One strategy, therefore, is to distribute power in such a way thatparts of the chip that are likely to transition all at once are routed separately. If power isdistributed across these simultaneously switching segments, we would expect large surges onthe power lines, but if power is distributed along the signal lines, then surge currents shouldbe much smaller.
A major problem of high performance chips is bringing power onto the chip. Bonding wires
VLSI DesignCourse 4-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power and Clock Distribution
Vss
Vdd
Vdd
Vss
Vdd
Vdd
Vdd
Vss
Vss
Vdd
Vdd
Vdd
Vss
Vss
OUTPUT PADS
Figure 4.12: Layout pattern for VDD and VSS lines.
can bave anywhere from 0.25 to 2nH of inductance (about 0.5 to 1nH/mm). VDD and groundare often double-bonded (two wires to the bonding pad) but while this lowers the inductancesomewhat, it does not give the expected factor of two unless the wires are kept far apart. Thisis because there is mutual coupling between the wires. Seperate power pins might be used forthe output driver, since these drivers cause huge switching transients and can tolerate morepower supply noise than the internal circuitry.
4.5.2 Clock distribution
Synchronizing machine operations and data transfers with clock pulses provides us with astructured framework for dealing with the complexities of large system designs. Clockingis a global control technique which provides the “glue” for system operation. It is equallyimportant at the circuit level, particularly in a dynamic logic stage.
VLSI DesignCourse 4-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power and Clock Distribution
4.5.3 Clock and Timing Circles
System level timing can be described using circular timing charts. Consider an ideal pseudo2-phase scheme with mutually-exclusive pulses φ1 and φ2:
φ1(t) · φ2(t) = 0
System timing can be described by constructing the chart shown in Fig. 4.13. Time increasesin a counter-clockwise direction with one full rotation corresponding to the clock periode T .Segments are labeled according to time intervals when a clock signal is high. In this example,φ1 = 1 during the first half-period, while φ2 = 1 during the last half-period.
Figure 4.13: Pseudo 2-Phase Clocking Chart
A more realistic clocking arrangement is depicted by the clocking circle in Fig. 4.14. If bothclocks have 50% duty cycles, normal operation gives
φ1(t) · φ2(t) = 0
except during the transition times. Mutually-exclusive clock signals provide timing intervalsfor logical operations, and are used to allow for normal gate delay times. Overlapped segmentsare avoided to prevent ill-defined movement of data, instructions, or control signals. Transtiontimes can be made small by proper clock generator design.
Figure 4.14: Pseudo 2-Phase Overlap Times
Clock skew is represented by rotating one of the clocks as shown in Fig. 4.15. The skew timets is defined as the time interval where
φ1(t) · φ2(t) = 1
VLSI DesignCourse 4-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power and Clock Distribution
and indicates the possibility of unwanted simultaneous bit transfers. This may lead to severeconflict problems in the operation.
Figure 4.15: Clock Skew
4.5.4 Clock Generation Circuits
A basic 2-phase clock generator circuit is designed to generate φ and φ from a single inputCLK signal. This is often a matter of convenience to the user: requiring only a single externalclock makes the chip’s usage more attractive to the board designer.
Various circuits have been developed for use in clock generation. Fig. 4.16 provides a CMOSgenerator/driver which uses a transmission gate as a delay element. MOSFETs Mn1 andMp1 form an inverter which acts as the first driver for the chain. The upper branch of thecircuit consists of two cascaded inverters and generator the signal φ = CLK while the lowerbranch only has a single inverter and gives φ = CLK. Transmission gate TG is used as a delayelement to minimize clock skew between φ and φ. Since it is biased into active conduction,we will model it using an equivalent resistance RTG, and introduce the time constant
tD ' RTGCin
If the propagation delay through an inverter is tp, then choosing
tD ' tP
equalizes the delay between the upper and lower branches. Recalling that the transmissiongate conductance can be approximated by
GTG ' βn(VDD − VTn) + βp(VDD − |VTp|)
we see that clocking skew can be controlled by adjusting the size of the TG transistors.
Another straightforward approach uses an SR latch as shown in Fig. 4.17. The clocking signalCLK is inverted, and CLK and CLK are used to drive the SR circuit. The 2-phase clocksignals φ and φ are taken from the latch outputs. This logic can also be used to generatepseudo 2-phase clocks φ1 and φ2 by redefining the outputs.
VLSI DesignCourse 4-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Power and Clock Distribution
Figure 4.16: Clock Generator With a TG Delay
Figure 4.17: Latch-Based Clock Generator
4.5.5 Clock Drivers and Distribution Techniques
Once the clocking pulses are generated they must be destributed throughout the chip in amanner which minimizes clock skew. Fig. 4.18 illustrates the problem in a pseudo 2-phasecircuit by showing timing circles at various points on a chip. Skew problems originate mostlyfrom
• Unbalanced loads at the driver,
• Unequal RC line delays,
so that the driver circuits and associated distribution schemes are important in maintainingthe synchronous logic design. A related problem is that the drive capability of the circuitmust be able to handle large capacitive loads at the required clock frequency.
VLSI DesignCourse 4-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Input Protection Circuits
Figure 4.18: Clock Skew Due to Distribution
One approach to designing a clock distribution network is to use a cascaded chain of invertingbuffers that matches the clock generator to the distribution line. Also careful global planningand structured distribution patterns can be used to solve the problem.
Clock distribution can also be accomplished by using a balanced tree network with multiplefanouts as shown in Fig. 4.19. Identical drivers can be used within a given stage. Moreover,the drive requirements of the output circuits are reduced from the single inverter design sincethe FO has been split into groups. Each inverter reshapes the clocking waveform, making theperformance less sensitive to variations in the interconnect routing.
Clock skew problems can be minimized by using symmetrical geometries for the clock distri-bution lines. An example is the “H-tree” network shown in Fig. 4.20. Every clock distributionpoint O is the same distance from the driver D, giving equal delay times. If the load capac-itance is the same at every O-point, then the clocks will all be in phase with one another.Other geometrical patterns can be used so long as the general design criteria are unchanged.
4.6 Input Protection Circuits
Input pads connect data, control, or clocking signals to on-chip logic gates. When the padsare directly connected to the gate electrodes of MOSFETs, care must be taken to insurethat excessive static electrical charge does not destroy the transistor. Protection circuits aredesigned to drain excessive charge away from the MOS capacitance to avoid static burnout.
To understand the origin of the problem, recall that a MOSFET gate is basically a capacitorof value
Cg = CoxWL
With a gate-substrate voltage VG applied to the transistor, the internal oxide electric field is
VLSI DesignCourse 4-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Input Protection Circuits
Figure 4.19: Clock Line Capacitance
Figure 4.20: Clock Line Capacitance
given by
Eox 'VGxox
where we have ignored any trapped oxide or surface charge. Breakdown occurs because of thefact that silicon dioxide has a breakdown field value of approximately
EBD ∼ 7.5× 106V cm
If Eox exceeds this value, the oxide insulating properties break down and charge is tranported
VLSI DesignCourse 4-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Static Gate Sizing
through the material. This usually results in destruction of the device. Since xox is usuallyless than about 450 A, the maximum gate voltage VG,max ' EBD · xox which can be appliedto the device is a relatively small number.
The basic idea of an input protection circuit is to allow for alternate charge flow paths whenthe input voltage gets too large. Diode structures are very useful in this application since theyhave relatively breakdown voltages which can be controlled. Moreover, reverse breakdown ina pn-junction is non-destructive, so that the protection circuit is reusable. Junctions whichare purposely used at the reverse-bias breakdown voltage are generally termed Zener diodes.
Fig. 4.21 illustrates a simple input protection circuit for CMOS IC. Reverse biased pn-junctionsare used as protection diodes, and a series connected resistor is included to drop some of thevoltage. Both diode pairs (D1, D2) and (D3, D4) are designed to undergo breakdown forpositive or negative voltage surges. R is designed to reduce the voltage that reaches (D3, D4);this effectively increases the level of protection to the transistor gate.
Figure 4.21: Input Protection Circuit
One problem that exists with this input protection circuits is the introduction of parasitic RCtime constants into the network.
Other input protection schemes are used. Fig. 4.22 shows a common circuit based on theproperties of a thick field oxide MOSFET. The transistor has an threshold voltage of VT,F >VDD and is in cutoff during normal operation. A large input voltage V > VT,F drives thetransistor into conduction, providing a path to ground to drain off the excessive charge. Thebreakdown voltage of the FOX MOSFET is large enough to withstand the high voltages sinceXFOX is large.
4.7 Static Gate Sizing
An interesting and useful problem is that of optimizing a chain of static gates to minimizethe overall propagation delay. This type of situation arises in many different situations and isimportant to high-performance circuits. In particular, it is relevant to the output drivers andclocking circuits.
VLSI DesignCourse 4-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Static Gate Sizing
Figure 4.22: Thin Oxide MOSFET Protection Circuit
A classic example is shown in Fig. 4.23 where the objective is to design the fastest networkfor driving a large capacitance. For the problem at had, we will assume a series of invertingbuffers for the driving network. At first sight, it may appear that we could want the fewestpossible gates between the input and the load. This simple solution, however, ignores the effectof capacitive loading on successive stages. Accounting for these factors shows that the sizingof the transistors in the chain allows for minimization of the delay. This gives the interestingresult that additional logic gates are often inserted to reduce the overall propagation delaybetween two points.
Figure 4.23: Capacitive Loading Problem
Consider the scaled inverter chain shown in Fig. 4.24. Each gate is characterized by a sizingfactor Sj which is normalized to the first stage such that S1 = 1, while Sj > 1 for (j > 1). Bydefinition, the first stage has a MOSFET conduction factor
β1 = k′(W
L
)1
while the j-th stage is described byβj = Sjβ1
The values of Ci and C0 are determined by gate 1, and scaled for successive gates. Notethat an additional capacitive component Cw has been added between stages. This represents
VLSI DesignCourse 4-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Static Gate Sizing
the wiring contribution. We assume that the wiring capacitance is between two stages isproportional to the sizing factor of the second stage. The capacitance between the j-th gateand the (j + 1)-st gate can be summarized as follows:
• SjCo, output capacitance from gate j
• Sj+1Ci, input capacitance to gate (j + 1)
• Sj+1Cw, wiring capacitance into gate (j + 1).
The time delay through gate j is thus estimated by
tD,j =
(R
Sj
)[SjCo + Sj+1(Ci + Cw)]
Our calculation is to determine the values of Sj for (j = 2, ...) which minimizes the total delaythrough the chain.
Figure 4.24: Inverter Sizing Problem
Suppose that there are N stages in the chain. The total time delay is given by
TD =N∑j=1
R[SjCo + Sj+1(Ci + Cw)]Sj
To minimize TD, we differentiate with respect to Sj and look for zero slope points via
δTDδSj
= 0;
this results in the recursion relationSj+1
Sj=
SjSj−1
VLSI DesignCourse 4-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Static Gate Sizing
for j = 2, 3, .., N . If this to hold for arbitrary values of j, then
Sj+1
Sj= K = constant
must be true. Now then, the boundary conditions of the problems are
S1 = 1
SN+1 =CLCi
and the ends of the chain. Forming the product
S2
S1· S3
S2· S4
S3· · · SN+1
SN= KN
and using the boundary conditions gives
KN =CLCi
Thus, we obtain the scaling ratio in the form
K =(CLCi
) 1N
which is our final result. Explicitly, the scaling factors are given by
S1 = 1S2 = K
S3 = K2
.
.
.
SN = KN−1
as the scaling required to optimize the chain. The minimum delay is then
TD,min =N∑j=1
R[Co +K(Ci + Cw)]
= NR[Co +K(Ci + Cw)]
as verified by direct substitution.
VLSI DesignCourse 4-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Off-Chip Driver Circuits
One important point which is obtained from the above analysis deals with the delay time.The equation K = (Sj+1
Sj) says physically that the minimum chain delay occurs when every
stage has the same individual time delay tD.
The final question which must be answered is the number of stages N needed to optimize thedelay. To calculate this, we differentiate TD with respect to N and set the result to 0. Thisgives the general equation
RCo +R(Ci + Cw)(CLCi
) 1N[1− ln(CLCi)
N
]= 0
If Co is small, this reduces to the well-publicized result
N = ln
(CLCi
)
which is chosen to the nearest integer for given values of Ci and CL.
4.8 Off-Chip Driver Circuits
Off-chip driver circuits are critical to the overall chip design. Much effort is put into speedingup internal switching networks. Careful output design insures that the high-performancespecifications apply to the external characteristics as well. Some important problems whichmust be addressed include
• Efficient buffer circuitry between internal and off-chip drivers
• Minimization of transmission line effects
• Fast switching
• Static charge protection
as well as interface-specific items such as a CMOS-TTL level converter.
An inverter circuit can be used as a basic off-chip driver. The dominant performance factorsare the transient switching times tLH and tHL. Transmission line effects also enter into theproblem; this is complicated by the fact that the line characteristics such as Z0 depend on thespecifics of the mounting and circuit traces.
4.8.1 Basic Off-Chip Driver Design
The simplest off-chip driver circuit consists of an inverter chain which is designed to handle alarge capacitive load. Cout includes contributions from the bonding pad, the package wiring,and the circuit board trace. Since this easily amounts to tens or a few hundred of picofaradsdepending on the interface specifications, the transistors must be relatively large.
VLSI DesignCourse 4-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Off-Chip Driver Circuits
Consider the 2-stage off-chip driver network shown in Fig. 4.25. We may use time constantsto obtain first-order design estimates for the sizes of the output transistors Mn2 and Mp2 bywriting (
W
L
)n2
=Cout
τnk′n(VDD − VTn)(W
L
)p2
=Cout
τpk′p(VDD − |VTp|)
where τn and τp are the high-to-low and low-to-high time constants, respectively. Since theoutput capacitance seen by an off-chip driver can be large, the MOSFET aspect ratios are alsoquite large. These are obtained using several parallel-connected transistors to aid in layoutand parasitic control. Sizing theory may be used to determine the sizes of the first stagetransistors Mn1 and Mp1.
Figure 4.25: Double-Inverter Off-Chip Driver Circuit
The actual values of the fall and rise times can be estimated from
tHL = τn
[2VTn
(VDD − VTn)+ ln
(2(VDD − VTn)
Vo− 1
)]tLH = τp
[2|VTp|
(VDD − |VTp|)+ ln
(2(VDD − |VTp|)
Vo− 1
)]
4.8.2 Tri-State and Bidirectional I/O
Tri-state off-chip driver circuits are constructed by splitting the input signal to individuallycontrol each output transistor. Normal operation gives high and low voltages, while the high-impedance state is obtained by driving both the nMOS and pMOS devices into cutoff. Aninverting tri-state circuit is shown in Fig. 4.26. When the tri-state variable Z = 1, pMOSFETsMp1 and Mp2 are off, while nMOSFET Mn conducts. This gives normal circuit operation.If Z = 0, then the gate voltages to output transistors are given by
VLSI DesignCourse 4-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Off-Chip Driver Circuits
Figure 4.26: Tri-State Output Circuit
Vp = VDD
Vn = 0
so that both are in cutoff. A condition of Z = 0 thus provides the necessary high-impedancestate.
Bi-directional input/output (I/O) circuits are also quite useful. An example is shown inFig. 4.27. The tri-state section of the circuit is a non-inverting buffer with an enable controlE, where E = 0 gives the High-Z state. Operation is straight forward and easily understoodby examining the circuit.
VLSI DesignCourse 4-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Off-Chip Driver Circuits
Figure 4.27: Bi-Directional I/O Circuit
VLSI DesignCourse 4-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Chapter 5
CMOS Process and Layout Designof Integrated Circuits
5.1 Processing Steps
The fabrication of an integrated circuit consists of a series of steps carried out in a specificorder. These steps convert the circuit design into an operable silicon integrated circuit chip.
The way in which individual IC fabrication steps are carried out is of critical importance tothe outcome of the manufacturing process. The main objective is to minimize the departureof geometrical features of the processed circuit from those determined during the design. Toachieve this, a high degree of control over the parameters of each processing step is required.Equally rigid requirements apply to the physical and chemical properties of materials used forIC fabrication as well as to the cleanliness of the production environment.
5.1.1 Wafer Processing
The basic raw material used in semiconductor plants is a wafer or disk of silicon, which variesfrom 75mm to 150mm in diameter and is less than 1mm thick. Wafers are cut from ingots ofsingle crystal silicon that have been pulled from a crucible melt of pure molten polycrystallinesilicon. Controlled amounts of impurities are added to the melt to provide the crystal withthe required electrical properties. The crystal orientation is determined by a seed crystal thatis dipped into the melt to initiate single crystal growth. The seed is then gradually withdrawnvertically from the melt while simultaneously being rotated.
Slicing into wafers is usually carried out using internal cutting edge diamond blades.
5.1.2 The n-Well CMOS Process
A common approach to n-well CMOS fabrication has been to start with a moderately dopedp-type substrate (wafer), create the n-type well for the p-channel devices, and build the n-channel transistors in the native p-substrate. The mask that is used in each process step isshown in addition to a sample cross-section through an n-device and a p-device.
VLSI DesignCourse 5-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Figure 5.1: Cazochalski process for manufacturing silicon ingots
1. The first mask defines the n-well (or n-tub). p-channel transistors will be fabricated inthis well. Field oxide is etched away to allow a deep diffusion.
2. The next mask is called the “thin oxide” or “thinox” mask, as it defines where areas ofthin oxide are needed to implement transistor gates and allow implantation to form p-or n-type diffusions for transistor source/drain regions. The field oxide areas are etchedto the silicon surface and then the thin oxide is grown on these areas. Other terms forthis mask include active area, island, and mesa.
3. Polysilicon gate definition is then completed. This involves covering the surface withpolysilicon and then etching the required pattern. In a self-aligned process, the polygate regions lead to aligned source-drain regions.
4. A n+-mask is then used to indicate those thin-oxide areas (and polysilicon) that are tobe implanted n+. Hence the thin-oxide area exposed by the n+-mask will become a n+
diffusion area. If the n+-area is in the p-substrate, then a n-channel transistor or n-typewire may be constructed. If the n+ area is in the n-well, then an ohmic contact to then-well may be constructed. An ohmic contact is one which is only resistive in natureand is not rectifying (as in the case of a diode). In other words, there is no junction andcurrent can flow in both directions in an ohmic contact. This typ of mask is sometimescalled the select mask as it selects those transistor regions that are to be p-type.
5. The next step ussually uses the complement of the n+-mask, although an extra mask is
VLSI DesignCourse 5-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Figure 5.2: The n-Well Mask
Figure 5.3: The Active Mask
VLSI DesignCourse 5-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Figure 5.4: The Poly Mask
Figure 5.5: The n+ Mask
VLSI DesignCourse 5-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
normally not needed. The “absence” of a n+-region over a thin oxide area indicates thatthe area will be an p+-diffusion. p+-diffusion in the n-well defines possible p-transistorsand wires. An n+-diffusion in the n-substrate allows an ohmic contact to be made.Following this step, the surface of the chip is covered with a layer of SiO2.
Figure 5.6: The p+ Mask
6. Contact cuts are then defined. This involves etching any SiO2 down to the contactedsurface. These allow metal to contact diffusion regions or polysilicon regions.
7. Metallization is then applied to the surface and selectively etched.
8. As a final step, the wafer is passivated and openings to the bond pads are etched toallow for wire bonding. Passivation protects the silicon surface against the ingress ofcontaminants that can modify circuit behavior in deleterious ways.
Additional steps might include threshold adjust steps to set the threshold voltages of the n-and p-devices.
In current fabrication processes the polysilicon is normally doped n+. The p+ doping phasereduces the poly doping such that the polysilicon inside the p+ regions have a higher sheet re-sistence than the polysilicon outside the p+ region. The extent of this reduction may influencethe qulaity of metal-poly contacts within p+ regions.
VLSI DesignCourse 5-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Figure 5.7: The Contact Mask
Figure 5.8: The Metalisation Mask
VLSI DesignCourse 5-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
5.1.3 The p-Well CMOS Process
Typical p-well fabrication steps are similar to an n-well process, except that a p-well is used.The first masking step defines the p-well regions. This is followed by a low-dose boron implantdriven in by a high-temperature step for the formation of the p-well. The next steps are todefine the devices and other diffusions, to grow fiels oxide, contact cuts, and metallization. Anp-well mask is used to define a p-well regions, as opposed to a n-well mask in a n-well process.An p+-mask may be used to define the p-channel transistors and VSS contacts. Alternatively,we could use a n+-mask to define the n-channel transistors, as the masks usually are thecomplement of each other.
Figure 5.9: An Example of a p-Well CMOS Process
5.1.4 The Twin-Tub Process
Twin-tub CMOS technology provides the basis for seperate optimization of the p-type andn-type transistors, thus making it possible for threshold voltage, body effect, and the gainassociated with n- and p-devices to be independently optimized. Generally the starting ma-terial is either an n+ or p+-substrate with a lightly doped epitaxial or epi layer, which is
VLSI DesignCourse 5-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Figure 5.10: continued
VLSI DesignCourse 5-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
used for protection against latch-up. The aim of epitaxy (which means “arranged upon”) isto grow high purity silicon layers of controlled thickness with accurately determined dopantconcentrations distributed homogeneously throughout the layer. The electrical properties forthis layer are determined by the dopant and its concentration in the silicon.
The process sequence, which is similar to the p-well process apart from the tub formationwhere both p-well and n-well are utilized, entails the following steps:
• tub formation
• thin oxide etching
• source and drain implantations
• contact cut definition
• metallization.
Fig. 5.11 illustrates the cross-sections of the 3 processes on an example of an inverter.
Figure 5.11: Twin-tub process cross-section and layout of an inverter
VLSI DesignCourse 5-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
5.1.5 Isolation
Device isolation deals with electrically decoupling neighboring transistors on a densely-packedintegrated circuit. Unwanted conduction channels must be eliminated by preventing bothdirect and indirect current flow paths. The most common isolation techniques used in bulkCMOS are LOCOS and trench isolation.
LOCOS
The Local Oxidation of Silicon (LOCOS) achieves device isolation by selective oxide growth.A typical LOCOS process starts by growing a thin stress relief thermal oxide (SiO2) layer onthe silicon surface. Next, silicon nitride (Si3N4) is deposited and patterned, keeping nitridein the areas where transistors will be built. The entire surface is then exposed to an oxidizingambient. Nitride does not oxidize, but any exposed silicon will react to form SiO2. Theresulting LOCOS structure is illustrated in Fig 5.12.
Figure 5.12: LOCOS Isolation
Simple analysis shows thatXR = 0.46XFOX
where XR is the depth of recession and XFOX is the thickness of the grown field oxide (FOX)which separates device locations. In general, the patterned nitride regions are called activeareas, while the oxide growth defines the field regions between active transistor sections.
LOCOS is a widely used isolation technique in many processing lines. However, a majorlimitation is the problem of active area encroachment which occurs during the FOX growthprocess and reduces the usable size of the region. The Problem is illustrated in Fig. 5.13. Eventhough the nitride protects the silicon surface, oxygen diffuses through the sides of the stress-relief oxide layer during the FOX growth. SiO2 is thus formed arround the edges, lifting thenitride upwards and forming a characteristic bird’s beak transition region between the activearea and the field oxide. Encroachment cannot be avoided and affects the integration density.
VLSI DesignCourse 5-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Figure 5.13: Encroachment in LOCOS
VLSI DesignCourse 5-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Trench Isolation
Trench isolation uses reactive ion etching (RIE) to form small trenches in the silicon. Thetrenches are then filled with oxide and polysilicon to electrically isolate neighboring deviceregions from one another. High integration levels are possible since the trench widths can bereduced to the order of a few microns. Trench isolation is illustrated in Fig. 5.14. A fieldimplant may be used to increase the trench threshold voltage VT,Tr. Small trench dimensionsmakes this approach particularly important for high-density integration.
Figure 5.14: Trench Isolation
The vertical trench regions may also be used to create large-value capacitors without con-suming valuable surface real estate. An example geometry which uses doped poly and p+ ascapacitor plates is shown in Fig. 5.15.
Trench capacitors are commonly used in advanced dynamic RAM (DRAM) cell design sincethey conserve surface real estate. Trench isolation has been developed to the point where it is aviable production line technique. It eliminates almost the problem of active area encroachmentfound in LOCOS and is useful when increasing the logic integration density.
5.1.6 Latchup
Bulk CMOS technologies are susceptible to latchup. This condition occurs when a parasiticconducting path is established between VDD and ground, directing current away from the cir-cuit. Once latchup occurs, it can only be stopped by removing the power supply and restartingthe circuit. In addition to halting the circuit operation, latchup may induce catastrophic fail-ure from heating.
Fig. 5.16 shows the cross-section of a n-well CMOS substrate region where the latchup problemoriginates. To understand the origin of the latchup problem, note that the voltage acrossparasitic resistor Rw1 acts to forward bias the emitter-base junction of Q2. If VEB2 reachesthe turn-on voltage of about 0.7 volts, IC2 flows. This current flowing through Rs1 developsa forward bias VBE1 across the base-emitter junction of Q1, causing IC1 to increase. The
VLSI DesignCourse 5-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Processing Steps
Figure 5.15: Trench Capacitor
transistor pair Q1 and Q2 are connected to form a positive feedback loop, so that the buildupcontinues.
Figure 5.16: Origin of CMOS Latchup
Latchup triggering may occur anytime the circuit voltages exceed normal levels. Causes in-clude
• Voltage overshoot/undershoot
• Avalanche breakdown
VLSI DesignCourse 5-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
• Punchthrough
• Parasitic MOSFETs
• Photocurrent
and others. Although careful circuit design may reduce the possibility of inducing latchup, itis generally worthwhile to take extra precautions.
There are two main approaches to dealing with the latchup problem: (a) reduce the transistorcurrent gains, or (b) decouple the transistor feedback loop; it is common to use both inpractice. Deep trench isolation can also be used to reduce the possibility of latchup. Fig. 5.17illustrates adjacent nMOS and pMOS transistors separated by deep trenches. Parasitic bipolartransistors are not found in the structure since the isolating pn-junctions have been replacedby an oxide barrier.
Figure 5.17: Trench-isolated CMOS
Latchup prevention is an important aspect of CMOS chip layout and design. One shouldalways check to insure that all suggested rules have been followed to guard against the problem.
5.2 Design Rules
Design rules are sets of geometrical specifications which govern chip design for a given fabri-cation process. The layout rules are statements of the geometrical limits placed on the maskpatterns and include items such as minimum widths, dimensions, and spacings. Violating thedesign rules can lead to a geometry which cannot be replicated in the fabrication line, yieldinga non-functional circuit. Designers are often saved from simple mistakes by the omnipotentdesign rule checker (DRC) used to find layout violations. Another important fact is thatparasitic circuit component values are a direct consequence of the layout geometry. Since thelayout is an integral part of the circuit design, it is important to examine how a design ruleset affets the overall performance.
5.2.1 Lithography and Fabrication
Microelectronic lithography is the science of transferring a pattern to each layer of materialin an integrated circuit. The resolution of the lithography limits the smallest line dimension
VLSI DesignCourse 5-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
and constitutes a metric for the surface dimensions. The most common approach is opticallithography which uses an ultraviolet light source through a patterned mask to selectivelyexpose a light-sensitive photoresist layer. Alternate approaches include electron-beam andX-ray sources; these offer finer resolution but introduce other problems. X-ray lithographycurrently appears to be the likely winner in the next generation, but recent advances in e-beamsystems still look promising.
Regardless of the approach, the resolution is limited by diffraction effects which occur whenevera wave passes by an opaque edge. This result in the minimum linewidth specification in thedesign rule set and may be viewed as the smallest mask dimension which can be reliablytransferred to the chip surface. UV optical lithography has a minimum linewidth on the orderof about 0.5 microns; e-beam systems can pattern down to one-tenth of a micron or less.
Diffraction also limits how small we can make the spacing between two lines; this considerationgives a set of minimum spacing allowances in the design rule set. Minimum spacings also areneeded to account for misaligned masking steps, lateral spreading, and other problems whichoccur during the many weeks it takes to fabricate a wafer. Yield enhancement plays animportant role in setting the final numbers.
5.2.2 Basic Design Rule Set
Design rules are best illustrated by example. We consider a 1.5-micron n-well, single-poly,double-metal process which uses 10 masks. The process flow description in Table 5.2 lists themajor steps in the fabrication and indicates each mask in proper sequence.
Geometrical layout rules specify minimum mask feature sizes. Rules are provided for eachmasking layer, and also for spacings between different layers. The former originates fromlithographic constraints or physical considerations. Bloats and shrinks may be applied to se-lected layers during the fabrication process, but the resulting physical overlay for the structureis still represented by the layout drawing.
Table 5.1 provides a listing of design rules for a 2-micron CMOS process. These consist ofminimum widths or dimensions, minimum spacings between features on the same or otherlayers, overlap distances, and other item of importance to the chip layout. Some examplesof the rules are shown below. Ground rules are usually accompanied by a complete set ofdrawings to illustrate each specification.
VLSI DesignCourse 5-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
Mask Value (µ) Description01 NWELL 3.0 Minimum width
2.0 Minimum spacing (same polarity)13.0 Minimum spacing (different polarity)
02 ACTIVE 1.5 Minimum width (diffusion line)2.25 Minimum width under POLY2.25 Minimum spacing (same polarity)2.50 Minimum spacing (different polarity)3.0 p-ACTIVE inside of NWELL to NWELL-edge:
pMOSFET3.75 p-ACTIVE outside of NWELL to NWELL-edge:
substrate contact0.0 n-ACTIVE inside of NWELL to NWELL-edge:
well contact6.0 n-ACTIVE outside of NWELL to NWELL-edge:
nMOSFET03 POLY 1.5 Minimum width
2.0 Minimum spacing1.25 Gate Overlap with ACTIVE0.75 POLY outside of ACTIVE to ACTIVE edge2.25 POLY inside of ACTIVE to ACTIVE edge
04 NPLUS 1.5 Minimum spacing1.25 Spacing to ACTIVE
05 PPLUS PPLUS is reverse of NPLUS06 CONTACT 1.5×1.5 Size
1.5 Minimum spacing1.25 Spacing to POLY edge (from inside)1.75 Spacing to POLY (contact outside of POLY)1.0 Spacing to ACTIVE edge (from inside)1.75 Spacing to ACTIVE (contact outside of POLY)
07 METAL1 2.0 Minimum width2.25 Minimum spacing
08 VIA 1.5 Size2.0 Minimum spacing1.0 Overlap with METAL11.5 Overlap with METAL22.0 Spacing POLY or ACTIVE1.5 Spacing to CONTACT
09 METAL2 2.75 Minimum width3.0 Minimum spacing
10 PAD 100×100 Dimensions5 Spacing to glass edge
Table 5.1: CMOS 1.5-Micron Design Rule Example
An integrated circuit may be viewed as a set of overlaid geometric patterns. Each layer is
VLSI DesignCourse 5-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
Design Rules for 1.50µm CMOS Process
3µ -
same
Potent.
differ.
Potential -10µ
6
?
2µ -35µ
scribelanen-well
p-active
Poly
2.25µ-
p-active n-active
n-well
?
6
3.0µ
6
?
2.25µ
-
p-active
-
1.50µ
n-active
n-active
?
62.25µ
?
6
2.50µ
- 6.00µ
-3.75µ
scribe
lane
-30µ
2.50µ
Active
Area
Active Area
?
61.50µ
-
1.25µ
?
6
2.25µ
-
0.75µ
-2.0µ
-
1.50µ
scribe
lane
-30µ
Poly
minimum channel length for VDD = 5V is 1.5µ and for VDD > 5V 2.25µ.
VLSI DesignCourse 5-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
Active Area
6?
1.25µ
n+
6
?2.0µ
Poly6
?
2.25µ
p-active
-
1.25µ
n+
- 1.50µ
p+ is reverse of n+
n+ diffusion
Poly -1.75µ
6?1.25µ-
6?
1.50µ
-1.0µ-
1.50µ
6
?
1.75µ
n+
Active Area
?
6
6
?
2.75µ
2.75µ
Contact
VLSI DesignCourse 5-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
-2.25µ
?61.0µ
scribelane
-30µmetal1
2.0µ
-
metal1
maximum metal1 line width: 30µ max. current density: 0.5mA/µ
@@@
@@@
metal1-
2.0µ
-1.5µ
Active Area
6?-1.5µ 6
?2.0µ
Poly
6
?2.0µ
@@@
@@@
6
?
6
?2.0µ 2.0µ
metal1 metal1
metal2
PolyActive Area
2.0µ
?
6
2.0µ
1.0µ
Via
-
6
?
@@@
metal2 6?1.5µ
metal1
metal2
-3.0µ
-2.75µ
-30µ
scribelanemetal2
VLSI DesignCourse 5-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
STEP MASK LAYER ProcessNO. NO. NAME STEP
0 Start with n-type wafer1 01 NWELL n-tub diffusion2 02 ACTIVE Active area definition3 THIN OXIDE Grow gate oxide4 POLY Deposit polysilicon5 03 POLY Pattern polysilicon6 04 NPLUS n+ implant7 05 PPLUS p+ implant8 Deposit oxide9 06 CONTACT Pattern poly contacts10 METAL1 Deposit metal 111 07 METAL1 Pattern metal 112 Deposit CVD oxide13 08 VIA Pattern metal 2 contacts14 METAL2 Deposit metal 215 09 METAL2 Pattern metal 216 GLASS Nitride passivation17 10 PAD Pattern pad openings
Table 5.2: Basic n-well CMOS Process
shaped to provide the proper characteristics when referenced to every other layer. High-densitycircuit design requires compacting the geometrical patterns into a small area without violatingthe design rules.
Active Areas
Dimensional specifications for active device areas are larger than that permitted by the lithog-raphy to account for encroachment from the isolation. As shown in the sequence of Fig. 5.18,growth of the field oxide creates the bird’s beak region which must be avoided when patterningthe device.
Gate Dimensions
Basic self-aligned MOSFETs are fabricated using the polysilicon gate as a mask for a n+ orp+ drain/source ion implant. Lateral doping affects give effective channel lengths which aresmaller than the drawn values shown on the poly mask.
Gate Overhang
Self-aligned MOSFETs use the gate polysilicon as a mask to the drain and source implants.To insure a functional MOSFET we require that the masks are drawn so that the poly gateextends further than required in the W direction. Fig. 5.20 shows the geometry. Providing
VLSI DesignCourse 5-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Rules
Figure 5.18: Active Area Encroachment in LOCOS
for a gate overhang allowance compensates for mask misalignment between the poly and n+
or p+ regions. If the gate over hang is reduced to zero, then even a minor registration errorwould result in a shorted transistor.
Contacts and Vias
Contact and via etches in the oxide can be troublesome failure points in a high-density layout.If the contact windows are too large, nonuniform coverage may result in void formation andother problems. The same comment also applies to oxide cuts which are too small. To avoidinducing contact-related failure modes, it is common practice to allow only one size for contactwindows; large areas are connected by multiple contacts. This is illustrated in Fig. 5.21.
Metal Dimensions
Metal layers are deposited at the end of the fabrication sequence. They generally encountera very rugged terrain due to patterning of the previous layers. Owing to this fact, the designrule widths and spacing must be large to insure electrical current flow. Another reason for
VLSI DesignCourse 5-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Circuit Extraction and Electrical Process Parameters
Figure 5.19: Effective Channel Length
increased widths is to allow larger current flow levels for power and ground connections.
5.3 Circuit Extraction and Electrical Process Parameters
The title circuit extraction includes a broad class of layout analysis problems. The fundamentalproblem is connectivity extraction, which derives a list of interconnections among the termi-nals from a layout description. There are several parameter extraction, which augment thebasic connectivity information with measurements of features that are related to the (analog)electrical characteristics of the chip.
Consider the problem of finding transistors. Transistors are formed by intersecting the polysil-icon and diffusion layers; their type depends on the presence or absence of different kinds ofimplant or tub.
Most circuit extractor treat two points (on the same or different layers) as electrically con-nected if they lie in the same region of a single layer or if they can be joined by a sequenceof regions on several layers that are connected explicitly by contact windows. A commoncircuit extraction operation is to find maximal regions of electrically connected points, morecommonly called nodes. This operation involves labeling the contents of each layer so thatitems belong to the same node if and only if they have the same label.
VLSI DesignCourse 5-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Circuit Extraction and Electrical Process Parameters
Figure 5.20: Design-mask transformation
Figure 5.21: Contact Cuts
5.3.1 Connectivity Extraction
The output of connectivity extraction is a list of transistors on the chip, together with nodenumbers on each transistor’s gate, source, and drain. This transistor list is adequate for
VLSI DesignCourse 5-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Circuit Extraction and Electrical Process Parameters
checking the logical correctness of the circuit. In order to check analog characteristics of thecircuit, it is necessary to extract parasitic capacitances and resistances and transistor sizeinformation.
The first step in connectivity extraction is to create derived layers that correspond to transistorof different kinds and to electrically connected regions on single layers. To illustrate thecreation of derived layers using the edge representation, suppose the artwork for an nMOSchip includes the following six levels: Dmask (the diffusion mask), Pmask ( the polysiliconmask), Mmask (the metal mask), Cmask (contact windows from metal to underlying layers),Bmask (buried contact windows between polysilicon and diffusion), and Imask ( the depletiontransistor implant). Then we could create five derive layers as follows:
trans ← Dmask and Pmask and not Bmaskdwires ← Dmask and not trans
PDcuts ← Pmask and Dmask and BmaskMPcuts ← Mmask and Pmask and CmaskMDcuts ← Mmask and Dmask and Cmask and Pmask
Regions in layer trans are transitor channels, that is, places where polysilicon crosses diffusionoutside of a buried contact region. Conduction diffusion regions are represented in layersdwires. Files PDcuts, MPcuts, and MDcuts contain pricisely the places where materials ofthe appopriate types make electrical contact.
The next step is to assign globally consistent signal labels to the items on each conductinglayer that belong to a node, using the contact windows to merge signals between layers. Thefinal step in connectivity extraction is to find for each transistor the signal labels on the nodesthat are its terminals. This requires examinig all regions that abut a transistor region.
5.3.2 Parasitic Capacitance Extraction
To extract capacitance we still treat each node as equipotential but also consider it as theterminal of one or more capacitors. Each region has a capacitance between itself and thechip substrate and also internodal capacitances between itself and other overlapping or nearbynodes.
Substrate capacitance can be accurately approximated as a function of the area and perimeterof each region on each layer. Capacitance between two nodes of the circuit is much harder tocompute accurately. Internodal capacitance is not a simple function of area and perimeter.
5.3.3 Transistor Size Extraction
Analog characteristics such as the drive of an MOS transistor are a function of its channellength and width. For a rectangular transistor formed by polysilicon that completely overlapsdiffusion, length is one-half of the transistor’s perimeter with polysilicon, and width is one-halfof the transistor’s perimeter with diffusion.
VLSI DesignCourse 5-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Circuit Extraction and Electrical Process Parameters
5.3.4 Parasitic Resitance Extraction
When we consider the problem of extracting resistances from a layout, the abstraction oftransistors connected by equipotential nodes breaks down completely. It does not make senseto associate resistance with a node: resistance is defined between pairs of points. Thus, anode attached to the terminals of k transistors gives rise to k(k−1)
2 resistances, one betweeneach pair of terminals.
One idea is to reduce the number of resistances we must compute by chopping the region intoelectrically isolated regions. If we add the appropriate k− 2 junctions to a node attached to kterminals, then we need to compute only O(k) resistances, instead of k(k−2)
2 . (See Fig. 5.22)
Figure 5.22: A region with eight terminals has 28 interconnection resistances. Making thecross-hatched juntions into new nodes splits the region into 10 electrically isolated regions andreduces the number of interconnection resistances to 10
A second way to reduce the number of resistances is to break nodes into rectangles by intro-ducing artificial junctions at corners. Thus, resistances can be more easily computed.
Careful resistance extraction is the hardest and most expensive problem. Indeed, most chipsare manufactured without ever undergoing a complete resistance extraction because such anextraction would result in a prohibitively large network of resistors.
5.3.5 Process Parameter and Technology Description
The technology description file contains all information specific for a particular technology.Among this information, and of particular importance for the extractor, is the specificationof the layers that can be used in a process and electrical parameters of that process.
Layers are specified by their name and their type. The type of a layer distinguishes betweenauxiliary layers, implantation layers, and interconnect layers. Auxiliary layers are ignored bythe extractor. Interconnect layers form the conducting patterns in a chip layout, so in a chip allinterconnections will always be made via such layers. If the layers is of type interconnect, anassociated terminal layer must be specified for it. Given the interconnect layers, the extractoris able to determine where the nodes of an element are located. Another important part ofthe technology description is the specification of the elements to be extracted.
For extraction of parasitic elements, electrical process parameters must be known. The layercapacities or layer and contact resistances are necessary for exact modelling of parasitic ca-
VLSI DesignCourse 5-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
pacitances and resistances on a wafer, e.g.: for calculating load capacities (gates) and coupling(between wires). Furthermore, process parameters must be involved during the design. Sopoly lines can not be designed too long because of the high layer capacity and resistance ofpolysilicon. Example parameter of an n-well CMOS process are listed in Table 5.3 and 5.4.
Capacities Value ( nFcm2 )
Gate-Oxide 135n+–diff to substrate (bottom) 25n+–diff to substrat (sidewall) 4 (pF/cm)p+–diff to n-well (bottom) 38p+–diff to n-well (sidewall) 4 (pF/cm)Poly–substrate 5.9Metal1–substrate 3.2Metal2–substrate 2Metal1–metal2 3.9Metal1–poly 5.4Metal2–poly 2.5Metal1–n+-diff. 5.2Metal1–p+-diff. 5.5Metal2–n+-diff. 2.4Metal2–p+-diff. 2.5
Table 5.3: Layer capacitances of an n-well CMOS process
Resistances Valuen-well 2.5 kΩ/2
n+-diffusion 50 Ω/2
p+-diffusion 150 Ω/2
Poly 50Ω/2
Metal1 60 mΩ/2
Metal2 40 mΩ/2
Contact 100 Ω/contactVia 1 Ω/via
Table 5.4: Layer resistances of an n-well CMOS process
5.4 Basic Layout
Transforming schematics into physical circuits occurs during the layout process. All aspectsof the circuit performance are structured by the patterning. Parasitics, interconnect coupling,and logic integration density are also determined by the geometries used in the layout artwork.Although layout is easy to learn, the interplay between the geometrical shapes and the resultingelectrical behavior makes it difficult to master.
VLSI DesignCourse 5-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
5.4.1 IC Design
IC design is a very complex process that involves hundreds of decisions dealing with thevariety of IC performances and manufacturing-related issues. The final phase in the design isthe creation of an IC layout; i.e. the creation of the drawing representing the geometry of thedesigned circuit. For a given process such a drawing uniquely defines the IC geometry andtherefore the performance of the designed circuit.
The layout of an IC is defined as a set of polygons that determines the presence or absenceof regions in a number of conducting and isolating layers. In other words, an IC layout showsfrom which part of the IC surface such materials as metal, silicon dioxide, photoresist, and soon should be removed, and where other materials should be deposited.
During the design the IC is represented by a set of numbers that can be manipulated to createa composite drawing of IC masks on the screen of the terminal or on the color plotter. In themanufacturing process a “hard copy” of this layout is needed in the form of photolithographicmasks.
Typically, the IC design is transformed into a set of masks in a sequence of steps illustrated inFig. 5.23. First, coordinates of all elements of the IC composite drawing are computed. Thendata representing different layers are separated(Fig. 5.23 (c) and (d)) and an image of each IClayer is produced. Typically, such images are engraved on the surface of glass plates coveredwith chromium, using a photographic technique and pattern generator or E-beam equipment.Masks created in this way are called master mask.
Next master masks are scaled down (Fig. 5.23 (e-f)) and duplicated (Fig. 5.23 (g-h)) so thatworking masks made in this way contain a couple of tens to a couple of hundreds of the sameimages as tte master masks. The size of the working mask is such that with a single exposurethe entire area of a single manufacturing wafer can be covered.
In the new lithography techniques, working masks are not needed and the image from themask is transferred directly onto the surface of the wafer (the master mask is then called areticle). Special high-precision optical step-and-repeat cameras are used for this purpose.
Data that describe a single IC layer can also be used to project an image directly onto thesurface of the manufacturing wafer using an electron beam technique. In this technique adeflected beam of electrons exposes appropriate regions directly on the surface of the photore-sist.
5.4.2 General Layout Strategies
Structured layout is based on the idea of grids and cells. The simplest approaches start withthe power distribution lines VDD and VSS and structure the circuits as needed. Each gate isplaced in a semi-rectangular cell, and cascaded logic is achieved using adjacent cells. Fig. 5.24illustrates the general idea. Both signal and power lines run horizontally in the network. Logi-cal gates are built between metal VDD and VSS lines, while the signals may move between polyand metal layers when necessary. Minimization of the area is achieved by creative placementand shaping of the MOSFETs, interconnects, and cells in the overall grid structure. It isimportant to remember that the dimensions set the electrical characteristics and must adhereto the design rules set. CMOS has the added complications of complementary nMOS/pMOSlogic blocks and physical separation of nMOS and pMOS transistors, which affect the layout.
VLSI DesignCourse 5-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
Figure 5.23: Design-mask transformation
Complementary structuring is illustrated in Fig. 5.25. Each input is connected to both nMOSand pMOS transistors which are physically separated from one another due to the oppositebackground polarity requirements.
5.4.3 Equivalent Load Concept
High-speed switching requires large currents and small Cout to insure small charging anddischarging time constants. It is evident that this leads to a design problem: to increasecurrent flow, we must use large (WL ) values for the MOSFETs, which in turn increases the
VLSI DesignCourse 5-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
Figure 5.24: General Layout Grid
Figure 5.25: Complementary Transistor/Logic Blocks
transistor capacitances. Increasing the aspect ratios in a CMOS circuit gives larger values forboth Cin and Cout, affecting the performance of the entire logic chain. In bottom-up design,we attempt to optimize each gate, both intrinsically and with respect to its nearest neighbors.
The concept of the equivalent load helps the initial layout problem by defining “standard”transistor or logic gate capacitances which are used as a reference. All loads are then specifiedby the number of equivalent loads. A common choice is a minimum-area transistor as shown
VLSI DesignCourse 5-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
in Fig. 5.26. Assuming drawn gate dimensions of (W × L), the gate input capacitance isapproximated by
CG ≈ CoxWL.
An inverter made using minimum area nMOS and pMOS transistors has an input capacitanceof approximately
Cin = 2CG
which becomes our reference value.
To use the equivalent load concept, we assume that the circuit we are designing must drive aload of value
CL = nCin,
where n is a scaling factor indicating the size of the transistors used in the next gate. Forexample, n = 2 may imply a single gate with MOSFETs which are twice as large as thereference, or a fan-out FO = 2 into two minimum size gates. The circuit is designed accordingto the assumed load value. After the design of the logic chain is completed, we recheck thecircuit to insure that the actual switching performance is acceptable.
Figure 5.26: Equivalent Load
Optimization of the circuit performance can also be specified at the system level and thenapplied to each gate. This type of top-down approach has been used to estimate gate sizingrules to speed up the response of a static logic chain. In general, combining the two viewsoffered by bottom-up (circuit level) and top-dowm (system level) design provides the mostpowerful approach to high-performance design. Large digital networks contain both criticaland non-critical logic paths so that intermixing design philosophies are often required.
VLSI DesignCourse 5-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
5.4.4 Latch-Up Prevention
Circuits which are fabricated in bulk CMOS require additional safeguards to aviod latch-up.A common approach is to use guard rings, which are heavily doped n+ or p+ regions aroundMOSFETs as shown in Fig. 5.27. Guard rings reduce the transistor current gain and offsetthe potential and are effective in preventing latch-up. Another common preventative measureis providing substrate bias contacts next to every MOSFET which is connected to the powersupply or ground.
Figure 5.27: Guard Ring Example
5.4.5 Static Gate Layout
Static CMOS gates are based on complementary nMOS/pMOS logic blocks. Cell design canbe split into two tasks: transistor placement and interconnect routing. Real estate budgetsoften have priority status, so that some thought may be required to fit the subsystem into theallocated area. The main limitations are usually due to design rule spacings and the complexityto the interconnect topolgy. Other considerations which may come into play include the shapeof the allocated area, location of input and output lines relative to neighboring logic units,and clock distribution.
Some of the more interesting designs are based on the complementary placement of oppositepolarity MOSFETs. Consider a NOR2 gate. This circuit uses 2 nMOS transistors in paral-lel and 2 pMOS transistors in series. Fig. 5.28 shows how the complementary arrangementcan be implemented by using similar transistor arrays with different interconnect patterning.Reversing the transistors in the NOR2 gate in Fig. 5.28(a) directly yields the NAND2 gateshown in Fig. 5.28(b).
Although some layouts are based on the schematic patterning, these do not generally yieldminimum-area circuits. Thoughtful use of transistor arrays and interconnect routing is usually
VLSI DesignCourse 5-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
required; intelligent CAD/CAE tools may also prove helpful.
Figure 5.28: Complement Static Gates
5.4.6 Transistor-Gate-Based Logic
The Layout of transmission-gate logic circuits is complicated by the transmission gate itself.The switch uses parallel-connected nMOS and a pMOS transistors which reside in opposite-polarity backgrounds. Consider, for example, a pwell process. The p-channel transistor is
VLSI DesignCourse 5-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Basic Layout
located on the n-substrate, while the nMOS is in a p-well region. Two extreme layout philoso-phies are (a) use a p-well for every transmission gate, or, (b) use a single p-well for all transmis-sion gates in the circuit. These are illustrated in Fig. 5.29. Approach (a) reduces integrationdensity due to the p-well spacing requirement, but is easy to replicate on a CAD systems; (b)on the other hand, may provide higher logic density, but has a larger capacitance from theextra interconnect. Although both are used in practice, minimizing the number of wells isussually the preferred strategy. Since each well requires a connection to either VDD or VSS ,this also aids in power distribution.
A critical aspect of high-speed CMOS layout is control of the parasitic capacitance values.
Figure 5.29: Transmission Gate Layout
VLSI DesignCourse 5-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Layout Examples
5.5 Layout Examples
Figure 5.30: Layout of an inverter
VLSI DesignCourse 5-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Layout Examples
Figure 5.31: Layout of a 2-input nand gate
Figure 5.32: Layout of a 2-input nor gate
VLSI DesignCourse 5-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Layout Examples
Figure 5.33: Layout of an exor gate
Figure 5.34: Layout of a ram cell
VLSI DesignCourse 5-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Layout Examples
Figure 5.35: Layout of a pad
Figure 5.36: Layout of a RS-latch
VLSI DesignCourse 5-37
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Layout Examples
Figure 5.37: Layout of a D-latch
Figure 5.38: Layout of a comparator
VLSI DesignCourse 5-38
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Layout Examples
Figure 5.39: Layout of a 1-bit fulladder
VLSI DesignCourse 5-39
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Introduction
Chapter 6
VLSI Device Packaging
6.1 Introduction
Packaging affects significantly or in some cases dominates the overall chip costs ([22]). Theincrease of packaging costs for a increasing number of gates on is different for memory andlogic/microprocessor devices:
Memory devices:
Due to multiplexing techniques on the chip, the I/O requirements remain essentially constant
Logic and microprocessor devices:
The number of required I/O terminals increases in proportion to the number of gates on thechip. An empirical estimation for the number of I/O-terminals needed for logic devices isknown as Rent’s Rule:
#I/O = α(#Gates)β (6.1)
Package design has to provide:
• good heat dissipation
• good electricial performance
• high reliability
• package must be easy to inspect after assembly
• package must be compatible with a variety of assembly, test and handling systems
VLSI DesignCourse 6-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Introduction
Figure 6.1: Continuous growth in DRAM complexity and size places littledemand on package size and number of I/Os
VLSI DesignCourse 6-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Introduction
Figure 6.2: Comparison of I/O requirements for DRAM, logic and micro-processor devices
VLSI DesignCourse 6-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Types
6.2 Package Types
Principally there are two types of mounting devices to printed wiring boards (PWB):
1. through-hole (TH) mounting:
• Dual-in-line packages (DIP)
• Pin-grid-array (PGA)
(available in hermetic plastic and ceramic types)(pitches: 2.54, 1.78 and 1.27 mm)
2. surface mounting (SM)
• up to 48 terminals:
– small outline (SO) (available in plastic only):SOP: small outline packageSSOP: shrinked small outline package
– quad types: chip carriers (CC) and flatpacks (available in ceramic and plastic)
• above 48 terminals: quad types only
– leaded plastic (PLCC)– leaded ceramic (LDCC)– leadless ceramic (LLCC)
(pitches: 1.37 or 0.635 mm)
VLSI DesignCourse 6-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Types
Figure 6.3: Examples for packages and PWB mounting techniques: (a) TH:Dual-in-line (DIL) package. (b) TH: Pin-grid-array (PGA) pack-age. (c) SM: ”J”-leaded packages, leaded chip carrier or small-outline. (d) SM: Gull-wing-leaded packages, chip-carrier orsmall-outline. (e) SM: Butt-leaded package, small-outline dual-in-line type. (f) Leadless type, ceramic chip carrier mounted toa matching ceramic substrate
Figure 6.4: IC package types as a function of I/Os and attachment type
VLSI DesignCourse 6-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Types
6.2.1 24-pin Packaging Evolution
Figure 6.5: Package history
Figure 6.6: Comparison: 24-pin SO package and 48-pin SSO package
VLSI DesignCourse 6-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Considerations
6.3 Design Considerations
6.3.1 VLSI Design Rules
Figure 6.7: Bonding-pad pitch versus chip lead count for several chip sizes
Figure 6.8: Arrangement of staggered bonding pads: → lower pitch thanwith single line of bonding pads. (a) Bonding pads size andspacing. (b) Maximum wire angle with respect to die edge
VLSI DesignCourse 6-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Considerations
Figure 6.9: CAD template for positioning bonding pads (assures that wirespan length meets the design rules)
Figure 6.10: CAD template for checking adherence to wire-span guidelines.The template also provides an extended zone (beyond the op-timum shown in Fig. 6.9) for cases where location in optimumzone is not compatible with the device layout.
VLSI DesignCourse 6-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Considerations
Figure 6.11: CAD template for checking the maximum distance that wirespans over silicon. Here: violation of the guidelines. The circlemust be at minimum tangent to the step-and-repeat centerline(case of maximum distance) or cross it
VLSI DesignCourse 6-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Considerations
6.3.2 Thermal Considerations
• Objective: keep temperature of silicon die low enough to prevent failure rate
• Conductive thermal resistance: function of package materials, geometry and orientation.
6.3.3 Electricial Considerations
Increased operation speed and reduced noise margins demand a more careful consideration ofpackage design. Performance criterions:
• low ground resistance (minimum power-supply voltage drop)
• short signal leads (minimum self-inductance)
• minimum power supply spiking due to signal lines simultaneously switching
• short parallel signal runs (cross talk)
• short-length signal length near a ground plane (minimum capacitive loading)
Figure 6.12: Lead inductances for various package sizes
The inductances of SM packages are significantly lower than the inductances of TH packagesdue to their shorter lead traces.
Most important problem: noise reduction. The noise induced in the ground line when one lineis switching is given by
Vi = Lgdi
dt(6.2)
VLSI DesignCourse 6-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Considerations
where Lg is the inductance of the ground lead.If j lines are switching:
Vi = Lg∑j
dijdt
(6.3)
If m ground leads are used, the total inductance is approximately Lg/m. In practical designsoften up to 25% of the leads have to be grounded in order to keep noise in desired limits (alsousage of large-area power and ground planes within the package).
6.3.4 Mechanical Design Considerations
• Ideally: prefer to use materials that are matched in physical properties, especially whichhave the same TCE (Themal Coefficient of Expansion)
Figure 6.13: TCE of materials for semiconductor devices, (C)
• Tradeoff between TCE, thermal conductivity and elastic modulus
VLSI DesignCourse 6-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Considerations
Figure 6.14: Plastic package: composite structure consisting of silicon chip,metal leadframe and plastic moulding compound
VLSI DesignCourse 6-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Assembly Technologies
6.4 Assembly Technologies
Figure 6.15: Generic assembly sequence for plastic and ceramic packages
6.4.1 Wafer Preparation
• Wafer sawing with diamant blade technology
• In some cases: wafer thinning down using highly automated backgrinding processes
• The sawed wafer is still mounted on a tape frame-fixture (to which it has been attachedbefore sawing and which is not destroyed by the sawing step) and loaded into an auto-matic die bonder that picks only the good chips from the tape
6.4.2 Die Bonding
The back of the die is mechanically attached to a mount medium, such as ceramic substrate,multilayer-ceramic-package-piece part or metal leadframe. This attachment sometimes enableselectricial connection to the back of the die to be made.
Two common Methods of die bonding:
VLSI DesignCourse 6-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Assembly Technologies
1. Eutectic die bonding
2. Epoxy die bonding
Eutectic Die Bonding (Hard solders)
Figure 6.16: Eutectic die bonding
• The die is metallurgically attached to a substrate material
• Substrate material: metal leadframe made of Alloy 42 or ceramic material (usually90. . .95% Al2O3)
• Melting preform: thin sheet of the appropriate solder-bonding Alloy
• Substrate: Metallization with Ag (leadframes) or Au (leadframes or ceramic)
• Bonding temperature: about 400 C
VLSI DesignCourse 6-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Assembly Technologies
Epoxy Die Bonding
Figure 6.17: Epoxy die bonding
• Bond material: silver-filled adhesives
• Advantage: less expensive than the high-gold-content hard soldiers and easy to process
6.4.3 Wire Bonding
Typically gold-wire is ball-wedge bonded (thermosonic or thermocompression).
• ball-bonding to the chip bond pad (typically Al)
• wedge-bonding to the package substrate (typically Ag or Au)
Description of the bonding cycle steps as seen in Fig. 6.18:
(a) targeting the capillary on the die’s bond pad
(b) the capillary presses the ball on the pad. In a thermosonic system ultrasonic vibrationis then applied
(c) the clamp opens and the capillary rises
(d) the lead of the device is positioned under the capillary, which is then lowered on the lead
(e) the capillary deforms the wire against the lead. In a thermosonic system ultrasonicvibration is applied
VLSI DesignCourse 6-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Assembly Technologies
Figure 6.18: Tailless ball-and-wedge bonding cycle
VLSI DesignCourse 6-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Technologies
(f) the capillary rises and the wire clamp closes at a predefined height
(g) a new ball is formed using a hydrogen flame or an electronic spark
Figure 6.19: Thermosonic ball wire bonds on a gate array VLSI chip
6.5 Package Technologies
6.5.1 Ceramic Package Technology
• very effective for constructing complex packages with many signal, power, ground, bond-ing and sealing layers
VLSI DesignCourse 6-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Technologies
Figure 6.20: Process sequence to create a laminated refractory-ceramicproduct from a ceramic slurry
Figure 6.21: Cross-sectional sketches of several package types
VLSI DesignCourse 6-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Technologies
6.5.2 Glass-Sealed Refractory Technology
Figure 6.22: Structures of CERDIP and quad CERPAC
Lower cost ceramic technology applicable to single-chip DIPs and quad CERPACs. Thistechnology relies on glass-sealing a leadframe between two pressed ceramic units.
VLSI DesignCourse 6-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Technologies
6.5.3 Plastic Molding Technology
Figure 6.23: Ball-and-wedge-bonded silicon die in a plastic DIP
Postmolding
• low cost
• state-of-the-art plastic package technology
• thermosetting epoxy resins are molded around the leadframe-chip subassembly after thechip being wire-bonded to the leadframe
Premolding
• avoids exposure of die and wire bond to viscous molding material
• package is molded first and then chip-leadframe compound is added
VLSI DesignCourse 6-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Package Technologies
6.5.4 Molding Process
Figure 6.24: Molding processing system
→ the preheated molding compound flows under pressure to fill the cavities containing lead-frame strips with their attached ICs.
VLSI DesignCourse 6-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
IC Package Market Share
6.6 IC Package Market Share
Figure 6.25: IC package market share
Figure 6.26: Worldwide IC package market share by material
VLSI DesignCourse 6-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Packaging Trends
6.7 Packaging Trends
Figure 6.27: Pin count versus usable gates
6.7.1 MultiChip Modules
• multiple dies are mounted on multilayer ceramic packages
• increasing performance by reducing the inter-die line length
VLSI DesignCourse 6-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Packaging Trends
Figure 6.28: Plastic IC package material costs
VLSI DesignCourse 6-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Packaging Trends
Figure 6.29: Ceramic IC package material costs
VLSI DesignCourse 6-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Packaging Trends
Figure 6.30: MCM: microprocessor performance
VLSI DesignCourse 6-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Packaging Trends
6.7.2 Comparison of Packaging Alternatives
Packaging Approach Features LimitationsSingle Chip Package • Mature • Low density(SCP) • Reliable • Speed: <30MHz
• Low risk • Increased PWB complexity• Low PWA producibility• Requires automated
assembly equipmentMultiChip Modules • Increased density • Cost(MCM) • Speed: 30 . . . 100MHz • Test and burn-in of
• Average PWB complexity bare chips required• Good PWA producibility
Chip-on-Board • Good ’middle ground’ • Environmental protection of bare die(COB) between MCMs and MWSI • TCE effects of coatings and/or PWBs
• High density • Difficult repairability• Speed: GHz range
Monolithic Wafer • Extreme density • Available 1995 - 1999Scale Integration • Speed: High GHz range • Defect density of wafers(MWSI) • Potential for low cost require redundancy
• Simplicity (once fabrication • Thermal managementprocesses are fully • TCE effectsdeveloped) • Vibration/shock environments
• No repairability
VLSI DesignCourse 6-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
CAD Tools
Chapter 7
Computer Aided Design ofIntegrated Circuits
7.1 CAD Tools
The following list shows some important CAD tools used for the design of integrated circuits:
• graphics editor (drawing schematic diagrams, physical layout, stick layout diagrams, . . . ,used for displaying results from simulations, layout verifications (like design rule checks),placement and routing, . . . )
• language based circuit capture tools (for hardware description languages like VHDL,Verilog, EDIF, . . . )
• physical design verification tools (design rule checker, extractor, LVS, schematic andelectrical rule checker, . . . )
• simulation tools (analog simulation: circuit level; digital simulations: circuit level, switchlevel, logic level, register transfer level, architectural level, behavioural level; thermalsimulation: displaying heat dissipation on chip)
• layout compilers (stick2layout, macrocell generators, datapath compilers)
• layout synthesizer, layout compactor
• logic optimizer
• database interfaces (file input / output from / to standardized interchange formats)
• database management (to keep different versions (current, backup1, backupn) and viewsof a design object [schematic, simulation netlist, stick diagram, physical layout, . . . ]) inthe design database)
VLSI DesignCourse 7-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Full Custom Design
7.2 Full Custom Design
With Full Custom Design techniques, the designer is able to individually specify the geometri-cal layout of the integrated circuit (transistor size [channel length, channel width, shape, . . . ],transistor placement, wire width, . . . ). The designer has the option to manually optimize thelayout → the most dense layouts can be generated using the full custom design styles.
Hand Crafted Layout
• The layout is drawn in form of rectangles and polygons on different layers using agraphics editor.
• The designer has to know a large set of process dependent design rules.
• The mask layout is generated as drawn on the screen → direct influence to compo-nent placement, to important parameters as W and L of transistors, wire widths,. . . .
Stick Diagram
• The layout is drawn in form of lines and polygons on different layers using a graphicseditor. A stick–to–layout converter together with a compactor and a description ofthe process design rules is then used to generate the rectangle based layout.
• The designer can draw almost process and design rule independent symbolic lay-outs. Process adaption is done by the converter/compactor.
• Converter constraints (cell dimensions, channel widths / lengths of transistors, . . . )can be specified.
Geometrical Specification Language
• The layout is specified in textual form giving either the position and layer of rect-angles (similar to hand crafted layout) or lines (as in stick diagrams).
• Since programming language constructs like parameterized macros (to be usedfor layout segments as cells, . . . ), loops (while, repeat, for, . . . ), and conditionalstatements (if, case, . . . ) may be available, parameterized layouts (e. g. generictransistor with W and L as parameters, cells for different bit–widths, . . . ) can bedescribed using geometrical specification languages.
• Used in a large number of macrocell compilers.
VLSI DesignCourse 7-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Full Custom Design
B x y dx dy Box with length dx, width dy, and lower left hand corner placed at (x, y).L n Layout level (layer) for the box definitions that followM n Start of macro definition nE End of macro definitionC n x y m Call for macro number n with translation x, y and orientation m.Q End layout file.
Table 7.1: Simplified geometrical specification language
Layer CMOS NMOS1 n-diffusion n-diffusion2 p-diffusion ion implant3 polysilicon polysilicon4 metal metal5 contact contact8 n-well —9 overglass overglass
Table 7.2: MOS layer definitions
Figure 7.1: Cell orientations
Orientation Description1 no rotation2 rotate 90o counterclockwise3 rotate 180o counterclockwise4 rotate 270o counterclockwise5 mirror about y-axis6 rotate 90o counterclockwise and mirror about y-axis7 rotate 180o counterclockwise and mirror about y-axis8 rotate 270o counterclockwise and mirror about y-axis
Table 7.3: Rotations of geometry
VLSI DesignCourse 7-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Full Custom Design
Figure 7.2: Full custom layout(hand crafted or generated out of astick diagram resp. a layout descrip-tion)
Figure 7.3: Corresponding geometrical specifi-cation file and schematic diagram
VLSI DesignCourse 7-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Full Custom Design
Figure 7.4: Memory cell schematic and corresponding stick diagram
VLSI DesignCourse 7-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Full Custom Design
?
Timing Analysis
Extraction
SimulationNetlist
Circuit Simulation
Fabrication
Layout Editor
?
-
?
?
6
?
6
?
-
-
?
?
Mask Layout Data
Schematic Entry
Symbol Generation
Floorplanning
Placement
Routing
Fabrication Test Pattern
Design Analysis
Circuit ExtractionLVS
DRC ERC
Block Layout
CompactorConverter
stick2layout
Stick Diagram
Editor
Figure 7.5: Full Custom Design Flow
VLSI DesignCourse 7-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Cell Based Design
7.3 Cell Based Design
The Cell Based Design approaches rely on layout components predefined and provided by thesilicon foundry. Several implementation styles can be distinguished:
Gate Array
• pre-fabricated diffusion and poly layers (regular structures e. g. transistors)
• customized interconnect structures (wires in metal 1 and metal 2)
• fixed size interconnect areas (channels)
Sea of Gate Array
• pre-fabricated diffusion and poly layers (regular structures e. g. transistors)
• customized interconnect structures (wires in metal 1 and metal 2)
• variable size interconnect areas (channels) over unused transistors
Standard Cell
• layout blocks predefined by silicon foundry
• full process sequence for chip fabrication required
VLSI DesignCourse 7-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Cell Based Design
Fabrication
Cell
Library
Design Analysis
DRCParasitics Extraction
Timing Analysis
Logic Simulation
Fault Simulation
Extraction
SimulationNetlist
Delay Backannotation
Parasitic Wire Capacitances /?
?
?
?
?
6
?
-
--
-
-
?
6
Schematic Entry
Symbol Generation
Placement
MacrocellsIO-Cells
Standard-Cells
Routing
Channel GenerationGlobal Routing
Detailed Routing
Mask Layout Data
Graphical
Data
Models
Layout
Data
Simulation
Macrocell
Compilation
Specification /
Fabrication Test Pattern
P & R – Optimization
Compiled Macrocell
Figure 7.6: Standard Cell Design Flow
VLSI DesignCourse 7-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Verification
7.4 Design Verification
7.4.1 Physical Design Rule Check
Physical design rule checks (DRCs) are performed to guarantee the conformity of a layoutdesign to the silicon vendor’s set of design rules. Design rules are defined between objects onthe same layer (minimum width, minimum spacing) as well as for objects on different layers(minimum spacing, overlapping, extension).
Minimum width
Minimum spacing
Overlapping
Extension
Design rule violations are usually reported in the physical layout using a graphics editor.Sometimes, also a tabular form indicating the location and type of design rule violation canbe generated.
7.4.2 Extraction
Circuit Level Extraction: can be used to create a netlist for circuit level simulations (e. g.SPICE, . . . ). The netlist consists of MOS transistors (including geometrical parametersas W / L, parasitic capacitances), resistors, capacitances, diodes, . . . .
Switch Level Extraction: can be used to create a netlist which can be processed by aswitch level simulator. The resulting netlist consists of MOS transistors and parasiticcapacitances (to model storage effects in MOS circuits).
Parasitics Extraction: is used in conjunction with cell based design techniques. Since wiredelay is dependent on the parasitic capacitance of a wire, parasitic capacitances of netsand input capacitances of other gates connected to an output can be used to estimatethe extrinsic delays (Note: intrinsic delays [i. e. the delay of unloaded gates] are fetchedfrom the cell library’s simulation model data).
Schematic Extraction: is executed to generate the connectivity data out of a graphical rep-resentation (schematic diagram) of a circuit module. The connectivity data is forwardedto a netlister which provides the information required e. g. by simulation tools (the sim-ulators cannot operate on graphical data, they require netlists in a textual format). Thiskind of extraction is usually required in pre-layout design specification phases.
VLSI DesignCourse 7-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Verification
Figure 7.7: Example of a design rules set checked during design verification
VLSI DesignCourse 7-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Verification
7.4.3 LVS
The layout–versus–schematic (LVS) comparison tool checks the equivalence of the layout andits schematic. The tool can be used to find wrong connections or parameter mismatch (as W/ L of transistors, . . . ) between a schematic and its physical layout representation.
7.4.4 Schematic / Electrical Rule Check (SRC / ERC)
To verify schematics used e. g. in cell based designs, a schematic rule checker can find schematicrule violations (like the following examples):
Warnings:
• unconnected (floating) wire segments
• open outputs
• exceeded fanout
Errors:
• open inputs (undefined input value!)
• number of bits differ for 2 buses connected together
• number of input/output pins in a schematic differs from its symbol representation (→pins are not accessible / not present at higher levels of schematic hierarchy)
• more than one active driver connected to a net at the same time
VLSI DesignCourse 7-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Simulation
7.5 Simulation
7.5.1 Goal of Simulation
• Validation of the system, logic timing, and electricial behaviour
• Verify testability aspects
• Software development
7.5.2 Simulator Classification
Level Primitives observable TimingValues Model
RT registers, user coded bit strings, discreteprimitives, busses, etc. vectors time set
Gate gates bits continuousor discrete
Switch transistors, capacitators bits continuousor discrete
Electricial capacitators, resistors, real values continuousinductors, diodes etc. time set
7.5.3 Signal Modelling
• values which exist in real circuits (0, 1, high impedance, oscillation, . . .)
• values which exist only in the simulator (unknown, tranistion, . . .)
• boolean logic set not sufficient
7.5.4 Signal States
3-valued logic:
log. zero = 0log. one = 1unknown = U
Example:
AND 0 1 U0 0 0 01 0 1 UU 0 U U
4-valued logic: additional state Z (= high impedance) is introduced
VLSI DesignCourse 7-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Simulation
Problems:
• Pessimism of U-value (for example: circuit initialisation, spikes)
• logic values are often not sufficient (value strength needed)
7.5.5 Circuit and Delay Modelling
• Circuit is built up by simulator primitives
• Modelling of the timing/delay behaviour:
dHHHH τ(n) yx
yt = x t−τ(n) ∆ : basic time unit
τ(n) = n ·∆ : delay of the gate
t1, t2, t3, . . . : clock time of synchronous circuit
(tν+1 − tν = ∆t = m ·∆)Timing models:
• Zero-Delay: ∆ = 0
• Unit-Delay: τ(n) = constant
• Nominal-Delay: τ(n) = user specified
VLSI DesignCourse 7-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Simulation
7.5.6 Advanced Logic Simulators
• Introduction of signal strength additional to logic values for driver and bus modelling
A : active, e.g. low impedance driverP : passive, e.g. high impedance driver (depletion load)S : storing, e.g. capacitive stored stateX : active indeterminate (e.g. active or storing)Y : passive indeterminate (e.g. passive or storing)Z : high impedance
• Instead of simple logical values signals are used for simulation. A signal consists of alogical value and a strength.
• Logical Values = 0,1,X
• 16 states
Overview on Signal Combinations
A0 A1 AX P0 P1 PX S0 S1 SX X0 X1 XX Y0 Y1 YX ZZA0 A0 AX AX A0 A0 A0 A0 A0 A0 A0 AX AX A0 A0 A0 A0A1 A1 A1 A1 A1 A1 A1 A1 A1 AX A1 AX A1 A1 A1 A1AX AX AX AX AX AX AX AX AX AX AX AX AX AX AXP0 P0 PX PX P0 P0 P0 X0 XX XX P0 PX PX P0P1 P1 PX P1 P1 P1 XX X1 XX PX P1 PX P1PX PX PX PX PX XX XX XX PX PX PX PXS0 S0 SX SX X0 XX XX Y0 YX YX S0S1 S1 SX XX X1 XX YX Y1 YX S1SX SX XX XX XX YX YX YX SXX0 X0 XX XX X0 X0 XX X0X1 X1 XX X1 XX XX X1XX XX XX XX XX XXY0 Y0 YX YX Y0Y1 Y1 YX Y1YX YX YXZZ ZZ
Example: Driver Modelling
7.5.7 Simulation Techniques
• Compiler driven technique
Problems:
– Feedbacks– Sorting of gate netlist– Zero delay model– Entire circuit is simulated
• Event driven simulation . . .
VLSI DesignCourse 7-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Simulation
HHHHHH
HH
HHHHHH
HH
-
-
-
HHHHHH
HH
A
B
C
HHHHHH
HH
HHHHHH
HH
-
-
-
HHHHHH
HH
A
B
C
HHHHHHH
H
HHHHHHH
H
-
-
-
HHHHHHH
H
A
B
C
HHHHHHH
H
HHHHHHH
H
-
-
-
HHHHHHH
H
A
B
C
A1
P0
A1
P0
S1
P0
A1
A0
AX
X1
P0
XX
A stronger than B
P stronger than S
Short circuit
short circuit possible
Figure 7.8: Competing drivers at a bus
7.5.8 Switch Level Simulation
• well suited to simulate digital MOS circuits
VLSI DesignCourse 7-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Simulation
Figure 7.9: Example: compiler driven simulation
• no fixed direction of signal flow
• transistor modeled as a switch with three states:open, closed, unknown
• algebraic or RC models
VLSI DesignCourse 7-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Simulation
MOS Transistor Model
Ideal Switch Transistor Model
d
d\\\-Gate
Drain
Source
Logic n-Channel p-Channel(Gate) Enhancement Enhancement Depletion
1 Closed Open Weak0 Open Closed WeakX Unknown Unknown Weak
remarks:
• Switch transition time is assumed to be zero or some nominal value.
• Unknown states can cause problems.
Linear Switch Tranistor Model
d
d
\\\-
.....................XXXXXXXX.....................
Gate
Drain
Source
REFF
Logic n-Channel p-Channel(Gate) Enhancement Enhancement Depletion
1 REFF ∞ REFF0 ∞ REFF REFFX [REFF ,∞] [REFF ,∞] REFF
remarks:
• In the linear model, node capacitance and devices resistance are used to compute outputlogic levels and transition times.
• Ratio errors can be detected.
VLSI DesignCourse 7-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
7.6 Hardware Description with VHDL
VLSI DesignCourse 7-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Hardware Description with VHDL
VLSI DesignCourse 7-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Weinberger Structuring
Chapter 8
Digital Subsystem Design
8.1 Weinberger Structuring
Weinberger structuring is a structured approach that simplifies physical layout and improveslayout density. The method has been presented by Weinberger in 1967.
Weinberger Arrays
• are created by placing transistors on the chip in a geometrically regular manner. Hori-zontal and vertical interconnect patterns are used to wire the devices together.
• using one type of gate; for example, NOR gates form a complete logic set for nMOScircuits
• regularity of Weinberger Arrays is very suitable for automatically layout generation
VLSI DesignCourse 8-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Weinberger Structuring
Example:F = (A+B + C) = ABC (8.1)
Figure 8.1: NOR gate reduction for Weinberger structuring
• empty squares denote input connections
• filled squares denote output connections
VLSI DesignCourse 8-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Weinberger Structuring
Example: 3-to-8 decoder
Figure 8.2: Weinberger structuring for 3-to-8 decoder
VLSI DesignCourse 8-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Weinberger Structuring
Figure 8.3: Weinberger structuring for 3-to-8 decoder (continued)
VLSI DesignCourse 8-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Weinberger Structuring
Example:Z = U + V +W +X + Y (8.2)
Z
U
V
W
X
Y
Figure 8.4: Function representation in random logic
bb b b bbbbbb b b b b b b b
b b b b b bbbb
b b
b b b b
VDD
U V W X Y Z
Figure 8.5: Weinberger NOR array representation
bb b b bbbbb
bb b
b b b b
b b b bb b b bb b
bbb b bbbb
c c c c................................................................
................................................................
................................................................
................................................................
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
............................
............................
............................
............................
VDD
U V W X Y Z
Figure 8.6: Weinberger stick diagram
VLSI DesignCourse 8-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Weinberger Structuring
Figure 8.7: Weinberger array structure: (a) schematic (b) layout
VLSI DesignCourse 8-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Matrix Layout
8.2 Gate Matrix Layout
Gate matrix layout is a character based layout style for custom CMOS circuitry. It is aregular design style, employing a matrix of intersecting transistor diffusion rows and polysiliconcolumns such that intersections are potential transistor sites.
8.2.1 Creating a Gate Matrix
Representational line drawing or stick figure using the levels of interconnections available (e.g.polysilicon gate technology: polysilicon, metal, diffusion)
• immediately draw series of parallel poly lines corresponding to the number of inputs tothe circuit (may become more if an output is chosen to be polysilicon)
• subsequent transistor placements will be determined by two factors, i.e. input columnand serial or parallel association among transistors.
• after row definition, further interconnections may be done with horizontal and verticalmetal interconnection tracks
• final improvements
VLSI DesignCourse 8-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Matrix Layout
Figure 8.8: Gate matrix layout: (a) schematic (b) layout (c) optimized layout of n part
VLSI DesignCourse 8-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Matrix Layout
8.2.2 Example: Half-Adder
A
B
A
B
C
S
C
S
A
B
HA
Figure 8.9: Half adder NAND/INV representation
C = AB = AB (8.3)S = AB +AB
= (A+B)B + (A+B)A= (AB)B + (AB)A
= (ABB) (ABA) (8.4)
VLSI DesignCourse 8-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Matrix Layout
Figure 8.10: Half adder realizations: (a) standard cell (b) gate matrix
8.2.3 Character Definitions for Symbolic Layout
N n-channel transistorP p-channel transistor+ metal-poly or metal-diffusion crossover∗ contact| polysilicon or n-diffusion wire! p-diffusion wire: vertical metal– horizontal metal
VLSI DesignCourse 8-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Matrix Layout
Figure 8.11: Typical gate matrix layout
VLSI DesignCourse 8-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Matrix Layout
The following rules summarise the gate-matrix technique:
1. Polysilicon runs only in one direction and is of constant width and pitch.
2. Diffusion wires (of constant width) may run vertically between polysilicon columns.
3. Metal may run horizontally and vertically. Any pitch departures from a minimum (e.g.power rails) are manually specified.
4. Transistors can only exist on polysilicon columns.
Wide transistors may be specified by abutting two or more N or P symbols.
Figure 8.12: Gate matrix row and column spacings
VLSI DesignCourse 8-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Matrix Layout
8.2.4 Summary of Gate Matrix Properties
+ regular design style
+ technology updatable
+ modularity is encouraged by the block nature of the layout style
+ circuit extraction may done at the symbolic level or at the mask level by conventionalcircuit extractions
– character symbolic description is not hierarchical ⇒ modules must be assembled in theirentirety and ”pasted” together at the mask level
– no freedom to locally optimize geometry, e.g. transistor size
VLSI DesignCourse 8-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
8.3 Optimal CMOS Complex Gate Layout
In MOS circuit design advantage can be taken by the application of complex functional cellsin order to achieve better performance. In this section the implementation of a random logicfunction on an array of CMOS transistors will be discussed. The method has been presented byUehara and van Cleemput in 1981. A graph theoretical approach for systematic and efficientlayout generation minimizes the required chip area.
⇓ optimal
Figure 8.13: (a) CMOS complex gate schematic and (b) corresponding layout
VLSI DesignCourse 8-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
8.3.1 CMOS Functional Cells
Figure 8.14: Implementation of an EXOR function: (a) Logic diagram. (b) Circuit. (c) Layout
Advantages of complex gate approach:
+ better performance
VLSI DesignCourse 8-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.15: Example of row-based layout scheme
+ smaller size
In the following, the consideration is limited to AND/OR networks realized in complex gateCMOS by means of series/parallel connections of transistors. The topology of the nMOSnetwork and the pMOS network are assumed to be dual.
The delay of a complex CMOS cell mainly depends on the maximum number of series transis-tors between VDD or VSS and the cell output, which is called level of the complex cell. Thisquantity has a direct influence on the charging or discharging resistance of the cell. Generallycells with less than four levels are desirable. The number of cells with parallel/serial topologyis given by the following table:
number of levels number of cells1 12 63 804 3434
So it’s reasonable to use mainly cells with three levels and only sometimes cells with four levelsin order to get a sufficient performance.
VLSI DesignCourse 8-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.16: Alternative complex gate implementation of EXOR function: (a) Logic diagram.(b) Circuit. (c) Layout
VLSI DesignCourse 8-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
8.3.2 Basic Layout Strategy
Figure 8.17: Basic layout of the functional cell: (a) Logic diagram. (b) Circuit. (c) Graphmodel. (d) Layout
VLSI DesignCourse 8-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Layout properties (from Fig. 8.17(d)):
• two rows of transistors, implementing the pMOS and nMOS part of the circuit
• equal number of transistors in both rows
Figure 8.18: Layout optimization: (a) Diffusion connection of adjacent transistors. (b) Opti-mal arrangement (reordered input lines)
Fig. 8.18 shows layout improvements for the circuit in Fig. 8.17. If the metal connectionsbetween adjacent transistors are replaced by diffusion (designer should be careful in doing thisfor high-speed circuits) the layout of Fig. 8.18(a) is achieved. An even more sophisticatedlayout arrangement which reduces the required area is shown in Fig. 8.18(b).
The best layout is achieved by the transistor arrangement of Fig. 8.19, which is logicallyequivalent to the previous figures.
VLSI DesignCourse 8-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.19: Alternative optimal circuit layout: (a) Logic diagram. (b) Circuit. (c) Graphmodel. (d) Optimal Layout.
VLSI DesignCourse 8-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Generally the area costs of a functional cell can be calculated by
area = width ∗ height (8.5)with
height = const. (8.6)and
width = basic grid size ∗ (#inputs + #separations + 1) (8.7)
A separation is required when there is no connection between physically adjacent transistors.An optimal layout is obtained by reducing the number of separations.
8.3.3 Graph Theoretical Algorithm
The p-side and the n-side of the circuit can be formulated as graphs which can be defined asfollows:
GP = (VP , EP ) p− side network (8.8)GN = (VN , EN ) n− side network (8.9)
Graph properties:
• the graphs are series/parallel graphs (CMOS complex gate property/assumption)
• every source/drain potential is represented by a vertex V
• every transistor is represented by an edge E, connectiong the vertices representing sourceand drain
• edges are labeled by the corresponding transistor gate input signal
• GP and GN are dual
If two edges Ei and Ej are adjacent in the graph model, then it is possible to place thecorresponding gates in a physically adjacent position of an array and hence, connect them bya diffusion area. In order to minimize the number of separations a set of minimum size pathshas to be found, which corresponds to chains of transistors in the array.
Definition 1 An Euler path is a closed path on a graph, that covers every edge of the graphexactly once
If there exist Euler paths for GN and GP then all transistors can be chained by diffusion areas.Otherwise the graphs have to be partitioned into subgraphs which have Euler graphs.It’s necessary to find a pair of paths for GP and GN with the same sequence of labels, becausep- and n-type transistors corresponding to the same input have to be positioned at the samehorizontal position (poly line).
VLSI DesignCourse 8-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
General algorithm:
1. enumerate all possible decompositions of the graph model to find the minimum numberof Euler paths that cover the graph
2. chain the gates by means of a diffusion area according to the order of the edges in eachEuler path and
3. if more than two Euler paths are necessary to cover the graph model, then provide aseparation area between each pair of chains
=⇒ Search of minimal number of Euler paths is NP-complete
8.3.4 Problem Reduction
An odd number of series or parallel edges can be reduced to a single edge:
Figure 8.20: Reduction of odd numbers of edges
Definition 2 The reduced graph is obtained by iteratively replacing an odd number of series(parallel) edges by a single edge, until no further reduction is possible.
Theorem 1 If there is an Euler path in the reduced Graph then there exists an Euler path inthe original graph.Proof: It is possible to reconstruct an Euler path in the original graph by replacing each edge of the Euler path
in the reduced graph by a sequence of the original odd number of edges.
Theorem 2 If the number of inputs to every AND/OR element is odd, then
1. the corresponding graph model has a single Euler path
2. there exists a graph model such that the sequence of edges on an Euler path correspondsto the vertical order of inputs on a planar representation of the logic diagramm.
If there are gates in the logic diagramm with an even number of inputs, additional “pseudo”inputs have to be introduced in order to guarantee an odd number of inputs. It is guaranteedby the second previously given theorem, that there exists an Euler path for this modifiedproblem. But the pseudo edges in the Euler path have to be removed afterwards and thenthey can cause diffusion separations. An algorithm for minimizing separations caused bypseudo edges is given in the next section (⇒ minimal interlace of normal and pseudo inputs).
The heuristic algorithm for generating an Euler path is given by:
VLSI DesignCourse 8-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.21: Application of reduction rule: (a) Logic Diagram. (b) Graph model and itsreduction. (c) Reconstruction of an Euler path
1. To every gate with an even number of inputs a “pseudo” input is added
2. Add this new input to the gate such that the planar representation of the logic diagramshows a minimal interlace of “pseudo” and real inputs. It should be noted that a“pseudo” input at the top or at the bottom of the logic diagram does not contribute tothe separation areas as shown in Fig. 8.22(b) and (c).
3. Construct the graph model such that the sequence of edges corresponds to the verticalorder of inputs on the planar logic diagram.
4. Chain together the gates by means of diffusion areas, as indicated by the sequence ofedges on the Euler path. “Pseudo” edges indicate separation areas.
5. The final circuit topology can be derived by deleting “pseudo” edges in parallel withother edges and by contracting “pseudo” edges in series with other edges.
VLSI DesignCourse 8-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
This heuristic algorithm does not necessarily give the optimal layout, but if the resultingsequence has no separations areas, it is the real optimal solution.
Figure 8.22: Application of the heuristic algorithm: (a) New inputs p1 and p2 are added. (b)Optimal sequence of inputs without the interlace of p1 or p2. (c) Circuit with the dual pathp1,2,3,1,4,5,p2
8.3.5 Algorithm for Calculating Minimal Interlace
VLSI DesignCourse 8-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.23: Minimal interlace algorithm
VLSI DesignCourse 8-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.24: Application example for minimal interlace algorithm
VLSI DesignCourse 8-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
8.3.6 Examples
Figure 8.25: Carry look-ahead circuit (this representation has no Euler path)
VLSI DesignCourse 8-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.26: Alternative topology for carry look-ahead circuit (with possibility of constructingan Euler path)
VLSI DesignCourse 8-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Optimal CMOS Complex Gate Layout
Figure 8.27: Comparison of space: (a) Functional cell realization. (b) Conventional NANDrealization
VLSI DesignCourse 8-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Standard Cell Layout
8.4 Standard Cell Layout
Figure 8.28: Standard cell architecture
Figure 8.29: Synchronous counter schematic
VLSI DesignCourse 8-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Standard Cell Layout
Figure 8.30: Synchronous counter floorplan using standard cells
VLSI DesignCourse 8-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
8.5 Programmable Logic Arrays
A programmable logic array (PLA) maps a set of Boolean functions in cannonical, two-levelsum-of-product form into a geometrical structure. A PLA consists of an AND-plane and anOR-plane. For every input variable in the Boolean equations, there is an input signal to
Figure 8.31: AND-OR-PLA
the AND-plane. The AND plane produces a set of product terms by performing an ANDoperation. The OR-plane generates output signals by performing an OR operation on theproduct terms fed by the AND-plane.
PLA: AND array and OR array programmableproduct term sharing: every product term of the AND array can be connected to any ofthe OR output gates
PAL: AND array is programmable and OR array has fixed connection points (OR gates)
PROM: AND array hardwired, OR array programmable (→ the set of all possible productterms is realized)
VLSI DesignCourse 8-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Figure 8.32: Programmable logic approaches
VLSI DesignCourse 8-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Example:
x0 x1 x2 z0 z1
0 0 0 1 10 0 1 1 10 1 0 0 00 1 1 0 01 0 0 0 01 0 1 0 01 1 0 1 01 1 1 0 1
z0 = x0 x1 x2 + x0 x1 x2 + x0 x1 x2
= x0 x1 + x0 x1 x2 (8.10)
z1 = x0 x1 x2 + x0 x1 x2 + x0 x1 x2
= x0 x1 + x0 x1 x2 (8.11)
(8.12)
here:
• PROM implementation realizes all of the 8 product terms
• PLA implementation needs only 3 terms
-
-
-
? ? ? ? ?
X
1 1 0
1 1 1
1 1
1 0
0 1
x0 x1 x2 z0 z1
AND OR
0 0
Figure 8.33: PLA realization for given example
VLSI DesignCourse 8-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
8.5.1 Floor Plan for PLA
Figure 8.34: PLA generic floor plan
A AND plane programming cellO OR plane programming cellAO AND-OR communication cellIN AND plane input cellOUT OR plane output cellLA Left AND plane cellRO Right OR plane cellBL Bottom left cellBM Bottom middle cellBR Bottom right cellTL Top left cellTA Top AND cellTM Top middle cellTO Top OR cellTR Top right cell
VLSI DesignCourse 8-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
8.5.2 Static nMOS and Pseudo-nMOS PLA
nMOS PLA: Pull-up network realized by single nMOS depletion transistor
Pseudo nMOS PLA: Pull-up by high resistance pMOS transistor with permanently groundedgate input
Since the AND-OR structure is not suited to MOS circuit technology both AND and ORplanes are implemented using distributed NOR or NAND gate structures based on deMorganslaw:
1. INV-NOR-NOR-INV structure:
a b+ c d = (a+ b) + (c+ d)
=
( a︸︷︷︸
INV
+b) + (c+ d)︸ ︷︷ ︸NOR
︸ ︷︷ ︸
NOR
︸ ︷︷ ︸
INV
(8.13)
Example:
z0 = xo x1 + x0 x1 x2
=[(x0 x1 + x0 x1 x2)
](8.14)
=[(
(x0 + x1) + x0 + x1 + x2
)](8.15)
Properties:
Figure 8.35: NOR-NOR PLA structure
• high static power dissipation
• small area
VLSI DesignCourse 8-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Figure 8.36: Pseudo nMOS NOR-NOR PLA circuit
Figure 8.37: PLA implementation in pseudo nMOS logic
VLSI DesignCourse 8-37
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Figure 8.38: Stick diagram of nMOS PLA
• useful if high speed is not required
2. NAND-NAND structure:
a b+ c d = a b+ c d
= (a b) (c d) (8.16)
Example:
z0 = xo x1 + x0 x1 x2
=((x0 x1) (x0 x1 x2)
)(8.17)
(8.18)
Properties:
• NAND-NAND approach not recommended:
• decreasing performance at increasing number of inputs (because of series connectionof nMOS transistors)
• high static power dissipation
VLSI DesignCourse 8-38
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
8.5.3 Static CMOS PLA
NOR gates with a large number of inputs should be avoided in CMOS, because the p-channeldevices are in series.Static CMOS PLA are usually realized with NAND-INV-INV-NAND structure in order toavoid long chains of pMOS transistors. Properties:
Figure 8.39: PLA NAND-INV-INV-NAND implementation
• no static power dissipation
• area increase becomes unacceptable for large PLA’s
• working fast
8.5.4 Dynamic CMOS PLA
• less size than static CMOS
• fast
• 2-phase clocking
• states of φ1:
VLSI DesignCourse 8-39
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Figure 8.40: CMOS PLA layout
φ1 = 1:
– no path to ground
– inputs change
– both NOR planes are precharged
φ1 = 0:
– first NOR plane discharges
– dummy: worst case discharge (prevents second NOR plane to discharge)
– after first NOR plane, the second NOR plane evaluates
VLSI DesignCourse 8-40
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Figure 8.41: Dynamic 2-phase PLA circuit
• φ2 is used to latch the second stage → intermediate clock is required to precharge ORplane: this is – as mentioned above – generated by the cells TL, TA and TM. This usesa dummy product row that discharges at the worst case rate according to the loading ofthe and array
8.5.5 Noise in PLAs
• in dynamic PLAs noise problems on switched supply lines
• discharging current is generating transients in the power supply bus
• to reduce noise: locally grounding the PLA; use of metal lines for power supply wheneverpossible (reduced impedance)
8.5.6 Optimization of PLAs
Logic Minimization
• optimizations (minimizations) of boolean equations in order to reduce the number ofminterms or literals
VLSI DesignCourse 8-41
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Figure 8.42: Noise problem in dynamic PLAs
• if a term is needed both positive and negative sometimes a reduction can be achievedusing negative logicExample:
z = x1 + x0x′1x′2 + x
′0x′1x2
=⇒ 3 minterms
z′
= (x1 + x0x′1x′2 + x
′0x′1x2)
′
= x′1 (x0x
′1x′2)′(x′0x′1x2)
′
= x′1 (x
′0 + x1 + x2) (x0 + x1 + x
′2)
= (x′1x′0 + x
′1x2) (x0 + x1 + x
′2)
= x0x′1x2 + x
′0x′1x′2
=⇒ 2 minterms
• decoder in front of the AND plane to generate combined input variables
VLSI DesignCourse 8-42
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Multiple Sided Access
Figure 8.43: Multiple sided input/output access
Folding
Figure 8.44: PLA before folding
Figure 8.45: Row-folded PLA
An advantage of multiple-sided access and folding is the decreased layout area, but the layoutstructure has changed and the wiring is more difficult.
VLSI DesignCourse 8-43
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Figure 8.46: Column-folded PLA
8.5.7 Timing and Power Dissipation of a Static PLA
Delay is determined by
• (W/L) of the AND/OR load
• (W/L) of the AND/OR cells
Minimum Delay:
• large load current Iload
• (W/L)ORplane = e · (W/L)ANDplane
Limitations:
• Iload limited by:
– the total power of the PLA
– the internal logical ’0’: (I · RnMOS = ’0’)!< VT
• the stage sizing factor e for successive stages can not always be realized due to thefloorplan
VLSI DesignCourse 8-44
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
8.5.8 Automatic PLA Layout Generation
JJ
AA
AA
AA
HH
HH
Cells:
input/output buffer
clock driver
VDD/VSS cells
Schmittrigger ...
floorplanner
truth table = matrix
logical optimization
Output: layout with mask data
Input: boolean equations
structure of PLA
Figure 8.47: Automatic PLA layout generation
Example: PLA generator input file
PLA adderpla;
INPUT: I1,I2,I3;OUTPUT: O1,O2;PRODUCT: P1,P2,P3,P4,P5,P6,P7;
AND_BEGINP1 := I1 * I2;P2 := I1 * I3;P3 := I2 * I3;P4 := I1 * I2’ * I3’;P5 := I1’ * I2 * I3’;P6 := I1’ * I2’ * I3;P7 := I1 * I2 * I3;
END_END
OR_BEGINO1 := P1 + P2 + P3;O2 := P4 + P5 + P6 + P7;
OR_END
VLSI DesignCourse 8-45
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Arrays
Truth table matrix: optimized intermediate result
11X 101X1 10X11 10100 01010 01001 01111 01
VLSI DesignCourse 8-46
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
8.6 Finite-State Machine
A typical digital circuit architecture for computation intensive applications consists of a data-path and a controller. The data-path is formed by a number of arithmetic units like adders,ALUs, multipliers etc. connected through a network of connections, busses, multiplexors andregisters. Registers are required to separate computational stages from each other (to syn-chronize computations) or to feed back data for further arithmetic operations (to break upcircuit loops).
However, no circuit can be realized through a data-path only since this circuit part has to becontrolled to perform actual computations. Signals are required to select e.g. the functionalityof an ALU, to steer data through multiplexors to a dedicated input of an arithmetic unit or tocontrol the reading of values into registers. Those signals are provided through a control unitor short controller. To support a hierarchical design approach data-path and controller arealways regarded separately as shown in figure 8.48. The control section provides some control
Figure 8.48: Datapath and controller block
signals required for datapath control and on the other hand reads status information as e.g.overflow flags or comparator results (to control loop execution etc). A typical control taskexample is the instruction set execution of standard microprocessors. Simplified the controllercan work in the following way:
step 1: initialize processorstep 2: fetch instruction (address)step 3: decode instruction
instr 1(add) instr 2 (move) instr 3 . . .
step 4: load operand 1 step 4: load operand .step 5: load operand 2 step 5: store operand .step 6: add op1 op2 .step 7: store result
VLSI DesignCourse 8-47
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
step n: address = address + 1step n+1: goto step 2
Different steps are required to fetch, decode and execute an instruction. Depending on thedecoding of the instruction a dedicated sequence of steps will be executed. During each stepan output vector will be produced to control the datapath (e.g. switch a multiplexor to se-lect a certain operand or determine ALU-operation to be performed on operands). In thisexample the controller is also receiving signals from the datapath as e.g. instruction decod-ing information in step 3 to be able to branch into the corresponding instruction executionsequence.
The question arises now, how such a controller can be specified and designed. Combinationalcircuit specification through boolean equations provides a good model for the behaviour ofmemoryless digital circuits. However, it is quite obvious that a controller realization cannotbe memoryless. This is due to the fact that one is passing through a sequence of steps whichgenerally will be influenced through signals to be read from the datapath. During each step anoutput vector has to be produced to control/steer the datapath. Therefore, a controller canbe regarded as a black box with an input and output vector, where the values of the outputvector depend on the current step. A certain step is reached through a sequence of precedingsteps which finally means, that the value of an output vector depends on the history of thecircuit. Such a behaviour is only possible when memory is available.
Synchronous digital circuits which comprise memory elements are called sequential circuitssince the results produced at the primary outputs generally depend on the values at theprimary inputs and the history of the circuit. History in this context means the values of allregisters in the current step (or state) which received those values before the actual clock cycle(in the past). Therefore, during its operation the circuit will run through a sequence of statesrepresented through the register contents.
Each sequential circuit can be represented in a way as depicted for the controller on the leftside of figure 8.48 if all registers are collected into the state register, all combinational logicproducing the contents of the registers into the next state logic and all combinational logicproducing the primary output values into the output function.
Due to the existence of memory, combinational circuit theory is no well suited model for thedescription of controllers or any other sequential logic. Since a controller can be regarded as aspecial case of sequential logic application (and one is interested in a general approach to copewith all sequential logic circuits) the more general term sequential logic will be investigatedin the remaining section. Figure 8.49 shows a small example of a sequential circuit. Despiteit is principally possible to replace the registers through the corresponding combinationalcircuits and to open the feedback loop such that combinational circuit theory can be applied,a more abstract behaviour description would be desirable. This is especially true for complexcontrollers where a designer does not want to be concerned with too much circuit details.Fortunately, the theory of finite state machines provides an abstract basis for the modellingof sequential logic.
VLSI DesignCourse 8-48
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
Figure 8.49: Sequential circuit example
8.6.1 Introduction into Finite State Machines
In this section we will show how a sequential circuit can be seen as one of several possibleimplementations of a particular finite state machine (FSM). Each FSM has a finite set ofdiscrete states as well as a finite set of digital inputs and outputs and a set of digital rules thatgovern its behaviour. An FSM operates in discrete time why its behaviour can be characterizedas a sequence of steps that occur at regular intervals (all registers are synchronously clocked).An FSM’s inputs, outputs, and state are assumed to be constant during each interval, changingonly at the boundaries between consecutive intervals (the registers are triggered with risingor falling clock edge). Summarizing an FSM is defined in the following way:
A finite state machine is a digital device having
• a finite set of states S1, S2, ..., Sk (where k is the number of states). Optionally one ofthese, SI is distinguished as the initial state of the FSM
• a finite number of binary inputs I1, I2, ..., Im (where m is the number of inputs)
• a finite number of binary outputs O1, O2, ..., On (where n is the number of outputs)
• a set of state-transition rules specifying, for each choice of current state SS and inputvalues I1, I2, ..., Im, a next state SS′
• a set of output rules specifying, for each choice of current state SS and input valuesI1, I2, ..., Im, the binary value at each output
One distinguishes between two types of finite state machines, namely the Moore machine andthe Mealy machine. Both types of machine differ in the last of the topics mentioned above. Inthe case of Moore type machines the output rules are such that the outputs of a Moore FSMare functions of the current state only. In figure 8.48 this would mean that control inputs areonly going into the next-state function block and not into the output function block.The alternative Mealy machine model allows outputs to reflect current inputs as well as currentstate. Therefore, figure 8.48 represents a Mealy machine. The behaviour of every FSM can bedescribed using either model, although the number of states and timing details will generallydiffer.
The Moore machine has some advantages for theoretical reasoning and is therefore generallyused in proving, however, the Mealy machine type is preferred in actual circuit implemen-tations since it generally requires less states (which means less logic for its realization) and
VLSI DesignCourse 8-49
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
it can respond immediately upon changes of the input vector (a Moore machine first has tobranch into a new state since output values only depend on the state information). PracticalFSM implementations typically have a reset input, which returns the FSM to a well definedinitial state such that the automata can be reset before a new input sequence is applied (e.g.when the system containing the FSM is turned on).
Returning to the circuit of figure 8.49, one can identify the discrete states by tabulatingcombinations of values for its state variables. If q0 and q1 are used to denote the values ofthe state variables in the current state, and n0 and n1 to denote the values in the succeedingstate, the following equations will describe this circuit:
n0 = in · q1
n1 = q0
out = q1 · q0
The state-transition and output rules are shown in the truth table of table 8.1, which listsall possible combinations of current state and input variables on the left side, and the next
Current state Next stateq1q0 Input n1n0 Output00 0 00 000 1 01 001 0 10 001 1 11 010 0 00 010 1 00 011 0 10 111 1 10 1
Table 8.1: State-transition truth table
state which the machine should enter on the right side along with the corresponding output.These tables can be easily obtained from the implementation of the FSM. For example, if inthe circuit above, q1 = 0, q0 = 0, and in = 0, then the next state that results is q1 = 0, q0 = 0.If in = 1, the next state will be q1 = 0, q0 = 1.
The state-transition table immediately suggests a ROM implementation of the FSM, the left-hand side of the table being the address of the ROM and the right-hand columns being dataoutputs.
The final and most abstract representation for a finite-state machine is a state-transitiondiagram. In such a diagram, states are shown as circles. Outputs associated with the stateare given inside the circle. Transitions between states are represented as directed arcs fromone circle to another. The input combination that causes a given transition is written alongthe arc. Since we are dealing with clocked sequential machines, transitions only occur onclock edges, and for this reason the clock is not explicitly shown on state-transition diagrams.Figure 8.50 gives the state-transition diagram for the FSM discussed above.
VLSI DesignCourse 8-50
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
Figure 8.50: State-transition diagram
8.6.2 Realization of Finite-State Machines
The realization of clocked sequential circuits is a fairly straightforward processing having fourmain steps.
First step is to draw a state-transition diagram for the FSM. This is often a very difficultstep since it requires thinking very precisely about what the FSM is supposed to do. Next,determine the number of state variables (and therefore registers) from the number of statesin the state-transition diagram and assign a binary encoding to each state. This assignmentcan be done arbitrarily, however, this might result in an inefficient solution. An optimalstate assignment is of major importance for the amount of combinational circuitry required toimplement the FSM. Unfortunately this problem is NP-hard which means that it is suspectedto require exponentially growing computation time if the problem size is increasing. Theimportance of an appropriate state encoding will be illustrated at the end of this subsection.
Then, based on the state-transition diagram, a state-transition table has to be built. It isimportant that the table covers all possible input combinations for each possible state (ifa combination does not occur don’t cares should be inserted which can be exploited duringcombinational logic minimization). From the table, the circuit can be directly implementedwith ROMs. If another implementation is required (logic gates, for example), Karnaugh mapsfrom the state-transition table for each next-state variable have to be developed. Finally,a reduced sum-of-products expression has to be found for each which can be implementedthrough appropriate combinational logic.
To illustrate those steps consider the design of a simple FSM whose one output goes highevery five clock times and remains high for one clock period. The frequency of the outputpulses is one-fifth that of the clock. This type of circuit is called a divide-by-5 counter. Thismachine has no external inputs. Its state-transition diagram is shown in figure 8.51. A stateassignment and a state-transition table for this counter are given in table 8.2. Please notethat the number of 3 bits for state encoding as well as the actual encoding of each state hadbeen done arbitrarily. A larger number of bits or another encoding could have been selected!The table can now be realized through e.g. ROMs or using explicit combinational logic (realizedas two/multilevel gates, PLA etc).
Figure 8.52 shows another example which in the following will be used to illustrate the im-
VLSI DesignCourse 8-51
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
Figure 8.51: State-transition diagram for the divide-by-5 counter
Transition table Output tableCurrent Current Next
state state State OutputA 000 001 0B 001 010 0C 010 011 0D 011 100 0E 100 000 1
Table 8.2: State-transition table for divide-by-5 counter
Figure 8.52: State-transition diagram of an arbitrary FSM
portance of state-encoding. The behaviour of the FSM is given represented in transition table8.3. State encoding is the process of assigning a unanimous bit vector to each state of theFSM, e.g. the following two encodings can be selected:
VLSI DesignCourse 8-52
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
Current state Next stateq1q0 Input n1n0 OutputS0 0 S0 0S0 1 S1 0S1 0 S2 0S1 1 S1 0S2 0 S0 0S2 1 S1 1
Table 8.3: State-transition truth table
Encoding 1 Encoding 2S0 = 00 S0 = 00S1 = 01 S1 = 11S2 = 11 S2 = 01
There are s possible encodings with
s =k!
(k −m)!
with k = 2n (n is the number of selected state bits) and m being the number of states to beencoded. Typically n is chosen as n = dlg2(m)e. However, other values are possible for n, e.g.one bit per state!
In the example above: k = 22 = 4 and m = 3. With these constraints the number ofpossible encodings is 4!
(4−3)! = 24. Each corresponding encoding results in different complexrealizations.
The first state encoding had been
S0 = 00, S1 = 01, S2 = 11
The corresponding output function is
out = abc
resulting in the state transition functions
y1 = abc
y2 = ab+ bc+ ac
The number of product terms is 5 and that of the literals 12. A resulting hardware implemen-tation using combinational logic is shown in figure 8.53 The second encoding had been
S0 = 00, S1 = 11, S2 = 01
The corresponding output function is
out = abc
VLSI DesignCourse 8-53
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
Figure 8.53: FSM-realization for first encoding scheme
and the transfer functions
y1 = ab
y2 = ab+ bc
The number of related product terms and literals is now 4 resp. 9. Figure 8.54 shows acorresponding realization.
As one can see state encoding is crucial for efficiency of the final solution. Unfortunately thereis no way to find an optimal assignment with an algorithm whose complexity is bound by apolynomial expression. A good heuristic is to simply select an encoding where only one bitis changing when sequencing from state to state (gray code). Another good approach can beone-hot encoding (where a single bit represents each state) which is certainly restricted to asmall number of states.
8.6.3 Synchronous FSM Circuit Models
Although it is possible to base FSM realizations on self-timed or other timing disciplines, mostFSM implementations are based on a synchronous, single-clock scheme. As already mentionedin connection with figure 8.48 a general sketch of an implementation strategy using the Moore
VLSI DesignCourse 8-54
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
Figure 8.54: FSM-realization for second encoding scheme
machine model (outputs are functions only of current state, independently of current inputs)is shown in figure 8.55. One should note the use of a clocked register to hold the current state
Figure 8.55: FSM (Moore automata) implementation
information. All other blocks are combinational logic components which can be realized indifferent ways (PLA, ROM, dedicated logic circuits etc).
Timing of the inputs of such a circuit has to be synchronous with the FSM’s clock becauseall signal outputs of the next-state logic have to be settled down before the values are loadedinto the registers during rising clock. In the case of asynchronous transitions nonsense mightbe loaded or meta-stable states of the registers might be activated.
Asynchronous inputs can be treated as shown in figure 8.56. The synchronisation throughadditional clocked registers guarantees that the inputs to the state register are stable at eachactive clock edge, assuming of course that the propagation delay along the combinational
VLSI DesignCourse 8-55
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
Figure 8.56: Treatment of asynchronous inputs in a Moore machine
path through the logic is shorter than the clock period (plus setup-time of state registers).Moreover, although meta-stable behaviour of the input register remains a possibility, it hasa clock period (minus the next-state propagation delay) to become valid before it corruptsthe contents of the state register. Thus, for sufficiently long clock periods, this latter designshould be arbitrarily reliable.
It is important to recognize that the implementations of figures 8.55 and 8.56 behave slightlydifferently, owing to the extra clock delay in the inputs of figure 8.56. Given identical next-state logic, identical input sequences will yield output sequences delayed by one clock cycle inthe second approach.
8.6.4 States and Bits
Most real digital systems are finite-state machines, yet the view and techniques introducedin this chapter are not appropriate in every circumstance. The binary encoding of an FSM’sstate allows at most 2k states to be represented in k bits of state variables, and in generalabout k flip-flops are required to hold the state of a 2k-state machine. Adding a single flip-flopto a machine potentially doubles its number of states. This exponential relationship betweenthe number of states and the amount of physical hardware in a sequential circuit leads theFSM model to become awkward in dealing with sequential circuits having more than a fewbits of storage. A 10-bit register, for example, would be quite difficult to characterize by astate-transition diagram; the number of states of a supercomputer is inconceivably large.
Typically, such systems are viewed in terms of memory cells and registers, partitioning theenormous state into more tractable units. It is important to recognize that sequential circuitsmay be viewed either in state or in bit terms, that the two are exponentially related, and thatit is often useful to change between these views.
Therefore, the reader should be aware that it makes no sense to apply the FSM-model to eachtype of sequential circuit. However, the FSM-model is very well suited to support the designof controllers since the number of states is reasonably small.
VLSI DesignCourse 8-56
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
8.6.5 Equivalence of FSMs
The input/output behaviour of two FSM’s may be identical even though the machines havedifferent transition and output rules or even different numbers of states. As a degenerateexample, one consider two single-input FSMs whose output remains constant, independent oftheir state. From external observations it is impossible to distinguish between the states ofsuch machines – one might have one state and the other nine, yet the machines are externallyindistinguishable. We call FSMs equivalent if they are indistinguishable; for all practical pur-poses, equivalent FSMs are interchangeable. Therefore, the equivalence of FSMs is importantfor their construction since the designer is interested to transform an initial FSM specifica-tion to an equivalent machine which can be realized most efficiently on silicon meeting allrequired constraints. It is therefore useful to develop the notion of equivalence together withengineering tools for reducing a specified FSM to a simpler equivalent.
The terms state equivalence and FSM equivalence are defined in the following way:
State equivalence: Let s1 and s2 be particular states of FSMs M1 and M2. State s1 of M1
is equivalent to state s2 of M2 if and only if for every finite sequence ofinputs, the outputs resulting from the application of that sequence toM1 in s1 are identical to the outputs resulting from the application ofthe same sequence to M2 in s2.
Thus two states are not equivalent only if there exists a finite input sequence that leads themto produce distinct outputs. The notation M : s will be used to specify state s of machine M .
FSM equivalence: Let s1 and s2 be initial states of FSMs M1 and M2. Then the machinesM1 and M2 are equivalent if and only if M1 : s1 is equivalent to M2 : s2.
Given an FSM that solves some practical problem, one is often interested in finding thesmallest equivalent FSM in order to minimize costs. While several measures of ‘smallest’might be proposed, a natural candidate (and usual choice) is the number of FSM states. Thusone seeks to perform a state reduction on a given FSM M1 to yield and equivalent M2 havingfewer states. In general, this may be done by detecting and merging equivalent states withinM1.
For example one can look for pairs M1 : si and M2 : sj that are equivalent. When such apair is found, they simply can be combined into a single state, yielding an equivalent FSMwith one fewer state. This process of looking for equivalent states can be continued in thenew FSM and terminates when a pair of equivalent states can no longer be found. This is anexample of a relaxation algorithm, in which a set of reduction rules is repeatedly applied toreduce a structure until it can be reduced no more. It begins with a pessimistic but workingmodel of the desired FSM and iteratively improves the cost while maintaining equivalence.
This approach has the disadvantage that the equivalence of two states can be difficult to detect.Rather than incrementally improving an initial pessimistic model, the optimistic relaxationapproach begins with the assumption that all of the states of M1 are equivalent (yielding aone-state machine). The relaxation iteratively discovers pairs of presumed equivalent statesthat cannot in fact be equivalent and grudgingly splits them into their components. Thisscheme is based on the detection of state none-equivalence through the following two rules:
• If states si and sj have different outputs, then they are nonequivalent
• If, for some input combination v1, v2, ..., vm state si1 goes to state Si2 and state sj1 goes
VLSI DesignCourse 8-57
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
to state Sj2 , where Si2 and Sj2 are nonequivalent, then si1 and sj1 are nonequivalent
Beginning with the unrealistic assumption that all states are equivalent, iteration of the aboverules will uncover more and more nonequivalent pairs of states until every pair that has notbeen shown nonequivalent is in fact equivalent.
Consider e.g. the FSM diagrammed in figure 8.57. The search for a reduced equivalent starts
Figure 8.57: Five-state FSM
by constructing a truth table for output and transition rules for a one-state equivalent:
TransitionsNew state Output 0 1
S0 = S1 = S2 = S3 = S4 X
In the course of building the table, it has to be checked that each output and next-state valuefor a merged state is consistent with each of the component states from the original FSM. Inthis first step, an inconsistency will be detected immediately: It is impossible to put a valueinto the output column for the single combined state that is consistent with all five componentstates. Thus the aggregate state has to be split into two new states for the next iteration, withoutput values of 0 and 1. One partitions the five-state aggregate into one state correspondingto the original S0 and S3 states with a 1 output, and a second state corresponding to theoriginal states with a 0 output. Then it has to be attempted to fill out the truth table:
TransitionsNew state Output 0 1S0 = S3 1 S1 = S4 S0
S1 = S2 = S4 0 S2 X
This time the table could be nearly completed. A single inconsistency is encountered whentrying to assign a transition for the S1 = S2 = S4 state on a 1 input: In the original machine,S1 and S4 both go to S3 in this case, while S2 goes to S4. Since the respective next states S3
and S4 are not equivalent, S2 has to be split into a separate state. This results in:
VLSI DesignCourse 8-58
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
TransitionsNew state Output 0 1S0 = S3 1 S1 = S4 S0
S1 = S4 0 S2 S3
S2 0 S2 S4
The corresponding state-transition diagram is shown in figure 8.58. The reader might verify
Figure 8.58: Reduced equivalent FSM
that this reduced FSM is equivalent to the original.
While simple optimistic relaxation gives optimal reductions in the case of completely specifiedFSMs, optimal solutions to interesting variations of the FSM reduction problem are known tobe computationally intractable. For example, optimal reduction of an incompletely specifiedFSM (don’t cares are available), in the sense that any values are acceptable for certain outputsand/or transitions, is NP-hard. The development of good heuristic approaches to this andrelated optimization problems remains a topic of research.
8.6.6 Regular Expressions and Nondeterministic FSMs
Regular expressions are a commonly used notation for describing simple classes of stringsand symbols. For the purpose of this subsection the following regular-expression syntax fordescribing stings of uppercase letters will be used:
1. Finite strings of symbols (letters), including the empty string (which will be writtenas ε), are regular expressions. Thus, A, ε and ABCAABCAAABB are valid regularexpressions, each denoting a set containing only the specified string of zero or moreletters
2. If p and q are regular expressions, then pq is a regular expression denoting the set ofstrings formed by concatenating a string from p with a string from q
3. If p and q are regular expressions, then p | q is a regular expression denoting the set ofstrings that includes both the strings denoted by p and the strings denoted by q. ThusA | B is a regular expression defining a set containing the strings A and B
4. If p is a regular expression, then (p) is a regular expression denoting the same set ofstrings; parentheses are used to disambiguate – for example, to distinguish (AB) | Cfrom A(B | C)
VLSI DesignCourse 8-59
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
5. If p is a regular expression, then p∗ is a regular expression denoting all strings that areconcatenations of finitely many (zero or more) strings denoted by p. Thus A∗ denotesthe set of strings containing the empty string as well as every string consisting of finitelymany As;A(A | B)∗B denotes the set of all strings of As and Bs that begin with A andend with B
An interesting property of regular expressions is that each regular expression defines a setof strings that can be recognized by a finite state machine. It is assumed that the input tothe FSM is a sequence of symbols (in this case, encoded uppercase letters) and that eachconsecutive input symbol can cause a transition from the current FSM state to a new state.At any time when the sequence of input symbols corresponds to a string to be recognized, theFSM is in a distinguished state marked R; it is allowed to mark several states in this way. Thestarting state will be marked S. The FSM of figure 8.59, for example, recognizes the strings
Figure 8.59: Example FSM
B(AB)∗. Note that transitions corresponding to input strings that are not recognized (such asthose containing the letter C) are omitted. The selected convention is that such strings causeimplicit transitions to a BAD state, which causes the entire input sequence to be rejected.
Although every regular expression denotes a set of strings recognizable by an FSM, the system-atic derivation of an FSM recognizer from a regular expression is not entirely trivial. A usefulconceptual tool in dealing with regular expressions is the nondeterministic FSM (NFSM),whose state-transition diagram is ambiguous in the sense that it may indicate several possibletransitions on a given input symbol. The simple NFSM in figure 8.60 recognizes the strings
Figure 8.60: Nondeterministic FSM
A | (AB).
One can view the NFSM as being in several states simultaneously. Its behaviour can beemulated by hand, using tokens that are moved about on the state-transition diagram torecord active states. One begins with a token on the starting state. At each input symbol,
VLSI DesignCourse 8-60
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
tokens are placed on each state at the arrow end of a transition from a marked state and theprevious tokens have to be removed. Note that at most one token has to be placed in eachstate. Whenever one or more states marked R contains a token, the input string is accepted(recognized) by the NFSM.
It is possible to construct a deterministic FSM that recognizes any regular expression, butthe construction becomes cumbersome when an expression of the form α | β is encountered.In effect, the FSM under construction must entertain the two alternative forms α and β aspossible inputs until some input symbol rules one or both forms out; this may require anumber of states, each corresponding to some combination of a tentative parse of form α oran alternative parse of form β. In contrast, the NFSM provides direct accommodation foralternative input forms by means of ambiguous transitions. The dual paths between the Sand R states of figure 8.60, for example, correspond directly to the alternative input forms Aand AB.
As a further convenience in the construction of NFSMs from regular expression, the use oftransitions on the empty input string is allowed; such transitions are taken spontaneously bythe NFSM. In the token model, whenever there is an empty transition from a state marked bya token, the target of the empty transition will be marked as well. Figure 8.61 shows how onemight use empty transitions, designated by ε, to convert the A | (AB) NFSM, for example, to
Figure 8.61: NFSM that recognizes strings of form (A | (AB))∗
recognize (A | (AB))∗.
Nondeterministic FSMs are, in an important sense, no more powerful than deterministic FSMs:The same set of strings (the ones that can be described by regular expressions) can be rec-ognized by each. NFSMs, however, provide a primitive model for parallelism because of theirability to model several discrete states simultaneously. While NFSMs and FSMs perform thesame computations, a deterministic FSM may require exponentially many states compared tothe equivalent NFSM.
The nondeterministic FSM, although not directly realizable in hardware, can be an importanttool in the synthesis of realizable deterministic FSMs that perform useful computations. Thesynthesis of an FSM to recognize strings described by the regular expression (A | (AB))∗,for example, might be approached by the straightforward synthesis of the NFSM of figure8.61 followed by the derivation of an equivalent (but less intuitive) deterministic FSM usinga computer-based algorithm.
VLSI DesignCourse 8-61
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Finite-State Machine
8.6.7 Context
Finite-state machines are simultaneously a mathematical abstraction that has received con-siderable attention from theorists and a practical engineering tool of enormous consequence tothe designer of digital systems. These roles are not independent; the formal study of FSMs hassignificantly enriched the repertoire of optimizations and techniques available to the engineer,while their practical significance stimulates continued attention by theorists.
VLSI DesignCourse 8-62
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
ASIC Design Process
Chapter 9
ASIC Design Concepts
9.1 ASIC Design Process
9.1.1 The VLSI Design Process as a Transformation from Higher to LowerDescriptive Levels
VLSI DesignCourse 9-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
ASIC Design Process
9.1.2 Phases of Electronic System Design
VLSI DesignCourse 9-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
ASIC Design Process
9.1.3 Application Architectural Properties
Data path oriented:
• Microprocessors, DSP’s and Co-Processors
• Data operation by a number of functional units interconnected by a wordsized datapath
• Functional units: ALU, Multiplier, . . .
Control-dominated:
• Sequencers, Protocol Engines
• no arithmetic structures
• no or small data path
• decentralized
• set of coupled controllers
9.1.4 Synthesis Steps
1. Architectural Synthesis (= Behavioural Synthesis)
• translation of a source description into a data flow graph• scheduling the events in the flow graph• allocation of functional units in the machine• binding the functional units to real components in a specific technology
2. Logic Synthesis
• translation of a register-transfer level description of a circuit into combinationallogic and registers• finite state machine synthesis• technology-independent logic optimization• mapping the result on a suitable target technology (Gate Arrays, Standard Cells,
Sea of Gates, . . .)• circuit retiming to meet performance requirements
3. Layout Synthesis
• module generators generate automatically a dense layout of specific modules• typical modules: functional units of data paths (ALU, register, shifter, adder, . . .)• greatest leverage for data path oriented design• PLA-generators for control logic• most useful in the design of application specific DSPs and generic components such
as microprocessors
VLSI DesignCourse 9-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
ASIC Design Styles
9.2 ASIC Design Styles
9.2.1 ASIC Technology Tree
VLSI DesignCourse 9-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
ASIC Design Styles
VLSI DesignCourse 9-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
9.3 Gate Arrays
9.3.1 Introduction to Gate Arrays
Gate Arrays (Masterslices):
• Prefabricated active elements (master)
• Construction of logic functions by personalization (wiring macros from a cell library,intra-cell routing)
• Connection of functional blocks by inter-cell routing in 1 . . . 3 layers + contact/via layers
• Arrangement of gate arrays:
– Row Structure
– Island Structure
– Matrix of structures (= sea of gates)
• Mixed analog/digital gate arrays
Figure 9.1: Gate array floorplan with row structure
VLSI DesignCourse 9-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
Figure 9.2: Floorplan for a sea of gates array
VLSI DesignCourse 9-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
9.3.2 IMI Grid Structure
Figure 9.3: IMI gate array structure
Fig. 9.3 principally shows the structure of gate arrays of International Microcircuits Inc. (IMI)(single metal layer). The real circuit has 1440 cells. In the Figure a reduced number of 40cells is drawn in order to improve the clearity of the representation.The gate array consists of the following elements:
• Pad (connection to outside world)
• Buffer devices (drive out-chip load capacitances)
• Distributed power and ground buses
• Underpasses to cross under the power and ground buses without contacting them
• Each point represents a contact (potential interconnection point)
From Fig. 9.4 the following features can be seen:
VLSI DesignCourse 9-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
Figure 9.4: Corner of IMI gate array die
• Cells containing transistors are clustered around the VDD and VSS buses.
• In each cell four horizontal bars (crossing VDD and VSS) can be seen. The thick barrepresents a poly underpass while the the three thin bars are common poly input linesto an nMOS/pMOS transistor pair
• Between cell columns a column of short horizontal poly underpasses is placed
VLSI DesignCourse 9-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
Figure 9.5: Grid representation of IMI gate array
VLSI DesignCourse 9-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
Figure 9.6: Explanations of grid: (a) basic cell. (b) internal interconnects. (c) basic cell andcrossover (poly) block. (d) XR = transistor. (e) crossover block interconnects
In Fig. 9.6 (b) the internal gate (long horizontal poly lines) and internal diffusion (shorthorizontal diffusion lines) are shown. From Fig. 9.6 (d) it can be seen that adjacent nMOSor pMOS transistors have a common drain/source connection. Contacts for the nMOS sourceand drain connections are at both sides of the VSS bus (same for pMOS transistors and VDDbus.
VLSI DesignCourse 9-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
Figure 9.7: Symbolic IMI cell structure representation
Figure 9.8: CMOS matrixcell
VLSI DesignCourse 9-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
9.3.3 CDI Grid Structure
Figure 9.9: CDI single metal layer gate array structure
VLSI DesignCourse 9-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
9.3.4 Gate Array Design Flow
Figure 9.10: Gate array design flow
VLSI DesignCourse 9-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
9.3.5 Personalization Examples for IMI and CDI Gate Array
Figure 9.11: Personalization for inverter: (a) schematic. (b),(c) IMI layout. (d) CDI layout
VLSI DesignCourse 9-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
Figure 9.12: NOR gate on IMI
Figure 9.13: Layout of transmission gates: (a) single TG. (b) pair of TGs with common output
VLSI DesignCourse 9-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
9.3.6 Qualification of Gate Array Design Style
Advantages:
• Lower number of individual masks needed
• Higher number of pieces for uncustomized master (cost reduction)
• Many others for masters, second source fabrication, libraries and design systems
Disadvantages:
• Area overhead (by unused transistor cells)
• Overdimensioned routing channels
• Larger cell size
=⇒ Advantages dominate for smaller production volumes
VLSI DesignCourse 9-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gate Arrays
9.3.7 Gate Array Market
Figure 9.14: Gate array market by process technology
Figure 9.15: Worldwide gate array market by user sector
VLSI DesignCourse 9-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Standard Cell Design
9.4 Standard Cell Design
9.4.1 Introduction to Standard Cells
Figure 9.16: Circuit and corresponding standard cell
VLSI DesignCourse 9-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Standard Cell Design
Figure 9.17: Standard cell scheme
Figure 9.18: Standard cell floorplan
VLSI DesignCourse 9-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Standard Cell Design
Standard Cells:
• No prefabrication: all cell layouts from a system library
• Cells in rows: VDD/VSS - lines connected by cell abutment, uniform cell height, variablewidth: I/O - connections top and bottom
• Cell rows alternating with routing channels
• Width of routing channel adaptable to design needs
• Crossing of Cells possible: feed-through cells, electricially equivalent pins
9.4.2 Qualification of Standard Cell Design Style
Advantages:
• substantial saving of chip area compared to Gate Arrays (typically 40%)
• thereby reduction of fabrication costs per chip
• higher flexibility in cell design
Disadvantages:
• all masks individually (high initial cost and turn-around time)
• very complex or large-area functional blocks like RAM, ROM or PLA cannot be inserted
=⇒ Advantages dominate with a higher number of pieces (> 10000)
VLSI DesignCourse 9-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Standard Cell Design
9.4.3 Standard Cell Market
Figure 9.19: Standard cell market by process technology
Figure 9.20: Standard cell market by application
VLSI DesignCourse 9-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Macro Cell Concept
9.5 Macro Cell Concept
9.5.1 Introduction to the Macro Cell Concept
• Rectangular cells, any form and size
• Free cell arrangement
• Wiring channels between the cells
• Width of wiring channels according to routing needs
• Power/ground routing not separated from signal routing
Figure 9.21: Floor plan for macro cell design style (= building block approach
VLSI DesignCourse 9-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Mixed Design Styles
9.6 Mixed Design Styles
9.6.1 Introduction: Mixed Design Styles
Figure 9.22: Mixed design style structures
9.6.2 Features of Mixed-Mode ASICs
• Mixed analog/digital macros
• EEPROM cells
• Power components:
– High-current analog buffer
– Power MOSFET driver
• ASIC-Hybrid combinations
• Subsystem Cells: 555 Timer, 4046 PLL, . . .
• SC-Filter
• Biquad units
• Temperature sensors
VLSI DesignCourse 9-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
9.7 Programmable Logic Devices
9.7.1 Classical PLD Devices
• PROM (Programmable Read Only Memory) Device with fixed AND array and a pro-grammable OR array
1. mask programmable
+ superior speed performance due to internal connections hardwired during man-ufacture
+ cheap at high volumes– can only be programmed by manufacturer– development cycle = weeks or months
2. field programmable
+ immediately programmable+ at low volumes less expensive than mask-programmable devices– resistance of programmable routing switches lowers signal performance
• EPROM (Erasable Programmable Read-Only Memory)
• EEPROM (Electricially Erasable Programmable Read-Only Memory)=⇒ additional advantage to be erasable and re-programmable
=⇒ structures of PROMs are best suited for the implementation of memories
• PLAAND array and OR array programmableproduct term sharing: every product term of the AND array can be connected toany of the OR output gates
• PALAND array is programmable and OR array has fixed connection points
– combinational PAL devicesused for implementation of logic functions
– sequential PAL devicesused for implementation of sequential logic (finite state machines)
– arithmetic PAL devicessum of product terms may be combined by EXOR gates at the input of the macrocellD flip-flop
VLSI DesignCourse 9-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
Figure 9.23: Combinational PAL devices: AMD 16L2
VLSI DesignCourse 9-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
Figure 9.24: Sequential PAL devices: AMD PAL16R4
VLSI DesignCourse 9-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
Figure 9.25: Arithmetic PAL devices: AMD PAL16A4
VLSI DesignCourse 9-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
9.7.2 Advanced PLD Devices
• EPLD (Erasable Programmable Logic Devices)
• EEPLD (Electricially Erasable Programmable Logic Devices)
=⇒ these devices use EPROM cells or EEPROM cells instead of fuses as programmableconnections
=⇒ tendency:instead of large global logic planes a blockoriented architecture with local logicblocks and macrocells and an interconnection network between the blocks is used
Example: Altera EP1800
Figure 9.26: Advanced PLD devices: Altera EP1800
VLSI DesignCourse 9-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
Figure 9.27: Local macro cell
Figure 9.28: Global macro cell
• each EP1800 quadrant contains 12 macrocells and has a local bus with 24 lines (fornormal and inverted macrocell outputs) and a local clock
• the global bus has 64 lines and runs through all of the four quadrants (true and com-plement signals of 12 inputs (= 24 lines) + true and complement of 4 clocks (= 8 lines)+ true and complement of I/O-pins of the 4 global macro cells in each quadrant (= 32lines)
• macrocells: combinational or registered data output; the flip-flop is configurable: D, T,JK or SR)
VLSI DesignCourse 9-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
Figure 9.29: Synchronous clock, output enabled by product term
Figure 9.30: Asynchronous clock, output permanently enabled
VLSI DesignCourse 9-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
Example: Altera MAX7000 Family
Figure 9.31: Block diagram of MAX7000 family
VLSI DesignCourse 9-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Programmable Logic Devices
Figure 9.32: MAX7000 macrocell
9.7.3 PLA-based Device Properties
1. Easy to map Espresso/MIS style logic into sum of products
2. Easy to route, very fast turnaround
3. Performance independent of netlist
4. Wide designer acceptance
5. Relatively mature technology, but some innovation still ongoing
VLSI DesignCourse 9-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
9.8 Field Programmable Gate Arrays
9.8.1 The FPGA Concept
Figure 9.33: Principal FPGA structure
• Logic blocks
• Routing resources that can connect the logic blocks
The routing resources are both the greateststrength and weakness of FPGA’s
VLSI DesignCourse 9-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
9.8.2 FPGA Categories
1. Block organized, SRAM based (internal block structure not restricted to AND–OR)
• Xilinx
• Altera (FLEX)
• Plessey
• AT&T
• . . .
2. Cell organized, anti-fuse based
• Actel
• Quicklogic
• . . .
Figure 9.34: Four classes of commercially available FPGAs
VLSI DesignCourse 9-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
9.8.3 Programming Technologies
Static SRAM Programming Technology
• connection elements are controlled by SRAM cells
Figure 9.35: SRAM programming technology
Basic Technology Issues: SRAM
1. Function unit and routing are controlled by SRAM cells
2. These cells are located adjacent to the logic they control (not in a separate chip)
3. SRAM cells are configured at power-up and potentially reconfigured during operation
4. Configuration is a non-destructive process
5. SRAM cells are large (5 transistors), require connection to power, ground, data andselect lines
6. . . . but they can be intimately intermixed with CMOS logic
7. SRAM memory design is highly refined
Anti-Fuse Programming Technology
• Anti-fuses are made with a modified CMOS process involving an extra step
• This step creates a very thin insulating layer that separates two conducting layers
• This insulator is penetrated by applying a high voltage to the to conducting layers (thisprocess is not reversible)
• The programming voltage must be much higher than the logic threshold, otherwise thechip would program itself under operation
• Such high voltages can be destructive for CMOS logic circuitry
• Large isolation devices may be required to protect logic gates from the programmingvoltage
VLSI DesignCourse 9-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
Actel PLICE Anti-Fuse Programming Technology
Figure 9.36: Actel PLICE anti-fuse structure
• Actel PLICE anti-fuses can be programmed by placing a relatively high voltage (18V)across the anti-fuse terminals, heat and melt the dielectric by a driving current of about5 mA and form a conductive link between poly-Si and n+ diffusion
• bottom and top layer of the anti-fuse are connected to metal, the over all resistance ofa programmed anti-fuse (from metal to metal) is about 300Ω – 500Ω
• manufactured by 3 additional masks to a normal CMOS process
Quicklogic ViaLink Anti-Fuse Programming Technology
Figure 9.37: Quicklogic ViaLink Anti-Fuse
• amorphous silicon antifuse
• a low resistance path (80Ω) between two metal wires is created by a 10V programmingvoltage at the terminals of the anti-fuse
VLSI DesignCourse 9-37
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
EEPROM Programming Technology
Figure 9.38: EEPROM programming technology
• used in FPGAs (PLDs) manufactured by Altera and Plus Logic
• static charge on floating gate turns the transistor permanently off
• EPROM transistors are used to pull bit lines to ground
• disadvantage of EPROM technology: static power dissipation
9.8.4 Overview: Commercially Available FPGAs
Company General Logic Block ProgrammingArchitecture Type Technology
Xilinx Symmetrical Look-up Static RAMArray Table
Actel Row-based Multiplexer- Anti-FuseBased
Altera Hierarchical PLD PLD Block EPROM/SRAMPlessey Sea-of-gates NAND-gate Static RAM
Plus Hierarchical PLD PLD Block EPROMAMD Hierarchial PLD PLD Block EEPROM
Quicklogic Symmetrical Multiplexer- Anti-FuseArray Based
Algotronix Sea-of-gates Multiplexers and Static RAMBasic Gates
Concurrent Sea-of-gates Multiplexers and Static RAMBasic Gates
Crosspoint Row-based Transistor Pairs Anti-Fuseand Multiplexers
VLSI DesignCourse 9-38
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
9.8.5 Xilinx Architecture
Figure 9.39: General architecture of XILINX FPGAs
Series Number of CLBs Equivalent GatesXC2000 64 . . . 100 1200 . . . 1800XC3000 64 . . . 320 2000 . . . 9000XC4000 64 . . . 900 2000 . . . 20000
Figure 9.40: Xilinx XC4000 CLB
• two stage look-up tables, two functions of 4 variables or one function of five variable canbe implemented
VLSI DesignCourse 9-39
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
Figure 9.41: Xilinx XC4000 single length lines
• XC4000 routing architecture: Single-length Lines and Double-length lines
• high CLB connectivity to wiring segments
Figure 9.42: Xilinx XC4000 double length lines and long lines
VLSI DesignCourse 9-40
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
9.8.6 Actel Architecture
Figure 9.43: General architecture of Actel FPGAs
Series Number of LMs Equivalent GatesAct-1 295 . . . 546 1200 . . . 2000Act-2 430 . . . 1232 6250 . . . 20000
• rows of programmable Logic Modules (LM)
• horizontal routing channels between rows
Figure 9.44: Act-1 logic module
VLSI DesignCourse 9-41
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
Figure 9.45: Act-1 programmable interconnection architecture
VLSI DesignCourse 9-42
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
Figure 9.46: Act-2 logic cells
VLSI DesignCourse 9-43
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
9.8.7 CAD for FPGAs
Figure 9.47: FPGA CAD
VLSI DesignCourse 9-44
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
Figure 9.48: The Xilinx design flow
VLSI DesignCourse 9-45
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Field Programmable Gate Arrays
9.8.8 Economical Considerations
Figure 9.49: Cost per Chip (Dollars)
Economics and Performance of FPGAs compared to MPGAs:
FPGA:
+ no overhead cost ⇒ less cost intensive for low volumes
+ short turnaround time ⇒ short time to market
+ high designers flexibility (short turnaround time), low redesign costs
– relatively low speed of operation caused by the resistance and capacitance of pro-grammable switches in the routing network
– decreased logical density, programmable switches and configuration network require chiparea
MPGA:
+ low per chip costs at high volumes
+ fabrication hardwired metal connection layers⇒ fast operation
+ high logic density
– very high costs for low volumes (for example prototypes)
– no redesign flexibility
VLSI DesignCourse 9-46
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Overview on Logic Design Alternatives
9.9 Overview on Logic Design Alternatives
Figure 9.50: Logic design alternatives
VLSI DesignCourse 9-47
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Overview on Logic Design Alternatives
Figure 9.51: Relative merits of various ASIC implementation styles
VLSI DesignCourse 9-48
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Adders / Subtracters
Chapter 10
Arithmetic Units
In the following chapter, basic arithmetic units like adders, subtracters, or multipliers arediscussed. These components are widely used in VLSI circuits e. g. for the digital signalprocessing application domain. More detailed descriptions on arithmetic units can be founde. g. in [13] or [3].
10.1 Adders / Subtracters
10.1.1 Basic Adder Cells
Half Adder The circuit realizing the function
C = A1A2 (10.1)S = A1 ⊕A2 (10.2)
is called half–adder and can be used to calculate the sum S of two bits A1 and A0. A possiblecarry is set at the C output.
Full Adder For adding binary numbers having a bitwidth of more than one single bit, theconcept of the half–adder has to be extended. The carry output of less significant bits in theaddition process have to be taken into account in the more significant bits. For that, a newcircuit structure called full–adder is used which is based on the following functional equations:
Cout = Cin(A1 +A2) +A1A2 (10.3)Sout = A1 ⊕A2 ⊕ Cin (10.4)
These equations can be realized either by logic gates (AND, OR, XOR) or by two half–addersand an OR gate.
10.1.2 Adders / Subtracters for Binary Coded Integers
The following section introduces the basic arithmetic components used in VLSI designs. First,adder and subtracter architectures are discussed. Since addition and subtraction for binary
VLSI DesignCourse 10-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Adders / Subtracters
numbers can be calculated by almost the same hardware (by selecting the appropriate comple-ment representation first), the term “adder” is used as synonym for both adder and subtracterin the following section.
Serial Adders
The principle of serial adders is shown in Fig. 10.1:
Shift Register
Shift Register
Shift Register
?
?
..................
..................
-
-
-
?
..................
6
-
?
n
n
Operand A
Operand B +
nSum
Cout
Cin
SX
Y
Register
Cout
Full-Adder
Carry
Figure 10.1: Serial adder principle
At the beginning of the operation, the two n–bit operands A and B are loaded to the shiftregisters. The carry register is cleared resp. set to the value of the carry input. During thenext n clock cycles (if a wordlength of n bits for each operand is assumed), the operands areadded bitwise in the full–adder and stored in the sum register. For that, the operand shiftregisters apply the least significant bit to the full–adder inputs whereas the sum shift registerreads the current sum output of the full–adder at the serial input and and shift the contentsby one bit to the right each clock cycle. The carry output of an addition is stored in the carryregister for use in the next clock cycle. The n-bit sum and the carry output are available after(n+1) clock cycles [1 operand load, n calculation].
The serial adder has the smallest hardware complexity which is wordlength independent (ifthe shift registers are not considered) but requires the highest computation time of all adderimplementations.
Parallel Adders
Ripple Carry Adder Chained full–adders which form an adder of the required wordlengthare called ripple carry adder since during addition the carry “ripples” through the whole chainfrom the least significant to the most significant bit as shown in Fig. 10.2:
The addition time is therefore dependent on the wordlength of the operands.
Carry Lookahead Adder To speed up the addition process, lookahead methods can beapplied to reduce the time associated with carry propagation. The carry input of a stagei is calculated directly from the input of the preceding stages i − 1, i − 2, . . . i − k rather
VLSI DesignCourse 10-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Adders / Subtracters
+ + +
??
? ?
? ?
?
? ?
?
A[n-1] B[n-1]
Sum[n-1]Cout
. . .
A[1] B[1]
Sum[1]
Cout[1]
A[0] B[0]
Sum[0]
Cout[0]CinFull-Adders
Figure 10.2: Ripple carry adder principle
than allowing carries to ripple from stage to stage. To perform that task, the cout of ordinaryfull–adders are substituted by the generate and propagate signals defined by
gi = aibi (10.5)pi = ai + bi. (10.6)
The carry input signal of stage i+ 1 is defined by the equation
cini+1 = ci = gi + pici−1 (10.7)
and by recursive substitution in an example of a 4 bit adder
c0 = cin1 = g0 + p0cin (10.8)c1 = cin2 = g1 + p1g0 + p1p0cin (10.9)c2 = cin3 = g2 + p2g1 + p2p1g0 + p2p1p0cin (10.10)c3 = cout = g3 + p3g2 + p3p2g1 + p3p2p1g0 + p3p2p1p0cin. (10.11)
As can be seen in the equations above, the carry lookahead logic circuits can be realized by atwo level logic implementation, that means the whole addition is performed in constant time(without influence of wordlength). The implementation of the carry lookahead correspondingto the above equations is shown in Fig. 10.3.
+ ++ +
????????
? ? ? ?
?? ??
? ? ? ?
?
A[1] B[1] A[0] B[0]
Sum[1] Sum[0]
A[3] A[2] B[2]B[3]
Sum[3] Sum[2]
Cin
g[3] g[2] g[1] g[0] p[0]p[1]p[2]p[3]
Carry Lookahead Circuit
Cin[3] Cin[2] Cin[1] Cin[0]
Cout
Figure 10.3: Carry lookahead adder for 4 bits
The number of gate inputs is restricted due to technological constraints. That means, thewordlength of a carry lookahead cannot increase above any number. Due to that reason,adders for a big wordlength are split into smaller groups processed by single carry lookaheadadders with reasonable wordlengths as shown in Fig. 10.4.
VLSI DesignCourse 10-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Adders / Subtracters
+4 bit
CLA-Add+
4 bit
CLA-Add+
4 bit
CLA-Add+
4 bit
CLA-Add
? ? ? ?
?? ??
? ? ? ?
?
Cin
Sum[15:12] Sum[11:8] Sum[7:4] Sum[3:0]
B[11:8] B[7:4] B[3:0]B[15:12]A[15:12] A[11:8] A[7:4] A[3:0]
Cout
C[11] C[7] C[3]C[15]
Figure 10.4: Clustered carry lookahead adder for 16 bits
The carry signal produced by a group is forwarded to the next group so that, if the group isconsidered as a single block, the carry ripples through different blocks as in the carry rippleadder. Alternatively, a hierarchical approach might be chosen in a way, that for each groupa group-generate as well as a group-propagate signal are generated which are evaluated by asecond level carry lookahead circuit.
Carry Select Adder In the following adder type, the wordlength of the operands is againsubdivided into clusters (see Fig. 10.5). The cluster subwordlength is chosen to balance thetime required for intra-cluster carry ripple additions and carry calculation of the precedingclusters. The additions are all performed in parallel assuming the following two cases: carry inof a cluster are ’0’ and are ’1’. The results (cluster carry out and partial sum C/Sum[i : j]) areforwarded to multiplexors which select the appropriate value depending on the carry output ofthe preceding stages. Since the time to switch a multiplexor is almost negligible compared tothe time required for the carry ripple additions, the overall addition time is almost independentof the wordlength.
HHH1 0 HHH
1 0 HHH1 0
1
0 0 0
1 1
? ??? ??
? ? ?
? ? ? ?
?? ??
? ? ?
? ? ? ?
?
B[11:8] B[7:4]B[15:12]A[15:12] A[11:8] A[7:4]
Cin
B[11:8] B[7:4] B[3:0]B[15:12]A[15:12] A[11:8] A[7:4] A[3:0]
+4 bit
+4 bit
+4 bit
+4 bit
+4 bit
+4 bit
+4 bit
CR-Adder
Cout
C[3]
C[7]C[11]
Sum[11:8] Sum[7:0]Sum[15:12] Sum[3:0]
C/Sum[3:0]C/Sum0[7:4]C/Sum0[11:8]
C/Sum1[11:8] C/Sum1[7:4]C/Sum1[15:12]
C/Sum0[15:12]
CR-Adder CR-Adder CR-Adder
CR-AdderCR-AdderCR-Adder
Figure 10.5: Carry select adder for 16 bits
Since the carry select adder requires two carry ripple adder chains for each cluster (except inthe least significant), the hardware amount is almost twice that of a simple ripple carry adder.It is slower than a carry lookahead adder but compared to that type it has a higher regularityand is for that reason better suited for VLSI implementation.
VLSI DesignCourse 10-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Adders / Subtracters
Carry Save Adder For the addition of very many addends (e. g. in parallel multipliers),the time required for full carry propagation even in the case of use of carry lookahead addersmight be to high for some applications. To achieve constant addition time complexity, thepropagation of computed carry results is avoided in the same stage and both, the S and theCout vectors are connected to the correct adder in the succeeding stage. This concept requiresa final addition to merge the sum and the carry vector of the final stage into a single sumvector which can be realized using any of the adders discussed above (in Fig. 10.6 a carry rippleadder has been chosen for simplicity). In a carry save adder, the adder delay is increased byone full-adder delay if it is extended by an additional operand.
+ + +
++ + +
++ +
++ + +
+
?
....................
?+
++FinalCarryPropagation
CarrySaveAdderArray
?
? ?
? ? ?
....................
?
....................
?
....................
?
? ?? ?
....................
?
....................
?
....................
?
? ?? ?
....................
?
....................
?
....................
?
....................
?
? ? ? ?
....................
?
?
?
?
?
??
?
?
?
?
###?
............. .............
.............
..........................
.............
Sum[n-1]
Full-Adders
Full-Adders
Full-Adders
Full-Adders
Sum[0]Sum[1]Sum[2]
Cout[1]Cout[2]
Sum[n]Sum[n+1]
. . .
. . .
. . .
. . .W[n-1] W[2] W[1] W[0]
0
0
Cin
0
V[n-1] V[2] V[1] V[0]
Y[n-1] Y[2] Y[1] Y[0]X[n-1] X[2] X[1] X[0]
Stages required to
evaluate the carry outputs
of preceeding stages
Cout
0
Figure 10.6: Carry save adder for summation of 4 operands (V, W, X, Y)
VLSI DesignCourse 10-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Multipliers
10.2 Multipliers
Shift and Add Multiplier The most common multiplier is the Shift and Add Multiplier(SAA Mult.). Two binary unsigned integer words X and Y of bit-size Nx and Ny, respectively,can be written using their binary representation:
X =Nx−1∑i=0
xi2i Y =Ny−1∑j=0
yj2j (10.12)
The product Z = X ∗ Y can now be computed:
Z =Nx−1∑i=0
xiY 2i = (...((xNx−1Y )2 + xNx−2Y )2 + ...)2 + x0Y (10.13)
The following recurrence can be derived from formula 10.13:
D0 = 0 Di+1 = Di2−1 + xiY Z = DNx2Nx−1 (10.14)
In each step of the recurrence one bit of X is multiplied (a simple AND-operation) with Y andadded to the intermediate result Di which is shifted one bit. Figure 10.7 shows the generalstructure of the Shift and Add multiplier with bit-sizes Nx and Ny.
Figure 10.7: Structure of SAA multipliers
For this multiplier type it takes Nx clock cycles to complete the multiplication, since one bitof X is processed each step. The delay of the combinatorical circuit (which determines themaximum clock frequency) is approximately: NyδFA (δFA is the delay of a full adder, theregister delays are not considered).
The cost of a Shift and Add Multiplier is (3Ny + 2Nx)γFA (the cost of a full adder γFA isassumed to be equal to the cost of a register).
Carry Save Multiplier In opposite to the SAA-Multiplier, the Carry Save Multiplier(CSM) calculates the result in one step. Every bit of the first argument is multiplied withevery bit of the second argument concurrently. The results are added up according to theposition of the source bits.
VLSI DesignCourse 10-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Multipliers
The CSM consists of combinatorial logic only. The multiplication of two 4-bit binary numberscan be written as
X3 X2 X1 X0
Y3 Y2 Y1 Y0
————————–P30 P20 P10 P00
P31 P21 P11 P01
P32 P22 P12 P02
P33 P23 P13 P03
—————————————————Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0
where Pij = Xi ∧ Yj . The addition of all Pij terms can be done in an array of full adders.
Figure 10.8 shows the general structure of a Carry Save Multiplier assuming Nx ≥ Ny. PartII is omitted in case of same size for Nx and Ny. The Carry In of the full adder is supplied inthe upper right corner. Not every full adder needs a Carry In, for some position half addersare sufficient. The adder Carry Out is depicted in the lower left corner.
Figure 10.8: Structure of CSM multipliers
The delay of this type of multipliers is (Nx + Ny − 2)δFA. The cost is (Nx − 1)NyγFA plus(2Ny + 2Nx)γFA, if X, Y and the Z-register are accounted as in the shift and add case above.
Block Multiplier A combination of the fully parallel Carry Save Multiplier and the serialShift and Add Multiplier leads to a flexible architecture which can be configured from workingfully serial to working fully parallel. Many combinations in between are possible, thus allowingthe adaptation to given specifications and restrictions.
VLSI DesignCourse 10-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Multipliers
The basic idea of the block multiplication is to divide each argument into blocks of the samesize. Each block of the first argument is multiplied with each block of the second argument ina fast Carry Save Multiplier. All calculated block products are added up taking into accountthe positions of the current argument blocks. Therefore, as in the Shift and Add Multiplier,the arguments and the intermediate result have to be shifted in an appropriate way.
. . . . . .
. . .
Carry SaveMultiplier
n + nx yn + nx y
n + nx y
..........
..........
..........
..........
XXXXXX
..........
..........
(((
(((
AA
.................................................
...........
....................
.........
.. ...........
..........
....................
..................
..
....................
..................
..
X register Y register
Adder
Z register
Controller
n
n
x
y
Figure 10.9: Architecture of the block multiplier
Figure 10.9 shows the architecture of the block multiplier. The argument registers and theCarry Hold Register are simple shift registers. The intermediate result has to be shifted inboth directions, thus requiring a bidirectional shift register. Signals for controlling the shiftdirections are generated by a controller, which can be realized using a simple counter.
The multiplier can be configured by varying the block sizes of the arguments. With increasingblock sizes the multiplier becomes more parallel, thus reducing the number of clock cyclesneeded to perform a multiplication. Larger block sizes, however, require a larger Carry SaveMultiplier, which increases the area needed to realize the multiplier. Assuming that the firstargument is separated in kx Blocks of size nx and the second argument in ky blocks of sizeny, the multiplier needs kx ∗ ky clock cycles to perform a multiplication. The delay of themultiplier is determined by the size of the ripple carry adder, which has a width of nx + nybits.
VLSI DesignCourse 10-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Microarchitectures
Chapter 11
Microarchitectures
The term microarchitecture describes the domain between the macroarchitecture (the lowest-level hardware visible to the user) and the implementation technology (MOS VLSI) [27]. Forbetter analysis, microarchitectures are usually divided into 3 parts: the data path whichperforms the data manipulations and calculations, the control path is used to apply correctsequences of control signals to the data path, and the input/output unit providing accessfrom/to the external world (see Fig. 11.1)
-
6
?
6
?
..................
..................
..................
..................
Input /
Output
Path
Control
Path
DataSignals
Control
Status
Flags
External I/O Data
Figure 11.1: Microarchitecture blocks
The control path which can be interpreted as a more or less complex finite state machine(FSM) can be either hardwired (used in fixed applications like a controller for the serialadder in Fig. 10.1) or programmable (microprocessor with downloadable microcode). Themicroarchitecture scheme as shown in Fig. 11.1 can represent quite simple circuits (like atraffic light controller) as well as complex microprocessors.
VLSI DesignCourse 11-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Datapath Design
11.1 Datapath Design
In the datapath of a microarchitecture, the operations and data manipulations are performed.For that, control signals are generated by the control path depending on the operation(s) tobe executed. By forwarding information about the status of the data path (e. g. exceptionalconditions, underflow, overflow, division by zero, . . . ), the control path is able to react in acorrect way to the actual needs. The state signals (flags) can be used to enable conditionalbranching depending on the state of the data path. Data processing is usually performed bytypical components like ALUs, shifters, register files, . . . .
The following section shows how datapath structures are usually implemented in larger VLSIdesigns. For that, we assume the following simple datapath structure:
PPP
...................
...
-
-
-
-
- --@@
-
-
?
??
?
?
?
?
......................
......................
......................
6
?Bin
Ain
Status Flags
Clock SelOP-Sel Shift Clock
Control Signals
Status Signals
Cin
Inputs
OutputRout
Figure 11.2: Datapath example
The datapath consists of 2 input registers for the input operands Ain and Bin, an arithmetic-logic unit (ALU), a multiplexor to select between the Cin input and the ALU output, ashifter unit, and an output register. The datapath structure could be implemented based onstandard cells, where basic library cells (like gates, muxes, registers, . . . ) are selected andinterconnected, or, if a datapath compiler is used, based on a set of several layout tiles asshown in Fig. 11.3.
A datapath compiler creates a regular layout depending on the wordlength of the operands bystacking the appropriate number of tiles in the layout. The horizontal structure consisting ofa set of tiles performing all functions for a single bit is called bit slice. If we apply vertical cutsto the layout structure, the whole layout will be subdivided in layout blocks corresponding toa single function implemented. These layout stripes are called functional slices.
11.1.1 Bit-slice ALU AMD 2901
As an example for a discrete datapath implementation the 2901 bit-slice will be discussed inthe following section (→ [10]).
The 2901 integrated circuit contains besides of a 16 word register set, a Q register (used
VLSI DesignCourse 11-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Datapath Design
.....................
.....................
..................... .....................
AReg BReg A L U MUX RReg
ControlSignal
BitSlices
BuffersStatus
Buffers
Bit[0]Bit[1]
Bit[n-1]
Functional Slices
Shifter
Figure 11.3: Corresponding layout scheme
Figure 11.4: 2901 4-bit ALU slice Figure 11.5: 2901 µ-OPs
VLSI DesignCourse 11-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Datapath Design
within add-shift multiplications or divisions) an arithmetic-logic unit (ALU), a shifter, andan instruction decoder (see Fig. 11.4). All operations and the registers are designed for 4 bitoperands. The set of instructions which can be executed by the 2901 IC is also shown inFig. 11.5. The instructions are encoded in a 9 bit I vector which is provided by an externalmicrocode controller. The first of these tables shows the selection of the sources for both ALUinputs (R and S), the second mentions the ALU functions, whereas the third indicates thedestination of the ALU results.
To form an ALU for wordlengths with multiples of 4 bits, the 2901 ICs can be cascaded asshown in Fig. 11.6. In the example, a simple carry propagation scheme has been selected.As an additional option, carry-lookahead circuits (AMD 2902) could be used to enhance thespeed for carry propagation.
Figure 11.6: 16-bit bit-sliced ALU
The 2901 IC has been widely used for applications in digital signal processing and for minicom-puters. It is available as stand-alone IC and some silicon manufacturers also provide macrocellswith the functionality of the 2901 (for different wordlengths) that might be included to ASICdesigns.
VLSI DesignCourse 11-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Controller Implementations
11.2 Controller Implementations
Controllers are used to apply a sequence of control signals to the datapath components. Thesecontrol signals are chosen to perform the desired operation(s) within the datapath. Thedatapath is able to interact with the controller unit by sending appropriate status signals(e. g. overflow flag when an addition is performed, equal flag as a result of a comparison, . . . ).The controller can be designed to change the sequence of control signals depending on theseflags (used e. g. in microprocessors to perform conditional branches).
The general structure of such a controller can be found in Fig. 11.7.
?
?
?
?
CombinationalLogic
Control Outputs
State Register
Environmental Inputs
Figure 11.7: Basic controller structure
It consists of a combinational logic block and a register. The combinational logic block gener-ates out of the input signals (which can be e. g. an instruction word defining the sequence ofcontrol signals to be generated, state flags, . . . ) and parts of the previous register content thecontrol output signals as well as the information which step in the sequence of control signalsis to be executed in the next cycle. The controller can be seen as a realization of the abstractmodel of a finite state machine.
To get a high level of regularity in the design of a controller, very often regular layout structures(like ROMs or PLAs) are used to implement the combinational logic block rather than directlyimplement the logic functions in separate gates (random logic). The random logic approachwas chosen in the control unit of many early microprocessors (≤ 8 bit) and in RISC (ReducedInstruction Set Computer) processors whereas the regular layout structures are used in CISC(Complex Instruction Set Computer) processors to simplify their controller design. Regularstructures simplify the design process due to the fact that if modifications in the controlsequences are required only the contents of a PLA resp. a ROM has to be redefined insteadof designing a whole combinational gate network. Since the design process for the latterapproach can be compared with programming a memory contents instead of circuit design,that approach is called microprogramming and will be considered in detail in the sequel.
VLSI DesignCourse 11-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Controller Implementations
11.2.1 Microprogrammed Controllers
Microprogrammed controllers mainly consist of a control memory and a microinstruction regis-ter. The control memory is implemented using ROM (Fig. 11.8) or PLA (Fig. 11.9) structures.For special applications, also RAM based control memories are used if e. g. the instructionset of a processor has to be changed for special purposes. That flexibility is not availablewhen using hardwired logic. On the other hand, extra hardware cost compared to randomlogic due to address decoding (in the ROM based controller) and sparse control matrices anda performance penalty due to larger internal delays in the PLA or ROM could be the prizefor that flexibility. The control memory contains both the control signals to be forwardedthrough the microinstruction register to the datapath and some sequencing information giv-ing the address (NA next address) of the subsequent microinstruction. The concatenation ofthe control signals and the next address is called microinstruction.
@@
?
?
6 6
R O M
Control N A
AddressDecoder
Control Outputs Environmental Inputs
Figure 11.8: ROM based controller
?
?
6 6
6
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Control N A
Control Outputs Environmental Inputs
O R A N D
P L A
Figure 11.9: PLA based controller
Depending on the generation of the control signals, two types of microinstructions can bedistinguished:
VLSI DesignCourse 11-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Controller Implementations
Horizontal Microinstructions. The control word from the microinstruction register isdirectly applied to the circuit which is to be controlled (see Fig. 11.10). Each elementarycontrol point has a corresponding entry in the control word. That results in a very longcontrol word and therefore big control memories. On the other hand, very specific encodingand a high degree of parallelism in the operations is possible.
Vertical Microinstructions. That type of microinstructions is based on a different ap-proach: since in a n-bit control word 2n configurations would be possible which are hardlyused by the controller, the wordlength of the control word in the control memory is reducedby encoding the smaller number of, let’s say M , used control vectors into a vector of dlog2Mebits. In a second step, the n-bit control word is fetched from a secondary memory used ascontrol vector decoder (implemented e. g. as ROM or PLA) and forwarded to the datapath(see Fig. 11.11). It is also possible to use encoding of the control vector in groups for differ-ent hardware units (one group for ALU control, the next for shifter control, . . . ) which aredecoded group by group instead of using a single and large control vector decoder.
? ? ? ? ? ? ? ? ? ? ? ?..................... .........
............
Control Lines
Control Bits in the Microinstruction
Figure 11.10: Horizontal microinstruction
Control Bits in the Microinstruction
? ? ? ? ? ? ? ? ? ? ? ?..................... .........
............
@@
?
Control Lines
Control Bit Decoder
Figure 11.11: Vertical microinstruction
In controller design, one can proceed one step further: if a microinstruction itself can berepresented as a sequence of ‘sub’microinstructions (so called nanoinstructions, the structureshown in Fig. 11.12 can be used. The most simple approach, which already has been mentionedunder vertical microcode, is a single step ‘sequence’ of nanoinstructions, namely the decodingof the control outputs out of an encoded control vector from the microcode control memory.
VLSI DesignCourse 11-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Controller Implementations
If feedback is introduced in the decoder PLA (via the NNA [nanocode next address] register),control sequences can be generated by the nanocode PLA. As long as a nanocode sequenceis running, the MNA [microcode next address] register is halted. In the case that manymicroinstructions use the same nanocode sequences, significant savings in implementationarea for the whole controller can be reached.
6
6 6
?
-
?
?
?
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.......................................
6
...
...
...
...
...
...
...
...
...
........
Environmental Inputs
O RA N D
Control Outputs
Microcode PLA
Nanocode PLA
MNA
ControlNNA
O R A N D
Figure 11.12: A microcode/nanocode controller
VLSI DesignCourse 11-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Introduction
Chapter 12
ASIC Design Guidelines
12.1 Introduction
The following design guidelines have been adapted from [5]. These recommendations are usefulin order to avoid functional faults and get the desired functionality.
12.2 Synchronous Circuits
• all data storage elements are clocked
• the same active edge of a single clock is applied at precisely the same time to all storageelements
12.2.1 Non-Recommended Circuits
Figure 12.1: Flip-flop driving clock input of another Flip-flop
→ The clock-input of the second FF is skewed by the clock-to-q delay of the first FF andnot activated at every activation clock edge (e.g. ripple counter)
VLSI DesignCourse 12-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Synchronous Circuits
Figure 12.2: Gated clock line
→ Clock skew caused by gating the clock line (e.g. multiplexer in clock line)
Figure 12.3: Double-edged clocking
→ FFs are clocked on the opposite edges of the clock signal
→ Insertion of scan-path impossible
→ Difficulties in determing critical path lengths
Figure 12.4: Flip-flop driving asynchronous reset of another Flip-flop
→ Synchronous design principle, that all FFs change state at exactly the same time is notfulfilled
VLSI DesignCourse 12-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clock Buffering
12.2.2 Recommended Circuits
Recommended circuits for synchronous circuit design are described in the subsequent sections.
12.3 Clock Buffering
12.3.1 Non-Recommended Circuits
Figure 12.5: Unequal depth of clock buffering
→ Clock skew
VLSI DesignCourse 12-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clock Buffering
Figure 12.6: Unbalanced fanout of clock buffers
→ Clock skew by different load-dependent delays
→ Excessive clock fanouts should be avoided (slow edges)
VLSI DesignCourse 12-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clock Buffering
12.3.2 Recommended Circuits
Figure 12.7: Balanced clock tree buffering
→ Same depth of buffering
→ Same fanout
→ Limited fanout in order to achieve sharp clock edges
VLSI DesignCourse 12-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Clock Buffering
Figure 12.8: Combined geometric/tree buffering
→ Using intermediate buffer of suitable strength at each fanout point
VLSI DesignCourse 12-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Gated Clocks
12.4 Gated Clocks
12.4.1 Non-Recommended Circuits
Figure 12.9: Multiplexer on clock line
→ Signal change at multiplexer input can cause a glitch at the clk input (FF capturesinvalid data)
→ Gating the clock line introduces clock skew
12.4.2 Recommended Circuits
Figure 12.10: Enabled (E-type) flip-flop
Figure 12.11: Toggle (T-type) flip-flop
VLSI DesignCourse 12-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Double-edged Clocking
12.5 Double-edged Clocking
12.5.1 Non-Recommended Circuit
Figure 12.12: Pipelined logic with double-edged clocking
→ Not recommended in context with scan-path methods
12.5.2 Recommended Circuit
Figure 12.13: Pipelined logic with single-edged clocking
VLSI DesignCourse 12-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Asynchronous Resets
12.6 Asynchronous Resets
12.6.1 Non-Recommended Circuit
Figure 12.14: Flip-flop driving asynchronous reset of another flip-flop
12.6.2 Recommended Circuits
Figure 12.15: Global asynchronous reset by external signal
Figure 12.16: Flip-flop driving synchronous reset of flip-flop
VLSI DesignCourse 12-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Shift-Registers
12.7 Shift-Registers
12.7.1 Non-recommended Circuits
Shift register with forward or reverse chain of clock buffers:
Figure 12.17: Shift register with forward chain of clock buffers
→ Internal clock skew can cause data fallthrough
12.7.2 Recommended Circuits
Figure 12.18: Shift register with balanced tree of clock buffers
VLSI DesignCourse 12-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Asynchronous Inputs
12.8 Asynchronous Inputs
12.8.1 Non-Recommended Circuits
→ Circuits with complicated feedback loops to capture asynchronous inputs (very sensitiveto noise and functionality can be influenced by placement and routing delays
12.8.2 Recommended Circuits
1. Chain of two or more D-type registers (reducing the probability of metastability)
2. Use of 4-bit register as shift register
3. Asynchronous handshake circuit
Figure 12.19: Series D-type flip-flops for capturing asynchronous input
→ The probability of propagating metastable state is decreased with increasing number ofregister stages
Figure 12.20: 4-bit register used as shift register to capture an asynchronousinput
VLSI DesignCourse 12-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Asynchronous Inputs
Figure 12.21: Asynchronous handshake circuit
VLSI DesignCourse 12-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Asynchronous Inputs
The asynchronous handshake ciruit works as follows:
• the first flip-flop is reset asynchronously when the r input is zero or when the qb outputsof the second and the third FF both have the value 0
• the q-output of the first FF is asynchronously set to high, when a positive edge arisesat its ck-input
• the high output of the first FF is propagated through the second and the third FF inthe two following cycles. The q-outputs of these FFs are set to zero and the reset logicfor the first FF is activated. Now the first FF is ready to receive another edge at itsinput.
• Three cases of metastability caused by simultaneously rising edges of the asynchronousinput and the system clock:
1. the second FF stabilizes to q=1 before the next rising clock edge (circuit works asdesired)
2. the second FF settles to q=0 and the third FF remains in its state. Since theoutput q of the first FF is high, the propagation of this output works correctly, butit needs one cycle more than in the first case.
3. The metastable state of the second FF is still there at the next rising edge ofthe clock signal. Then the third FF also becomes metastable. The probability ofreceiving a metastable d (internal) signal can be reduced by increasing the lengthof the register chain.
Figure 12.22: Operation of asynchronous handshake circuit
VLSI DesignCourse 12-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Delay Lines and Monostables
12.9 Delay Lines and Monostables
12.9.1 Non-Recommended Circuits
In general it can not be recommended to build circuits, which functionality relies on delays.
Figure 12.23: Monostable pulse generator
Figure 12.24: Pulse generator using flip-flop
Figure 12.25: Multivibrator
12.9.2 Recommended Circuits
→ Usage of higher clock speed and build synchronous pulse generator
→ Minimum time resolution is given by clock cycle
12.10 Bistable Elements
12.10.1 Non-Recommended Circuits
→ Cross-coupled flip-flops and RS flip-flops
VLSI DesignCourse 12-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Bistable Elements
Figure 12.26: Synchronous pulse generator
Figure 12.27: Bistable storing element formed by cross-coupled NANDgates
Figure 12.28: Bistable storing element formed by cross-coupled NOR gates
VLSI DesignCourse 12-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Bistable Elements
Figure 12.29: Asynchronous RS flip-flop
12.10.2 Recommended Circuits
→ Use D-types with set/reset
→ Use latch configured as RS flip-flop
Figure 12.30: Latch configured as RS flip-flop
VLSI DesignCourse 12-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
RAMs and ROMs in Synchronous Circuits
12.11 RAMs and ROMs in Synchronous Circuits
Problem: RAMs are double-edge triggered. The address is latched on the opposite edge tothe data
Figure 12.31: ME and WEbar RAM/DPRAM timing scheme
12.11.1 Recommended Circuits
Figure 12.32: Interfacing RAM into synchronous circuit: ME and WEbargeneration
VLSI DesignCourse 12-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
RAMs and ROMs in Synchronous Circuits
Figure 12.33: Using flip-flop for WEbar generation: timing schene
Figure 12.34: Avoiding floating RAM/DPRAM output propagation
VLSI DesignCourse 12-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Tristates
12.12 Tristates
12.12.1 Non-Recommended Circuit
Figure 12.35: Tristate bus with non-central enable control
VLSI DesignCourse 12-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Tristates
12.12.2 Recommended Circuits
Figure 12.36: Tristate bus with central control of tristate enables and ad-ditional driver activated on non-controlled states
12.12.3 Multiplexer ↔ Tristates
Disadvantages of Tristates:
• large area
• limited buffering
• large routing load, → slow
Advantages of Multiplexers:
• small area
• efficient routing
Control decoding expense is the same for tristates and multiplexers.
VLSI DesignCourse 12-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Parallel Signals
12.13 Parallel Signals
12.13.1 Non-Recommended Circuits
Figure 12.37: Wired-OR part used to create higher fanout
12.13.2 Recommended Circuit
Figure 12.38: High-fanout buffer replacing wired OR part
VLSI DesignCourse 12-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fanout
12.14 Fanout
12.14.1 Non-Recommended Circuit
Figure 12.39: Excessive fanout on control signal
VLSI DesignCourse 12-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fanout
12.14.2 Recommended Circuits
Figure 12.40: Geometric buffering on control signal
VLSI DesignCourse 12-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fanout
Figure 12.41: Tree buffering on control signal
VLSI DesignCourse 12-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Speed
12.15 Design for Speed
1. Use a maximum of 2 inputs on all combinational logic gates
Figure 12.42: 4-input AND gate and 2-input NAND/NOR equivalent
2. Use AOI logic (complex cells from standard cell library) where possible
Figure 12.43: Multiplexer using AOI logic
3. Feed late changing inputs late into combinational logic
Figure 12.44: Late changing input fed late into combinational logic
VLSI DesignCourse 12-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Speed
4. Use shift (Johnson) counters instead of binary counters
Figure 12.45: 4-stage Johnson counter
q0 q1 q2 q30 0 0 01 0 0 01 1 0 01 1 1 01 1 1 10 1 1 10 0 1 10 0 0 10 0 0 0
5. Use duplicate logic to reduce fanout
Figure 12.46: Using duplicate logic for reducing fanout
6. Use fast library cells where available
7. Reduce length of critical signal paths
8. Use Schmitt trigger inputs in noisy environments
VLSI DesignCourse 12-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
12.16 Design for Testability
Testability:
1. Controllability
2. Observability
12.16.1 Non-Recommended Circuits
Figure 12.47: Circuit with inaccessible internal logic: only first block iscontrollable and only last block is directly observable
Figure 12.48: Chain of counters: first counter is not directly observable andsecond counter is not directly controllable
VLSI DesignCourse 12-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Figure 12.49: Counter with closed feedback loop: initial state not known
VLSI DesignCourse 12-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
12.16.2 Recommended Circuits
1. Insert test inputs and outputs
Figure 12.50: Circuit with test inputs and outputs
2. Break long counter/shift register chains
Figure 12.51: Chain of counters broken by test input and output signals
VLSI DesignCourse 12-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
3. Open feedback loops
Figure 12.52: Counter with feedback loop opened by test control and outputsignals
4. Use BIST (Built-In Self Test) with compiled megacells
Figure 12.53: Compiled megacell with compiled inputs/outputs
VLSI DesignCourse 12-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
5. Scan path testing
Figure 12.54: E-type scan path flip-flop
Figure 12.55: Circuit with scan path
VLSI DesignCourse 12-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
6. Use of JTAG boundary scan path
Figure 12.56: JTAG test circuitry
VLSI DesignCourse 12-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Motivation
Chapter 13
Testing and Design for Testability
13.1 Motivation
• Stable chip manufacturing costs
• Increasing testing costs:
– Increasing number of gates/device
– Limited number of pins
→ Increasing number of internal states
→ Increasing logical and sequential depth
Example:Testing of a combinational circuit with n inputs (10 MHz, one test per cycle)
n time for test25 3 s30 107 s40 1 day50 3.5 years60 3656 years
• Testability has to be considered in all phases of design
VLSI DesignCourse 13-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Economical Considerations
13.2 Economical Considerations
13.2.1 Average Quality Level (AQL)
aql =#DefectiveParts#AcceptedParts
(13.1)
VLSI DesignCourse 13-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Economical Considerations
13.2.2 Correlation: Fault Coverage and Defective Parts
• DL(= AQL): Number of defective circuits which have been classified as correct working(testing with T )
• Y: yield
• T: fault coverage
DL = 1− Y 1−T (13.2)
VLSI DesignCourse 13-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Economical Considerations
Figure 13.1: Defect level as function of yield and fault coverage
VLSI DesignCourse 13-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design Flow: Testing
13.3 Design Flow: Testing
Figure 13.2: A typical synthesis flow
VLSI DesignCourse 13-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fundamental Definitions
13.3.1 Chip Test after Manufacturing
Manufacturing Process
↓Parametric Test (current/power dissipation)
(erroneous chips are marked with color points and removed after sawing)
↓Chip Test on Tester
13.4 Fundamental Definitions
Figure 13.3: Relationship between faults, errors and failures
• fault:physical defect, imperfection or flaw which occurs in an hardware or software component
• error:manifestation of a fault (erroneous information on an hardware line or in a program,caused by a fault)
• failure:malfunction of a system
Figure 13.4: Three-universe model of a system
VLSI DesignCourse 13-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Models
13.5 Fault Models
Basis: physical phenomena
• Oxide defects
• Missing implants
• Lithographic defects
• Junction defects
• Metal shorts & opens
• Moisture accumulation
• Impurities/Contaminants
• Static discharge
Figure 13.5: Examples for physical faults
VLSI DesignCourse 13-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Models
VLSI DesignCourse 13-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Models
VLSI DesignCourse 13-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Models
VLSI DesignCourse 13-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Models
VLSI DesignCourse 13-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Tolerant Design
13.6 Fault Tolerant Design
Fault tolerance achieved by redundancy techniques:
• Duplication with Complementary Logic
Figure 13.6: Fault detection by duplication with complementary logic
• Self-Checking Logic
• Reconfigurable Array Structures
VLSI DesignCourse 13-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Tolerant Design
Figure 13.7: 4-by-4 array with one spare column
VLSI DesignCourse 13-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Tolerant Design
Figure 13.8: Reconfigured array
VLSI DesignCourse 13-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
13.7 Test Pattern Generation
• manually
• pseudo random (leads up to 60% fault coverage)
• algorithmic
• special test patterns for RAMs
? fault coverage sufficient ?⇒ fault simulation
13.7.1 The D-Algorithm
Every test generation procedure has to solve the following problems
1. Creation of a change at the faulty line
2. Propagation of the change to the primary output line
In the D-Algorithm the symbols D and D are used to refer to the changes. D and D are usedas follows:
D: used if a line has the value 1 in absence of a fault and the value 0 in case of a faultocurrance
D: used if a line has the value 0 if no fault occurs and otherwise the value 1
The D-algorithm method for path sensitization consists of two principal phases:
1. forward drive (propagation) of an D-value to an primary output
2. backward trace (consistency operation)
These two steps are iterated for different propagation paths for the D-value from one dedicatedinternal point i to one dedicated primary output point o until the backward trace phase isfinished without any contradiction (a test vector for a fault at i has been found) or until allpossible paths from i to o have been examined.
VLSI DesignCourse 13-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
Figure 13.9: Basic concept of D-algorithm
1. A primitive D-cube of a failure is a D-cube associated with a fault l/α on the outputline l of a gate G. This produces the value D or D on l and the input lines have valueswhich would produce α in the fault-free case.
Figure 13.10: Primitive D-cube of fault (pdcf) for two-input NAND gate
VLSI DesignCourse 13-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
2. A propagation D-cube of a failure specifies the propagation of changes at one (ormore) inputs of a gate G to its output l.
Figure 13.11: Propagation-D-cube (pdc) for two-input NAND gate
3. A singular cover of a gate G is a 0,1,X truth table representation of G
Figure 13.12: Singular cover for two-input NAND gate
VLSI DesignCourse 13-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
Figure 13.13: Singular covers for several basic logic gates
VLSI DesignCourse 13-18
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
Figure 13.14: Construction the singular cover of an logic module
VLSI DesignCourse 13-19
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
In the following the D-algorithm is illustrated for the given example from fig. 13.15
Figure 13.15: Example circuit illustrating D-algorithm
Table 13.1: Propagation D-cube table
Table 13.2: Singular cover table
VLSI DesignCourse 13-20
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
Table 13.3: D-cube intersection table
VLSI DesignCourse 13-21
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
Running the D-algorithm for generating a test for line 5/0:
1. Start with D-cube for the fault 5/0:
2. The D of line 5 is automatically propagated to line 6 and 7 by cube j.
3. Now the propagation along path 6 → 9 → 11 is considered:D on line 6 is propagated to line 9 by cube d. Combining d and k yields cube l:
4. If cube i is used with D instead of D, the propagation to the output can be done:
5. Now the consistency phase is started and a value for line 4 has to be found. From thesingular cover table can be seen that a 0 on line 10 implies both line 7 and line 8 to be1. In cube m line 7 is a D (and also line 5 which is connected to 7 by j) and this D mustnow be set to 1 which is a contradiction which disables the path sensitization 5 → 6/7→ 9 → 11.⇒ Start test vector generation using another path
6. Starting the propagation along 5 → 7 → 10 → 11 leads to the following cube:
7. From the singular cover table we get the information that a 1 on line 8 is the same as a0 on line 4. Additionally it can be seen that the 0 on line 9 can be obtained by a 1 online 1.
8. This yields the final cube
1 1 1 0 D D D 1 0 D D
VLSI DesignCourse 13-22
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Test Pattern Generation
9. ⇒ a test vector for line 5/0 is given by
1 1 1 0
VLSI DesignCourse 13-23
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Fault Simulation
13.8 Fault Simulation
13.8.1 Algorithms: Serial Fault Simulation
Figure 13.16: Serial fault simulation
13.8.2 Improved Algorithms
• Parallel Fault Simulation
• Concurrent Fault Simulation
⇒ discussed in CAD lecture
VLSI DesignCourse 13-24
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
13.9 Design for Testability
• Circuit level: restriction of physically possible faults
• Logic level: restrict possibilities of realizations
• System level: restrict size of components and number of states
Testability:
• controllability
• observability
• → additional chip area required
• → shorter design cycle
Methods to improve controllability and observability:
• ad-hoc techniques
• structured approaches
Figure 13.17: Design for testability: complex gate (a) not testable with stuck-at model. (b)fully testable with stuck-at model
13.9.1 Ad-Hoc Techniques
• developed for special design
• less silicon area
• design automation almost impossible
• partitioning (test of circuit components by use of dedicated multiplexers)
VLSI DesignCourse 13-25
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Figure 13.18: Testability: ad-hoc techniques (partitioning for testability)
VLSI DesignCourse 13-26
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Figure 13.19: Testability: ad-hoc techniques (a) insertion of register in order to limit logicdepth to a given maximum value. (b) test shift registers for PLA test (increasing PLA area).
VLSI DesignCourse 13-27
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
13.9.2 Scan-Path Methods
Scan-Path:
• Main idea: test of sequential network is reduced to test of combinational network
• for circuits consisting of logic with some feedbacks
• can be realized by reconfiguration of latches as shift registers (two mode of use)
Figure 13.20: Feedback logic with scanpath
Test scan-path/register function first:
• Flush test (0 . . . 010 . . . 0) or
• shift test (00110011 . . .) (each register transfer is tested by this combination: 0 → 0,0→ 1, 1→ 1 and 1→ 0).
Cycle for testing combinational logic function:
1. Scan mode: Preload Y and set PI
2. System operation mode: Wait until inputs of Y are steady. Clock new state into Y.
3. Shift state out.Compare PO and state values with expected responses
VLSI DesignCourse 13-28
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Advantages:
• Testability of clocked circuits is improved and guaranteed at design stage
• Consistent with good VLSI design practice(rules, abstraction, modularity . . .)
• Does not require special CAD
Disadvantages:
• Wastes silicon
• Constrains designer to design according given conditions
• Additional Complexity
Overhead
' 2% for a fundamentally ’structured’ design
' 30% for ’wild’ logic
13.9.3 Built-In Tests
• System generates test vectors by its own
• Analyse and evaluation of test vectors is also automatically done
• Compromise: silicon ↔ testability
Test Pattern Generators
• Test patterns are generated inside the circuit to be tested
• Short testing time, simple test programs, self-test
• Example: Test pattern memories, deterministic generators, counter
VLSI DesignCourse 13-29
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Figure 13.21: Examples for built-in test pattern generators
VLSI DesignCourse 13-30
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Pseudo Random Number Generators
Figure 13.22: Pseudo random pattern generator
VLSI DesignCourse 13-31
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Example:
Figure 13.23: Example for pseudo random pattern generator
VLSI DesignCourse 13-32
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
13.9.4 Evaluation of Testing Data
• Evaluation of testing results inside the circuit
• Counting techniques, signature analyse
Example: Counting Techniques
Figure 13.24: Counting techniques for test data evaluation
VLSI DesignCourse 13-33
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Signature Analyse
• Communication technique: coding theory
• Code words: data stream D, polynom P(x), division modulo 2
D
P= Q+
R
P
→ Evaluation of testing data
Figure 13.25: Test data evaluation by signature analyse
VLSI DesignCourse 13-34
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Signature Analyse: Degree of Fault Recognition
1. Length of sequence: m bit → 2m sequences possible
2. One sequence contains no faults → number of erroneous sequences is 2m − 1
3. Length of signature register: n bit → 2n signatures
4. 2m sequences are mapped on 2n signatures → number of nondetectable faults:
2m
2n− 1 = 2m−n − 1
5. Possibility for nondetection of erroneous sequence: number of nondetectable faults di-vided by number of possible faults:
N =2m−n − 12m − 1
6. ⇒ Fault detection rate:
F = 1− 2m−n − 12m − 1
F ≈ 1− 2n
VLSI DesignCourse 13-35
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
Interpretation:
• all faults recognized if m < n (trivial)
• long sequences: n is important only
• n = 16 bit −→ F = 99,99985%
Parallel Signature Register with k Inputs
Figure 13.26: Parallel signature register
Fault recognition rate:
F = 1− 2mk−n − 12mk − 1
VLSI DesignCourse 13-36
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
13.9.5 Built-In Logic Block Observation
A BILBO register is a universal element for use in either a scanpath environment or a self-test(signature analysis) environment.
Figure 13.27: BILBO registers: 1. full circuit 2. normal use 3. scan-path use 4. signatureanalysis
Advantages:
• Versatility
– Normal operation
– Scan-path test: enhances testability
– Test vector generation via LFSR
– Data compression via LFSR
– Combined scan-path/self-test using same LFSRs
Disadvantages:
• Silicon area
– BILBO latch can be ' 50% larger than ordinary latch
VLSI DesignCourse 13-37
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Design for Testability
13.9.6 Example: Self-testing Circuit
Figure 13.28: Example: self-testing circuit
VLSI DesignCourse 13-38
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
JTAG Standard
Chapter 14
Boundary-Scan Architecture –JTAG Standard
• miniaturization of electronic components, multilayer and surface mount techniques maketest of boards more complicate⇒ requirement of design-integrated test structures
• 1985 first meeting of small group from European electronics companies
• later North American companies joined the group (→ Joint Test Action Group = JTAG)
• results: IEEE Standard Test Access Port and Boundary-Scan Architecture
VLSI DesignCourse 14-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Classical Board Test Approaches
14.1 Classical Board Test Approaches
Figure 14.1: In-circuit test using bed-of-nails
Figure 14.2: Functional test using board connector
VLSI DesignCourse 14-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Classical Board Test Approaches
Figure 14.3: Combined use of in-circuit and functional test
Disadvantages of classical approach:
• high costs for test hardware
• increased density
• not suited for surface mount technology
• modern chip testing techniques as
– scan path techniques
– built-in self-test techniques (BIST)/BILBO
are not exploited well
VLSI DesignCourse 14-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Introduction to Boundary Scan
14.2 Introduction to Boundary Scan
Scan-testing at the board-level:
• permits use of automatic test pattern generation tools
• simplification of the hardware of the test equipment
Figure 14.4: Scan design at the board level
VLSI DesignCourse 14-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Introduction to Boundary Scan
Figure 14.5: Testing for interconnection faults
Input OutputExpected Actual
x1x1x0xxxxxx xxxxxxxx01x1 xxxxxxxx11x0x0x0x1xxxxxx xxxxxxxx10x0 xxxxxxxx11x0
VLSI DesignCourse 14-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Introduction to Boundary Scan
Figure 14.6: Testing on-chip logic
Input Expected Outputx10xxxxx xxxxx1xxx01xxxxx xxxxx1xxx11xxxxx xxxxx0xx
Boundary scan application properties and limitations
• each test vector has to be shifted into scan path⇒ not very suitable for testing the chips themselves because of reduced test rate com-pared to stand-alone chip testing
• well suited for interconnection testing
• testing of dynamic behaviour impossible
• self-testing ICs: boundary scan can be used to trigger the self-test procedure
VLSI DesignCourse 14-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
14.3 The IEEE Standard 1149.1
14.3.1 IEEE Std 1149.1 Architecture
Figure 14.7: IEEE Std 1149.1 test logic
• TAP Controller: responds to the control sequences supplied through the test access port(TAP) and generates the clocks an control signals required for the operation of the othercircuit blocks
• Instruction Register: shift register which is serially loaded with instruction for test
• Test Data Registers: Bank of shift registers. The stimuli values required for a test areserially loaded into a test register selected by the current instruction. After executionthe results can be shifted out for examination
VLSI DesignCourse 14-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
Figure 14.8: Test data registers
14.3.2 Test Access Port
• Test Clock Input (TCK): independent of the system clock; used for synchronization oftest operations between various chips on a board
• Test Mode Select Input (TMS): Input for controlling the test logic
• Test Data Input (TDI): Serial input for instruction and test register data
• Test Data Output (TDO): Serial output of instruction or test register data (source se-lected by TMS code)
• Optional Test Reset Input (TRST∗): For test initialization
VLSI DesignCourse 14-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
Figure 14.9: Serial connection of IEEE Std 1149.1-compatible ICs
Figure 14.10: Parallel connection of IEEE Std 1149.1-compatible ICs
VLSI DesignCourse 14-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
Control of the test signals
• by external automatic test equipment (ATE) or
• by on-board bus master chip
Figure 14.11: Use of bus master chip to control IEEE Std 1149.1 chips
14.3.3 TAP-Controller
• 16-state FSM which controls data register (DR) and instruction register (IR) operations
• input signals:
– TRST∗– TCK
– TMS
– last state (stored in internal FFs)
VLSI DesignCourse 14-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
• output signals:
– Reset*
– Select
– Enable
– ShiftIR
– ClockIR
– UpdateIR
– ShiftDR
– ClockDR
– UpdateDR
14.3.4 The Instruction Register
Figure 14.12: Daisy-chain connection of instruction registers
Figure 14.13: Instruction register
VLSI DesignCourse 14-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
Figure 14.14: An example instruction register cell (stage)
VLSI DesignCourse 14-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
14.3.5 Test Data Registers
Test data registers:
• bypass register (mandatory)
• boundary scan register (mandatory)
• device identification register (optional)
Bypass Register
Figure 14.15: Example design for bypass register
Figure 14.16: Use of bypass register
VLSI DesignCourse 14-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
Basic Boundary Cells
Figure 14.17: Provision of boundary-scan cells
VLSI DesignCourse 14-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
The IEEE Standard 1149.1
Figure 14.18: Basic boundary-scan cell for input pin
Figure 14.19: Basic boundary scan cell for output pin
VLSI DesignCourse 14-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog Signal Processing
Chapter 15
Analog VLSI systems
15.1 Analog Signal Processing
Typical signal processing applications require mixed analog/digital implementations. Thesemainly consist of
• Preprocessing of the signals, e.g. filtering and A/D conversion
• Digital signal processing, e.g. digital filtering, calculation of FFT
• Postprocessing, e.g. D/A conversion
as shown in Fig.15.1
The aim of development is to integrate all these functions on a single chip.
Figure 15.1: Block diagram of a typical signal processing system
VLSI DesignCourse 15-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog Signal Processing
15.1.1 Signal Bandwidths in Analog VLSI
Figure 15.2: Bandwidths of signals used in signal processing applications
Figure 15.3: Signal bandwidths that can be processed by present day (1989)technologies
VLSI DesignCourse 15-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog Signal Processing
15.1.2 A/D and D/A Conversion in Signal Processing Systems
Fig. 15.4 illustrates how analog-to-digital (A/D) and digital-to-analog (D/A) converters areused in data systems. In general, an A/D conversion process will convert a sampled andheld analog signal to a digital word that is a representative of the analog signal. The D/Aconversion process is essentially the inverse of the A/D process. Digital words are applied tothe input of the D/A converter to create from a reference voltage an analog output signal thatis a representative of the digital word.
Figure 15.4: Converters in signal processing systems: (a) A/D, (b) D/A
VLSI DesignCourse 15-3
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Digital-To-Analog Converters
15.2 Digital-To-Analog Converters
Input to D/A converters are
(a) a digital word of N bits (b1, b2, b3, . . . , bN )
(b) a reference Voltage Vref
The output voltage can be expressed as
VOUT = KVrefD (15.1)
where K is a scaling factor and D is given as
D =b121
+b222
+b323
+ . . .+bN2N
(15.2)
Thus, the output of a D/A converter can be expressed by
VOUT = KVref
N∑i=1
bi2−i (15.3)
Figure 15.5: (a) Conceptual block diagram of a D/A converter, (b) ClockedD/A converter
In most cases, the digital input of the D/A converter is synchronously clocked. It is thereforenecessary to provide a latch to hold the word for conversion and a sample-and-hold circuit atthe output, as shown in Fig. 15.5(b).
The basic architecture of the D/A converter without an output sample-and-hold circuit isshown in Fig. 15.7. Fig. 15.8 shows the ideal input-output characteristics for such a D/Aconverter.
15.2.1 Current Scaling D/A Converters
The output Voltage of a current-scaling D/A converter as shown in Fig. 15.9 can be expressedas
Vout = −R2I0 = −R
2
(b1R
+b22R
+b34R
+ . . .+bN
2N−1R
)Vref (15.4)
= −Vref (b12−1 + b22−2 + b32−3 + . . .+ bN2−N ) (15.5)
VLSI DesignCourse 15-4
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Digital-To-Analog Converters
Figure 15.6: (a) Sample-and-hold circuit, (b) Waveforms illustrating the op-eration of the sample-and-hold circuit
Figure 15.7: Block diagram of a D/A converter
VLSI DesignCourse 15-5
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Digital-To-Analog Converters
Figure 15.8: Ideal input-output characteristics for a 3-bit D/A converter
The major disadvantage of this approach is the large ratio of component values. For example,the ratio of the resistor for the MSB to the resistor for the LSB is given by
RMSB
RLSB=
12N−1
(15.6)
For a 8-bit converter, this gives a ratio of 1/128.
An alternative to this approach is the use of a R-2R ladder as shown in Fig. 15.10. Using thefact that the resistance to the right of any of the vertical 2R resistors is 2R, we see that thecurrents I1, I2, I3, . . . , IN are binary-weighted and given as
I1 = 2I2 = 4I3 = . . . = 2N−1IN (15.7)
Thus, the output voltage of the R-2R D/A converter is given by Eq. 15.5.
VLSI DesignCourse 15-6
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Digital-To-Analog Converters
Figure 15.9: (a) Conceptual illustration of a current-scaling D/A converter,(b) Implementation of (a)
Figure 15.10: A current-scaling D/A converter using an R-2R ladder
VLSI DesignCourse 15-7
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Digital-To-Analog Converters
15.2.2 Voltage Scaling D/A Converters
A voltage-scaling D/A converter is shown in Fig. 15.11. Its output voltage at any tap i canbe expressed as
Vi =Vref
8(i− 0.5) (15.8)
The output voltage of the D/A converter is then determined by the values of the inputs b1,b2 and b3.
Figure 15.11: Illustration of a voltage-scaling D/A converter
The structure of this voltage-scaling D/A converter is very regular and thus well suited forMOS technology. A problem with this type of D/A converters is the accuracy requirementsof the resistors used. This makes it difficult to build D/A converters of this type with morethan 8 bit resolution.
VLSI DesignCourse 15-8
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
15.3 Analog-To-Digital Converters
The objective of an A/D converter is the determination of the digital word corresponding tothe analog input signal. Usually a sample-and-hold circuit (see Fig. 15.6) is required at theinput of the A/D converter because it is not possible to convert a changing analog signal. Ablock diagram of a general A/D converter is shown in Fig. 15.12. The ideal input-outputcharacteristics for a A/D converter are shown in Fig. 15.13.
Figure 15.12: Block diagram of a general analog-to-digital converter
Figure 15.13: Ideal input-output characteristics for a 3-bit A/D converter
VLSI DesignCourse 15-9
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
15.3.1 Serial A/D Converters
Two possible implementations of serial A/D converters are single-slope and dual-slope A/Dconverters. Both will not be discussed in detail here. The main advantages of these convertersis their simplicity, their main disadvantage is the long conversion time required.
15.3.2 Successive Approximation A/D Converters
This type of A/D converters converts an analog input into an N-bit digital word in N clockcycles. Consequently, the conversion time is less than for the serial converters without muchincrease in the complexity of the circuit. Fig. 15.14 shows an example of a successive approx-imation A/D converter architecture.
Figure 15.14: Example of a successive approximation A/D converter archi-tecture
The successive approximation process is shown in Fig. 15.15.
VLSI DesignCourse 15-10
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
Figure 15.15: The successive approximation process
15.3.3 Parallel A/D Converters
In many applications, it is necessary to have a smaller conversion time than is possible withthe previously described A/D converter architectures. Parallel A/D converters, also known asflash A/D converters, typically require down to one clock cycle for conversion. An architectureof a 3-bit parallel A/D converter is shown in Fig. 15.16.
Parallel A/D converters can reach typically up to 20 MHz for CMOS technology. The sample-and-hold time may though be larger than 50 ns and could prevent this conversion time frombeing realised. Another problem is that the number of comparators required is 2N−1. For Ngreater than 8, too much area is required.
One method of achieving small system conversion times is to use slower A/D converters inparallel, which is called time-interleaving and is shown in Fig. 15.17. Here M successiveapproximation A/D converters are used in parallel to complete the N -bit conversion of oneanalog signal per clock cycle. The sample-and-hold circuits consecutively sample and applythe input analog signal to their respective A/D converters. N clock cycles later, the A/Dconverter provides a digital word output. If M = N , then a digital word is given out everyclock cycle. If one examines the chip area for an N -bit A/D converter using the parallel A/Dconverter architecture (M = 1) compared with the time-interleaved architecture for M = N ,the minimum area will occur for a value of M between 1 and N .
VLSI DesignCourse 15-11
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
Figure 15.16: A 3-bit parallel A/D converter
VLSI DesignCourse 15-12
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
Figure 15.17: A time-interleaved A/D converter array
VLSI DesignCourse 15-13
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
15.3.4 Sigma-Delta A/D Converter
Introduction
The basic structure of a sigma-delta converter is shown in Fig. 15.18. The sigma-delta con-verter can be referred to as an oversampling converter, although oversampling is just one ofthe techniques contributing to the performance of a sigma-delta converter. The sigma-deltaconverter shown in Fig. 15.18 quantizes an analog signal with very low resolution (1 bit) anda very high sampling rate (2 MHz). With the use of oversampling techniques and digitalfiltering, the sampling rate is reduced (8 kHz) and the resolution is increased (16 bits).
Figure 15.18: Basic structure of a sigma-delta converter
A more detailed block diagram of the sigma-delta modulator is shown in Fig. 15.19. It consistsof an integrator, a quantizer (comparator for 1 bit) and a feedback loop with a D/A converter(switch for 1 bit). The output of the sigma-delta modulator is shown in Fig.15.20 for a sinewave input. The single-bit conversion will result in an output which is either ’1’ or ’0’. Whenthe signal is near plus full scale, the output is positive during most of the clock cycles. Theopposite is true for near minus full scale signals. When the output is followed by a digitalfilter as shown in Fig. 15.18 which can perform sophisticated averaging functions, the 1-bitsequence is transformed into a much more meaningful signal.
Figure 15.19: First-order sigma-delta modulator block diagram
Noise Shaping
One feature that makes the sigma-delta converter so powerful is its noise shaping capability.To understand how this works, the analysis of the sigma-delta modulator in the frequencydomain is appropriate. Fig.15.21 shows the frequency domain linearized model of a sigma-delta modulator.
VLSI DesignCourse 15-14
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
Figure 15.20: Output of first-order sigma-delta modulator
Figure 15.21: Frequency domain linearized model of a sigma-delta modula-tor
VLSI DesignCourse 15-15
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
The integrator is represented as a analog filter. For an integrator, the transfer function hasan amplitude which is inversly proportional to the input frequency ( 1
f relationship). Thequantizer is modelled as a gain stage followed by the addition of quantization noise.
Thus, the output y of the sigma-delta converter can be expressed by
y = (x− y)1f
+ q (15.9)
where (x − y) is the difference signal from the summing node at the input and q is thequantization noise. Applying some algebraic rearrangement yields
y =x
f− y
f+ q(
1 +1f
)y =
x
f+ q
y =xf
1 + 1f
+q
1 + 1f
y =x
f + 1+
qf
f + 1(15.10)
At a frequency f = 0, the output signal equals x with no noise element q. At higher frequencies,the value of x is reduced and the influence of q increases. In essence, the sigma-delta modulatorhas a low pass effect on the signal and a high pass effect on the noise. As a result of this,the modulator can be thought of as a noise shaping filter where noise in the signal pass bandis reduced and noise energy is pushed into the higher frequency region. The effect of thisprocedure on normally equally distributed (white) quantization noise is shown in Fig. 15.22.
Figure 15.22: Noise-shaping filter function
VLSI DesignCourse 15-16
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Analog-To-Digital Converters
Digital Filtering
The sigma-delta modulator described so far produces a stream of single-bit digital values ata very high rate. The modulator’s output bit stream is fed into the converter’s digital filter,which performs several different functions. All of these functions, however, are integrated intoa single filter implementation. The functions of the filter are:
• sophisticated averaging (low pass filtering)
• removing high frequency noise (quantization noise)
• reducing sampling rate
The sampling rate reduction is done by averaging over a sample of cycles of the input bitstream and produces an output data stream that is reduced in sampling rate, but increasedin resolution (i.e. number of bits per sample).
Advantages of Sigma-Delta Converters
The advantages of the sigma-delta converter technology are
• Sigma-delta converters are a complete conversion and filtering system, additional digitalfiltering functions may easily be implemented in the digital output filter of the converter
• Very low-cost and high-performance conversion ist possible as the analog part of theconverter is very simple and need not be as accurate as in other A/D converters. Themain part of the converter is the digital filter which can be integrated more easily inMOS technology.
• excellent signal-to-noise performance, therefore high resolution converters possible
• no sample-and-hold circuit preceeding the converter is neccessary as sampling rates arevery high
VLSI DesignCourse 15-17
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Bibliography
Bibliography
[1] M. Anaratone. Digital CMOS Circuit Design. Kluwer Academic Publishers, 1986.
[2] Stephen D. Brown, Robert J. Francis, Jonathan Rose, and Zvonko G. Vranesic. Field-Programmable Gate Arrays. Kluwer Academic Publishers, 1992.
[3] Joseph J. F. Cavanagh. Digital Computer Arithmetic - Design and Implementation.McGraw-Hill, Inc., 1985.
[4] Murray Disman. The Programmable Logic IC Market. Electronic Trend Publications,1992.
[5] European Silicon Structures (ES2), Zone Industrielle, 13106 Rousset, France. Solo 2030User Guide, e02a02 edition, June 1992.
[6] Daniel D. Gajski. Silicon Compilation. Addison-Wesley Publishing Company, Inc., 1988.
[7] Randall L. Geiger, Phillip E. Allen, and Noel R. Strader. VLSI Design Techniques forAnalog and Digital Circuits. McGraw-Hill, Inc., 1990.
[8] Abhijit Ghosh, Srinivas Devadas, and A. Richard Newton. Sequential Logic Testing andVerification. Kluwer Academic Publishers, 1992.
[9] Lance A. Glasser and Daniel W. Dobberpuhl. The Design and Analysis of VLSI Circuits.Addison-Wesley Publishing Company, 1985.
[10] John P. Hayes. Computer Architecture and Organization. McGraw-Hill, Inc., 1988.
[11] David A. Hodges and Horace G. Jackson. Analysis and Design of Digital IntegratedCircuits. McGraw-Hill, 1983.
[12] Ernest E. Hollis. Design of VLSI Gate Array ICs. Prentice-Hall, 1987.
[13] Kai Hwang. Computer Arithmetic – Principles, Architectures, and Design. John Wileyand Sons, 1979.
[14] Barry W. Johnson. Design and Analysis of Fault-Tolerant Digital Systems. Addison-Wesley Publishing Company, 1989.
[15] Parak K. Lala. Digital System Design using Programmable Logic Devices. Prentice-Hall,1990.
VLSI DesignCourse 16-1
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0
Bibliography
[16] W. Maly. Atlas of IC Technologies: An Introduction to VLSI Processes. The Ben-jamin/Cummings Publishing Company, 1987.
[17] Colin M. Maunder and Rodham E. Tulloss. The Test Access Port and Boundary ScanArchitecture. IEEE Computer Society Press, 1990.
[18] John Mavor, Mervyn A. Jack, and Peter B. Denyer. Introduction to MOS LSI Design.Addison Wesley, 1983.
[19] William J. McClean (Editor). ASIC Outlook 1993. ICE (Integrated Circuit EngineeringCorporation), 1993.
[20] Dhiraj K. Pradhan, editor. Fault-Tolerant Computing: Theory and Techniques, volume I.Prentice-Hall, 1986.
[21] Bryan T. Preas and Michael J. Lorenzetti. Physical Design Automation of VLSI Systems.The Benjamin/Cummings Publishing Company, 1988.
[22] S. M. Sze. VLSI Technology. McGraw-Hill, Inc., 1988.
[23] Takao Uehara and William M. van Cleemput. Optimal Layout of CMOS FunctionalArrays . In IEEE Transactions on Computers, pages 305–312, May 1981.
[24] John P. Uyemura. Fundamentals of MOS Digital Integrated Circuits. Addison Wesley,1988.
[25] John P. Uyemura. Circuit Design for CMOS VLSI. Kluwer Academic Publishers, 1992.
[26] Stephen A. Ward and Robert H. Halstead. Computation Structures. MIT-Press, 1990.
[27] Neil Weste and Kamran Eshraghian. Principles of CMOS VLSI design. Addison-WesleyPublishing Company, 1985.
[28] T.W. Williams, editor. VLSI Testing, volume 5 of Advances in CAD for VLSI. ElsevierScience Publishers B.V., 1986.
VLSI DesignCourse 16-2
Darmstadt University of TechnologyInstitute of Microelectronic Systems 0