Upload
parul-abrol
View
523
Download
10
Tags:
Embed Size (px)
Citation preview
Memory Characterization, Validation and Front End Tool Evaluation
SALIL CHATRATHBE 071195
Under the Guidance of RITA MAHAJAN NITIN GUPTA
Assistant Professor Design Engineer PECUT ST Microelectronics, Greater Noida
STMicroelectronics is an Italian-French electronics and semiconductor manufacturer headquartered in Geneva, Switzerland.
STMicroelectronics was created in 1987 by the merger of SGS Microelettronica of Italy and Thomson Semiconductors of France with the aim of becoming a world leader in the sub-micron era.
The Noida site was launched in 1992 to conduct software engineering activities.The site hosts mainly design teams. It is now primarily involved with the design of home video products (Set-Top Box, DVD), GPS and Wireless LAN chips, and accompanying software. The employee strength in Greater Noida is around 2400.
ABOUT THE COMPANY
Characterization of a memory means to get information about its behavior in terms of different timings (access time, set-up time, hold time etc),powers (dynamic, static and leakage power), and capacitances when a set of specified input is applied to the memory.
For the characterization we need different kind of simulators each having different accuracy, performance aspects. So it is required to constantly evaluate and select the simulator to be used according to the performance required for the specific compiler.
PROJECT ABSTRACT
TYPES OF MEMORIES
SMEM Group Responsible for the development of SRAM memory compilers in CMOS technology
Type Volatile Writeable Erase Size Max Erase Cycles Cost (per byte) Speed
SRAM Yes Yes Byte Unlimited Expensive Fast
DRAM Yes Yes Byte Unlimited Moderate Moderate
Masked ROM No No n/a n/a Inexpensive Fast
PROM No One time n/a n/a Moderate Fast
EPROM No Yes Entire Chip Limited Moderate Fast
EEPROM No Yes Byte Limited Expensive Fast to read, slow to erase/write
FLASH No Yes Sector Limited Moderate Fast to read, slow to erase/write
NVRAM No Yes Byte Unlimited Expensive (SRAM + battery) Fast
COMPARISON BETWEEN MEMORIES
FUNCTIONAL SRAM CHIP MODEL
• The address latch block, receives the address.
• The higher order bits of the address are
connected to the row decoder, which selects a
row in the memory cell array.
• The lower order address bits go to the column
decoder, which selects the required columns.
The number of column selected depends on the
data width of the chip that is the number of data
lines of chip, which determines how many bits
can be accessed during a read or write operation.
FUNCTIONAL SRAM CHIP MODEL (Contd.)
• When the read/write line indicates read operation,
the contents of the selected cells in the memory
cell array are amplified by the sense amplifiers,
loaded in the data register & presented on the
data-out line(s).
• During a write operation the data on the data-in
line(s) are loaded into the data register & written
in to the memory cell array through the write
driver. Usually the data-in & data-outlines are
combined to form bidirectional data lines, thus
reducing the number of pins on the chip.
• The chip-select line enables the data register,
together with read/write line, the write driver.
SRAM CORE ARRAY
• Wordline
• Bitline (b & b bar)
• Due to normal variations in device
parameters and operating conditions,
it is difficult to obtain reliable
operation at full speed using a single
access line. Therefore, the
symmetrical data paths b and ~b are
usually used.
MEMORY ARCHITECTURE
• A single Row decoder is connected to entire Memory Array.
• With Increase in Core Size wordline delay increases (increase in the wordline
capacitance).
Basic Architecture:
Basic 256 X 256 SRAM Architecture
ROW DECODERCORE (MEMORY ARRAY)
[64K]
CONTROL
MUX
SENSE AMPLIFIER
I/0 Circuitry
In this type of architecture, reduction is performed by splitting the matrix in smaller blocks.
• World line Capacitance is reduced.
• The reduction in the RC delay is observed because of the split bank.
• But, the activation of a wordline activates the entire cell in both of the core areas.
• No advantage in terms of Power Dissipation.
CORE (MEMORY ARRAY)
[32K]
ROW
DECODE
R
CORE (MEMORY ARRAY)
[32K]
MUX
CONTROL
MUX
SENSE AMPLIFIER SENSE AMPLIFIER
I/0 Circuitry I/0 Circuitry
Split Core Architecture for 64K SRAM
Split Core Architecture:
In split-core, architecture although the bank is split is two parts but the word line activates the cell in both and no gain in power is observed.
Thus to reduce the run length of the word lines, a new architecture is analyzed
Page type Architecture for 64K SRAM
GLOBAL ROW
DECODER
LOCAL ROW
DECODER
CORE (MEMORY
ARRAY)
[32K]
LOCAL ROW
DECODER
CORE (MEMORY
ARRAY)
[32K]
GLOBAL CONTROLLOCAL
CONTROL
MUXLOCAL
CONTROL
MUX
SENSE AMPLIFIER SENSE AMPLIFIER
I/0 Circuitry I/0 Circuitry
Page Type Architecture:
This technique to reduce the run length of the bitlines and divided core structure helps in gain in both of speed and power.
ROW DECODER
CORE (MEMORY
ARRAY)
[32K]
LOCAL CONTROL LOCAL I/0 Circuitry
ROW DECODER
CORE (MEMORY
ARRAY)
[32K]
LOCAL CONTROL LOCAL I/0 Circuitry
GLOBAL CONTROL
MUX
SENSE AMPLIFIER
GLOBAL I/0 Circuitry
Bank type Architecture for 64K SRAM
Bank Architecture:
SRAM CELL DESCRIPTION
What is a memory cell?
A memory cell is an electronic device which can retain the state of a node even after the
input is removed. The very basic memory cell is a latch.
The basic static RAM cell is consists of two cross-coupled inverters and two access
transistors. The access transistors are connected to the wordline at their respective gate
terminals, and the bitlines at their source/drain terminals.
6TMemcell consisting of cross coupled inverters and Pass
Transistors
6T MEMORY CELL
* It is a Bistable circuit.
* Capable of being driven into one of two
states.
* This cell is “static” since it does not need
to have its data refreshed as long as the dc
power is applied.
Schematic of Memcell Layout of Memcell
6T MEMORY CELL (Contd.)
Read Cycle
A B
The bit lines are precharged to VDD.
Then the word line is made high.
It causes one of the bit line to discharge
while the other remains high.
Bit lines has a much higher capacitance
than than the capacitor of a single cell.
The level of BL as a result become lower
than BL\ & this differential signal is
detected by the
Differential amplifier(sense amplifier)
connected to the BL & BL\ lines.
Condition : V1< Vtn2
Write Circuitry of SRAM
• The operation of writing 0 or 1 is accomplished by forcing one bitline, either b or b bar, low while the other bitline remains at about VDD.
During the read operation one bitline discharges
while the other remains at vdd. After some time
there will be a detectable difference in the voltage
levels of bitline and bitline bar. So we need
a circuitry which can detect this small difference
in the voltage levels, thus saving time by not
making bitline discharge fully.
SENSE AMPLIFIER
Sense Amplifier Circuitry
SenseAmplifier Circuitry.
SAE A B BL BL\
1. GROUND BOUNCE:
At node ‘A’ there will be a
voltage, which can rise due to
wrong transistor size selection
and cause the memory cell to
be flipped. This voltage is
known as GROUND BUMP or
GROUND BOUNCE. From
the design point of view it
should be kept low.
PERFORMANCE ASPECTS OF MEMORY CELL
W/L should be lesser than M1
PERFORMANCE ASPECTS OF MEMORY CELL
2. STATIC NOISE MARGIN (SNM)
Noise margin describes the amount of variation in the signal levels that can be allowed
while signal is transmitting. In short, static noise margin describes the noise tolerance range
of the memory cell while performing the read operation. For data retention while reading in
noise environment, we need to keep some margin at the time of memory cell designing and
this should be more than 10% of Vdd.
VA (Ground bounce)* + VNOISE< VTpulldown (Threshold Voltage of Pull down Transistor)
This leads to destructive ‘Read’.
PROCEDURE FOR FINDING OUT SNMNoise margin of the memory cell can be analyzed by adding noise sources (VCVS) of opposite polarity
Effect of Pass Transistor Aspect ratio variation on SNMAspect ratio of Pass transistor and SNM are inversely proportional to each other. In the other words, on increasing the (W/L) ratio of pass transistor, SNM decreases. Effect of the Pull-Up Transistor Aspect ratio variation on SNMAspect ratio of Pull-Up transistor and SNM are proportional to each other. Therefore, on increasing the (W/L) ratio of Pull-Up transistor, SNM increases because Vdd will be maintained on B node.
WORST CASE PVT CONDITIONS FOR STATIC NOISE MARGIN
SNM is found to be worst at FS corner (where NMOS is fast and PMOS is slow) high voltage and High temperature.• FAST NMOS :The current carrying capacity of max NMOS will be more because its resistance will be low. So, fast NMOS will pull down the node voltage at B soon.
• SLOW PMOS :Since the current carrying capacity will be less because its resistance is high, so, it will not hold value at node B for longer time.
EFFECT OF TRANSISTOR SIZING ON SNM
CHARACTERIZATION OF MEMORYA Memory cut or Cut is an independent memory unit of fixed size with particular combination of various required blocks. A generator or compiler can generate many cuts on the basis of feasible permutations and combinations in a defined specified range.
ROW DECODERCORE (MEMORY ARRAY)
CONTROL
MUX
SENSE AMPLIFIER
I/0 Circuitry
Generator
For example
Words bits mux bank redundancy dft design power_supply e-switch
2048 72 4 4 no yes bb adv No
FUNDAMENTALS OF CHARACTERIZATION
Memory Characteriz
ation
Timing
Power
PinCap
Leakage
Characterization of a memory means to get information about its behavior in terms of different timings (access time, set-up time, hold time etc.), powers (dynamic, static and leakage power), and capacitances when a set of specified input is applied to the memory. It is done by running the simulations and then doing measurements from the simulated values
Timing CharacterizationThe Timing measurement task is to characterize all the timings defined in the memory generators specification. The need is to define the constraint timings between the signals applied to these pins. Typically they are setup and hold time of each signal like address, data, chip select, write enable, mask.
Leakage CharacterizationThe leakage power is characterized to give the customer, the power consumed by the memory when it is idle. The memory is selected (CSN low) but all signals are INACTIVE .
Pincap CharacterizationThe capacitance of all the external pins needs to be characterized. To measure, the capacitance of any pin, voltage source is connected to this node, and the current passing through this voltage source is integrated to calculate the capacitance of the node.
Power CharacterizationThe power measurement is done to characterize the various powers consumed by the memory. All the measurement should be aligned to Power Estimation Methodology. Precaution to be taken is that the power measured, should be of the actual memory only.
CHARACTERIZATION OF MEMORY (Contd.)
MCF takes an input setup and characterizes instances defined in cutlist file at the PVTs (Process, Voltage and Temperature) defined in measure.setup file. It starts the process by first preparing a netlist for the exact instance and then simulates it with the help of other input files defined in the mcf setup.
Basic Requirements For Mcf The cutlist contains the cuts that are to be generated for the given generator. The measure.setup contains the information about Process, supply voltage and
temperature at which the simulation should run. The analysis file contains the information of what analysis will be done on the netlist
of the circuit. The stimuli file contains the input waveform that acts as the input of the circuit which
is a memory. The header file contains some global statements which are independent of PVT,
analysis, cut, option etc. The option file contains the option for the simulator. In the option file the accuracy
the simulator should follow is mentioned. The measure.script file contains the measurements that are needed to be done. The spi file contains netlist for the memory under characterization.
MCF
STAR RC EXTRACTION
SPICE.SPI
NETLISTS
SIMULATOR
OUTPUTS (.fsdb, .cou)
MEASUREMENT TOOL
RESULT FILES
CDL GDS
PRE_LAYOUT
CUTLIST
PPS
SIMULATOR OPTIONS
STIMULI
OPERATING PVTS
LIBRARY (MODELS)
Flow of MCF
SIMULATORS
Simulation is the imitation of some real thing, state of affairs, or process. Key issues in
simulation include :
The use of simplifying approximations and assumptions within the simulation
Fidelity and validity of the simulation outcomes.
NEED OF SPICE SIMULATORS
• High costs of photolithographic masks and other manufacturing prerequisites.
• Characterization of the circuit keeping in view the behavior of other IP connected
along with the memory
• For integrated circuits, probing the behavior of internal signals is extremely difficult
without the use of circuit simulators.
TYPES OF SIMULATORS
SIMU
LATORS
Fast SPICE Sim
ulators
True SPICE Sim
ulators
First GENERATION CIRCUIT SIMULATORS Second GENERATION CIRCUIT SIMULATORS
TRUE SPICE SIMULATORS
At each time step, SPICE builds a small-signal model (i.e. linear model) at the operating point:
Small Signal Model
Highly Accurate. Performance an Issue ( large run time). Used as benchmark Simulator.
It takes a text netlist describing the circuit elements (transistors, resistors, capacitors, etc.) and their connections, and translates this description into equations to be solved. The general equations produced are nonlinear differential algebraic equations which are solved using implicit integration methods, Newton's method and sparse matrix techniques.
BASIC OPERATION :
Constructs and Solve the matrix equation : A.x=b
0 0
Linear = fixed 0
0
KCLs
KVLs
I-V Eq's
nodalvoltage
branchvoltage
branchcurrents
nodalvoltage
branchvoltage
branchcurrents
A x b
Nonlinear
Before transient or other analysis is performed, a stable DC Operating point must be set – this can either be calculated by the simulator or given by user.
Once a stable DC Operating point has been found , SPICE can proceed to transient or time-domain analysis. During transient analysis, simulator attempts to compute an accurate approximation to the analytical solution for the given circuit at discrete time points using a numerical integration method.
STRENGTHS :
The main benefit True Spice Simulators offer over other simulators is precision. Due to this
feature, they are used as the reference or benchmarking simulators.
LIMITATIONS :
1. While True Spice offers superior precision, it can do so for circuits of limited size.
Because the bigger the circuit, the more complex matrix equations are there to be solved,
which would naturally take more time.
2. Convergence can be an issue for some classes of circuits, particularly sensitive or high-
gain ones. This is purely an issue with numerical integration, worse for some integration
methods than others. As circuit gets larger & includes more non-linear behavior,
convergence has become more difficult.
TRUE SPICE SIMULATORS (Contd.)
Example of True-SPICE simulators
• ELDO (Mentor Graphics)
• SPECTRE (CADENCE)
• HSPICE (SYNOPSYS)
C90 C65 C45 C32
No. of Coupling CapacitorsNo. of CMOS elements
CMOS Technology
Incr
easin
g Co
mpl
exity
Need of Fast Spice Simulators
INCREASING NEED FOR FAST-SPICE
36
Performance Tuning Options
Yield Enhancement features
INCREASING FEATURES ---- MORE VALIDATION COST
INCREASING NEED FOR FAST-SPICE
Features
• Fast-SPICE simulators are transistor-level circuit simulators that achieve faster simulation runtime than SPICE.
• Fast-SPICE is generally used for functional verification on large designs that aren’t able to simulate in SPICE.
• Fast SPICE simulators take “short-cuts” based on the knowledge of present technologies to improve the performance of the simulation.
Common technologies used in Fast-SPICE
Circuit Partitioning
Table Look-up Model
Dynamic Time Step Control
Circuit Partitioning
2
7 9
64
1 3
5
8
Circuit partitioning
Fig 4.5- Circuit Partitioning (Block Diagram)
It cuts a single large system into smaller “independent” groups, therefore many smaller
matrices are to be solved.
Table Lookup Model
Table lookup model is used to replace analytical model in order to speed up simulation.
• It describes the characteristics of the specified MOS model based on the device size.
XA, NANOSIM, HSIM
ULTRASIM
ADiT
FINESIM
FAST-SPICE SIMULATORS (Contd.)
• XA is a high-capacity easy-to-use tool for transistor-level circuit simulation. XA supports
simulation of netlists in both the pre-layout and post-layout domains. It provides a
scalable accuracy vs. speed trade-off.
• Time to results (setup/configuration plus simulation time) is typically much better than
other Fast-SPICE simulators .
• Combination of the best fast-SPICE technologies from NanoSim, HSIM, and Star-
SimXT plus new technologies.
DETAILED STUDY OF XA
C90 C65 C45
19.37534.25
141
NANOSIM
Technology
Showing average simulation time in seconds for a sample of 35 simulations.
NEED OF XA- DEGRADING PERFORMANCE OF NANOSIM
LSF CONSIDERATION WITH SIMULATORS
InfiniteMediumShort
No limit
30 Mins
4 Hours
LowPriority Medium High
Simulator - A Simulator - B
SHORT MEDIUMLSF QUEUES
SHORT MEDIUMLSF QUEUES Runtime
Queue Wait Time
Average Runtime greater than 30minsAverage Runtime
lesser than 15mins ×
44
HISTORY OF FASTSPICE SIMULATORS IN ST MICROELECTRONICS
2005 2007 20102000 2002
HSIM
120nm 90nm 65nm 45nm 32nm
HSIM
HSIM
NANOSIM
XA
Option provided in XA for ACCURACY – PERFORMANCE Trade off
ACCURACY (25944 Samples) PERFORMANCE (606 Samples)SET_SIM_LEVEL 6 SET_SIM_LEVEL 5
Gain Over True Spice Absolute ValueAbsolute
Diff % Diff Absolute Diff % Diff
Max Min Max Min Max Min Max Min Max Min Avg Max Min Avg
27ps 0ps 11.43% 0% 254ps 1ps 18.31% 0.12% 77.95X 3.71X 24.87X 2152 sec
127sec
333.23 sec
XA PERFORMANCE DATA - TIMING
1 37 73 1091451812172532893253613974334695055415770
10002000300040005000600070008000
SET_SIM_LEVEL 6
1 33 65 97 1291611932252572893213533854174494815135455770
10002000300040005000600070008000
SET_SIM_LEVEL 5
ACCURACY (25944 Samples) PERFORMANCE (606 Samples)SET_SIM_LEVEL 6 SET_SIM_LEVEL 5
Gain Over True Spice Absolute ValueAbsolute
Diff % Diff Absolute Diff % Diff
Max Min Max Min Max Min Max Min Max Min Avg Max Min Avg
1.347μA/MHz
0μA/MHz
1674.7% 0% 1.695μA/MHz
0.001μA/MHz
1865.3% 0.01% 53.73X 1.7X 17.89X 2911sec
121sec
505.87 sec
XA PERFORMANCE DATA - POWER
1 35 69 1031371712052392733073413754094434775115455790
500100015002000250030003500400045005000
SET_SIM_LEVEL 6
1 35 69 1031371712052392733073413754094434775115455790
500100015002000250030003500400045005000
SET_SIM_LEVEL 5
Leakage (uA) Runtime Gain
Absolute Diff % Diff Max Min Max Min Max Min Avg
22uA 0.5uA 8% 0.04% 7.15X 1.6X 4.7X
XA PERFORMANCE DATA - LEAKAGE
1. ACCURACY CHECKS
Many of the new features of Fast spice simulators were checked at different design corners and PVTs
and accuracy compared with that of true spice results.
2. RUN TIME IMPROVEMENT
Many of the options were tweaked to adjust the speed – accuracy trade off ratio (for both true spice
and fast spice simulators) and then its effects on runtime were analyzed maintaining accuracy within
permissible limits..
3. TESTCASE TRANSFER
Many of the violations, accuracy issues(during Timing, Leakage and Power Characterizations) in the
simulator observed in the several runs were compiled and reported as testcases to the simulator
vendors mentioning the cause and location of the faults which were then resolved at their end
WORK DONE IN SIMULATORS
?QUESTIONS?
Thank you for your time & attention