Upload
khawar
View
217
Download
5
Embed Size (px)
Citation preview
A Novel Low-Leakage 8T Differential SRAM Cell
Khawar Sarfraz
Department of Electrical Engineering
School of Science & Engineering
Lahore University of Management Sciences (LUMS)
Opp. Sector U, DHA, Lahore 54792, Pakistan
Abstract—A novel low-leakage 8T differential SRAM cell is
presented in 65nm bulk CMOS technology with the aim to
address standby power consumption of embedded memories. The
proposed cell supports differential read and write operation.
HSPICE simulations incorporating process variation indicate
that the proposed cell exhibits 84.8%, 44.8%, 42.6% and 27.0%
less leakage under worst case conditions compared to 5T, 6T, 8T
and 9T cells reported earlier in literature. With approximately
60% and 33% more area than the conventional 5T and the 6T
cells and 11% less area than the 9T cell the proposed cell
provides 3900.5x and 1.3x improvement in read static power
noise margin compared to its 5T and 6T counterparts
respectively under worst case conditions.
Keywords-8T SRAM cell; low-leakage; differential; read-SNM-
free; process variation; low-voltage; cache;
I. INTRODUCTION
Embedded memories constitute a large fraction of the total area budget of modern microprocessors – up to 90% as reported for the Montecito processor in [1]. It therefore becomes imperative to maintain a check on the leakage current levels of large memory instances, which directly govern battery lifetime in handheld devices. To this end, numerous SRAM cell topologies have been proposed in literature where each exhibits its dominance either in area, performance or power.
This work presents a novel low-leakage 8T SRAM cell topology in 65nm bulk CMOS technology [2]. The proposed cell supports differential read and write operation, and leaks less than the 5T cell presented in [3], the standard 6T cell, the 8T cell presented in [4] and the 9T cell presented in [5]. The behavior of the cell under various operating modes has been explained and supported with HSPICE simulations incorporating process variation over an operating temperature range of -10°C to +125°C. A comprehensive cell stability analysis reveals improved read stability compared to the 5T and 6T cells respectively.
This paper is organized as follows. An overview of the proposed cell is presented in Section II. Cell write operation is presented in Section III while cell read operation is presented in Section IV. Section V focuses on the standby state of the cell while Section VI deals with device sizing methodology whereas Section VII investigates the cell’s read stability. The paper concludes with a summary of important results and a discussion on benefits of the new topology in Section VIII.
II. THE PROPOSED 8T SRAM CELL
The design of the proposed cell is based on two existing cell topologies, the 5T cell proposed in [3] and the 9T cell proposed in [5]. These two cells are presented in Fig. 1.
M5M1
M2
M3
M4
Q Qb
VDD
Q Qb
WR WR
RD
VDD
VDD
BL BLb
BLbBL
AXS
M1
M2
M3
M4
M5 M6
M7 M8
M9
(a) (b)
Figure 1. (a) 5T cell. (b) 9T cell
M1
M2
M3
M4
M5
M6 M7
M8
Q Qbar
RWL
WWL
VDD
BL BLbar
65/28075/75 75/75
75/75 75/75
100/75
100/75
100/75
VDD
Figure 2. Proposed 8T SRAM Cell
The proposed cell is presented in Fig. 2. It takes power directly from the bitlines and does not have the conventional access transistors. It consists of a cross-coupled inverter pair, M1-M2 and M3-M4, which acts like a latch. A coupling transistor M5, connected between the two storage nodes Q and Qbar, is driven by the write wordline only during the write mode. Transistors M6 and M7 are driven by the potential at the storage nodes of the cell, and together with transistor M8 they
978-1-4577-0170-2/11/$26.00 ©2011 IEEE
2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip
19
help create a path for the read current to flow from one of the bitlines to ground without disturbing the potential of storage nodes. M8 is driven by the read wordline only during the read mode. In comparison to a standard 6T cell, the RWL is the only additional signal required in this topology.
III. CELL WRITE OPERATION
The write operation is illustrated in detail in Fig. 3. A standard column precharge and equalization circuit consisting of transistors M9, M10 and M11 is presented to highlight the supply of the cell. A conceptual view of write drivers M12 and M13 at the bottom of the column is also presented, which are only driven during the write mode. A write operation is initiated by asserting the write wordline and simultaneously lowering the supply voltage of the inverter storing logic one at its output. The PRE and RWL signals are maintained at logic zero during the entire write operation. During a successful write to a particular cell, non-accessed cells in the same column must retain their state. The sizing ratio of M9 and M12 (or M10 and M13) governs the bit-line voltage during write and hence the minimum allowed Q potential of non-accessed cells in the same column, which store logic one. Proper cell sizing ensures the cell has a high write margin keeping in view 10% variation on the supply rail. That is necessary because the non-accessed cells should be able to bear the reduced bit-line voltage (i.e. not flip) with their coupling transistors switched off. Similarly, all cells in a row share the same write wordline. Cells in non-accessed columns of the same row have their write drivers turned off and hence do not flip their state on the activation of the write wordline.
The initiation of a write operation is presented in Fig. 3a while the flow of write currents after the cell has flipped its state is presented in Fig. 3b. The coupling transistor M5 and the write driver transistor M12 are turned on simultaneously, as illustrated in Fig. 3a. Turning on M12 pulls the BL down to a value of 2/3VDD [3]. If Q is initially high, then by turning on M5 in conjunction with M12, a path is provided for the write current IWRITE-A to flow through the cell thus reducing the potential between nodes Q and Qbar [3]. The initially high Q node is therefore pulled down to a level below the trip point VM of inverter M3-M4, as illustrated in Fig. 3b. Consequently node Qbar rises to a voltage above VM. Keeping M5 on after the cell has flipped its state allows three different currents to flow through the cell. Since both inverters are partially on, short circuit currents IWRITE-L and IWRITE-R flow from BL and BLbar respectively to ground. In addition, the direction of write current through the cell, labeled IWRITE-B, is now reversed since the cell has flipped its state. When M5 and M12 are turned off at the end of a write operation, nodes Q and Qbar assume their new logic state of zero and one respectively. Also, write current IWRITE-B and short circuit currents IWRITE-L and IWRITE-R seize to flow.
A simulated write timing diagram illustrating key signals of interest is presented in Fig. 4 in which node Q has been assumed to be logic one initially. Also, a comparative cell write operation is shown in Fig. 5, where the reduction in potential between the internal storage nodes for the 5T and the proposed 8T cell is clearly visible under process variation. The wordline has to be kept on till the time the voltage on the storage nodes
reaches the trip point of the opposite inverter. This is the reason write operation is slower in case of 5T and the proposed 8T cells, compared to the 6T, original 8T and 9T cells.
If the supply voltage is lowered below the device threshold voltage, turning on the write wordline is sufficient to flip the state of the cell without the need to turn on write drivers in certain process corners. This is not acceptable because cells in non-accessed columns of the same row would flip without intentionally being written to.
M1
M2
M3
M4
M5
M6 M7
M8
M12 M13
M11
M9 M10PRE = 0
Q=1 Qbar=0
RWL = 0
WWL =
WR-0= WR-1=0
VDD
VDD
BL=
VDD
2/3VDD
BLbar=
VDD
Cblb
Cbl
IWRITE-A
(a)
M1
M2
M3
M4
M5
M6 M7
M8
M12 M13
M11
M9 M10PRE = 0
Q<VMQbar>VM
RWL = 0
WWL = 1
WR-0=1 WR-1=0
VDD VDD
BLbar=
VDD
Cblb
Cbl
IWRITE-B
IWRITE-L
IWRITE-R
2/3VDD
BL=
(b)
Figure 3. Cell write operation. (a) Initiation of write. (b) Write currents
through the cell after the cell has flipped its state
20
-0.1
0.3
0.7
1.1
0.4 0.6 0.8 1.0 1.2 1.4 1.6
Time (ns)
WW
L, W
R-0
,
BL, B
Lb
(V)
increasing tempWWL & WR-0
BLBLb
-0.1
0.3
0.7
1.1
0.4 0.6 0.8 1.0 1.2 1.4 1.6
Time (ns)
Q &
Qb
(V)
Q
Qbincreasing temp
Accessed cell
-0.1
0.3
0.7
1.1
0.4 0.6 0.8 1.0 1.2 1.4 1.6
Time (ns)
Q &
Qb
(V) Non-accessed cell
(same row)
Q
Qbincreasing temp
-0.1
0.3
0.7
1.1
0.4 0.6 0.8 1.0 1.2 1.4 1.6
Time (ns)
Q &
Qb
(V) Non-accessed cell
(same column)increasing temp
Q
Qb
Figure 4. Write timing diagram (T=-10°C to 125°C, Process=NOM,
VDD=1.2V)
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
Time (ns)
Q &
Qbar
(V)
Q Qbar
SLOW
corner
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
Time (ns)
Q &
Qbar
(V)
Q Qbar
TYPICAL
corner
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
1.3
0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
Time (ns)
Q &
Qbar
(V)
Q Qbar
FAST
corner
5T cell
6T cell
8T original
8T proposed
9T cell
WL
Figure 5. Cell write operation for different cells in SLOW corner (T=125°C,
Process=SNSP, VDD=1.08V), Typical corner (T=25°C, Process=NOM,
VDD=1.2V) and FAST corner (T=-10°C, Process=FNFP, VDD=1.32V), legend
IV. CELL READ OPERATION
The read operation of the proposed cell is illustrated in Fig. 6 together with a matrix column view, which was designed and implemented to read data out of the cell. At the start of a read operation, the precharged bitlines are left floating. Soon thereafter, the read wordline is asserted, which allows read current IREAD to flow from BLbar through M7 and M8 to ground, assuming node Q is logic one. The potential at Qbar therefore does not change during a read operation. The write wordline is not asserted during a read operation. The read operation presented here is very different from that in [3] where read current flows through node Qbar, hence disturbing its potential and reducing the cell’s read Static Noise Margin (SNM). Similar to a conventional 6T SRAM cell, a differential bitline voltage of 200mV is allowed to develop before the sense amplifier in Fig. 6 is triggered. A simulated read timing diagram illustrating key signals is presented in Fig. 7. The magnitude of bitline capacitance and the size of transistors M6, M7 and M8 (in Fig. 2) determine the rate of discharge of the bitline. Since all cells in a row share the same read wordline, cells in non-accessed columns of the same row experience a parasitic read operation.
If a read operation is attempted on a cell that stores a logic one at node Q in the sub-threshold operating region, the non- accessed cells in the same column which store a logic one at node Qbar would flip well before the development of a sufficient differential bitline voltage, which is necessary for reliable sensing. The situation gets worse in the presence of process variation.
M1
M2
M3
M4
M5
M6 M7
M8
M12 M13
M11
M9 M10
PRE =
Q=1 Qbar=0
RWL = 1
WWL = 0
BL BLbar
Cblb
Cbl
IREAD
YMUX YMUX
M14 M15LPRE
Non-accessed
cells
VDD VDD
VDD
VDD
VDD
CDLb
CDL
SAE
DLDLbar
M16
M17 M18
M19 M20
M21 SenseAmplifier
Figure 6. A column view illustrating read operation
21
-0.1
0.3
0.7
1.1
0.3 0.5 0.7 0.9 1.1 1.3 1.5
Time (ns)
L/P
RE &
RW
L
(V)
\
L/PRE RWL
0.9
1.0
1.1
1.2
1.3
0.3 0.5 0.7 0.9 1.1 1.3 1.5
Time (ns)
BL &
BLbar
(V) BL
BLbarincreasing temp
-0.1
0.3
0.7
1.1
0.3 0.5 0.7 0.9 1.1 1.3 1.5
Time (ns)
YM
UX &
SA
E
(V) YMUX SAE
-0.1
0.3
0.7
1.1
0.3 0.5 0.7 0.9 1.1 1.3 1.5
Time (ns)
DL &
DLbar
(V) DL
DLbarincreasing
temp
-0.1
0.3
0.7
1.1
0.3 0.5 0.7 0.9 1.1 1.3 1.5
Time (ns)
Q &
Qbar
(V)
Q
Qbar
Figure 7. Read timing diagram (T=-10°C to 125°C, Process=NOM,
VDD=1.2V)
V. CELL IN STANDBY
During standby, i.e. in the absence of a read or write operation, the cell holds its state. The bitlines are tied to the supply rail via the precharge circuit, while WWL and RWL signals are maintained at logic zero. Leakage currents thus flow through the cell as illustrated in Fig. 8, which comprise primarily of sub-threshold leakage, gate leakage and gate induced drain leakage.
M1
M2
M3
M4
M5
M6 M7
M8
Q=1 Qbar=0
RWL=0
WWL=0
BL=1 BLbar=1
SUB-THRESHOLDLEAKAGE
GATELEAKAGE
GATE INDUCED
DRAINLEAKAGE
VDD
VDD
Figure 8. Cell leakage currents
10
20
30
40
50
60
70
80
-15 5 25 45 65 85 105 125
Temperature (deg C)
Cell leak
ag
e c
urr
en
t (n
A)
5T Cell 6T Cell 8T Cell Proposed 8T Cell 9T Cell
Figure 9. Worst case cell leakage current comparison (Process=FNFP,
VDD=1.32V)
In Fig. 9, a comparison of leakage currents of different SRAM cells is presented over the entire operating temperature range, under worst case leakage conditions (FNFP, 1.32V). The compared cells were sized for the same read performance and similar write performance. Only supply and ground rails were employed in the simulation of the cells. WL (6T cell), WWL and RWL (this work), AXS (5T cell [3]), RWL (8T cell [4]), and WR and RD signals (9T cell [5]) were maintained at logic zero. Additional leakage reduction techniques including but not limited to negative gate voltage [5], source biasing [6] or forward body-biasing [7] were not employed. It can be seen that the proposed cell exhibits lower levels of leakage over the entire temperature scale, leaking 84.8%, 44.8%, 42.6% and 27.0% less than its 5T, 6T, 8T and 9T counterparts primarily due to the absence of access transistor M6, shown in Fig. 1b. The 5T cell leaks the most due to its much larger size.
VI. TRANSISTOR SIZING METHODOLOGY
As mentioned in Section V, leakage current comparison is based on all cells being sized for the same read and similar write performance. All cell transistors, with the exception of the coupling transistor in the proposed 8T cell, were sized with length greater than the minimum feature size in order to reduce cell leakage currents [8]. In 65nm technology [2], NMOS and PMOS leakage currents are reduced by a factor of 10 and 18 respectively when channel length is increased by 10nm over the minimum feature size. The increase in length may not be necessary in certain technology nodes where low-leakage dedicated SRAM transistors are employed. Hence a trade-off exists between standby power consumption and cell area.
Transistor sizes for the compared cells are explicitly shown for reference in Fig. 10. The critical transistor in the proposed cell is the coupling transistor M5 (in Fig. 2). M5 has to be sized carefully such that it is strong enough to flip the accessed cell during a write operation when the bit line connected to the node storing a logic 1 is lowered to 2/3VDD. Bitlines of non-accessed cells in the same row are maintained at VDD, hence M5 must be weak enough at the same time to prevent cell flip when the write wordline is activated. Moreover, it must also be weak enough to avoid a cell flip in all non-accessed cells which lie in the same column, and which experience reduced voltage
22
on one of the bitlines. These aspects are illustrated graphically in Fig. 4. Furthermore, the coupling transistor must be able to perform under process variation effects.
The cell read operation is governed by transistors M6, M7 and M8 (in Fig. 2). They have been sized to match the read performance of compared SRAM cells. A specific performance level has not been targeted here. The emphasis has been placed on standby power consumption of different cells that exhibit same read and similar write performance.
210/75
Q Qb
VDD
VDD
BLbBL
AXS
900/75 900/75
500/75500/75
VDD
WL WL
BLbBL
Q Qb
75/75 75/75
170/75 170/75
80/75 80/75
(a) (b)
VDD
WL WL
BLbBL
Q Qb
75/75 75/75
75/75
75/75 75/75
RD
75/75
100/75
100/75
65/280
Q Qb
VDD
VDD
BLbBL
WWL
75/75 75/75
75/7575/75
100/75100/75
100/75RWL
(c) (d)
VDD
WR WR
BLbBL
Q Qb
75/75 75/75
75/75 75/75
75/75 75/75
RD
100/75100/75
100/75
(e)
Figure 10. Transistor sizes for the compared memory cells. (a) 5T cell. (b) 6T
cell. (c) Proposed 8T cell. (d) Original 8T cell. (e) 9T cell.
VII. CELL READ STABILITY ANALYSIS
The storage nodes of the proposed cell are completely isolated from the bitlines during a read operation. If node Q is initially high, then the fact that read current does not flow through node Qbar enhances the read SNM of the cell in comparison to the original 5T cell in [3].
A complete picture of a cell’s read stability is presented by both its Static Voltage Noise Margin (SVNM) and its Static Current Noise Margin (SINM) [9]-[10]. A comparison of read SVNM, SINM and Static Power Noise Margin (SPNM) of different SRAM cells is presented in Fig. 11 in the worst case (T=125°C, VDD=1.08V, Process=FNSP). The proposed cell, the 8T cell and the 9T cell exhibit equal SVNM, SINM and hence SPNM values due to the cell topology they display during a read operation. If the off transistors are removed from the SRAM cell circuit as is done for analysis in [11]-[12], the resultant cell circuit is identical for all three cells. This is theoretically verified by the fact that SVNM and SINM are functions of VDD, VT and β cell ratios [1], and all three cells possess equal values for these parameters.
397 397 397
325
30
5T Cell 6T Cell Proposed
8T Cell
9T Cell 8T Cell
Cell Type
Read S
VN
M (m
V)
(a)
0.1
30.9 30.9 30.940.5
5T Cell 6T Cell Proposed
8T Cell
9T Cell 8T Cell
Cell Type
Read S
INM
(µA
)
(b)
2
6165
7801 7801 7801
5T Cell 6T Cell Proposed
8T Cell
9T Cell 8T Cell
Cell Type
Read S
PN
M (nW
)
(c)
Figure 11. (a) Worst case read SVNM comparison. (b) Worst case read SINM
comparison. (c) Worst case read SPNM comparison (T=125°C,
Process=FNSP, VDD=1.08V)
23
Since SVNM and SINM values of the 5T and 6T cells were different, it was necessary to compare them on a SPNM criterion. The results in Fig. 11c show that the proposed 8T cell matches the read stability of the 8T and 9T cells and is more stable than the 5T and 6T cells. The SINM value of the 6T cell is higher because the cell was sized to match the read performance of other cells, as already outlined in Section VI. The SVNM and SINM of the 5T cell is considerably lower because a heavy read current flows through the zero storage node of the cell during a read operation, as discussed in Section IV.
VIII. CONCLUSION
A novel 8-transistor, low-leakage differential SRAM cell has been proposed to address the standby power consumption problem of high-performance embedded caches, which see extensive use in Systems on Chip (SoC) employed in modern handheld audio-visual and navigation equipment. The proposed cell is not only larger than the standard 6T cell but also requires an extra control signal, the RWL. However, it leaks 84.8%, 44.8%, 42.6% and 27.0% less than its 5T, 6T, 8T and 9T counterparts under worst case conditions (T=125°C, VDD=1.32V, Process=FNFP). The cell, which has approximately 60% and 33% more area than its 5T and 6T counterparts and about 11% less are than its 9T counterpart provides 3900.5x and 1.3x improvement in read SPNM over its 5T and 6T counterparts in the worst case (T=125°C, VDD=1.08V, Process=FNSP).
The proposed cell demonstrates superior performance in terms of leakage current levels but it is inappropriate for use in the sub-threshold operating region due to stability concerns. This is the most significant difference between the proposed cell and the 8T and 9T cells, which support full sub-threshold operation. Dual-port operation of the proposed topology is also possible if the bitlines associated with each column are split
into separate read and write bitlines. Such an arrangement would also provide faster memory access due to reduced bitline capacitance per memory cell, possibly at the cost of increased cell area.
REFERENCES
[1] A. Pavlov, M. Sachdev, “CMOS SRAM Circuit Design and Parametric Test in Nano-Scaled Technologies,” Springer, 2008
[2] Predictive Technology Model (PTM). Available: http://ptm.asu.edu/
[3] M. Wieckowski, M. Margala, “A Novel Five-Transistor (5T) SRAM Cell for High Performance Cache,” International SoC Conference 2005, Herndon, VA, USA, pp. 101-102
[4] L. Chang, D.M. Fried, J. Hergenrother, J.W. Sleight, R.H. Dennard, R.K. Montoye, L. Sekaric, S.J. McNab, et al, “Stable SRAM Design for the 32nm Node and Beyond,” Symposium on VLSI Technology 2005, pp. 128-129
[5] Z. Liu, V. Kursun, “Characterization of a Novel Nine-Transistor SRAM Cell,” IEEE Transactions on VLSI Systems 2008, Vol. 16, Issue 4, pp. 488-492
[6] K. Sarfraz, “Comparison of two SRAM Matrix Leakage Reduction Techniques in 45nm Technology,” 22nd International Conference on Microelectronics (ICM) 2010, Cairo, Egypt, pp. 367-370
[7] C.H. Kim, J. Kim, S. Mukhopadhyay, K. Roy, “A Forward Body-Biased Low-Leakage SRAM Cache: Device, Circuit and Architecture Considerations,” IEEE Transactions on VLSI Systems 2005, Vol. 13, Issue 3, pp. 349-357
[8] J. Rabaey, “Low Power Design Essentials,” Springer, 2009
[9] C. Wann, R. Wong, D.J. Frank, R. Mann, Shang-Bin Ko, et al, “SRAM Cell Design for Stability Methodology,” International Symposium on VLSI Technology 2005, pp. 21-22
[10] E. Grossar, M. Stucchi, K. Maex, W. Dehaene, “Read Stability and Write-Ability Analysis of SRAM Cells for Nanometer Technologies,” IEEE Journal of Solid State Circuits 2006, Vol. 41, Issue 11, pp. 2577-2588
[11] F. J. List, “The Static Noise Margin of SRAM Cells,” 12th IEEE European Solid-State Circuits Conference, ESSCIRC 1986, pp. 16-18
[12] E. Seevinck, F. J. List, J. Lohstroh, “Static-Noise Margin Analysis of MOS SRAM Cells,” IEEE Journal of Solid State Circuits 1987, Vol. 22, Issue 5, pp. 748-754
24