5
Design of Ternary Content Addressable Memory (TCAM) with 180 nm Sampath Kumar 1 , Arti Noor 2 , Brajesh Kumar Kaushik 3 and Brijesh Kumar 4 1 J.S.S. Academy of Technical Education, Noida, INDIA 2 Centre for Development of Advance Computing (CDAC), Noida, INDIA 3,4 Department of Electronics and Computer Engineering, Indian Institute of Technology-Roorkee, Roorkee, INDIA Email [[email protected] 1 , [email protected] 2 , [email protected] 3 , [email protected] 4 ] Abstract— This paper deals with the design and analysis of Ternary Content Addressable Memory using 180nm technology. The main aim of the TCAM is to perform the search operation using match line (ML). Ternary content addressable memories (TCAMs) are hardware-based parallel lookup tables with bit-level masking capability. They are attractive for applications such as packet forwarding and classification in network routers. Despite the attractive features of TCAMs, high power consumption is one of the most critical challenges faced by TCAM designers. TCAMs are popular because of their searching operation based on the content unlike the RAM cell which does it on the basis of address. The main contribution of this work is testing of ML. The testing of match line was a task which needs to be done to check the searching condition in the TCAM cell. To accomplish this task a new circuitry was added to the existing circuit in order to test the masking condition in a TCAM cell. The work was started from the scratch. First a RAM cell was designed with the goal in mind to be used for the TCAM cell. Various parameters were calculated for the stability of a SRAM cell. The SRAM cell designed was then used in the design of Binary CAMs. After the completion of binary CAM, the cell was modified into the TCAM cell. The additional circuitry added was used to test the working of match line during the search operation of the TCAM cell. Finally, design and testing of a complete TCAM cell is presented. Keywords- SRAM, BINARY CAM (Content Addressable Memory), TCAM and Static Noise Margin (SNM) . I. INTRODUCTION CAM is an outgrowth of random access memory (RAM). In addition to the conventional READ and WRITE operations, CAMs also support SEARCH operations. A CAM stores a number of data words and compares a search key with all the stored entries in parallel. If a match is found, the corresponding memory location is retrieved. In the presence of multiple matches, a priority encoder resolves the highest priority match. CAM-based table lookup is very fast due to the parallel nature of the SEARCH operation. Phenomenal growth in the number of Internet users and the increasing popularity of bandwidth- hungry real-time applications has resulted in a demand for very high-speed networks. The Internet is a mesh of routers and switches, which process data packets and forwards them toward their destinations. Each router [1] maintains a routing table and forwards incoming packets based on the information stored in the routing table. Routers also communicate with one another to update their routing tables. The growing demand for high- speed networks [2] is pushing the existing solutions to their limits in order to meet the increasing packet processing rates. The current version of Internet protocol (IP), commonly known as IPv4, supports only 32-bit IP addresses. Due to the rapid increase in the number of Internet users, there is a growing shortage of IPv4 addresses, which are needed by all new machines added to the Internet. Hence, a new version of IP (IPv6) has been introduced that supports 128-bit addresses. IPv6 is expected to gradually replace IPv4. The increasing number of network nodes supported by IPv6 significantly increases the capacity and word-size of the routing table used for packet forwarding. An efficient hardware solution to perform table lookup is the content addressable memory (CAM). A CAM can be used as a co-processor for the network processing unit to off load the table lookup tasks. Besides the networking equipment, CAMs are also attractive for other key applications such as translation look-aside buffers (TLBs) [4] in virtual memory systems, tag directories in associative cache memories, database accelerators, data compression, and image processing. Some of the recent applications of CAMs include real-time pattern matching in virus/intrusion-detection systems and gene pattern searching in bioinformatics. II. CIRCUIT DESIGN A. SRAM The memory cell used is 6T SRAM [3] cell in Fig.1 with two pass gates instead of one. The circuit consists of the 2 cross-coupled inverters, but uses two pass transistors instead of one. Six transistors cell is simple and reliable and consumes less standby power. It also has more noise immunity. These 6T SRAM [7] designs that use substrate-biased transistors draw a significant current only when they are switching and are easier to scale in geometry migration to the next generation devices. The bit value stored in the cell is preserved as long as the cell is connected to a supply voltage whose value is greater than the Data Retention Voltage (DRV). This feature, which is due to the presence of cross-coupled inverters inside the 6T SRAM, holds independent of the amount of leakage current. 978-1-4244-9190-2/11/$26.00 ©2011 IEEE

Documentca

Embed Size (px)

DESCRIPTION

ty

Citation preview

Page 1: Documentca

Design of Ternary Content Addressable Memory (TCAM) with 180 nm

Sampath Kumar1, Arti Noor2, Brajesh Kumar Kaushik3 and Brijesh Kumar4 1J.S.S. Academy of Technical Education, Noida, INDIA

2Centre for Development of Advance Computing (CDAC), Noida, INDIA 3,4Department of Electronics and Computer Engineering, Indian Institute of Technology-Roorkee, Roorkee, INDIA

Email [[email protected], [email protected], [email protected], [email protected]]

Abstract— This paper deals with the design and analysis of Ternary Content Addressable Memory using 180nm technology. The main aim of the TCAM is to perform the search operation using match line (ML). Ternary content addressable memories (TCAMs) are hardware-based parallel lookup tables with bit-level masking capability. They are attractive for applications such as packet forwarding and classification in network routers. Despite the attractive features of TCAMs, high power consumption is one of the most critical challenges faced by TCAM designers. TCAMs are popular because of their searching operation based on the content unlike the RAM cell which does it on the basis of address. The main contribution of this work is testing of ML. The testing of match line was a task which needs to be done to check the searching condition in the TCAM cell. To accomplish this task a new circuitry was added to the existing circuit in order to test the masking condition in a TCAM cell. The work was started from the scratch. First a RAM cell was designed with the goal in mind to be used for the TCAM cell. Various parameters were calculated for the stability of a SRAM cell. The SRAM cell designed was then used in the design of Binary CAMs. After the completion of binary CAM, the cell was modified into the TCAM cell. The additional circuitry added was used to test the working of match line during the search operation of the TCAM cell. Finally, design and testing of a complete TCAM cell is presented.

Keywords- SRAM, BINARY CAM (Content Addressable Memory), TCAM and Static Noise Margin (SNM) .

I. INTRODUCTION CAM is an outgrowth of random access memory (RAM).

In addition to the conventional READ and WRITE operations, CAMs also support SEARCH operations. A CAM stores a number of data words and compares a search key with all the stored entries in parallel. If a match is found, the corresponding memory location is retrieved. In the presence of multiple matches, a priority encoder resolves the highest priority match. CAM-based table lookup is very fast due to the parallel nature of the SEARCH operation. Phenomenal growth in the number of Internet users and the increasing popularity of bandwidth-hungry real-time applications has resulted in a demand for very high-speed networks.

The Internet is a mesh of routers and switches, which process data packets and forwards them toward their

destinations. Each router [1] maintains a routing table and forwards incoming packets based on the information stored in the routing table. Routers also communicate with one another to update their routing tables. The growing demand for high-speed networks [2] is pushing the existing solutions to their limits in order to meet the increasing packet processing rates. The current version of Internet protocol (IP), commonly known as IPv4, supports only 32-bit IP addresses. Due to the rapid increase in the number of Internet users, there is a growing shortage of IPv4 addresses, which are needed by all new machines added to the Internet. Hence, a new version of IP (IPv6) has been introduced that supports 128-bit addresses. IPv6 is expected to gradually replace IPv4. The increasing number of network nodes supported by IPv6 significantly increases the capacity and word-size of the routing table used for packet forwarding. An efficient hardware solution to perform table lookup is the content addressable memory (CAM). A CAM can be used as a co-processor for the network processing unit to off load the table lookup tasks. Besides the networking equipment, CAMs are also attractive for other key applications such as translation look-aside buffers (TLBs) [4] in virtual memory systems, tag directories in associative cache memories, database accelerators, data compression, and image processing. Some of the recent applications of CAMs include real-time pattern matching in virus/intrusion-detection systems and gene pattern searching in bioinformatics.

II. CIRCUIT DESIGN

A. SRAM The memory cell used is 6T SRAM [3] cell in Fig.1 with

two pass gates instead of one. The circuit consists of the 2 cross-coupled inverters, but uses two pass transistors instead of one. Six transistors cell is simple and reliable and consumes less standby power. It also has more noise immunity. These 6T SRAM [7] designs that use substrate-biased transistors draw a significant current only when they are switching and are easier to scale in geometry migration to the next generation devices. The bit value stored in the cell is preserved as long as the cell is connected to a supply voltage whose value is greater than the Data Retention Voltage (DRV). This feature, which is due to the presence of cross-coupled inverters inside the 6T SRAM, holds independent of the amount of leakage current.

978-1-4244-9190-2/11/$26.00 ©2011 IEEE

Page 2: Documentca

Figure 1. Schematic of SRAM cell

Ideally all the cells are kept at low W/L ratio but a careful sizing is necessary to avoid accidentally writing a 1 into the cell. When word line is enabled for reading, the series combination of two NMOS transistor pulls down the BL’ line to ground. For a small sized cell, the transistor sizing should be as small as possible. Analytically, the cell ratio should be greater than 1.2.

SRAM memory arrays are arranged in rows and columns of memory cells called word lines and bit lines respectively. The intersection of row and column forms a unique location or address and each of these locations are linked to data input/output pin. The number of arrays on a memory chip is determined by the total memory size, number of data input/output pins, speed of operation requirement and power budget. B. BINAR CAM CELL

A binary CAM is a memory in which data is searched

with respect to the content stored in the memory. A binary CAM cell [9] is shows in the Fig. 2.

Figure 2. Binary CAM cell Structure

The CAM cell shown is a 10T CAM cell. In a CAM cell we can find a 6T SRAM cell. The data is stored in that SRAM cell itself. In addition to the SRAM cell we have an additional matching circuitry that makes it different from a conventional SRAM cell. The READ and WRITE operations are similar to

an SRAM cell in Binary CAM. But the searching operation is implemented using XNOR logic. Transistors N1-N4 implements the XNOR logic to as shown in the Fig 2. The data to be searched is placed on the search line (SL) and it is compared to the entries stored in the SRAM cell.

TABLE 1: CR VS. SNM

Symbols Used Meaning

BL1 Bit Line

BL1c Bit Line-Compliment

WL Word Line

ML Match Line

SL Search Line

SLc Search Line-Compliment

C. TCAM CELL

A typical 16T static TCAM cell [10] is shown in Fig. 3. It is similar to the binary CAM cell except that it has two SRAM cells to store ternary data. The READ, WRITE and SEARCH operations in this cell are performed in the same way as described for binary CAM cell. For the given circuit style, masking can be achieved by turning off both ML-to-GND pull-down paths. For example, global masking is performed by SL1 = SL2 = ‘0’, and local masking is achieved by Vx = Vy = ‘0’.

Figure 3. TCAM cell

The above TCAM cell shown was added with an

additional circuitry to test the ML operation. The above circuitry is modified by adding an additional T.G. and a capacitor. The modified circuitry is shown below to test the searching operation.

Page 3: Documentca

Figure 4. Modified TCAM cell

In fig 4, the input pulse is applied at the input

terminal. When the input pulse is high, the ML is pre-charged to Vdd through the transmission gate (T.G) and so is the capacitor. When the input pulse goes low, the capacitor holds the charge making ML voltage as VDD. In this phase, when the input pulse is low, the searching operation is performed. As we know that when we don’t find a match, the ML is discharged to ground. So to perform the searching operation again, we need to apply a high-to-low pulse to pre-charge the capacitor. In this way we can test the searching operation in a TCAM cell.

III. RESULTS AND DISCUSIIONS

A. Core cell SNM Simulation

The Static Noise Margin (SNM) [5] serves as a figure of merit in stability evaluation of SRAM cells. Noise margin can be defined using the input to output voltage transfer characteristic (VTC). In general, Noise Margin (NM) [11] is the maximum spurious signal that can be accepted by the device when used in a system while still maintaining the correct operation [6].

Figure 5. Static noise margin

The Fig 5 shows the simulated result of SNM for the designed SRAM. Fig 6 and 7 represents the Read and Write margin simulation results respectively. After the layout and schematic designs, the DRC and LVS procedures are verified for the designs.

The figure plots the voltage transfer characteristic (VTC)

of Inverter 2 from Fig. 3 and the inverse VTC from Inverter 1. The resulting two-lobed curve is called a “butterfly curve” and is used to determine the SNM. The SNM is defined as the length of the side of the largest square that can be embedded inside the lobes of the butterfly curve. The internal node of the bit cell that represents a zero gets pulled upward through the access transistor due to the voltage dividing effect across the access transistor and drive transistor. This increase in voltage severely degrades the SNM during the read operation (read SNM).

Figure 6. Read margin

Figure 7. Write margin

Page 4: Documentca

TABLE 2: CR VS. SNM

Technology(nm) CR SNM(mV)

180nm

0.8 380

1.0 440

1.2 480

1.4 540

1.6 580

The SRAM cell ratio (CR) (i.e. the ratio of the driver transistor’s W/L to the access transistor’s W/L) was introduced to simplify consideration of SNM optimization. The Table 2 shows the variation of SNM with CR.

S N M

(mV)

CR

Fig 8: CR vs. SNM (180 nm) The Fig 8 shows cell ratio Vs static noise margin graph,

here the value of static noise margin increases with the increase cell ratio. As the cell ratio is increased, average value of SNM increases because the driver transistor now has higher drive strength and is less susceptible to noise. At the same time, the variation in SNM reduces with increasing cell ratio. This is expected because in a wider driver transistor, there will be higher number of dopants and small variation in the number/location of these dopants will result in a smaller effect on overall device characteristics.

B. BINARY CAM Simulation

The binary CAM [13] cell output wave forms are shown in fig 9. We can perform three operation on this cell,

Read, Write and Search operation. The Read and Write operations are performed very similar to SRAM Cell but Search operation is done using the content stored in the cell.

Fig 9: Wave form output of Binary CAM cell

The Search operation is performed in three steps. In the first step, The SL1 and SL1C are set to ground. In the second step ML is pre-charged [14] to VDD. Finally in the third step search key bit and its complementary value are placed on SL1 and SLC respectively. If the search key and the stored value both are identical, the two pull-down paths between ML – to – GND remains OFF. The ML remains at VDD indicating a “MATCH”. Otherwise, if the search key bit is different from the stored value, any of the pull – down paths conducts and discharges the ML to GND indicating “MISMATCH”.

C. TCAM Simulation The Read, Write and Search operations are very similar to

Binary CAM cell. The Fig 10 shows the output waveform for search operation and timing analysis shown in Fig 11. When ‘0’ is stored in the cell (Vx = 0 & Vy = 1), For SL1 = ‘1’, ML is discharged to ‘0’ detecting mismatch. Similarly for SL1 = ‘0’, ML remains at ‘1’ detecting “MATCH”. The Fig11 shows the timing analysis, i.e., time required to discharge ML to ground, of the TCAM cell.

0

100

200

300

400

500

600

0. 8 1 1. 2 1. 4 1. 6

.

Page 5: Documentca

Fig 10: Output wave of TCAM cell

Fig 11: Timing Analysis

TABLE 3: RESULT COMPARISON OF CAM AND TCAM

Parameters CAM Results TCAM Results

Total Power Dissipation 714.1031 µWatts 686.407 µWatts

Search Time 1.921 n sec 1.83n sec

Temperature 27 0C 27 0C

D. CONCLUSION

This paper presents a 6T-based SRAM, which addresses the critical issues in designing a low power static RAM in Deep sub micron (DSM) technologies. Thereafter, techniques to optimize delay and power in critical path are investigated and implemented. The SNM for the SRAM structure is determined using the butterfly structure of SRAM bitcell. The bitcell operates properly for static noise margin of 0.466V, Read margin of 0.3985V and Write margin of 0.5028V. TCAMs are gaining importance in high-speed lookup-intensive applications. However, the high power consumption of TCAMs is limiting their popularity and versatility. This work proposed several circuit techniques to test the ML operation. The searching operation was successfully performed using the additional circuitry that was

added to the TCAM cell. The output obtained was satisfactory and was in accordance with the actual results obtained.

REFERENCES

[1] P. Gupta, “Algorithms for routing lookups and packet classification,” Ph.D. Thesis, Department of Computer Science, Stanford University, CA, 2000.

[2] R. Sangireddy, and A. K. Somani, “High-speed IP routing with binary decision diagrams based hardware address lookup engine,” IEEE Journal on Selected Areas in Communications, vol. 21, no. 4, pp. 513-521, May 2003.

[3] J. M. Rabaey, Digital Integrated Circuits: A Design Perspective, First Edition, Prentice-Hall Inc., Upper Saddle River, New Jersey, 1996.

[4] M. Sumita, “A 800 MHz single cycle access 32 entry fully associative TLB with a240ps access match circuit,” Digest of Technical Papers of the Symposium on VLSI Circuits, pp. 231-232, Jun. 2001.

[5] Benton H. Calhoun and Anantha Chandra-kasan, (2006), "Analyzing Static Noise Margin for Sub- threshold SRAM in 90nm CMOS", IEEE International Solid-State Circuits Conference, No.7, pp. 234 -255.

[6] Hill. Definitions of noise margin in logic systems. Mullard Tech. Commun., 89, 239–245, September 1967.

[7] Y. Nakagome, “ Design of low power 6T SRAM with reduced leakage currents”, IBM J. Res. & Dev. vol. 47, no. 5/6 pp. 82, 2003.

[8] A. Kumar, H. Qin, P. Ishwar, J. Rabaey, K. Ramchandran, “Fundamental Bounds on Power Reduction during Data-Retention in Standby SRAM” ; IEEE International Symposium May 27-30, 2007, pp. 1867 - 1870 .

[9] Nitin Mohan, Wilson Fung, Derek Wright and Manoj Sachdev, “Design Techniques and Test Methodology for Low-Power TCAMs,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 14, no. 6, June 2006.

[10] Nitin Mohan and Manoj Sachdev, “Novel Ternary Storage Cells And Techniques For Leakage Reduction In Ternary Cam,” IEEE SOC Conference, Sept. 2006.

[11] J. Hauser, “ Noise margin criteria for digital logic circuits,” IEEE Transactions on Education, vol. 36 pp. 363–368, November 1993.

[12] I. Arsovski, T. Chandler, and A. Sheikholeslami, “A ternary content-addressable memory (TCAM) based on 4T static storage and including a current-race sensing scheme,” IEEE Journal of Solid-state Circuits, vol. 38, no. 1, pp. 155-158, Jan. 2003.

[13] A. Roth, D. Foss, R. McKenzie, and D. Perry, “Advanced ternary CAM circuits on 0.13μm logic process technology,” Proceedings of the IEEE Custom Integrated Circuits Conference (CICC), pp. 465-468, Oct. 2004.

[14] C. A. Zukowski, and S.-Y. Wang, “Use of selective precharge for low-power content-addressable memories,” Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1788-1791, Jun. 1997.