7
An Investigation of Power Delay Trade-offs for Dual CMOS Circuits Qi Wang Cadence Design Systems, Inc. 2500 Augustine Dr., Suite 200 Santa Clara, CA 95054 email: [email protected] Sarma B.K. Vrudhula Center for Low Power Electronics Department of Electrical and Computer Engineering The University of Arizona, Tucson, AZ 85721 email: [email protected] Abstract The availability of dual CMOS process provides a practical way to achieve high performance and low leak- age power dissipation for current deep submicron technol- ogy. Early work on leakage power optimization of digital circuits utilizing dual devices show some promising re- sults [4, 6, 5]. However, due to the lack of real dual pro- cess models and parameters, these works are based on sim- ple power and delay analysis of dual devices. For exam- ple, the impact of dual on the short circuit power dissipa- tion is ignored in all these works. In this paper we provide ex- tensive HSPICE simulation results on CMOS gates and cir- cuits from a commercial dual CMOS process. The exper- imental results show that optimization of dual circuits in- volves complex trade-offs between leakage power, short cir- cuit power and performance. For example, it is observed that using lower devices does not always result in a faster circuit. One of the main contributions of this paper is that it reveals some new challenges and opportunities offered by the dual technology to both circuits designers and CAD software developers for circuit optimization. 1 Introduction The emergence of deep submicron devices has given rise to new challenges in the optimization of CMOS circuits. The increasing integration of complex functions onto a sin- gle chip has fueled an explosive growth in the market for portable computing and communication devices. One of the difficult problems faced by circuit and system designers is to reduce power dissipation while maintaining high perfor- mance. Power dissipation in CMOS circuits can be expressed as the sum of the (average) switching power ( ), the short- circuit power ( ) and the leakage power ( ). is due to the charging and discharging of load capacitances as logic gates transition between 0 and 1. It is typically ex- pressed as , where is the load capacitance, is the supply voltage and is the expected number of times that the gate switches. is due to the existence of a conducting path between and ground during the brief period when a gate switches, and is due to the leakage current caused by the stored charge in the drain junctions leaking away and due to sub-threshold conduction. Small devices must be operated at low voltages. The reduction of combined with the reduced capacitances of optimized circuits result in a significant reduction in the switching power ( ). Reducing also helps avoid ve- locity saturation and hot carrier effects in short channel de- vices [1]. As is reduced, must also be reduced so as not to adversely impact delay [2]. However, reducing in short channel devices results in an exponential increase in the leakage power ( ). At values of v, the leakage power can dominate the switching power [3, 5]. To achieve high performance and low leakage power dissipation, one relatively recent development is the use of dual or multiple threshold voltage CMOS (DTC- MOS/MTCMOS) [4]. Most of the published work on cir- cuit optimization for dual circuits have focused on the trade-offs between leakage power dissipation and perfor- mance [4, 5, 6]. Also due to the lack of real dual pro- cess models and parameters, results reported in these works were obtained using power and delay models that may be over simplified for real dual CMOS process. In this paper, extensive HSPICE simulations were conducted to investigate the impacts of dual MOSFETs on the power and delay of CMOS gates and circuits. The device models and process parameters were obtained from a commercial 0.25 m dual CMOS process. The objective of this paper is to highlight some important issues which circuit designers and CAD soft- ware developers should take into consideration to utilize this new technology in circuit optimization. The rest of the paper is organized as follows. Section 2 shows the experimental results on leakage power dissipation of single and series connected MOSFETs. The impact of the different devices on the power dissipation of CMOS gates is discussed in Section 3. Gate delays of low , high and mixed inverters are examined in Section 4. The material presented in Sections 3 and 4 is brought together through a

[IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

  • Upload
    sbk

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

An Investigation of Power Delay Trade-offs for DualVt CMOS Circuits

Qi WangCadence Design Systems, Inc.2500 Augustine Dr., Suite 200

Santa Clara, CA 95054email: [email protected]

Sarma B.K. VrudhulaCenter for Low Power Electronics

Department of Electrical and Computer EngineeringThe University of Arizona, Tucson, AZ 85721

email: [email protected]

Abstract

The availability of dualVt CMOS process provides apractical way to achieve high performance and low leak-age power dissipation for current deep submicron technol-ogy. Early work on leakage power optimization of digitalcircuits utilizing dualVt devices show some promising re-sults [4, 6, 5]. However, due to the lack of real dualVt pro-cess models and parameters, these works are based on sim-ple power and delay analysis of dualVt devices. For exam-ple, the impact of dualVt on the short circuit power dissipa-tion is ignored in all these works. In this paper we provide ex-tensive HSPICE simulation results on CMOS gates and cir-cuits from acommercialdualVt CMOS process. The exper-imental results show that optimization of dualVt circuits in-volves complex trade-offs between leakage power, short cir-cuit power and performance. For example, it is observedthat using lowerVt devices does not always result in a fastercircuit. One of the main contributions of this paper is thatit reveals some new challenges and opportunities offered bythe dualVt technology to both circuits designers and CADsoftware developers for circuit optimization.

1 Introduction

The emergence of deep submicron devices has given riseto new challenges in the optimization of CMOS circuits.The increasing integration of complex functions onto a sin-gle chip has fueled an explosive growth in the market forportable computing and communication devices. One of thedifficult problems faced by circuit and system designers isto reduce power dissipation while maintaining high perfor-mance.

Power dissipation in CMOS circuits can be expressed asthe sum of the (average)switching power(Psw), theshort-circuit power (Psc) and theleakage power(Pleak). Psw isdue to the charging and discharging of load capacitances aslogic gates transition between 0 and 1. It is typically ex-pressed asCLV

2

ddE(t), whereCL is the load capacitance,Vdd is the supply voltage andE(t) is the expected number of

times that the gate switches.Psc is due to the existence ofa conducting path betweenVdd and ground during the briefperiod when a gate switches, andPleak is due to the leakagecurrent caused by the stored charge in the drain junctionsleaking away and due to sub-threshold conduction.

Small devices must be operated at low voltages. Thereduction ofVdd combined with the reduced capacitancesof optimized circuits result in a significant reduction in theswitching power (Psw). ReducingVdd also helps avoid ve-locity saturation and hot carrier effects in short channel de-vices [1]. AsVdd is reduced,Vt must also be reduced so asnot to adversely impact delay [2]. However, reducingVt inshort channel devices results in an exponential increase in theleakage power (Pleak). At values ofVt � 0:2v, the leakagepower can dominate the switching power [3, 5].

To achieve high performance and low leakage powerdissipation, one relatively recent development is the useof dual or multiple threshold voltage CMOS (DTC-MOS/MTCMOS) [4]. Most of the published work on cir-cuit optimization for dualVt circuits have focused on thetrade-offs between leakage power dissipation and perfor-mance [4, 5, 6]. Also due to the lack of real dualVt pro-cess models and parameters, results reported in these workswere obtained using power and delay models that may beover simplified for real dualVt CMOS process. In this paper,extensive HSPICE simulations were conducted to investigatethe impacts of dualVt MOSFETs on the power and delay ofCMOS gates and circuits. The device models and processparameters were obtained from a commercial 0.25�m dualVt CMOS process. The objective of this paper is to highlightsome important issues which circuit designers and CAD soft-ware developers should take into consideration to utilize thisnew technology in circuit optimization.

The rest of the paper is organized as follows. Section 2shows the experimental results on leakage power dissipationof single and series connected MOSFETs. The impact of thedifferentVt devices on the power dissipation of CMOS gatesis discussed in Section 3. Gate delays of lowVt, highVt andmixedVt inverters are examined in Section 4. The materialpresented in Sections 3 and 4 is brought together through a

Page 2: [IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

AverageIleak (pA)1 2 3

p Vt;high 6.1 4.5 4.3p Vt;low 325.2 25.7 14.5

n Vt;high 10.2 4.6 4.4n Vt;low 1979.8 204.6 110.3

Table 1. Average leakage current of different number of

serial connected p/n transistors with Vt;high/Vt;low .

detailed analysis of a CMOS D flip-flop. This is presented inSection 5. Finally, conclusions are presented in Section 6.

As stated earlier, data from a newly developed commer-cial 0.25�m dualVt CMOS process is used throughout thisinvestigation. The nominal highVt (Vt;high) for the p andn devices is�0.6v and the lowVt (Vt;low) for the p andndevices is�0.4v. The supply voltage (Vdd) for the process is2.5v.

2 Leakage Current in MOSFETs With Differ-ent Vt

In this section we examine the leakage currents in seriesconnected (1, 2 and 3 transistors)p andn transistors of dif-ferent widths at two different voltages,Vt;low andVt;high.The widths vary from 0.5�m to 20�m in the HSPICE sim-ulation and the results are shown in Table 1.

In Table 1, columns 2 to 4 correspond to the number oftransistors connected in series and rows 3 to 6 correspondto different combination ofp/n transistors and the thresholdvoltages. For example, the number at the intersection of thethird column and the fourth row is the average leakage cur-rent for a circuit with 2Vt;low p transistors connected in se-ries, and both transistors have the same width which variesfrom 0.5�m to 20�m. Several important conclusions canbe made from the results.

1. The leakage currents of thep andn devices withVt;loware significantly larger than those withVt;high. The dif-ference is about 50X and 200X for thep andn devices,respectively.

2. For the same channel width, the leakage current of ann Vt;low device is significantly larger than that of apVt;low device, by a factor of 6.

3. The leakage current decreases significantly as the num-ber of MOSFETs in series increases. For example, forthe p andn devices atVt;low, the leakage current of asingle MOSFET is more than 10 times the leakage cur-rent of a two MOSFETs in series, and this is about twiceof the leakage current of three MOSFETs in series withall transistors having the same channel width. This re-sult is consistent with the theoretical results presentedin [3].

3 Impact of Low VT MOSFETs on Power Dis-sipation of CMOS Gates

In this section, we examine the power dissipation ofCMOS logic gates that are implemented using various com-binations ofVt;low andVt;high devices. A logic gate in whichall devices have threshold voltages equal toVt;low (Vt;high)is referred to as a low (high)Vt gate. A mixedVt gate isone where the threshold voltages of some of the devices areequal toVt;low while that of the others areVt;high. We firstexamine the leakage power dissipationPleak , i.e., due to thetransistors conducting in the off state.

3.1 Leakage Power Dissipation

Tables 2 and 3 show the leakage power dissipation(Pleak) of a CMOS inverter, a 2-input NAND gate and a 2-input NOR gate. As expected, thePleak depends strongly onthe gate function, the input state and the threshold voltage.There are however several important observations that canbe made that not so obvious.

First consider the inverter. The lowVt inverter exhibitssignificantly morePleak (47.8X) than the highVt inverter.This is to be expected since the off current increases expo-nentially asVt is reduced. Moreover, the difference inPleakbetween the input states is also much greater (10X) for thelow Vt inverter. On the other hand, the mixedVt inverter witha low Vt p-device and a highVt n-device exhibits a muchsmaller increase (7X) inPleak when compared to the highVtinverter. As we shall see in Section 4, the delay characteris-tics of the mixedVt inverter are closer to the lowVt inverter,which is significantly better than the highVt inverter.

Inverter (nW)in=(0) in=(1) Relative Value

High Vt 0.048 0.028 1Low Vt 3.149 0.487 47.8XMixed Vtp Vt;low 0.048 0.487 7Xn Vt;highMixed Vtp Vt;high 3.149 0.028 41.8Xn Vt;lowLeakage Path n device p device

Table 2. Leakage power dissipation of low, high and mixed

Vt CMOS inverter. Channel length L = 0:24�m, Wp =7:2�m and Wn = 5:6�m.

Now consider Table 3 which shows thePleak for a 2-inputNOR and a 2-input NAND. The lowVt NOR has a maximumvalue ofPleak when both inputs are low. This is due to thefact that leakage path consists of two n-devices in parallel.For the other input states, the leakage path consists of a sin-gle n-device or two p-devices in series. Note that the differ-ence inPleak among the different input states can be as much

Page 3: [IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

as 100X for the NOR gate. Using the same argument, onewould expect that the maximum value ofPleak for a 2-inputNAND gate would occur when both inputs are high since theleakage path in this case consists of two p-devices in paral-lel. However, Table 3 shows that this is not the case and themaximum value ofPleak occurs when the input closer to theoutput is low and the other is high. This is because thePleakof a single n-device exceeds thePleak of two p-devices (seeSection 2). While the maximum value ofPleak of the NAND(over all inputs) is about 33% than that of the NOR gate, theaverage value ofPleak (assuming equally likely inputs) ofthe NAND is 17% greater than that of the NOR.

2-input NAND (nW)in=(0,0) in=(0,1) in=(1,0) in=(1,1)

High Vt 0.011 0.024 0.029 0.028Low Vt 0.157 1.551 1.215 0.346

Leakage Path 2 ser n top n bot n 2 par p

2-input NOR (nW)in=(0,0) in=(0,1) in=(1,0) in=(1,1)

High Vt 0.049 0.023 0.014 0.011Low Vt 2.315 0.201 0.244 0.025

Leakage Path 2 ser n bot p top p 2 par pTable 3. Leakage power dissipation of low and high VtCMOS NAND and NOR gates. The channel length L =0:24�m. For the 2-input NAND gate, Wp = 2:8�m and

Wn = 2:8�m; for the 2-input NOR gate, Wp = 3:6�mand Wn = 2�m.

3.2 Short Circuit Power Dissipation

Traditional power optimization and estimation techniquesfor CMOS logic circuits have focused on the dynamic powerdissipation, caused by the charging and discharging of loadcapacitances. In deep submicron circuits operating at lowvoltages, the short circuit power,Psc, is no longer a negligi-ble factor. In fact when the ratio of the input transition timeto the output transition time is large (slow input or small loadcapacitances)Psc can far exceed thePsw . As we will seeshortly,Psc can have a significant impact on the delay. Thisis particularly important for lowVt or dualVt CMOS gates.

To see whyPsc increases asVt is lowered, consider anoutput falling transition of a lowVt CMOS inverter. The n-device is the driving device which removes the charges storedat the output load capacitor. Since the input slew is not zeroin practice, the p-device will also be on during the period ofthe input and output transition. The p-device starts in the lin-ear region and transitions into the saturation region until it isfinally turned off. With lowVt gates, the period during whichboth devices are on is increased and this leads to greater totalcurrent flowing fromVdd to the ground. Additionally, first or-der drain current models predict higher linear and saturationregion drain currents given the sameVdd andVds for lower

input High Vt Low Vtslew load Ptotal Pshort Ptotal Pshort(ns) (fF) (�W) (�W) (�W) (�W)0.5 10 33.83 13.10 60.53 38.72

1 10 57.05 36.23 114.63 92.902 10 106.25 85.48 224.53 202.833 10 156.45 135.68 335.25 313.43

0.5 50 54.70 8.91 77.58 30.891 50 74.95 29.16 128.03 81.332 50 120.68 74.85 234.70 187.983 50 168.73 122.95 343.50 296.83

0.5 100 83.50 6.54 102.95 25.011 100 101.05 24.12 149.38 71.502 100 143.08 66.13 251.75 173.853 100 188.70 111.70 358.25 280.28

0.5 200 143.55 4.07 158.63 18.181 200 158.03 18.44 199.20 58.702 200 194.60 55.05 293.75 153.283 200 235.98 96.43 395.25 254.70

0.5 1000 639.75 0.73 645.00 4.381 1000 646.00 6.40 666.00 25.612 1000 666.00 26.53 727.00 86.553 1000 692.00 52.78 801.00 160.83

Table 4. Total and short circuit power of a CMOS inverter

with low and high Vt. Lp=Ln=0.24�m, Wp=7.2�m andWn=5.6�m. Vdd=2.5v, f=100MHz.

Vt. Both these effects will increase the average short circuitcurrent of CMOS gates implemented with lowVt devices.

Table 4 shows HSPICE simulations forPsc of a highVtand a lowVt CMOS inverter. Each row in the Table 4 cor-responds to a different combination of input signal transitiontime and load capacitance. The first column shows the ris-ing and falling transition times at the input of the inverter.The second column is the external load capacitance at thegate output. The third and fourth columns are the averagetotal power and average short circuit power for the highVtinverter. The fifth and sixth columns are the average totalpower and average short circuit power for the lowVt inverter.One clock cycle with the period of 10ns was simulated foreach situation.

Figure 1(a) shows plots of the data in Table 4. For compar-ison purposes, Figure 1(b) shows the ratio of the short circuitpowerPsc to the total switching powerPsw for each inverter.Each data point in the plots corresponds to a row in the Ta-ble 4. Eachgroup of points corresponds to a value ofCL.Each point within a group corresponds to a given value of theinput transition time. Several important observations can bemade from these plots.Psc for a lowVt inverter far exceedsthat of the highVt inverter. For a givenCL, the difference inPsc between the low and highVt inverters increases as the in-put transition increases. For a given input transition time, asCL increases, the difference inPsc decreases. This last pointis shown clearly in Figure 1(b). These observations indicatethat it is the ratioof the input transition time to the outputtransition time that has the greatest impact inPsc. Thus,Pscof the lowVt inverter with a slow input and small load ca-pacitance becomes more significant when compared toPsw.

Page 4: [IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

This translates to a significant impact on the gate delay. Thisaspect will be explored further in the following section wheregate delays are examined.

0 2 4 6 8 10 12 14 16 18 200

50

100

150

200

250

300

350

Psc

of a CMOS inverter with different Vt

Data points

Sho

rt c

ircui

t pow

er (

µ W

)

x Low Vt

o High Vt

0 2 4 6 8 10 12 14 16 18 200

5

10

15

Psc

/Psw of a CMOS inverter with different V

t

Data points

Psc

/ P

sw

x Low Vt

o High Vt

Figure 1. (a) Psc (b) Ratio of Psc/Psw for the inverter oflow Vt and high Vt.

4 Impact of Low Vt MOSFETs on Gate Delay

In this section we examine the impact of lowVt MOS-FETs on the gate delays of CMOS inverters. The observa-tions obtained from the experiments on the inverters can beapplied to other more complex CMOS gates.

4.1 The LowVt and High Vt CMOS Inverter

In general, it is assumed that a lowVt CMOS gate exhibitsless delay than a highVt gate. However, the experimental re-sults presented here indicate that this is not always the case.The primary reason for this seems to be the short circuit cur-rent.

Figures 2 show thepropagation delay1 for low and highVt inverters. In these figures, each data point correspondsto the same combination of input transition times and loadcapacitances given in Table 4. From Figure 2, we see thatfor a falling transition on the output, the propagation delayof the low Vt inverter is always less than that of the high

1propagation delay is measured as the time between the midpoint of in-put and output transitions

0 2 4 6 8 10 12 14 16 18 200

0.2

0.4

0.6

0.8

1

1.2x 10

−9

Data points

Del

ay (

seco

nds)

Propagation delay for output falling (s)

o High Vt

x Low Vt

0 2 4 6 8 10 12 14 16 18 200

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6x 10

−9

Data points

Del

ay (

seco

nds)

Propagation delay for output rising (s)

o High Vt

x Low Vt

Figure 2. Propagation delay of output falling and risingtransition for low and high Vt inverter.

Vt inverter, but larger for a rising transition on the output.Along with the propagation delay, the output transition timewas also measured but it is not shown here due to page lim-itations. Overall, there is was much difference in the outputtransition time between the lowVt and highVt inverters.

To understand why the lowVt inverter may be slowerthan a highVt inverter for a rising transition on the out-put, HSPICE simulation was conducted for a particular inputtransition time and load capacitance. It was observed thatduring the early phase of output rising transition, the charg-ing current to the output capacitor of highVt inverter is largerthan that of the lowVt inverter. The reason is that during theearly period of the transition, the lowVt inverter has a highershort circuit current (flowing through the NMOS) than thehigh Vt inverter (see also Section 3), while the PMOS cur-rent of the two inverters was comparable2. Since the currentavailable to charge the load capacitor is the PMOS currentminus the NMOS current, we find a smaller charging currentfor the lowVt inverter during this period.

Figure 3 contains the plot that shows the correlation be-tween the increase inPsc and propagation delay. The pointsconnected by solid lines show the difference in the propaga-tion delay between the lowVt and highVt inverter as a per-

2Note that, during the early transition period, the NMOS transistor isin the strong linear region and the PMOS transistor is in the weak satura-tion region. Therefore a small change at the gate input will lead to a largerchange of the NMOS drain current than that of the PMOS. The fact that lowVt device having larger drain current under the same gate bias makes thedifference larger.

Page 5: [IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

centage of the propagation delay of the highVt inverter. Thepoints connected by the broken line indicate the differencein the ratio of the short circuit power to the switching power,i.e.,Psc=Psw, between the lowVt and highVt inverter. Theseplots show a strong correlation between the increase in theshort circuit current and the increase in the propagation de-lay for a rising transition on the output. These plots alsoshow that the increase in the propagation delay for the risingtransition on the output for the lowVt inverter over the highVt inverter is reduced with increased load capacitances. Thereason is that as the load capacitance is increased, the shortcircuit power as percentage of switching power is reduced(see Figure 1), i.e. the short circuit current has less of animpact on the charging current of the load capacitor.

In contrast, the lowVt inverter is 41% faster than the highVt inverter for the falling transition on the output. This isbecause the discharging current (through the NMOS) is muchlarger than the short circuit current (through the PMOS) as aresult of superior conductivity of the NMOS and lowerVt.Consequently, the short circuit current has less of an impactwhen the load capacitor is discharging.

0 2 4 6 8 10 12 14 16 18 20−15

−10

−5

0

5

10

15

o percentage of propgation delay increasex P

sc increase as percentage of P

d

Data points

Per

cent

age

(%)

Figure 3. Correlations between the percentage increase of

the output rising propagation delay of low Vt inverter andthe percentage increase of the Psc as a percentage of the

Pd

4.2 Mixed Vt Inverter

Similar experiments were conducted to study the gate de-lays of mixedVt inverters. Due to page limitations, detailedresults are not shown here. In stead, a summary of the resultsis given in Table 5 which shows the (average) differences inpropagation delay and leakage power for the different typesof inverters. In terms of magnitudes, the mixedVt inverterwith a low Vt n-device and a highVt p-device resulted in a42% improvement in the propagation delay over the highVtinverter for a falling transition on the output. This is to becompared with a 32% improvement in the propagation delayof the lowVt inverter over the highVt inverter for the same

transition. However, for the rising transition on the output,the mixedVt inverter with a lowVt p-device and a highVtn-device showed a 15% improvement in the propagation de-lay over the highVt inverter. This is in contrast to the lowVt inverter, which showed a 1% increase in propagation de-lay over the highVt inverter. The result is consistent withthe observation in above section that the short circuit currenthas a significant impact on the delay when considering lowVt circuits (see also Figure 3). The last column shows theleakage power. It is should be noted that the mixedVt in-verter with a lowVt p-device and highVt n-device providesthe best compromise.

Propagation delay LeakageOutput Falling Output Rising Power

High Vt 1 1 1Low Vt 32% less 1% more 47X more

Mixed Vtn Vt;low 42% less 12% more 42X morep Vt;highMixed Vtp Vt;low 10% more 15% less 7X moren Vt;high

Table 5. Summary of propagation delay and leakage power

for inverters with different Vt.

5 Power-Delay Tradeoffs in DualVt CMOS Dflip-flop

In this section we investigate the impact of dualVt on thepower and delay of a CMOS D flip-flop (DFF). The DFFwas selected from a commercial library and its schematicis shown Figure 4. The channel lengths of the devices are0.24�m and the channel widths are obtained by multiplyingthe size ratio by 0.4�m. For example, for the inverter withsize ratios of 3/1:4/1, the channel width of PMOS and NMOSare 1.2�m and 1.6�m respectively.

QD

CLK

4/1:5/1

3/1:3/13/1:3/1

7/1:7/1

3/1:3/1 3/1:3/1

3/1:4/1 3/1:4/1

5/1:7/1

5/1:7/1

7/1:9/1

7/1:9/1

3/1:4/1

QN

Figure 4. A simple DFF from a commercial synthesizable

library.

The experiments conducted on the DFF employed thesame combinations of input transition times and load capac-itances used in the experiments described in Sections 3 and

Page 6: [IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

4. That is four different output load capacitance values wereused3 and for each load capacitance value, four different datainput signal transition times were simulated. A ramp inputwas applied to the data signal and a step signal was appliedto the clock input. The clock frequency was chosen to be 100MHz, which is sufficient for all signals to stabilize before thenext transition.

The termhigh Vt (low Vt) DFF describes the DFF inwhich the threshold voltages of all devices isVt;high (Vt;low).The termmixedVt DFF is used to refer to the DFF in whichboth highVt and lowVt devices are used. The motivationhere is to find a mixed DFF implementation which is as fastas the lowVt DFF but with the least increase in leakagepower dissipation compared to the highVt DFF.

The selection of which transistors to be highVt and whichto be lowVt was done as follows. Thep andn devices withinan inverter have the sameVt. All pass transistors were se-lected to be lowVt since they never have a leakage path fromVdd to the ground. The inverters on the clock line were lowVt since it determines the overall speed of the DFF. In addi-tion, the two forward inverters in the master stage were lowVt, since they are also on the critical path. The remainder ofthe gates were highVt.

Figures 5 shows the propagation delay (Clock to Q) of thethree DFF implementations. The mixedVt DFF has nearlythe same propagation delay as that of the lowVt DFF andis about 20% faster than the highVt DFF. In terms of thetotal dynamic power (Psc+Psw), the mixedVt and lowVtDFF have the same power dissipation (see Figure 6). Thisis about 20% more than that of the highVt DFF. As to theleakage power, Table 6 shows that the mixedVt DFF reducesthe leakage power over the lowVt DFF from 6.4 nW to about2.2 nW, a 66% reduction. Note that the leakage power of thelow Vt DFF is almost 30 times more than that of the highVtDFF. The results on the propagation delay from Clock to QNexhibits similar patterns as Figure 5.

Unlike the combinational CMOS gates, the overall delayof a DFF must include the setup time, in addition to the prop-agation delay. In Table 7 the setup times for the DFF areshown for the input transition time of 1 ns and load capac-itance of 25fF. Interestingly, the lowVt DFF and mixedVtDFF have a larger setup time for storing a 0 when comparedwith the highVt DFF. The reason for this is that the setuptime is determined by both the propagation delay of the inputgates as well as the logic threshold value. The logic thresh-old value of the lowVt input gates is smaller than that of thehighVt input gates. Therefore for lowVt DFF the input sig-nal may need to be stable earlier than the highVt DFF. Asa result, a larger setup time is required for a lowVt DFF insome cases.

3The output QN and Q of the DFF are assumed to have the same loadcapacitance.

0 2 4 6 8 10 12 14 16 18 200.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8x 10

−9 CLK rising to Q rising propagation delay

Data points

Del

ay (

seco

nds)

− All low Vt, average = 6.1923e−10 (s)

−. Mixed Vt, average = 6.8384e−10 (s)

. All high Vt, average = 8.0133e−10 (s)

Average improvement 17.30 %

0 2 4 6 8 10 12 14 16 18 202

4

6

8

10

12

14x 10

−10 CLK rising to Q falling propagation delay

Data points

Del

ay (

seco

nds)

. All high Vt, average = 6.9307e−10 (s)

− All low Vt, average = 5.2063e−10 (s)

−. Mixed Vt, average = 5.6649e−10 (s)

Average improvement 19.37 %

Figure 5. Propagation delay of three different implementa-tion of DFF (Clock to Q output).

Pleak

All high Vt 0.23 nWAll low Vt 6.4 nWMixed Vt 2.2 nW

Table 6. Leakage power dissipation of three different im-plementation of DFF.

6 Conclusions

In this paper, we presented simulation results using datafrom commercial submicron dualVt process to gauge the im-pact of using two threshold voltages on power dissipationand delay. We believe that these observations will serve as avaluable aid to circuit designers and CAD software develop-ers. The results raise some new challenges and opportunitiesfor circuit design and optimization.

6.1 Challenges to Circuit Designers

In the design of a cell library, the key challenge is toachieve the best power-delay tradeoff. From the experiments,it is clear that using lowerVt transistors the circuit perfor-mance can be improved as much as 50%, but the leakagepower dissipation may increase as much as 50 to 100 times.In addition to the leakage power dissipation, the short cir-cuit power dissipation is also significantly increased. Asthe power supply voltage continues to be scaled down, thethreshold voltage will also have reduced. This will increase

Page 7: [IEEE Comput. Soc Proceedings. 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD'99 - Austin, TX, USA (10-13 Oct. 1999)] Proceedings 1999

0 2 4 6 8 10 12 14 16 18 200

100

200

300

400

500

600

700

800

900Total dynamic power dissipation

Data points

Pow

er (µ

W)

− All low Vt, average = 304.26 (µ W)

−. Mixed Vt, average = 301.21 (µ W)

. All high Vt, average = 268.51 (µ W)

Average power increase 18.17 %

Figure 6. Total dynamic power of three different implemen-tation of DFF.

setup0 (ns) setup1 (ns)All high Vt 0.23 0.14All low Vt 0.26 0.12Mixed Vt 0.26 0.12

Table 7. Setup time of three different implementation ofDFF.

the leak power exponentially. Although for current technol-ogy, the leakage power dissipation is still much smaller thanthe switching power dissipation, this will become a very se-rious problem for battery operated portable systems, whichare in standby mode for long periods of time. DualVt offersan excellent solution to this problem.

The experiments also show that the impact of lowVtMOSFETs on the rising and falling delay of CMOS gatesis different. By using lowVt NMOS transistors, the outputfall time will be significantly improved but with a slight in-crease in the output rise time. From the analysis of Section 4,this is mainly due to the large short circuit current when theoutput rises. Regardless of the choice ofVt for the p and ndevices, either the rise or the fall delay will increase in com-parison with the case when both PMOS and NMOS are highVt. On the other hand, the experiments also show that giventhe same channel width, the lowVt NMOS consumes signif-icantly more leakage power than the lowVt PMOS. Thesetwo facts may imply that design styles other than the staticCMOS may be better for the dual or multipleVt technology.For example, in the NMOS domino logic, the performanceis determined only by the NMOS transistors. Consequently,by using lowVt NMOS transistors the performance could beimproved. On the other hand, since NMOS consumes moreleakage power, this provides a larger room for power delaytrade-offs. Finally, there is no short circuit power dissipatedin the dynamic logic.

The experimental results in Section 5 show that goodpower delay trade-offs can be achieved by selecting subsetsof the transistors in the circuit to have a differentVt. An in-

teresting observation from these experiments is that using alow Vt gates (all devices atVt;low) does not necessarily re-sult in the fastest D flip-flop due to the possible increase inthe setup time.

6.2 Challenges to CAD Software Developers

Dual Vt technology also poses important challenges toCAD software developers. Several algorithms have recentlybeen reported that optimally assign the one of two thresholdvoltages to the logic gates with the intent of minimizing theleakage power under delay constraints [6, 5]. However mostof these works are based on simplified power and delay mod-els. For example, the short circuit power, which has demon-strated here to have a significant impact on the delay, has notbeen considered in [6, 5]. The idea of selecting transistors onthe non-critical path to be highVt is also promising. But thedesign styles, e.g. static versus dynamic, should affect the al-gorithms for circuit optimization due to the reasons pointedout in Section 6.1. The availability of dualVt will also have astrong impact on technology mapping algorithms. The newalgorithms will have to consider leakage power dissipationas well.

7 Acknowledgment

The authors gratefully acknowledge Dr. Tom Dillingerfrom Rockwell for his help in conducting the experiments.This work was partly carried out at the Center for LowPower Electronics which is supported by the National Sci-ence Foundation, the Department of Commerce of the Stateof Arizona, and various companies in the microelectronicsindustry, including, Analog Devices, Analogy, Burr Brown,Hughes Aircraft, Intel, Microchip, Motorola, National Semi-conductor, Rockwell, Sicom, SMI, Texas Instruments, andWestern Design.

References

[1] A. Chandrakasan, M. Potkonjak, J. Rabaey, and R. Brodersen.Low-power digital CMOS design.IEEE Journal of Solid StateCircuits, pages 472–484, Apr. 1992.

[2] B. Davari, R. Dennard, and G. Shahidi. CMOS scaling for highperformance and low power.Proceedings of IEEE, 83(4):408–425, Apr. 1995.

[3] R. X. Gu and M. I. Elmasry. Power dissipation analysis andoptimization of deep submicron CMOS digital circuits.IEEEJournal of Solid State Circuits, 31(5):707–713, May 1996.

[4] J. Kao, A. Chandrakasan, and D. Antoniandis. Transistor siz-ing issues and tool for multi-threshold cmos technology. InProceedings of Design Automation Conference, Las Vegas, NV,June 1997.

[5] Q. Wang and S. Vrudhula. Efficient procedures for minimizingthe stand-by power in dualVt CMOS circuits. InProceedingsof PATMOS’98, Denmark, Oct. 1998.

[6] L. Wei and K. Roy. Multiple Vth CMOS for leakage controlin deep submicron IC’s. InProceedings of PATMOS’98, Den-mark, Oct. 1998.