Domino Logic Circuits

High speed Domino Logic Circuits-Sahil Bansal (2010CS10244)

1

Abstract—In today’s world the most widely used logic style is static CMOS. The complementary metal oxide semiconductor (CMOS) circuit style falls under a broad class of logic circuits called static circuits in which at every point in time (except during the switching transients), each gate output is connected to either VDD or Vss via a low-resistance path. This is in contrast to the dynamic circuit class that relies on temporary storage of signal values on the capacitance of high-impedance circuit nodes. The domino Logic circuits are one of the most preferred circuits in the present world of high performance processors. This is because they have a speed and area advantage over the static logic circuits. Index Terms—CMOS, Domino logic, Dynamic logic circuits, noise-immunity

I. INTRODUCTION HE static CMOS style is really an extension of the static CMOS inverter to multiple inputs. A static CMOS gate is a combination of two networks, called the pull-‐up

network (PUN) and the pull-‐down network (PDN). The function of the PUN is to provide a connection between the output and VDD anytime the output of the logic gate is meant to be 1 (based on the inputs). Similarly, the function of the PDN is to connect the output to VSS when the output of the logic gate is meant to be 0. The PUN and PDN networks are constructed in a mutually exclusive fashion such that one and only one of the networks is conducting in steady state. In review, the primary advantage of the CMOS structure is robustness (i.e., low sensitivity to noise), good performance, and low power consumption (with no static power consumption). Moreover it is easy to translate logic into MOSFETS. However, the CMOS structure face a major drawback of propagating the delay across the logic structure. Figure 1.1 shows the two-‐input NAND gate and its equivalent RC switch level model. If both inputs are driven low, the two PMOS devices are on. The delay in this case is 0.69*(Rp/2)*CL, since the two resistors are in parallel. This is not the worst-‐case low-‐to-‐high transition, which occurs when only one device turns on, and is given by 0.69 * Rp * CL. For the pull-‐down path, the output is discharged only if both A and B are switched high, and the delay is given by 0.69*(2RN)*CL to a first order. Moreover, the number of transistors required to implement an N fan-‐in gate is 2N. This can result in significant implementation area. The large number of transistors (2N) increases the overall

capacitance of the gate. For an N-‐input NAND gate, the output capacitance increases linearly with the fan-‐in since

the number of PMOS devices connected to the output node increases linearly with the fan-‐in. The fan-‐out has a large impact on the delay of complementary CMOS logic as well. Each input to a CMOS gate connects to both an NMOS and a PMOS device, and presents a load to the driving gate equal to the sum of the gate capacitances. The above observations are summarized by the following formula, which approximates the influence of fan-‐in and fan-‐out on the propagation delay of the complementary CMOS gate

tp = a1FI + a2FI2 + a3FO where FI and FO are the fan-‐in and fan-‐out of the gate, respectively, and a1, a2 and a3 are weighting factors that are a function of the technology. Figure 1.2 shows the propagation delay for both transitions as a function of fan-‐in assuming a fixed fan-‐out (NMOS: 0.5mm and PMOS: 1.5mm). As predicted above, the tpLH increases linearly due to the linearly increasing value of the output capacitance. The simultaneous increase in the pull-‐down resistance and the load capacitance results in an approximately quadratic relationship for tpHL. Gates with a fan-‐in greater than or equal to 4 become excessively slow and must be avoided.

High Speed Domino Logic Circuits (Nov. 2011) Sahil Bansal

T


2

II. DYNAMIC LOGIC CIRCUIT It was noted earlier that static CMOS logic with a fan-‐in of N requires 2N devices. However an alternate logic style called Dynamic Logic can obtain correct result using N+2 devices for a fan-‐in of N. With the addition of a clock input, it uses a sequence of pre-‐charge and conditional evaluation phases to realize complex logic functions. The basic construction of a N-‐type dynamic logic gate is shown in Figure 2.1 The PDN is constructed exactly in the same fashion as a complementary CMOS. The operation of this circuit can be divided into two major phases: pre-‐charge and evaluation, with the mode of operation determined by the clock signal.

2.2 N-‐type Network

A number of important properties can be derived for the dynamic logic gate:

• The number of transistors (for complex gates) is substantially lower than in the static case: N + 2 versus 2N.

• It is non ratioed. The sizing of the PMOS pre-‐charge device is not important for realizing proper functionality of the gate. The size of the pre-‐charge device can be made large to improve the low-‐to-‐high transition time (of course, at a cost to the high-‐to low transition time). There is however, a trade-‐off with power dissipation since a larger pre-‐charge device directly increases clock power dissipation.

• It only consumes dynamic power. Ideally, no static current path ever exists between VDD and GND. The overall power dissipation, however, can be significantly higher compared to a static logic gate.

• The logic gates have faster switching speeds. There are two main reasons for this. The first (obvious) reason is due to the reduced load capacitance attributed to the number of transistors per gate and the single-‐transistor load per fan-‐in. Second, the dynamic gate do not have short circuit current, and all the current provided by the pull-‐down devices go into discharging the load capacitance.

However, dynamic logic circuits do have certain limitations such as charge leakage, charge sharing, back gate (and in general capacitive) coupling, and clock feed through. Figure 2.2 illustrates the effect on waveforms due to Leakage issues in dynamic circuits. Moreover dynamic logic circuits cannot be cascaded. To illustrate this, let us consider 2 simple N-‐type dynamic inverters cascaded together.

2.2 Waveform

2.3 Cascading of Dynamic N-type block

During the pre-‐charge phase (i.e., CLK =0), the output of both inverters is pre-‐charged up to VDD. Assume that the primary input In makes a 0 to 1 transition On the rising edge of the clock, output Out1 starts to discharge. The second output should remain in the pre-‐charged state of VDD since Out1 transitions to 0 during evaluation. However, since there is a finite propagation delay for the input to discharge Out1 to GND, the second output also starts to discharge. As long as Out1 exceeds the switching threshold of the second gate, which approximately equals VTn, a conducting path exists between Out2 and GND. Out2 therefore discharges as well, resulting in incorrect evaluation.

III. DOMINO LOGIC CIRCUIT A Domino Logic module consists of an N-type dynamic

logic circuit followed by a static inverter as illustrated in figure 3.1. During pre-‐charge, the output of the N-‐type dynamic gate is charged up to VDD and the output of the inverter is set to 0. During evaluation, the dynamic gate conditionally discharges and the output of the inverter makes a conditional transition from 0 to 1. The input to a Domino gate always comes from the output of another Domino gate.


3

3.1 Domino Logic This ensures that all inputs to the Domino gate are set to 0 at end of the pre-‐charge period. Hence, the only possible transition for the input during the evaluation period is the 0 to 1 transition, so that the formulated rule is obeyed. The introduction of the static inverter has the additional advantage that a static inverter with a low-‐impedance output, which increases noise immunity, drives the fan-‐out of the gate. The buffer furthermore reduces the capacitance of the dynamic output node by separating internal and load capacitances. During pre-‐charge, all inputs are set to 0. During evaluation, the output of the first Domino block either stays at 0 or makes a 0 to 1 transition, affecting the second Domino. This effect might ripple through the whole chain, one after the other, as with a line of falling dominoes—hence the name. Figure 3.2 illustrates the switching behavior of static and a domino buffer as the data input to the cell rises. The lower switching voltage of a domino cell leads to a speedup since the input driving cells will reach the lower NMOS threshold voltage quicker than a higher voltage level. The speed advantage of a domino cell over an equivalent static design is in the range of 1.5 to 2.5.

The output voltage of a static and domino buffer as the input switches from 0

to 1

IV. OPTIMIZING DOMINO CIRCUITS

• Since domino gate outputs are low during the pre-‐charge phase, gates, which have only domino output nodes, as inputs don’t need the “evaluate” NFET since all the NFET’s in the pull down will be off anyway.

• With the inclusion of the evaluation devices in Domino circuits, all gates pre-‐charge in parallel and the pre-‐charge operation is only two gates as the output of the dynamic gate charges to VDD and the inverter output is driven low. The critical path during evaluation happens through the pull-‐down path of the dynamic gate and the PMOS pull-‐up path of the static inverter. Therefore, to speed up the circuit, the beta ratio of the static inverter should be made high so that the switching threshold is close to VDD. This can be accomplished by using a small (minimum) sized NMOS and a large PMOS device. The minimum sized NMOS does not affect the performance since the pre-‐charge happens in parallel. The only disadvantage of

using a large beta ratio is a reduction in noise margin is reduced.

• One optimization that reduces area is Multiple Output Domino Logic. The basic concept is illustrated is Figure 4.1. The idea is to exploit the partial trees in the pull-‐down network and the fact that certain outputs are subsets of other outputs.

• 4.1 Multiple Circuit Domino

V. LIMITATIONS AND MODIFICATIONS

A. Leakage Currents: The direct tunneling of electrons and holes through the gate insulator cause ‘Gate oxide Leakage Current’. Tunneling probability of carriers increases dramatically with the scaling of gate oxide thickness (tox) in each new technology generation. Generally its hard to avoid charge sharing and leakage problems in a dynamic circuit design. Therefore a weak P device (Staticizing Gate) is added to compensate for charge loss due to leakage. Even though it does affect performance but the gate is still faster than static CMOS.

B. Charge Sharing: Charge sharing is a phenomenon in which a discharged dynamic node is pre-‐charged but its charge gets distributed with other intermediated pull-‐down network nodes resulting in lesser charge than required. In the given figure, when CLK goes high, the voltage on the dynamic

5.1 Staticized gates

node goes to: 3C/(3C+6C).Vdd = 0.3. Vdd

which is low enough to switch the output inverter. However this situation can be resolved by adding


4

additional pre-‐charge devices to intermediate nodes or increasing size of output buffer which will increase capacitance of dynamic node (faster output buffer may compensate for larger internal capacitance).

5.2 Charge Sharing

C. Inverting Logic A major limitation in Domino logic is that only non-‐ inverting logic can be implemented. This is due to the inclusion of the static inverter at the output of each dynamic gate.

5.3 Dual-rail Domino

This issue can be resolved by implementing a Dual-‐rail Domino circuit (by generating both polarities of output).

D. Capacitive Coupling

5.4 Capacitive Coupling

When using multiple-‐input gates as the Domino buffer changes in the “other” input during evaluate phase can cause dynamic node voltage to sag due to capacitive coupling, leading to unintended transition.

E. Technology Mapping For Domino Logic One tricky part in designing Domino Logic circuit is that, they can’t be automated as efficiently as normal logic circuits. This is because there is a need to put in extra time checks, while designing the circuit. Also, for domino logic gates since the number of possible cells is extremely large, the layout of a cell is produced on the fly instead of using a parameterized library (a collection of gates in a simulation software).

VI. CONCLUSION AND FURTHER SCOPE OF STUDY Domino logic circuit techniques have been extensively applied in recent high performance microprocessors due to the superior speed and area characteristics of domino circuits as compared to static CMOS circuits. However, domino logic circuits have an inherent drawback of increased noise sensitivity. Also, there is problem of lack of design automation and increased power dissipation. But, the application of aggressive circuit design techniques that only focus on enhancing circuit speed without considering power is no longer an acceptable approach in most high complexity digital systems. Thus, the focus should be on further modifications such as Low Swing or Reduced Dynamic voltage swing Domino Logic circuits.

REFERENCES [1] http://www.webopedia.com/TERM/C/CMOS.html [2] bwrc.eecs.berkeley.edu/classes/icdesign/ee141_f00/.../chapter6.pdf [3] http://en.wikipedia.org/wiki/Cmos [4] http://en.wikipedia.org/wiki/Dynamic_logic_(digital_electronics) [5] http://en.wikipedia.org/wiki/Domino_logic [6] http://6004.csail.mit.edu/6.371/handouts/L11.pdf [7] http://assets.cambridge.org/97805218/73345/excerpt/9780521873345_ex

cerpt.pdf (Cambridge Notes on CMOS/Domino Logic) [8] Low Swing Dual Threshold Voltage Domino Logic by Volkan Kursun

and Eby G. Friedman [9] www.ece.gatech.edu/research/labs/gsigroup/.../Shakeri_asic_02.pdf [10] http://assets.cambridge.org/97805218/73345/excerpt/9780521873345_ex

cerpt.pdf [11] www.ece.ucsb.edu/bears/class/ece224a/Lecture7.ppt [12] users.ece.utexas.edu/~mcdermot/vlsi-2/Lecture_13.pdf (Lectures on

Domino Logic of University Of Texas) [13] Salendra.Govindarajulu - (IJCSE) International Journal on Computer

Science and Engineering Vol. 02, No. 05, 2010, 1741-1745 [14] www.ijcse.com/docs/IJCSE10-01-02-03.pdf [15] Images from: http: //6004.csail.mit.edu/6.371/handouts/L11.pdf and

bwrc.eecs.berkeley.edu/classes/icdesign/ee141_f00/.../chapter6.pdf

Documents

Domino Logic Circuits