23
Chapter 3 Design of Clock dividers 3.1 INTRODUCTION The clock divider circuit has found immense application in multiple clock domain (MCD) systems like SoC (System on Chip) and GALS(Globally Asynchronous, Locally Synchronous). Clock generation and clock distribution for these MCD systems are the costliest in terms of power consumption (R. Chen, Vijaykrishnan, & Irwin, 1999). The clock generation system generates different frequencies for the clock domains from the basic crystal oscillator (tens of MHz) using PLLs (as frequency multipliers) followed by clock dividers. Hence minimiz- ing the power consumption of the clock divider circuit is a crucial step in the design of a low power clock generator circuit for MCD systems. The individual frequency required by the IP core can be derived from the PLL generating the LCM frequency (Chapter 2) by clock dividers. The clock divider circuit is designed using D-Flip flops. To design a low power clock divider circuit, we need a D-Flip Flop which consumes less power and can operate at very high speed (GHz). We studied and simulated some D-Flip flops which are already proposed in literature (Gao, Qiao, Wei, & Yang, 2006; Sharma, Sharma, Singh, & Sharma, 2011; Yu, Do, Lim, Yeo, & Ma, 2006; Zhang & Sun, 2007). Simulations were done in CADENCE VIRTUOSO using 90nm technology files. In this chapter, we present our findings on the aforementioned flip flops and our improvements. We also present the post-layout simulation results of clock dividers using these flip-flops. 3.2 MODIFIED ETSPC FLIP FLOP (METSPC-FF) 3.2.1 EXTENDED TRUE SINGLE-PHASE CLOCK(ETSPC) FLIP FLOP OF XIAO PENG YU ET AL, 2006 A dynamic ETSPC (Extended True Single-Phase Clock) flip flop is presented in (Yu et al., 2006) (Figure. 3.1), which doesn’t have stacked MOS structure that slows the switching 46

Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

Chapter 3

Design of Clock dividers

3.1 INTRODUCTION

The clock divider circuit has found immense application in multiple clock domain (MCD)systems like SoC (System on Chip) and GALS(Globally Asynchronous, Locally Synchronous).Clock generation and clock distribution for these MCD systems are the costliest in terms ofpower consumption (R. Chen, Vijaykrishnan, & Irwin, 1999). The clock generation systemgenerates different frequencies for the clock domains from the basic crystal oscillator (tensof MHz) using PLLs (as frequency multipliers) followed by clock dividers. Hence minimiz-ing the power consumption of the clock divider circuit is a crucial step in the design of a lowpower clock generator circuit for MCD systems. The individual frequency required by theIP core can be derived from the PLL generating the LCM frequency (Chapter 2) by clockdividers.

The clock divider circuit is designed using D-Flip flops. To design a low power clockdivider circuit, we need a D-Flip Flop which consumes less power and can operate at veryhigh speed (GHz). We studied and simulated some D-Flip flops which are already proposedin literature (Gao, Qiao, Wei, & Yang, 2006; Sharma, Sharma, Singh, & Sharma, 2011; Yu,Do, Lim, Yeo, & Ma, 2006; Zhang & Sun, 2007). Simulations were done in CADENCEVIRTUOSO using 90nm technology files. In this chapter, we present our findings on theaforementioned flip flops and our improvements. We also present the post-layout simulationresults of clock dividers using these flip-flops.

3.2 MODIFIED ETSPC FLIP FLOP (METSPC-FF)

3.2.1 EXTENDED TRUE SINGLE-PHASE CLOCK(ETSPC) FLIP FLOP OF XIAOPENG YU ET AL, 2006

A dynamic ETSPC (Extended True Single-Phase Clock) flip flop is presented in (Yu et al.,2006) (Figure. 3.1), which doesn’t have stacked MOS structure that slows the switching

46

Page 2: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

P1 P2 P3

N1 N2 N3

S1 S2

D

CLK CLK

CLK

Q Q

W/L: 1.2

W/L: 1.2

W/L: 5

W/L: 1.2

W/L: 9

W/L: 1.2

Figure 3.1: ETSPC-FF

speed.The ETSPC shown in Fig.3.1 is a negative edge triggered flip flop. When the clock is

high, N1 and N2 will be ON, P3 will be OFF. The node S1 and S2 are precharged to lowthrough N1 and N2 irrespective of D state. The evaluation phase starts at the negative edgeof the clock.

1. Case 1: If D is low, P1 will turn ON to make node S1 high which in turn will turn OFFP2 to make node S2 stay low. Thus node S2 will turn OFF N3 to make node S3 (Q̄)high and hence Q will become low.

2. Case 2: If D is high, P1 will turn OFF to make node S1 stay low which in turn willturn ON P2 to make node S2 high. Thus node S2 will turn ON N3 to make node S3(Q̄) low and hence Q will become high.

3.2.2 PROPOSED MODIFIED ETSPC FLIP FLOP

The basic ETSPC of (Yu et al., 2006) has delay of 36.34 pS and consumes 141.01 µ Wpower resulting in a PDP of 5.12 fJ at 6 GHz (Nominal process corner). We propose aModified ETSPC FF (Fig. 3.2) which has a better PDP as will be discussed below.Theproposed METSPC-FF is a positive edge triggered FF. It’s operation consists of prechargeand evaluation phase.

Precharge Phase

When clk is low, transistor P1, P2 will be turned ON and N3 will be turned OFF.

1. Case 1p: When D is low, N1 will be turned OFF, thus node A will be high through P1and it will turn ON N2. The circuit becomes like shown in the Fig.3.3 below. This willmake B to go low which we don’t want. To avoid this, the transistor P2 is sized so thatB remains high inspite of N2 being ON.

τcharging = Rp.Cout (3.1)

47

Page 3: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

P1 P2 P3

N1 N2 N3

A B

D

CLK

CLK

CLK

W/L:1.2 W/L:5 W/L:9

W/L:1.2W/L:1.2W/L:1.2

Q Q

Figure 3.2: Modified ETSPC flip flop

τdischarging = Rn.Cout (3.2)

where Rp and Rn are the on- resistance of PMOS and NMOS. To maintain charge innode B during evaluation period,

τcharging < τdischarging (3.3)

This implies Rp < Rn We know that for CMOS, on resistance of PMOS is greaterthan on-resistance of NMOS. Hence, we can increase the (W/L) of PMOS to decreasethe on-resistance of PMOS. Since the width of P2 is larger than N2 (resistance of P2is lesser than N2), node B will stay high through P2 irrespective of N2 being ON orOFF. The similar argument holds true for the last branch P3,N3 Referring to Fig. 3.2again, the middle branch P2,N2 transistor pair has to drive transistor P3 alone. But thelast branch P3,N3 has to drive 2 MOS transistors of the inverter. Hence the size of P3is chosen as almost twice as P2.

2. Case 2p: When D is high, N1 will be turned ON, thus node A will discharge throughN1 and it will turn OFF N2, therefore node B will stay high through P2.

P1 P2

N2

A B

CLK CLK P2

N2

B

CLK(inverted and delayed)

CLK

Cout

Figure 3.3: METSPC-FF during precharge phase

48

Page 4: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

Table 3.1: State of nodes at Precharge phase

Qt-1 Case Clk Dt A B Q̄t Qt

0 1p 0 0 1 1 1 01 2p 0 1 0 1 0 11 1p 0 0 1 1 0 10 2p 0 1 0 1 1 0

Table 3.2: State of nodes at Evaluate phase

Case Clk Dt A B Q̄t Qt

1e rising edge 0 1 0 1 02e rising edge 1 0 1 0 1

Since both P3 and N3 will stay turned OFF in precharge phase, Q̄t and Qt will hold theprevious state Q̄t-1 and Qt-1 respectively even though the present state of D (Dt) changes asshown in Table 3.1.

Evaluation Phase

When clk goes high, transistor P1, P2 will be turned OFF, N3 will be turned ON and thenode A and B will hold the precharge state as shown in Table 3.2.

1. Case 1e: If Dt is low at rising edge of the clock, A will stay high but B will dischargeto GND through N2 since P2 is OFF in evaluation phase. Thus node B will turn ONP3. The node Q̄t will stay high as the width of P3 is larger than N3 (resistance of P3will be lesser than N3). Therefore Qt will become low.

2. Case 2e: If Dt is high at rising edge of the clock, A will stay low and B will also stayhigh since N2 is OFF. Thus node B will turn OFF P3. The node Q̄t will be dischargedto GND through N3. Therefore Qt will become high.

The simulation results showed that METSPC flip flop has a propagation delay of 20.03pS and consumes 88.74 µW power resulting in a PDP of 1.77 fJ at 6 GHz (Nominal processcorner) which is better than ETSPC flip flop. Hence it is more suitable for high frequency(GHz) operation.

3.3 PERFORMANCE ANALYSIS OF METSPC-FF

Fig.3.4 shows the variation of propagation delay and power dissipation w.r.t frequency whichare as expected. To make sure that our FF design is insensitive to process variations, we didextensive simulation of our design across all process corners for various frequency. Three

49

Page 5: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

0

10

20

30

40

50

60

70

80

90

100

18

19

20

21

22

23

24

25

1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

Propagation Delay

Power Dissipation

Pro

paga

tion

Del

ay (

pS)

Pow

er D

issi

patio

n (u

W)

Frequency (GHz)

Figure 3.4: Propagation delay and Power dissipation of METSPC-FF for various frequencies

process corners exist: typical, fast and slow. Fast and slow corners exhibit carrier mobilitiesthat are higher and lower than normal, respectively. For example, a corner designated as FSdenotes fast nmos and slow pmos. There are therefore five possible corners: typical-typical(NN) (nominal), fast-fast (FF), slow-slow (SS), fast-slow (FS), and slow-fast (SF). The firstthree corners (TT, FF, SS) are called even corners, because both types of devices are affectedevenly, and generally do not adversely affect the logical correctness of the circuit. The re-sulting devices can function at slower or faster clock frequencies. The last two corners (FS,SF) are called "skewed" corners, and are cause for concern. This is because one type of MOSwill switch much faster than the other, and this form of imbalanced switching can cause oneedge of the output to have much less slew than the other edge.Mobility of electrons = 2.5 or 3 X Mobility of holes.Hence nmos is 2.5 to 3 times faster than pmos.

We have plotted the average propagation delay of our FF across all corners in Fig.3.5.(NN-Nominal, FF-Fast NMOS,Fast PMOS, FS-Fast NMOS, Slow PMOS,SF-Slow nMOS,Fast PMOS and SS-Slow NMOS and Slow PMOS)

1. FF: Both devices are fast, hence least propagation delay

2. FS: Since typical nmos is inherently faster than pmos in terms of switching speed,making it faster makes the overall device faster. Hence lesser propagation delay thannominal case.

3. NN: Typical propagation delay

4. SF: Since typical nmos is inherently faster than pmos in terms of switching speed,making it slower makes the overall device slower. Hence more propagation delay thannominal case.

5. SS: Both devices are slow, hence highest propagation delay

50

Page 6: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

0

5

10

15

20

25

30

35

Ave

rage

Pro

paga

tion

Del

ay (

pS)

FFFS

NNSF

SS

Process Corners

Figure 3.5: Average propagation delay of METSPC-FF across all corners

0

20

40

60

80

100

120

Ave

rage

Pow

er D

issi

patio

n (u

W)

FF

FS

NNSF

SS

Process Corners

Figure 3.6: Average power dissipation of METSPC-FF across all corners

The propagation delay of our FF averaged over all corners turns out to be 23.48 pS.Similarly, we have plotted the average power dissipation of our FF across all corners in

Fig.3.6.The average power dissipation of our FF averaged over all corners turns out to be76.44 µW. The average PDP averaged over all corners is 1.794 fJ.

3.4 SELF-BLOCKING FLIP FLOP

A single phase clocked flip-flop, SBFF is proposed in Li, Jia, Liang, and Wang (2012),where the authors claim that their FF has better power-delay product than even the mostrecent Sense Amplifier FF. To verify the claims of SBFF, we simulated it in TSMC 90 nmand optimized the W/L ratios for proper functionality.

The SBFF consist of a dynamic XOR gate in the first stage and a differential storagelatch in second stage. The slave latch is controlled by the X and clk signal from the XORgate. When clk is low, the node X is precharged and N7 is turned OFF, thus the slave latchis opaque to changes in D. At the positive edge of the clk, signal X is evaluated to D ⊕ Q.

1. Case1: When the present state D and previous state Q are same (Dt ⊕ Qt-1 = 0), nodeX discharges through N1 which in turn will hold N6 OFF, this will prevent data fromentering into storage latch as the previous data is unnecessary to be changed.

51

Page 7: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

DBD

D DB

CLK

X

QQB

GND

P2 P3

P4 P5

N10N8 N11 N9

N7

N6

P1

N2 N3

N4 N5

N1

CLK

DB

QB

D

Q

CLK

X

W/L=1.2

W/L=1.2 W/L=1.2

W/L=1.2W/L=1.2

W/L=1.2

W/L=1.2 W/L=1.2

W/L=5

W/L=7 W/L=7

W/L=3.6 W/L=3.6

W/L=3

W/L=1.2

W/L=3 W/L=3

Figure 3.7: Self Blocking Flip Flop

2. Case2: When the present state D and previous state Q are different (Dt ⊕ Qt-1 = 1),node X will hold the VDD state. Thus both N6 and N7 will be turned ON.

When the present state is high (Dt = 1), N8 will be turned ON, which will pull Q to high(Qt = 1) through P5. Since Qt is high, N10 is turned ON making Q̄t zero. If present stateis low (Dt = 0), N9 will be turned ON (since D̄t = 1), which will pull Q̄t to high (Qt=0)through P4. Since Q̄t is high, N11 is turned ON making Qt zero. Thus SB-FF will hold thevalue Qt and Q̄t through the differential latch set-up in the evaluation stage. Once the Q andD become same, signal X will discharge through N1 to GND. Hence changes in D after thepositive edge of the clock will not affect Q.

3.5 PERFORMANCE ANALYSIS OF SB-FF

After extensive simulation of SB-FF with optimized W/L values shown in Fig.3.7, it wasfound that this SB-FF has lower PDP than METSPC at frequencies below 1.5 GHz. ButSB-FF fails to function as a flip flop above 1.5 GHz due to setup time constraints. Hence, weconcluded that SB-FF is a better option for sub 1.5 GHz operation.

Fig.3.8 shows the variation of propagation delay and power dissipation w.r.t frequency ofSBFF which are as expected. Since the authors in Li et al. (2012) didn’t check their FF forprocess insensitivity, we verified the functionality of the SBFF across all process corners for

52

Page 8: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

0

2

4

6

8

10

12

14

0

20

40

60

80

100

120

100 200 400 500 600 800 1000 1200 1400 1500

Propagation Delay

Power DissipationP

ropa

gatio

n D

elay

(pS

)

Pow

er D

issi

patio

n (u

W)

Frequency (MHz)

Figure 3.8: Propagation delay and Power dissipation of SB-FF for various frequencies

frequencies below 1.5 GHz. We have plotted the average propagation delay of SB FF acrossall corners in Fig.3.9. The average propagation delay of SBFF averaged over all cornersturns out to be 89.80 pS.

0

5

10

15

20

25

30

35

Ave

rage

Pro

paga

tion

Del

ay (

pS)

FFFS

NNSF

SS

Process Corners

Figure 3.9: Average Propagation delay of SB-FF across all process corners

Similarly, we have plotted the average power dissipation of our SBFF across all cornersin Fig.3.10. The average power dissipation of SBFF averaged over all corners turns out to be76.44 µW. The average PDP averaged over all corners is 0.577 fJ.

0

1

2

3

4

5

6

7

8

9

Ave

rage

Pow

er D

issi

patio

n (u

W)

FF

FS NN SF

SS

Process Corners

Figure 3.10: Average Power Dissipation of SB-FF across all process corners

53

Page 9: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

3.6 COMPARISON OF METSPC-FF AND SB-FF

Compared to the average PDP of METSPC which is 1.794 fJ, the average PDP of SBFF is68% lesser, except that it can function only at frequencies less than 1.5 GHz. The PDP ofMETSPC-FF is higher because the power consumption is high due to short circuit currentflowing in the direct path between VDD and GND during the precharge phase.

3.7 POST-LAYOUT SIMULATION OF FLIP FLOPS

Parasitics play a crucial role in the performance of a circuit. They need to be consideredespecially in high frequency designs like ours. Parasitics have to be considered all the morein flip-flops because of the potential set-up and hold time violations that can be caused bythem. Since the clock divider circuit takes some GHz frequency as input to perform division,we have done the post layout simulation of the clock divider circuits to verify their suitabilityfor high frequency operation. Post-layout simulations also aid us to get an accurate estimateof power consumption. Fig.3.11 and Fig.3.12 show the layouts of METSPC-FF and SB-FFrespectively. We have verified the proper functionality of the flip flops after RC extraction.Post layout simulation results show that METSPC-FF functions as a FF (without setup timeand hold time violations) for input frequencies upto 5.6 GHz and SB-FF upto 1.35GHz.

Clk

D

Qb

Q

Figure 3.11: Layout of METSPC-FF

3.8 CLOCK DIVIDER CIRCUITS

We use the flip-flops we discussed in the previous sections to construct ‘divide by 2’, ‘divideby 3’, ‘divide by 5’ and ‘divide by 7’ circuits shown in Figure 3.13. We can design any‘divide by n’ circuit by cascading these basic ‘divide by’ circuits.

For ‘divide by 3’, ‘divide by 5’ and ‘divide by 7’, the duty cycle of the divided clock willnot be 50 %. In case the IP core requires a clock with 50 % duty cycle, duty cycle corrector

54

Page 10: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

Clk

D

QQb

X

Db

Figure 3.12: Layout of SB-FF

CLK

D Q

CLK

RESET

OUT

OUT

RESET

CLK CLK

CLK

D D Q2Q1

D D D

CLK CLK CLK

CLK

RESET

OUT

Q1 Q2 Q3

RESET

CLKCLKCLKCLK

CLK

D D D DQ1 Q2 Q3 Q4

OUT

(a) Divide by 2 circuit (b) Divide by 3 circuit

(c) Divide by 5 circuit (d) Divide by 7 circuit

Figure 3.13: Clock Divider circuits

circuits can be used. Figure 3.14 shows the post layout simulation waveform of our clockdivider to generate 450 MHz from 1.8 GHz.

55

Page 11: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

Figure 3.14: Post layout simulation waveform of Clock divider to generate 450 MHz from1.8 GHz

56

Page 12: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

Chapter 4

Design of Fractional-N clock divider

4.1 INTRODUCTION

The clock generation system generates the frequencies for the different clock domains fromthe basic crystal oscillator (tens of MHz) using PLLs(as frequency multipliers) followed byclock dividers (Chapters 2 and 3). In MCD systems, we generate many clock signals ofvarious frequencies from a high frequency clock by frequency division. Frequency divisionby an integer can be achieved using flip flops and basic gates (Fig.4.1).

Crystal Oscillator

24MHz(say) PLL

(Acts as frequency multiplier)

2GHz

Clock Divider

/2

/5

/2

1GHz

400MHz

200MHz

D-FFD Q

D-FFD Q

D-FFD Q

Fin

Fout

Figure 4.1: Clock generation for MCD systems

There may be certain blocks in MCD systems which need a clock which cannot be de-rived by simple integer division. Hence we need fractional-N frequency dividers (FFD) toprovide division ratios like 4.25, 8.75 etc. The authors in Boon, Do, Yeo, and Ma (2005)present a FFD which can be used in the feedback loop of the Phase locked loop (PLL). Inthis work, we improve the fractional frequency divider (FFD) of Boon et al. (2005) in thefollowing manner.

1. We improve the division resolution of the FFD to 1/8. The improved FFD and itsoperation is presented in Sections 4.3 and 4.4.

57

Page 13: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

2. The power consumption of the FFD is reduced by the judicious use of two low power,high speed flip flops presented in Chapter 3. We present post layout simulation resultsof the improved FFD in Section 4.5.

We conclude this chapter by analyzing our FFD in terms of division ratio error, frequency ofoperation and power consumption.

4.2 BASIC CONCEPT OF FRACTIONAL FREQUENCY DIVIDER

Fractional division is achieved by phase shifting the input clock signal of frequency ‘f’ andmanipulating the integer divided signals of frequency ‘f/2’,‘f/4’ and ‘f/8’ to achieve the de-sired division ratio (Boon et al. (2005),Chang and Cheng (2011)). Fig.4.2 shows the signalsof successive div-by-2 circuits (four div-by-2 circuits are connected in series to generate f/2,f/4, f/8 and f/16 on a clock of input frequency ‘f’) to achieve a division ratio of 15.875 on a1 GHz clock.The FFD will take 16 clock periods to generate a fractionally divided clock (of ratio 15.875).As shown in Fig.4.2, the first clock cycle is phase shifted by 45◦, i.e. for 1 GHz input clock,the first ON period of signal is reduced by 125pS (375pS instead of 500pS). As a result,the output of first div-by-2 circuit (f/2) will be one div-by-1.875 (since input clock is phaseshifted by 45◦) operation and seven div-by-2 operations. The output of first div-by-2 circuitforms the input for the successive div-by-2 circuits, thus dividing the input signal by a ratioof 15.875 as illustrated in Fig.4.2. The same sequence of phase shifting and division willrepeat for every 16 clock cycles.

1.875 2222222

3.875 4 4 4

7.875 8

15.875

Clock

f/2

f/4

f/8

FFD (f/16)

Phase Shifted Clock

45° Phase Shifted (125pS) for 1GHz clock

Figure 4.2: Illustration of division for divide-by-15.875

58

Page 14: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

4.3 FRACTIONAL-N FREQUENCY DIVIDER WITH RESOLUTION

OF (1/8)

4.3.1 FRACTIONAL-N FREQUENCY DIVIDER CIRCUIT

The frequency divider presented in this chapter has a division ratio range 3.75 times largerthan that of the conventional fractional-N divider(Boon et al. (2005)) because its divisionmodulus ranges from (N - 1.875) to (N + 1.875) while a conventional fractional divider hasonly a division modulus of N to (N + 1). Four divide-by-2 Flip-Flops(FFs) can be usedto make it operational with N = 16, i.e., (16 - 1.875 = 14.125 to 16 + 1.875 = 17.875)depending on the backward or forward propagation. Fig.4.3 shows the proposed fractionaldivider architecture where Fin is the appropriately phase shifted input signal.

0o

Phase Select

Div-by 16 using 4 Div-by-2 FFs

Fin(Phase shifted)

Backward/Forward

Mode1

Mode2

Mode3

Control1

Control2

/16

FFD Clock

ModulusControl

PhaseControl

Next

Mode4

Control3

/8/4

/2

45o 90o 135o 180o 225o 270o 315o

0o

Phase Shift

90o 180o 270o

Clock From Quadrature VCO (PLL)

Figure 4.3: Fractional-N frequency divider with a divide by resolution of (1/8)

4.3.2 PHASE SHIFT CIRCUIT

The clock signal from the PLL needs to be phase shifted by every 45◦ to achieve improvedresolution in fractional division. So we need an 8-phase shift circuit which is accurate andfrequency independent. In Zid, Scandurra, Tourki, and Pistritto (2011), the authors proposesuch a phase shift circuit by using a positive edge triggered and negative edge triggeredflip-flop in parallel. The basic principle of this method is that a 180◦ phase shift (half timeperiod) in a signal of frequency ‘2f’ translates to a 90◦ phase shift (quarter time period) inthe signal of frequency ‘f’ (180◦ phase shift was achieved in Zid et al. (2011) using negativeedge triggering).Extending this way, 45◦ phase shift in a signal of frequency ‘f’ can be achieved by producinga 90◦ phase shift in ‘2f’ which in turn can be achieved by producing 180◦ phase shift in

59

Page 15: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

‘4f’. To generate 45◦ phase shifted signals of frequency ‘f’ from a signal of frequency ‘4f’,we need a very high frequency PLL (to phase shift a 2 GHz clock every 45◦, we need aPLL generating 8 GHz which is inefficient in terms of power). Instead, we can use a PLLwith quadrature outputs which gives 90◦ phase shifted signals of ‘2f’ (Fig.4.4). Quadraturephases of signal can be obtained by using special PLLs with quadrature VCO. Such PLLswith quadrature VCO are available in recent literature(C.-T. Lu, Hsieh, and Lu (2010),yeopLee et al. (2013),Titus and Kenney (2012)). They are also available in the industry as IPcores by vendors like Terminus Circuits, Analogies (IP datasheets (2013)).

Out

Div-by-2

OutClk

Out

Div-by-2

OutClk

Out

Div-by-2

OutClk

Out

Div-by-2

OutClk

0q

180q

90q

270q

45q

225q

135q

315q

0q

180q

90q

270q

Qua

drat

ure

VC

O (

PLL

)

Phase ShiftCircuitry

2f f

(a) Phase shift circuit

0q

180q

90q

270q

45q

225q

135q

315q

0q

180q

90q

270q

Output ofPhase-

ShiftCircuitry

ClockFrom

Quadrature VCO

(b) Phase shifted waveform

Figure 4.4: Phase shift block

4.3.3 MODULUS CONTROL CIRCUIT

The 4-bit control word, Mode determines the division modulus by generating 0, 1, 2, 3, · · ·, 15 pulses depending on the settings of the control bits Mode1, Mode2, Mode3 and Mode4.The modulus control circuitry generates a ‘Next’ signal as shown in Fig. 4.5. Number ofpulses in ‘Next’ signal is

4∑i=1

{(MODEi)× 16

Max.dividerratioappliedtotheNANDi

}(4.1)

For example, when Mode is 1100, where Mode4 and Mode3 are high while Mode1 andMode2 are low, 12 pulses will be generated at the output ‘Next’. The number of pulsesdepends on the value of Mode. More the value of Mode more will be the number of pulsesgenerated. The input Fin (which is inverted while feeding to modulus control circuit) is the

60

Page 16: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

phase shifted input clock signal and /2, /4, /8 and /16 correspond to the outputs of the fourdivide-by-2 flip-flops connected in series.

Fin

/8

Fin

Fin

Mode2

Mode3

Mode4

/2

/4

/2

/4

/2

Next

Fin

/8

Mode1

/2

/4

/16

NAND4

NAND3

NAND2

NAND1

Figure 4.5: Modulus control circuit

4.3.4 PHASE CONTROL CIRCUIT

The phase control circuitry is used to convert the signal ‘Next’ generated by the moduluscontrol circuitry to the 3-bit control signal which feeds the phase select circuitry. The 3-bitcontrol signal is generated with the help of three divide-by-2 flip-flops as shown in Fig. 4.6.Control1, Control2 and Control3 are ‘Next’ signal divided by 2, 4 and 8, respectively. Theirvalues may be inverted depending on forward or backward propagation.

Next

2:1 MUXA

B sel

Div-by-2 Div-by-2

Backward/Forward

Control1

2:1 MUXA

B sel

Div-by-2

Control2

2:1 MUXA

B sel

Control3

Figure 4.6: Phase control circuit

4.3.5 PHASE SELECT CIRCUIT

The purpose of phase select circuitry is to switch Fin from One to Two, Two to Three, Threeto Four, Four to Five, Five to Six, Six to Seven, Seven to Eight and then back from Eightto One, as the Control1, Control2 and Control3 signal changes. This is done by seven 2:1multiplexers as shown in Fig. 4.7. The selection of which phase shift should be chosen

61

Page 17: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

can be determined from Table 4.1. This FFD is designed for both forward and backwardpropagation. With each switching of the input control word with Control3 as MSB andControl1 as LSB (from 000 -> 001 -> 010 · · · 111 -> 000) an extra 0.125 T is subtracted fromthe output period of Fin in backward propagation (added in forward propagation). Table 4.1summarizes the forward and backward propagation in terms of control signal. For forwarddivision ratio, the control signals are inverted before feeding them to the phase select circuit.

Control2

Control3

MUX5

MUX6

MUX7 Fin

Control1

MUX1

MUX2

Control1

MUX3

MUX4

8

1

2

3

4

5

6

7

(315q�

(0q�

(45q�

(90q�

(135q�

(180q�

(225q�

(270q�

(Phase Shifted Signal)

Figure 4.7: Phase select circuit

Table 4.1: Forward and backward sequences of phase select circuit

Control3 Control2 Control1 Fin(forward) Fin(backward)1 1 1 Eight Seven1 1 0 One Six1 0 1 Two Five1 0 0 Three Four0 1 1 Four Three0 1 0 Five Two0 0 1 Six One0 0 0 Seven Eight

4.4 CIRCUIT OPERATION

The modes decide the division modulus and thus the control signal will change in such a waythat the phase select block will connect Fin to the signal that is 45◦ phase shifted with respect

62

Page 18: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

to the present signal, e.g., from 0 to 45◦ or 45◦ to 90◦ and so on.

4.4.1 BACKWARD PROPAGATION

For instance, for a division of 14.125, Mode is 1111. Hence, fifteen pulses will be generatedat ‘Next’ (Table 4.2). If Fin is initially connected to One, after control signal changes, aconnection will be made to Eight then to Seven, Six, Five and so on. This happens 15 timesin one cycle of divide-by 14.125 operation. In this way, all the division modulus from 16 to14.125 (for N = 16) can be achieved (backward propagation).

4.4.2 FORWARD PROPAGATION

Essentially, for forward propagation, all the control signals are inverted. In order to achieve adivide-by-16.875 operation, Mode must be set to 0111. Thus, seven pulses will be generatedat ‘Next’. When Fin is initially connected to one, after control signal changes, a connec-tion will be made to two, three and so on. As there are seven pulses, control signal willchange seven times in one divide-by-16.875 operation. The forward propagation and back-ward propagation of division ratio are implemented by adjusting the sequences of the phaseshifted signals in the phase select circuit. Hence, the range of the division ratio for the fre-quency divider is increased two times, which ranges from (N - 1.875 = 14.125) to (N + 1.875= 17.875) for N = 16.

Table 4.2: Division ratio for mode sequences

Mode4 Mode3 Mode2 Mode1 No.of pulses in ‘Next’ signal Steps of division ratio0 0 0 0 0 00 0 0 1 1 0.1250 0 1 0 2 0.250 0 1 1 3 0.3750 1 0 0 4 0.50 1 0 1 5 0.6250 1 1 0 6 0.750 1 1 1 7 0.8751 0 0 0 8 11 0 0 1 9 1.1251 0 1 0 10 1.251 0 1 1 11 1.3751 1 0 0 12 1.51 1 0 1 13 1.6251 1 1 0 14 1.751 1 1 1 15 1.875

Table 4.2 shows the complete list of division ratio combinations. It is to be noted that, thenumber of switching of Fin phase is decided by the control word. If the phase select block isconsidered as a black box, then number of switching of Fin phase is equal to the number of

63

Page 19: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

pulses at ‘Next’. For example, if number of pulses of the ‘Next’ signal is 13, then the stepof the division ratio is the product of number of phase shift (13) and the step-size (0.125),which is 1.625. Therefore, the division ratio will be 14.375 (16 - 1.625) or 17.625 (16 +1.625) depending upon whether the circuit is operating at backward or forward propagation.

4.5 POST LAYOUT SIMULATION

The phase shift, phase control and divide-by-16 blocks of the FFD need high speed FFsto perform fractional division on a GHz clock. They also need to consume low power toenable a power efficient FFD. We use the modified extended true single-phase clock flip-flop (METSPC-FF) and the self blocking FF (SBFF) of Chapter 3 to design our Fractionalfrequency divider circuit. According to our pre-layout simulation studies, the METSPC-FFcan be used as a divider for input frequencies up to 6 GHz and SB-FF up to 1.5 GHz. In ourFFD design, we use the METSPC-FF for the phase shift block since the input clock will bein GHz (twice the frequency of the signal to be fractionally divided). The phase control andthe divide by-16 block use the two flip-flops in a hybrid fashion i.e., METSPC-FF followedby SB-FFs. We have used METSPC-FF in the first stage since it has got lesser propagationdelay than the SBFF at higher frequencies. The SBFF is used next because it has got lowerpower delay product (PDP) than METSPC for frequencies lesser than 1.5 GHz. Hence thesetwo flip-flops are judiciously used together to minimize the overall power consumption ofthe FFD. The divide-by-16 block of FFD is shown in Fig. 4.8.

METSPC-FF

Fin /2 /4

D DQ Q

Q Q

SBFF-FF

/8

D Q

Q

SBFF-FF

/16

D Q

QMETSPC-FF

Figure 4.8: Hybrid clock divider

The accuracy of the FFD of this resolution cannot be ascertained without post layoutsimulation after parasitic extraction. Hence we performed the post layout simulation ofMETSPC-FF and SB-FF and verified the proper functionality of the flip-flops after RC ex-traction. Post layout simulation results show that METSPC-FF functions as a FF (withoutsetup time and hold time violations) for input frequencies up to 5.6 GHz and SBFF upto 1.35GHz. The layouts are shown Section 3.7.

64

Page 20: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

4.5.1 PHASE SHIFT ERROR

Since we are dealing with GHz frequency, there cannot be a phase shift circuit without error.(A 45◦ phase shift of 1.25 GHz clock amounts to shifting it by 100 pS!). The post layoutsimulation of the phase shift circuit using METSPC-FFs shows phase errors in the range of0–4.3 pS for different phase angles. We studied the error % in the phase shifted signal fordifferent frequencies. The phase error % decreases with frequency. To account for phaseerror in the quadrature signals coming from VCO, we introduced a 2.5◦ error at the input ofthe phase shift circuit. The error % in the phase shifted signals for 2 GHz clock for both thecases (without any error and with 2.5◦ error in quadrature VCO output) are plotted in Fig.4.9.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 45 90 135 180 225 270 315

0 deg error at

Quadrature

VCO output

2.5 deg error at

Quadrature

VCO output

Phase Shift (deg)

Per

cent

age

erro

r in

Pha

se S

hift

Figure 4.9: Error % in phase shifted signal for various phase shift degrees

The average error % in the phase shifted signals is 0.53% without any error and 1.12%with 2.5◦ phase error at the input of phase shift circuit. This error in the phase shiftedsignal propagates to the phase select circuit and then to the divide-by-16 circuit. This willconsequently translate into an error in the FFD ratio which will be discussed in Sec. 4.5.2.The phase shift error doesn’t further propagate since the output of the divide-by-16 is usedto generate the NEXT signal using the modulus control. The NEXT signal depends only onthe number of pulses and NOT on the width of the pulses. Hence the error doesn’t get fedback to the Phase control and hence is not accumulated.

If we want a FFD with resolution greater than 1/8 (say 1/16), we need more hardware inall the blocks of FFD (e.g., eight more FFs in phase shift circuit, eight more MUX in phaseselect circuit, one more FF in divide by 16 block). Moreover, we need to phase shift theinput clock every 22.5◦ to achieve fractional division of resolution (1/16). The inevitableerror in the 22.5◦ phase shifted signal (which is more compared to the error in 45◦ phaseshifted signal) will result in an increased error in the division ratio of overall FFD circuit.Hence the increased hardware and the difficulty in obtaining accurate phase shifted signalimpose an upper bound of (1/8) in the resolution of FFD. Figures 4.10 and 4.11 show post-

65

Page 21: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

layout simulation of 1 GHz clock divided by 14.875 and 2 GHz clock divided by 15.625,respectively.

1GHz Input Clock

Fractional Frequency Divided Clock by a ratio 14.875

Figure 4.10: 1 GHz clock divided by 14.875

2GHz Input Clock

Fractional Frequency Divided Clock by a ratio 15.625

Figure 4.11: 2 GHz clock divided by 15.625

4.5.2 EFFECT OF PHASE ERROR ON THE DIVISION RATIO

As we have already stated, there is an average error of 0.53% in the phase shifted signals evenwithout any phase error at the input of phase shift circuit. This small error propagates throughphase select and divide-by-16 block and appears as an error in the fractionally divided clocksignal. We have studied this error in division ratio for various frequencies. Figure 4.12shows the error % versus frequency for division ratio of 15.625. For frequencies > 1.25

66

Page 22: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

GHz, the error % in division ratio increases with the input frequency and beyond 2 GHzthe error increases so much that the division ratio goes to the next step. For example, for adivision ratio of 15.625 on a 2.25 GHz clock, a 0.908% error in division ratio leads to thenext division ratio of 15.75. Hence the maximum operating frequency of our FFD is 2 GHzwith this improved resolution.

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.5 0.75 1 1.25 1.5 1.75 2 2.25

0 deg error at

Quadrature VCO

output

2.5 deg error at

Quadrature VCO

output

Frequency (GHz)

% e

rror

in D

ivis

ion

Rat

io

Figure 4.12: Error % in division ratio of FFD for various frequencies

Finally, we plot the power dissipation of the FFD for various frequencies in Fig. 4.13-(a).To verify the suitability of our FFD for low voltage operations, we simulated it for a 2 GHzclock with supply voltage decreasing from 1.1V in steps of 0.05 V. We were able to achievefractional division till VDD of 0.75 V. The power consumption reduces drastically to 246.56µW at supply voltage of 0.75V as shown in Fig. 4.13-(b).

67

Page 23: Chapter 3 Design of Clock dividers - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/117846/11/11... · 2018-07-04 · ing the power consumption of the clock divider circuit

600

620

640

660

680

700

720

740

760

780

0.5 0.75 1 1.25 1.5 1.75 2

Frequency (GHz)

Pow

er D

issi

patio

n (u

W)

(a) Across various frequencies(VDD=1.1 V)

0

100

200

300

400

500

600

700

800

0.75 0.8 0.85 0.9 0.95 1.05 1.1

Supply Voltage (V)

Pow

er D

issi

patio

n (u

W)

(b) Across various supply voltage(clock as 2 GHz)

Figure 4.13: Power consumption of FDD

4.6 CONCLUSION

In this chapter, we present an improved FFD which has a resolution of (1/8) and consumesless power. Post layout simulation results after parasitic RC extraction in the 90-nm tech-nology node show that our FFD is able to fractionally divide signals up to 2 GHz frequencywith an average error of 0.11% in division ratio even with 2.5◦ phase error at the input. OurFFD consumes 754 µW when fractionally dividing a 2 GHz signal with a resolution of (1/8).Since simulation results of FFD of this resolution and division ratio in 90-nm technologyare not reported in literature, we are not able to make a direct comparison with other works(since technology, resolution and division ratio are different in different works). However, inTable 4.3 we summarize our results along with two other similar works to ascertain the meritof our FFD.

Table 4.3: Forward and backward sequences of phase select circuit

W1 represents work of (Chang & Cheng, 2011) and W2 represents work of (Boon et al., 2005)Ref. Tech(µm) VDD Max.freq(GHz) Div. ratio Step PowerW1 0.35 1.5 2.70 240.5-248 0.5 5.13 mW to divide 2.7 GHz clockW2 0.25 2.0 1.20 6.25-9.75 0.25 3mW to divide 1.2 GHz clock

Ours 0.09 1.1 2.00 14.125-17.875 0.125 0.754 mW to divide 2 GHz clock

68