A 20/30 GBIT S CMOS BACKPLANE DRIVER WITH DIGITAL PRE …sorinv/theses/paul_westergaard... · 2006-02-20 · -6 A 20/30 GBIT/S CMOS BACKPLANE DRIVER WITH DIGITAL PRE-EMPHASIS Paul

-7

A 20/30 GBIT/S CMOS BACKPLANE DRIVER WITH

DIGITAL PRE-EMPHASIS

by

Paul Westergaard

A project report submitted in conformity with the requirementsfor the degree of Masters of Engineering

Graduate Department of Electrical and Computer EngineeringUniversity of Toronto

Copyright by Paul Westergaard 2005

-6



Paul Westergaard

Master of Engineering, 2005

Graduate Department of Electrical and Computer Engineering

University of Toronto

Abstract

A high-speed input comparator and output driver with fully adjustable pre-

emphasis for applications in serial inter-chip communications over backplanes

at 20 Gb/s is presented. The driver achieves data rates of up to 30 Gb/s when

the pre-emphasis is disabled. The circuit was implemented in 130-nm CMOS

and consumes 150 mW from a 1.5-V supply in 20 Gbps operation. It has over

30 dB dynamic range with a sensitivity of 20 mVpp and a differential output

swing of 700 mVpp at 20 Gb/s. The output driver features a novel digital pre-

emphasis circuit with independent pulse height and pulse width control with-

out the requirement of an external clock input. Other independent features

are 30%-70% eye-crossing control and adjustable output swing between 170

mVpp and 350 mVpp per side. The results of this project were published and

presented at the IEEE Custom Integrated Circuits Conference in 2004 [1].

ii

-5

Acknowledgements

The author would graciously like to thank his supervisor and mentor Profes-

sor Sorin Voinigescu for his inspiration, technical and personal insight and

unwavering support throughout this project. The author would also like to

specially thank Timothy O. Dickson, a fellow graduate student, for his on-

going technical input, altruism and camaraderie throughout this thesis.

Experimental results would not have been possible without Timothy’s contri-

butions. Furthermore, the author acknowledges that this thesis would not

have been possible without the valuable technical and financial contribution

of the Ottawa, Ontario office of ST Microelectronics. Special contributing

members of the ST Microelectronics technical staff who are owed special grat-

itude include Rudy Beerkens, Boris Prokes, Imran Khalid and Steve McDow-

ell.

iii

-4

Contents

List of Tables viList of Figures vii1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2 Objectives and scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2

2 Background 42.1 Review of pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Inductive peaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Hazardous relative placement of load inductors . . . . . . . . . . . . . . . . . . . . . 132.4 Second order effects in deep submicron MOSFETs . . . . . . . . . . . . . . . . . . . . 14

3 Circuit Design 213.1 Driver overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .213.2 Biasing a MOS CML gate for optimal speed . . . . . . . . . . . . . . . . . . . . . . .223.3 Biasing a differential CMOS stage for low-noise . . . . . . . . . . . . . . . . . . . .243.4 Input matching network and low-noise comparator . . . . . . . . . . . . . . . . . .253.5 Eye-crossing control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26

3.5.1 Transistor sizing and biasing in the eye-crossing control circuit . . . .273.6 Output driver and parallel current summation . . . . . . . . . . . . . . . . . . . . . . . 283.7 Digital pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4 Simulation Results 344.1 S-Parameter simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .344.2 Time-domain simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5 Physical Implementation 415.1 Inductor design and model extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.1.1 Model extraction using ASITIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.2 Inductor Realization and isolation . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 Layout and fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Experimental Results 506.1 Test environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .506.2 S-Parameters and noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.3 Time-domain measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.4 Performance summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

iv

-3

7 Conclusions 617.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

References 63

Appendix A: Spice file for inductor parameter extraction 65

v

-2

List of Tables

Table 1: Circuit features categorized by design requirement . . . . . . . . . . . . . . . . . . . . . . . . 2

Table 2: Performance metrics for shunt peaking [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10

Table 3: Physical dimensions of each inductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Table 4: Simulated inductor parasitic values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Table 5: Simulated and measured circuit parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

vi

-1

List of Figures

Fig. 1: Transmitter pre-emphasis and backplane trace transfer characteristic . . . . . . . . . . . 5

Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Fig. 3: Time-domain pre-emphasis overshoot and undershoot. . . . . . . . . . . . . . . . . . . . . . . . . . 7

Fig. 4: Simple common source amplifier with resistive loading. . . . . . . . . . . . . . . . . . . . . 8

Fig. 5: Common source amplifier with shunt peaking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Fig. 6: Potentially hazardous placement of load inductor . . . . . . . . . . . . . . . . . . . . . . . . . .14

Fig. 7: Proper placement of inductive load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14

Fig. 8: Transconductance of a 130-nm nMOSFET vs. gate voltage [8] . . . . . . . . . . . . . . . . .15

Fig. 9: MOS differential pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Fig. 10: Driver block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Fig. 11: Constant peak ft current density over four technology nodes . . . . . . . . . . . . . . . . . . 22

Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2-µm unit finger width biased at VDS = 1V . . . . . . . . . . . . .25

Fig. 13: Input bias and matching network and comparator . . . . . . . . . . . . . . . . . . . . . . . . 26

Fig. 14: Eye-crossing control circuit and intermediate signals . . . . . . . . . . . . . . . . . . . . . . 28

Fig. 15: Summation of output currents across output resistor . .. . . . . . . . . . . . . . . . . . . . . . 29

Fig. 16: Output driver transistor-level schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30

Fig. 17: Digital pre-emphasis block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Fig. 18: Pre-emphasis waveforms and transfer function . . . . . . . . . . . . . . . . . . . . . . . . . .31

Fig. 19: NMOS digital differentiator schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33

Fig. 20: Simulated S22 and S11 of complete driver circuit to 60 GHz . . . . . . . . . . . . . . . . . 34

Fig. 21: Simulated single-ended S21 of entire driver circuit . . . . . . . . . . . . . . . . . . . . . . . . . .35

Fig. 22: Simulated single-ended S21 of driver with output amplitude reduced using amplitude control . . . . . . . . . . . . . . . . . . . . . 35

Fig. 23: Simulated S21 with output peaking enabled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Fig. 24: 20Gb/s eye-diagrams 27-1 PRBS: single-ended input 20 mVpp;differential output 99mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36

Fig. 25: 25Gb/s eye-diagrams 27-1 PRBS: single-ended input 60 mVpp;

vii

0

differential output 180mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

Fig. 26: 30Gb/s eye-diagrams 27-1 PRBS: single-ended input 140 mVpp;differential output 300mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50, (b) 66%, and (c) 33% . . . . . . . . .38

Fig. 28: Output amplitude control at 20 Gb/s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39

Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per sideand +/- 16% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per sideand +33%/ -25% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Fig. 31: Inductor single-frequency Π-model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43

Fig. 32: Inductor lumped element broadband model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Fig. 33: On-chip 900 pH inductor with 44mm diameter . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Fig. 34: Full chip photograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47

Fig. 35: Magnified photograph of main path layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Fig. 36: Magnified photograph of the parallel path layout . . . . . . . . . . . . . . . . . . . . . . . . 48

Fig. 37: S-Parameter test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Fig. 38: Eye-diagram test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Fig. 39: Measured single-ended S21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Fig. 40: Measured single-ended S22 and S11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Fig. 41: Measured and simulated driver noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54


Fig. 43: 25Gb/s output eye diagram with 50% eye crossing . . . . . . . . . . . . . . . . . . . . . . . 56

Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per sidefor a single-ended 200 mVpp, 231-1 input PRBS . . . . . . . . . . . . . . . . . . . . . . . . 56

Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30% . . . . . . . . . . . . . . . . 57

Fig. 46: Output amplitude control at 20 Gb/s;output of (a) 190 mVpp and (b) 350 mVpp per side . . . . . . . . . . . . . . . . . . . . .58

Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis . . . 59

viii

1

1 Introduction

1.1 Motivation

Serial inter-chip communication is gaining widespread acceptance over paral-

lel architectures because congested printed circuit board (PCB) routing and

pad-limited silicon dice are not cost efficient in commodity designs. To mini-

mize the overall circuit area required for a serial transmitter/receiver pair,

equalization can be performed at the transmitter instead of at the receiver, in

which case it is known as pre-equalization. At the transmitter, pre-equalizers

alter the wave-function to account for the low-pass response of the intercon-

nect. Historically, pre-emphasis has been achieved either using clocked flip-

flops and step-delayed current summation or analog differentiators. The latter

only permits for amplitude control of the pulse, obviating control for the pulse

width. The former implementation, while having the necessary control mecha-

nisms and efficacy, places severe strain on device technology as the required

flip-flop typically operates at twice the frequency of the driver itself.

Even though 40 Gb/s CMOS amplifiers [2], demultiplexers and multiplexers

[3] have been recently reported, demonstrating the high-speed potential of

standard CMOS technology, they suffer from limited dynamic range due to

poor sensitivity and modest output swings of about 100 mVpp per side.

This paper presents the first published CMOS driver with duty-cycle, ampli-

tude, and pre-emphasis control that operates at data rates exceeding 20 Gb/s.

The driver achieves over 30dB of dynamic range. It includes a novel passive

element-free differentiator that enables control of both amplitude and width

of the pre-emphasis pulse.

2

1.2 Objectives and scope

The prime objective of this thesis was a fabricated high-speed, pre-emphasis

enabled, output driver for applications in serial inter-chip communications

over backplanes. The scope of the thesis was the theoretical derivation, com-

puter-aided design, simulation, fabrication and experimental characterization

of the circuit. The required features of the design in order to achieve its tar-

geted application were as follows:

Table 1: Circuit features categorized by design requirement

DesignRequirement

Corresponding Circuit Feature

High Bandwidth • Inductive peaking at each signal path stage• High gain (> 20dB) at 20 Gb/s operation• Positive gain at 30 Gb/s operation

Low Power • 1.5 V power supply• 150 mW dissipation at 20 Gbps operation.

High SignalIntegrity andSensitivity

• Input and output matching 50 Ω up to 50 GHz• Differential signalling• Symmetrical layout• 20 mVpp input sensitivity

Signal Shaping andControl

• 30% to 70% pulse-width control• 200-700mVpp differential output swing control• Pre-emphasis spike width and height control

NovelImplementation

• Full CMOS implementation• Clock-free pre-emphasis circuit with non-tra-

ditional circuit design

3

1.3 Organization

The thesis is organized as follows. Chapter 2 discusses the background of pre-

emphasis, inductive peaking, and hazardous relative placement of load induc-

tors. Chapter 3 details the concept, design and biasing of the individual circuit

elements in the driver. Insights into transistor sizing for optimal speed based

on a current-density centric approach as well as minimum noise are provided.

In Chapter 4, the pre-layout simulation results of the driver are presented.

Chapter 5 summarizes the physical implementation of the entire circuit.

Chapter 6 offers an overview of the experimental results. Conclusions are

given in Chapter 7.

4

2 Background

2.1. Review of pre-emphasis

In high-speed circuit applications in which high-frequency signals are sent

over backplane channels, there are two types of equalization: transmitter pre-

emphasis and receiver equalization [4]. Both are intended to either emphasize

the high-frequency components or de-emphasize the low-frequency compo-

nents of the transmitted signal, in order to compensate for the low-pass trans-

fer characteristics of the channel. The transfer function of both types of

equalizer is high-pass, though in practice, it is band-pass. The reasons for the

latter are threefold: (i) semiconductor devices in practice cannot achieve infi-

nite bandwidth; (ii) to avoid high-frequency noise amplification; (iii) to meet

regulated electromagnetic interference (EMI) specifications.

Pre-emphasis is achieved at the transmitter side by increasing the high-fre-

quency components. Fig. 1 shows the mechanism in which ideal transmitter

pre-emphasis compensates for the low-pass transfer characteristics of the

backplane trace.

5

Fig. 1: Transmitter pre-emphasis and backplane trace transfer character-istic

A common practical pre-emphasis circuit implementation is a Finite

Impulse Response (FIR) filter. Fig. 2 shows the block diagram of a 4-tap FIR

filter with a single Data input, delay elements D and tap coefficients C1, C2,

C3, C4. The tap coefficients adjust the gain at each multiplier independently to

produce the output voltage across the load resistors of value R. The output is a

frequency shaped version of the Data input in the form of amplified high-fre-

quency components.

Backplane traceTransfer Characteristics

Frequency (Hz)

TransmitterPre-emphasis

Response

6

Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis

In the time domain, the FIR filter performs a differentiation function. As

shown with the dotted lines in Fig. 3, the waveform of transmit pre-emphasis

appears as overshoot and undershoot in the time-domain.

C1 C2 C3 C4

D

D

D

D

Data

Delay Element

R R

7

Fig. 3: Time-domain pre-emphasis overshoot and undershoot

Other circuits that perform similar differentiation functions are passive RC

differentiators and inductively-loaded differential amplifiers. Unfortunately,

there exist drawbacks in each of these three differentiator implementations.

For FIR filters, a clock of at least twice the frequency of the data is required

to trigger the delay elements D in Fig. 2. The width of the pre-emphasis spike

is inversely proportional to the frequency of this clock. Hence, for 20 Gb/s (10

GHz) signals, a minimum 20 GHz clock signal is needed, requiring a very well

designed clock recovery circuit and 20 GHz flip-flops. The RC passive-element

differentiator is a more viable solution in that it does not require the input

clock or flip-flops, however, the width of the pre-emphasis spike in this case is

not controllable. This results in a non-ideal pre-equalization that can not fully

compensate for the effects of the channel. Finally, inductively loaded differen-

tial amplifiers offer no control over the width nor the height of the pre-empha-

sis pulse, and more importantly can result easily in output ringing due to

resonant effects. The pre-emphasis employed in the presented driver is differ-

ent than all three methods outlined above as: (i) no passive L or C elements

are used; (ii) no clock is required and; (iii) the pulse width and height can be

independently controlled.

Pre-emphasis overshoot

time

Voltage

and undershoot

8

2.2 Inductive peaking

The theory of inductive peaking or broad-banding is well-documented [5][6]. A

brief review and implications for the driver design will be presented here.

Inductive series and shunt peaking are techniques that can be used to extend

the 3-dB bandwidth of an amplifier without expensing extra power. The fol-

lowing explanation will focus on shunt inductive peaking as it is applied in the

driver design.

Fig. 4: Simple common source amplifier with resistive loading

Fig. 5: Common source amplifier with shunt peaking

9

Fig. 4 illustrates a common source amplifier with an ideal resistor and capac-

itive load. For simplicity, we assume that the small signal frequency response

of the amplifier is determined by a single dominant pole, which is determined

solely by the output load resistance RL and by the load capacitance C.

The introduction of an inductance L in series with the load resistance as

shown in Fig. 5, alters the frequency response of the amplifier. This technique,

known as shunt peaking, increases the bandwidth of the amplifier by trans-

forming the frequency response from that of a single pole to one with two poles

and a zero.

The poles may or may not be complex. The zero is determined solely by the

L/RL time constant and is primarily responsible for the bandwidth improve-

ment. In addition, the frequency response of this amplifier is characterized by

the ratio of L/RL and RLC time constants. This ratio is denoted by m = L/

(RL2C). Isolating for the inductance value the ratio is re-written as L =

mRL2C.

V out

V in---------- ω( )

gmRL

1 jωCRL+----------------------------= 1( )

V out

V in---------- ω( )

gm RL jωL+( )

1 jωRLC ω2LC–+

-------------------------------------------------= 2( )

10

It can be shown [5] that bandwidth extension is possible at varying degrees

with adjustments of m. As expected, the 3-dB bandwidth of the shunt ampli-

fier increases as m increases. Table 1 shows the normalized 3-dB extension

factor relative to the value of m. The maximum bandwidth occurs for m = 0.71

and yields an 85% improvement in bandwidth. However it is accompanied by

a significant amount of gain peaking which is undesirable for broadband

amplifiers used in fibre optic or backplane applications. A maximally flat

response is observed for m = 0.41 while still improving bandwidth by 72%.

Finally, although a value of m = 0.32 does not result in the same bandwidth

improvement as the other two non-zero values of m shown in the table, it

exhibits the most linear phase response up to the 3-dB bandwidth [5]. This

value of m, called the optimum group delay value, is desirable for optimizing

pulse fidelity in broadband systems that transmit digital signals.

The optimum group delay value, which still results in a respectable 60 per-

cent increase in bandwidth, is best suited for the design of the broadband dig-

ital signal driver.

An implicit benefit of using inductive peaking is the enhanced freedom in the

power-bandwidth trade-off. This improvement can be demonstrated by first

Table 2: Performance metrics for shunt peaking [5]

Factor (m)Normalized

ω3dBResponse

0 1.00 No shuntpeaking

0.32 1.60 OptimalGroup Delay

0.41 1.72 Maximally flat

0.71 1.85 Maximumbandwidth

11

examining the key equations for a non-inductively loaded amplifier (as the one

introduced in Fig. 4). The bandwidth of the amplifier is dominated by the out-

put pole to be:

The value of the tail current Itail is also determined by the amount of desired

output swing and load resistance, especially in the design of switching invert-

ers:

where ∆Vswing is the voltage swing on the output node of the inverter and RL

is the load resistance of the inverter.

The advantage of shunt inductive peaking is made more obvious when the

bandwidth and power of an amplifier are examined. The bandwidth of an

amplifier is given by equation 3, while the power consumption is directly pro-

portional to Itail for a given power supply. The goal is to increase the band-

width and minimize the power consumption (and hence Itail of an amplifier).

This results in a contradictory solution for RL whereby equation 3 requires a

small value for RL and equation 4 requires a large one.

Inductive peaking allows the circuit designer to increase the value of RL to

reduce overall power consumption while simultaneously increasing band-

width with the introduction of a load peaking inductor. Since Table 1 shows

that shunt peaking can increase the bandwidth characteristics of an amplifier

by 60% while still maintaining a linear phase response, a possible trade-off is

to increase the resistance RL by 30% and decrease the tail current Itail by a

BW 12πRLC------------------= 3( )

I tail

V swing∆RL

-------------------= 4( )

12

similar amount (1/1.3) and introduce a load inductor. With m = 0.32 from the

Table 1, the value for the inductor is:

From equation 3, the increased resistance RL will decrease the bandwidth by

a factor of 1/1.3, but the inductive peaking will increase the newly reduced

bandwidth by 60%. Overall, the inductively peaked circuit will have both

higher bandwidth and lower power than the original resistively-loaded circuit.

Explicitly, the inductively peaked circuit will have bandwidth:

and the tail current of the inductively peaked circuit will be:

This results in a 23% gain in bandwidth with a simultaneous 23% decrease in

power consumption.

There are a few compromises involved in this optimization of the power-

bandwidth product via the introduction of an on-chip inductor. First, the

LCRL

2

3.1-------------= 5( )

BW peaked BW resistive1.61.3-------× 1.23BW resistive= = 6( )

I tail peaked( )I tail resistive( )

1.3------------------------------ 0.77I tail resistive( )= = 7( )

13

added die area expense at each inverter stage can be relatively large, with the

inductor usually occupying more area than the resistively-loaded inverter

stage alone. Secondly, deterministic jitter can occur leading to deleterious

results if an inductor is realized with a larger than simulated value. In this

case, jitter is a result of undesired, and more importantly, uncontrollable

peaking and signal distortion. In a circuit with multiple sequential gain

stages, each with over-sized peaking inductors, the resultant signal distortion

becomes catastrophic. Electro-magnetic field solvers that simulate the induc-

tances of on-chip planar and stacked spiral inductors must be verified experi-

mentally with fabricated test-structures before the final circuit is fabricated.

This pre-verification methodology was employed in the inductor design.

2.3 Hazardous relative placement of load inductors

The relative placement of the load inductors to the load resistors in the design

of each amplifying stage is of utmost importance. Fig. 6 shows a potentially

hazardous placement of a load inductor in an inductively loaded amplifier

stage. The root of the deleterious effect lies in transmission line theory. As the

driver operates in the high-frequency signalling spectrum, transmission line

theory is applicable.

Transmission lines spatially transform impedance [7]. The impedance of the

voltage supply in the AC case is zero, a short circuit. At high-frequencies, the

inductor length is comparable to that of a quarter-wavelength of the signal on

the inductor wire segment. Hence, the impedance looking into the inductor

may be spatially transformed from the short-circuit of the power supply into

an open circuit. The transformed open circuit will result in instability and/or

oscillatory behavior in the amplifier.

14

Fig. 6: Potentially hazardous placement of load inductor

Fig. 7: Proper placement of inductive load

It is important to note that, for the proper placement of the inductor in Fig. 7,

spatial impedance transformations still occur. However, the impedance look-

ing into the inductor from transistor M2 is always a finite, non-zero value

because the transformation acts on the finite and non-zero resistive value RL.

2.4 Second order effects in deep submicron MOSFETs

Of special relevance in the transistor sizing and biasing in the presented

Zin may ~ inf.

Zin alwaysfinite

15

design is a relatively unfamiliar submicron phenomenon. Electron mobility

degradation due to high vertical electric fields can drastically decrease perfor-

mance of deep-sub micron circuits. Fortunately, the application of proper bias-

ing measures can mitigate these negative effects. However, scarcely few

publications on this topic exist and classical biasing techniques are the norm.

Fig. 8: Transconductance of a 130nm nMOSFET vs. gate voltage [8]

Fig. 8 [8] portrays the deleterious effects on transconductance due to electron

mobility degradation in high-vertical fields. Shown is the transconductance

normalized by width (gm/W) of a 130-nm nMOS transistor as a function of

gate-source voltage. Its shape is similar to that of the fT dependence on VGS

and typical for all deep submicron technologies. The curve exhibits two dis-

tinct regions, the square-law region and the high-vertical field region as

shown. There also exists an intermediate area between the two regions in

which hybrid behaviour is observed.

At low effective gate voltages (VGS < 0.5V in Fig. 8), the device follows the

classical square law model and its transconductance varies linearly with VGS.

The equation derived for an n-MOS transistor in saturation (square law

region) is:

High-verticalfield region

Square-lawregion

16

where COX is the oxide capacitance, W is the gate width, L is the gate length,

and VT is the threshold voltage of the transistor.

At large gate-source voltages, the high-electric field developed between the

gate and channel of the transistor confines charge carriers to a narrower

region below the oxide-silicon interface, leading to more carrier scattering and

hence lower mobility. Further, small-geometry devices experience significantly

more mobility degradation [9]. An empirical equation modelling this effect is

[9]:

where µο denotes the “low-field” mobility and θ is a fitting parameter that

increases with decreasing oxide thickness and hence smaller geometries.

Substituting the solution for electron mobility µn (9) into (8) reveals that

when the second term in the denominator of (9) becomes dominant, transcon-

ductance becomes a constant. This effect is observed clearly for the high-verti-

cal field region of Fig. 8, where VGS > 0.7V.

It is of special interest to note how the constant transconductance due to

gm µnCOXWL-----

V GS V T–( )= 8( )

µn

µo

1 θ V GS V T–( )+-----------------------------------------= 9( )

17

high-vertical fields affects the differential voltage required to completely

switch a MOS differential pair. It will be shown that a larger switching volt-

age, with little or no improvement in transition time, is required when the

MOS pair is biased in the high-vertical field region [8].

We first derive the differential voltage required to completely switch a MOS

differential pair when biased in the high-vertical field region (VGS > 0.7V in

Fig. 8).

Fig. 9: MOS differential pair

Fig. 9 shows a differential pair of MOS transistors fed by a constant tail cur-

rent Itail. Imagine tail current Itail is fully routed through transistor Q1.

Assuming high-vertical field operation, Itail is:

I tail I DS1

Cox

2--------

W 1

L1--------

µo

1 θ V GS1 V T–( )+( )------------------------------------------------- V GS1 V T–( )2

= = 10a( )

18

where transistor Q1 is assumed to be in saturation and equation (9) has been

substituted for µn. In high-vertical field operation, when the term θ(VGS1 - VT)

becomes dominant relative to unity, equation (10a) becomes:

Equation (10b) shows that IDS1 now exhibits a linear relationship with VGS1.

At the instant that Itail is fully routed through Q1, it is evident that VGS2 = VT,

the threshold voltage of the transistors, such that IDS2 = 0. Hence, the differ-

ential voltage across the gates of the transistors is

Isolating VGS1 in (11) and substituting into (10b) reveals that

for Itail at the instant when all the tail current is shifted completely through

I tail I DS1

Cox

2--------

W 1

L1--------

µo

θ----- V GS1 V T–( )≈= 10b( )

V∆ V GS1 V GS2– V GS1 V T–= = 11( )

I tail

Cox

2--------

W 1

L1--------

µo

θ----- V∆ V T V T–+( )≈

Cox

2--------

W 1

L1--------

µo

θ----- V∆( )= 12( )

19

Q1. An alternate equation for Itail is derived in the steady state when the tail

current is split evenly between the transistors such that:

as VGS(1,2) is equal for both Q1 and Q2 in equilibrium. Equating Itail from (12)

and (13), we solve for ∆V, the differential voltage required to completely switch

the MOS differential pair:

where the inequality is explicitly shown in (14b) to indicate a minimum differ-

ential voltage requirement.

To find the minimum differential voltage to switch the tail current com-

pletely through one of the MOS transistors in the square-law region (VGS <

0.5V in Fig. 8), a parallel mathematical process of equations (10) through (14)

is completed. This is shown explicitly in [10]. The resultant required voltage

swing for full switching becomes:

I DS1 I DS2

I tail

2---------

Cox

2--------

W 1

L1--------

µo

θ----- V GS 1 2,( ) V T–( )

Cox

2--------

W 1

L1--------

µo

θ----- V EFF( )= = = = 13( )

Cox

2--------

W 1

L1--------

µo

θ----- V∆( ) Cox

W 1

L1--------

µo

θ----- V EFF( )= 14a( )

V∆ 2V EFF≥ 14b( )

V∆ 2V EFF≥ 15( )

20

Hence, in the high-vertical field region, both the scalar multiplier and VEFF

itself are larger, requiring a greater differential voltage and hence transition

time, to switch the differential pair [8].

The outcome of this analysis emphasizes that gate-source voltages must be

limited to mitigate the effects of high-vertical fields on electron mobility. This

is accomplished through proper transistor sizing and tail current selection.

21

3 Circuit Design

3.1 Driver overview

Fig. 10 shows the block diagram of the backplane driver highlighting the four

sections of the circuit. The pre-emphasis path is placed in parallel with the

main signal path and the current from both output stages are summed across

the 50 Ω load resistors to develop the differential output voltage. The parallel

path consists of delay buffers and the digital differentiator circuit. The output

swing is adjusted from the tail current of the output driver while the height of

pre-emphasis is controlled by the relative tail current of the digital differenti-

ator.

Fig. 10: Driver block diagram

22

3.2 Biasing a MOS CML gate for optimal speed

MOS CML logic has only recently been applied to high-speed ICs [6]. Conse-

quently, no systematic design guidelines existed until recently [8]. As such,

MOS CML gates have been biased for optimal speed rather haphazardly using

a voltage-centric approach. We will show through simulation that, for MOS

CML, a current-density centric instead of a voltage-centric design approach

leads to more accurate and reliable circuit design.

Fig. 11: Constant peak ft current density over four technology nodes [8]

In VGS-focussed MOS CML design the effective gate voltage (VEFF = VGS -

VT) value at which the peak fT of the MOSFET scales with technology,

decreasing with every technology node. This makes it very difficult to predict

an optimal bias across multiple technologies and very difficult to predict an

optimal bias within a given technology node. However, as the simulated data

collected over four technology nodes (# of fingers x gate length x finger width)

JpfT-MOS

23

shown in Fig. 11 [8] show, the peak-fT current density (JpfTMOS) remains

approximately constant (between 0.25 mA/µm and 0.35 mA/µm depending on

VDS) as technology scales. This trend is likely to occur also for future MOS

generations as a result of the constant field scaling that has been applied from

the 0.5-µm technology node [11] downward. Subsequently, a current-density

centric design approach, similar to that which is commonly employed in bipo-

lar designs [12], is more appropriate for reproducible, accurate, high-speed

design of MOS CML circuits. In a current-density centric design scenario, the

gate width of the MOSFET is sized such that the device reaches its peak fT

when all of the tail current flows through the device:

In 130-nm technology, this corresponds to a VEFF of around 300mV. Biasing

beyond the peak-fT current density will degrade circuit performance.

For large-signal high-speed circuit biasing in a MOS differential pair, it is

recommended that each of the differential pair transistors are biased at half-

peak fT current density in balanced current steady state. This avoids current

densities beyond peak-fT when, during full-switching, all the tail current is

routed through one transistor of the differential pair and the current density

is momentarily doubled from that of half peak fT to peak fT. Referring back to

the discussion of Section 2.4, biasing the circuit at half-peak fT has a second

positive effect on circuit switching speed. Biasing at half-peak fT current den-

sity instead of full-peak fT current-density permits the differential pair tran-

W G

I T

J pfTMOS---------------------= 16( )

24

sistors to operate more in the square-law region instead of the slower-

switching high-vertical field region.

Based on this observation, each circuit block in the presented driver design

consists of a MOS-CML inverter whose ratio of tail current to differential pair

transistor width is set to correspond to the peak fT bias of the n-channel MOS-

FET of 0.25 to 0.3 mA/µm. This bias scheme is implemented to obtain the

maximum switching speed. Inductive peaking is employed in every stage to

further improve the circuit bandwidth.

3.3 Biasing a differential CMOS inverter for low-noise

Fig. 12 shows the fT and oppositely the NFMIN versus current-density for a

130-nm n-MOSFET with a 2µm unit finger width. It is shown that NFMIN has

a minimum value corresponding to a bias current of about half the current-

density of the maximum fT. Earlier it was shown that biasing each transistor

in a differential pair at half-peak-fT would result in optimal switching speed.

Now it is shown, co-incidentally, that this current-density bias point also

results in minimum NFMIN.

25

Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2 µm unit finger width biased at VDS = 1V

3.4 Input matching network and low-noise comparator

Fig. 13 illustrates the input matching network and input low-noise compara-

tor. The input differential pair has higher gain and larger tail current than

the other stages in order to reduce the noise by making the optimum noise

impedance of the input stage closer to 50Ω per side. A compromise was

reached between achieving the best possible noise match, which calls for

larger transistor sizes and bias current, and the broadband input impedance

match. On-chip matching resistors, realized as a resistive divider with series

inductors, provide appropriate gate bias for the input transistors and broad-

band input impedance matching.

10-2

10-1

100

Current Density (mA per µm width)

0

25

50

75

100

f T (

GH

z)

0.0

0.5

1.0

1.5

2.0

NF

MIN

@ 1

0GH

z (d

B)

26

Fig. 13: Input bias and matching network and comparator

3.5 Eye-Crossing Control

A key objective of the thesis was the design of a driver with controllable pulse-

width. The application of controllable pulse width is the compensation of DC

offsets that may cause signal distortion. This compensation is used to alter the

duty-cycle (of an input signal) such that a 50% duty-cycle periodic input may

be changed to a 30% or 70% duty-cycle periodic output signal. Conversely, a

DC offset which has imposed an output duty cycle of 30% could be negated to

re-instate the duty-cycle back to 50% as desired.

The circuit of Fig. 14 accomplishes pulse-width control using a technique

found in [13]. The circuit consists of two series inductively-peaked differential

stages with a DC offset control pair connected at the output of the first stage.

By applying a DC voltage Voffset, an offset voltage is developed at the output

Zin= 50Ω

27

node of the first differential pair, shifting the zero-crossing between the two

outputs. Due to the finite rise and fall time of the waveform, and the trunca-

tion by the limiting action of the last inverter, Vout exhibits a change in duty-

cycle as illustrated by the overlaid waveforms of Fig. 14.

3.5.1 Transistor sizing and biasing in the eye-crossing control circuit

On the left side of Fig. 14, the input pair simply drives a series RL load, no dif-

ferently than any of the other inductively peaked circuits of this backplane

driver. The middle pair of transistors (M3 and M4) have DC bias voltage Voffset

held constant or a mixing effect would occur because the current pull of M3

and M4 act on the same signal path as M1 and M2. Hence it must be ensured

that the control voltage Voffset comes from a low-noise source.

The third stage acts as a limiting amplifier as previously described. The tail

current was chosen to be 12mA and hence the differential pair transistors

were chosen to be 32 µm each so that a current density of 0.19 mA/µm in each

during steady state. This biasing is in-line with the recommendations for

half-ft value derived in Section 3.2.

28

Fig. 14: Eye-crossing control circuit and intermediate signal

3.6 Output driver and parallel current summation

The output driver and digital pre-emphasis circuit, in parallel, create the out-

put voltage across the output load resistor. It is shown in Fig. 15 how the out-

put currents of the output driver and the digital differentiator are summed.

The ratio of output current from the output driver (I1 in Fig. 15) and the out-

put current contribution from the digital differentiator (I2 in Fig. 15) deter-

mine the percentage of pre-emphasis in the overall output voltage, Vout. The

higher this ratio, the lower the percentage of pre-emphasis in Vout.

29

Fig. 15: Summation of output currents across output resistor

The output driver is shown in Fig. 16 and consists of a simple differential pair

biased with a current mirror. The value of bias current in the output driver

determines the magnitude of output signal swing from the main path.

DelayBuffers Digital

Differentiator

Vout

OutputDriver

I2

I1 I1 + I2Rout

30

Fig. 16: Output driver transistor-level schematic

3.7 Digital pre-emphasis

The digital pre-emphasis circuit, whose block diagram is shown in Fig. 17, fea-

tures three delay cells, implemented as inverters, followed by a digital differ-

entiator circuit. The third delay cell is loaded additionally with MOS

varactors connected in parallel with resistive loads in order to control the

delay and, therefore, the pre-emphasis spike width.

Fig. 18 illustrates the waveforms in the digital differentiator. The circuit has

two differential inputs IN, IN and INdly, INdly and a differential output

Voutp, Voutn which is summed with the currents from the main path across

the on-chip 50 Ω load resistors.

VoutNVoutP

Vin+ Vin-Ibias = 2mA

Iout = 20mA

M1 M2

Vdd = 1.5VR

load (shared with digitaldifferentiator) R

load (shared with digitaldifferentiator)

31

Fig. 17: Digital pre-emphasis block diagram

Fig. 18: Pre-emphasis waveforms and transfer function

The input signals are assumed to be periodic for this example. Input signals

IN and IN are delayed through the delay circuit to produce INdly and INdly,

respectively. As shown by the dotted arrows in Fig. 18, the delay between IN

32

and INdly translates into the pre-emphasis pulse width of VoutP. The same

relationship holds for IN, INdly, and VoutN, respectively. The four signals IN,

IN, INdly and INdly are fed into the digital differentiator circuit, which has

two outputs, VoutP and VoutN.

The differentiator functions as a logical XOR gate with one notable excep-

tion. Whereas an XOR gate in the classical sense operates within a binary

logic system, this logic circuit operates on a tertiary (three-level) logic system.

Specifically, when IN.INdly is true, the output VoutP rises; when IN.INdly is

true, the output VoutP falls; and when neither case is true, the output of the

circuit stays in steady-state. The output VoutN falls and rises in a horizon-

tally-mirrored fashion.

The transform of binary logic at the input of the differentiator to tertiary

logic at its output is accomplished using current switching. The circuit of Fig.

19 is biased by two constant current sources of equal value, Iswch. The voltages

at the output of both VoutP and VoutN are current-controlled by voltage drops

across each 50 Ω load resistor. In the steady-state, both VoutP and VoutN are

pulled down by an equal current of value Iswch. When VoutP rises (and VoutN

drops), the current drain path pulling down VoutP is cut off, and VoutN is

pulled down by a current equal to 2Iswch. Conversely, when VoutP drops (and

VoutN rises), VoutP is pulled down by current 2Iswch, and VoutN has its cur-

rent path to ground cut off.

33

Fig. 19: NMOS digital differentiator schematic

In Fig. 19, the pre-emphasis height is controlled by the two constant current

sources Iswch, whose value is adjustable between 0 and 10 mA. For matching,

transistors M3 and M6 compensate the VDS drop across transistors M1, M2

and M4, M5, respectively.

34

4 Simulation Results

4.1 S-Parameter simulations

The small signal S-parameters were simulated in a single-ended input and

output configuration. The input and output return loss are better than -5 dB

up to 60 GHz, as shown in Fig. 20, with S11 achieving -10dB up to 50 GHz.

The single-ended simulated small signal gain is 14.5dB, confirmed by the eye

diagram measurements of Fig. 21. Further, Fig. 20 shows that the 3dB-fre-

quency of the driver is 8.5 GHz and the driver has gain (> 0dB) up to 24.1

GHz.

Fig. 20 Simulated S22 and S11 of complete driver circuit to 60 GHz

35

Fig. 21 Simulated single-ended S21 of entire driver circuit

With the driver’s output amplitude control set to a low level, the resulting

simulated S21 is as shown in Fig. 22. The output amplitude is controlled by an

off-chip current source and can be manipulated to increase or decrease the low

frequency gain peak. The decreased gain gives an extended 3dB-bandwidth

and 0dB crossing of 9.3 GHz and 37.6 GHz, respectively.

Fig. 22 Simulated single-ended S21 of driver with output amplitudereduced using amplitude control

S21 simulations with output peaking enabled are shown in Fig. 23. The peak-

ing is evidenced in the simulation, increasing the maximum small signal gain

from 14.1dB up to 16.5dB and the 3dB frequency to 13.3 GHz. The frequency

36

shaping effects of the pre-emphasis output circuit is exemplified here.

Fig. 23: Simulated S21 with output peaking enabled

4.2 Time domain simulations

Time domain eye-diagram simulations were performed on the full driver cir-

cuit with a 27-1 PRBS (Pseudo Random Binary Stream) generator. In Fig. 24,

a 20 Gb/s input signal was applied single-ended and the unused input was ter-

minated with a 50 Ω resistance connected to VDD.

Fig. 24: 20Gb/s eye-diagrams 27-1 PRBS: single-ended input 20 mVpp; dif-ferential output 99mVpp per side

37

Fig. 25 and Fig. 26 show simulated eye-diagrams at 25 Gbps and 30 Gbps,

respectively. The circuit exhibits reduced sensitivity at 25 Gbps and 30 Gbps,

requiring, respectively, 60 mV and 140 mV single-ended inputs for similar eye

openings.



Fig. 27 demonstrates the simulated eye-crossing control performance at 20

Gb/s taken at the output of the driver. The eye-crossing control for (a) 50%, (b)

66%, and (c) 33% was performed by varying the control voltage that was con-

nected to off-chip voltage sources (Voffset of Section 3.5.1)

38

Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50%, (b) 66%, and (c)33%

Fig. 28 shows output amplitude control at 20 Gb/s. The input signal in both

diagrams is 20 mVpp, applied to one side only, and the output is varied

between 21 mVpp and 220 mVpp.

(a)

(b)

(c)

39

Fig. 28: Output amplitude control at 20 Gb/s;Input of 20mVpp applied to a single end

Output of (a) 21 mVpp and (b) 220 mVpp per side

Simulated waveforms with varying levels of pre-emphasis at 20Gb/s are

shown in Fig. 29 and Fig. 30. By altering the current bias of the parallel digi-

tal pre-emphasis path, the pre-emphasis of the overall driver can be controlled

independently of the main path of the driver. Fig. 29 shows a symmetric +/-

16% overshoot/undershoot ratio, relative to the voltage swing of the main-

path signal. Fig. 30 shows an asymmetric +33%/-25% overshoot/undershoot

ratio. The asymmetry in Fig. 30 is attributed to the systemic limitation of the

pre-emphasis circuit, in that the maximum output voltage cannot exceed VDD

while the minimum value must be higher than 3*VDS(sat) due to the triple-

stacked NMOS configuration of Fig. 17. Hence, the ratio of overshoot/under-

shoot relative to the isolated main-path signal swing is directly related to the

(a)

(b)

40

pre-emphasis bias current. A lower digital pre-emphasis current results in a

symmetric overshoot/undershoot ratio as shown in Fig. 29, but the relative

percentage of overshoot/undershoot reduces to less than 25%.

Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per side and +/-16% pre-emphasis

Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per side and+33%/ -25% pre-emphasis

41

5 Physical Implementation

5.1 Inductor design and model extraction

The Computer Automated Design (CAD) tool ASITIC (http://rfic.eecs.berke-

ley.edu/~niknejad/asitic.html) was used in the simulation and sizing of the

inductors in the driver. ASITIC is a three-dimensional field solver which aids

the RF circuit designer in the optimization and modelling of spiral inductors,

transformers, capacitors, and substrate coupling. Test-structures previously

fabricated and characterized at the University of Toronto have confirmed the

accuracy of the ASITIC solver to within 90-95% of absolute inductance values,

in both planar and stacked spiral inductor situations.

All inductors of the driver were simulated as two-port circuits using both the

π−model of Fig. 31 and the lumped element broadband model shown in Fig.

32. In the substrate, the eddy-current induced loss and substrate capacitance

are represented by Rsub and Csub, respectively. The oxide capacitance is rep-

resented by Cox. The series inductance and resistance of the inductor proper

is represented by Ls and Rs, respectively. Finally, the capacitance between the

two symmetrical interwoven arms of the inductor is represented by Ciw.

None of the passive lumped elements in the model of Fig. 32 could be omitted

to reduce simulation time because each inductor was not attached directly to a

ground node in the design. This design requirement was discussed previously

in section 2.3. Further, as each inductor was employed in a broadband load

configuration, design considerations gave priority to self-resonant frequency

over quality factor (Q) in the optimization of each inductor.

42

5.1.1 Model extraction using ASITIC

This section will outline the procedure used to extract both Π-model of the

inductor, which is valid for singular low-frequency values, and the broadband

model of the inductor, which is valid for the inductor below self-resonant fre-

quencies.

The π-model of the inductor is shown in Fig. 31 and is valid only at a single-

frequency. The ASITIC three-dimensional field-solver is invoked using the

command pix to resolve the circuit parameters Ls, Rs, Cox1, Cox2, Rsub1, and

Rsub2 from an inductor layout drawn in the ASITIC graphical tool. These val-

ues are assumed correct at all frequencies below self-resonance for the induc-

tor in question. The broadband model introduces three fitting capacitances

(Ciw, Csub1, Csub2) to account for the broadband frequency response of the

inductor’s parasitics.

The broadband model capacitors are found using a Y-parameter simulation

from 5 GHz up to the self-resonant frequency of the inductor, in 1 GHz steps.

The output of the Y-parameter data is then ported to a SPICE optimization

deck. The optimization deck is included in Appendix A for reference. The

SPICE optimization deck will best fit the three broadband model capacitances

to match the Y-parameter data simulated in ASITIC. As three unknowns are

being optimized for simultaneously, reasonable and educated estimates for

each of the broadband capacitances (Ciw, Csub1, Csub2) are substituted initially.

The SPICE deck in Appendix A will also print L12, Leff and Q vs. frequency.

43

Fig. 31: Inductor single-frequency Π -model

Fig. 32: Inductor lumped element broadband model

44

Derived ASITIC parameters

Table 2 and Table 3 show the physical dimensions and the simulated values of

the equivalent circuit parameters for each of the three inductors employed in

this design. All inductors have dual-layer stacked-spiral configurations with

varying numbers of turns and a maximum outer diameter (per side) of 51.2µm

is used for the 900pH inductor.

Table 1:Table 2:

Table 3: Physical Dimensions of each inductor

Parameter LS = 400pH LS = 700pH LS = 900pH

Diameter (µm) 42.2 46.2 51.2

Number ofTurns

2 3 3

Metal Width(µm)

1.96 1.96 1.96

Spacingbetween wind-ings (µm)

1.96 1.96 1.96

Metal layers METAL6METAL5

METAL6METAL5

METAL6METAL5

45

5.1.2 Inductor realization and isolation

The 900pH inductor is shown in Fig. 33 and is comprised of two metal layers -

the top level of the six metal process and metal five. The left-side terminal of

the inductor is formed in metal six and connects directly to the drain of the

amplifying transistor of that half stage. The right-side port of the inductor is

routed in metal five and is connected to a poly-silicon resistor at each stage.

Each inductor is isolated from each adjacent inductor to minimize crosstalk.

The isolation is accomplished by surrounding each inductor with n-wells in

turn surrounded by p-tap guard rings connected to ground to impose reverse-

biasing on the junctions. Additionally, referring to Fig. 33, the p-taps between

adjacent inductors are electrically attached to a stack of metal layers, from

metal 1 through metal 6, which form a Faraday cage and improve isolation.

Table 1:Table 2:Table 3:

Table 4: Simulated inductor parasitic values


RS (Ω) 5.14 8.01 9.13

Cox1 (fF) 8.63 11 13

Cox2 (fF) 8.2 10.8 12.7

Rsub1 (Ω) 1820 1830 1710

Rsub2 (Ω) 2280 2130 1980

Csub1 (fF) 5.72 5.76 6.16

Csub2 (fF) 4.70 4.95 5.32

Ciw (fF) 4.69 6.26 7.24

fSelf-Res (GHz) 85.6 56.7 45.93

Q-factor (5GHz)

2.4 2.7 3.1

46

Fig. 33: On-chip 900 pH inductor with 51.2µm diameter

5.2 Layout and fabrication

The circuit was fabricated in ST Microelectronics’ 130-nm standard CMOS

process with typical n-MOSFET fT and fMAX of 90 GHz and 100 GHz, respec-

tively. The chip microphotograph is reproduced in Fig. 34. The design is pad-

limited and the total die area was 1.0mm x 0.8mm.

47

Fig. 34: Full chip photograph

Magnified versions of the two parallel paths of the driver are shown in Fig.

35 and Fig. 36, respectively. Using Fig. 10 as a reference that describes the

schematic block diagram of the entire driver, Fig. 35 emphasizes the layout of

the main signal path of the driver while Fig. 36 details the layout of the paral-

lel pre-emphasis signal path.

48

Fig. 35: Magnified photograph of main path layout

Fig. 36: Magnified photograph of the parallel path layout

49

The entire layout, with the exception of the digital pre-emphasis XOR gate, is

completely symmetric about the horizontal axis. Power and ground connec-

tions are provided along the central axis in metal 1 and metal 2, respectively.

Transistors and poly-silicon resistors are located adjacent to the division, fol-

lowed by signal path routing, with the peaking inductors located on the out-

side.

The fully-symmetrical layout results in several advantages: (i) the layout of

the entire circuit is simplified by employing half-cell layout and replication

techniques; (ii) the positive and negative signal paths are matched in length

and are physically and electrically isolated from one another; and (iii), the

coupling coefficients of same-stage load inductors are diminished.

50

6 Experimental Results

6.1 Test environment

The post-fabricated circuit was tested on wafer with probes microscopically

placed on the circuit pads. The frequency-domain measurements were made

with a 50-GHz 8510C Hewlett-Packard Vector Network Analyzer (VNA), as

shown in Fig. 37. The time-domain 231-1 PRBS stimulus was created with a

combination of an Anritsu 69397B Signal Generator, an MP17584 Pulse Pat-

tern Generator, and a MP1801A 43.5 Gbps MUX as shown in Fig. 38.

In Fig. 38, the signal generator generates a CLK signal which is driven to the

MUX, and also divided by four and sent to the pulse pattern generator. The

pulse pattern generator produces four 1/4 DATA signals, each with bitrate

one-quarter the final PRBS stream bitrate. The four 1/4 DATA are multi-

plexed temporally with the original CLK controlling the switching of the MUX

on positive CLK edges. Hence, the output of the MUX is a 231-1 PRBS bit

sequence of the required bitrate for the test, not exceeding 43.5 Gbps.

51

Fig. 37: S-parameter test setup

D.U.T.

DC Bias / ControlProbes


DA in

DA out

50 ohm 50 ohm

Hewlett-Packard8510C

52

Fig. 38: Eye-diagram test setup

6939

7BS

ynth

esiz

edS

wee

p/S

ign

al G

ener

ato

r

D.U

.T.

DC

Bia

s / C

on

tro

lP

rob

es

DC

Bia

s / C

on

tro

lP

rob

es

50 o

hm

Vd

d

Vd

d

50 o

hm

MP

1801

A43

.5 G

b/s

MU

X

CL

K

MP

1758

4P

uls

e P

atte

rn G

ener

ato

r

1/4

CL

K

1/4

DA

TA

-RA

TE

PR

BS

(DA

TA

-Rat

e)

PR

BS

refe

ren

ce C

LK

8610

0BD

CA

20 -

43.

5 G

b/s

53

6.2 S-Parameters and noise figure

The small signal s-parameters and noise figure were measured in a single-

ended configuration. The small signal gain, S21, is shown in Fig. 39 and

agrees well with the simulated results of Fig. 21. Measured single-ended S21

bandwidth was 8.5 GHz, similar to the simulated small-signal gain. The mea-

sured input and output return loss are better than -12 dB up to 50 GHz, as

shown in Fig. 40, outperforming the simulated value of -5dB up to 60 GHz in

Fig. 20. Measured and simulated noise figure values are plotted in Fig. 41. As

expected, the simulations show an inverse relationship between tail current

and overall driver noise figure. Furthermore, the measured noise figure of the

overall driver was 2 dB higher than simulated for a 6mA tail current in the

input comparator. This is primarily due to losses associated with the probe-

pad contact resistance and the series substrate resistance below the pad

which are not accounted for in simulation. Another reason for the higher than

simulated noise figure is the limitation of the BSIM3 model for MOSFETs

that does not capture the gate noise current of the MOSFET.

Fig. 39: Measured single-ended S21

54

Fig. 40: Measured single-ended S22 and S11

Fig. 41: Measured and simulated driver noise figure

55

6.3 Time-domain measurements

Time domain measurements were carried out at data rates between 20 Gb/s

and 43 Gb/s and using a 231 - 1 PRBS pattern. A sensitivity of 20 mVpp, as

illustrated in Fig. 42, was measured at 20 Gb/s when the input signal was

applied single-ended and the unused input was terminated with a 50 Ω resis-

tance. The sensitivity degraded to 60 mVpp, and 150 mVpp, at 25 Gb/s and 30

Gb/s, respectively.


Typical 25 Gb/s and 30 Gb/s output eye diagrams are illustrated in Fig. 43

and Fig. 44, respectively. It is important to note that the driver exhibits gain

at 30 Gb/s, with a 200mVpp input signal resulted in a 260mVpp output signal.

This is the first known recording of positive gain at 30 Gb/s in an all CMOS

driver to date (2005).

56

Fig. 43: 25Gb/s output eye diagram with 50% eye crossing

Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per side for a single-ended 200 mVpp, 231-1 input PRBS.

57

Fig. 45 demonstrates the eye-crossing control performance at 20 Gb/s taken at

the output of the driver. The eye-crossing control for (a) 50%, (b) 70%, and (c)

30% was performed by varying the control voltage that was connected to off-

chip voltage sources.

Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30%

(a)

(b)

(c)

58




Fig. 46: Output amplitude control at 20 Gb/s; Output of (a) 190 mVpp and(b) 350 mVpp per side

Measured waveforms with pre-emphasis at 20Gb/s are shown in Fig. 47. The

eye-diagram exhibits higher positive overshoot than undershoot with spike

height control between 0% and 25% of the eye height. The overshoot/under-

shoot imbalance shown in Fig. 47 was caused by an over-ratio between the

(a)

(b)

59

pre-emphasis path tail current and the main path tail current. The ratio of

overshoot/undershoot relative to the output signal swing alone is directly

related to the pre-emphasis path tail current value. A lower digital pre-

emphasis tail current could have resulted in a symmetrical overshoot/under-

shoot ratio, but the relative percentage of overshoot/undershoot would have

reduced to less than 15%. Unfortunately, experimental evidence of this was

not captured due to time restrictions at the testing facilities at Quake Tech-

nologies and ST Microelectronics, both located in Ottawa, Canada.

Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis

6.3 Performance summary

Table 4 summarizes the overall circuit characteristics. Of special note are the

high input sensitivity and high output swing, the extensive -12dB input/out-

put matching up to 50 GHz and the multiplicity of control mechanisms with

60

respect to output amplitude, pre-emphasis and eye-crossing. Further, there is

excellent agreement between the simulated and measured values of the driver

in both the time domain and the frequency domain.

Table 5: Simulated and measured circuit parameters

Parameter SimulatedValue Measured Value

Supply 1.5V 1.5V

Power 150 mW 150 mW

Output swing@ 20Gb/s with 20 mVpp input

21-350 mVpp perside

190-350 mVpp perside

Pre-emphasis control @ 20 Gb/s +33%/ -25% +25%/-15%

Crossing control @ 20 Gb/s 33% to 66% 30% to 70%

Eye sensitivity @ 20 Gb/s 20 (10) mVpp 20 (10) mVpp

Noise Figure 14.9 dB @ 5 GHz15 dB @ 15 GHz

17 dB @ 5 GHz17 dB @ 15 GHz

S11/S22 up to 50 GHz < -5dB < -12dB

61

7 Conclusions

7.1 Summary

A 20 Gb/s backplane driver with more than 30 dB dynamic range was imple-

mented in 130-nm CMOS technology. The circuit consumes 150 mW from a 1.5

V supply and features independent control of output swing, duty cycle and

pre-emphasis. The circuit is operational without pre-emphasis at data rates

up to 30 Gb/s with 300 mVpp swing per side. The pre-emphasis pulse is both

amplitude and width controllable via the introduction of a novel digital circuit

implementation which does not require a separate clock signal. The results of

this project were published and presented at the IEEE Custom Integrated

Circuits Conference in 2004 [1].

7.2 Future work

Future work associated with this design would result in the further system

verification, expansion and industrialization of the circuit. As time-domain

testing was completed in industrial settings, tester time availability was low,

and in particular, pre-emphasis experimentation was affected. Further test-

ing on the pre-emphasis circuit would involve varying the varactor to increase

and decrease the width of the pre-emphasis pulse width, and varying the rela-

tive current bias from the digital differentiator to empirically match postitive

and negative pre-emphasis pulse heights. In terms of system expansion and

industrialization, a receiver circuit placed across a backplane with appropri-

ate mechanisms to control the pre-emphasis would be required to close the

feedback loop. Further industrialization of the main path would require

62

instantiation of on-chip and off-chip reference voltages and currents for each

of its independently controlled stages. Finally, verification of the circuit in a

datapath operation to measure its true performance in a digital data-specific

application is required. This would involve an addition of a Media Access Con-

trol (MAC) circuit and layer at the input of the driver as well as its MAC coun-

terpart on the receiving end.

63

References

[1] P. Westergaard, S.P. Voinigescu, T.O. Dickson “A 1.5-V, 20/30-Gb/s CMOS

Backplane Driver with Digital Pre-emphasis,” Proc. IEEE Custom Inte-

grated Circuits Conference, pp.23-26, Orlando, FL, Oct. 2004

[2] S. Galal, B. Razavi, “40Gb/s Amplifier and ESD protection Circuit in 0.18-

um CMOS Technology,” IEEE ISSCC Digest, pp.480-481, 2004

[3] D. Kehrer, H.D. Wohlmuth, “40 Gb/s 2:1 Multiplexer and 1:2 Demulti-

plexer in 120 nm CMOS,” IEEE ISSCC Digest, pp. 345-346, 2003

[4] J. Liu, X. Lin, “Equalization in high-speed communication systems,” Cir-

cuits and Systems Magazine, IEEE, Volume 4, Issue 2, pp. 4-17, 2004

[5] S.S. Mohan, M.D.M. Hershenson, S.P. Boyd, T.H. Lee, “Bandwidth exten-

sion in CMOS with optimized on-chip inductors,” IEEE Journal of Solid

State Circuits, Volume 35, Issue 3, March 2000, pp. 346 - 355

[6] M. Green, “Current-controlled CMOS circuits with Inductive broadband-

ing,” U.S. Patent 6,525,571, Filed Sept. 26, 2001

[7] S. Ramo, J.R. Whinnery, T. Van Duzer, Fields and Waves in Communica-

tions Electronics, 3rd. Ed. New York, John Wiley & Sons, 1994

[8] T. O. Dickson, R. Beerkens, S. P. Voinigescu, “A 2.5-V, 45-Gb/s Decision Cir-

cuit Using SiGe BiCMOS Logic," IEEE Journal of Solid-State Circuits, Vol-

ume 40, Issue 4, pp. 994-1003, April 2005

[9] B. Razavi, Design of Analog CMOS Integrated Circuits, 1st Ed. New York:

Mcgraw-Hill, 2001

64

[10] A. Sedra, K. Smith, Microelectronic Circuits, 4th Ed. New York: Oxford

Press, 1998

[11] S.P. Voinigescu, T.O. Dickson, R. Beerkens, I. Khalid, P. Westergaard, "A

Comparison of Si CMOS, SiGe BiCMOS, and InP HBTs Technologies for

High-Speed and Millimeter-wave ICs," Si Monolithic Integrated Circuits in

RF Systems, pp.111-114, Atlanta, GA, Sept. 2004

[12] R. Ranfft, H.M. Rein, “High-speed bipolar logic circuits with low power

consumption for LSI - a comparison.” IEEE Journal of Solid State Circuits,

Vol. 17, Issue 4, Aug. 1982, pp. 703 - 712

[13] D.S. McPherson. McPherson, D.S.; Pera, F.; Tazlauanu, M.; Voinigescu,

S.P. “A 3V fully differential distributed limiting driver for 40-Gb/s optical

transmission systems,” IEEE Journal of Solid-State Circuits, Volume 38,

Issue 9, Sept. 2003 pp. 1485 - 1496

65

Appendix A: Spice file for inductor parameterextraction.option acct nomod post=2 probe

.net v(p2) vin rout=50 rin=50

vin p1 0 AC 1

L p1 3 LsR 3 p2 RsCs1 p1 1 Cp1Cs2 p2 2 Cp2Rs1 1 0 Rsub1Rs2 2 0 Rsub2Csub1 1 0 Csub1Csub2 2 0 Csub2Cbr p1 p2 Cbr

.param+ Ls = 0.407n+ Rs = 6.9+ Rsub1 = 5240+ Rsub2 = 691+ Cp1 = 4.67f+ Cp2 = 5.31f+ Csub1 = OPT1(0.1p, 0.0001p, 10p)+ Csub2 = OPT1(0.1p, 0.0001p, 10p)*+ Csub1 = 5.72f*+ Csub2 = 4.70f+ Cbr = OPT1(30f, 0.0001p, 10p)

.AC data=measured optimize=opt1+ results=comp1,comp2,comp3,comp4,comp5,comp6,comp7,comp8+ model=converge.model converge opt relin=1e-4 relout=1e-4 close=10 itropt=30.measure ac comp1 err1 par(y11r) y11(r).measure ac comp2 err1 par(y11i) y11(i).measure ac comp3 err1 par(y12r) y12(r).measure ac comp4 err1 par(y12i) y12(i).measure ac comp5 err1 par(y21r) y21(r).measure ac comp6 err1 par(y21i) y21(i).measure ac comp7 err1 par(y22r) y22(r)

66

.measure ac comp8 err1 par(y22i) y22(i)

.ac data=measured

*.ac lin 75 5e9 79e9

.plot ac y21(m) y11(m) y21(db)

.print par(y11r) y11(r) par(y11i) y11(i)




.print y11(r) y11(i) y11(m) y11(p)

.print y22(r) y22(i) y22(m) y22(p)

.print y12(r) y12(i) y12(m) y12(p)

.print y21(r) y21(i) y21(m) y21(p)

.print ac L12=par(’y12(i)/((6.28*FREQ)*(y12(m)*y12(m)))’)

.print ac Leff=par(’-y11(i)/((6.28*FREQ)*(y11(m)*y11(m)))’)

.print ac Q=par(’-y11(i)/y11(r)’)

.print ac cox=par(’(y11(i) + y12(i))/(6.28*FREQ)’)

*.print par(s11r) s11(r) par(s11i) s11(i)*.print par(s12r) s12(r) par(s12i) s12(i)*.print par(s21r) s21(r) par(s21i) s21(i)*.print par(s22r) s22(r) par(s22i) s22(i)*.print z11(r) z11(i) z11(m) z11(p)*.print z22(r) z22(i) z22(m) z22(p)*.print z12(r) z12(i) z12(m) z12(p)*.print z21(r) z21(i) z21(m) z21(p)

.data measuredFREQ Y11r Y11i Y12r Y12i Y21r Y21i Y22r Y22i***PLACE ASITIC SIMULATION DATA HERE***

*.param freq=100MEG,s11m = 0 , s11p = 0, s12m = 0, s12p = 0, s21m =0,*+s21p =0, s22m =0 , s22p = 0.end

-7



by

Paul Westergaard

A project report submitted in conformity with the requirementsfor the degree of Masters of Engineering

Graduate Department of Electrical and Computer EngineeringUniversity of Toronto

Copyright by Paul Westergaard 2005

-6



Paul Westergaard

Master of Engineering, 2005

Graduate Department of Electrical and Computer Engineering

University of Toronto

Abstract

A high-speed input comparator and output driver with fully adjustable pre-

emphasis for applications in serial inter-chip communications over backplanes

at 20 Gb/s is presented. The driver achieves data rates of up to 30 Gb/s when

the pre-emphasis is disabled. The circuit was implemented in 130-nm CMOS

and consumes 150 mW from a 1.5-V supply in 20 Gbps operation. It has over

30 dB dynamic range with a sensitivity of 20 mVpp and a differential output

swing of 700 mVpp at 20 Gb/s. The output driver features a novel digital pre-

emphasis circuit with independent pulse height and pulse width control with-

out the requirement of an external clock input. Other independent features

are 30%-70% eye-crossing control and adjustable output swing between 170

mVpp and 350 mVpp per side. The results of this project were published and

presented at the IEEE Custom Integrated Circuits Conference in 2004 [1].

ii

-5

Acknowledgements

The author would graciously like to thank his supervisor and mentor Profes-

sor Sorin Voinigescu for his inspiration, technical and personal insight and

unwavering support throughout this project. The author would also like to

specially thank Timothy O. Dickson, a fellow graduate student, for his on-

going technical input, altruism and camaraderie throughout this thesis.

Experimental results would not have been possible without Timothy’s contri-

butions. Furthermore, the author acknowledges that this thesis would not

have been possible without the valuable technical and financial contribution

of the Ottawa, Ontario office of ST Microelectronics. Special contributing

members of the ST Microelectronics technical staff who are owed special grat-

itude include Rudy Beerkens, Boris Prokes, Imran Khalid and Steve McDow-

ell.

iii

-4

Contents

List of Tables viList of Figures vii1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2 Objectives and scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2

2 Background 42.1 Review of pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Inductive peaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Hazardous relative placement of load inductors . . . . . . . . . . . . . . . . . . . . . 132.4 Second order effects in deep submicron MOSFETs . . . . . . . . . . . . . . . . . . . . 14

3 Circuit Design 213.1 Driver overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .213.2 Biasing a MOS CML gate for optimal speed . . . . . . . . . . . . . . . . . . . . . . .223.3 Biasing a differential CMOS stage for low-noise . . . . . . . . . . . . . . . . . . . .243.4 Input matching network and low-noise comparator . . . . . . . . . . . . . . . . . .253.5 Eye-crossing control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26

3.5.1 Transistor sizing and biasing in the eye-crossing control circuit . . . .273.6 Output driver and parallel current summation . . . . . . . . . . . . . . . . . . . . . . . 283.7 Digital pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4 Simulation Results 344.1 S-Parameter simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .344.2 Time-domain simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5 Physical Implementation 415.1 Inductor design and model extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.1.1 Model extraction using ASITIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.2 Inductor Realization and isolation . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 Layout and fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6 Experimental Results 506.1 Test environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .506.2 S-Parameters and noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.3 Time-domain measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.4 Performance summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

iv

-3

7 Conclusions 617.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

References 63

Appendix A: Spice file for inductor parameter extraction 65

v

-2

List of Tables

Table 1: Circuit features categorized by design requirement . . . . . . . . . . . . . . . . . . . . . . . . 2

Table 2: Performance metrics for shunt peaking [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10

Table 3: Physical dimensions of each inductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Table 4: Simulated inductor parasitic values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Table 5: Simulated and measured circuit parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

vi

-1

List of Figures

Fig. 1: Transmitter pre-emphasis and backplane trace transfer characteristic . . . . . . . . . . . 5

Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Fig. 3: Time-domain pre-emphasis overshoot and undershoot. . . . . . . . . . . . . . . . . . . . . . . . . . 7

Fig. 4: Simple common source amplifier with resistive loading. . . . . . . . . . . . . . . . . . . . . 8

Fig. 5: Common source amplifier with shunt peaking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Fig. 6: Potentially hazardous placement of load inductor . . . . . . . . . . . . . . . . . . . . . . . . . .14

Fig. 7: Proper placement of inductive load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14

Fig. 8: Transconductance of a 130-nm nMOSFET vs. gate voltage [8] . . . . . . . . . . . . . . . . .15

Fig. 9: MOS differential pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Fig. 10: Driver block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Fig. 11: Constant peak ft current density over four technology nodes . . . . . . . . . . . . . . . . . . 22

Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2-µm unit finger width biased at VDS = 1V . . . . . . . . . . . . .25

Fig. 13: Input bias and matching network and comparator . . . . . . . . . . . . . . . . . . . . . . . . 26

Fig. 14: Eye-crossing control circuit and intermediate signals . . . . . . . . . . . . . . . . . . . . . . 28

Fig. 15: Summation of output currents across output resistor . .. . . . . . . . . . . . . . . . . . . . . . 29

Fig. 16: Output driver transistor-level schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30

Fig. 17: Digital pre-emphasis block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Fig. 18: Pre-emphasis waveforms and transfer function . . . . . . . . . . . . . . . . . . . . . . . . . .31

Fig. 19: NMOS digital differentiator schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33

Fig. 20: Simulated S22 and S11 of complete driver circuit to 60 GHz . . . . . . . . . . . . . . . . . 34

Fig. 21: Simulated single-ended S21 of entire driver circuit . . . . . . . . . . . . . . . . . . . . . . . . . .35

Fig. 22: Simulated single-ended S21 of driver with output amplitude reduced using amplitude control . . . . . . . . . . . . . . . . . . . . . 35

Fig. 23: Simulated S21 with output peaking enabled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36


Fig. 25: 25Gb/s eye-diagrams 27-1 PRBS: single-ended input 60 mVpp;

vii

0

differential output 180mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

Fig. 26: 30Gb/s eye-diagrams 27-1 PRBS: single-ended input 140 mVpp;differential output 300mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37

Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50, (b) 66%, and (c) 33% . . . . . . . . .38

Fig. 28: Output amplitude control at 20 Gb/s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39

Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per sideand +/- 16% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per sideand +33%/ -25% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Fig. 31: Inductor single-frequency Π-model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43

Fig. 32: Inductor lumped element broadband model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Fig. 33: On-chip 900 pH inductor with 44mm diameter . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Fig. 34: Full chip photograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47

Fig. 35: Magnified photograph of main path layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Fig. 36: Magnified photograph of the parallel path layout . . . . . . . . . . . . . . . . . . . . . . . . 48

Fig. 37: S-Parameter test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Fig. 38: Eye-diagram test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Fig. 39: Measured single-ended S21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Fig. 40: Measured single-ended S22 and S11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Fig. 41: Measured and simulated driver noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54


Fig. 43: 25Gb/s output eye diagram with 50% eye crossing . . . . . . . . . . . . . . . . . . . . . . . 56

Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per sidefor a single-ended 200 mVpp, 231-1 input PRBS . . . . . . . . . . . . . . . . . . . . . . . . 56

Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30% . . . . . . . . . . . . . . . . 57

Fig. 46: Output amplitude control at 20 Gb/s;output of (a) 190 mVpp and (b) 350 mVpp per side . . . . . . . . . . . . . . . . . . . . .58

Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis . . . 59

viii

1

1 Introduction

1.1 Motivation

Serial inter-chip communication is gaining widespread acceptance over paral-

lel architectures because congested printed circuit board (PCB) routing and

pad-limited silicon dice are not cost efficient in commodity designs. To mini-

mize the overall circuit area required for a serial transmitter/receiver pair,

equalization can be performed at the transmitter instead of at the receiver, in

which case it is known as pre-equalization. At the transmitter, pre-equalizers

alter the wave-function to account for the low-pass response of the intercon-

nect. Historically, pre-emphasis has been achieved either using clocked flip-

flops and step-delayed current summation or analog differentiators. The latter

only permits for amplitude control of the pulse, obviating control for the pulse

width. The former implementation, while having the necessary control mecha-

nisms and efficacy, places severe strain on device technology as the required

flip-flop typically operates at twice the frequency of the driver itself.

Even though 40 Gb/s CMOS amplifiers [2], demultiplexers and multiplexers

[3] have been recently reported, demonstrating the high-speed potential of

standard CMOS technology, they suffer from limited dynamic range due to

poor sensitivity and modest output swings of about 100 mVpp per side.

This paper presents the first published CMOS driver with duty-cycle, ampli-

tude, and pre-emphasis control that operates at data rates exceeding 20 Gb/s.

The driver achieves over 30dB of dynamic range. It includes a novel passive

element-free differentiator that enables control of both amplitude and width

of the pre-emphasis pulse.

2

1.2 Objectives and scope

The prime objective of this thesis was a fabricated high-speed, pre-emphasis

enabled, output driver for applications in serial inter-chip communications

over backplanes. The scope of the thesis was the theoretical derivation, com-

puter-aided design, simulation, fabrication and experimental characterization

of the circuit. The required features of the design in order to achieve its tar-

geted application were as follows:

Table 1: Circuit features categorized by design requirement

DesignRequirement

Corresponding Circuit Feature

High Bandwidth • Inductive peaking at each signal path stage• High gain (> 20dB) at 20 Gb/s operation• Positive gain at 30 Gb/s operation

Low Power • 1.5 V power supply• 150 mW dissipation at 20 Gbps operation.

High SignalIntegrity andSensitivity

• Input and output matching 50 Ω up to 50 GHz• Differential signalling• Symmetrical layout• 20 mVpp input sensitivity

Signal Shaping andControl

• 30% to 70% pulse-width control• 200-700mVpp differential output swing control• Pre-emphasis spike width and height control

NovelImplementation

• Full CMOS implementation• Clock-free pre-emphasis circuit with non-tra-

ditional circuit design

3

1.3 Organization

The thesis is organized as follows. Chapter 2 discusses the background of pre-

emphasis, inductive peaking, and hazardous relative placement of load induc-

tors. Chapter 3 details the concept, design and biasing of the individual circuit

elements in the driver. Insights into transistor sizing for optimal speed based

on a current-density centric approach as well as minimum noise are provided.

In Chapter 4, the pre-layout simulation results of the driver are presented.

Chapter 5 summarizes the physical implementation of the entire circuit.

Chapter 6 offers an overview of the experimental results. Conclusions are

given in Chapter 7.

4

2 Background

2.1. Review of pre-emphasis

In high-speed circuit applications in which high-frequency signals are sent

over backplane channels, there are two types of equalization: transmitter pre-

emphasis and receiver equalization [4]. Both are intended to either emphasize

the high-frequency components or de-emphasize the low-frequency compo-

nents of the transmitted signal, in order to compensate for the low-pass trans-

fer characteristics of the channel. The transfer function of both types of

equalizer is high-pass, though in practice, it is band-pass. The reasons for the

latter are threefold: (i) semiconductor devices in practice cannot achieve infi-

nite bandwidth; (ii) to avoid high-frequency noise amplification; (iii) to meet

regulated electromagnetic interference (EMI) specifications.

Pre-emphasis is achieved at the transmitter side by increasing the high-fre-

quency components. Fig. 1 shows the mechanism in which ideal transmitter

pre-emphasis compensates for the low-pass transfer characteristics of the

backplane trace.

5

Fig. 1: Transmitter pre-emphasis and backplane trace transfer character-istic

A common practical pre-emphasis circuit implementation is a Finite

Impulse Response (FIR) filter. Fig. 2 shows the block diagram of a 4-tap FIR

filter with a single Data input, delay elements D and tap coefficients C1, C2,

C3, C4. The tap coefficients adjust the gain at each multiplier independently to

produce the output voltage across the load resistors of value R. The output is a

frequency shaped version of the Data input in the form of amplified high-fre-

quency components.

Backplane traceTransfer Characteristics

Frequency (Hz)

TransmitterPre-emphasis

Response

6

Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis

In the time domain, the FIR filter performs a differentiation function. As

shown with the dotted lines in Fig. 3, the waveform of transmit pre-emphasis

appears as overshoot and undershoot in the time-domain.

C1 C2 C3 C4

D

D

D

D

Data

Delay Element

R R

7

Fig. 3: Time-domain pre-emphasis overshoot and undershoot

Other circuits that perform similar differentiation functions are passive RC

differentiators and inductively-loaded differential amplifiers. Unfortunately,

there exist drawbacks in each of these three differentiator implementations.

For FIR filters, a clock of at least twice the frequency of the data is required

to trigger the delay elements D in Fig. 2. The width of the pre-emphasis spike

is inversely proportional to the frequency of this clock. Hence, for 20 Gb/s (10

GHz) signals, a minimum 20 GHz clock signal is needed, requiring a very well

designed clock recovery circuit and 20 GHz flip-flops. The RC passive-element

differentiator is a more viable solution in that it does not require the input

clock or flip-flops, however, the width of the pre-emphasis spike in this case is

not controllable. This results in a non-ideal pre-equalization that can not fully

compensate for the effects of the channel. Finally, inductively loaded differen-

tial amplifiers offer no control over the width nor the height of the pre-empha-

sis pulse, and more importantly can result easily in output ringing due to

resonant effects. The pre-emphasis employed in the presented driver is differ-

ent than all three methods outlined above as: (i) no passive L or C elements

are used; (ii) no clock is required and; (iii) the pulse width and height can be

independently controlled.

Pre-emphasis overshoot

time

Voltage

and undershoot

8

2.2 Inductive peaking

The theory of inductive peaking or broad-banding is well-documented [5][6]. A

brief review and implications for the driver design will be presented here.

Inductive series and shunt peaking are techniques that can be used to extend

the 3-dB bandwidth of an amplifier without expensing extra power. The fol-

lowing explanation will focus on shunt inductive peaking as it is applied in the

driver design.

Fig. 4: Simple common source amplifier with resistive loading

Fig. 5: Common source amplifier with shunt peaking

9

Fig. 4 illustrates a common source amplifier with an ideal resistor and capac-

itive load. For simplicity, we assume that the small signal frequency response

of the amplifier is determined by a single dominant pole, which is determined

solely by the output load resistance RL and by the load capacitance C.

The introduction of an inductance L in series with the load resistance as

shown in Fig. 5, alters the frequency response of the amplifier. This technique,

known as shunt peaking, increases the bandwidth of the amplifier by trans-

forming the frequency response from that of a single pole to one with two poles

and a zero.

The poles may or may not be complex. The zero is determined solely by the

L/RL time constant and is primarily responsible for the bandwidth improve-

ment. In addition, the frequency response of this amplifier is characterized by

the ratio of L/RL and RLC time constants. This ratio is denoted by m = L/

(RL2C). Isolating for the inductance value the ratio is re-written as L =

mRL2C.

V out

V in---------- ω( )

gmRL

1 jωCRL+----------------------------= 1( )

V out

V in---------- ω( )

gm RL jωL+( )

1 jωRLC ω2LC–+

-------------------------------------------------= 2( )

10

It can be shown [5] that bandwidth extension is possible at varying degrees

with adjustments of m. As expected, the 3-dB bandwidth of the shunt ampli-

fier increases as m increases. Table 1 shows the normalized 3-dB extension

factor relative to the value of m. The maximum bandwidth occurs for m = 0.71

and yields an 85% improvement in bandwidth. However it is accompanied by

a significant amount of gain peaking which is undesirable for broadband

amplifiers used in fibre optic or backplane applications. A maximally flat

response is observed for m = 0.41 while still improving bandwidth by 72%.

Finally, although a value of m = 0.32 does not result in the same bandwidth

improvement as the other two non-zero values of m shown in the table, it

exhibits the most linear phase response up to the 3-dB bandwidth [5]. This

value of m, called the optimum group delay value, is desirable for optimizing

pulse fidelity in broadband systems that transmit digital signals.

The optimum group delay value, which still results in a respectable 60 per-

cent increase in bandwidth, is best suited for the design of the broadband dig-

ital signal driver.

An implicit benefit of using inductive peaking is the enhanced freedom in the

power-bandwidth trade-off. This improvement can be demonstrated by first

Table 2: Performance metrics for shunt peaking [5]

Factor (m)Normalized

ω3dBResponse

0 1.00 No shuntpeaking

0.32 1.60 OptimalGroup Delay

0.41 1.72 Maximally flat

0.71 1.85 Maximumbandwidth

11

examining the key equations for a non-inductively loaded amplifier (as the one

introduced in Fig. 4). The bandwidth of the amplifier is dominated by the out-

put pole to be:

The value of the tail current Itail is also determined by the amount of desired

output swing and load resistance, especially in the design of switching invert-

ers:

where ∆Vswing is the voltage swing on the output node of the inverter and RL

is the load resistance of the inverter.

The advantage of shunt inductive peaking is made more obvious when the

bandwidth and power of an amplifier are examined. The bandwidth of an

amplifier is given by equation 3, while the power consumption is directly pro-

portional to Itail for a given power supply. The goal is to increase the band-

width and minimize the power consumption (and hence Itail of an amplifier).

This results in a contradictory solution for RL whereby equation 3 requires a

small value for RL and equation 4 requires a large one.

Inductive peaking allows the circuit designer to increase the value of RL to

reduce overall power consumption while simultaneously increasing band-

width with the introduction of a load peaking inductor. Since Table 1 shows

that shunt peaking can increase the bandwidth characteristics of an amplifier

by 60% while still maintaining a linear phase response, a possible trade-off is

to increase the resistance RL by 30% and decrease the tail current Itail by a

BW 12πRLC------------------= 3( )

I tail

V swing∆RL

-------------------= 4( )

12

similar amount (1/1.3) and introduce a load inductor. With m = 0.32 from the

Table 1, the value for the inductor is:

From equation 3, the increased resistance RL will decrease the bandwidth by

a factor of 1/1.3, but the inductive peaking will increase the newly reduced

bandwidth by 60%. Overall, the inductively peaked circuit will have both

higher bandwidth and lower power than the original resistively-loaded circuit.

Explicitly, the inductively peaked circuit will have bandwidth:

and the tail current of the inductively peaked circuit will be:

This results in a 23% gain in bandwidth with a simultaneous 23% decrease in

power consumption.

There are a few compromises involved in this optimization of the power-

bandwidth product via the introduction of an on-chip inductor. First, the

LCRL

2

3.1-------------= 5( )

BW peaked BW resistive1.61.3-------× 1.23BW resistive= = 6( )

I tail peaked( )I tail resistive( )

1.3------------------------------ 0.77I tail resistive( )= = 7( )

13

added die area expense at each inverter stage can be relatively large, with the

inductor usually occupying more area than the resistively-loaded inverter

stage alone. Secondly, deterministic jitter can occur leading to deleterious

results if an inductor is realized with a larger than simulated value. In this

case, jitter is a result of undesired, and more importantly, uncontrollable

peaking and signal distortion. In a circuit with multiple sequential gain

stages, each with over-sized peaking inductors, the resultant signal distortion

becomes catastrophic. Electro-magnetic field solvers that simulate the induc-

tances of on-chip planar and stacked spiral inductors must be verified experi-

mentally with fabricated test-structures before the final circuit is fabricated.

This pre-verification methodology was employed in the inductor design.

2.3 Hazardous relative placement of load inductors

The relative placement of the load inductors to the load resistors in the design

of each amplifying stage is of utmost importance. Fig. 6 shows a potentially

hazardous placement of a load inductor in an inductively loaded amplifier

stage. The root of the deleterious effect lies in transmission line theory. As the

driver operates in the high-frequency signalling spectrum, transmission line

theory is applicable.

Transmission lines spatially transform impedance [7]. The impedance of the

voltage supply in the AC case is zero, a short circuit. At high-frequencies, the

inductor length is comparable to that of a quarter-wavelength of the signal on

the inductor wire segment. Hence, the impedance looking into the inductor

may be spatially transformed from the short-circuit of the power supply into

an open circuit. The transformed open circuit will result in instability and/or

oscillatory behavior in the amplifier.

14

Fig. 6: Potentially hazardous placement of load inductor

Fig. 7: Proper placement of inductive load

It is important to note that, for the proper placement of the inductor in Fig. 7,

spatial impedance transformations still occur. However, the impedance look-

ing into the inductor from transistor M2 is always a finite, non-zero value

because the transformation acts on the finite and non-zero resistive value RL.

2.4 Second order effects in deep submicron MOSFETs

Of special relevance in the transistor sizing and biasing in the presented

Zin may ~ inf.

Zin alwaysfinite

15

design is a relatively unfamiliar submicron phenomenon. Electron mobility

degradation due to high vertical electric fields can drastically decrease perfor-

mance of deep-sub micron circuits. Fortunately, the application of proper bias-

ing measures can mitigate these negative effects. However, scarcely few

publications on this topic exist and classical biasing techniques are the norm.

Fig. 8: Transconductance of a 130nm nMOSFET vs. gate voltage [8]

Fig. 8 [8] portrays the deleterious effects on transconductance due to electron

mobility degradation in high-vertical fields. Shown is the transconductance

normalized by width (gm/W) of a 130-nm nMOS transistor as a function of

gate-source voltage. Its shape is similar to that of the fT dependence on VGS

and typical for all deep submicron technologies. The curve exhibits two dis-

tinct regions, the square-law region and the high-vertical field region as

shown. There also exists an intermediate area between the two regions in

which hybrid behaviour is observed.

At low effective gate voltages (VGS < 0.5V in Fig. 8), the device follows the

classical square law model and its transconductance varies linearly with VGS.

The equation derived for an n-MOS transistor in saturation (square law

region) is:

High-verticalfield region

Square-lawregion

16

where COX is the oxide capacitance, W is the gate width, L is the gate length,

and VT is the threshold voltage of the transistor.

At large gate-source voltages, the high-electric field developed between the

gate and channel of the transistor confines charge carriers to a narrower

region below the oxide-silicon interface, leading to more carrier scattering and

hence lower mobility. Further, small-geometry devices experience significantly

more mobility degradation [9]. An empirical equation modelling this effect is

[9]:

where µο denotes the “low-field” mobility and θ is a fitting parameter that

increases with decreasing oxide thickness and hence smaller geometries.

Substituting the solution for electron mobility µn (9) into (8) reveals that

when the second term in the denominator of (9) becomes dominant, transcon-

ductance becomes a constant. This effect is observed clearly for the high-verti-

cal field region of Fig. 8, where VGS > 0.7V.

It is of special interest to note how the constant transconductance due to

gm µnCOXWL-----

V GS V T–( )= 8( )

µn

µo

1 θ V GS V T–( )+-----------------------------------------= 9( )

17

high-vertical fields affects the differential voltage required to completely

switch a MOS differential pair. It will be shown that a larger switching volt-

age, with little or no improvement in transition time, is required when the

MOS pair is biased in the high-vertical field region [8].

We first derive the differential voltage required to completely switch a MOS

differential pair when biased in the high-vertical field region (VGS > 0.7V in

Fig. 8).

Fig. 9: MOS differential pair

Fig. 9 shows a differential pair of MOS transistors fed by a constant tail cur-

rent Itail. Imagine tail current Itail is fully routed through transistor Q1.

Assuming high-vertical field operation, Itail is:

I tail I DS1

Cox

2--------

W 1

L1--------

µo

1 θ V GS1 V T–( )+( )------------------------------------------------- V GS1 V T–( )2

= = 10a( )

18

where transistor Q1 is assumed to be in saturation and equation (9) has been

substituted for µn. In high-vertical field operation, when the term θ(VGS1 - VT)

becomes dominant relative to unity, equation (10a) becomes:

Equation (10b) shows that IDS1 now exhibits a linear relationship with VGS1.

At the instant that Itail is fully routed through Q1, it is evident that VGS2 = VT,

the threshold voltage of the transistors, such that IDS2 = 0. Hence, the differ-

ential voltage across the gates of the transistors is

Isolating VGS1 in (11) and substituting into (10b) reveals that

for Itail at the instant when all the tail current is shifted completely through

I tail I DS1

Cox

2--------

W 1

L1--------

µo

θ----- V GS1 V T–( )≈= 10b( )

V∆ V GS1 V GS2– V GS1 V T–= = 11( )

I tail

Cox

2--------

W 1

L1--------

µo

θ----- V∆ V T V T–+( )≈

Cox

2--------

W 1

L1--------

µo

θ----- V∆( )= 12( )

19

Q1. An alternate equation for Itail is derived in the steady state when the tail

current is split evenly between the transistors such that:

as VGS(1,2) is equal for both Q1 and Q2 in equilibrium. Equating Itail from (12)

and (13), we solve for ∆V, the differential voltage required to completely switch

the MOS differential pair:

where the inequality is explicitly shown in (14b) to indicate a minimum differ-

ential voltage requirement.

To find the minimum differential voltage to switch the tail current com-

pletely through one of the MOS transistors in the square-law region (VGS <

0.5V in Fig. 8), a parallel mathematical process of equations (10) through (14)

is completed. This is shown explicitly in [10]. The resultant required voltage

swing for full switching becomes:

I DS1 I DS2

I tail

2---------

Cox

2--------

W 1

L1--------

µo

θ----- V GS 1 2,( ) V T–( )

Cox

2--------

W 1

L1--------

µo

θ----- V EFF( )= = = = 13( )

Cox

2--------

W 1

L1--------

µo

θ----- V∆( ) Cox

W 1

L1--------

µo

θ----- V EFF( )= 14a( )

V∆ 2V EFF≥ 14b( )

V∆ 2V EFF≥ 15( )

20

Hence, in the high-vertical field region, both the scalar multiplier and VEFF

itself are larger, requiring a greater differential voltage and hence transition

time, to switch the differential pair [8].

The outcome of this analysis emphasizes that gate-source voltages must be

limited to mitigate the effects of high-vertical fields on electron mobility. This

is accomplished through proper transistor sizing and tail current selection.

21

3 Circuit Design

3.1 Driver overview

Fig. 10 shows the block diagram of the backplane driver highlighting the four

sections of the circuit. The pre-emphasis path is placed in parallel with the

main signal path and the current from both output stages are summed across

the 50 Ω load resistors to develop the differential output voltage. The parallel

path consists of delay buffers and the digital differentiator circuit. The output

swing is adjusted from the tail current of the output driver while the height of

pre-emphasis is controlled by the relative tail current of the digital differenti-

ator.

Fig. 10: Driver block diagram

22

3.2 Biasing a MOS CML gate for optimal speed

MOS CML logic has only recently been applied to high-speed ICs [6]. Conse-

quently, no systematic design guidelines existed until recently [8]. As such,

MOS CML gates have been biased for optimal speed rather haphazardly using

a voltage-centric approach. We will show through simulation that, for MOS

CML, a current-density centric instead of a voltage-centric design approach

leads to more accurate and reliable circuit design.

Fig. 11: Constant peak ft current density over four technology nodes [8]

In VGS-focussed MOS CML design the effective gate voltage (VEFF = VGS -

VT) value at which the peak fT of the MOSFET scales with technology,

decreasing with every technology node. This makes it very difficult to predict

an optimal bias across multiple technologies and very difficult to predict an

optimal bias within a given technology node. However, as the simulated data

collected over four technology nodes (# of fingers x gate length x finger width)

JpfT-MOS

23

shown in Fig. 11 [8] show, the peak-fT current density (JpfTMOS) remains

approximately constant (between 0.25 mA/µm and 0.35 mA/µm depending on

VDS) as technology scales. This trend is likely to occur also for future MOS

generations as a result of the constant field scaling that has been applied from

the 0.5-µm technology node [11] downward. Subsequently, a current-density

centric design approach, similar to that which is commonly employed in bipo-

lar designs [12], is more appropriate for reproducible, accurate, high-speed

design of MOS CML circuits. In a current-density centric design scenario, the

gate width of the MOSFET is sized such that the device reaches its peak fT

when all of the tail current flows through the device:

In 130-nm technology, this corresponds to a VEFF of around 300mV. Biasing

beyond the peak-fT current density will degrade circuit performance.

For large-signal high-speed circuit biasing in a MOS differential pair, it is

recommended that each of the differential pair transistors are biased at half-

peak fT current density in balanced current steady state. This avoids current

densities beyond peak-fT when, during full-switching, all the tail current is

routed through one transistor of the differential pair and the current density

is momentarily doubled from that of half peak fT to peak fT. Referring back to

the discussion of Section 2.4, biasing the circuit at half-peak fT has a second

positive effect on circuit switching speed. Biasing at half-peak fT current den-

sity instead of full-peak fT current-density permits the differential pair tran-

W G

I T

J pfTMOS---------------------= 16( )

24

sistors to operate more in the square-law region instead of the slower-

switching high-vertical field region.

Based on this observation, each circuit block in the presented driver design

consists of a MOS-CML inverter whose ratio of tail current to differential pair

transistor width is set to correspond to the peak fT bias of the n-channel MOS-

FET of 0.25 to 0.3 mA/µm. This bias scheme is implemented to obtain the

maximum switching speed. Inductive peaking is employed in every stage to

further improve the circuit bandwidth.

3.3 Biasing a differential CMOS inverter for low-noise

Fig. 12 shows the fT and oppositely the NFMIN versus current-density for a

130-nm n-MOSFET with a 2µm unit finger width. It is shown that NFMIN has

a minimum value corresponding to a bias current of about half the current-

density of the maximum fT. Earlier it was shown that biasing each transistor

in a differential pair at half-peak-fT would result in optimal switching speed.

Now it is shown, co-incidentally, that this current-density bias point also

results in minimum NFMIN.

25

Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2 µm unit finger width biased at VDS = 1V

3.4 Input matching network and low-noise comparator

Fig. 13 illustrates the input matching network and input low-noise compara-

tor. The input differential pair has higher gain and larger tail current than

the other stages in order to reduce the noise by making the optimum noise

impedance of the input stage closer to 50Ω per side. A compromise was

reached between achieving the best possible noise match, which calls for

larger transistor sizes and bias current, and the broadband input impedance

match. On-chip matching resistors, realized as a resistive divider with series

inductors, provide appropriate gate bias for the input transistors and broad-

band input impedance matching.

10-2

10-1

100

Current Density (mA per µm width)

0

25

50

75

100

f T (

GH

z)

0.0

0.5

1.0

1.5

2.0

NF

MIN

@ 1

0GH

z (d

B)

26

Fig. 13: Input bias and matching network and comparator

3.5 Eye-Crossing Control

A key objective of the thesis was the design of a driver with controllable pulse-

width. The application of controllable pulse width is the compensation of DC

offsets that may cause signal distortion. This compensation is used to alter the

duty-cycle (of an input signal) such that a 50% duty-cycle periodic input may

be changed to a 30% or 70% duty-cycle periodic output signal. Conversely, a

DC offset which has imposed an output duty cycle of 30% could be negated to

re-instate the duty-cycle back to 50% as desired.

The circuit of Fig. 14 accomplishes pulse-width control using a technique

found in [13]. The circuit consists of two series inductively-peaked differential

stages with a DC offset control pair connected at the output of the first stage.

By applying a DC voltage Voffset, an offset voltage is developed at the output

Zin= 50Ω

27

node of the first differential pair, shifting the zero-crossing between the two

outputs. Due to the finite rise and fall time of the waveform, and the trunca-

tion by the limiting action of the last inverter, Vout exhibits a change in duty-

cycle as illustrated by the overlaid waveforms of Fig. 14.

3.5.1 Transistor sizing and biasing in the eye-crossing control circuit

On the left side of Fig. 14, the input pair simply drives a series RL load, no dif-

ferently than any of the other inductively peaked circuits of this backplane

driver. The middle pair of transistors (M3 and M4) have DC bias voltage Voffset

held constant or a mixing effect would occur because the current pull of M3

and M4 act on the same signal path as M1 and M2. Hence it must be ensured

that the control voltage Voffset comes from a low-noise source.

The third stage acts as a limiting amplifier as previously described. The tail

current was chosen to be 12mA and hence the differential pair transistors

were chosen to be 32 µm each so that a current density of 0.19 mA/µm in each

during steady state. This biasing is in-line with the recommendations for

half-ft value derived in Section 3.2.

28

Fig. 14: Eye-crossing control circuit and intermediate signal

3.6 Output driver and parallel current summation

The output driver and digital pre-emphasis circuit, in parallel, create the out-

put voltage across the output load resistor. It is shown in Fig. 15 how the out-

put currents of the output driver and the digital differentiator are summed.

The ratio of output current from the output driver (I1 in Fig. 15) and the out-

put current contribution from the digital differentiator (I2 in Fig. 15) deter-

mine the percentage of pre-emphasis in the overall output voltage, Vout. The

higher this ratio, the lower the percentage of pre-emphasis in Vout.

29

Fig. 15: Summation of output currents across output resistor

The output driver is shown in Fig. 16 and consists of a simple differential pair

biased with a current mirror. The value of bias current in the output driver

determines the magnitude of output signal swing from the main path.

DelayBuffers Digital

Differentiator

Vout

OutputDriver

I2

I1 I1 + I2Rout

30

Fig. 16: Output driver transistor-level schematic

3.7 Digital pre-emphasis

The digital pre-emphasis circuit, whose block diagram is shown in Fig. 17, fea-

tures three delay cells, implemented as inverters, followed by a digital differ-

entiator circuit. The third delay cell is loaded additionally with MOS

varactors connected in parallel with resistive loads in order to control the

delay and, therefore, the pre-emphasis spike width.

Fig. 18 illustrates the waveforms in the digital differentiator. The circuit has

two differential inputs IN, IN and INdly, INdly and a differential output

Voutp, Voutn which is summed with the currents from the main path across

the on-chip 50 Ω load resistors.

VoutNVoutP

Vin+ Vin-Ibias = 2mA

Iout = 20mA

M1 M2

Vdd = 1.5VR

load (shared with digitaldifferentiator) R

load (shared with digitaldifferentiator)

31

Fig. 17: Digital pre-emphasis block diagram

Fig. 18: Pre-emphasis waveforms and transfer function

The input signals are assumed to be periodic for this example. Input signals

IN and IN are delayed through the delay circuit to produce INdly and INdly,

respectively. As shown by the dotted arrows in Fig. 18, the delay between IN

32

and INdly translates into the pre-emphasis pulse width of VoutP. The same

relationship holds for IN, INdly, and VoutN, respectively. The four signals IN,

IN, INdly and INdly are fed into the digital differentiator circuit, which has

two outputs, VoutP and VoutN.

The differentiator functions as a logical XOR gate with one notable excep-

tion. Whereas an XOR gate in the classical sense operates within a binary

logic system, this logic circuit operates on a tertiary (three-level) logic system.

Specifically, when IN.INdly is true, the output VoutP rises; when IN.INdly is

true, the output VoutP falls; and when neither case is true, the output of the

circuit stays in steady-state. The output VoutN falls and rises in a horizon-

tally-mirrored fashion.

The transform of binary logic at the input of the differentiator to tertiary

logic at its output is accomplished using current switching. The circuit of Fig.

19 is biased by two constant current sources of equal value, Iswch. The voltages

at the output of both VoutP and VoutN are current-controlled by voltage drops

across each 50 Ω load resistor. In the steady-state, both VoutP and VoutN are

pulled down by an equal current of value Iswch. When VoutP rises (and VoutN

drops), the current drain path pulling down VoutP is cut off, and VoutN is

pulled down by a current equal to 2Iswch. Conversely, when VoutP drops (and

VoutN rises), VoutP is pulled down by current 2Iswch, and VoutN has its cur-

rent path to ground cut off.

33

Fig. 19: NMOS digital differentiator schematic

In Fig. 19, the pre-emphasis height is controlled by the two constant current

sources Iswch, whose value is adjustable between 0 and 10 mA. For matching,

transistors M3 and M6 compensate the VDS drop across transistors M1, M2

and M4, M5, respectively.

34

4 Simulation Results

4.1 S-Parameter simulations

The small signal S-parameters were simulated in a single-ended input and

output configuration. The input and output return loss are better than -5 dB

up to 60 GHz, as shown in Fig. 20, with S11 achieving -10dB up to 50 GHz.

The single-ended simulated small signal gain is 14.5dB, confirmed by the eye

diagram measurements of Fig. 21. Further, Fig. 20 shows that the 3dB-fre-

quency of the driver is 8.5 GHz and the driver has gain (> 0dB) up to 24.1

GHz.

Fig. 20 Simulated S22 and S11 of complete driver circuit to 60 GHz

35

Fig. 21 Simulated single-ended S21 of entire driver circuit

With the driver’s output amplitude control set to a low level, the resulting

simulated S21 is as shown in Fig. 22. The output amplitude is controlled by an

off-chip current source and can be manipulated to increase or decrease the low

frequency gain peak. The decreased gain gives an extended 3dB-bandwidth

and 0dB crossing of 9.3 GHz and 37.6 GHz, respectively.

Fig. 22 Simulated single-ended S21 of driver with output amplitudereduced using amplitude control

S21 simulations with output peaking enabled are shown in Fig. 23. The peak-

ing is evidenced in the simulation, increasing the maximum small signal gain

from 14.1dB up to 16.5dB and the 3dB frequency to 13.3 GHz. The frequency

36

shaping effects of the pre-emphasis output circuit is exemplified here.

Fig. 23: Simulated S21 with output peaking enabled

4.2 Time domain simulations

Time domain eye-diagram simulations were performed on the full driver cir-

cuit with a 27-1 PRBS (Pseudo Random Binary Stream) generator. In Fig. 24,

a 20 Gb/s input signal was applied single-ended and the unused input was ter-

minated with a 50 Ω resistance connected to VDD.


37

Fig. 25 and Fig. 26 show simulated eye-diagrams at 25 Gbps and 30 Gbps,

respectively. The circuit exhibits reduced sensitivity at 25 Gbps and 30 Gbps,

requiring, respectively, 60 mV and 140 mV single-ended inputs for similar eye

openings.



Fig. 27 demonstrates the simulated eye-crossing control performance at 20

Gb/s taken at the output of the driver. The eye-crossing control for (a) 50%, (b)

66%, and (c) 33% was performed by varying the control voltage that was con-

nected to off-chip voltage sources (Voffset of Section 3.5.1)

38

Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50%, (b) 66%, and (c)33%




(a)

(b)

(c)

39

Fig. 28: Output amplitude control at 20 Gb/s;Input of 20mVpp applied to a single end

Output of (a) 21 mVpp and (b) 220 mVpp per side

Simulated waveforms with varying levels of pre-emphasis at 20Gb/s are

shown in Fig. 29 and Fig. 30. By altering the current bias of the parallel digi-

tal pre-emphasis path, the pre-emphasis of the overall driver can be controlled

independently of the main path of the driver. Fig. 29 shows a symmetric +/-

16% overshoot/undershoot ratio, relative to the voltage swing of the main-

path signal. Fig. 30 shows an asymmetric +33%/-25% overshoot/undershoot

ratio. The asymmetry in Fig. 30 is attributed to the systemic limitation of the

pre-emphasis circuit, in that the maximum output voltage cannot exceed VDD

while the minimum value must be higher than 3*VDS(sat) due to the triple-

stacked NMOS configuration of Fig. 17. Hence, the ratio of overshoot/under-

shoot relative to the isolated main-path signal swing is directly related to the

(a)

(b)

40

pre-emphasis bias current. A lower digital pre-emphasis current results in a

symmetric overshoot/undershoot ratio as shown in Fig. 29, but the relative

percentage of overshoot/undershoot reduces to less than 25%.

Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per side and +/-16% pre-emphasis

Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per side and+33%/ -25% pre-emphasis

41

5 Physical Implementation

5.1 Inductor design and model extraction

The Computer Automated Design (CAD) tool ASITIC (http://rfic.eecs.berke-

ley.edu/~niknejad/asitic.html) was used in the simulation and sizing of the

inductors in the driver. ASITIC is a three-dimensional field solver which aids

the RF circuit designer in the optimization and modelling of spiral inductors,

transformers, capacitors, and substrate coupling. Test-structures previously

fabricated and characterized at the University of Toronto have confirmed the

accuracy of the ASITIC solver to within 90-95% of absolute inductance values,

in both planar and stacked spiral inductor situations.

All inductors of the driver were simulated as two-port circuits using both the

π−model of Fig. 31 and the lumped element broadband model shown in Fig.

32. In the substrate, the eddy-current induced loss and substrate capacitance

are represented by Rsub and Csub, respectively. The oxide capacitance is rep-

resented by Cox. The series inductance and resistance of the inductor proper

is represented by Ls and Rs, respectively. Finally, the capacitance between the

two symmetrical interwoven arms of the inductor is represented by Ciw.

None of the passive lumped elements in the model of Fig. 32 could be omitted

to reduce simulation time because each inductor was not attached directly to a

ground node in the design. This design requirement was discussed previously

in section 2.3. Further, as each inductor was employed in a broadband load

configuration, design considerations gave priority to self-resonant frequency

over quality factor (Q) in the optimization of each inductor.

42

5.1.1 Model extraction using ASITIC

This section will outline the procedure used to extract both Π-model of the

inductor, which is valid for singular low-frequency values, and the broadband

model of the inductor, which is valid for the inductor below self-resonant fre-

quencies.

The π-model of the inductor is shown in Fig. 31 and is valid only at a single-

frequency. The ASITIC three-dimensional field-solver is invoked using the

command pix to resolve the circuit parameters Ls, Rs, Cox1, Cox2, Rsub1, and

Rsub2 from an inductor layout drawn in the ASITIC graphical tool. These val-

ues are assumed correct at all frequencies below self-resonance for the induc-

tor in question. The broadband model introduces three fitting capacitances

(Ciw, Csub1, Csub2) to account for the broadband frequency response of the

inductor’s parasitics.

The broadband model capacitors are found using a Y-parameter simulation

from 5 GHz up to the self-resonant frequency of the inductor, in 1 GHz steps.

The output of the Y-parameter data is then ported to a SPICE optimization

deck. The optimization deck is included in Appendix A for reference. The

SPICE optimization deck will best fit the three broadband model capacitances

to match the Y-parameter data simulated in ASITIC. As three unknowns are

being optimized for simultaneously, reasonable and educated estimates for

each of the broadband capacitances (Ciw, Csub1, Csub2) are substituted initially.

The SPICE deck in Appendix A will also print L12, Leff and Q vs. frequency.

43

Fig. 31: Inductor single-frequency Π -model

Fig. 32: Inductor lumped element broadband model

44

Derived ASITIC parameters

Table 2 and Table 3 show the physical dimensions and the simulated values of

the equivalent circuit parameters for each of the three inductors employed in

this design. All inductors have dual-layer stacked-spiral configurations with

varying numbers of turns and a maximum outer diameter (per side) of 51.2µm

is used for the 900pH inductor.

Table 1:Table 2:

Table 3: Physical Dimensions of each inductor


Diameter (µm) 42.2 46.2 51.2

Number ofTurns

2 3 3

Metal Width(µm)

1.96 1.96 1.96

Spacingbetween wind-ings (µm)

1.96 1.96 1.96

Metal layers METAL6METAL5

METAL6METAL5

METAL6METAL5

45

5.1.2 Inductor realization and isolation

The 900pH inductor is shown in Fig. 33 and is comprised of two metal layers -

the top level of the six metal process and metal five. The left-side terminal of

the inductor is formed in metal six and connects directly to the drain of the

amplifying transistor of that half stage. The right-side port of the inductor is

routed in metal five and is connected to a poly-silicon resistor at each stage.

Each inductor is isolated from each adjacent inductor to minimize crosstalk.

The isolation is accomplished by surrounding each inductor with n-wells in

turn surrounded by p-tap guard rings connected to ground to impose reverse-

biasing on the junctions. Additionally, referring to Fig. 33, the p-taps between

adjacent inductors are electrically attached to a stack of metal layers, from

metal 1 through metal 6, which form a Faraday cage and improve isolation.

Table 1:Table 2:Table 3:

Table 4: Simulated inductor parasitic values


RS (Ω) 5.14 8.01 9.13

Cox1 (fF) 8.63 11 13

Cox2 (fF) 8.2 10.8 12.7

Rsub1 (Ω) 1820 1830 1710

Rsub2 (Ω) 2280 2130 1980

Csub1 (fF) 5.72 5.76 6.16

Csub2 (fF) 4.70 4.95 5.32

Ciw (fF) 4.69 6.26 7.24

fSelf-Res (GHz) 85.6 56.7 45.93

Q-factor (5GHz)

2.4 2.7 3.1

46

Fig. 33: On-chip 900 pH inductor with 51.2µm diameter

5.2 Layout and fabrication

The circuit was fabricated in ST Microelectronics’ 130-nm standard CMOS

process with typical n-MOSFET fT and fMAX of 90 GHz and 100 GHz, respec-

tively. The chip microphotograph is reproduced in Fig. 34. The design is pad-

limited and the total die area was 1.0mm x 0.8mm.

47

Fig. 34: Full chip photograph

Magnified versions of the two parallel paths of the driver are shown in Fig.

35 and Fig. 36, respectively. Using Fig. 10 as a reference that describes the

schematic block diagram of the entire driver, Fig. 35 emphasizes the layout of

the main signal path of the driver while Fig. 36 details the layout of the paral-

lel pre-emphasis signal path.

48

Fig. 35: Magnified photograph of main path layout

Fig. 36: Magnified photograph of the parallel path layout

49

The entire layout, with the exception of the digital pre-emphasis XOR gate, is

completely symmetric about the horizontal axis. Power and ground connec-

tions are provided along the central axis in metal 1 and metal 2, respectively.

Transistors and poly-silicon resistors are located adjacent to the division, fol-

lowed by signal path routing, with the peaking inductors located on the out-

side.

The fully-symmetrical layout results in several advantages: (i) the layout of

the entire circuit is simplified by employing half-cell layout and replication

techniques; (ii) the positive and negative signal paths are matched in length

and are physically and electrically isolated from one another; and (iii), the

coupling coefficients of same-stage load inductors are diminished.

50

6 Experimental Results

6.1 Test environment

The post-fabricated circuit was tested on wafer with probes microscopically

placed on the circuit pads. The frequency-domain measurements were made

with a 50-GHz 8510C Hewlett-Packard Vector Network Analyzer (VNA), as

shown in Fig. 37. The time-domain 231-1 PRBS stimulus was created with a

combination of an Anritsu 69397B Signal Generator, an MP17584 Pulse Pat-

tern Generator, and a MP1801A 43.5 Gbps MUX as shown in Fig. 38.

In Fig. 38, the signal generator generates a CLK signal which is driven to the

MUX, and also divided by four and sent to the pulse pattern generator. The

pulse pattern generator produces four 1/4 DATA signals, each with bitrate

one-quarter the final PRBS stream bitrate. The four 1/4 DATA are multi-

plexed temporally with the original CLK controlling the switching of the MUX

on positive CLK edges. Hence, the output of the MUX is a 231-1 PRBS bit

sequence of the required bitrate for the test, not exceeding 43.5 Gbps.

51

Fig. 37: S-parameter test setup

D.U.T.



DA in

DA out

50 ohm 50 ohm

Hewlett-Packard8510C

52

Fig. 38: Eye-diagram test setup

6939

7BS

ynth

esiz

edS

wee

p/S

ign

al G

ener

ato

r

D.U

.T.

DC

Bia

s / C

on

tro

lP

rob

es

DC

Bia

s / C

on

tro

lP

rob

es

50 o

hm

Vd

d

Vd

d

50 o

hm

MP

1801

A43

.5 G

b/s

MU

X

CL

K

MP

1758

4P

uls

e P

atte

rn G

ener

ato

r

1/4

CL

K

1/4

DA

TA

-RA

TE

PR

BS

(DA

TA

-Rat

e)

PR

BS

refe

ren

ce C

LK

8610

0BD

CA

20 -

43.

5 G

b/s

53

6.2 S-Parameters and noise figure

The small signal s-parameters and noise figure were measured in a single-

ended configuration. The small signal gain, S21, is shown in Fig. 39 and

agrees well with the simulated results of Fig. 21. Measured single-ended S21

bandwidth was 8.5 GHz, similar to the simulated small-signal gain. The mea-

sured input and output return loss are better than -12 dB up to 50 GHz, as

shown in Fig. 40, outperforming the simulated value of -5dB up to 60 GHz in

Fig. 20. Measured and simulated noise figure values are plotted in Fig. 41. As

expected, the simulations show an inverse relationship between tail current

and overall driver noise figure. Furthermore, the measured noise figure of the

overall driver was 2 dB higher than simulated for a 6mA tail current in the

input comparator. This is primarily due to losses associated with the probe-

pad contact resistance and the series substrate resistance below the pad

which are not accounted for in simulation. Another reason for the higher than

simulated noise figure is the limitation of the BSIM3 model for MOSFETs

that does not capture the gate noise current of the MOSFET.

Fig. 39: Measured single-ended S21

54

Fig. 40: Measured single-ended S22 and S11

Fig. 41: Measured and simulated driver noise figure

55

6.3 Time-domain measurements

Time domain measurements were carried out at data rates between 20 Gb/s

and 43 Gb/s and using a 231 - 1 PRBS pattern. A sensitivity of 20 mVpp, as

illustrated in Fig. 42, was measured at 20 Gb/s when the input signal was

applied single-ended and the unused input was terminated with a 50 Ω resis-

tance. The sensitivity degraded to 60 mVpp, and 150 mVpp, at 25 Gb/s and 30

Gb/s, respectively.


Typical 25 Gb/s and 30 Gb/s output eye diagrams are illustrated in Fig. 43

and Fig. 44, respectively. It is important to note that the driver exhibits gain

at 30 Gb/s, with a 200mVpp input signal resulted in a 260mVpp output signal.

This is the first known recording of positive gain at 30 Gb/s in an all CMOS

driver to date (2005).

56

Fig. 43: 25Gb/s output eye diagram with 50% eye crossing

Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per side for a single-ended 200 mVpp, 231-1 input PRBS.

57

Fig. 45 demonstrates the eye-crossing control performance at 20 Gb/s taken at

the output of the driver. The eye-crossing control for (a) 50%, (b) 70%, and (c)

30% was performed by varying the control voltage that was connected to off-

chip voltage sources.

Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30%

(a)

(b)

(c)

58




Fig. 46: Output amplitude control at 20 Gb/s; Output of (a) 190 mVpp and(b) 350 mVpp per side

Measured waveforms with pre-emphasis at 20Gb/s are shown in Fig. 47. The

eye-diagram exhibits higher positive overshoot than undershoot with spike

height control between 0% and 25% of the eye height. The overshoot/under-

shoot imbalance shown in Fig. 47 was caused by an over-ratio between the

(a)

(b)

59

pre-emphasis path tail current and the main path tail current. The ratio of

overshoot/undershoot relative to the output signal swing alone is directly

related to the pre-emphasis path tail current value. A lower digital pre-

emphasis tail current could have resulted in a symmetrical overshoot/under-

shoot ratio, but the relative percentage of overshoot/undershoot would have

reduced to less than 15%. Unfortunately, experimental evidence of this was

not captured due to time restrictions at the testing facilities at Quake Tech-

nologies and ST Microelectronics, both located in Ottawa, Canada.

Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis

6.3 Performance summary

Table 4 summarizes the overall circuit characteristics. Of special note are the

high input sensitivity and high output swing, the extensive -12dB input/out-

put matching up to 50 GHz and the multiplicity of control mechanisms with

60

respect to output amplitude, pre-emphasis and eye-crossing. Further, there is

excellent agreement between the simulated and measured values of the driver

in both the time domain and the frequency domain.

Table 5: Simulated and measured circuit parameters

Parameter SimulatedValue Measured Value

Supply 1.5V 1.5V

Power 150 mW 150 mW

Output swing@ 20Gb/s with 20 mVpp input

21-350 mVpp perside

190-350 mVpp perside

Pre-emphasis control @ 20 Gb/s +33%/ -25% +25%/-15%

Crossing control @ 20 Gb/s 33% to 66% 30% to 70%

Eye sensitivity @ 20 Gb/s 20 (10) mVpp 20 (10) mVpp

Noise Figure 14.9 dB @ 5 GHz15 dB @ 15 GHz

17 dB @ 5 GHz17 dB @ 15 GHz

S11/S22 up to 50 GHz < -5dB < -12dB

61

7 Conclusions

7.1 Summary

A 20 Gb/s backplane driver with more than 30 dB dynamic range was imple-

mented in 130-nm CMOS technology. The circuit consumes 150 mW from a 1.5

V supply and features independent control of output swing, duty cycle and

pre-emphasis. The circuit is operational without pre-emphasis at data rates

up to 30 Gb/s with 300 mVpp swing per side. The pre-emphasis pulse is both

amplitude and width controllable via the introduction of a novel digital circuit

implementation which does not require a separate clock signal. The results of

this project were published and presented at the IEEE Custom Integrated

Circuits Conference in 2004 [1].

7.2 Future work

Future work associated with this design would result in the further system

verification, expansion and industrialization of the circuit. As time-domain

testing was completed in industrial settings, tester time availability was low,

and in particular, pre-emphasis experimentation was affected. Further test-

ing on the pre-emphasis circuit would involve varying the varactor to increase

and decrease the width of the pre-emphasis pulse width, and varying the rela-

tive current bias from the digital differentiator to empirically match postitive

and negative pre-emphasis pulse heights. In terms of system expansion and

industrialization, a receiver circuit placed across a backplane with appropri-

ate mechanisms to control the pre-emphasis would be required to close the

feedback loop. Further industrialization of the main path would require

62

instantiation of on-chip and off-chip reference voltages and currents for each

of its independently controlled stages. Finally, verification of the circuit in a

datapath operation to measure its true performance in a digital data-specific

application is required. This would involve an addition of a Media Access Con-

trol (MAC) circuit and layer at the input of the driver as well as its MAC coun-

terpart on the receiving end.

63

References

[1] P. Westergaard, S.P. Voinigescu, T.O. Dickson “A 1.5-V, 20/30-Gb/s CMOS

Backplane Driver with Digital Pre-emphasis,” Proc. IEEE Custom Inte-

grated Circuits Conference, pp.23-26, Orlando, FL, Oct. 2004

[2] S. Galal, B. Razavi, “40Gb/s Amplifier and ESD protection Circuit in 0.18-

um CMOS Technology,” IEEE ISSCC Digest, pp.480-481, 2004

[3] D. Kehrer, H.D. Wohlmuth, “40 Gb/s 2:1 Multiplexer and 1:2 Demulti-

plexer in 120 nm CMOS,” IEEE ISSCC Digest, pp. 345-346, 2003

[4] J. Liu, X. Lin, “Equalization in high-speed communication systems,” Cir-

cuits and Systems Magazine, IEEE, Volume 4, Issue 2, pp. 4-17, 2004

[5] S.S. Mohan, M.D.M. Hershenson, S.P. Boyd, T.H. Lee, “Bandwidth exten-

sion in CMOS with optimized on-chip inductors,” IEEE Journal of Solid

State Circuits, Volume 35, Issue 3, March 2000, pp. 346 - 355

[6] M. Green, “Current-controlled CMOS circuits with Inductive broadband-

ing,” U.S. Patent 6,525,571, Filed Sept. 26, 2001

[7] S. Ramo, J.R. Whinnery, T. Van Duzer, Fields and Waves in Communica-

tions Electronics, 3rd. Ed. New York, John Wiley & Sons, 1994

[8] T. O. Dickson, R. Beerkens, S. P. Voinigescu, “A 2.5-V, 45-Gb/s Decision Cir-

cuit Using SiGe BiCMOS Logic," IEEE Journal of Solid-State Circuits, Vol-

ume 40, Issue 4, pp. 994-1003, April 2005

[9] B. Razavi, Design of Analog CMOS Integrated Circuits, 1st Ed. New York:

Mcgraw-Hill, 2001

64

[10] A. Sedra, K. Smith, Microelectronic Circuits, 4th Ed. New York: Oxford

Press, 1998

[11] S.P. Voinigescu, T.O. Dickson, R. Beerkens, I. Khalid, P. Westergaard, "A

Comparison of Si CMOS, SiGe BiCMOS, and InP HBTs Technologies for

High-Speed and Millimeter-wave ICs," Si Monolithic Integrated Circuits in

RF Systems, pp.111-114, Atlanta, GA, Sept. 2004

[12] R. Ranfft, H.M. Rein, “High-speed bipolar logic circuits with low power

consumption for LSI - a comparison.” IEEE Journal of Solid State Circuits,

Vol. 17, Issue 4, Aug. 1982, pp. 703 - 712

[13] D.S. McPherson. McPherson, D.S.; Pera, F.; Tazlauanu, M.; Voinigescu,

S.P. “A 3V fully differential distributed limiting driver for 40-Gb/s optical

transmission systems,” IEEE Journal of Solid-State Circuits, Volume 38,

Issue 9, Sept. 2003 pp. 1485 - 1496

65

Appendix A: Spice file for inductor parameterextraction.option acct nomod post=2 probe

.net v(p2) vin rout=50 rin=50

vin p1 0 AC 1

L p1 3 LsR 3 p2 RsCs1 p1 1 Cp1Cs2 p2 2 Cp2Rs1 1 0 Rsub1Rs2 2 0 Rsub2Csub1 1 0 Csub1Csub2 2 0 Csub2Cbr p1 p2 Cbr

.param+ Ls = 0.407n+ Rs = 6.9+ Rsub1 = 5240+ Rsub2 = 691+ Cp1 = 4.67f+ Cp2 = 5.31f+ Csub1 = OPT1(0.1p, 0.0001p, 10p)+ Csub2 = OPT1(0.1p, 0.0001p, 10p)*+ Csub1 = 5.72f*+ Csub2 = 4.70f+ Cbr = OPT1(30f, 0.0001p, 10p)

.AC data=measured optimize=opt1+ results=comp1,comp2,comp3,comp4,comp5,comp6,comp7,comp8+ model=converge.model converge opt relin=1e-4 relout=1e-4 close=10 itropt=30.measure ac comp1 err1 par(y11r) y11(r).measure ac comp2 err1 par(y11i) y11(i).measure ac comp3 err1 par(y12r) y12(r).measure ac comp4 err1 par(y12i) y12(i).measure ac comp5 err1 par(y21r) y21(r).measure ac comp6 err1 par(y21i) y21(i).measure ac comp7 err1 par(y22r) y22(r)

66

.measure ac comp8 err1 par(y22i) y22(i)

.ac data=measured

*.ac lin 75 5e9 79e9

.plot ac y21(m) y11(m) y21(db)





.print y11(r) y11(i) y11(m) y11(p)

.print y22(r) y22(i) y22(m) y22(p)

.print y12(r) y12(i) y12(m) y12(p)

.print y21(r) y21(i) y21(m) y21(p)

.print ac L12=par(’y12(i)/((6.28*FREQ)*(y12(m)*y12(m)))’)

.print ac Leff=par(’-y11(i)/((6.28*FREQ)*(y11(m)*y11(m)))’)

.print ac Q=par(’-y11(i)/y11(r)’)

.print ac cox=par(’(y11(i) + y12(i))/(6.28*FREQ)’)

*.print par(s11r) s11(r) par(s11i) s11(i)*.print par(s12r) s12(r) par(s12i) s12(i)*.print par(s21r) s21(r) par(s21i) s21(i)*.print par(s22r) s22(r) par(s22i) s22(i)*.print z11(r) z11(i) z11(m) z11(p)*.print z22(r) z22(i) z22(m) z22(p)*.print z12(r) z12(i) z12(m) z12(p)*.print z21(r) z21(i) z21(m) z21(p)

.data measuredFREQ Y11r Y11i Y12r Y12i Y21r Y21i Y22r Y22i***PLACE ASITIC SIMULATION DATA HERE***

*.param freq=100MEG,s11m = 0 , s11p = 0, s12m = 0, s12p = 0, s21m =0,*+s21p =0, s22m =0 , s22p = 0.end

Documents

A 20/30 GBIT S CMOS BACKPLANE DRIVER WITH DIGITAL PRE …sorinv/theses/paul_westergaard... · 2006-02-20 · -6 A 20/30 GBIT/S CMOS BACKPLANE DRIVER WITH DIGITAL PRE-EMPHASIS Paul