Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
-7
A 20/30 GBIT/S CMOS BACKPLANE DRIVER WITH
DIGITAL PRE-EMPHASIS
by
Paul Westergaard
A project report submitted in conformity with the requirementsfor the degree of Masters of Engineering
Graduate Department of Electrical and Computer EngineeringUniversity of Toronto
Copyright by Paul Westergaard 2005
-6
A 20/30 GBIT/S CMOS BACKPLANE DRIVER WITH
DIGITAL PRE-EMPHASIS
Paul Westergaard
Master of Engineering, 2005
Graduate Department of Electrical and Computer Engineering
University of Toronto
Abstract
A high-speed input comparator and output driver with fully adjustable pre-
emphasis for applications in serial inter-chip communications over backplanes
at 20 Gb/s is presented. The driver achieves data rates of up to 30 Gb/s when
the pre-emphasis is disabled. The circuit was implemented in 130-nm CMOS
and consumes 150 mW from a 1.5-V supply in 20 Gbps operation. It has over
30 dB dynamic range with a sensitivity of 20 mVpp and a differential output
swing of 700 mVpp at 20 Gb/s. The output driver features a novel digital pre-
emphasis circuit with independent pulse height and pulse width control with-
out the requirement of an external clock input. Other independent features
are 30%-70% eye-crossing control and adjustable output swing between 170
mVpp and 350 mVpp per side. The results of this project were published and
presented at the IEEE Custom Integrated Circuits Conference in 2004 [1].
ii
-5
Acknowledgements
The author would graciously like to thank his supervisor and mentor Profes-
sor Sorin Voinigescu for his inspiration, technical and personal insight and
unwavering support throughout this project. The author would also like to
specially thank Timothy O. Dickson, a fellow graduate student, for his on-
going technical input, altruism and camaraderie throughout this thesis.
Experimental results would not have been possible without Timothy’s contri-
butions. Furthermore, the author acknowledges that this thesis would not
have been possible without the valuable technical and financial contribution
of the Ottawa, Ontario office of ST Microelectronics. Special contributing
members of the ST Microelectronics technical staff who are owed special grat-
itude include Rudy Beerkens, Boris Prokes, Imran Khalid and Steve McDow-
ell.
iii
-4
Contents
List of Tables viList of Figures vii1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2 Objectives and scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
2 Background 42.1 Review of pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Inductive peaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Hazardous relative placement of load inductors . . . . . . . . . . . . . . . . . . . . . 132.4 Second order effects in deep submicron MOSFETs . . . . . . . . . . . . . . . . . . . . 14
3 Circuit Design 213.1 Driver overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .213.2 Biasing a MOS CML gate for optimal speed . . . . . . . . . . . . . . . . . . . . . . .223.3 Biasing a differential CMOS stage for low-noise . . . . . . . . . . . . . . . . . . . .243.4 Input matching network and low-noise comparator . . . . . . . . . . . . . . . . . .253.5 Eye-crossing control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
3.5.1 Transistor sizing and biasing in the eye-crossing control circuit . . . .273.6 Output driver and parallel current summation . . . . . . . . . . . . . . . . . . . . . . . 283.7 Digital pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4 Simulation Results 344.1 S-Parameter simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .344.2 Time-domain simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5 Physical Implementation 415.1 Inductor design and model extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1.1 Model extraction using ASITIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.2 Inductor Realization and isolation . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Layout and fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6 Experimental Results 506.1 Test environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .506.2 S-Parameters and noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.3 Time-domain measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.4 Performance summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
iv
-3
7 Conclusions 617.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
References 63
Appendix A: Spice file for inductor parameter extraction 65
v
-2
List of Tables
Table 1: Circuit features categorized by design requirement . . . . . . . . . . . . . . . . . . . . . . . . 2
Table 2: Performance metrics for shunt peaking [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
Table 3: Physical dimensions of each inductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Table 4: Simulated inductor parasitic values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 5: Simulated and measured circuit parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
vi
-1
List of Figures
Fig. 1: Transmitter pre-emphasis and backplane trace transfer characteristic . . . . . . . . . . . 5
Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Fig. 3: Time-domain pre-emphasis overshoot and undershoot. . . . . . . . . . . . . . . . . . . . . . . . . . 7
Fig. 4: Simple common source amplifier with resistive loading. . . . . . . . . . . . . . . . . . . . . 8
Fig. 5: Common source amplifier with shunt peaking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Fig. 6: Potentially hazardous placement of load inductor . . . . . . . . . . . . . . . . . . . . . . . . . .14
Fig. 7: Proper placement of inductive load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Fig. 8: Transconductance of a 130-nm nMOSFET vs. gate voltage [8] . . . . . . . . . . . . . . . . .15
Fig. 9: MOS differential pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Fig. 10: Driver block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Fig. 11: Constant peak ft current density over four technology nodes . . . . . . . . . . . . . . . . . . 22
Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2-µm unit finger width biased at VDS = 1V . . . . . . . . . . . . .25
Fig. 13: Input bias and matching network and comparator . . . . . . . . . . . . . . . . . . . . . . . . 26
Fig. 14: Eye-crossing control circuit and intermediate signals . . . . . . . . . . . . . . . . . . . . . . 28
Fig. 15: Summation of output currents across output resistor . .. . . . . . . . . . . . . . . . . . . . . . 29
Fig. 16: Output driver transistor-level schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
Fig. 17: Digital pre-emphasis block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Fig. 18: Pre-emphasis waveforms and transfer function . . . . . . . . . . . . . . . . . . . . . . . . . .31
Fig. 19: NMOS digital differentiator schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
Fig. 20: Simulated S22 and S11 of complete driver circuit to 60 GHz . . . . . . . . . . . . . . . . . 34
Fig. 21: Simulated single-ended S21 of entire driver circuit . . . . . . . . . . . . . . . . . . . . . . . . . .35
Fig. 22: Simulated single-ended S21 of driver with output amplitude reduced using amplitude control . . . . . . . . . . . . . . . . . . . . . 35
Fig. 23: Simulated S21 with output peaking enabled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Fig. 24: 20Gb/s eye-diagrams 27-1 PRBS: single-ended input 20 mVpp;differential output 99mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36
Fig. 25: 25Gb/s eye-diagrams 27-1 PRBS: single-ended input 60 mVpp;
vii
0
differential output 180mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
Fig. 26: 30Gb/s eye-diagrams 27-1 PRBS: single-ended input 140 mVpp;differential output 300mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50, (b) 66%, and (c) 33% . . . . . . . . .38
Fig. 28: Output amplitude control at 20 Gb/s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per sideand +/- 16% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per sideand +33%/ -25% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Fig. 31: Inductor single-frequency Π-model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
Fig. 32: Inductor lumped element broadband model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Fig. 33: On-chip 900 pH inductor with 44mm diameter . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Fig. 34: Full chip photograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
Fig. 35: Magnified photograph of main path layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Fig. 36: Magnified photograph of the parallel path layout . . . . . . . . . . . . . . . . . . . . . . . . 48
Fig. 37: S-Parameter test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Fig. 38: Eye-diagram test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Fig. 39: Measured single-ended S21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Fig. 40: Measured single-ended S22 and S11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Fig. 41: Measured and simulated driver noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54
Fig. 42: 20Gb/s eye-diagrams 231-1 PRBS: single-ended input 20 mVpp;differential output 84mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
Fig. 43: 25Gb/s output eye diagram with 50% eye crossing . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per sidefor a single-ended 200 mVpp, 231-1 input PRBS . . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30% . . . . . . . . . . . . . . . . 57
Fig. 46: Output amplitude control at 20 Gb/s;output of (a) 190 mVpp and (b) 350 mVpp per side . . . . . . . . . . . . . . . . . . . . .58
Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis . . . 59
viii
1
1 Introduction
1.1 Motivation
Serial inter-chip communication is gaining widespread acceptance over paral-
lel architectures because congested printed circuit board (PCB) routing and
pad-limited silicon dice are not cost efficient in commodity designs. To mini-
mize the overall circuit area required for a serial transmitter/receiver pair,
equalization can be performed at the transmitter instead of at the receiver, in
which case it is known as pre-equalization. At the transmitter, pre-equalizers
alter the wave-function to account for the low-pass response of the intercon-
nect. Historically, pre-emphasis has been achieved either using clocked flip-
flops and step-delayed current summation or analog differentiators. The latter
only permits for amplitude control of the pulse, obviating control for the pulse
width. The former implementation, while having the necessary control mecha-
nisms and efficacy, places severe strain on device technology as the required
flip-flop typically operates at twice the frequency of the driver itself.
Even though 40 Gb/s CMOS amplifiers [2], demultiplexers and multiplexers
[3] have been recently reported, demonstrating the high-speed potential of
standard CMOS technology, they suffer from limited dynamic range due to
poor sensitivity and modest output swings of about 100 mVpp per side.
This paper presents the first published CMOS driver with duty-cycle, ampli-
tude, and pre-emphasis control that operates at data rates exceeding 20 Gb/s.
The driver achieves over 30dB of dynamic range. It includes a novel passive
element-free differentiator that enables control of both amplitude and width
of the pre-emphasis pulse.
2
1.2 Objectives and scope
The prime objective of this thesis was a fabricated high-speed, pre-emphasis
enabled, output driver for applications in serial inter-chip communications
over backplanes. The scope of the thesis was the theoretical derivation, com-
puter-aided design, simulation, fabrication and experimental characterization
of the circuit. The required features of the design in order to achieve its tar-
geted application were as follows:
Table 1: Circuit features categorized by design requirement
DesignRequirement
Corresponding Circuit Feature
High Bandwidth • Inductive peaking at each signal path stage• High gain (> 20dB) at 20 Gb/s operation• Positive gain at 30 Gb/s operation
Low Power • 1.5 V power supply• 150 mW dissipation at 20 Gbps operation.
High SignalIntegrity andSensitivity
• Input and output matching 50 Ω up to 50 GHz• Differential signalling• Symmetrical layout• 20 mVpp input sensitivity
Signal Shaping andControl
• 30% to 70% pulse-width control• 200-700mVpp differential output swing control• Pre-emphasis spike width and height control
NovelImplementation
• Full CMOS implementation• Clock-free pre-emphasis circuit with non-tra-
ditional circuit design
3
1.3 Organization
The thesis is organized as follows. Chapter 2 discusses the background of pre-
emphasis, inductive peaking, and hazardous relative placement of load induc-
tors. Chapter 3 details the concept, design and biasing of the individual circuit
elements in the driver. Insights into transistor sizing for optimal speed based
on a current-density centric approach as well as minimum noise are provided.
In Chapter 4, the pre-layout simulation results of the driver are presented.
Chapter 5 summarizes the physical implementation of the entire circuit.
Chapter 6 offers an overview of the experimental results. Conclusions are
given in Chapter 7.
4
2 Background
2.1. Review of pre-emphasis
In high-speed circuit applications in which high-frequency signals are sent
over backplane channels, there are two types of equalization: transmitter pre-
emphasis and receiver equalization [4]. Both are intended to either emphasize
the high-frequency components or de-emphasize the low-frequency compo-
nents of the transmitted signal, in order to compensate for the low-pass trans-
fer characteristics of the channel. The transfer function of both types of
equalizer is high-pass, though in practice, it is band-pass. The reasons for the
latter are threefold: (i) semiconductor devices in practice cannot achieve infi-
nite bandwidth; (ii) to avoid high-frequency noise amplification; (iii) to meet
regulated electromagnetic interference (EMI) specifications.
Pre-emphasis is achieved at the transmitter side by increasing the high-fre-
quency components. Fig. 1 shows the mechanism in which ideal transmitter
pre-emphasis compensates for the low-pass transfer characteristics of the
backplane trace.
5
Fig. 1: Transmitter pre-emphasis and backplane trace transfer character-istic
A common practical pre-emphasis circuit implementation is a Finite
Impulse Response (FIR) filter. Fig. 2 shows the block diagram of a 4-tap FIR
filter with a single Data input, delay elements D and tap coefficients C1, C2,
C3, C4. The tap coefficients adjust the gain at each multiplier independently to
produce the output voltage across the load resistors of value R. The output is a
frequency shaped version of the Data input in the form of amplified high-fre-
quency components.
Backplane traceTransfer Characteristics
Frequency (Hz)
TransmitterPre-emphasis
Response
6
Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis
In the time domain, the FIR filter performs a differentiation function. As
shown with the dotted lines in Fig. 3, the waveform of transmit pre-emphasis
appears as overshoot and undershoot in the time-domain.
C1 C2 C3 C4
D
D
D
D
Data
Delay Element
R R
7
Fig. 3: Time-domain pre-emphasis overshoot and undershoot
Other circuits that perform similar differentiation functions are passive RC
differentiators and inductively-loaded differential amplifiers. Unfortunately,
there exist drawbacks in each of these three differentiator implementations.
For FIR filters, a clock of at least twice the frequency of the data is required
to trigger the delay elements D in Fig. 2. The width of the pre-emphasis spike
is inversely proportional to the frequency of this clock. Hence, for 20 Gb/s (10
GHz) signals, a minimum 20 GHz clock signal is needed, requiring a very well
designed clock recovery circuit and 20 GHz flip-flops. The RC passive-element
differentiator is a more viable solution in that it does not require the input
clock or flip-flops, however, the width of the pre-emphasis spike in this case is
not controllable. This results in a non-ideal pre-equalization that can not fully
compensate for the effects of the channel. Finally, inductively loaded differen-
tial amplifiers offer no control over the width nor the height of the pre-empha-
sis pulse, and more importantly can result easily in output ringing due to
resonant effects. The pre-emphasis employed in the presented driver is differ-
ent than all three methods outlined above as: (i) no passive L or C elements
are used; (ii) no clock is required and; (iii) the pulse width and height can be
independently controlled.
Pre-emphasis overshoot
time
Voltage
and undershoot
8
2.2 Inductive peaking
The theory of inductive peaking or broad-banding is well-documented [5][6]. A
brief review and implications for the driver design will be presented here.
Inductive series and shunt peaking are techniques that can be used to extend
the 3-dB bandwidth of an amplifier without expensing extra power. The fol-
lowing explanation will focus on shunt inductive peaking as it is applied in the
driver design.
Fig. 4: Simple common source amplifier with resistive loading
Fig. 5: Common source amplifier with shunt peaking
9
Fig. 4 illustrates a common source amplifier with an ideal resistor and capac-
itive load. For simplicity, we assume that the small signal frequency response
of the amplifier is determined by a single dominant pole, which is determined
solely by the output load resistance RL and by the load capacitance C.
The introduction of an inductance L in series with the load resistance as
shown in Fig. 5, alters the frequency response of the amplifier. This technique,
known as shunt peaking, increases the bandwidth of the amplifier by trans-
forming the frequency response from that of a single pole to one with two poles
and a zero.
The poles may or may not be complex. The zero is determined solely by the
L/RL time constant and is primarily responsible for the bandwidth improve-
ment. In addition, the frequency response of this amplifier is characterized by
the ratio of L/RL and RLC time constants. This ratio is denoted by m = L/
(RL2C). Isolating for the inductance value the ratio is re-written as L =
mRL2C.
V out
V in---------- ω( )
gmRL
1 jωCRL+----------------------------= 1( )
V out
V in---------- ω( )
gm RL jωL+( )
1 jωRLC ω2LC–+
-------------------------------------------------= 2( )
10
It can be shown [5] that bandwidth extension is possible at varying degrees
with adjustments of m. As expected, the 3-dB bandwidth of the shunt ampli-
fier increases as m increases. Table 1 shows the normalized 3-dB extension
factor relative to the value of m. The maximum bandwidth occurs for m = 0.71
and yields an 85% improvement in bandwidth. However it is accompanied by
a significant amount of gain peaking which is undesirable for broadband
amplifiers used in fibre optic or backplane applications. A maximally flat
response is observed for m = 0.41 while still improving bandwidth by 72%.
Finally, although a value of m = 0.32 does not result in the same bandwidth
improvement as the other two non-zero values of m shown in the table, it
exhibits the most linear phase response up to the 3-dB bandwidth [5]. This
value of m, called the optimum group delay value, is desirable for optimizing
pulse fidelity in broadband systems that transmit digital signals.
The optimum group delay value, which still results in a respectable 60 per-
cent increase in bandwidth, is best suited for the design of the broadband dig-
ital signal driver.
An implicit benefit of using inductive peaking is the enhanced freedom in the
power-bandwidth trade-off. This improvement can be demonstrated by first
Table 2: Performance metrics for shunt peaking [5]
Factor (m)Normalized
ω3dBResponse
0 1.00 No shuntpeaking
0.32 1.60 OptimalGroup Delay
0.41 1.72 Maximally flat
0.71 1.85 Maximumbandwidth
11
examining the key equations for a non-inductively loaded amplifier (as the one
introduced in Fig. 4). The bandwidth of the amplifier is dominated by the out-
put pole to be:
The value of the tail current Itail is also determined by the amount of desired
output swing and load resistance, especially in the design of switching invert-
ers:
where ∆Vswing is the voltage swing on the output node of the inverter and RL
is the load resistance of the inverter.
The advantage of shunt inductive peaking is made more obvious when the
bandwidth and power of an amplifier are examined. The bandwidth of an
amplifier is given by equation 3, while the power consumption is directly pro-
portional to Itail for a given power supply. The goal is to increase the band-
width and minimize the power consumption (and hence Itail of an amplifier).
This results in a contradictory solution for RL whereby equation 3 requires a
small value for RL and equation 4 requires a large one.
Inductive peaking allows the circuit designer to increase the value of RL to
reduce overall power consumption while simultaneously increasing band-
width with the introduction of a load peaking inductor. Since Table 1 shows
that shunt peaking can increase the bandwidth characteristics of an amplifier
by 60% while still maintaining a linear phase response, a possible trade-off is
to increase the resistance RL by 30% and decrease the tail current Itail by a
BW 12πRLC------------------= 3( )
I tail
V swing∆RL
-------------------= 4( )
12
similar amount (1/1.3) and introduce a load inductor. With m = 0.32 from the
Table 1, the value for the inductor is:
From equation 3, the increased resistance RL will decrease the bandwidth by
a factor of 1/1.3, but the inductive peaking will increase the newly reduced
bandwidth by 60%. Overall, the inductively peaked circuit will have both
higher bandwidth and lower power than the original resistively-loaded circuit.
Explicitly, the inductively peaked circuit will have bandwidth:
and the tail current of the inductively peaked circuit will be:
This results in a 23% gain in bandwidth with a simultaneous 23% decrease in
power consumption.
There are a few compromises involved in this optimization of the power-
bandwidth product via the introduction of an on-chip inductor. First, the
LCRL
2
3.1-------------= 5( )
BW peaked BW resistive1.61.3-------× 1.23BW resistive= = 6( )
I tail peaked( )I tail resistive( )
1.3------------------------------ 0.77I tail resistive( )= = 7( )
13
added die area expense at each inverter stage can be relatively large, with the
inductor usually occupying more area than the resistively-loaded inverter
stage alone. Secondly, deterministic jitter can occur leading to deleterious
results if an inductor is realized with a larger than simulated value. In this
case, jitter is a result of undesired, and more importantly, uncontrollable
peaking and signal distortion. In a circuit with multiple sequential gain
stages, each with over-sized peaking inductors, the resultant signal distortion
becomes catastrophic. Electro-magnetic field solvers that simulate the induc-
tances of on-chip planar and stacked spiral inductors must be verified experi-
mentally with fabricated test-structures before the final circuit is fabricated.
This pre-verification methodology was employed in the inductor design.
2.3 Hazardous relative placement of load inductors
The relative placement of the load inductors to the load resistors in the design
of each amplifying stage is of utmost importance. Fig. 6 shows a potentially
hazardous placement of a load inductor in an inductively loaded amplifier
stage. The root of the deleterious effect lies in transmission line theory. As the
driver operates in the high-frequency signalling spectrum, transmission line
theory is applicable.
Transmission lines spatially transform impedance [7]. The impedance of the
voltage supply in the AC case is zero, a short circuit. At high-frequencies, the
inductor length is comparable to that of a quarter-wavelength of the signal on
the inductor wire segment. Hence, the impedance looking into the inductor
may be spatially transformed from the short-circuit of the power supply into
an open circuit. The transformed open circuit will result in instability and/or
oscillatory behavior in the amplifier.
14
Fig. 6: Potentially hazardous placement of load inductor
Fig. 7: Proper placement of inductive load
It is important to note that, for the proper placement of the inductor in Fig. 7,
spatial impedance transformations still occur. However, the impedance look-
ing into the inductor from transistor M2 is always a finite, non-zero value
because the transformation acts on the finite and non-zero resistive value RL.
2.4 Second order effects in deep submicron MOSFETs
Of special relevance in the transistor sizing and biasing in the presented
Zin may ~ inf.
Zin alwaysfinite
15
design is a relatively unfamiliar submicron phenomenon. Electron mobility
degradation due to high vertical electric fields can drastically decrease perfor-
mance of deep-sub micron circuits. Fortunately, the application of proper bias-
ing measures can mitigate these negative effects. However, scarcely few
publications on this topic exist and classical biasing techniques are the norm.
Fig. 8: Transconductance of a 130nm nMOSFET vs. gate voltage [8]
Fig. 8 [8] portrays the deleterious effects on transconductance due to electron
mobility degradation in high-vertical fields. Shown is the transconductance
normalized by width (gm/W) of a 130-nm nMOS transistor as a function of
gate-source voltage. Its shape is similar to that of the fT dependence on VGS
and typical for all deep submicron technologies. The curve exhibits two dis-
tinct regions, the square-law region and the high-vertical field region as
shown. There also exists an intermediate area between the two regions in
which hybrid behaviour is observed.
At low effective gate voltages (VGS < 0.5V in Fig. 8), the device follows the
classical square law model and its transconductance varies linearly with VGS.
The equation derived for an n-MOS transistor in saturation (square law
region) is:
High-verticalfield region
Square-lawregion
16
where COX is the oxide capacitance, W is the gate width, L is the gate length,
and VT is the threshold voltage of the transistor.
At large gate-source voltages, the high-electric field developed between the
gate and channel of the transistor confines charge carriers to a narrower
region below the oxide-silicon interface, leading to more carrier scattering and
hence lower mobility. Further, small-geometry devices experience significantly
more mobility degradation [9]. An empirical equation modelling this effect is
[9]:
where µο denotes the “low-field” mobility and θ is a fitting parameter that
increases with decreasing oxide thickness and hence smaller geometries.
Substituting the solution for electron mobility µn (9) into (8) reveals that
when the second term in the denominator of (9) becomes dominant, transcon-
ductance becomes a constant. This effect is observed clearly for the high-verti-
cal field region of Fig. 8, where VGS > 0.7V.
It is of special interest to note how the constant transconductance due to
gm µnCOXWL-----
V GS V T–( )= 8( )
µn
µo
1 θ V GS V T–( )+-----------------------------------------= 9( )
17
high-vertical fields affects the differential voltage required to completely
switch a MOS differential pair. It will be shown that a larger switching volt-
age, with little or no improvement in transition time, is required when the
MOS pair is biased in the high-vertical field region [8].
We first derive the differential voltage required to completely switch a MOS
differential pair when biased in the high-vertical field region (VGS > 0.7V in
Fig. 8).
Fig. 9: MOS differential pair
Fig. 9 shows a differential pair of MOS transistors fed by a constant tail cur-
rent Itail. Imagine tail current Itail is fully routed through transistor Q1.
Assuming high-vertical field operation, Itail is:
I tail I DS1
Cox
2--------
W 1
L1--------
µo
1 θ V GS1 V T–( )+( )------------------------------------------------- V GS1 V T–( )2
= = 10a( )
18
where transistor Q1 is assumed to be in saturation and equation (9) has been
substituted for µn. In high-vertical field operation, when the term θ(VGS1 - VT)
becomes dominant relative to unity, equation (10a) becomes:
Equation (10b) shows that IDS1 now exhibits a linear relationship with VGS1.
At the instant that Itail is fully routed through Q1, it is evident that VGS2 = VT,
the threshold voltage of the transistors, such that IDS2 = 0. Hence, the differ-
ential voltage across the gates of the transistors is
Isolating VGS1 in (11) and substituting into (10b) reveals that
for Itail at the instant when all the tail current is shifted completely through
I tail I DS1
Cox
2--------
W 1
L1--------
µo
θ----- V GS1 V T–( )≈= 10b( )
V∆ V GS1 V GS2– V GS1 V T–= = 11( )
I tail
Cox
2--------
W 1
L1--------
µo
θ----- V∆ V T V T–+( )≈
Cox
2--------
W 1
L1--------
µo
θ----- V∆( )= 12( )
19
Q1. An alternate equation for Itail is derived in the steady state when the tail
current is split evenly between the transistors such that:
as VGS(1,2) is equal for both Q1 and Q2 in equilibrium. Equating Itail from (12)
and (13), we solve for ∆V, the differential voltage required to completely switch
the MOS differential pair:
where the inequality is explicitly shown in (14b) to indicate a minimum differ-
ential voltage requirement.
To find the minimum differential voltage to switch the tail current com-
pletely through one of the MOS transistors in the square-law region (VGS <
0.5V in Fig. 8), a parallel mathematical process of equations (10) through (14)
is completed. This is shown explicitly in [10]. The resultant required voltage
swing for full switching becomes:
I DS1 I DS2
I tail
2---------
Cox
2--------
W 1
L1--------
µo
θ----- V GS 1 2,( ) V T–( )
Cox
2--------
W 1
L1--------
µo
θ----- V EFF( )= = = = 13( )
Cox
2--------
W 1
L1--------
µo
θ----- V∆( ) Cox
W 1
L1--------
µo
θ----- V EFF( )= 14a( )
V∆ 2V EFF≥ 14b( )
V∆ 2V EFF≥ 15( )
20
Hence, in the high-vertical field region, both the scalar multiplier and VEFF
itself are larger, requiring a greater differential voltage and hence transition
time, to switch the differential pair [8].
The outcome of this analysis emphasizes that gate-source voltages must be
limited to mitigate the effects of high-vertical fields on electron mobility. This
is accomplished through proper transistor sizing and tail current selection.
21
3 Circuit Design
3.1 Driver overview
Fig. 10 shows the block diagram of the backplane driver highlighting the four
sections of the circuit. The pre-emphasis path is placed in parallel with the
main signal path and the current from both output stages are summed across
the 50 Ω load resistors to develop the differential output voltage. The parallel
path consists of delay buffers and the digital differentiator circuit. The output
swing is adjusted from the tail current of the output driver while the height of
pre-emphasis is controlled by the relative tail current of the digital differenti-
ator.
Fig. 10: Driver block diagram
22
3.2 Biasing a MOS CML gate for optimal speed
MOS CML logic has only recently been applied to high-speed ICs [6]. Conse-
quently, no systematic design guidelines existed until recently [8]. As such,
MOS CML gates have been biased for optimal speed rather haphazardly using
a voltage-centric approach. We will show through simulation that, for MOS
CML, a current-density centric instead of a voltage-centric design approach
leads to more accurate and reliable circuit design.
Fig. 11: Constant peak ft current density over four technology nodes [8]
In VGS-focussed MOS CML design the effective gate voltage (VEFF = VGS -
VT) value at which the peak fT of the MOSFET scales with technology,
decreasing with every technology node. This makes it very difficult to predict
an optimal bias across multiple technologies and very difficult to predict an
optimal bias within a given technology node. However, as the simulated data
collected over four technology nodes (# of fingers x gate length x finger width)
JpfT-MOS
23
shown in Fig. 11 [8] show, the peak-fT current density (JpfTMOS) remains
approximately constant (between 0.25 mA/µm and 0.35 mA/µm depending on
VDS) as technology scales. This trend is likely to occur also for future MOS
generations as a result of the constant field scaling that has been applied from
the 0.5-µm technology node [11] downward. Subsequently, a current-density
centric design approach, similar to that which is commonly employed in bipo-
lar designs [12], is more appropriate for reproducible, accurate, high-speed
design of MOS CML circuits. In a current-density centric design scenario, the
gate width of the MOSFET is sized such that the device reaches its peak fT
when all of the tail current flows through the device:
In 130-nm technology, this corresponds to a VEFF of around 300mV. Biasing
beyond the peak-fT current density will degrade circuit performance.
For large-signal high-speed circuit biasing in a MOS differential pair, it is
recommended that each of the differential pair transistors are biased at half-
peak fT current density in balanced current steady state. This avoids current
densities beyond peak-fT when, during full-switching, all the tail current is
routed through one transistor of the differential pair and the current density
is momentarily doubled from that of half peak fT to peak fT. Referring back to
the discussion of Section 2.4, biasing the circuit at half-peak fT has a second
positive effect on circuit switching speed. Biasing at half-peak fT current den-
sity instead of full-peak fT current-density permits the differential pair tran-
W G
I T
J pfTMOS---------------------= 16( )
24
sistors to operate more in the square-law region instead of the slower-
switching high-vertical field region.
Based on this observation, each circuit block in the presented driver design
consists of a MOS-CML inverter whose ratio of tail current to differential pair
transistor width is set to correspond to the peak fT bias of the n-channel MOS-
FET of 0.25 to 0.3 mA/µm. This bias scheme is implemented to obtain the
maximum switching speed. Inductive peaking is employed in every stage to
further improve the circuit bandwidth.
3.3 Biasing a differential CMOS inverter for low-noise
Fig. 12 shows the fT and oppositely the NFMIN versus current-density for a
130-nm n-MOSFET with a 2µm unit finger width. It is shown that NFMIN has
a minimum value corresponding to a bias current of about half the current-
density of the maximum fT. Earlier it was shown that biasing each transistor
in a differential pair at half-peak-fT would result in optimal switching speed.
Now it is shown, co-incidentally, that this current-density bias point also
results in minimum NFMIN.
25
Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2 µm unit finger width biased at VDS = 1V
3.4 Input matching network and low-noise comparator
Fig. 13 illustrates the input matching network and input low-noise compara-
tor. The input differential pair has higher gain and larger tail current than
the other stages in order to reduce the noise by making the optimum noise
impedance of the input stage closer to 50Ω per side. A compromise was
reached between achieving the best possible noise match, which calls for
larger transistor sizes and bias current, and the broadband input impedance
match. On-chip matching resistors, realized as a resistive divider with series
inductors, provide appropriate gate bias for the input transistors and broad-
band input impedance matching.
10-2
10-1
100
Current Density (mA per µm width)
0
25
50
75
100
f T (
GH
z)
0.0
0.5
1.0
1.5
2.0
NF
MIN
@ 1
0GH
z (d
B)
26
Fig. 13: Input bias and matching network and comparator
3.5 Eye-Crossing Control
A key objective of the thesis was the design of a driver with controllable pulse-
width. The application of controllable pulse width is the compensation of DC
offsets that may cause signal distortion. This compensation is used to alter the
duty-cycle (of an input signal) such that a 50% duty-cycle periodic input may
be changed to a 30% or 70% duty-cycle periodic output signal. Conversely, a
DC offset which has imposed an output duty cycle of 30% could be negated to
re-instate the duty-cycle back to 50% as desired.
The circuit of Fig. 14 accomplishes pulse-width control using a technique
found in [13]. The circuit consists of two series inductively-peaked differential
stages with a DC offset control pair connected at the output of the first stage.
By applying a DC voltage Voffset, an offset voltage is developed at the output
Zin= 50Ω
27
node of the first differential pair, shifting the zero-crossing between the two
outputs. Due to the finite rise and fall time of the waveform, and the trunca-
tion by the limiting action of the last inverter, Vout exhibits a change in duty-
cycle as illustrated by the overlaid waveforms of Fig. 14.
3.5.1 Transistor sizing and biasing in the eye-crossing control circuit
On the left side of Fig. 14, the input pair simply drives a series RL load, no dif-
ferently than any of the other inductively peaked circuits of this backplane
driver. The middle pair of transistors (M3 and M4) have DC bias voltage Voffset
held constant or a mixing effect would occur because the current pull of M3
and M4 act on the same signal path as M1 and M2. Hence it must be ensured
that the control voltage Voffset comes from a low-noise source.
The third stage acts as a limiting amplifier as previously described. The tail
current was chosen to be 12mA and hence the differential pair transistors
were chosen to be 32 µm each so that a current density of 0.19 mA/µm in each
during steady state. This biasing is in-line with the recommendations for
half-ft value derived in Section 3.2.
28
Fig. 14: Eye-crossing control circuit and intermediate signal
3.6 Output driver and parallel current summation
The output driver and digital pre-emphasis circuit, in parallel, create the out-
put voltage across the output load resistor. It is shown in Fig. 15 how the out-
put currents of the output driver and the digital differentiator are summed.
The ratio of output current from the output driver (I1 in Fig. 15) and the out-
put current contribution from the digital differentiator (I2 in Fig. 15) deter-
mine the percentage of pre-emphasis in the overall output voltage, Vout. The
higher this ratio, the lower the percentage of pre-emphasis in Vout.
29
Fig. 15: Summation of output currents across output resistor
The output driver is shown in Fig. 16 and consists of a simple differential pair
biased with a current mirror. The value of bias current in the output driver
determines the magnitude of output signal swing from the main path.
DelayBuffers Digital
Differentiator
Vout
OutputDriver
I2
I1 I1 + I2Rout
30
Fig. 16: Output driver transistor-level schematic
3.7 Digital pre-emphasis
The digital pre-emphasis circuit, whose block diagram is shown in Fig. 17, fea-
tures three delay cells, implemented as inverters, followed by a digital differ-
entiator circuit. The third delay cell is loaded additionally with MOS
varactors connected in parallel with resistive loads in order to control the
delay and, therefore, the pre-emphasis spike width.
Fig. 18 illustrates the waveforms in the digital differentiator. The circuit has
two differential inputs IN, IN and INdly, INdly and a differential output
Voutp, Voutn which is summed with the currents from the main path across
the on-chip 50 Ω load resistors.
VoutNVoutP
Vin+ Vin-Ibias = 2mA
Iout = 20mA
M1 M2
Vdd = 1.5VR
load (shared with digitaldifferentiator) R
load (shared with digitaldifferentiator)
31
Fig. 17: Digital pre-emphasis block diagram
Fig. 18: Pre-emphasis waveforms and transfer function
The input signals are assumed to be periodic for this example. Input signals
IN and IN are delayed through the delay circuit to produce INdly and INdly,
respectively. As shown by the dotted arrows in Fig. 18, the delay between IN
32
and INdly translates into the pre-emphasis pulse width of VoutP. The same
relationship holds for IN, INdly, and VoutN, respectively. The four signals IN,
IN, INdly and INdly are fed into the digital differentiator circuit, which has
two outputs, VoutP and VoutN.
The differentiator functions as a logical XOR gate with one notable excep-
tion. Whereas an XOR gate in the classical sense operates within a binary
logic system, this logic circuit operates on a tertiary (three-level) logic system.
Specifically, when IN.INdly is true, the output VoutP rises; when IN.INdly is
true, the output VoutP falls; and when neither case is true, the output of the
circuit stays in steady-state. The output VoutN falls and rises in a horizon-
tally-mirrored fashion.
The transform of binary logic at the input of the differentiator to tertiary
logic at its output is accomplished using current switching. The circuit of Fig.
19 is biased by two constant current sources of equal value, Iswch. The voltages
at the output of both VoutP and VoutN are current-controlled by voltage drops
across each 50 Ω load resistor. In the steady-state, both VoutP and VoutN are
pulled down by an equal current of value Iswch. When VoutP rises (and VoutN
drops), the current drain path pulling down VoutP is cut off, and VoutN is
pulled down by a current equal to 2Iswch. Conversely, when VoutP drops (and
VoutN rises), VoutP is pulled down by current 2Iswch, and VoutN has its cur-
rent path to ground cut off.
33
Fig. 19: NMOS digital differentiator schematic
In Fig. 19, the pre-emphasis height is controlled by the two constant current
sources Iswch, whose value is adjustable between 0 and 10 mA. For matching,
transistors M3 and M6 compensate the VDS drop across transistors M1, M2
and M4, M5, respectively.
34
4 Simulation Results
4.1 S-Parameter simulations
The small signal S-parameters were simulated in a single-ended input and
output configuration. The input and output return loss are better than -5 dB
up to 60 GHz, as shown in Fig. 20, with S11 achieving -10dB up to 50 GHz.
The single-ended simulated small signal gain is 14.5dB, confirmed by the eye
diagram measurements of Fig. 21. Further, Fig. 20 shows that the 3dB-fre-
quency of the driver is 8.5 GHz and the driver has gain (> 0dB) up to 24.1
GHz.
Fig. 20 Simulated S22 and S11 of complete driver circuit to 60 GHz
35
Fig. 21 Simulated single-ended S21 of entire driver circuit
With the driver’s output amplitude control set to a low level, the resulting
simulated S21 is as shown in Fig. 22. The output amplitude is controlled by an
off-chip current source and can be manipulated to increase or decrease the low
frequency gain peak. The decreased gain gives an extended 3dB-bandwidth
and 0dB crossing of 9.3 GHz and 37.6 GHz, respectively.
Fig. 22 Simulated single-ended S21 of driver with output amplitudereduced using amplitude control
S21 simulations with output peaking enabled are shown in Fig. 23. The peak-
ing is evidenced in the simulation, increasing the maximum small signal gain
from 14.1dB up to 16.5dB and the 3dB frequency to 13.3 GHz. The frequency
36
shaping effects of the pre-emphasis output circuit is exemplified here.
Fig. 23: Simulated S21 with output peaking enabled
4.2 Time domain simulations
Time domain eye-diagram simulations were performed on the full driver cir-
cuit with a 27-1 PRBS (Pseudo Random Binary Stream) generator. In Fig. 24,
a 20 Gb/s input signal was applied single-ended and the unused input was ter-
minated with a 50 Ω resistance connected to VDD.
Fig. 24: 20Gb/s eye-diagrams 27-1 PRBS: single-ended input 20 mVpp; dif-ferential output 99mVpp per side
37
Fig. 25 and Fig. 26 show simulated eye-diagrams at 25 Gbps and 30 Gbps,
respectively. The circuit exhibits reduced sensitivity at 25 Gbps and 30 Gbps,
requiring, respectively, 60 mV and 140 mV single-ended inputs for similar eye
openings.
Fig. 25: 25Gb/s eye-diagrams 27-1 PRBS: single-ended input 60 mVpp; dif-ferential output 180mVpp per side
Fig. 26: 30Gb/s eye-diagrams 27-1 PRBS: single-ended input 140 mVpp; dif-ferential output 300mVpp per side
Fig. 27 demonstrates the simulated eye-crossing control performance at 20
Gb/s taken at the output of the driver. The eye-crossing control for (a) 50%, (b)
66%, and (c) 33% was performed by varying the control voltage that was con-
nected to off-chip voltage sources (Voffset of Section 3.5.1)
38
Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50%, (b) 66%, and (c)33%
Fig. 28 shows output amplitude control at 20 Gb/s. The input signal in both
diagrams is 20 mVpp, applied to one side only, and the output is varied
between 21 mVpp and 220 mVpp.
(a)
(b)
(c)
39
Fig. 28: Output amplitude control at 20 Gb/s;Input of 20mVpp applied to a single end
Output of (a) 21 mVpp and (b) 220 mVpp per side
Simulated waveforms with varying levels of pre-emphasis at 20Gb/s are
shown in Fig. 29 and Fig. 30. By altering the current bias of the parallel digi-
tal pre-emphasis path, the pre-emphasis of the overall driver can be controlled
independently of the main path of the driver. Fig. 29 shows a symmetric +/-
16% overshoot/undershoot ratio, relative to the voltage swing of the main-
path signal. Fig. 30 shows an asymmetric +33%/-25% overshoot/undershoot
ratio. The asymmetry in Fig. 30 is attributed to the systemic limitation of the
pre-emphasis circuit, in that the maximum output voltage cannot exceed VDD
while the minimum value must be higher than 3*VDS(sat) due to the triple-
stacked NMOS configuration of Fig. 17. Hence, the ratio of overshoot/under-
shoot relative to the isolated main-path signal swing is directly related to the
(a)
(b)
40
pre-emphasis bias current. A lower digital pre-emphasis current results in a
symmetric overshoot/undershoot ratio as shown in Fig. 29, but the relative
percentage of overshoot/undershoot reduces to less than 25%.
Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per side and +/-16% pre-emphasis
Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per side and+33%/ -25% pre-emphasis
41
5 Physical Implementation
5.1 Inductor design and model extraction
The Computer Automated Design (CAD) tool ASITIC (http://rfic.eecs.berke-
ley.edu/~niknejad/asitic.html) was used in the simulation and sizing of the
inductors in the driver. ASITIC is a three-dimensional field solver which aids
the RF circuit designer in the optimization and modelling of spiral inductors,
transformers, capacitors, and substrate coupling. Test-structures previously
fabricated and characterized at the University of Toronto have confirmed the
accuracy of the ASITIC solver to within 90-95% of absolute inductance values,
in both planar and stacked spiral inductor situations.
All inductors of the driver were simulated as two-port circuits using both the
π−model of Fig. 31 and the lumped element broadband model shown in Fig.
32. In the substrate, the eddy-current induced loss and substrate capacitance
are represented by Rsub and Csub, respectively. The oxide capacitance is rep-
resented by Cox. The series inductance and resistance of the inductor proper
is represented by Ls and Rs, respectively. Finally, the capacitance between the
two symmetrical interwoven arms of the inductor is represented by Ciw.
None of the passive lumped elements in the model of Fig. 32 could be omitted
to reduce simulation time because each inductor was not attached directly to a
ground node in the design. This design requirement was discussed previously
in section 2.3. Further, as each inductor was employed in a broadband load
configuration, design considerations gave priority to self-resonant frequency
over quality factor (Q) in the optimization of each inductor.
42
5.1.1 Model extraction using ASITIC
This section will outline the procedure used to extract both Π-model of the
inductor, which is valid for singular low-frequency values, and the broadband
model of the inductor, which is valid for the inductor below self-resonant fre-
quencies.
The π-model of the inductor is shown in Fig. 31 and is valid only at a single-
frequency. The ASITIC three-dimensional field-solver is invoked using the
command pix to resolve the circuit parameters Ls, Rs, Cox1, Cox2, Rsub1, and
Rsub2 from an inductor layout drawn in the ASITIC graphical tool. These val-
ues are assumed correct at all frequencies below self-resonance for the induc-
tor in question. The broadband model introduces three fitting capacitances
(Ciw, Csub1, Csub2) to account for the broadband frequency response of the
inductor’s parasitics.
The broadband model capacitors are found using a Y-parameter simulation
from 5 GHz up to the self-resonant frequency of the inductor, in 1 GHz steps.
The output of the Y-parameter data is then ported to a SPICE optimization
deck. The optimization deck is included in Appendix A for reference. The
SPICE optimization deck will best fit the three broadband model capacitances
to match the Y-parameter data simulated in ASITIC. As three unknowns are
being optimized for simultaneously, reasonable and educated estimates for
each of the broadband capacitances (Ciw, Csub1, Csub2) are substituted initially.
The SPICE deck in Appendix A will also print L12, Leff and Q vs. frequency.
43
Fig. 31: Inductor single-frequency Π -model
Fig. 32: Inductor lumped element broadband model
44
Derived ASITIC parameters
Table 2 and Table 3 show the physical dimensions and the simulated values of
the equivalent circuit parameters for each of the three inductors employed in
this design. All inductors have dual-layer stacked-spiral configurations with
varying numbers of turns and a maximum outer diameter (per side) of 51.2µm
is used for the 900pH inductor.
Table 1:Table 2:
Table 3: Physical Dimensions of each inductor
Parameter LS = 400pH LS = 700pH LS = 900pH
Diameter (µm) 42.2 46.2 51.2
Number ofTurns
2 3 3
Metal Width(µm)
1.96 1.96 1.96
Spacingbetween wind-ings (µm)
1.96 1.96 1.96
Metal layers METAL6METAL5
METAL6METAL5
METAL6METAL5
45
5.1.2 Inductor realization and isolation
The 900pH inductor is shown in Fig. 33 and is comprised of two metal layers -
the top level of the six metal process and metal five. The left-side terminal of
the inductor is formed in metal six and connects directly to the drain of the
amplifying transistor of that half stage. The right-side port of the inductor is
routed in metal five and is connected to a poly-silicon resistor at each stage.
Each inductor is isolated from each adjacent inductor to minimize crosstalk.
The isolation is accomplished by surrounding each inductor with n-wells in
turn surrounded by p-tap guard rings connected to ground to impose reverse-
biasing on the junctions. Additionally, referring to Fig. 33, the p-taps between
adjacent inductors are electrically attached to a stack of metal layers, from
metal 1 through metal 6, which form a Faraday cage and improve isolation.
Table 1:Table 2:Table 3:
Table 4: Simulated inductor parasitic values
Parameter LS = 400pH LS = 700pH LS = 900pH
RS (Ω) 5.14 8.01 9.13
Cox1 (fF) 8.63 11 13
Cox2 (fF) 8.2 10.8 12.7
Rsub1 (Ω) 1820 1830 1710
Rsub2 (Ω) 2280 2130 1980
Csub1 (fF) 5.72 5.76 6.16
Csub2 (fF) 4.70 4.95 5.32
Ciw (fF) 4.69 6.26 7.24
fSelf-Res (GHz) 85.6 56.7 45.93
Q-factor (5GHz)
2.4 2.7 3.1
46
Fig. 33: On-chip 900 pH inductor with 51.2µm diameter
5.2 Layout and fabrication
The circuit was fabricated in ST Microelectronics’ 130-nm standard CMOS
process with typical n-MOSFET fT and fMAX of 90 GHz and 100 GHz, respec-
tively. The chip microphotograph is reproduced in Fig. 34. The design is pad-
limited and the total die area was 1.0mm x 0.8mm.
47
Fig. 34: Full chip photograph
Magnified versions of the two parallel paths of the driver are shown in Fig.
35 and Fig. 36, respectively. Using Fig. 10 as a reference that describes the
schematic block diagram of the entire driver, Fig. 35 emphasizes the layout of
the main signal path of the driver while Fig. 36 details the layout of the paral-
lel pre-emphasis signal path.
48
Fig. 35: Magnified photograph of main path layout
Fig. 36: Magnified photograph of the parallel path layout
49
The entire layout, with the exception of the digital pre-emphasis XOR gate, is
completely symmetric about the horizontal axis. Power and ground connec-
tions are provided along the central axis in metal 1 and metal 2, respectively.
Transistors and poly-silicon resistors are located adjacent to the division, fol-
lowed by signal path routing, with the peaking inductors located on the out-
side.
The fully-symmetrical layout results in several advantages: (i) the layout of
the entire circuit is simplified by employing half-cell layout and replication
techniques; (ii) the positive and negative signal paths are matched in length
and are physically and electrically isolated from one another; and (iii), the
coupling coefficients of same-stage load inductors are diminished.
50
6 Experimental Results
6.1 Test environment
The post-fabricated circuit was tested on wafer with probes microscopically
placed on the circuit pads. The frequency-domain measurements were made
with a 50-GHz 8510C Hewlett-Packard Vector Network Analyzer (VNA), as
shown in Fig. 37. The time-domain 231-1 PRBS stimulus was created with a
combination of an Anritsu 69397B Signal Generator, an MP17584 Pulse Pat-
tern Generator, and a MP1801A 43.5 Gbps MUX as shown in Fig. 38.
In Fig. 38, the signal generator generates a CLK signal which is driven to the
MUX, and also divided by four and sent to the pulse pattern generator. The
pulse pattern generator produces four 1/4 DATA signals, each with bitrate
one-quarter the final PRBS stream bitrate. The four 1/4 DATA are multi-
plexed temporally with the original CLK controlling the switching of the MUX
on positive CLK edges. Hence, the output of the MUX is a 231-1 PRBS bit
sequence of the required bitrate for the test, not exceeding 43.5 Gbps.
51
Fig. 37: S-parameter test setup
D.U.T.
DC Bias / ControlProbes
DC Bias / ControlProbes
DA in
DA out
50 ohm 50 ohm
Hewlett-Packard8510C
52
Fig. 38: Eye-diagram test setup
6939
7BS
ynth
esiz
edS
wee
p/S
ign
al G
ener
ato
r
D.U
.T.
DC
Bia
s / C
on
tro
lP
rob
es
DC
Bia
s / C
on
tro
lP
rob
es
50 o
hm
Vd
d
Vd
d
50 o
hm
MP
1801
A43
.5 G
b/s
MU
X
CL
K
MP
1758
4P
uls
e P
atte
rn G
ener
ato
r
1/4
CL
K
1/4
DA
TA
-RA
TE
PR
BS
(DA
TA
-Rat
e)
PR
BS
refe
ren
ce C
LK
8610
0BD
CA
20 -
43.
5 G
b/s
53
6.2 S-Parameters and noise figure
The small signal s-parameters and noise figure were measured in a single-
ended configuration. The small signal gain, S21, is shown in Fig. 39 and
agrees well with the simulated results of Fig. 21. Measured single-ended S21
bandwidth was 8.5 GHz, similar to the simulated small-signal gain. The mea-
sured input and output return loss are better than -12 dB up to 50 GHz, as
shown in Fig. 40, outperforming the simulated value of -5dB up to 60 GHz in
Fig. 20. Measured and simulated noise figure values are plotted in Fig. 41. As
expected, the simulations show an inverse relationship between tail current
and overall driver noise figure. Furthermore, the measured noise figure of the
overall driver was 2 dB higher than simulated for a 6mA tail current in the
input comparator. This is primarily due to losses associated with the probe-
pad contact resistance and the series substrate resistance below the pad
which are not accounted for in simulation. Another reason for the higher than
simulated noise figure is the limitation of the BSIM3 model for MOSFETs
that does not capture the gate noise current of the MOSFET.
Fig. 39: Measured single-ended S21
54
Fig. 40: Measured single-ended S22 and S11
Fig. 41: Measured and simulated driver noise figure
55
6.3 Time-domain measurements
Time domain measurements were carried out at data rates between 20 Gb/s
and 43 Gb/s and using a 231 - 1 PRBS pattern. A sensitivity of 20 mVpp, as
illustrated in Fig. 42, was measured at 20 Gb/s when the input signal was
applied single-ended and the unused input was terminated with a 50 Ω resis-
tance. The sensitivity degraded to 60 mVpp, and 150 mVpp, at 25 Gb/s and 30
Gb/s, respectively.
Fig. 42: 20Gb/s eye-diagrams 231-1 PRBS: single-ended input 20 mVpp; dif-ferential output 84mVpp per side
Typical 25 Gb/s and 30 Gb/s output eye diagrams are illustrated in Fig. 43
and Fig. 44, respectively. It is important to note that the driver exhibits gain
at 30 Gb/s, with a 200mVpp input signal resulted in a 260mVpp output signal.
This is the first known recording of positive gain at 30 Gb/s in an all CMOS
driver to date (2005).
56
Fig. 43: 25Gb/s output eye diagram with 50% eye crossing
Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per side for a single-ended 200 mVpp, 231-1 input PRBS.
57
Fig. 45 demonstrates the eye-crossing control performance at 20 Gb/s taken at
the output of the driver. The eye-crossing control for (a) 50%, (b) 70%, and (c)
30% was performed by varying the control voltage that was connected to off-
chip voltage sources.
Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30%
(a)
(b)
(c)
58
Fig. 46 shows output amplitude control at 20 Gb/s. The input signal in both
diagrams is 200 mVpp, applied to one side only, and the output is varied
between 190 mVpp and 350 mVpp.
Fig. 46: Output amplitude control at 20 Gb/s; Output of (a) 190 mVpp and(b) 350 mVpp per side
Measured waveforms with pre-emphasis at 20Gb/s are shown in Fig. 47. The
eye-diagram exhibits higher positive overshoot than undershoot with spike
height control between 0% and 25% of the eye height. The overshoot/under-
shoot imbalance shown in Fig. 47 was caused by an over-ratio between the
(a)
(b)
59
pre-emphasis path tail current and the main path tail current. The ratio of
overshoot/undershoot relative to the output signal swing alone is directly
related to the pre-emphasis path tail current value. A lower digital pre-
emphasis tail current could have resulted in a symmetrical overshoot/under-
shoot ratio, but the relative percentage of overshoot/undershoot would have
reduced to less than 15%. Unfortunately, experimental evidence of this was
not captured due to time restrictions at the testing facilities at Quake Tech-
nologies and ST Microelectronics, both located in Ottawa, Canada.
Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis
6.3 Performance summary
Table 4 summarizes the overall circuit characteristics. Of special note are the
high input sensitivity and high output swing, the extensive -12dB input/out-
put matching up to 50 GHz and the multiplicity of control mechanisms with
60
respect to output amplitude, pre-emphasis and eye-crossing. Further, there is
excellent agreement between the simulated and measured values of the driver
in both the time domain and the frequency domain.
Table 5: Simulated and measured circuit parameters
Parameter SimulatedValue Measured Value
Supply 1.5V 1.5V
Power 150 mW 150 mW
Output swing@ 20Gb/s with 20 mVpp input
21-350 mVpp perside
190-350 mVpp perside
Pre-emphasis control @ 20 Gb/s +33%/ -25% +25%/-15%
Crossing control @ 20 Gb/s 33% to 66% 30% to 70%
Eye sensitivity @ 20 Gb/s 20 (10) mVpp 20 (10) mVpp
Noise Figure 14.9 dB @ 5 GHz15 dB @ 15 GHz
17 dB @ 5 GHz17 dB @ 15 GHz
S11/S22 up to 50 GHz < -5dB < -12dB
61
7 Conclusions
7.1 Summary
A 20 Gb/s backplane driver with more than 30 dB dynamic range was imple-
mented in 130-nm CMOS technology. The circuit consumes 150 mW from a 1.5
V supply and features independent control of output swing, duty cycle and
pre-emphasis. The circuit is operational without pre-emphasis at data rates
up to 30 Gb/s with 300 mVpp swing per side. The pre-emphasis pulse is both
amplitude and width controllable via the introduction of a novel digital circuit
implementation which does not require a separate clock signal. The results of
this project were published and presented at the IEEE Custom Integrated
Circuits Conference in 2004 [1].
7.2 Future work
Future work associated with this design would result in the further system
verification, expansion and industrialization of the circuit. As time-domain
testing was completed in industrial settings, tester time availability was low,
and in particular, pre-emphasis experimentation was affected. Further test-
ing on the pre-emphasis circuit would involve varying the varactor to increase
and decrease the width of the pre-emphasis pulse width, and varying the rela-
tive current bias from the digital differentiator to empirically match postitive
and negative pre-emphasis pulse heights. In terms of system expansion and
industrialization, a receiver circuit placed across a backplane with appropri-
ate mechanisms to control the pre-emphasis would be required to close the
feedback loop. Further industrialization of the main path would require
62
instantiation of on-chip and off-chip reference voltages and currents for each
of its independently controlled stages. Finally, verification of the circuit in a
datapath operation to measure its true performance in a digital data-specific
application is required. This would involve an addition of a Media Access Con-
trol (MAC) circuit and layer at the input of the driver as well as its MAC coun-
terpart on the receiving end.
63
References
[1] P. Westergaard, S.P. Voinigescu, T.O. Dickson “A 1.5-V, 20/30-Gb/s CMOS
Backplane Driver with Digital Pre-emphasis,” Proc. IEEE Custom Inte-
grated Circuits Conference, pp.23-26, Orlando, FL, Oct. 2004
[2] S. Galal, B. Razavi, “40Gb/s Amplifier and ESD protection Circuit in 0.18-
um CMOS Technology,” IEEE ISSCC Digest, pp.480-481, 2004
[3] D. Kehrer, H.D. Wohlmuth, “40 Gb/s 2:1 Multiplexer and 1:2 Demulti-
plexer in 120 nm CMOS,” IEEE ISSCC Digest, pp. 345-346, 2003
[4] J. Liu, X. Lin, “Equalization in high-speed communication systems,” Cir-
cuits and Systems Magazine, IEEE, Volume 4, Issue 2, pp. 4-17, 2004
[5] S.S. Mohan, M.D.M. Hershenson, S.P. Boyd, T.H. Lee, “Bandwidth exten-
sion in CMOS with optimized on-chip inductors,” IEEE Journal of Solid
State Circuits, Volume 35, Issue 3, March 2000, pp. 346 - 355
[6] M. Green, “Current-controlled CMOS circuits with Inductive broadband-
ing,” U.S. Patent 6,525,571, Filed Sept. 26, 2001
[7] S. Ramo, J.R. Whinnery, T. Van Duzer, Fields and Waves in Communica-
tions Electronics, 3rd. Ed. New York, John Wiley & Sons, 1994
[8] T. O. Dickson, R. Beerkens, S. P. Voinigescu, “A 2.5-V, 45-Gb/s Decision Cir-
cuit Using SiGe BiCMOS Logic," IEEE Journal of Solid-State Circuits, Vol-
ume 40, Issue 4, pp. 994-1003, April 2005
[9] B. Razavi, Design of Analog CMOS Integrated Circuits, 1st Ed. New York:
Mcgraw-Hill, 2001
64
[10] A. Sedra, K. Smith, Microelectronic Circuits, 4th Ed. New York: Oxford
Press, 1998
[11] S.P. Voinigescu, T.O. Dickson, R. Beerkens, I. Khalid, P. Westergaard, "A
Comparison of Si CMOS, SiGe BiCMOS, and InP HBTs Technologies for
High-Speed and Millimeter-wave ICs," Si Monolithic Integrated Circuits in
RF Systems, pp.111-114, Atlanta, GA, Sept. 2004
[12] R. Ranfft, H.M. Rein, “High-speed bipolar logic circuits with low power
consumption for LSI - a comparison.” IEEE Journal of Solid State Circuits,
Vol. 17, Issue 4, Aug. 1982, pp. 703 - 712
[13] D.S. McPherson. McPherson, D.S.; Pera, F.; Tazlauanu, M.; Voinigescu,
S.P. “A 3V fully differential distributed limiting driver for 40-Gb/s optical
transmission systems,” IEEE Journal of Solid-State Circuits, Volume 38,
Issue 9, Sept. 2003 pp. 1485 - 1496
65
Appendix A: Spice file for inductor parameterextraction.option acct nomod post=2 probe
.net v(p2) vin rout=50 rin=50
vin p1 0 AC 1
L p1 3 LsR 3 p2 RsCs1 p1 1 Cp1Cs2 p2 2 Cp2Rs1 1 0 Rsub1Rs2 2 0 Rsub2Csub1 1 0 Csub1Csub2 2 0 Csub2Cbr p1 p2 Cbr
.param+ Ls = 0.407n+ Rs = 6.9+ Rsub1 = 5240+ Rsub2 = 691+ Cp1 = 4.67f+ Cp2 = 5.31f+ Csub1 = OPT1(0.1p, 0.0001p, 10p)+ Csub2 = OPT1(0.1p, 0.0001p, 10p)*+ Csub1 = 5.72f*+ Csub2 = 4.70f+ Cbr = OPT1(30f, 0.0001p, 10p)
.AC data=measured optimize=opt1+ results=comp1,comp2,comp3,comp4,comp5,comp6,comp7,comp8+ model=converge.model converge opt relin=1e-4 relout=1e-4 close=10 itropt=30.measure ac comp1 err1 par(y11r) y11(r).measure ac comp2 err1 par(y11i) y11(i).measure ac comp3 err1 par(y12r) y12(r).measure ac comp4 err1 par(y12i) y12(i).measure ac comp5 err1 par(y21r) y21(r).measure ac comp6 err1 par(y21i) y21(i).measure ac comp7 err1 par(y22r) y22(r)
66
.measure ac comp8 err1 par(y22i) y22(i)
.ac data=measured
*.ac lin 75 5e9 79e9
.plot ac y21(m) y11(m) y21(db)
.print par(y11r) y11(r) par(y11i) y11(i)
.print par(y12r) y12(r) par(y12i) y12(i)
.print par(y21r) y21(r) par(y21i) y21(i)
.print par(y22r) y22(r) par(y22i) y22(i)
.print y11(r) y11(i) y11(m) y11(p)
.print y22(r) y22(i) y22(m) y22(p)
.print y12(r) y12(i) y12(m) y12(p)
.print y21(r) y21(i) y21(m) y21(p)
.print ac L12=par(’y12(i)/((6.28*FREQ)*(y12(m)*y12(m)))’)
.print ac Leff=par(’-y11(i)/((6.28*FREQ)*(y11(m)*y11(m)))’)
.print ac Q=par(’-y11(i)/y11(r)’)
.print ac cox=par(’(y11(i) + y12(i))/(6.28*FREQ)’)
*.print par(s11r) s11(r) par(s11i) s11(i)*.print par(s12r) s12(r) par(s12i) s12(i)*.print par(s21r) s21(r) par(s21i) s21(i)*.print par(s22r) s22(r) par(s22i) s22(i)*.print z11(r) z11(i) z11(m) z11(p)*.print z22(r) z22(i) z22(m) z22(p)*.print z12(r) z12(i) z12(m) z12(p)*.print z21(r) z21(i) z21(m) z21(p)
.data measuredFREQ Y11r Y11i Y12r Y12i Y21r Y21i Y22r Y22i***PLACE ASITIC SIMULATION DATA HERE***
*.param freq=100MEG,s11m = 0 , s11p = 0, s12m = 0, s12p = 0, s21m =0,*+s21p =0, s22m =0 , s22p = 0.end
-7
A 20/30 GBIT/S CMOS BACKPLANE DRIVER WITH
DIGITAL PRE-EMPHASIS
by
Paul Westergaard
A project report submitted in conformity with the requirementsfor the degree of Masters of Engineering
Graduate Department of Electrical and Computer EngineeringUniversity of Toronto
Copyright by Paul Westergaard 2005
-6
A 20/30 GBIT/S CMOS BACKPLANE DRIVER WITH
DIGITAL PRE-EMPHASIS
Paul Westergaard
Master of Engineering, 2005
Graduate Department of Electrical and Computer Engineering
University of Toronto
Abstract
A high-speed input comparator and output driver with fully adjustable pre-
emphasis for applications in serial inter-chip communications over backplanes
at 20 Gb/s is presented. The driver achieves data rates of up to 30 Gb/s when
the pre-emphasis is disabled. The circuit was implemented in 130-nm CMOS
and consumes 150 mW from a 1.5-V supply in 20 Gbps operation. It has over
30 dB dynamic range with a sensitivity of 20 mVpp and a differential output
swing of 700 mVpp at 20 Gb/s. The output driver features a novel digital pre-
emphasis circuit with independent pulse height and pulse width control with-
out the requirement of an external clock input. Other independent features
are 30%-70% eye-crossing control and adjustable output swing between 170
mVpp and 350 mVpp per side. The results of this project were published and
presented at the IEEE Custom Integrated Circuits Conference in 2004 [1].
ii
-5
Acknowledgements
The author would graciously like to thank his supervisor and mentor Profes-
sor Sorin Voinigescu for his inspiration, technical and personal insight and
unwavering support throughout this project. The author would also like to
specially thank Timothy O. Dickson, a fellow graduate student, for his on-
going technical input, altruism and camaraderie throughout this thesis.
Experimental results would not have been possible without Timothy’s contri-
butions. Furthermore, the author acknowledges that this thesis would not
have been possible without the valuable technical and financial contribution
of the Ottawa, Ontario office of ST Microelectronics. Special contributing
members of the ST Microelectronics technical staff who are owed special grat-
itude include Rudy Beerkens, Boris Prokes, Imran Khalid and Steve McDow-
ell.
iii
-4
Contents
List of Tables viList of Figures vii1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2 Objectives and scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
2 Background 42.1 Review of pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Inductive peaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3 Hazardous relative placement of load inductors . . . . . . . . . . . . . . . . . . . . . 132.4 Second order effects in deep submicron MOSFETs . . . . . . . . . . . . . . . . . . . . 14
3 Circuit Design 213.1 Driver overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .213.2 Biasing a MOS CML gate for optimal speed . . . . . . . . . . . . . . . . . . . . . . .223.3 Biasing a differential CMOS stage for low-noise . . . . . . . . . . . . . . . . . . . .243.4 Input matching network and low-noise comparator . . . . . . . . . . . . . . . . . .253.5 Eye-crossing control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
3.5.1 Transistor sizing and biasing in the eye-crossing control circuit . . . .273.6 Output driver and parallel current summation . . . . . . . . . . . . . . . . . . . . . . . 283.7 Digital pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4 Simulation Results 344.1 S-Parameter simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .344.2 Time-domain simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5 Physical Implementation 415.1 Inductor design and model extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1.1 Model extraction using ASITIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.1.2 Inductor Realization and isolation . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Layout and fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6 Experimental Results 506.1 Test environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .506.2 S-Parameters and noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.3 Time-domain measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.4 Performance summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
iv
-3
7 Conclusions 617.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
References 63
Appendix A: Spice file for inductor parameter extraction 65
v
-2
List of Tables
Table 1: Circuit features categorized by design requirement . . . . . . . . . . . . . . . . . . . . . . . . 2
Table 2: Performance metrics for shunt peaking [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
Table 3: Physical dimensions of each inductor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Table 4: Simulated inductor parasitic values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 5: Simulated and measured circuit parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
vi
-1
List of Figures
Fig. 1: Transmitter pre-emphasis and backplane trace transfer characteristic . . . . . . . . . . . 5
Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Fig. 3: Time-domain pre-emphasis overshoot and undershoot. . . . . . . . . . . . . . . . . . . . . . . . . . 7
Fig. 4: Simple common source amplifier with resistive loading. . . . . . . . . . . . . . . . . . . . . 8
Fig. 5: Common source amplifier with shunt peaking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Fig. 6: Potentially hazardous placement of load inductor . . . . . . . . . . . . . . . . . . . . . . . . . .14
Fig. 7: Proper placement of inductive load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Fig. 8: Transconductance of a 130-nm nMOSFET vs. gate voltage [8] . . . . . . . . . . . . . . . . .15
Fig. 9: MOS differential pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Fig. 10: Driver block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Fig. 11: Constant peak ft current density over four technology nodes . . . . . . . . . . . . . . . . . . 22
Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2-µm unit finger width biased at VDS = 1V . . . . . . . . . . . . .25
Fig. 13: Input bias and matching network and comparator . . . . . . . . . . . . . . . . . . . . . . . . 26
Fig. 14: Eye-crossing control circuit and intermediate signals . . . . . . . . . . . . . . . . . . . . . . 28
Fig. 15: Summation of output currents across output resistor . .. . . . . . . . . . . . . . . . . . . . . . 29
Fig. 16: Output driver transistor-level schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
Fig. 17: Digital pre-emphasis block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Fig. 18: Pre-emphasis waveforms and transfer function . . . . . . . . . . . . . . . . . . . . . . . . . .31
Fig. 19: NMOS digital differentiator schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
Fig. 20: Simulated S22 and S11 of complete driver circuit to 60 GHz . . . . . . . . . . . . . . . . . 34
Fig. 21: Simulated single-ended S21 of entire driver circuit . . . . . . . . . . . . . . . . . . . . . . . . . .35
Fig. 22: Simulated single-ended S21 of driver with output amplitude reduced using amplitude control . . . . . . . . . . . . . . . . . . . . . 35
Fig. 23: Simulated S21 with output peaking enabled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Fig. 24: 20Gb/s eye-diagrams 27-1 PRBS: single-ended input 20 mVpp;differential output 99mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36
Fig. 25: 25Gb/s eye-diagrams 27-1 PRBS: single-ended input 60 mVpp;
vii
0
differential output 180mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
Fig. 26: 30Gb/s eye-diagrams 27-1 PRBS: single-ended input 140 mVpp;differential output 300mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50, (b) 66%, and (c) 33% . . . . . . . . .38
Fig. 28: Output amplitude control at 20 Gb/s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39
Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per sideand +/- 16% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per sideand +33%/ -25% pre-emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Fig. 31: Inductor single-frequency Π-model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43
Fig. 32: Inductor lumped element broadband model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Fig. 33: On-chip 900 pH inductor with 44mm diameter . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Fig. 34: Full chip photograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
Fig. 35: Magnified photograph of main path layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Fig. 36: Magnified photograph of the parallel path layout . . . . . . . . . . . . . . . . . . . . . . . . 48
Fig. 37: S-Parameter test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Fig. 38: Eye-diagram test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Fig. 39: Measured single-ended S21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Fig. 40: Measured single-ended S22 and S11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Fig. 41: Measured and simulated driver noise figure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54
Fig. 42: 20Gb/s eye-diagrams 231-1 PRBS: single-ended input 20 mVpp;differential output 84mVpp per side . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
Fig. 43: 25Gb/s output eye diagram with 50% eye crossing . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per sidefor a single-ended 200 mVpp, 231-1 input PRBS . . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30% . . . . . . . . . . . . . . . . 57
Fig. 46: Output amplitude control at 20 Gb/s;output of (a) 190 mVpp and (b) 350 mVpp per side . . . . . . . . . . . . . . . . . . . . .58
Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis . . . 59
viii
1
1 Introduction
1.1 Motivation
Serial inter-chip communication is gaining widespread acceptance over paral-
lel architectures because congested printed circuit board (PCB) routing and
pad-limited silicon dice are not cost efficient in commodity designs. To mini-
mize the overall circuit area required for a serial transmitter/receiver pair,
equalization can be performed at the transmitter instead of at the receiver, in
which case it is known as pre-equalization. At the transmitter, pre-equalizers
alter the wave-function to account for the low-pass response of the intercon-
nect. Historically, pre-emphasis has been achieved either using clocked flip-
flops and step-delayed current summation or analog differentiators. The latter
only permits for amplitude control of the pulse, obviating control for the pulse
width. The former implementation, while having the necessary control mecha-
nisms and efficacy, places severe strain on device technology as the required
flip-flop typically operates at twice the frequency of the driver itself.
Even though 40 Gb/s CMOS amplifiers [2], demultiplexers and multiplexers
[3] have been recently reported, demonstrating the high-speed potential of
standard CMOS technology, they suffer from limited dynamic range due to
poor sensitivity and modest output swings of about 100 mVpp per side.
This paper presents the first published CMOS driver with duty-cycle, ampli-
tude, and pre-emphasis control that operates at data rates exceeding 20 Gb/s.
The driver achieves over 30dB of dynamic range. It includes a novel passive
element-free differentiator that enables control of both amplitude and width
of the pre-emphasis pulse.
2
1.2 Objectives and scope
The prime objective of this thesis was a fabricated high-speed, pre-emphasis
enabled, output driver for applications in serial inter-chip communications
over backplanes. The scope of the thesis was the theoretical derivation, com-
puter-aided design, simulation, fabrication and experimental characterization
of the circuit. The required features of the design in order to achieve its tar-
geted application were as follows:
Table 1: Circuit features categorized by design requirement
DesignRequirement
Corresponding Circuit Feature
High Bandwidth • Inductive peaking at each signal path stage• High gain (> 20dB) at 20 Gb/s operation• Positive gain at 30 Gb/s operation
Low Power • 1.5 V power supply• 150 mW dissipation at 20 Gbps operation.
High SignalIntegrity andSensitivity
• Input and output matching 50 Ω up to 50 GHz• Differential signalling• Symmetrical layout• 20 mVpp input sensitivity
Signal Shaping andControl
• 30% to 70% pulse-width control• 200-700mVpp differential output swing control• Pre-emphasis spike width and height control
NovelImplementation
• Full CMOS implementation• Clock-free pre-emphasis circuit with non-tra-
ditional circuit design
3
1.3 Organization
The thesis is organized as follows. Chapter 2 discusses the background of pre-
emphasis, inductive peaking, and hazardous relative placement of load induc-
tors. Chapter 3 details the concept, design and biasing of the individual circuit
elements in the driver. Insights into transistor sizing for optimal speed based
on a current-density centric approach as well as minimum noise are provided.
In Chapter 4, the pre-layout simulation results of the driver are presented.
Chapter 5 summarizes the physical implementation of the entire circuit.
Chapter 6 offers an overview of the experimental results. Conclusions are
given in Chapter 7.
4
2 Background
2.1. Review of pre-emphasis
In high-speed circuit applications in which high-frequency signals are sent
over backplane channels, there are two types of equalization: transmitter pre-
emphasis and receiver equalization [4]. Both are intended to either emphasize
the high-frequency components or de-emphasize the low-frequency compo-
nents of the transmitted signal, in order to compensate for the low-pass trans-
fer characteristics of the channel. The transfer function of both types of
equalizer is high-pass, though in practice, it is band-pass. The reasons for the
latter are threefold: (i) semiconductor devices in practice cannot achieve infi-
nite bandwidth; (ii) to avoid high-frequency noise amplification; (iii) to meet
regulated electromagnetic interference (EMI) specifications.
Pre-emphasis is achieved at the transmitter side by increasing the high-fre-
quency components. Fig. 1 shows the mechanism in which ideal transmitter
pre-emphasis compensates for the low-pass transfer characteristics of the
backplane trace.
5
Fig. 1: Transmitter pre-emphasis and backplane trace transfer character-istic
A common practical pre-emphasis circuit implementation is a Finite
Impulse Response (FIR) filter. Fig. 2 shows the block diagram of a 4-tap FIR
filter with a single Data input, delay elements D and tap coefficients C1, C2,
C3, C4. The tap coefficients adjust the gain at each multiplier independently to
produce the output voltage across the load resistors of value R. The output is a
frequency shaped version of the Data input in the form of amplified high-fre-
quency components.
Backplane traceTransfer Characteristics
Frequency (Hz)
TransmitterPre-emphasis
Response
6
Fig. 2: A 4-tap FIR filter for transmitter pre-emphasis
In the time domain, the FIR filter performs a differentiation function. As
shown with the dotted lines in Fig. 3, the waveform of transmit pre-emphasis
appears as overshoot and undershoot in the time-domain.
C1 C2 C3 C4
D
D
D
D
Data
Delay Element
R R
7
Fig. 3: Time-domain pre-emphasis overshoot and undershoot
Other circuits that perform similar differentiation functions are passive RC
differentiators and inductively-loaded differential amplifiers. Unfortunately,
there exist drawbacks in each of these three differentiator implementations.
For FIR filters, a clock of at least twice the frequency of the data is required
to trigger the delay elements D in Fig. 2. The width of the pre-emphasis spike
is inversely proportional to the frequency of this clock. Hence, for 20 Gb/s (10
GHz) signals, a minimum 20 GHz clock signal is needed, requiring a very well
designed clock recovery circuit and 20 GHz flip-flops. The RC passive-element
differentiator is a more viable solution in that it does not require the input
clock or flip-flops, however, the width of the pre-emphasis spike in this case is
not controllable. This results in a non-ideal pre-equalization that can not fully
compensate for the effects of the channel. Finally, inductively loaded differen-
tial amplifiers offer no control over the width nor the height of the pre-empha-
sis pulse, and more importantly can result easily in output ringing due to
resonant effects. The pre-emphasis employed in the presented driver is differ-
ent than all three methods outlined above as: (i) no passive L or C elements
are used; (ii) no clock is required and; (iii) the pulse width and height can be
independently controlled.
Pre-emphasis overshoot
time
Voltage
and undershoot
8
2.2 Inductive peaking
The theory of inductive peaking or broad-banding is well-documented [5][6]. A
brief review and implications for the driver design will be presented here.
Inductive series and shunt peaking are techniques that can be used to extend
the 3-dB bandwidth of an amplifier without expensing extra power. The fol-
lowing explanation will focus on shunt inductive peaking as it is applied in the
driver design.
Fig. 4: Simple common source amplifier with resistive loading
Fig. 5: Common source amplifier with shunt peaking
9
Fig. 4 illustrates a common source amplifier with an ideal resistor and capac-
itive load. For simplicity, we assume that the small signal frequency response
of the amplifier is determined by a single dominant pole, which is determined
solely by the output load resistance RL and by the load capacitance C.
The introduction of an inductance L in series with the load resistance as
shown in Fig. 5, alters the frequency response of the amplifier. This technique,
known as shunt peaking, increases the bandwidth of the amplifier by trans-
forming the frequency response from that of a single pole to one with two poles
and a zero.
The poles may or may not be complex. The zero is determined solely by the
L/RL time constant and is primarily responsible for the bandwidth improve-
ment. In addition, the frequency response of this amplifier is characterized by
the ratio of L/RL and RLC time constants. This ratio is denoted by m = L/
(RL2C). Isolating for the inductance value the ratio is re-written as L =
mRL2C.
V out
V in---------- ω( )
gmRL
1 jωCRL+----------------------------= 1( )
V out
V in---------- ω( )
gm RL jωL+( )
1 jωRLC ω2LC–+
-------------------------------------------------= 2( )
10
It can be shown [5] that bandwidth extension is possible at varying degrees
with adjustments of m. As expected, the 3-dB bandwidth of the shunt ampli-
fier increases as m increases. Table 1 shows the normalized 3-dB extension
factor relative to the value of m. The maximum bandwidth occurs for m = 0.71
and yields an 85% improvement in bandwidth. However it is accompanied by
a significant amount of gain peaking which is undesirable for broadband
amplifiers used in fibre optic or backplane applications. A maximally flat
response is observed for m = 0.41 while still improving bandwidth by 72%.
Finally, although a value of m = 0.32 does not result in the same bandwidth
improvement as the other two non-zero values of m shown in the table, it
exhibits the most linear phase response up to the 3-dB bandwidth [5]. This
value of m, called the optimum group delay value, is desirable for optimizing
pulse fidelity in broadband systems that transmit digital signals.
The optimum group delay value, which still results in a respectable 60 per-
cent increase in bandwidth, is best suited for the design of the broadband dig-
ital signal driver.
An implicit benefit of using inductive peaking is the enhanced freedom in the
power-bandwidth trade-off. This improvement can be demonstrated by first
Table 2: Performance metrics for shunt peaking [5]
Factor (m)Normalized
ω3dBResponse
0 1.00 No shuntpeaking
0.32 1.60 OptimalGroup Delay
0.41 1.72 Maximally flat
0.71 1.85 Maximumbandwidth
11
examining the key equations for a non-inductively loaded amplifier (as the one
introduced in Fig. 4). The bandwidth of the amplifier is dominated by the out-
put pole to be:
The value of the tail current Itail is also determined by the amount of desired
output swing and load resistance, especially in the design of switching invert-
ers:
where ∆Vswing is the voltage swing on the output node of the inverter and RL
is the load resistance of the inverter.
The advantage of shunt inductive peaking is made more obvious when the
bandwidth and power of an amplifier are examined. The bandwidth of an
amplifier is given by equation 3, while the power consumption is directly pro-
portional to Itail for a given power supply. The goal is to increase the band-
width and minimize the power consumption (and hence Itail of an amplifier).
This results in a contradictory solution for RL whereby equation 3 requires a
small value for RL and equation 4 requires a large one.
Inductive peaking allows the circuit designer to increase the value of RL to
reduce overall power consumption while simultaneously increasing band-
width with the introduction of a load peaking inductor. Since Table 1 shows
that shunt peaking can increase the bandwidth characteristics of an amplifier
by 60% while still maintaining a linear phase response, a possible trade-off is
to increase the resistance RL by 30% and decrease the tail current Itail by a
BW 12πRLC------------------= 3( )
I tail
V swing∆RL
-------------------= 4( )
12
similar amount (1/1.3) and introduce a load inductor. With m = 0.32 from the
Table 1, the value for the inductor is:
From equation 3, the increased resistance RL will decrease the bandwidth by
a factor of 1/1.3, but the inductive peaking will increase the newly reduced
bandwidth by 60%. Overall, the inductively peaked circuit will have both
higher bandwidth and lower power than the original resistively-loaded circuit.
Explicitly, the inductively peaked circuit will have bandwidth:
and the tail current of the inductively peaked circuit will be:
This results in a 23% gain in bandwidth with a simultaneous 23% decrease in
power consumption.
There are a few compromises involved in this optimization of the power-
bandwidth product via the introduction of an on-chip inductor. First, the
LCRL
2
3.1-------------= 5( )
BW peaked BW resistive1.61.3-------× 1.23BW resistive= = 6( )
I tail peaked( )I tail resistive( )
1.3------------------------------ 0.77I tail resistive( )= = 7( )
13
added die area expense at each inverter stage can be relatively large, with the
inductor usually occupying more area than the resistively-loaded inverter
stage alone. Secondly, deterministic jitter can occur leading to deleterious
results if an inductor is realized with a larger than simulated value. In this
case, jitter is a result of undesired, and more importantly, uncontrollable
peaking and signal distortion. In a circuit with multiple sequential gain
stages, each with over-sized peaking inductors, the resultant signal distortion
becomes catastrophic. Electro-magnetic field solvers that simulate the induc-
tances of on-chip planar and stacked spiral inductors must be verified experi-
mentally with fabricated test-structures before the final circuit is fabricated.
This pre-verification methodology was employed in the inductor design.
2.3 Hazardous relative placement of load inductors
The relative placement of the load inductors to the load resistors in the design
of each amplifying stage is of utmost importance. Fig. 6 shows a potentially
hazardous placement of a load inductor in an inductively loaded amplifier
stage. The root of the deleterious effect lies in transmission line theory. As the
driver operates in the high-frequency signalling spectrum, transmission line
theory is applicable.
Transmission lines spatially transform impedance [7]. The impedance of the
voltage supply in the AC case is zero, a short circuit. At high-frequencies, the
inductor length is comparable to that of a quarter-wavelength of the signal on
the inductor wire segment. Hence, the impedance looking into the inductor
may be spatially transformed from the short-circuit of the power supply into
an open circuit. The transformed open circuit will result in instability and/or
oscillatory behavior in the amplifier.
14
Fig. 6: Potentially hazardous placement of load inductor
Fig. 7: Proper placement of inductive load
It is important to note that, for the proper placement of the inductor in Fig. 7,
spatial impedance transformations still occur. However, the impedance look-
ing into the inductor from transistor M2 is always a finite, non-zero value
because the transformation acts on the finite and non-zero resistive value RL.
2.4 Second order effects in deep submicron MOSFETs
Of special relevance in the transistor sizing and biasing in the presented
Zin may ~ inf.
Zin alwaysfinite
15
design is a relatively unfamiliar submicron phenomenon. Electron mobility
degradation due to high vertical electric fields can drastically decrease perfor-
mance of deep-sub micron circuits. Fortunately, the application of proper bias-
ing measures can mitigate these negative effects. However, scarcely few
publications on this topic exist and classical biasing techniques are the norm.
Fig. 8: Transconductance of a 130nm nMOSFET vs. gate voltage [8]
Fig. 8 [8] portrays the deleterious effects on transconductance due to electron
mobility degradation in high-vertical fields. Shown is the transconductance
normalized by width (gm/W) of a 130-nm nMOS transistor as a function of
gate-source voltage. Its shape is similar to that of the fT dependence on VGS
and typical for all deep submicron technologies. The curve exhibits two dis-
tinct regions, the square-law region and the high-vertical field region as
shown. There also exists an intermediate area between the two regions in
which hybrid behaviour is observed.
At low effective gate voltages (VGS < 0.5V in Fig. 8), the device follows the
classical square law model and its transconductance varies linearly with VGS.
The equation derived for an n-MOS transistor in saturation (square law
region) is:
High-verticalfield region
Square-lawregion
16
where COX is the oxide capacitance, W is the gate width, L is the gate length,
and VT is the threshold voltage of the transistor.
At large gate-source voltages, the high-electric field developed between the
gate and channel of the transistor confines charge carriers to a narrower
region below the oxide-silicon interface, leading to more carrier scattering and
hence lower mobility. Further, small-geometry devices experience significantly
more mobility degradation [9]. An empirical equation modelling this effect is
[9]:
where µο denotes the “low-field” mobility and θ is a fitting parameter that
increases with decreasing oxide thickness and hence smaller geometries.
Substituting the solution for electron mobility µn (9) into (8) reveals that
when the second term in the denominator of (9) becomes dominant, transcon-
ductance becomes a constant. This effect is observed clearly for the high-verti-
cal field region of Fig. 8, where VGS > 0.7V.
It is of special interest to note how the constant transconductance due to
gm µnCOXWL-----
V GS V T–( )= 8( )
µn
µo
1 θ V GS V T–( )+-----------------------------------------= 9( )
17
high-vertical fields affects the differential voltage required to completely
switch a MOS differential pair. It will be shown that a larger switching volt-
age, with little or no improvement in transition time, is required when the
MOS pair is biased in the high-vertical field region [8].
We first derive the differential voltage required to completely switch a MOS
differential pair when biased in the high-vertical field region (VGS > 0.7V in
Fig. 8).
Fig. 9: MOS differential pair
Fig. 9 shows a differential pair of MOS transistors fed by a constant tail cur-
rent Itail. Imagine tail current Itail is fully routed through transistor Q1.
Assuming high-vertical field operation, Itail is:
I tail I DS1
Cox
2--------
W 1
L1--------
µo
1 θ V GS1 V T–( )+( )------------------------------------------------- V GS1 V T–( )2
= = 10a( )
18
where transistor Q1 is assumed to be in saturation and equation (9) has been
substituted for µn. In high-vertical field operation, when the term θ(VGS1 - VT)
becomes dominant relative to unity, equation (10a) becomes:
Equation (10b) shows that IDS1 now exhibits a linear relationship with VGS1.
At the instant that Itail is fully routed through Q1, it is evident that VGS2 = VT,
the threshold voltage of the transistors, such that IDS2 = 0. Hence, the differ-
ential voltage across the gates of the transistors is
Isolating VGS1 in (11) and substituting into (10b) reveals that
for Itail at the instant when all the tail current is shifted completely through
I tail I DS1
Cox
2--------
W 1
L1--------
µo
θ----- V GS1 V T–( )≈= 10b( )
V∆ V GS1 V GS2– V GS1 V T–= = 11( )
I tail
Cox
2--------
W 1
L1--------
µo
θ----- V∆ V T V T–+( )≈
Cox
2--------
W 1
L1--------
µo
θ----- V∆( )= 12( )
19
Q1. An alternate equation for Itail is derived in the steady state when the tail
current is split evenly between the transistors such that:
as VGS(1,2) is equal for both Q1 and Q2 in equilibrium. Equating Itail from (12)
and (13), we solve for ∆V, the differential voltage required to completely switch
the MOS differential pair:
where the inequality is explicitly shown in (14b) to indicate a minimum differ-
ential voltage requirement.
To find the minimum differential voltage to switch the tail current com-
pletely through one of the MOS transistors in the square-law region (VGS <
0.5V in Fig. 8), a parallel mathematical process of equations (10) through (14)
is completed. This is shown explicitly in [10]. The resultant required voltage
swing for full switching becomes:
I DS1 I DS2
I tail
2---------
Cox
2--------
W 1
L1--------
µo
θ----- V GS 1 2,( ) V T–( )
Cox
2--------
W 1
L1--------
µo
θ----- V EFF( )= = = = 13( )
Cox
2--------
W 1
L1--------
µo
θ----- V∆( ) Cox
W 1
L1--------
µo
θ----- V EFF( )= 14a( )
V∆ 2V EFF≥ 14b( )
V∆ 2V EFF≥ 15( )
20
Hence, in the high-vertical field region, both the scalar multiplier and VEFF
itself are larger, requiring a greater differential voltage and hence transition
time, to switch the differential pair [8].
The outcome of this analysis emphasizes that gate-source voltages must be
limited to mitigate the effects of high-vertical fields on electron mobility. This
is accomplished through proper transistor sizing and tail current selection.
21
3 Circuit Design
3.1 Driver overview
Fig. 10 shows the block diagram of the backplane driver highlighting the four
sections of the circuit. The pre-emphasis path is placed in parallel with the
main signal path and the current from both output stages are summed across
the 50 Ω load resistors to develop the differential output voltage. The parallel
path consists of delay buffers and the digital differentiator circuit. The output
swing is adjusted from the tail current of the output driver while the height of
pre-emphasis is controlled by the relative tail current of the digital differenti-
ator.
Fig. 10: Driver block diagram
22
3.2 Biasing a MOS CML gate for optimal speed
MOS CML logic has only recently been applied to high-speed ICs [6]. Conse-
quently, no systematic design guidelines existed until recently [8]. As such,
MOS CML gates have been biased for optimal speed rather haphazardly using
a voltage-centric approach. We will show through simulation that, for MOS
CML, a current-density centric instead of a voltage-centric design approach
leads to more accurate and reliable circuit design.
Fig. 11: Constant peak ft current density over four technology nodes [8]
In VGS-focussed MOS CML design the effective gate voltage (VEFF = VGS -
VT) value at which the peak fT of the MOSFET scales with technology,
decreasing with every technology node. This makes it very difficult to predict
an optimal bias across multiple technologies and very difficult to predict an
optimal bias within a given technology node. However, as the simulated data
collected over four technology nodes (# of fingers x gate length x finger width)
JpfT-MOS
23
shown in Fig. 11 [8] show, the peak-fT current density (JpfTMOS) remains
approximately constant (between 0.25 mA/µm and 0.35 mA/µm depending on
VDS) as technology scales. This trend is likely to occur also for future MOS
generations as a result of the constant field scaling that has been applied from
the 0.5-µm technology node [11] downward. Subsequently, a current-density
centric design approach, similar to that which is commonly employed in bipo-
lar designs [12], is more appropriate for reproducible, accurate, high-speed
design of MOS CML circuits. In a current-density centric design scenario, the
gate width of the MOSFET is sized such that the device reaches its peak fT
when all of the tail current flows through the device:
In 130-nm technology, this corresponds to a VEFF of around 300mV. Biasing
beyond the peak-fT current density will degrade circuit performance.
For large-signal high-speed circuit biasing in a MOS differential pair, it is
recommended that each of the differential pair transistors are biased at half-
peak fT current density in balanced current steady state. This avoids current
densities beyond peak-fT when, during full-switching, all the tail current is
routed through one transistor of the differential pair and the current density
is momentarily doubled from that of half peak fT to peak fT. Referring back to
the discussion of Section 2.4, biasing the circuit at half-peak fT has a second
positive effect on circuit switching speed. Biasing at half-peak fT current den-
sity instead of full-peak fT current-density permits the differential pair tran-
W G
I T
J pfTMOS---------------------= 16( )
24
sistors to operate more in the square-law region instead of the slower-
switching high-vertical field region.
Based on this observation, each circuit block in the presented driver design
consists of a MOS-CML inverter whose ratio of tail current to differential pair
transistor width is set to correspond to the peak fT bias of the n-channel MOS-
FET of 0.25 to 0.3 mA/µm. This bias scheme is implemented to obtain the
maximum switching speed. Inductive peaking is employed in every stage to
further improve the circuit bandwidth.
3.3 Biasing a differential CMOS inverter for low-noise
Fig. 12 shows the fT and oppositely the NFMIN versus current-density for a
130-nm n-MOSFET with a 2µm unit finger width. It is shown that NFMIN has
a minimum value corresponding to a bias current of about half the current-
density of the maximum fT. Earlier it was shown that biasing each transistor
in a differential pair at half-peak-fT would result in optimal switching speed.
Now it is shown, co-incidentally, that this current-density bias point also
results in minimum NFMIN.
25
Fig. 12: fT and NFMIN vs. current density for a 130-nmn-MOSFET with 2 µm unit finger width biased at VDS = 1V
3.4 Input matching network and low-noise comparator
Fig. 13 illustrates the input matching network and input low-noise compara-
tor. The input differential pair has higher gain and larger tail current than
the other stages in order to reduce the noise by making the optimum noise
impedance of the input stage closer to 50Ω per side. A compromise was
reached between achieving the best possible noise match, which calls for
larger transistor sizes and bias current, and the broadband input impedance
match. On-chip matching resistors, realized as a resistive divider with series
inductors, provide appropriate gate bias for the input transistors and broad-
band input impedance matching.
10-2
10-1
100
Current Density (mA per µm width)
0
25
50
75
100
f T (
GH
z)
0.0
0.5
1.0
1.5
2.0
NF
MIN
@ 1
0GH
z (d
B)
26
Fig. 13: Input bias and matching network and comparator
3.5 Eye-Crossing Control
A key objective of the thesis was the design of a driver with controllable pulse-
width. The application of controllable pulse width is the compensation of DC
offsets that may cause signal distortion. This compensation is used to alter the
duty-cycle (of an input signal) such that a 50% duty-cycle periodic input may
be changed to a 30% or 70% duty-cycle periodic output signal. Conversely, a
DC offset which has imposed an output duty cycle of 30% could be negated to
re-instate the duty-cycle back to 50% as desired.
The circuit of Fig. 14 accomplishes pulse-width control using a technique
found in [13]. The circuit consists of two series inductively-peaked differential
stages with a DC offset control pair connected at the output of the first stage.
By applying a DC voltage Voffset, an offset voltage is developed at the output
Zin= 50Ω
27
node of the first differential pair, shifting the zero-crossing between the two
outputs. Due to the finite rise and fall time of the waveform, and the trunca-
tion by the limiting action of the last inverter, Vout exhibits a change in duty-
cycle as illustrated by the overlaid waveforms of Fig. 14.
3.5.1 Transistor sizing and biasing in the eye-crossing control circuit
On the left side of Fig. 14, the input pair simply drives a series RL load, no dif-
ferently than any of the other inductively peaked circuits of this backplane
driver. The middle pair of transistors (M3 and M4) have DC bias voltage Voffset
held constant or a mixing effect would occur because the current pull of M3
and M4 act on the same signal path as M1 and M2. Hence it must be ensured
that the control voltage Voffset comes from a low-noise source.
The third stage acts as a limiting amplifier as previously described. The tail
current was chosen to be 12mA and hence the differential pair transistors
were chosen to be 32 µm each so that a current density of 0.19 mA/µm in each
during steady state. This biasing is in-line with the recommendations for
half-ft value derived in Section 3.2.
28
Fig. 14: Eye-crossing control circuit and intermediate signal
3.6 Output driver and parallel current summation
The output driver and digital pre-emphasis circuit, in parallel, create the out-
put voltage across the output load resistor. It is shown in Fig. 15 how the out-
put currents of the output driver and the digital differentiator are summed.
The ratio of output current from the output driver (I1 in Fig. 15) and the out-
put current contribution from the digital differentiator (I2 in Fig. 15) deter-
mine the percentage of pre-emphasis in the overall output voltage, Vout. The
higher this ratio, the lower the percentage of pre-emphasis in Vout.
29
Fig. 15: Summation of output currents across output resistor
The output driver is shown in Fig. 16 and consists of a simple differential pair
biased with a current mirror. The value of bias current in the output driver
determines the magnitude of output signal swing from the main path.
DelayBuffers Digital
Differentiator
Vout
OutputDriver
I2
I1 I1 + I2Rout
30
Fig. 16: Output driver transistor-level schematic
3.7 Digital pre-emphasis
The digital pre-emphasis circuit, whose block diagram is shown in Fig. 17, fea-
tures three delay cells, implemented as inverters, followed by a digital differ-
entiator circuit. The third delay cell is loaded additionally with MOS
varactors connected in parallel with resistive loads in order to control the
delay and, therefore, the pre-emphasis spike width.
Fig. 18 illustrates the waveforms in the digital differentiator. The circuit has
two differential inputs IN, IN and INdly, INdly and a differential output
Voutp, Voutn which is summed with the currents from the main path across
the on-chip 50 Ω load resistors.
VoutNVoutP
Vin+ Vin-Ibias = 2mA
Iout = 20mA
M1 M2
Vdd = 1.5VR
load (shared with digitaldifferentiator) R
load (shared with digitaldifferentiator)
31
Fig. 17: Digital pre-emphasis block diagram
Fig. 18: Pre-emphasis waveforms and transfer function
The input signals are assumed to be periodic for this example. Input signals
IN and IN are delayed through the delay circuit to produce INdly and INdly,
respectively. As shown by the dotted arrows in Fig. 18, the delay between IN
32
and INdly translates into the pre-emphasis pulse width of VoutP. The same
relationship holds for IN, INdly, and VoutN, respectively. The four signals IN,
IN, INdly and INdly are fed into the digital differentiator circuit, which has
two outputs, VoutP and VoutN.
The differentiator functions as a logical XOR gate with one notable excep-
tion. Whereas an XOR gate in the classical sense operates within a binary
logic system, this logic circuit operates on a tertiary (three-level) logic system.
Specifically, when IN.INdly is true, the output VoutP rises; when IN.INdly is
true, the output VoutP falls; and when neither case is true, the output of the
circuit stays in steady-state. The output VoutN falls and rises in a horizon-
tally-mirrored fashion.
The transform of binary logic at the input of the differentiator to tertiary
logic at its output is accomplished using current switching. The circuit of Fig.
19 is biased by two constant current sources of equal value, Iswch. The voltages
at the output of both VoutP and VoutN are current-controlled by voltage drops
across each 50 Ω load resistor. In the steady-state, both VoutP and VoutN are
pulled down by an equal current of value Iswch. When VoutP rises (and VoutN
drops), the current drain path pulling down VoutP is cut off, and VoutN is
pulled down by a current equal to 2Iswch. Conversely, when VoutP drops (and
VoutN rises), VoutP is pulled down by current 2Iswch, and VoutN has its cur-
rent path to ground cut off.
33
Fig. 19: NMOS digital differentiator schematic
In Fig. 19, the pre-emphasis height is controlled by the two constant current
sources Iswch, whose value is adjustable between 0 and 10 mA. For matching,
transistors M3 and M6 compensate the VDS drop across transistors M1, M2
and M4, M5, respectively.
34
4 Simulation Results
4.1 S-Parameter simulations
The small signal S-parameters were simulated in a single-ended input and
output configuration. The input and output return loss are better than -5 dB
up to 60 GHz, as shown in Fig. 20, with S11 achieving -10dB up to 50 GHz.
The single-ended simulated small signal gain is 14.5dB, confirmed by the eye
diagram measurements of Fig. 21. Further, Fig. 20 shows that the 3dB-fre-
quency of the driver is 8.5 GHz and the driver has gain (> 0dB) up to 24.1
GHz.
Fig. 20 Simulated S22 and S11 of complete driver circuit to 60 GHz
35
Fig. 21 Simulated single-ended S21 of entire driver circuit
With the driver’s output amplitude control set to a low level, the resulting
simulated S21 is as shown in Fig. 22. The output amplitude is controlled by an
off-chip current source and can be manipulated to increase or decrease the low
frequency gain peak. The decreased gain gives an extended 3dB-bandwidth
and 0dB crossing of 9.3 GHz and 37.6 GHz, respectively.
Fig. 22 Simulated single-ended S21 of driver with output amplitudereduced using amplitude control
S21 simulations with output peaking enabled are shown in Fig. 23. The peak-
ing is evidenced in the simulation, increasing the maximum small signal gain
from 14.1dB up to 16.5dB and the 3dB frequency to 13.3 GHz. The frequency
36
shaping effects of the pre-emphasis output circuit is exemplified here.
Fig. 23: Simulated S21 with output peaking enabled
4.2 Time domain simulations
Time domain eye-diagram simulations were performed on the full driver cir-
cuit with a 27-1 PRBS (Pseudo Random Binary Stream) generator. In Fig. 24,
a 20 Gb/s input signal was applied single-ended and the unused input was ter-
minated with a 50 Ω resistance connected to VDD.
Fig. 24: 20Gb/s eye-diagrams 27-1 PRBS: single-ended input 20 mVpp; dif-ferential output 99mVpp per side
37
Fig. 25 and Fig. 26 show simulated eye-diagrams at 25 Gbps and 30 Gbps,
respectively. The circuit exhibits reduced sensitivity at 25 Gbps and 30 Gbps,
requiring, respectively, 60 mV and 140 mV single-ended inputs for similar eye
openings.
Fig. 25: 25Gb/s eye-diagrams 27-1 PRBS: single-ended input 60 mVpp; dif-ferential output 180mVpp per side
Fig. 26: 30Gb/s eye-diagrams 27-1 PRBS: single-ended input 140 mVpp; dif-ferential output 300mVpp per side
Fig. 27 demonstrates the simulated eye-crossing control performance at 20
Gb/s taken at the output of the driver. The eye-crossing control for (a) 50%, (b)
66%, and (c) 33% was performed by varying the control voltage that was con-
nected to off-chip voltage sources (Voffset of Section 3.5.1)
38
Fig. 27: Simulated eye-crossing control at 20 Gb/s (a) 50%, (b) 66%, and (c)33%
Fig. 28 shows output amplitude control at 20 Gb/s. The input signal in both
diagrams is 20 mVpp, applied to one side only, and the output is varied
between 21 mVpp and 220 mVpp.
(a)
(b)
(c)
39
Fig. 28: Output amplitude control at 20 Gb/s;Input of 20mVpp applied to a single end
Output of (a) 21 mVpp and (b) 220 mVpp per side
Simulated waveforms with varying levels of pre-emphasis at 20Gb/s are
shown in Fig. 29 and Fig. 30. By altering the current bias of the parallel digi-
tal pre-emphasis path, the pre-emphasis of the overall driver can be controlled
independently of the main path of the driver. Fig. 29 shows a symmetric +/-
16% overshoot/undershoot ratio, relative to the voltage swing of the main-
path signal. Fig. 30 shows an asymmetric +33%/-25% overshoot/undershoot
ratio. The asymmetry in Fig. 30 is attributed to the systemic limitation of the
pre-emphasis circuit, in that the maximum output voltage cannot exceed VDD
while the minimum value must be higher than 3*VDS(sat) due to the triple-
stacked NMOS configuration of Fig. 17. Hence, the ratio of overshoot/under-
shoot relative to the isolated main-path signal swing is directly related to the
(a)
(b)
40
pre-emphasis bias current. A lower digital pre-emphasis current results in a
symmetric overshoot/undershoot ratio as shown in Fig. 29, but the relative
percentage of overshoot/undershoot reduces to less than 25%.
Fig. 29: 20Gb/s output eye diagram with 316 mVpp swing per side and +/-16% pre-emphasis
Fig. 30: 20Gb/s output eye diagram with 300 mVpp swing per side and+33%/ -25% pre-emphasis
41
5 Physical Implementation
5.1 Inductor design and model extraction
The Computer Automated Design (CAD) tool ASITIC (http://rfic.eecs.berke-
ley.edu/~niknejad/asitic.html) was used in the simulation and sizing of the
inductors in the driver. ASITIC is a three-dimensional field solver which aids
the RF circuit designer in the optimization and modelling of spiral inductors,
transformers, capacitors, and substrate coupling. Test-structures previously
fabricated and characterized at the University of Toronto have confirmed the
accuracy of the ASITIC solver to within 90-95% of absolute inductance values,
in both planar and stacked spiral inductor situations.
All inductors of the driver were simulated as two-port circuits using both the
π−model of Fig. 31 and the lumped element broadband model shown in Fig.
32. In the substrate, the eddy-current induced loss and substrate capacitance
are represented by Rsub and Csub, respectively. The oxide capacitance is rep-
resented by Cox. The series inductance and resistance of the inductor proper
is represented by Ls and Rs, respectively. Finally, the capacitance between the
two symmetrical interwoven arms of the inductor is represented by Ciw.
None of the passive lumped elements in the model of Fig. 32 could be omitted
to reduce simulation time because each inductor was not attached directly to a
ground node in the design. This design requirement was discussed previously
in section 2.3. Further, as each inductor was employed in a broadband load
configuration, design considerations gave priority to self-resonant frequency
over quality factor (Q) in the optimization of each inductor.
42
5.1.1 Model extraction using ASITIC
This section will outline the procedure used to extract both Π-model of the
inductor, which is valid for singular low-frequency values, and the broadband
model of the inductor, which is valid for the inductor below self-resonant fre-
quencies.
The π-model of the inductor is shown in Fig. 31 and is valid only at a single-
frequency. The ASITIC three-dimensional field-solver is invoked using the
command pix to resolve the circuit parameters Ls, Rs, Cox1, Cox2, Rsub1, and
Rsub2 from an inductor layout drawn in the ASITIC graphical tool. These val-
ues are assumed correct at all frequencies below self-resonance for the induc-
tor in question. The broadband model introduces three fitting capacitances
(Ciw, Csub1, Csub2) to account for the broadband frequency response of the
inductor’s parasitics.
The broadband model capacitors are found using a Y-parameter simulation
from 5 GHz up to the self-resonant frequency of the inductor, in 1 GHz steps.
The output of the Y-parameter data is then ported to a SPICE optimization
deck. The optimization deck is included in Appendix A for reference. The
SPICE optimization deck will best fit the three broadband model capacitances
to match the Y-parameter data simulated in ASITIC. As three unknowns are
being optimized for simultaneously, reasonable and educated estimates for
each of the broadband capacitances (Ciw, Csub1, Csub2) are substituted initially.
The SPICE deck in Appendix A will also print L12, Leff and Q vs. frequency.
43
Fig. 31: Inductor single-frequency Π -model
Fig. 32: Inductor lumped element broadband model
44
Derived ASITIC parameters
Table 2 and Table 3 show the physical dimensions and the simulated values of
the equivalent circuit parameters for each of the three inductors employed in
this design. All inductors have dual-layer stacked-spiral configurations with
varying numbers of turns and a maximum outer diameter (per side) of 51.2µm
is used for the 900pH inductor.
Table 1:Table 2:
Table 3: Physical Dimensions of each inductor
Parameter LS = 400pH LS = 700pH LS = 900pH
Diameter (µm) 42.2 46.2 51.2
Number ofTurns
2 3 3
Metal Width(µm)
1.96 1.96 1.96
Spacingbetween wind-ings (µm)
1.96 1.96 1.96
Metal layers METAL6METAL5
METAL6METAL5
METAL6METAL5
45
5.1.2 Inductor realization and isolation
The 900pH inductor is shown in Fig. 33 and is comprised of two metal layers -
the top level of the six metal process and metal five. The left-side terminal of
the inductor is formed in metal six and connects directly to the drain of the
amplifying transistor of that half stage. The right-side port of the inductor is
routed in metal five and is connected to a poly-silicon resistor at each stage.
Each inductor is isolated from each adjacent inductor to minimize crosstalk.
The isolation is accomplished by surrounding each inductor with n-wells in
turn surrounded by p-tap guard rings connected to ground to impose reverse-
biasing on the junctions. Additionally, referring to Fig. 33, the p-taps between
adjacent inductors are electrically attached to a stack of metal layers, from
metal 1 through metal 6, which form a Faraday cage and improve isolation.
Table 1:Table 2:Table 3:
Table 4: Simulated inductor parasitic values
Parameter LS = 400pH LS = 700pH LS = 900pH
RS (Ω) 5.14 8.01 9.13
Cox1 (fF) 8.63 11 13
Cox2 (fF) 8.2 10.8 12.7
Rsub1 (Ω) 1820 1830 1710
Rsub2 (Ω) 2280 2130 1980
Csub1 (fF) 5.72 5.76 6.16
Csub2 (fF) 4.70 4.95 5.32
Ciw (fF) 4.69 6.26 7.24
fSelf-Res (GHz) 85.6 56.7 45.93
Q-factor (5GHz)
2.4 2.7 3.1
46
Fig. 33: On-chip 900 pH inductor with 51.2µm diameter
5.2 Layout and fabrication
The circuit was fabricated in ST Microelectronics’ 130-nm standard CMOS
process with typical n-MOSFET fT and fMAX of 90 GHz and 100 GHz, respec-
tively. The chip microphotograph is reproduced in Fig. 34. The design is pad-
limited and the total die area was 1.0mm x 0.8mm.
47
Fig. 34: Full chip photograph
Magnified versions of the two parallel paths of the driver are shown in Fig.
35 and Fig. 36, respectively. Using Fig. 10 as a reference that describes the
schematic block diagram of the entire driver, Fig. 35 emphasizes the layout of
the main signal path of the driver while Fig. 36 details the layout of the paral-
lel pre-emphasis signal path.
48
Fig. 35: Magnified photograph of main path layout
Fig. 36: Magnified photograph of the parallel path layout
49
The entire layout, with the exception of the digital pre-emphasis XOR gate, is
completely symmetric about the horizontal axis. Power and ground connec-
tions are provided along the central axis in metal 1 and metal 2, respectively.
Transistors and poly-silicon resistors are located adjacent to the division, fol-
lowed by signal path routing, with the peaking inductors located on the out-
side.
The fully-symmetrical layout results in several advantages: (i) the layout of
the entire circuit is simplified by employing half-cell layout and replication
techniques; (ii) the positive and negative signal paths are matched in length
and are physically and electrically isolated from one another; and (iii), the
coupling coefficients of same-stage load inductors are diminished.
50
6 Experimental Results
6.1 Test environment
The post-fabricated circuit was tested on wafer with probes microscopically
placed on the circuit pads. The frequency-domain measurements were made
with a 50-GHz 8510C Hewlett-Packard Vector Network Analyzer (VNA), as
shown in Fig. 37. The time-domain 231-1 PRBS stimulus was created with a
combination of an Anritsu 69397B Signal Generator, an MP17584 Pulse Pat-
tern Generator, and a MP1801A 43.5 Gbps MUX as shown in Fig. 38.
In Fig. 38, the signal generator generates a CLK signal which is driven to the
MUX, and also divided by four and sent to the pulse pattern generator. The
pulse pattern generator produces four 1/4 DATA signals, each with bitrate
one-quarter the final PRBS stream bitrate. The four 1/4 DATA are multi-
plexed temporally with the original CLK controlling the switching of the MUX
on positive CLK edges. Hence, the output of the MUX is a 231-1 PRBS bit
sequence of the required bitrate for the test, not exceeding 43.5 Gbps.
51
Fig. 37: S-parameter test setup
D.U.T.
DC Bias / ControlProbes
DC Bias / ControlProbes
DA in
DA out
50 ohm 50 ohm
Hewlett-Packard8510C
52
Fig. 38: Eye-diagram test setup
6939
7BS
ynth
esiz
edS
wee
p/S
ign
al G
ener
ato
r
D.U
.T.
DC
Bia
s / C
on
tro
lP
rob
es
DC
Bia
s / C
on
tro
lP
rob
es
50 o
hm
Vd
d
Vd
d
50 o
hm
MP
1801
A43
.5 G
b/s
MU
X
CL
K
MP
1758
4P
uls
e P
atte
rn G
ener
ato
r
1/4
CL
K
1/4
DA
TA
-RA
TE
PR
BS
(DA
TA
-Rat
e)
PR
BS
refe
ren
ce C
LK
8610
0BD
CA
20 -
43.
5 G
b/s
53
6.2 S-Parameters and noise figure
The small signal s-parameters and noise figure were measured in a single-
ended configuration. The small signal gain, S21, is shown in Fig. 39 and
agrees well with the simulated results of Fig. 21. Measured single-ended S21
bandwidth was 8.5 GHz, similar to the simulated small-signal gain. The mea-
sured input and output return loss are better than -12 dB up to 50 GHz, as
shown in Fig. 40, outperforming the simulated value of -5dB up to 60 GHz in
Fig. 20. Measured and simulated noise figure values are plotted in Fig. 41. As
expected, the simulations show an inverse relationship between tail current
and overall driver noise figure. Furthermore, the measured noise figure of the
overall driver was 2 dB higher than simulated for a 6mA tail current in the
input comparator. This is primarily due to losses associated with the probe-
pad contact resistance and the series substrate resistance below the pad
which are not accounted for in simulation. Another reason for the higher than
simulated noise figure is the limitation of the BSIM3 model for MOSFETs
that does not capture the gate noise current of the MOSFET.
Fig. 39: Measured single-ended S21
54
Fig. 40: Measured single-ended S22 and S11
Fig. 41: Measured and simulated driver noise figure
55
6.3 Time-domain measurements
Time domain measurements were carried out at data rates between 20 Gb/s
and 43 Gb/s and using a 231 - 1 PRBS pattern. A sensitivity of 20 mVpp, as
illustrated in Fig. 42, was measured at 20 Gb/s when the input signal was
applied single-ended and the unused input was terminated with a 50 Ω resis-
tance. The sensitivity degraded to 60 mVpp, and 150 mVpp, at 25 Gb/s and 30
Gb/s, respectively.
Fig. 42: 20Gb/s eye-diagrams 231-1 PRBS: single-ended input 20 mVpp; dif-ferential output 84mVpp per side
Typical 25 Gb/s and 30 Gb/s output eye diagrams are illustrated in Fig. 43
and Fig. 44, respectively. It is important to note that the driver exhibits gain
at 30 Gb/s, with a 200mVpp input signal resulted in a 260mVpp output signal.
This is the first known recording of positive gain at 30 Gb/s in an all CMOS
driver to date (2005).
56
Fig. 43: 25Gb/s output eye diagram with 50% eye crossing
Fig. 44: 30 Gb/s output eye-diagram with 260 mVpp per side for a single-ended 200 mVpp, 231-1 input PRBS.
57
Fig. 45 demonstrates the eye-crossing control performance at 20 Gb/s taken at
the output of the driver. The eye-crossing control for (a) 50%, (b) 70%, and (c)
30% was performed by varying the control voltage that was connected to off-
chip voltage sources.
Fig. 45: Eye-crossing control at 20 Gb/s (a) 50, (b) 70%, and (c) 30%
(a)
(b)
(c)
58
Fig. 46 shows output amplitude control at 20 Gb/s. The input signal in both
diagrams is 200 mVpp, applied to one side only, and the output is varied
between 190 mVpp and 350 mVpp.
Fig. 46: Output amplitude control at 20 Gb/s; Output of (a) 190 mVpp and(b) 350 mVpp per side
Measured waveforms with pre-emphasis at 20Gb/s are shown in Fig. 47. The
eye-diagram exhibits higher positive overshoot than undershoot with spike
height control between 0% and 25% of the eye height. The overshoot/under-
shoot imbalance shown in Fig. 47 was caused by an over-ratio between the
(a)
(b)
59
pre-emphasis path tail current and the main path tail current. The ratio of
overshoot/undershoot relative to the output signal swing alone is directly
related to the pre-emphasis path tail current value. A lower digital pre-
emphasis tail current could have resulted in a symmetrical overshoot/under-
shoot ratio, but the relative percentage of overshoot/undershoot would have
reduced to less than 15%. Unfortunately, experimental evidence of this was
not captured due to time restrictions at the testing facilities at Quake Tech-
nologies and ST Microelectronics, both located in Ottawa, Canada.
Fig. 47: 20Gb/s output eye diagram with 300 mVpp swing per side and pre-emphasis
6.3 Performance summary
Table 4 summarizes the overall circuit characteristics. Of special note are the
high input sensitivity and high output swing, the extensive -12dB input/out-
put matching up to 50 GHz and the multiplicity of control mechanisms with
60
respect to output amplitude, pre-emphasis and eye-crossing. Further, there is
excellent agreement between the simulated and measured values of the driver
in both the time domain and the frequency domain.
Table 5: Simulated and measured circuit parameters
Parameter SimulatedValue Measured Value
Supply 1.5V 1.5V
Power 150 mW 150 mW
Output swing@ 20Gb/s with 20 mVpp input
21-350 mVpp perside
190-350 mVpp perside
Pre-emphasis control @ 20 Gb/s +33%/ -25% +25%/-15%
Crossing control @ 20 Gb/s 33% to 66% 30% to 70%
Eye sensitivity @ 20 Gb/s 20 (10) mVpp 20 (10) mVpp
Noise Figure 14.9 dB @ 5 GHz15 dB @ 15 GHz
17 dB @ 5 GHz17 dB @ 15 GHz
S11/S22 up to 50 GHz < -5dB < -12dB
61
7 Conclusions
7.1 Summary
A 20 Gb/s backplane driver with more than 30 dB dynamic range was imple-
mented in 130-nm CMOS technology. The circuit consumes 150 mW from a 1.5
V supply and features independent control of output swing, duty cycle and
pre-emphasis. The circuit is operational without pre-emphasis at data rates
up to 30 Gb/s with 300 mVpp swing per side. The pre-emphasis pulse is both
amplitude and width controllable via the introduction of a novel digital circuit
implementation which does not require a separate clock signal. The results of
this project were published and presented at the IEEE Custom Integrated
Circuits Conference in 2004 [1].
7.2 Future work
Future work associated with this design would result in the further system
verification, expansion and industrialization of the circuit. As time-domain
testing was completed in industrial settings, tester time availability was low,
and in particular, pre-emphasis experimentation was affected. Further test-
ing on the pre-emphasis circuit would involve varying the varactor to increase
and decrease the width of the pre-emphasis pulse width, and varying the rela-
tive current bias from the digital differentiator to empirically match postitive
and negative pre-emphasis pulse heights. In terms of system expansion and
industrialization, a receiver circuit placed across a backplane with appropri-
ate mechanisms to control the pre-emphasis would be required to close the
feedback loop. Further industrialization of the main path would require
62
instantiation of on-chip and off-chip reference voltages and currents for each
of its independently controlled stages. Finally, verification of the circuit in a
datapath operation to measure its true performance in a digital data-specific
application is required. This would involve an addition of a Media Access Con-
trol (MAC) circuit and layer at the input of the driver as well as its MAC coun-
terpart on the receiving end.
63
References
[1] P. Westergaard, S.P. Voinigescu, T.O. Dickson “A 1.5-V, 20/30-Gb/s CMOS
Backplane Driver with Digital Pre-emphasis,” Proc. IEEE Custom Inte-
grated Circuits Conference, pp.23-26, Orlando, FL, Oct. 2004
[2] S. Galal, B. Razavi, “40Gb/s Amplifier and ESD protection Circuit in 0.18-
um CMOS Technology,” IEEE ISSCC Digest, pp.480-481, 2004
[3] D. Kehrer, H.D. Wohlmuth, “40 Gb/s 2:1 Multiplexer and 1:2 Demulti-
plexer in 120 nm CMOS,” IEEE ISSCC Digest, pp. 345-346, 2003
[4] J. Liu, X. Lin, “Equalization in high-speed communication systems,” Cir-
cuits and Systems Magazine, IEEE, Volume 4, Issue 2, pp. 4-17, 2004
[5] S.S. Mohan, M.D.M. Hershenson, S.P. Boyd, T.H. Lee, “Bandwidth exten-
sion in CMOS with optimized on-chip inductors,” IEEE Journal of Solid
State Circuits, Volume 35, Issue 3, March 2000, pp. 346 - 355
[6] M. Green, “Current-controlled CMOS circuits with Inductive broadband-
ing,” U.S. Patent 6,525,571, Filed Sept. 26, 2001
[7] S. Ramo, J.R. Whinnery, T. Van Duzer, Fields and Waves in Communica-
tions Electronics, 3rd. Ed. New York, John Wiley & Sons, 1994
[8] T. O. Dickson, R. Beerkens, S. P. Voinigescu, “A 2.5-V, 45-Gb/s Decision Cir-
cuit Using SiGe BiCMOS Logic," IEEE Journal of Solid-State Circuits, Vol-
ume 40, Issue 4, pp. 994-1003, April 2005
[9] B. Razavi, Design of Analog CMOS Integrated Circuits, 1st Ed. New York:
Mcgraw-Hill, 2001
64
[10] A. Sedra, K. Smith, Microelectronic Circuits, 4th Ed. New York: Oxford
Press, 1998
[11] S.P. Voinigescu, T.O. Dickson, R. Beerkens, I. Khalid, P. Westergaard, "A
Comparison of Si CMOS, SiGe BiCMOS, and InP HBTs Technologies for
High-Speed and Millimeter-wave ICs," Si Monolithic Integrated Circuits in
RF Systems, pp.111-114, Atlanta, GA, Sept. 2004
[12] R. Ranfft, H.M. Rein, “High-speed bipolar logic circuits with low power
consumption for LSI - a comparison.” IEEE Journal of Solid State Circuits,
Vol. 17, Issue 4, Aug. 1982, pp. 703 - 712
[13] D.S. McPherson. McPherson, D.S.; Pera, F.; Tazlauanu, M.; Voinigescu,
S.P. “A 3V fully differential distributed limiting driver for 40-Gb/s optical
transmission systems,” IEEE Journal of Solid-State Circuits, Volume 38,
Issue 9, Sept. 2003 pp. 1485 - 1496
65
Appendix A: Spice file for inductor parameterextraction.option acct nomod post=2 probe
.net v(p2) vin rout=50 rin=50
vin p1 0 AC 1
L p1 3 LsR 3 p2 RsCs1 p1 1 Cp1Cs2 p2 2 Cp2Rs1 1 0 Rsub1Rs2 2 0 Rsub2Csub1 1 0 Csub1Csub2 2 0 Csub2Cbr p1 p2 Cbr
.param+ Ls = 0.407n+ Rs = 6.9+ Rsub1 = 5240+ Rsub2 = 691+ Cp1 = 4.67f+ Cp2 = 5.31f+ Csub1 = OPT1(0.1p, 0.0001p, 10p)+ Csub2 = OPT1(0.1p, 0.0001p, 10p)*+ Csub1 = 5.72f*+ Csub2 = 4.70f+ Cbr = OPT1(30f, 0.0001p, 10p)
.AC data=measured optimize=opt1+ results=comp1,comp2,comp3,comp4,comp5,comp6,comp7,comp8+ model=converge.model converge opt relin=1e-4 relout=1e-4 close=10 itropt=30.measure ac comp1 err1 par(y11r) y11(r).measure ac comp2 err1 par(y11i) y11(i).measure ac comp3 err1 par(y12r) y12(r).measure ac comp4 err1 par(y12i) y12(i).measure ac comp5 err1 par(y21r) y21(r).measure ac comp6 err1 par(y21i) y21(i).measure ac comp7 err1 par(y22r) y22(r)
66
.measure ac comp8 err1 par(y22i) y22(i)
.ac data=measured
*.ac lin 75 5e9 79e9
.plot ac y21(m) y11(m) y21(db)
.print par(y11r) y11(r) par(y11i) y11(i)
.print par(y12r) y12(r) par(y12i) y12(i)
.print par(y21r) y21(r) par(y21i) y21(i)
.print par(y22r) y22(r) par(y22i) y22(i)
.print y11(r) y11(i) y11(m) y11(p)
.print y22(r) y22(i) y22(m) y22(p)
.print y12(r) y12(i) y12(m) y12(p)
.print y21(r) y21(i) y21(m) y21(p)
.print ac L12=par(’y12(i)/((6.28*FREQ)*(y12(m)*y12(m)))’)
.print ac Leff=par(’-y11(i)/((6.28*FREQ)*(y11(m)*y11(m)))’)
.print ac Q=par(’-y11(i)/y11(r)’)
.print ac cox=par(’(y11(i) + y12(i))/(6.28*FREQ)’)
*.print par(s11r) s11(r) par(s11i) s11(i)*.print par(s12r) s12(r) par(s12i) s12(i)*.print par(s21r) s21(r) par(s21i) s21(i)*.print par(s22r) s22(r) par(s22i) s22(i)*.print z11(r) z11(i) z11(m) z11(p)*.print z22(r) z22(i) z22(m) z22(p)*.print z12(r) z12(i) z12(m) z12(p)*.print z21(r) z21(i) z21(m) z21(p)
.data measuredFREQ Y11r Y11i Y12r Y12i Y21r Y21i Y22r Y22i***PLACE ASITIC SIMULATION DATA HERE***
*.param freq=100MEG,s11m = 0 , s11p = 0, s12m = 0, s12p = 0, s21m =0,*+s21p =0, s22m =0 , s22p = 0.end