Edge and Data Adaptive Equalization of Serial-Link Transceiversdownload.xuebalib.com/4a7LSQ1IR6.pdf · creasing the jitter of the clock and data recovery (CDR). This work proposes

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 9, SEPTEMBER 2008 2157

Edge and Data Adaptive Equalizationof Serial-Link Transceivers

Koon-Lun Jackie Wong, E-Hung Chen, and Chih-Kong Ken Yang, Senior Member, IEEE

Abstract—Limited channel bandwidth introduces inter-symbolinterference (ISI) at both data and edge samples. In addition to theISI at data samples, ISI at the edge samples (edge ISI) increasesthe bit error rate (BER) by degrading on the eye diagram and in-creasing the jitter of the clock and data recovery (CDR). This workproposes a forward FIR equalizer and a decision-feedback equal-izer (DFE) that compensate for both data and edge samples. Toadapt both the data and edge equalizers, a modified LMS adapta-tion algorithm is introduced to achieve convergence. A transmitterand receiver are implemented in 0.13 m and 0.18 m technolo-gies respectively. The edge ISI is improved by 20% and the jitter isimproved by 10% in measurement. The link operates over a 120FR4 channel with 24 dB attenuation at Nyquist frequency, and theBER is below 10

14 at 3.6 Gb/s.

Index Terms—Bit-error rate, CDR, equalizer, I/O link, ISI, LMSadaptation.

I. INTRODUCTION

W ITH the increase in on-chip data processing, demandfor I/O data rate has continued to increase. Band-

width limitations due to the higher data rates can severelyattenuate the signal even across short distances. The low-passfiltering of the channel introduces inter-symbol interference(ISI) and necessitates channel equalization. A large numberof mixed-signal equalizers [1]–[4] have been proposed witha wide range of architectures such as designs that are trans-mitter-based, receiver-based, continuous-time, discrete-time,FIR-based forward equalizers, decision-feedback equalizers(DFE), etc. Designs have been demonstrated to achieve accu-rate ISI compensation for multi-Gb/s data rates across channelswith 30 dB of attenuation at the Nyquist frequency.

This paper focuses on discrete-time equalizers [4] with a for-ward FIR at the transmitter and a DFE at the receiver. In additionto correcting the ISI at the data samples, this paper describes thedesign, implementation, and adaptation of equalizers that alsocompensate the ISI at the edges of the eye-opening where thetiming information is sampled, the edge ISI. The paper showsthe improvements on BER with our proposed techniques.

Edge ISI degrades BER in two ways. First, edge ISI worsensthe curvature of the data eye center, so non-ideal sampling clockleads to data samples with less voltage margin. The reduced

Manuscript received February 22, 2008; revised May 6, 2008. Current versionpublished September 10, 2008.

K.-L. J. Wong is with Broadcom Corporation, Irvine, CA 92617 USA (e-mail:[email protected]).

E.-H. Chen and C.-K. K. Yang are with the University of California at LosAngeles, Los Angeles, CA 90095 USA.

Digital Object Identifier 10.1109/JSSC.2008.2001876

margin is illustrated in the circled sampling points in Fig. 1(a).Second, as shown conceptually in Fig. 1(b), since the edge sam-ples are used to provide timing information to the clock and datarecovery (CDR) circuits, the edge ISI increases the samplingjitter through the CDR.

Edge equalization has been introduced in prior publicationsto compensate for the edge ISI. One of the techniques to com-pensate for edge ISI is edge-only equalization [5]. By equalizingedge samples only, a less aggressive equalizer can be used. How-ever, the data samples experience more ISI and hence the BERis degraded. Another technique to clean up edge ISI is to changethe edge dynamically in the time domain [6]. An additional dis-crete-time equalizer is needed in this technique in order to be ap-plicable to high-loss channels where the signal transitions maynot pass through the data sampler’s slicing threshold [7]. An-other technique that has been proposed is to filter out the un-desired transitions [3]. By ignoring some transitions, the corre-sponding ISI is eliminated. However, transition filtering reducestransition density, which lowers the CDR bandwidth.

This paper introduces a comprehensive forward FIR thatcompensates both data and edge ISI using overlapping pulsesto reduce power dissipation. The architecture is discussed inSection II-B. We also propose an improved edge equalizerimplemented in the DFE. By only equalizing the CDR input,the CDR jitter is reduced without trading off the opening atthe data sample or decreasing the loop bandwidth. Section II-Cdescribes the implementation. To determine the tap coefficients,an adaptation algorithm is discussed in Section III. Unlike con-ventional discrete-time equalizers, an edge equalizer introducesan additional degree of freedom to adjust the average phase ofthe data transitions which conflicts with the CDR phase lock.The additional degree of freedom results in multiple lock pointsof the CDR and the coefficient adaptation. Section III describesthe modified least-mean-square (LMS) adaptation algorithmto achieve convergence. Measurement results are shown inSection IV that validates the performance of the forward FIRequalizer, the DFE, and the adaptation algorithm.

II. EQUALIZER ARCHITECTURE

The overall transceiver with edge equalization implementedin both the transmitter and receiver is shown in Fig. 2. The trans-mitter equalizer is a half-symbol spaced FIR filter (XFIR). Thereceiver equalizer uses separate DFEs for the data and edge sam-ples (XDFE). A low-power CDR [8], which generates quadra-ture phases with programmable phase offset, is used to obtainthe optimum sampling phase. A PRBS generator and checkerare built on-chip to facilitate testing. The coefficients of the for-ward equalizer FIR and DFE are obtained by a modified LMSadaptation loop. For experimentation, the adaptation is done by

0018-9200/$25.00 © 2008 IEEE

2158 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 43, NO. 9, SEPTEMBER 2008

Fig. 1. Impact of timing ISI: (a) reducing voltage margin, and (b) recovered clock with higher jitter.

Fig. 2. Architecture of a transceiver with equalizers at both transmitter and receiver. The transmitter equalizer is a forward FIR (XFIR), and the receive equalizeris a DFE (XDFE).

storing the transition and data samples in a data memory andprocessing the information off-chip. In the actual applications,dedicated logic and a back channel, such as [9], are needed toadapt the transmitter.

This section begins with a brief discussion of the tradeoffsin implementing the edge equalization. The implementation de-tails and proposed architectures are described subsequently.

A. Edge-Equalizer Tradeoff Analysis

A pulse response is shown in Fig. 3 for a band-limitedchannel. The pulse responses after a symbol-spaced equalizer(FIR) and a half-symbol-spaced equalizer (XFIR) are alsoshown. The approach for the timing equalization is to constrainthe edge sample to have A/2 amplitude. Fig. 4 illustrates thesimulated eye diagrams for both cases. The tradeoff for com-plete edge equalization is the improvement of the eye width( 20%) at the cost of the signal amplitude ( 8%).

There are two benefits of the improved eye width. First, byreducing the edge ISI, the XFIR is advantageous in timingnoise dominated systems. For multi-Gb/s data transmission,clock jitter can be a larger fraction of the unit-interval (UI). Our

Fig. 3. Pulse response of proposed XFIR.

measurements in Fig. 20 of Section IV will demonstrate underwhich conditions an XFIR out-performs a symbol-spaced FIR.

Second, since edge samples are input to the CDR, the outputclock of the CDR is cleaner by reducing the timing uncertainty

WONG et al.: EDGE AND DATA ADAPTIVE EQUALIZATION OF SERIAL-LINK TRANSCEIVERS 2159

Fig. 4. Simulated eye diagrams of (a) conventional FIR and (b) XFIR.

Fig. 5. (a) Transition density at the edge sample for an unequalized eye. (b) Example of transition filtering that eliminates the [1 �1 1] edge. (c) The PSD of theedge ISI at the output of the phase detector for 3 cases: data unequalized, edge equalized, and transition filtered using the filter in (b).

at the input. The CDR bandwidth filters the timing noise andso the impact of the edge ISI depends on the power-spectraldensity (PSD) of the edge ISI and the bandwidth of the CDR.Fig. 5(a) illustrates the transition density at the edge sample inthe case without edge equalization. Edges are shifted earlier orlater depending on the data pattern. The timing noise due to theedge ISI can be treated as a random variable, and the Fouriertransform of its autocorrelation gives the PSD of the edge ISI.The PSD of the edge ISI is illustrated in Fig. 5(c). The relativelywide and flat spectrum of the PSD indicates that the noise due toedge ISI can be approximated by a white noise spectrum. Notethat even though a data sequence can be run-length coded, thenoise power from edge ISI can be at a lower frequency thanthe minimum frequency of the data pattern since a pattern thatleads to an early or late phase can repeat to result in PSD of edgeISI at low frequencies. The low-pass filter cutoff frequency bythe CDR is often limited by the jitter tolerance specification sothe amount of filtering is limited. The PSD as a result of edgeequalization is also shown in Fig. 5(c) showing the reductionin the magnitude of the noise. The middle curve in Fig. 5(c)shows the impact of transition filtering. An example of a digitaltransition filter that removes the sequence fromimpacting the CDR is shown in Fig. 5(b). The result shows thatedge ISI can be reduced by several dB but the bandwidth of theCDR is halved, because some of the edges (i.e. the

which accounts for 50% of the rising edges) are ignored by theCDR. Simulations show that an increase of jitter from 0.1UIp-pto 0.15UIp-p due to edge ISI can degrade the BER fromto (assuming 18 dB SNR at the data samples).

The second benefit of edge equalization can be derivedwithout any tradeoff in the eye-height in the case of the receiverequalization. The input to the CDR can be edge equalized whilethe signal to the data samplers is only equalized for the datasamples. This approach, implemented in our receiver as theXDFE, is described in Section II-C.

B. Edge-Equalized FIR (XFIR) Implementation

The delay of an FIR can be implemented using an analogdelay line or clocked samplers. An analog half-symbol-spacedFIR typically requires high power dissipation to finely controlshort analog delay [10]. In a sampled system, where the

the power dissipation can also be considerable becauseof the sharp slew rates and/or the large number of latching ele-ments.

Our proposed XFIR is implemented at the transmitter usingpre-emphasis. The proposed design is a full-rate architecture,

, and the latches, which are half of the flip-flop, areused as half-cycle delay elements, as shown in Fig. 6. The latchoutputs have half-cycle overlaps with adjacent taps, as shownin Fig. 7. Tap coefficients are adjusted such that the sum of the


Fig. 6. Schematic of XFIR.

Fig. 7. Full rate pulse response with half cycle overlap.

overlapping tap settings is the desired FIR weight. Because ofthe overlapping pulses, any dependencies due to the previous tapis taken into account as a precursor ISI on the channel. Mathe-matically, given the pulse response of the channel, a unique setof coefficients can be found to exactly cancel the data ISI andedge ISI by solving a set of N equation, where N is the numberof taps, and the matrix specifies the dependencies be-tween tap coefficients. The delay chain is composed of CMLlatches and a CML XOR is used to create programmable in-versions. Compared to the implementation of , theproposed architecture dissipate substantially less power becauseof the lower clock frequency.

The transmitter equalizer is a 10-tap FIR with 5 taps for dataand 5 taps for edge. Seven-bit current DACs are used to programthe tap weights of the first 4 taps and 5-bit DACs are used forthe remaining taps. The designed output swing is 250 mV withdouble-sided termination.

The edge taps adjacent to the main data tap are equalized toA/2 resulting in a pulse target of [0 A/2 A A/2 0]. Note thatone of the edges and one of the data taps are to address thesamples prior to the main data sample (precursor taps). Because

the implementation is at the transmitter where the total outputpower is constrained by the voltage swing of the peak signal, thesum of the tap weights must not exceed the maximum swing. Awell-known drawback is a reduction in the maximum eye-heightdue to this constraint. Equivalently, for a given eye-height, ahigher voltage swing is needed, resulting in a power penalty.

C. Edge-Equalized DFE (XDFE) Implementation

Several options are available for the DFE at the receiver.Conceptually, it is possible to implement a complete data+edgeequalizer with a half-symbol-spaced filter. This type is labeledType1 XDFE in Fig. 8(a). However, the eye-height reduction issimilar to that of a transmitter XFIR. The alternative employedin our receiver is denoted Type 2 XDFE in Fig. 8(b). This typeconsists of two separated filters for data path (XdDFE) and edgepath (XeDFE). The XdDFE is the same as a conventional DFEand the XeDFE equalizes only the edge samples similar to anedge only equalizer [5].

Fig. 9 shows the eye diagram of both the data and the edgepath of the Type2 XDFE in comparison with the eye diagramfrom a Type1 XDFE. The Type2 XDFE does not degrade thevoltage margin in the data path while cancelling the edge ISI atthe edge path. The tradeoff between voltage margin and timingmargin is decoupled. In comparison to a conventional DFE, aType2 XDFE always results in a better BER because the voltagemargin of data sample is not changed while better information isprovided to CDR. Furthermore, a Type2 XDFE does not requirehalf-symbol delay elements in the delay chain, so the hardwarecomplexity is reduced. Type2 XDFE only uses traditional full-rate flip-flops and a clock that is already generated for the CDR.

Since the XeDFE cleans up the CDR input, the net impact onthe BER depends on not only the channel attenuation and thenoise but also the CDR bandwidth. It is interesting to note thatthe Type2 XDFE cannot tolerate extremely large random timingjitter by either the sampling clock or from the data becausethe technique does not reduce the curvature of the eye opening(marked as dotted line in Fig. 9(b) around the data sample. Theflat the curvature improves the SNR of the data sample whensubject to a non-ideal sampling clock. Therefore, depending onthe specific channel or noise condition, it may be necessary to


Fig. 8. Different architectures of XDFEs: (a) Type1, data + edge DFE; (b) Type2, separate data and edge DFEs.

apply some partial edge equalization to the data path (using theXFIR) to reduce the timing ISI for an optimum tradeoff.

Fig. 10 illustrates the circuit diagram of the Type2 XDFE. Thedesign implements a 3-tap DFE and combines a look-ahead [11]and half-rate architecture [4]. The middle 2 paths (DataOut0,DataOut180) reflect the half-rate data outputs and the outside2 paths (EdgeOut90, EdgeOut270) correspond to the half-rateedge outputs. The two adders and comparators in each path im-plement the look-ahead architecture. The decision on the left isused to select the look-ahead results of the right path and viceversa. The remaining taps, and , are fed back di-rectly. The critical path starts from the output of ,through the gm stage and adder of the edge path, and ends at theinput of the slicer before . To relax the critical path to 1.5bit time, is taken from the MUX output. A race condi-tion does not occur because the slicer is a positive edge-triggeredcomparator.

The feedback summation by the DFE must add the tapcurrents quickly, accurately and linearly. The schematic of theadder is shown in Fig. 11. A large is used in the input dif-ferential pair (main tap) for linearity. Cascode current sourcesare used for accurate current control. PMOS devices are used asresistive load. The first tap, , is assumed to be negative. Thesign of the 2nd and 3rd tap, and , is achieved by switchingthe cascode bias through passgates. Finally, an extra differentialpair is added to compensate the combined offset of this adderand the subsequent comparator. To minimize the impact ofparasitic capacitances, all differential pairs are matched andphysically close together and all programmable current sourcesare oriented for better matching.

In order to verify the functionality of the Type2 XDFE, aneye monitor is used to trace out the equalized eye diagram. Theeye monitor is an auxiliary sampler in parallel with the receivingsamplers whose phase and slicing voltage can be programmedand swept. The auxiliary sampler is an exact copy of the datasampler with a separated sampling clock that is generated fromthe CDR using an additional phase interpolator. The auxiliary

sampler is also used to detect the signal amplitude during equal-izer adaptation.

III. ADAPTATION ALGORITHM

The coefficients of the equalizers depend on the channel.It is inconvenient and sometimes impossible to characterize achannel before it is used. An adaptation algorithm is often usedto determine the proper tap coefficients. Real-time adaptationof equalizer coefficients can further provide robustness toPVT and channel variations. While standard LMS adaptationtechniques [12] are simple to implement, as the first part of thissection illustrates, the standard algorithm does not guaranteeconvergence for an edge equalized system because of theinteraction of the CDR with the adaptation. A modified-LMSalgorithm is introduced next to achieve near optimal conver-gence. In this paper, only adaptation for the receiver with XDFEis demonstrated, but the techniques proposed here can be easilyexpanded to XFIR and most discrete-time equalizers with edgeequalization. In the following sections, the coefficient labelsare based on Fig. 8(b). Coefficients correspond to datataps and correspond to edge taps. To compare theperformance of the proposed architecture to the conventionalarchitectures, the XDFE can be configured to be equivalentto several conventional configurations, which have no edgeequalization ability. Fig. 12 illustrates two conventional caseswhere (a) the input to the CDR is not equalized [4] (denotedNOEQ), and (b) the input is equalized with the subtraction froma conventional DFE prior to the CDR [13] (denoted DFE).

A. Impact of Edge Equalization on CDR

The nominal sampling phase position of a discrete-timeequalizer impacts the eye opening at the sampling moment.Fig. 13 illustrates the impact of shifting the sampling phaseposition on the equalized eye-height, where the tap coefficientsare adapted at each phase position. This simulation assumes


Fig. 9. Comparison of eye diagram of type1 and type2 XDFE.

that there is no CDR to lock the phase position but the fre-quency of the sampling clock is exactly the same as the datasignal. At each fixed sampling phase position, the equalizer isadapted to find the maximum eye-height. The top row of eyediagrams in the figure shows the unequalized eye diagramsat various sampling positions. The second row shows the eyediagrams at the input to the data samplers after being adaptivelyequalized by the XdDFE. The third row shows the eye datadiagram at the input of the edge samplers after being adaptivelyequalized by the XeDFE. The arrow in the eye diagram of thesecond row demarks the data sampling position.1 There is anoptimum sampling position that leads to the maximum dataeye. The shallow maximum indicates that small deviationsfrom the optimal sampling phase do not impair performancesignificantly, because the adaptive equalizer recalculates thetap coefficients. The simulation uses an 80 FR4 channel withSMA connectors at both ends as the channel.

The locked CDR phase depends on the location of the me-dian of the signal transitions, . Depending on the input to theCDR, the CDR locks to one of the sampling phases of Fig. 13.For instance, with the input to the CDR being directly from thechannel, The CDR sampling phase is the point correspondingto “NOEQ”. These lock points depend strongly on the channelcharacteristics. For the 80 channel, the “NOEQ” point leads to

1The later sampling phases results in an eye that is not equalized to have noISI at the data sampling point (arrow). The ISI is due to the precursor ISI that islarger with the later phase position.

a sampling point reasonably close to the optimum. However, thenoisy phase input would result in larger jitter at the output of theCDR. If the CDR input is input after the data DFE summation(the median transition time of the second row of eye diagrams),the CDR sampling phase is the point corresponding to “DFE”.Even though the phase input has less ISI, the “DFE” point issubstantially shifted from the optimum. The shift is due to aninteraction between the CDR and the adaptation. As the equal-izer compensates the ISI, the median of the data transitions shiftshence causing the CDR to change the sampling position.

Adding a separate edge equalizer to the CDR input intro-duces an additional degree of freedom to adjust while main-taining an “ISI-free” CDR input. However, applying the stan-dard blind-LMS adaptation algorithm to adapt the two equal-izers may not converge because of the added degree of freedom.The third row of Fig. 13 illustrates how the CDR can potentiallylock at any phase position depending on the XeDFE. At eachphase position without activating the CDR, the XeDFE can beadapted to minimize the edge ISI. With the proper coefficientsfor the XeDFE, we can initialize a CDR at any of the phase po-sitions and it would remain locked at that phase position.

Fig. 14 illustrates filter coefficients adaptation when ap-plying a sign-sign LMS (SS-LMS) adaptation. The resultsare measured from a test- chip. The divergence can be seenin the adapted coefficients, (amplitude), and (edge co-efficient). The divergence depends on the initial conditionsand the non-idealities in the receiving system. To address this


Fig. 10. Block diagram of Type2 XDFE.

Fig. 11. Schematic of adder.

non-convergence, the next section introduces a method to adda constraint to an LMS algorithm so that it converges to anear-optimal phase position.

B. Adaptation Algorithm for Edge Equalization

To adapt , the amplitude is detected by a reference level, .The reference level also adapts for input-amplitude variations.The adaptation for the edge-tap coefficients is a second loop thatuses the outputs from the edge and data sample in parallel withthe adaptation of the data sampling.

In order to converge, an additional constraint is imposed ona blind-LMS algorithm. There are a number of possible ap-proaches to add a constraint. In our approach, we choose oneof the data equalized transition edge times (XdDFE) as a refer-ence transition edge. An equalized transition from the data DFE

summation is chosen rather than an unequalized transition be-cause the equalized one guarantees an open eye. For instance,the or [1 1 1 1] data transition is chosen asthe reference. This choice of the reference edge is discuss in thenext paragraph. The CDR lock position is determined by the ref-erence edge crossing the slicing voltage of the edge sampler. Inour implementation, the edge sampler has a slicing voltage of0. The adaptation algorithm adjusts the edge equalizers coeffi-cients, , to adapt all edges toward the reference edge. Fig. 15illustrates the concept. The CDR first places the sampling phaseby averaging the transition information, as shown in Fig. 15(a),where edge “1” is the reference edge and is assumed to crossat . Since the reference edge is fixed, only the re-maining edges (“2–4”) converge toward the position of the sam-pling phase. Because of the CDR, the sampling phase moves to-ward the average of the edges hence favoring the fixed positionof edge “1”. Fig. 15(b) shows the new position of the sampling


Fig. 12. Possible alternate configurations of receiver architecture (a) no equalization applied to the CDR input, and (b) data equalized signal as input to CDR.

Fig. 13. Eye-height as a function of the sampling phase. The first row below are the eye diagrams before the signal has been adaptively equalized. The second rowis the adaptively equalized input to the slicer for the data path (XdDFE). The third row is the adaptively equalized input to the slicer for the edge path (XeDFE).

phase after one adaptation iteration (with the old position shownas a gray arrow). The process repeats [Fig. 15(c)] and eventu-ally pushes edges “2–4”– to coincide with edge “1” as shown inFig. 15(d).

The choice of the reference-edge position depends on the lo-cation of the optimum phase, which is not 90 after data DFEis applied. Fig. 16 shows a simulation of 4 different edge tran-sitions from different data patterns at the input of data sampler(after the DFE subtraction). The x-axis is the sampling phasefrom 0.5UI to 0.5UI, where 0UI is referred to the optimumphase obtained from Fig. 13. The corresponding eye-height ifused as the CDR lock point is plotted in Fig. 13. The

is found to be the best reference edge for a number of chan-nels, including FR4, RG55, IEEE 1394 cable, and multi-drop

channel for memory bus, because is typically theclosest to the optimum phase.

Other methods of introducing an additional constraint arepossible. For instance, instead of choosing a reference edge, theslicing voltage can be offset to move the lock point to the op-timal position for a given reference edge. Also, instead of a ref-erence edge, a reference sequence of or [1 1] can bealso used and the target slicing voltage of the edge sample canbe adjusted to the reference level of the target signal amplitude.

C. Implementation

The adaptation algorithm is applied to the half-ratelook-ahead DFE receiver shown in Fig. 9. Because the co-efficients for the edge equalizer are expected to be similar to


Fig. 14. Illustration of the divergence of equalizer coefficients when applying the LMS algorithm to the edge equalizer.

Fig. 15. Illustration of adaptation on edge equalization.

Fig. 16. Simulated phase position of different edges. The “0” position indicatesthe best reference edge position.

the data equalizer, we set the XeDFE coefficients to whereare the coefficients for the XdDFE.

Because of the half-rate and look-ahead architecture, thesignal eye diagrams differ from Fig. 9. With the more re-laxed timing, the feedback can occur during the non-sampledhalf-cycle of each sampler (i.e. during the Odd half-cycle ofthe Even sampler). Fig. 17(a) shows the input to the Evendata samplers and 17(b) shows the inputs to the Even edgesamplers. The eyes are not partial-response eyes [14] becausethey are traced based on the digital data after the look-aheadmultiplexer. Half of the eye diagram does not contain usefulinformation because of the injected decision feedback signal.The shaded region in the figure indicates the region that doesnot contain data or phase information.

The adaptation algorithm constrains the possible values ofso that the reference edge does not cause any change to

the coefficients. To determine the constraint, the voltage at theinput of the edge sampler, , at the edge sample time can beexpressed as


Fig. 17. Data and edge eye diagrams of the even path of half-rate architecture.

Fig. 18. Die photo of transmitter with XFIR in 0.13 �m CMOS (left), and receiver with XDFE in 0.18 �m CMOS (right).

The reference edge does not change with when thesecond bracket equals to zero for the input sequence of

or [1 1 1]. As a result, areconstrained so that . Therefore, areadapted normally as in simple LMS algorithm, but is alwayscalculated from and .

Some additional circuits are needed for the adaptation. Asampler is needed to detect the reference level, A, for adaptingthe amplitude of the data samples. In our implementation, theeye monitor, which is adjusted to use the same clock edge as thedata path and a slicing voltage of A, is reused for the referencelevel adaptation. For edge adaptation, we use the existing edgesamplers for the Odd and Even data paths that have zero slicinglevels and sample the data 1/2 UI from the data samples. Theadaptation computation is performed off-chip. A 256-bit SRAMis used for each path to store the sampled values. By processingthe data and edge samples downloaded from the SRAM, earlyand late information could be determined for each edge, and thetiming information are used for adaptation.

IV. MEASUREMENT RESULTS

The XFIR transmitter is fabricated in a 0.13- m CMOS. Thereceiver with the XDFE and adaptation circuits is fabricatedin a 0.18 m CMOS technology. The die photos are shown in

Fig. 18(a) and (b) respectively. The latch-based transmitter oc-cupies m m and the XDFE core occupies m

m. In the measurement setup, several channels are imple-mented in our test board: a 40 , an 80 and a 120 FR4 channelwith 8.5 dB, 16.5 dB and 24 dB of attenuation respectively atroughly 2 GHz.

A. XFIR Measurements

The transmitter is implemented using full-rate common-mode logic (CML). The test chip consumes total 137 mWand includes output driver (13 mW), 10 pre-drivers (35 mW),10-tap delay chain with XOR for bit inversion (5.2 mW), clockgeneration (10.5 mW), PRBS generator, and other test I/Ocircuits. Fig. 19 shows the eyes from an FIR and an XFIRthrough the 120 channel. At 3.8 Gb/s, the XFIR is able toreduce the timing ISI from 47 ps to 37 ps. The penalty of XFIRis a voltage reduction from 19 mV to 17 mV.

The BER is measured with the transmitter configured as anXFIR and an FIR to fairly compare the two architectural options.The receiver used for the BER measurement has a sensitivityof 2 mVrms which corresponds to 6% of input signal. Error-free operation was found in the nominal operation,where CDR and adaptation are running. Both timing and voltagenoise are injected into the signal. Voltage noise is increased by


Fig. 19. Measured eye diagram of FIR (left) and XFIR (right).

Fig. 20. Measured BER of XFIR at 3.8 Gb/s.

reducing the launch amplitude of the transmitter. Timing noiseis injected by using PLL bypass mode on-chip and adding a highfrequency phase modulation on the source clock. The BER asa result of differing amounts of noise is shown in Fig. 20. Theblack line shows the boundary where the XFIR out-performsthe FIR. As discussed previously, the XFIR has an advantage intiming noise dominated system.

B. XDFE Measurements

The XDFE receiver with the eye monitor consumes14.87 mW at 3.6 Gb/s. The added circuits for edge equal-ization in XDFE consume about 3 mW. The XDFE is able toreceive a 3.6-Gb/s binary data stream launched with 500 mVswing differential through a 40 and 80 FR4 channel withouterror. The eye-diagrams as traced at the input to the data sam-pler by the eye monitor are shown in Fig. 21(a), (b), and (c).To highlight the impact of the XDFE, Fig. 21(d) plots the inputto the edge sampler with to simulate a normal DFE.The result is 30% larger than the result when are properlyset by the XDFE adaptation. The recovered clock jitter is also

Fig. 21. Measured eye diagram of XDFE. (a) 40in No DFE; (b) 40in DFE; (c)80in DFE; (d) 40in DFE (no XDFE); (e) 40in XDFE.

an indication of the XDFE performance. Measurement shows areduction from 31.1 ps to 28.9 ps-p2p.

As discussed previously, the advantage of XDFE is that it re-duces the impact of edge ISI without requiring a lower CDRbandwidth. The technique can be applied in conjunction withbetter CDR filtering if the required loop bandwidth is set by acommunication standard. Digital edge filtering [3] can be ad-ditionally employed as shown previously in Fig. 5(b). In thiscase, the edge filtering can remove signal transitions that haspoor slew rates which has worse jitter. Jitter measurement byapplying digital edge filtering shows a jitter of 27 ps-p2p beforeXDFE and a jitter of 25.5 ps-p2p after XDFE.

C. Adaptation Algorithm Measurements

The adaptation algorithm is verified by using stored edge anddata samples and processing the information off-chip. Fig. 22shows the behavior of the coefficients as a function of time(number of iterations). The coefficients are shown to converge.

The edge equalizer pushes all edges to the reference edge atthe end of the adaptation. The on-chip eye monitor is used totrace the position of four different edges. The results are shownin Fig. 23. In the plot, the CDR phase is fixed at the optimal


Fig. 22. Measured locking behavior of adaptation algorithm for data taps (left) and edge taps (right) across a 120 FR4 channel.

Fig. 23. Measured transition edges monitored at optimum phase with data DFE tap coefficients on edge equalizer (a) before adaptation, (b) after adaptation.

phase position (middle of the plot). The edge from the data se-quence (or [1 1 1 1]) is closest to the optimumphase and corresponds to edge “1” in Fig. 16. Fig. 23 also showsthe resulting edge positions after edge equalization.

Fig. 24 and 25 plots the measured eye height versus the CDRlock position for the 80 and 120 channels. The lock positionis swept by applying a phase offset to the CDR output with theuse of a phase interpolator. The sampling phase of “NOEQ” and“DFE” are measured by using the proper configuration of ourreceiver. The point, “XDFE”, reflects the CDR sampling phaseafter applying the modified adaptation algorithm. As expected,the “XDFE” sampling point is close to the optimal. Interest-ingly, for the 80 FR4 channel, the “NOEQ” sampling pointhas similar voltage opening as the “XDFE” sampling point.However, taking the input directly from the channel has worseBER performance because much larger noise is input into theCDR. Measurements indicate that the “XDFE” can tolerate 10%voltage reduction at the receiver as compared to the “NOEQ”

design while still achieving a minimum . Thesignal amplitude at the receiver is 71 mV.

With the 120 FR4 channel, the “NOEQ” configuration doesnot lock due to the ISI. Using the output of a conventionalDFE also fails to converge during adaptation. Only a DFEwith an edge equalizer with the modified adaptation algorithmconverges to nearly the optimal sampling point and the CDRremains in lock.

V. CONCLUSION

Conventional discrete-time equalization does not compensatefor ISI at the signal transitions hence degrading the BER. Edgeequalization is proposed to reduce edge ISI. A lower edge ISIresults in a wider eye opening and a cleaner CDR recoveredclock. The proposed edge equalizers include an XFIR, whichcompensates both data and transition samples, and an XDFE,which equalizes the edge samples used by the CDR.


Fig. 24. Measured eye height (solid) versus sampling phase for 80 FR4. Sim-ulated result (dashed above) does not include 28 mV of noise floor.

Fig. 25. Measured eye height versus sampling phase for 120 FR4.

As shown in the measured results, the XFIR trades off timingmargin with voltage margin. For systems with large timingnoise, an XFIR shows better BER. The XDFE does not sufferthe same tradeoff. The timing margin is improved by improvingthe edge ISI at the CDR input without any degradation to thevoltage margin. Our results show 30% improvement in theedge ISI at the CDR input.

A modification to LMS adaptation algorithm is presented toadapt the edge equalizer’s coefficients. The modification addsa constraint to the algorithm. The method implemented uses afixed reference edge that is not adapted. The reference pointallows the edge equalizer to converge. As shown in the analysis,proper choice of the constraint can lead to a near-optimal CDRlocking phase.

REFERENCES

[1] S. Gondi et al., “A 10 Gb/s CMOS adaptive equalizer for backplane ap-plications,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.Papers, 2005, pp. 328–329.

[2] S. Reynolds et al., “A 7-tap transverse analog-FIR filter in 0.13 �mCMOS for equalization of 10 Gb/s fiber-optic data systems,” in IEEEInt. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2005, pp.330–331.

[3] K. Yamaguchi et al., “12 Gb/s duobinary signaling with �2 over-sampled edge equalization,” in IEEE Int. Solid-State Circuits Conf.(ISSCC) Dig. Tech. Papers, 2005, pp. 70–71.

[4] T. Beukema et al., “A 6.4-Gb/s CMOS serDes core with feed-forwardand decision-feedback equalization,” IEEE J. Solid-State Circuits, vol.40, no. 12, pp. 2633–2645, Dec. 2005.

[5] B. Brunn, “Edge equalization NRZ,” Jul. 2004 [Online]. Available:http://www.ieee802.org/3/ap/public/jul04/brunn_01_0704.pdf

[6] J. Buckwalter and A. Hajimiri, “A 10 Gb/s data-dependent jitter equal-izer,” in Proc. IEEE Custom Integrated Circuits Conf. (CICC), 2004,pp. 39–42.

[7] J. Buckwalter, M. Meghelli, D. Friedman, and A. Hajimiri, “Phase andamplitude pre-emphasis techniques for low-power serial links,” IEEEJ. Solid-State Circuits, pp. 1391–1399, Jun. 2006.

[8] K.-L. J. Wong, H. Hatamkhani, M. Mansuri, and C.-K. K. Yang, “A27-mW 3.6-Gb/s I/O transceiver,” IEEE J. Solid-State Circuits, vol. 39,no. 4, pp. 602–612, Apr. 2004.

[9] A. Ho et al., “Common-mode backchannel signaling system for dif-ferential high-speed links,” in Symp. VLSI Circuits Dig. Tech. Papers,2004, pp. 352–355.

[10] H. Johansson et al., “Reconstruction of nonuniform sampled bandlim-ited signals using digital fractional filters,” in Proc. IEEE Int. Symp.Circuits and Systems (ISCAS), 2001, pp. 593–596.

[11] R. S. Kajley, P. J. Hurst, and J. E. C. Brown, “A mixed-signal decision-feedback equalizer that uses a look-ahead architecture,” IEEE J. Solid-State Circuits, vol. SSC-32, no. 3, pp. 450–459, Mar. 1987.

[12] B. Widrow, J. M. McCool, M. G. Larimore, and C. R. Johnson, Jr., “Sta-tionary and nonstationary learning characteristics of the LMS adaptivefilter,” Proc. IEEE, vol. 64, no. 8, pp. 1151–1162, Aug. 1976.

[13] V. Stojanovic et al., “Autonomous dual-mode (PAM2/4) serial linktransceiver with adaptive equalization and data recovery,” IEEE J.Solid-State Circuits, vol. 40, no. 4, pp. 1012–1026, Apr. 2005.

[14] B. Leibowitz et al., “A 7.5 Gb/s 10-tap DFE receiver with first tap par-tial response, spectrally gated adaptation, and 2nd-order data-filteredCDR,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Pa-pers, 2007, pp. 228–599.

Koon-Lun Jackie Wong was born in Hong Kong.He received the B.S., M.S., and Ph.D. degrees in elec-trical engineering from the University of California atLos Angeles in 1999, 2001, and 2007, respectively.

In summer 2002, he was with National Semicon-ductor Corp. working on clock and data recovery forOC-3 application. He also designed high-speed fre-quency dividers and low power channel equalizers atIBM in summer 2003 and summer 2004. He joinedBroadcom Corp. as a central engineer in 2006, wherehe worked on power-efficient circuits and phase pre-

distortion circuits for serial communications.

E-Hung Chen was born in Taipei, Taiwan, R.O.C.He received the B.S. and M.S. degrees in electricalengineering from National Taiwan University and theUniversity of California at Los Angeles (UCLA), re-spectively. He is currently working toward the Ph.D.degree at UCLA.

He was an intern at Broadcom Corporation insummer 2005. In summer 2006 and 2007, he waswith Rambus Inc. working on channel equalizationtechnique and receiver modeling.

Chih-Kong Ken Yang (S’94–M’98–SM’07) wasborn in Taipei, Taiwan, R.O.C. He received the B.S.and M.S. degrees in 1992 and the Ph.D. degree in1998 from Stanford University, Stanford, CA, all inelectrical engineering.

He has been with the University of California atLos Angeles as an Assistant Professor and AssociateProfessor in 1999 and 2004, respectively. His currentresearch area is high-performance mixed-mode cir-cuit design for VLSI systems such as clock genera-tion, high-performance signaling, low-power digital

design, and analog-to-digital conversion. He is the recipient of the IBM FacultyDevelopment Fellowship from 2003 to 2005 and the 2003 Northrup-GrummanOutstanding Teaching Award.

本文献由“学霸图书馆-文献云下载”收集自网络，仅供学习交流使用。

学霸图书馆（www.xuebalib.com）是一个“整合众多图书馆数据库资源，

提供一站式文献检索和下载服务”的24 小时在线不限IP

图书馆。

图书馆致力于便利、促进学习与科研，提供最强文献下载服务。

图书馆导航：

图书馆首页文献云下载图书馆入口外文数据库大全疑难文献辅助工具

http://www.xuebalib.com/cloud/

http://www.xuebalib.com/

http://www.xuebalib.com/cloud/


http://www.xuebalib.com/vip.html

http://www.xuebalib.com/db.php

http://www.xuebalib.com/zixun/2014-08-15/44.html


Documents

Edge and Data Adaptive Equalization of Serial-Link Transceiversdownload.xuebalib.com/4a7LSQ1IR6.pdf · creasing the jitter of the clock and data recovery (CDR). This work proposes