11
1368 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006 Sequential Element Design With Built-In Soft Error Resilience Ming Zhang, Member, IEEE, Subhasish Mitra, Senior Member, IEEE, T. M. Mak, Senior Member, IEEE, Norbert Seifert, Senior Member, IEEE, Nicholas J. Wang, Quan Shi, Kee Sup Kim, Member, IEEE, Naresh R. Shanbhag, Fellow, IEEE, and Sanjay J. Patel, Member, IEEE Abstract—This paper presents a built-in soft error resilience (BISER) technique for correcting radiation-induced soft errors in latches and flip-flops. The presented error-correcting latch and flip-flop designs are power efficient, introduce minimal speed penalty, and employ reuse of on-chip scan design-for-testability and design-for-debug resources to minimize area overheads. Cir- cuit simulations using a sub-90-nm technology show that the pre- sented designs achieve more than a 20-fold reduction in cell-level soft error rate (SER). Fault injection experiments conducted on a microprocessor model further demonstrate that chip-level SER improvement is tunable by selective placement of the presented error-correcting designs. When coupled with error correction code to protect in-pipeline memories, the BISER flip-flop design improves chip-level SER by 10 times over an unprotected pipeline with the flip-flops contributing an extra 7–10.5% in power. When only soft errors in flips-flops are considered, the BISER technique improves chip-level SER by 10 times with an increased power of 10.3%. The error correction mechanism is configurable (i.e., can be turned on or off) which enables the use of the presented techniques for designs that can target multiple applications with a wide range of reliability requirements. Index Terms—Circuit simulation, error correction, fault injec- tion, sequential element design, soft error rate (SER). I. INTRODUCTION S OFT errors are radiation-induced transient errors caused by neutrons from cosmic rays and alpha particles from pack- aging materials [1]. Soft error protection is very important for enterprise computing and communication applications since the system-level soft error rate (SER) has been rising with tech- nology scaling and increasing system complexity. Several de- signs today implement extensive error detection and correction (ECC) mainly for on-chip SRAMs and register-files. However, memory protection is not enough for designs manufactured in advanced technologies because soft errors in flip-flops, latches, and combinational logic, also referred to as logic soft errors, are significant contributors to the system-level SER [2], [3]. Logic soft errors pose a major challenge to robust enterprise com- puting and networking platform designs. For many designs tar- Manuscript received August 30, 2005. This work was supported in part by MARCO Gigascale Systems Research Center (GSRC). M. Zhang, T. M. Mak, N. Seifert, Q. Shi, and K. S. Kim are with Intel Cor- poration, Folsom, CA 95630 USA (e-mail: [email protected]). S. Mitra is with the Department of Electrical Engineering, Stanford Univer- sity, Stanford, CA 94305 USA. N. J. Wang, N. R. Shanbhag, and S. J. Patel are with the Coordinated Sci- ence Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA. Digital Object Identifier 10.1109/TVLSI.2006.887832 Fig. 1. Circuit schematic of a representative single-port D-latch. geting such platforms, not only the memory elements but also the latches and flip-flops, must be protected from soft errors in order to satisfy the system data integrity requirements. This work presents a new family of sequential element de- signs with built-in soft error resilience (BISER) that demon- strates four major contributions: 1) error correction technique that achieves a more than 20-fold reduction in the cell-level SER of latches and flip-flops; 2) reuse paradigm that helps lower the power and area penal- ties of the error correction technique; 3) set of power saving techniques including an economy op- eration mode; 4) set of fault injection results to illustrate the chip-level ef- fectiveness and power penalty of the BISER technique. The rest of this paper is organized as follows. In Section II, we describe the principle of the new error-correcting designs. Section III describes the reuse paradigm that further reduces the overhead of the presented technique. Section IV presents the key circuit design considerations and the cell-level characteri- zation results. Section V presents the chip-level SER, perfor- mance, power, and area impact of the BISER technique based on fault injection results. Other BISER design variations are briefly described in Section VI. Finally, related work and conclusions are described in Sections VII and VIII, respectively. II. ERROR CORRECTION IN SEQUENTIAL ELEMENTS A master–slave flip-flop, as implemented in several cycle- based microprocessor designs, is composed of a master latch followed by a slave latch. A representative single-port latch is shown in Fig. 1. The inverter IN1 is used to generate comple- mentary clock signals locally; the transmission gate TG1 allows 1063-8210/$20.00 © 2006 IEEE Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

4. Sequential Element Design With Built-In Soft Error Resilience

Embed Size (px)

Citation preview

Page 1: 4. Sequential Element Design With Built-In Soft Error Resilience

1368 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006

Sequential Element Design WithBuilt-In Soft Error Resilience

Ming Zhang, Member, IEEE, Subhasish Mitra, Senior Member, IEEE, T. M. Mak, Senior Member, IEEE,Norbert Seifert, Senior Member, IEEE, Nicholas J. Wang, Quan Shi, Kee Sup Kim, Member, IEEE,

Naresh R. Shanbhag, Fellow, IEEE, and Sanjay J. Patel, Member, IEEE

Abstract—This paper presents a built-in soft error resilience(BISER) technique for correcting radiation-induced soft errorsin latches and flip-flops. The presented error-correcting latchand flip-flop designs are power efficient, introduce minimal speedpenalty, and employ reuse of on-chip scan design-for-testabilityand design-for-debug resources to minimize area overheads. Cir-cuit simulations using a sub-90-nm technology show that the pre-sented designs achieve more than a 20-fold reduction in cell-levelsoft error rate (SER). Fault injection experiments conducted ona microprocessor model further demonstrate that chip-level SERimprovement is tunable by selective placement of the presentederror-correcting designs. When coupled with error correctioncode to protect in-pipeline memories, the BISER flip-flop designimproves chip-level SER by 10 times over an unprotected pipelinewith the flip-flops contributing an extra 7–10.5% in power. Whenonly soft errors in flips-flops are considered, the BISER techniqueimproves chip-level SER by 10 times with an increased powerof 10.3%. The error correction mechanism is configurable (i.e.,can be turned on or off) which enables the use of the presentedtechniques for designs that can target multiple applications with awide range of reliability requirements.

Index Terms—Circuit simulation, error correction, fault injec-tion, sequential element design, soft error rate (SER).

I. INTRODUCTION

SOFT errors are radiation-induced transient errors caused byneutrons from cosmic rays and alpha particles from pack-

aging materials [1]. Soft error protection is very important forenterprise computing and communication applications since thesystem-level soft error rate (SER) has been rising with tech-nology scaling and increasing system complexity. Several de-signs today implement extensive error detection and correction(ECC) mainly for on-chip SRAMs and register-files. However,memory protection is not enough for designs manufactured inadvanced technologies because soft errors in flip-flops, latches,and combinational logic, also referred to as logic soft errors, aresignificant contributors to the system-level SER [2], [3]. Logicsoft errors pose a major challenge to robust enterprise com-puting and networking platform designs. For many designs tar-

Manuscript received August 30, 2005. This work was supported in part byMARCO Gigascale Systems Research Center (GSRC).

M. Zhang, T. M. Mak, N. Seifert, Q. Shi, and K. S. Kim are with Intel Cor-poration, Folsom, CA 95630 USA (e-mail: [email protected]).

S. Mitra is with the Department of Electrical Engineering, Stanford Univer-sity, Stanford, CA 94305 USA.

N. J. Wang, N. R. Shanbhag, and S. J. Patel are with the Coordinated Sci-ence Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801USA.

Digital Object Identifier 10.1109/TVLSI.2006.887832

Fig. 1. Circuit schematic of a representative single-port D-latch.

geting such platforms, not only the memory elements but alsothe latches and flip-flops, must be protected from soft errors inorder to satisfy the system data integrity requirements.

This work presents a new family of sequential element de-signs with built-in soft error resilience (BISER) that demon-strates four major contributions:

1) error correction technique that achieves a more than20-fold reduction in the cell-level SER of latches andflip-flops;

2) reuse paradigm that helps lower the power and area penal-ties of the error correction technique;

3) set of power saving techniques including an economy op-eration mode;

4) set of fault injection results to illustrate the chip-level ef-fectiveness and power penalty of the BISER technique.

The rest of this paper is organized as follows. In Section II,we describe the principle of the new error-correcting designs.Section III describes the reuse paradigm that further reduces theoverhead of the presented technique. Section IV presents thekey circuit design considerations and the cell-level characteri-zation results. Section V presents the chip-level SER, perfor-mance, power, and area impact of the BISER technique based onfault injection results. Other BISER design variations are brieflydescribed in Section VI. Finally, related work and conclusionsare described in Sections VII and VIII, respectively.

II. ERROR CORRECTION IN SEQUENTIAL ELEMENTS

A master–slave flip-flop, as implemented in several cycle-based microprocessor designs, is composed of a master latchfollowed by a slave latch. A representative single-port latch isshown in Fig. 1. The inverter IN1 is used to generate comple-mentary clock signals locally; the transmission gate TG1 allows

1063-8210/$20.00 © 2006 IEEE

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 2: 4. Sequential Element Design With Built-In Soft Error Resilience

ZHANG et al.: SEQUENTIAL ELEMENT DESIGN WITH BISER 1369

Fig. 2. Block diagram of an error-correcting flip-flop design.

TABLE ITRUTH TABLE OF THE C-ELEMENT

data at the D pin to flow through when the clock signal CLK ishigh; the inverter IN3 and tristate inverter TI1 form a regenera-tive loop to store the sampled data when CLK is low.

The key to BISER is a new flip-flop design, composed of twoflip-flops joined with a C-element as shown in Fig. 2. To illus-trate how error correction is achieved by the C-element, con-sider the scenario where a particle strikes one of the four latcheswhen CLK is low. Note that only one latch is affected by a par-ticle strike under the assumption of a single event upset (SEU).When CLK is high, latches LB and PH1 are transparent, and thesame data is stored in these two latches (assuming that no softerror has affected PH2 or LA). As shown in Table I, the C-ele-ment acts as an inverter when the outputs of LB and PH1 match,and the flip-flop output Q has the correct value. When CLKturns low, latches LB and PH1 hold the stored logic value insidetheir feedback loops. The contents of latches LB and PH1 arenow prone to soft errors, while LA and PH2 are not error-pronebecause they are transparent and driven by the preceding logicstages. Suppose that a particle strike flips the logic value storedin PH1. The two inputs of the C-element will be different butthe error will not propagate to the C-element output. Using thesame principle, error correction is also enabled during the sce-nario where a particle strike occurs when CLK is high.

The purpose of the keeper circuit in Fig. 2 is to fight theleakage current in the C-element when both the pull-up and thepull-down paths in the C-element are shut off, which occurs onlywhen the content of one bistable gets flipped by a particle strike.Depending on the process technology and the clock frequency,the keeper structure may not be required. A particle strike in the

keeper circuit will not cause an error at Q, because both O1 andO2 hold the correct logic values under the SEU assumption, andhence, the Q node is strongly driven by the C-element. Simula-tions have shown that the error-correcting design can achieve anSER reduction of more than 20-fold when compared to an un-protected flip-flop. More details are provided in Section IV.

III. REUSE PARADIGM

Scan DFT has become a de facto test standard in the in-dustry because it enables an automated solution to high qualityproduction testing at low cost. In addition, scan is extremelyvaluable for postsilicon debug activities [4] because it providesaccess to the internal nodes of an integrated circuit. Fig. 3(a)shows the block diagram of a scan flip-flop design of a micro-processor, comprising system and scan portions. Each portion isa master-slave flip-flop composed of two latches. Note that thetwo-port latches, such as the PH1 block in the system portionor the LA block in the scan portion, can sample from one of thetwo data lines depending on which clock signal is active.

The scan flip-flop in Fig. 3(a) has two operation modes: testand normal. The scan clocks for the test mode are illustrated inFig. 4. Clocks SCA and SCB are applied along the scan chain[Fig. 3(b)] alternately to shift a test pattern into latches LA andLB. Next, the UPDATE clock is applied to move the contentsof LB to PH1. Thus, a test pattern is written into the systemportion. Next, functional clock CLK is applied which capturesthe system’s response to the test pattern. Finally, the CAPTUREclock is applied to move the contents of PH1 to LA. The systemresponse can then be scanned out by alternately applying clocksSCA and SCB. During normal system operation, the scan por-tion is shut off by assigning zero values to the scan clocks (SCA,SCB, UPDATE, and CAPTURE), while the system portion isbeing clocked at full speed. The main reasons for using thisstyle of scan DFT instead of the classical scan flip-flop with amultiplexer are simplified postsilicon debug and at-speed func-tional testing. The scan portion of the flip-flop design in Fig. 3

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 3: 4. Sequential Element Design With Built-In Soft Error Resilience

1370 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006

Fig. 3. (a) Block diagram of a scan flip-flop design. (b) Scan chain.

Fig. 4. Clock waveforms for a scan flip-flop in test mode.

is sometimes designed to operate at-speed when scan is used forat-speed functional test [5].

A key observation is that scan resources [e.g., latches LA andLB in Fig. 3(a)] are unused during normal operation, but stilloccupy chip area and consume leakage power. The concept ofreusing scan resources for error correction is illustrated in moredetail by the error-correcting scan flip-flop design (denoted ECdesign) in Fig. 5. The EC design is modified based on the scanflip-flop design in Fig. 3 by adding an OR gate to the clock path ofLB and an AND gate to the clock path of LA, as well as reroutingthe 2-D data port of the latch LA. The EC design has three dis-tinct operation modes: normal, test, and economy.

A. Normal and Test Modes

When the EC design operates in normal mode, the scan clocksSCA, SCB, UPDATE, and TEST are forced low, while the CAP-TURE signal is high. This equivalently converts the scan portioninto a master–slave flip-flop that operates in parallel with thesystem flip-flop. Error correction is then achieved in the sameway as described in Section II.

The test mode operation of the EC design is activated byforcing the TEST signal high. This ensures that the output ofscan portion O2 becomes a “don’t care” to the output of the ECdesign Q so that the shifting of a test sequence along the scanchain does not interfere with the operation of the EC design.The clock waveforms for the EC design in test mode remain thesame, as shown previously in Fig. 4. The EC design requiresan extra control signal TEST at the cell level compared to the

scan flip-flop. However, at the chip level, the TEST signal isdirectly generated from the available test circuitry. Moreover,the TEST signal remains static (either high or low) in any op-eration mode. Minimal buffering and routing is required sincethere are no strict timing constraints to meet. As a result, the ECdesign does not increase the number of timing-critical signalsin the system, and hence, does not require major architecturalchanges.

B. Economy Mode

The economy mode is motivated by the fact that in a modernchip design environment, a single design often targets severalapplication segments at the same time with apparently con-flicting requirements. For example, a mobile application (e.g.,a laptop) requires very low power consumption but may nothave a stringent requirement for low SER. On the other hand,an enterprise application (e.g., data centers) may be less powerconstrained, but may have to satisfy very stringent data integrityrequirements against soft errors. This provides a motivation tointroduce an economy mode into the EC design. The systemcan adaptively switch between the normal and economy modesdepending on the criticality of the application. If the applicationrequires high reliability, the system works under normal mode.Otherwise, the system switches into an economy mode, whichsignificantly reduces the power consumption by turning off partof the EC design circuitry.

One way to invoke an economy mode for the EC design isto disable their scan portions by assigning proper values to thescan clocks, as illustrated in Fig. 6. The signal CAPTURE isforced low so that the second clock port C2 of latch LA is alwayslow even though the clock signal CLK is still toggling duringeconomy mode. The signal SCB is forced high so that the clockport C1 of latch LB is always high. The combined assignmentof CAPTURE and SCB signal values ensures the scan portion,equivalent to a shadow master–slave flip-flop, has an opaquemaster latch and transparent slave latch. As a result, the scanportion does not consume any dynamic power caused by internaldata or clock activities. The signal TEST is also forced high so

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 4: 4. Sequential Element Design With Built-In Soft Error Resilience

ZHANG et al.: SEQUENTIAL ELEMENT DESIGN WITH BISER 1371

Fig. 5. Error-correcting scan flip-flop design.

Fig. 6. Error-correcting scan flip-flop design in economy mode.

that the C-element is disabled and the value of Q depends solelyon the operation of the system portion. This added economymode roughly halves the dynamic power consumption of theEC design when it runs noncritical applications.

IV. CIRCUIT DESIGN CONSIDERATIONS AND RESULTS

The scan portion in a scan flip-flop [Fig. 3(a)] typically op-erates at a lower speed than the system clock during test mode[4]. As a result of this relaxed timing requirement, slow transis-tors (those with smaller channel widths, longer channel lengths,or higher threshold voltages) are sometimes used in the scanportion to lower leakage power during normal mode without af-fecting its functionality in test mode. The scan portion in an EC

design, on the other hand, needs to meet the same timing con-straints (setup time, CLK-to-Q delay, etc.) as the system flip-flopin order to guarantee the correct operation. This means that tran-sistors at critical locations within the scan portion of the EC de-sign need to be replaced with the same type as those used in thesystem portion.

Due to the presence of both the nMOS and pMOS stacks inthe C-element, it is very difficult to match the delay of a C-el-ement with that of an inverter by only sizing up the transistors.Instead, low- transistors are used in the transistor stacks. Fur-thermore, the keeper circuit needs to be designed with greatcare. More specifically, the forward inverter is kept at minimalsize while the feedback inverter is intentionally weakened byusing a larger channel length value. This minimizes the extra

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 5: 4. Sequential Element Design With Built-In Soft Error Resilience

1372 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006

Fig. 7. Simulation test bench for timing and power measurements.

delay caused by contention in the keeper loop when the C-ele-ment writes a new value into the keeper circuit. The impact ofthese circuit modifications are accurately modeled by our sim-ulation methodology, which is described as follows.

Comparisons are conducted between the proposed EC de-sign, a reference scan flip-flop (denoted MSFF), and a dual in-terlocked storage cell [6] flip-flop with scan circuitry (denotedDICE). The DICE design has been chosen for comparison be-cause it is one of the best known classical circuit hardening tech-niques. Since the circuit styles and transistor sizes may be sig-nificantly different across all designs being compared, we adopta unified approach previously suggested in [7] to analyze thetiming and power of the flip-flops under investigation. Fig. 7illustrates the test bench we used. We assume the external ca-pacitive loads and are both fan-out of 4. We also insertbuffers between CLKI (fed by an ideal voltage source) and theCLK pin of the flip-flop, as well as between DI and D. Thesebuffering inverters serve two purposes: 1) provide realistic Dand CLK signals to the flip-flop and 2) capture power dissipa-tion differences due to different CLK and D loadings in differentflip-flop designs. While optimizing the various flip-flop designs,the objective is to match the timing parameter, D-to-Q delay,with that of the reference design MSFF. We make the followingassumptions during the power measurement of all flip-flops:

1) data activity factor (average number of output transitionsper clock cycle) is 0.25;

2) low-to-high and high-to-low data transitions are equallylikely.

The cell layout areas are estimated by an internal tool at Intel,with a worst case error of 5% compared to real layouts. TheSERs are obtained from an internal simulator [8].

Table II shows the simulated cell-level power, area, and SERof all the designs. All the measurements are normalized withrespect to those of MSFF. The D-to-Q delay parameters are al-ready equalized for all designs and hence not shown. The ECdesign achieves a 20-fold SER improvement over the referencedesign (MSFF) with the power and area overhead of 1.43 and0.17, respectively. This also indicates a 7% power and 26% areasavings as compared to the DICE design. As mentioned earlier,these savings are a direct result of reusing existing on-chip re-sources. Another distinct advantage of the EC design over DICEis the economy mode, which can further lower the overheads

TABLE IICELL-LEVEL POWER, AREA, AND SER COMPARISONS

when noncritical applications are running. Note that an SER im-provement of more than 20-fold is theoretically possible for theEC design but is beyond the resolution of the simulator we used.

V. CHIP-LEVEL RESULTS

In this section, we explore the impact of BISER on a micro-processor by conducting a case study. We focus on the impactthat BISER has on SER and power consumption, identifying thetradeoff between these two metrics. The chip-level area penaltycaused by BISER is also estimated. To conduct this investiga-tion, we adopt the methodology used in [9], which is summa-rized below.

We use a highly detailed register transfer level model of adeeply pipelined, out-of-order microprocessor similar in com-plexity to the Alpha 21264. The fault model is a single bit flipof a state element, either a flip-flop or a RAM cell. In order tounderstand how various logic blocks in the pipeline contribute tothe failure rate of the microarchitecture, each flip-flop or RAMcell in the processor is categorized based on the general func-tion provided by that bit of state. The cumulative error coverageas a function of protected states can then be obtained by meansof fault injection experiments. Two fault injection campaignsare then performed to characterize the processor under two sce-narios: one where all memories are protected and soft errors af-fect flip-flops only and the other where soft errors affect bothflip-flops and in-pipeline memories.

A. Scenario I: Soft Errors Affect Flip-Flops Only

To emulate this scenario, faults are only injected into flip-flopstates within the processor pipeline because memories are as-sumed to be protected by ECC, and hence, not affected by softerrors. The error coverage as a function of cumulative flip-flopstate coverage is shown in Fig. 8. The chip-level power penaltyof selectively applying BISER techniques to flip-flops is alsoestimated based on the cell-level power, total chip power, andthe percentage of protected flip-flops. The power and areapenalties are listed together with chip-level SER improvementin Table III. The chip-level SER can be improved by ten timeswith an increased power of 10.3%.

B. Scenario II: Soft Errors Affect Both Flip-Flops andMemories

To emulate this scenario, faults are injected into both flip-flopand memory states within the processor pipeline. The error cov-erage plot is shown in Fig. 9. The raw SERs of flip-flops (de-noted ) are different from those of RAM cells (de-noted ). Since fault injection was conducted into both

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 6: 4. Sequential Element Design With Built-In Soft Error Resilience

ZHANG et al.: SEQUENTIAL ELEMENT DESIGN WITH BISER 1373

Fig. 8. Error coverage versus state coverage when all memories are protectedand soft errors affect flip-flops only.

TABLE IIICHIP-LEVEL POWER AND AREA PENALTIES OF BISER AS A FUNCTION

OF CHIP-LEVEL SER WHEN ALL MEMORIES ARE PROTECTED AND

SOFT ERRORS AFFECT FLIP-FLOPS ONLY

Fig. 9. Error coverage versus state coverage when errors affect both flip-flopsand in-pipeline memories.

flip-flop and RAM states, the error coverage curve depends onthe relative SER of flip-flops to RAMs. We consider various sce-narios based on the comparative SER study in [10]. The firstcurve (circle) corresponds to the case where the SER of theflip-flop is the same as that of the RAM cell; the second curve in-dicates the SER of flip-flop is one-tenth of that of the RAM cell;and so on. All three curves exhibit similar behavior. They start to

TABLE IVCHIP-LEVEL POWER, AREA, AND PERFORMANCE PENALTIES OF BISER

AS A FUNCTION OF CHIP-LEVEL SER WHEN SOFT ERRORS AFFECT

BOTH FLIP-FLOPS AND IN-PIPELINE MEMORIES

rise abruptly in the beginning because most of the error coverageis achieved by protecting the RAMs in certain logic blocks. Thecurve corresponding to higher flip-flop SER is lower becauseflip-flops contribute to more errors in that case. A prediction hasbeen made [10] that flip-flop SER will become larger relative toSRAM SER as process technology advances, meaning it will bemore important to protect flip-flops.

The chip-level power penalty results are shown in Table IV.A ten-fold improvement in chip-level SER can be achieved bypaying 7.0%–10.5% power penalty, depending on the flip-flopSER relative to SRAM SER. The higher the flip-flop SER, themore power penalty is needed to achieve the same chip-levelSER improvement.

VI. OTHER DESIGN VARIATIONS

The same principles of BISER have been employed in var-ious sequential element designs. Examples include error-cor-recting scanout and mux-scan flip-flops [3], and sequential ele-ment designs that correct combinational logic soft errors [11].We present two additional design variations in this section.

A. Low-Power EC Design

The EC design described in Section V inherits the main fea-tures of scan clocks in a scan flip-flop, i.e., all scan clocks (SCA,SCB, CAPTURE, and UPDATE) are globally routed and arenot timing critical. A low-power error-correcting design namedEC-LP is also possible as shown in Fig. 10. The CAPTUREsignal in this design is integrated into the clock generation cir-cuit, instead of being globally routed. This configuration re-duces clock loading inside the cell and eliminates one pin, whichresults in lower power consumption. The main difference be-tween EC-LP and EC designs stems from the clock waveformsin test mode, as illustrated in Fig. 11. Note that the EC-LP de-sign does not have an economy mode.

B. Error-Correcting Scan Pulse Latch

Latches with pulse clocking, also known as pulse latches,have been used in several microprocessors. For example, on the

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 7: 4. Sequential Element Design With Built-In Soft Error Resilience

1374 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006

Fig. 10. Low-power error-correcting design.

Fig. 11. Clock waveforms for a low-power error-correcting design in testmode.

Itanium 2 processor, 95% of all static latches are pulse latches[12]. A pulse latch combines the behavioral merits of both anedge triggered flip-flop and a level sensitive latch. On the onehand, it provides a relatively wider transparency window thanthat of a flip-flop, which allows cycle stealing and skew toler-ance. On the other hand, it maintains edge triggering behaviorand avoids the relatively long hold times of a level-sensitivelatch. Moreover, a pulse latch has speed, power, and areaadvantages over its flip-flop counter-part: a pulse latch is fasterdue to the reduced number of logic gate levels between its inputand output, consumes less power due to roughly halved clockloading and reduced data switching activities, and occupiessmaller silicon area due to smaller transistor counts.

Scan pulse latches have been implemented in a micropro-cessor for production testing and postsilicon debug purposes [8],[12], [13]. Fig. 12 shows a scan pulse latch with system and scanportions. During test operation, the system and scan portionsform a master–slave flip-flop to shift the test sequence in andout. During normal operation, SHIFT is assigned a low value sothat the system portion is isolated from the scan-in data at SI.Note that data is written from both directions into the latch cellduring normal operation. This design eliminates the need for aninterrupted feedback inverter, and hence, saves clock power.

The design illustrated in Fig. 13 is an error-correcting scanpulse latch. During test mode operation, SCKA and SCKBsignals are pulsed high alternately to shift in the test sequencewhile SHIFT is high. Once the test sequence reaches the

Fig. 12. Scan pulse latch design.

desired location, the system’s response is captured by one ormore PCK pulses while SHIFT is low. To shift out the systemresponse, SCKA and SCKB are again alternately pulsed highwhile SHIFT is kept high and PCK is kept low. During normalmode operation, SCKA, SCKB, and SHIFT signals are keptlow while PCK is used to latch system data. An SEU in eitherthe PL1 or PL2 block will be corrected by the C-element asexplained earlier.

VII. RELATED WORK

Prior work for soft error protection in memory and sequen-tial elements can be broadly classified into the following cat-egories: process-level, device-level, circuit-level, and system-level techniques. We briefly review and compare some repre-sentative techniques in this section.

At the process level, silicon-on-sapphire (SOS) and sil-icon-on-insulator (SOI) technologies have been developed asa soft error mitigation technique for space and military appli-cations. These technologies are immune to radiation-induced

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 8: 4. Sequential Element Design With Built-In Soft Error Resilience

ZHANG et al.: SEQUENTIAL ELEMENT DESIGN WITH BISER 1375

Fig. 13. (a) Error-correcting scan pulse latch. (b) Circuit schematics of the PL1 or PL2 block in (a). (c) Circuit schematics of the C block in (a).

latch-up as the parasitic p-n-p-n structure does not exist any-more due to full electrical isolation of individual transistors.Moreover, the thin silicon film reduces the charge collectiondepth, which in turn, lowers the sensitivity to soft errors.Prior work has demonstrated the SER improvement of SRAMdevices fabricated on SOI over those on bulk: 2 improvementfor the 180-nm node and 5 for the 90-nm node [14]. However,the robustness of an SOI device with a floating body may becompromised by an ionizing particle that triggers the inherentand parasitic bipolar transistors, and hence, results in chargeamplification. Employing external body contacts could curethis problem but significantly increases the area penalty.

At the device level, rad-hard-by-design (RHBD) techniqueshave been used to lower a device’s sensitivity to radiation. Forexample, guardbanding around nMOS and/or pMOS transistorsgreatly reduces the susceptibility of CMOS circuits to radiation-induced latch-up [15]. However, applying such techniques to anexisting standard cell library could be time consuming since thelayouts of many devices need to be regenerated.

At the circuit level, there are two main approaches to reducethe effects of soft errors: 1) increasing the critical charge ofa circuit node and 2) adding transistors to enable redundantstorage of information. Capacitive hardening [16], resistivehardening [17], and the use of high-drive transistors have beenused to increase the critical charge. These techniques tend toincrease the power consumption and lower the speed of thecircuits. Redundant circuit techniques include the low powerWhitaker cell [18], Barry/Dooley design [19], [20], and DICE

design [6]. These redundant transistor-based designs usuallyrequire at least twice as many transistors as unprotected circuits,which typically indicates very high area and power penalties.The presented BISER technique overcomes this limitation byreusing exiting on-chip DFT resources.

At the system-level, hardware and time redundancy tech-niques have been proposed to combat soft errors. Classicalhardware redundancy techniques include chip-level duplication(as used in HP-Tandem machines [21]), block-level duplicationused in IBM Z-Series machines [21], triple modular redun-dancy [22], parity prediction [23]–[25], application-specificerror detection techniques [26]–[28], DIVA [29] and severalothers. A major benefit of these techniques is that they do notassume any particular error mechanism, and hence, work formost error sources. However, except for the application-specificerror detection techniques, the area and power overheads aregenerally very high. Major time redundancy techniques includeerror detection with shifted operands (RESO) [30], redundantexecution using spare elements (REESE) [31], multithreadingfor transient error detection [32]–[36], and software imple-mented hardware fault tolerance (SIHFT) [37], [38]. A keydrawback for these techniques is that the performance penaltycan be very significant: around 40% for multithreading, and40%–200% for SIHFT. Furthermore, the power overheads ofthese techniques are also significant due to redundant execution,although there is a lack of published power overheads. Thesetechniques are mainly applicable for specific designs such asmicroprocessors.

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 9: 4. Sequential Element Design With Built-In Soft Error Resilience

1376 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006

VIII. CONCLUSIONS

The BISER paradigm presented in this paper is the key tothe development of power, performance, and area efficient softerror protection techniques. BISER has the following uniqueadvantages over other major soft error protection techniques:

1) minimal area and power overheads because resources al-ready present for test and debug are reused for soft errorresilience;

2) minimal routing overhead;3) no requirement for major architectural changes;4) applicability to any digital design (e.g., microprocessors,

network processors, ASICs);5) broad spectrum of design choices suitable for adaptive ap-

plications with a wide range of power and performancetradeoffs;

6) additional power saving techniques such as economy modeoperation.

ACKNOWLEDGMENT

The authors would like to thank K. Ganesh, J. Maiz, P.Shipley, A. Vo, S. Walstra, and V. Zia, and from Intel Corpo-ration for their discussions and assistance during the course ofthis research.

REFERENCES

[1] R. C. Baumann, “Soft errors in advanced semiconductor devices-PartI: The three radiation sources,” IEEE Trans. Device Mater. Reliab., vol.1, no. 1, pp. 17–22, Mar. 2001.

[2] M. Zhang and N. R. Shanbhag, “A soft error rate analysis (SERA)methodology,” in Proc. IEEE Int. Conf. Comput.-Aided Des., 2004, pp.111–118.

[3] S. Mitra, M. Zhang, T. M. Mak, N. Seifert, V. Zia, and K. S. Kim,“Logic soft errors: A major barrier to robust platform design,” in Proc.IEEE Int. Test Conf., 2005, pp. 687–696.

[4] R. Kuppuswamy, P. DesRosier, D. Feltham, R. Sheikh, and P.Thadikaran, “Full hold-scan systems in microprocessors: Cost/benefitanalysis,” Intel Technol. J., vol. 8, pp. 63–72, Feb. 2004.

[5] A. Carbine and D. Feltham, “Pentium pro processor design for test anddebug,” in Proc. IEEE Int. Test Conf., 1999, pp. 294–303.

[6] T. Calin, M. Nicolaidis, and R. Velazco, “Upset hardened memory de-sign for submicron CMOS technology,” IEEE Trans. Nucl. Sci., vol.43, no. 6, pp. 2874–2878, Dec. 1996.

[7] V. Stojanovic, V. G. Oklobdzija, and R. Bajwa, “A unified approach inthe analysis of latches and flip-flops for low power systems,” in Proc.Int. Symp. Low Power Electron. Des., 1998, pp. 227–232.

[8] N. Seifert, V. Ambrose, P. Shipley, M. Pant, and B. Gill, “Radiationinduced clock jitter and race,” in Proc. Int. Phys. Reliab. Symp., 2005,pp. 215–222.

[9] N. J. Wang, J. Quek, T. M. Rafacz, and S. J. Patel, “Characterizing theeffects of transient faults on a high-performance processor pipeline,” inProc. Int. Conf. Dependable Syst. Netw., 2004, pp. 61–70.

[10] R. C. Baumann, “The impact of technology scaling on soft error rateperformance and limits to the efficacy of error correction,” in Dig. IEEEInt. Electron Devices Meeting, 2002, pp. 329–332.

[11] S. Mitra, M. Zhang, S. Waqas, N. Seifert, B. Gill, and K. S. Kim, “Com-binational logic soft error correction,” in Proc. IEEE Int. Test Conf.,2006, Paper No. 29.2.

[12] S. D. Naffziger, G. Colon-Bonet, T. Fischer, R. Riedlinger, T. J.Sullivan, and T. Grutkowski, “The implementation of the Itanium 2microprocessor,” IEEE J. Solid-State Circuits, vol. 37, no. 11, pp.1448–1460, Nov. 2002.

[13] D. D. Josephson, S. Poehlman, V. Govan, and C. Mumford, “Testmethodology for the McKinley processor,” in Proc. Int. Test Conf.,2001, pp. 578–585.

[14] P. Roche and G. Gasiot, “Impacts of front-end and middle-end processmodifications on terrestrial soft error rate,” IEEE Trans. Device Mater.Reliab., vol. 5, no. 3, pp. 382–396, Sep. 2005.

[15] J. V. Osborn, D. C. Mayer, R. C. Lacoe, S. C. Moss, and S. D. LaLu-mondiere, “Single event latchup characteristics of three commercialCMOS processes,” in Proc. 7th NASA Symp. VLSI Des., 1998, PaperNo. 4.3.1.

[16] STMicroelectronics Press Release, “New chip technology from STmi-croelectronics eliminates soft error threat to electronic systems,” 2003[Online]. Available: http://www.st.com/stonline/press/news/year2003/t1394h.htm

[17] L. R. Rockett Jr., “Simulated SEU hardened scaled CMOS SRAM celldesign using gated resistors,” IEEE Trans. Nucl. Sci., vol. 39, no. 5, pp.1532–1541, Oct. 1992.

[18] M. N. Liu and S. Whitaker, “Low power SEU immune CMOS memorycircuits,” IEEE Trans. Nucl. Sci., vol. 39, no. 6, pp. 1679–1684, Dec.1992.

[19] M. J. Barry, “Radiation resistant SRAM memory cell,” U.S. Patent5 157 625, Oct. 20, 1992.

[20] J. G. Dooley, “SER-immune latch for gate array, standard cell, andother ASIC applications,” U.S. Patent 5 311 070, May 10, 1994.

[21] W. Bartlett and L. Spainhower, “Commercial fault tolerance: A tale oftwo systems,” IEEE Trans. Dependable Secure Comput., vol. 1, no. 1,pp. 87–96, Jan. 2004.

[22] D. P. Siewiorek and R. S. Swarz, Reliable Computer Systems Designand Evaluation, 3rd ed. Natick, MA: A. K. Peters, 1998.

[23] K. Mohanram, E. S. Sogomonyan, M. Gossel, and N. A. Touba,“Synthesis of low-cost parity-based partially self-checking circuits,”in Proc. IEEE On-Line Testing Symp., 2003, pp. 35–40.

[24] N. A. Touba and E. J. McCluskey, “Logic synthesis of multilevelcircuits with concurrent error detection,” IEEE Trans. Comput.-AidedDes., vol. 16, no. 7, pp. 783–789, Jul. 1997.

[25] C. Zeng, N. R. Saxena, and E. J. McCluskey, “Finite state machine syn-thesis with concurrent error detection,” in Proc. Int. Test Conf., 1999,pp. 672–680.

[26] K. H. Huang and J. A. Abraham, “Algorithm based fault tolerancefor matrix operations,” IEEE Trans. Comput., vol. C-33, no. 6, pp.518–528, Jun. 1984.

[27] W. J. Huang, N. R. Saxena, and E. J. McCluskey, “A reliable LZ datacompressor on reconfigurable coprocessors,” in Proc. IEEE Field Pro-gram. Custom Comput. Mach., 2000, pp. 249–258.

[28] J. Y. Jou and J. A. Abraham, “Fault-tolerant FFT networks,” IEEETrans. Comput., vol. 37, no. 5, pp. 548–561, May 1988.

[29] T. M. Austin, “DIVA: A reliable substrate for deep submicron microar-chitecture design,” in Proc. Int. Symp. Microarch., 1999, pp. 196–207.

[30] J. H. Patel and L. Y. Fung, “Concurrent error detection in ALUs byrecomputing with shifted operands,” IEEE Trans. Comput., vol. C-31,no. 7, pp. 589–595, Jul. 1982.

[31] J. B. Nickel and A. K. Somani, “REESE: A method of soft error detec-tion in microprocessors,” in Proc. Int. Conf. Dependable Syst. Netw.,2001, pp. 401–410.

[32] E. Rotenberg, “AR-SMT: A microarchitectural approach to fault toler-ance in microprocessors,” in Proc. Int. Symp. Fault-Tolerant Comput.,1999, pp. 84–91.

[33] N. R. Saxena, S. Fernandez-Gomez, W. Huang, S. Mitra, S. Yu, and E.J. McCluskey, “Dependable computing and online testing in adaptiveand configurable systems,” IEEE Des. Test. Comput., vol. 17, no. 1, pp.29–41, Jan. 2000.

[34] J. Ray, J. C. Hoe, and B. Falsafi, “Dual use of superscalar datapath fortransient-fault detection and recovery,” in Proc. Int. Symp. Microarch.,2001, pp. 214–224.

[35] S. S. Mukherjee, M. Kontz, and S. K. Reinhardt, “Detailed designand evaluation of redundant multi-threading alternatives,” in Proc. Int.Symp. Comput. Arch., 2002, pp. 99–110.

[36] T. N. Vijaykumar, I. Pomeranz, and K. Cheng, “Transient-faultrecovery using simultaneous multithreading,” in Proc. Int. Symp.Comput. Arch., 2002, pp. 87–98.

[37] N. Oh, P. P. Shirvani, and E. McCluskey, “Error detection by dupli-cated instructions in super-scalar processors,” IEEE Trans. Reliab., vol.51, no. 1, pp. 63–75, Mar. 2002.

[38] N. Oh, S. Mitra, and E. J. McCluskey, “ED4I: Error detection by di-verse data and duplicated instructions,” IEEE Trans. Comput., vol. 51,no. 2, pp. 180–199, Feb. 2002.

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 10: 4. Sequential Element Design With Built-In Soft Error Resilience

ZHANG et al.: SEQUENTIAL ELEMENT DESIGN WITH BISER 1377

Ming Zhang (S’06–M’07) received the B.S. degreein physics from Peking University, Beijing, China, in1999 and the M.S. and Ph.D. degrees in electrical en-gineering from the University of Illinois at Urbana-Champaign (UIUC), Urbana, in 2001 and 2006, re-spectively.

He is currently a Staff Computer-Aided Design(CAD) Engineer with Intel Corporation, Folsom,CA. From 1999 to 2001, he developed micro-electromechanical systems for nanolithographyapplications at the Microelectronics Laboratory of

UIUC. From 2004 to 2005, he interned at Intel Corporation and developedsoft error resilient circuits and fault-tolerant architectures. His Ph.D. researchwork included a soft error rate analysis methodology and various classes ofsoft-error tolerant circuit design techniques. His current research interests in-clude error-resilient and low-power circuits, variation and degradation-tolerantcircuits and architectures, and circuit/architecture design for nanotechnology.He has published more than twenty technical papers and holds seven issued orpending U.S. patents. He serves on the program committees of several IEEEconferences and symposia.

Dr. Zhang was a recipient of the M. E. Van Valkenburg Research Awardfor demonstrated excellence in circuit and system research and the UniversityAward for excellence in teaching an undergraduate-level circuit class.

Subhasish Mitra (SM’06) is an Assistant Professorin the Departments of Electrical Engineering andComputer Science, Stanford University, Stanford,CA. His research interests include robust systemdesign, VLSI design and test, computer-aided design(CAD), fault-tolerant computing, and computerarchitecture. Prior to joining Stanford, he was a Prin-cipal Engineer at Intel Corporation, Hillsboro, OR,where he was responsible for developing enablingtechnologies for robust system design – Design forExcellence (Reliability, Testability, and Debug) –

in advanced technologies. He has published more than 90 technical papersin leading conferences and journals, and invented design and test techniquesthat have shown wide-spread proliferation in the industry. His X-Compacttechnique for test compression is being used by more than 40 Intel products,and is supported by major CAD tools.

Dr. Mitra was a recipient of the IEEE Circuits and Systems Society DonaldO. Pederson Award for the best paper published in the IEEE TRANSACTIONS

ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, a BestPaper Award at the Intel Design and Test Technology Conference for his workon built-in soft error resilience, a Best Paper Award nomination at the DesignAutomation Conference, a Divisional Recognition Award from the Intel Mo-bility Group for a breakthrough soft error protection technology, Terman Fel-lowship from the Stanford School of Engineering, the Sundaram Seshu ScholarLecturer at the Coordinated Science Laboratory of the University of Illinois atUrbana Champaign, and the Intel Achievement Award, Intel’s highest honor,for the development and deployment of a breakthrough test compression tech-nology that achieved an order of magnitude reduction in scan test cost. He hasheld consulting positions at several companies, and serves on the organizing andprogram committees of several IEEE and ACM sponsored conferences, sym-posia, and workshops.

T. M. Mak (SM’01) received the B.S.E.E. degreefrom Hong Kong Polytechnic University, HongKong, in 1979.

He is a Research Scientist with the DesignTechnology Solution Group, Intel Corporation,Santa Clara, CA, carrying out test research. Heis currently serving a second term assignment tomentor MARCO/Focus Center Research Program(FCRP) research. He has been with Intel for over22 years and has worked on a variety of areasincluding test development, product engineering,

design automation, and design for test. His current research interests rangefrom defect-based testing, fault effects as a result of nanometer technology,circuit level and physical design test issues, I/O interface and analog testing,and fault tolerant and online testing. He currently holds eight patents with

seven more pending. He had served on the program committees of variousconferences and workshops.

Dr. Mak was a recipient of the SRC Outstanding Industrial Mentor Award inboth 1997 and 2004 and the Best Paper Award at the International Test Confer-ence in 2004 and a Best Panel Award from VTS in 2004.

Norbert Seifert (SM’04) received the M.S. degreein physics from Vanderbilt University, Nashville, TN,in 1994, and the Diplom Ingenieur and Ph.D. degreesin physics from the Technical University of Vienna,Vienna, Austria, in 1990 and 1993, respectively.

He is currently a Staff Reliability and Design En-gineer with Intel Corporation, Hillsboro, OR, wherehe is responsible for developing a coherent chip-levelSER methodology. He is also studying the impact ofNBTI on circuit and system performance. He workedin TCAD and circuit design in the Alpha Develop-

ment Group (DEC/Compaq/HP) from 1997 to 2003. Prior to joining DEC, hestudied charge transfer processes in atomic collisions as a postdoctoral associateat North Carolina State University, Raleigh, and computational fluid dynamicsof high-power laser material processing as a postdoctoral associate at the Tech-nical University of Vienna. He has worked extensively on the interaction of radi-ation with matter, in general, and on the response of digital circuits in particular.He has published more than 30 technical papers on this topic and has presentedsoft error tutorials at several reliability conferences. He actively serves on theorganizing and program committees at several IEEE sponsored conferences. Heis a frequent reviewer for leading reliability journals and is a co-editor of theSeptember, 2005, IEEE TRANSACTIONS DEVICE AND MATERIALS RELIABILITY

Special Issue on Soft Errors and Data Integrity in Terrestrial Computer Systems.

Nicholas J. Wang received the M.S. and B.S.degrees in electrical and computer engineering fromthe University of Illinois Urbana-Champaign, Ur-bana Champaign, where he is currently pursuing thePh.D. degree in electrical and computer engineering.

His research interests include fault-tolerant com-puter architectures.

Quan Shi received the B.S. and M.S. degrees in radioelectronics from Beijing Normal University, Beijing,China, in 1986 and 1989, respectively, and the Ph.D.degree in electrical engineering from University ofNew Mexico, Albuqueque, in 2000.

He is currently a Design Engineer with Intel,Hillsboro, OR. Before joining Intel, he was withNASA Institute of Advanced MicroElectronics inAlbuquerque, NM. His research interests includecircuit hardening techniques, circuit reliability,circuit modeling, and asynchronous circuits.

Kee Sup Kim (M’92) received the Ph.D. degree fromthe University of Wisconsin-Madison, Madison.

Currently, he is Director of DFX at MobilityGroup at Intel Corporation, Hillsboro, OR, where heis in charge of developing solutions for design fortestability, manufacturability, debug, and reliabilityfor communications products. Previously, he workedon various aspects of testing and DFT for Intel CPUproducts. He served as an organizing committeemember for International Conference on ComputerDesign.

Dr. Kim was a co-recipient of the IEEE Donald O. Peterson Award for hiswork in test compression.

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.

Page 11: 4. Sequential Element Design With Built-In Soft Error Resilience

1378 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 12, DECEMBER 2006

Naresh R. Shanbhag (F’06) received the Ph.D. de-gree in electrical engineering from the University ofMinnesota, Minneapolis, in 1993.

Since August 1995, he has been with the Depart-ment of Electrical and Computer Engineering, andthe Coordinated Science Laboratory, University ofIllinois at Urbana-Champaign, Urbana, where heis presently a Professor. From 1993 to 1995, heworked at AT&T Bell Laboratories, Murray Hill,NJ, where he was the lead chip architect for AT&T’s51.84-Mb/s transceiver chips over twisted-pair

wiring for Asynchronous Transfer Mode (ATM)-LAN and very high-speed dig-ital subscriber line (VDSL) chip-sets. His research interests include the designof integrated circuits and systems for broadband communications includinglow-power/high-performance VLSI architectures for error-control coding,equalization, as well as digital integrated circuit design. He has published morethan 90 journal articles, book chapters, and conference publications in this areaand holds three U.S. patents. He is also a co-author of the research monographPipelined Adaptive Digital Filters (Kluwer, 1994). From 1997–1999, he wasa Distinguished Lecturer for the IEEE Circuits and Systems Society. From1997–1999 and from 1999–2002, he served as an Associate Editor for theIEEE TRANSACTION ON CIRCUITS AND SYSTEMS—PART II: EXPRESS BRIEFS

and the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI)SYSTEMS, respectively. He has served on the technical program committees ofvarious conferences. He is also a co-founder and the Chief Technology Officerof Intersymbol Communications, Inc., (a wholly owned subsidiary of KodeosCommunications, Inc., since March 2006) Champaign, IL, which was foundedin 2000, and where he provides strategic directions in the development ofmixed-signal receivers for next generation optical fiber links.

Dr. Shanbhag was a recipient of the IEEE TRANSACTIONS ON VERY LARGE

SCALE INTEGRATION (VLSI) SYSTEMS Best Paper Award in 2001, the IEEELeon K. Kirchmayer Best Paper Award in 1999, the Xerox Faculty Award in1999, the National Science Foundation CAREER Award in 1996, and the Dar-lington Best Paper Award from the IEEE Circuits and Systems Society in 1994.

Sanjay J. Patel (M’99) received the B.S., M.S., andPh.D. degrees in computer science and engineeringfrom the University of Michigan, Ann Arbor, in 1990,1992, and 1999, respectively.

Currently, he is an Associate Professor in the Elec-trical and Computer Engineering Department and aWillett Faculty Scholar, University of Illinois at Ur-bana-Champaign, Urbana. He is also serving as ChiefArchitect at AGEIA Technologies, St. Louis, MO.His research interests include processor microarchi-tecture, computer architecture, and high performance

and reliable computer systems. He has worked with architecture, hardware ver-ification, logic design, and performance modeling at Digital Equipment Cor-poration, Intel Corporation, and HAL Computer Systems, as well as providedconsultation for Transmeta, Jet Propulsion Laboratory, HAL, Intel, and AGEIATechnologies.

Dr. Patel is a member of the IEEE Computer Society.

Authorized licensed use limited to: Chang Gung University. Downloaded on August 18,2010 at 09:30:37 UTC from IEEE Xplore. Restrictions apply.