Transcript
Page 1: Enabling Wearout-Immune BEOL and FEOL with Active ...people.virginia.edu/~xg2dt/papers/Xinfei Guo_VMC_Poster_48x36in.pdf · Enabling Wearout-Immune BEOL and FEOL with Active Rejuvenation

Department of Electrical and Computer Engineering, University of Virginia

Xinfei Guo, Mircea R. Stan

Enabling Wearout-Immune BEOL and FEOL with Active Rejuvenation

{xg2dt, mircea}@virginia.edu

BEOL & FEOL Wearout

Wearout Issues

Previous SolutionsTolerate - Design for the worst case (margins, upsizing)

Compensate - Dynamically adapt to wearout

Slow down the wearout - Reduce the stress

Passive Recovery - Remove the stress, shut off the core

Consider FEOL and BEOL separately!

This WorkRepair Both wearout completely by introducing the notion

of Accelerated & Active Recovery

Circadian Rhythms for FULL recovery

Demonstrate both solutions experimentally

Introduce Accelerated & Active Recovery as a new design

knob for cross-layer resilience

Wearout

Accelerated & Active Recovery

Interface Board ChipData Sampling

16-b

frefclk

in

Cout

16

En En

75 LUTs

Circuit Under Test (CUT)

rst

(a) (b)

Thermal Chamber

Counter

Accelerated & Active

Recovery for 12 hours24.5

25

25.5

26

Fre

qu

ency

(MH

z)

Accelerated Stress for 48 hours

72.4%

An example where about 72.4% of wearout is

recovered by accelerated self-healing techniques in

only ¼ of stress time (measured).Accelerated self-healing space exploration (model prediction)

Related Publications• [Submitted] Enabling Wearout-Immune BEOL and FEOL with Active Recovery

• [INTEGRATION, the VLSI Journal] “Implications of Accelerated Self-Healing as a Key Design Knob

for Cross-Layer Resilience”• [ASPDAC ‘16] “Work hard, sleep well - Avoid irreversible IC wearout with proactive rejuvenation”

• [SELSE ‘15] “MCPENS: Multiple-Critical-Path Embeddable NBTI Sensors for Dynamic Wearout

Management”

• [DAC ‘14] “Modeling and Experimental Demonstration of Accelerated Self-Healing Techniques”

Sleep-when-getting-tired for FULL recovery

Main IdeaSchedule accelerated and active recovery for both wearout before

the irreversible wearout kicks in and do this periodically

Cross-Layer Implementations

- VActive Recovery

EN

CircuitWearout Sensors

Architecture

Dark SiliconProgramCounters

+1

SystemVirtualSensorsLoad Balancer

Heating Elements

AcceleratedRecovery

EN

Negative Voltage Generator

Proactive Scheduler

RedundantResources

Cross-layer Accelerated Self-Healing

Core

sensors

Sensor

Sensor

Proactive

Accelerated

From

To cores

Apply to

Sleep Cores Active Cores

Accelerated

rou

ter

Scheduler

& Active Recovery

outputs

sleep cores

outputs

Applications

Scheduler

& Active Recovery

Blocks

Core

Allocation

LoadBalancer

Heat for accelerated

recovery

A potential implementation of Cross-layer Accelerated

Self-Healing in a NoC system

2016 IEEE/ACM Workshop on Variability Modeling and Characterization (VMC), Austin, TX

BEOL

FEOL

Electromigration (EM)

Negative/Positive-bias temperature

instability (N/PBTI)

?

System

Circuit

Device BTI → |Vth| increase

EM → R increase

Faults, Errors,

Reduced MTTF

Delay increase (e.g. 10% in 3 years),

Timing violations

Resistance Recording

Thermal Chamber

Constant Current Supply Device (Wire) under test

BEOL Wearout (EM) Test Setup

EM Test Chip Die Photo

Metal wire under test

Probe

Pads

FEOL Wearout (BTI) Test Setup

On-chip Wire Information

BTI Test Structure

Vsg = 0, room

Passive Recovery

Vsg = 0, high

Accelerated RecoveryActivate the recovery

Vsg = negative room temperature

Active Recovery:

Vsg = negativehigh temperature

Accelerated & Active 1 2 3 4

Recovery

temperature temperature

I = 0, room

Passive Recovery

I = 0, high

Accelerated RecoveryActivate the recovery

I = negative room temperature

Active Recovery:

I = negativehigh temperature

Accelerated & Active 1 2 3 4

Recovery

temperature temperature

BTI (FEOL Wearout) active and accelerated recovery

EM (BEOL Wearout) active and accelerated recovery

Experimental Results

FEOL Measurement Results

BEOL Measurement Results

72.8

73

73.2

73.4

73.6

73.8

74

74.2

74.4

74.6

74.8

0 200 400 600 800 1000 1200

R (

)

Time (min)

Stress (230C, +100mA)

Recovery (230C, -100mA)

There is still a permanent part

Summary• BTI - 72.4% of the wearout is recovered within only 1/4 of the stress

time through both high temperature and negative voltage

• EM - >75% of the wearout is recovered within 1/6 of the stress time

with reverse current and high temperature

• In both cases, there is still an irreversible (permanent) component

Fresh Turbo-boost

Fre

qu

en

cy

Time

Desig

n M

arg

in

Average

Frequency

Negative

Days

Hours

End of life

Years

Wearout

No recoveryActive

Sleep

FEOL recovery under different “circadian rhythms”

Recovery under Different “circadian rhythms”

2X

1X

Reduction of Design Margin >60X

BEOL recovery right after void growth phase starts

76

77

78

79

80

0 200 400 600 800

R (

)

Time (min)

EM Stress (230C, +100mA)

EM Accelerated & Active

Recovery (230C, -100mA)

Fully Recovered!

Start Recovery

EM induced

by reverse

current

operation

Recommended