Detecting Errors Using Multi-Cycle Invariance Information Nuno Alves, Jennifer Dworak, and R. Iris...

Preview:

Citation preview

Detecting Errors Using Multi-Cycle Detecting Errors Using Multi-Cycle Invariance InformationInvariance Information

Nuno Alves, Jennifer Dworak, and R. Iris Bahar

Division of Engineering Brown University

Providence, RI 02912

Kundan Nepal

Electrical Engineering Dept.Bucknell University

Lewisburg, PA 17837

Design, Automation, and Test in Europe, April 20-24, 2009

MotivationMotivation

Errors in ICs are increasingErrors in ICs are increasing– Particle strikes, temperature, power, noise, Particle strikes, temperature, power, noise,

process variations, test escapes, etc.process variations, test escapes, etc.

Previously, we have proposed using Previously, we have proposed using logic logic implicationsimplications for online error detection during a for online error detection during a single clock cyclesingle clock cycle

What happens if we consider implications across time cycles?

OutlineOutline

Introduction & BackgroundIntroduction & Background

Logic Implications for Error DetectionLogic Implications for Error Detection

Multi-Cycle ImplicationsMulti-Cycle Implications

Experimental ResultsExperimental Results

ConclusionsConclusions

OutlineOutline

Introduction & BackgroundIntroduction & Background

Logic Implications for Error DetectionLogic Implications for Error Detection

Multi-Cycle ImplicationsMulti-Cycle Implications

Experimental ResultsExperimental Results

ConclusionsConclusions

Other WorkOther Work

Triple Modular RedundancyTriple Modular RedundancyLogic DuplicationLogic DuplicationRe-Execution in Multiple ThreadsRe-Execution in Multiple ThreadsCodes (Parity, Berger, Bose Lin, etc.)Codes (Parity, Berger, Bose Lin, etc.)High Level Fault AssertionsHigh Level Fault AssertionsFault MaskingFault MaskingChecking the Outputs Against a Subset of Checking the Outputs Against a Subset of the Truth Tablethe Truth Table

Our ApproachOur Approach

Find natural expected relationships and Find natural expected relationships and check for their violation.check for their violation.

Water should be blue….

Not brown…

In circuits, expected relationships at the gate level consist of logic implications.

OutlineOutline

Introduction & BackgroundIntroduction & Background

Logic Implications for Error DetectionLogic Implications for Error Detection

Multi-Cycle ImplicationsMulti-Cycle Implications

Experimental ResultsExperimental Results

ConclusionsConclusions

Implications Naturally Occur in CircuitsImplications Naturally Occur in Circuits

n1

n2n3

n4n5

n6n7

n80

1

00

n5 = 1 → n8 = 0

Implication Violations Can Be Used Implication Violations Can Be Used to Detect Errorsto Detect Errors

ERROR

n1

n2n3

n4n5

n6n7

n8

n5=1 n8=0

Appropriate checker logic can detect multiple errors with a single implication.

Implication Violations Can Be Used Implication Violations Can Be Used to Detect Errorsto Detect Errors

ERROR

n1

n2n3

n4n5

n6n7

n8

n5=1 n8=0

Appropriate checker logic can detect multiple errors with a single implication.

sa1

sa1

sa1sa1sa1

sa1

sa1

sa1

sa1

sa1

sa1sa1

sa1

sa1sa1

sa1

sa1sa1sa1

sa1

sa1

Total Number of Implications With Distance 2 or Greater

1

10

100

1000

10000

100000

Circuit

Nu

mb

er

of

Imp

lica

tion

s

So….what’s the problem?So….what’s the problem?

We have too many implications!We have too many implications!

How do we efficiently find them and which ones should we use?

Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically

…without functional knowledge of the circuit.

StartIdentify Potential

Implications w/ Simulation

Verify Implications

Eliminate Subsumed Implications

Determine Coverage of Remaining Implications

Select Best Subset for Target Error Detection

and OverheadEnd

What determines which faults an What determines which faults an implication may cover?implication may cover?

Potential Spatial Fault CoveragePotential Spatial Fault Coverage

Each implication can only cover a limited area of the Each implication can only cover a limited area of the circuit….circuit….

PQ

Direct Path

P

Q

P=0 → Q=0

Faults along the path may be detected

P Q

P Q

P=1 → Q=1

Faults along reconverging paths may be detected

Reconvergent Fanout

P

Q

Divergent Fanout

P

Q

Q=0 → P=0

Faults along paths to common ancestors may

be detected

Implications cannot cover any Implications cannot cover any sites sites downstreamdownstream of both of both

implication points!implication points!

Limitations of Single-Cycle Limitations of Single-Cycle ImplicationsImplications

Implications may not exist to cover faults Implications may not exist to cover faults far downstream—e.g. close to:far downstream—e.g. close to:– Flip-flopsFlip-flops– Primary OutputsPrimary Outputs

It is possible for no useful implications to It is possible for no useful implications to exist in a single cycleexist in a single cycle

Optimal timing of capture is difficultOptimal timing of capture is difficult

Many of these issues are alleviated if we consider multi-cycle implications

OutlineOutline

Introduction & BackgroundIntroduction & Background

Logic Implications for Error DetectionLogic Implications for Error Detection

Multi-Cycle ImplicationsMulti-Cycle Implications

Experimental ResultsExperimental Results

ConclusionsConclusions

Multi-Cycle ImplicationsMulti-Cycle Implications

A

B

X

Y

F

Sequential Circuit Containing No Non-Trivial

Implications in Combinational Logic

Time Frame Expansion B1

F1

X0Y0

A1X1

Y1

B2

X1

Y1

A2X2

Y2

F2

X2

Y2

Cycle t1 Cycle t2

Logic Value in First Clock Cycle Implies a Value at a Different Site

in the Second Clock Cycle

B1 = 0 → F2 = 0

Multi-Cycle Checker HardwareMulti-Cycle Checker Hardware

B1

F1

X0Y0

A1X1

Y1

B2

X1

Y1

A2X2

Y2

F2

X2

Y2

Cycle t1 Cycle t2

B1 = 0 → F2 = 0

A

B

X

Y

F

violation

Checker hardware requires state to be held between first and second cycle….

Spatial Coverage of Multi-Cycle Spatial Coverage of Multi-Cycle ImplicationsImplications

Good spatial coverage can be achieved near flip-flopsGood spatial coverage can be achieved near flip-flops

Logical distance may increase between implication sitesLogical distance may increase between implication sites

Delays captured at flip-flops in cycle t can be detected Delays captured at flip-flops in cycle t can be detected without complex timingwithout complex timing

P Q

Cycle t Cycle t + 1

Advantages:

OutlineOutline

Introduction & BackgroundIntroduction & Background

Logic Implications for Error DetectionLogic Implications for Error Detection

Multi-Cycle ImplicationsMulti-Cycle Implications

Experimental ResultsExperimental Results

ConclusionsConclusions

Experimental SetupExperimental Setup

ISCAS ’89 benchmark circuitsISCAS ’89 benchmark circuitsZchaff SAT solver to validate implicationsZchaff SAT solver to validate implicationsThree sets of implications per circuitThree sets of implications per circuit– First cycle First cycle

Both implication sites in cycle 1Both implication sites in cycle 1Obtained with single cycle analysis & unrestricted inputsObtained with single cycle analysis & unrestricted inputs

– Second cycle Second cycle Both implication sites in cycle 2Both implication sites in cycle 2Obtained with time frame expansionObtained with time frame expansion

– Cross cycle Cross cycle One site per cycleOne site per cycleObtained with time frame expansionObtained with time frame expansion

So, how many implications exist?So, how many implications exist?

Number of Implications in Each Class

0

5000

10000

15000

20000

25000

30000

Circuit

Num

ber

of I

mpl

icat

ions

1st cyclecross-cycle2nd cycle only

What is the distance between What is the distance between implication sites?implication sites?

Average Implication Distance for Single and Between Cycle Implications

0

2

4

6

8

10

12

14

Circuit

Ave

rag

e D

ista

nce

Average single cycledistanceAverage cross-cycledistance

How do the different How do the different implication classes compare implication classes compare

for error detection (if we use for error detection (if we use allall possiblepossible implications)? implications)?

Contribution of Different Implication Classes to Error Detection

0

10

20

30

40

50

60

70

80

90

100

s298 s420 s444 s510 s713 s953 s1196 s1488Circuit

Err

or C

over

age 1st cycle

1st and 2nd cycle cross cycleall

Developing a Compressed Developing a Compressed Implication SetImplication Set

Start

Choose next fault in fault list

Find implication with best coverage of this fault

Add best implication to compressed list

Any more faults?Return

implication listEnd

Yes No

Number of Compressed Implications

0

50

100

150

200

250

300

350

400

450

500

s298 s420 s444 s510 s713 s953 s1196 s1488

Circuit

Num

ber 1st cycle

cross-cycle2nd cycle only

What if we further tradeoff What if we further tradeoff error coverage for reduced error coverage for reduced

area overhead?area overhead?

Average Error Coverage Acheived for Different Area Thresholds

0

10

20

30

40

50

60

70

80

90

100

s298 s420 s444 s510 s713 s953 s1196 s1488Circuit

Ave

rag

e E

rror

Co

vera

ge

10%20%30%40%50%CompressedAll

Percentage of Cross-Cycle Implications Chosen for Different Area Overheads

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

100.00

s298 s420 s444 s510 s713 s953 s1196 s1488

Circuit

% o

f Cho

sen

Impl

icat

ions

that

are

C

ross

-Cyc

le

10%50%

OutlineOutline

Introduction & BackgroundIntroduction & Background

Logic Implications for Error DetectionLogic Implications for Error Detection

Multi-Cycle ImplicationsMulti-Cycle Implications

Experimental ResultsExperimental Results

ConclusionsConclusions

ConclusionsConclusions

Implications can be used to effectively detect Implications can be used to effectively detect many errors at runtimemany errors at runtime– Without requiring functional knowledge of the circuitWithout requiring functional knowledge of the circuit– Allowing tradeoffs to be made between error Allowing tradeoffs to be made between error

coverage and overheadcoverage and overhead

Cross-cycle implications cover faults that cannot Cross-cycle implications cover faults that cannot be covered by single cycle implicationsbe covered by single cycle implications

Even though they have larger overhead, cross Even though they have larger overhead, cross cycle implications are often an “optimal” choicecycle implications are often an “optimal” choice

When optimizing for low area overhead, more than 85% of the implications may be cross cycle

For Inquiring Minds

Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically

…without functional knowledge of the circuit.

StartIdentify Potential

Implications w/ Simulation

Verify Implications

Eliminate Subsumed Implications

Determine Coverage of Remaining Implications

Select Best Subset for Target Error Detection

and OverheadEnd

Run Good Circuit Simulation with Random Vectors and

Monitor Site Values…

0000 0101 1010 1111

A,BA,B

A,CA,C

A,DA,D

A=0 → C = 0

Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically

…without functional knowledge of the circuit.

StartIdentify Potential

Implications w/ Simulation

Verify Implications

Eliminate Subsumed Implications

Determine Coverage of Remaining Implications

Select Best Subset for Target Error Detection

and OverheadEnd

Using a SAT solver

(such as Zchaff)

Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically

…without functional knowledge of the circuit.

StartIdentify Potential

Implications w/ Simulation

Verify Implications

Eliminate Subsumed Implications

Determine Coverage of Remaining Implications

Select Best Subset for Target Error Detection

and OverheadEnd

n1

n2n3

n4n5

n6n7

n8n9

n10

n11

n12

n13

n10 = 0 → n13 = 0

n4 = 1 → n8 = 0

Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically

…without functional knowledge of the circuit.

StartIdentify Potential

Implications w/ Simulation

Verify Implications

Eliminate Subsumed Implications

Determine Coverage of Remaining Implications

Select Best Subset for Target Error Detection

and OverheadEnd

Of all the patterns that will allow a fault to produce an

error at an output, how many will each implication

detect?

Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically

…without functional knowledge of the circuit.

StartIdentify Potential

Implications w/ Simulation

Verify Implications

Eliminate Subsumed Implications

Determine Coverage of Remaining Implications

Select Best Subset for Target Error Detection

and OverheadEnd