BackSpace: Formal Analysis for Post-Silicon Debug

Preview:

DESCRIPTION

BackSpace: Formal Analysis for Post-Silicon Debug. Flavio M. de Paula * Marcel Gort * , Alan J. Hu * , Steve Wilton * , Jin Yang + * University of British Columbia + Intel Corporation. Outline. Motivation Current Practices BackSpace – The Intuition - PowerPoint PPT Presentation

Citation preview

BackSpace: Formal Analysis for Post-Silicon Debug

Flavio M. de Paula*

Marcel Gort *, Alan J. Hu *, Steve Wilton *, Jin Yang+

* University of British Columbia+ Intel Corporation

Outline

Motivation Current Practices BackSpace – The Intuition Proof-of-Concept Experimental Results (Recent Experiments) Conclusions and Future Work

2

Motivation

Chip is back from fab! Screened out chips w/ manufacturing defects

3

Motivation

Chip is back from fab! Screened out chips w/ manufacturing defects

A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine!

4

Motivation

Chip is back from fab! Screened out chips w/ manufacturing defects

A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine! But, the system becomes irresponsive while running

the real application…

5

Motivation

Chip is back from fab! Screened out chips w/ manufacturing defects

A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine! But, the system becomes irresponsive while running

the real application… Every single chip fails in the same way (1M DPM: Func. bugs)

6

Motivation

Chip is back from fab! Screened out chips w/ manufacturing defects

A bring-up procedure follows: Run diagnostics w/o problems, everything looks fine! But, the system becomes irresponsive while running

the real application… Every single chip fails in the same way (1M DPM: Func. bugs)

What do we do now?

7

Current Practices

8

Scan-out buggy state

Inputs

Current Practices

9

Scan-out buggy state

But, cause is not obvious!!!

Inputs

Current Practices

10

Guess when to stop and single step

?? ?

Scan-out

Inputs

Current Practices

11

?

Non-buggy path

Problems: Single-stepping interference;Non-determinism;Too early/late to stop?

Inputs

Guess when to stop and single step

Current Practices

Leveraging additional debugging support: Trace buffer of the internal state

12

Current Practices

Leveraging additional debugging support: Trace buffer of the internal state

Provides only a narrow view of the design, e.g., program counter, address/data fetches

13

Current Practices

Leveraging additional debugging support: Trace buffer of the internal state

Provides only a narrow view of the design, e.g., program counter, address/data fetches

Record all I/O and replay Solves the non-determinism problem, but… Requires highly specialized bring-up systems

14

Current Practices

Leveraging additional debugging support: Trace buffer of the internal state

Provides only a narrow view of the design, e.g., program counter, address/data fetches

Record all I/O and replay Solves the non-determinism problem, but… Requires highly specialized bring-up systems

15

Just having additional hardware does NOT solve the problemJust having additional hardware does NOT solve the problem

A Better Solution: BackSpace

Goal: Avoid guess work Avoid interfering with the system Run at speed Portable debug support Compute an accurate trace to the bug

16

Requires: Hardware:

Existing test infrastructure and scan-chains; Breakpoint circuit; Good signature scheme;

Software: Efficient SAT solver; BackSpace Manager

17

A Better Solution: BackSpace

18

Non-buggy path

Inputs

1. Run at-speed until hit the buggy state

A Better Solution: BackSpace

19

Non-buggy path

Inputs

1. Run at-speed until hit the buggy state

A Better Solution: BackSpace

20

Non-buggy path

Inputs

1. Run at-speed until hit the buggy state

A Better Solution: BackSpace

21

Non-buggy path

Inputs

1. Run at-speed until hit the buggy state

A Better Solution: BackSpace

22

Inputs

2. Scan-out buggy state and history of signatures

A Better Solution: BackSpace

23

Inputs

A Better Solution: BackSpace

FormalEngine

3. Off-Chip Formal Analysis

24

Inputs

4. Off-Chip Formal Analysis - Compute Pre-image

A Better Solution: BackSpace

FormalEngine

25

Inputs

5. Pick candidate state and load breakpoint circuit

A Better Solution: BackSpace

FormalEngine

26

Inputs

6. Run until hits the breakpoint

A Better Solution: BackSpace

FormalEngine

27

Inputs

7. Pick another state

A Better Solution: BackSpace

FormalEngine

28

Inputs

7. Run until hits the breakpoint

A Better Solution: BackSpace

FormalEngine

29

Inputs

7. Run until hits the breakpoint

A Better Solution: BackSpace

FormalEngine

30

Inputs

A Better Solution: BackSpace

Computed trace of length 2

31

Inputs

A Better Solution: BackSpace

7. Iterate

FormalEngine

32

Inputs

8. BackSpace trace

A Better Solution: BackSpace

Outline

Motivation Current Practices BackSpace – The Intuition Proof-of-Concept Experimental Results Recent Experiments Future Work

33

Proof-of-Concept Experimental Results

34

SAT Solver

Chip on Silicon

BackSpace Manager

Proof-of-Concept Experimental Results

35

SAT Solver

Logic Simulator

BackSpace Manager

Proof-of-Concept Experimental Results Setup:

OpenCores’ designs: 68HC05: 109 latches oc8051 : 702 latches

Run real applications

36

Proof-of-Concept Experimental Results Can we find a signature that reduces the size

of the pre-image? Experiment:

Select 10 arbitrary ‘crash’ states on 68HC05; Try different signatures

37

38

Signature Size vs.States in Pre-Image

Proof-of-Concept Experimental Results How far can we go back? Experiment:

Select arbitrary ‘crash’ states: 10 for each 68HC05 and oc8051;

Set limit to 500 cycles of backspace; Set limit on size of pre-image to 300 states; Compare the best two types of signature;

Hand-picked Universal Hashing of entire state

39

40

68HC05 w/ 38-Bit Manual Signature

41

68HC05 w/ 38-Bit Manual Signature

42

68HC05 w/ 38-Bit Universal Hashing

43

8051 w/ 281-Bit Manual Signature

44

8051 w/ 281-Bit Universal Hashing

Proof-of-Concept Experimental Results Results

Signature: Universal Hashing Small size of pre-images All 20 cases successfully BackSpaced to limit

45

Proof-of-Concept Experimental Results Breakpoint Circuitry

40-50% area overhead. Signature Computation

Universal Hashing naïve implementation results in 150% area overhead.

46

Recent Experiments OpenRisc 1200:

32-bit RISC processor; Harvard micro-architecture; 5-stage integer pipeline; Virtual memory support; Total of 3k+ latches

BackSpace implemented in HW/SW AMIRIX AP1000 FPGA board (provided by CMC) Board mimics bring-up systems Host-PC: off-chip formal analysis

47

Recent Experiments

BackSpacing OpenRisc 1200: Running simple software application Backspaced for hundreds of cycles Demonstrated robustness in the presence of

nondeterminism

48

Conclusions & Future Work

Introduced BackSpace: a new paradigm for post-silicon debug

Demonstrated it works

Main challenges: Find hardware-friendly & SAT-friendly signatures Minimize breakpoint circuitry overhead

49

50

Dfn. BackSpaceable Design

1) Augmented Machine Given , where is the set of states,

Define the signature generator as

where is the set of states, , Construct an augmented machine MA such that:

51

Dfn. BackSpaceable Design

2) BackSpaceable State A state (s’,t’) of augment state machine MA is

backspaceable if its pre-image projected onto 2S is unique.

52

Dfn. BackSpaceable Design

3) BackSpaceable Machine An augmented machine MA is backspaceable iff

all reachable states are backspaceable. A state machine M is backspaceable iff it can be augmented into a state machine MA for which all reachable states are reachable.

53

54

Crash State History Algorithm

Given state (s0,t0) of a backspaceable augmented state machine MA, compute a finite sequence of states (s0,t0), (s1,t1),… as follows: Since MA is backspaceable, let si+1 be the unique

pre-image state (on the state bits) of (si,ti).

Run MA (possibly repeatedly) until it reaches a state (si+1,x). Let ti+1 = x.

55

Theorem (Correctness)

If started at a reachable state, the sequence of states computed by the preceding algorithm is the (reversed) suffix of a valid execution of M.

56

Theorem (Probabilistic Termination)

If the forward simulation is random, then with probability 1, the preceding algorithm will reach an initial state.

Recommended