Efficient On-Line Testing of FPGAs with Provable Diagnosabilities Vinay Verma (Xilinx Inc. ) Shantanu Dutt (Univ. of Illinois at Chicago) Vishal Suthar

Efficient On-Line Testing of FPGAs with Provable Diagnosabilities

Vinay Verma (Xilinx Inc. )Shantanu Dutt (Univ. of Illinois at Chicago)Vishal Suthar (Univ. of Illinois at Chicago)

OutlineOutline

Previous on-line testing methods

Roving Tester (ROTE) & Bulilt-in Self Tester (BISTer) Concepts

Two new BISTer architectures– 1-diagnosable BISTer-1

– 2-diagnosable BISTer-2

New fast functional testing and diagnosis: FAST-TAD

Simulation results (fault coverage and fault latency)

Conclusions

Previous On-Line Testing MethodsPrevious On-Line Testing Methods

On-line testing:On-line testing: Testing a (small) part of the FPGA while a Testing a (small) part of the FPGA while a circuit is executing on another part – increases system circuit is executing on another part – increases system availabilityavailability

Fault scanning technique of [Shnidman et al., IEEE Tr. Fault scanning technique of [Shnidman et al., IEEE Tr. VLSI’98] that is applicable to bus-based FPGAsVLSI’98] that is applicable to bus-based FPGAs

STAR technique of [Abramovici, et al., ITC’99] that uses a STAR technique of [Abramovici, et al., ITC’99] that uses a roving tester that tests part of the FPGA while the rest roving tester that tests part of the FPGA while the rest executes the application circuit.executes the application circuit.– Their group have presented several Built-in-Self-Testers Their group have presented several Built-in-Self-Testers

(BISTers) with different diagnosabilities and complex adaptive (BISTers) with different diagnosabilities and complex adaptive diagnosis; e.g., [Abramovici, et al., ITW’00] – will be discussed diagnosis; e.g., [Abramovici, et al., ITW’00] – will be discussed laterlater

– Have also presented on-line BIST for interconnects [Stroud et al., Have also presented on-line BIST for interconnects [Stroud et al., ITW’01] ITW’01]

Roving Tester (ROTE) with Built-in-Self-Testers (BISTers)

CIRCUIT

ROTEBISTer

• Two column left spare for ROTE; one for fault reconf.• ROTE roves across the FPGA• ROTE concept similar to STAR at a high level• Differentiation: BIST designs, fault reconfig. & incr. re-routing techniques

SP

AR

E

CO

LU

MN

CIRCUIT CIRCUITS

PA

RE

C

OL

UM

N

TPG - Test Pattern GeneratorCUT - Cells Under TestORA - Output Response Anal.

TPG CUT

CUT ORA

BISTerSyndrome

Definitions

k-diagnosability:A testing technique is said to be k-diagnosable if in thepresence of any m ≤ k faulty components it can correctly identify all m faulty components among the n ≥ k components that it tests.

Detailed syndrome: The detailed syndrome for a session is the 0/1 bit pattern observed at the ORA output (0 => match, 1 => mismatch)over all the test vectors of the TPG.

Gross syndrome:A gross syndrome of a session is the overall pass/fail (indicated as X/√) observation over all modes of operationfor that session. In other words, the gross syndrome of asession is a X (fail) if the ORA output is 1 for any input test vector and is a √ (pass), otherwise.

TPG CUT

CUT ORA

BISTerSyndrome

TPG - Test Pattern GeneratorCUT - Cells Under TestORA - Output Response Analyser

BISTer-0 [M. Abramovici et. al., ITC ’99]

CUT ORA

TPG CUT

ORA CUT

CUT TPG

TPG CUT

CUT ORA

A

B C

D

A

A

B

B

C

C

D

DCUT TPG

ORA CUT

A D

B C

(S1) (S2)

(S3) (S4)

• Exhaustive testing of CUTs• S1, S2, S3, S4 are four sessions of testing in a BISTer tile

S1S1 S2S2 S3S3 S4S4

AA TPGTPG CUTCUT ORAORA CUTCUT

BB CUTCUT TPGTPG CUTCUT ORAORA

CC ORAORA CUTCUT TPGTPG CUTCUT

DD CUTCUT ORAORA CUTCUT TPGTPG

Theorem: BISTer-0 is zero-diagnosable.

Proof: The same pair of PLBs are configured as CUTs in two different sessions:

PLBs A and C in S2 and S4 PLBs B and D in S1 and S3.

When either PLB fails, the gross syndrome will be identical in these sessions.

E.g. if A fails as a CUT only, then its gross syndrome is identical to the gross syn. ofC failing as a CUT only. Hence we cannotdistinguish between faulty PLBs A and C.

FaultyFaulty

PLBPLB S1S1 S2S2 S3S3 S4S4

AA √ XX√/X XX

CC√/X XX √ XX

BISTer-0 [M. Abramovici et. al., ITC ’99]

Thus has a complex adaptive diagnosis phase

Our BISTer-1 Architecture

A

B

C

D

TPG

TPG

TPG

TPG

ORA

ORA

ORA

CUT

CUT

CUT

CUT

ORA CUT

CUT

CUT

CUT

S1 S2 S3 S4Sess PLB

TPGCUT

CUT ORA

B A

C D

S1S1 S2S2 S3S3 S4S4 InferenceInference

√ √ √ √ No faulty PLBNo faulty PLB

XX √ √ √ Fault not in PLBFault not in PLB

√ XX √ √ Fault not in PLBFault not in PLB

√ √ XX √ Fault not in PLBFault not in PLB

√ √ √ XX Fault not in PLBFault not in PLB

XX XX √ √ Faulty C Faulty C (CUT)(CUT)

√ XX XX √ Faulty D Faulty D (CUT)(CUT)

√ √ XX XX Faulty A Faulty A (CUT)(CUT)

XX √ √ XX Faulty B Faulty B (CUT)(CUT)

XX √ XX √ Fault not in PLBFault not in PLB

√ XX √ XX Fault not in PLBFault not in PLB

XX XX XX √ Faulty DFaulty D

√ XX XX XX Faulty AFaulty A

XX XX √ XX Faulty CFaulty C

XX √ XX XX Faulty BFaulty B

XX XX XX XX Fault not in PLBFault not in PLB

CUT

B A

C D

TPG

CUT

ORA

Our BISTer-1 Architecture

S1S1 S2S2 S3S3 S4S4 InferenceInference

√ √ √ √ No faulty PLBNo faulty PLB

XX √ √ √ Fault not in PLBFault not in PLB

√ XX √ √ Fault not in PLBFault not in PLB

√ √ XX √ Fault not in PLBFault not in PLB

√ √ √ XX Fault not in PLBFault not in PLB

XX XX √ √ Faulty C Faulty C (CUT)(CUT)

√ XX XX √ Faulty D Faulty D (CUT)(CUT)

√ √ XX XX Faulty A Faulty A (CUT)(CUT)

XX √ √ XX Faulty B Faulty B (CUT)(CUT)

XX √ XX √ Fault not in PLBFault not in PLB

√ XX √ XX Fault not in PLBFault not in PLB

XX XX XX √ Faulty DFaulty D

√ XX XX XX Faulty AFaulty A

XX XX √ XX Faulty CFaulty C

XX √ XX XX Faulty BFaulty B

XX XX XX XX Fault not in PLBFault not in PLB

A

B

C

D

TPG

TPG

TPG

TPG

ORA

ORA

ORA

CUT

CUT

CUT

CUT

ORA CUT

CUT

CUT

CUT

S1 S2 S3 S4Sess PLB

CUT CUT

Theorem: BISTer-1 is 1-diagnosable

Each PLB is a CUT in 2 unique sessn’sand a TPG in another unique session – this serves to uniquely identify the faulty PLB which will have a X X √ in these sessions.

BISTer-2 Architecture

Y2Y1

D

ORA2

C

CUT TPG

CUT TPG

ORA1

AB

E

F

Y1 – output of the ORA comparing CUTsY2 – output of the ORA comparing TPGs

Theorem: BISTer-2 is 1-diagnosable

Faulty Faulty PLBPLB S1S1 S2S2 S3S3 S4S4 S5S5 S6S6

AA XX √ XX XX XX XX

BB XX XX √ XX XX XX

CC XX XX XX √ XX XX

DD XX XX XX XX √ XX

EE XX XX XX XX XX √

FF √ XX XX XX XX XX

Gross syndrome corresponding to Y1

Proof:Gross syndrome corresponding to Y1 for each faulty PLB is unique.E.g. Y1 is pass in section 2 only for faulty PLB A and no other PLB.

6 rotations => 6 sessions

Theorem: BISTer-2 is 2-diagnosable under the assumptions:1. No fault masking for all detailed syndromes2. Faulty PLBs either uniformly all fail or all pass as TPG/ORA

Proof:

• For the case faulty PLBs fail as TPG/ORA also, possible gross syndromes (GS) are: Y1Y2 = X √ and XX

• Class 1: faulty pairs corresponding to GS= X √.• 3 Class 1 pairs: (CUT,CUT)2, (CUT,OR1)1 and (OR1,CUT)1

• Class 2 includes remaining faulty pairs (GS=XX).

• For session S1, Class 1 includes BD2, BC1 and CD1

Y2Y1

D

OR2C

CUT TPG

CUT TPG

OR1

AB

E

F

BISTer-2 Architecture (cont.)

(S1)

Y2

D

TPGC

OR1 CUT

TPG OR2

CUT

AB

E

F

(S2)Y1 Y2

Y1

D

TPGC

TPG OR2

OR1 CUT

CUT

AB

E

F

(S6)

S1: GS = X √=> BC/CD/BD

S2: GS = X √ => CD

S2: GS = X X=> BC/BD

S6: GS = X √ => BC

S6: GS = X X=> BD

S1: GS = X X=> Class 2 pairs

In S1-S6 all the faulty pairs at dist. 1 & 2will be in Class 1 and hence will be diag.

=> GS’s are distinct for all dist. 1 & 2 faulty pairs

dist. 3 pairdist. 1 pair

Class 1 pairs

CD only Class 1 pair from S1

BC only Class 1 pair from S1Class 1 pairsdist. 2pair

The detailed syndrome for a session is the 0/1 bit pattern observed at the ORA output (0 => match, 1 => mismatch) over all the test vectors of the TPG.

For faulty pairs at dist. 3, i.e., pairs AD, BE and CF, G.S. of Y1Y2 = XX in all sessions.Hence they don’t fall in Class 1 and hence are not distinguishable among themselves.

To distinguish these dist. 3 pairs we compare their detailed syndromes:AD: dS1 = dS3 (T-C in both sess’s), dS4 = dS6 (C-T in both)

Similarly,BE: dS1 = dS5, dS2 = dS4CF: dS2 = dS6, dS3 = dS5

These pairs are uniquely diag. except for the case when dS1 = dS3 = dS5 and dS2 = dS4 = dS6; which is a very low probability event---e.g. requires 4v. low prob. events of the type ds(CUT, TPG) = ds(TPG, CUT)

Thus all faulty pairs are diagnosable with high probability.

Y2Y1

D

OR2

C

CUT TPG

CUT TPG

OR1

A

E

F

BISTer-2 Architecture (cont.)

(S1)

Y1

Y2

D

CUTC

CUT OR1

OR2 TPG

TPG

AB

E

F

(S3)

B

Three dist. 3 pairs

Fast-TAD: A Fast Functional Testing and Diagnosis• In this methodology a PLB is tested only for specific functions (called

operational functions) it will assume as the ROTE moves across the FPGA.

• A PLB X is functionally-faulty (f-faulty) if faults in X produce incorrect outputs, when X implements any of its operational functions.

• Property: While roving the ROTE in an FPGA either without f-faults or with reconfigured f-faults, a PLB X needs to implement at most 2 functions: its

original function (when ROTE is in its initial position) and the fn. of the PLB two f-fault-free PLBs to its right.

ROTE

PLB in column c3 implements functions fx1 and fx3 as the ROTE moves across the FPGA.

c3 c4 c5 fx1 fx2 fx3 fx4

c6 c7

ROTE

c1 c2

c5fx3 fx4c6 c7

fx2fx1c1 c2 c3 c4

fx3 fx4c5 c6 c7

ROTEROTE

c1 c2

Operational functions of c3Advantages:

• Faster T&D

• >> yield

• >> availab.

Diagnosis in Fast-TAD (overlaid on BISTer-1)

SesSes

PLBPLBS1S1 S2S2 S3S3 S4S4

AA TPGTPG ORAORACUT CUT d1,d2d1,d2

CUTCUTa1,a2a1,a2

BBCUTCUTb1,b2b1,b2

TPGTPG ORAORACUT CUT a1,a2a1,a2

CCCUT CUT b1,b2b1,b2

CUTCUTc1,c2c1,c2

TPGTPG ORAORA

DD ORAORACUT CUT c1,c2c1,c2

CUTCUTd1,d2d1,d2

TPGTPG

f-faulty f-faulty

PLBPLB S1S1 S2S2 S3S3 S4S4

AA √ X/√ X/√ XX

BB XX √ X/√ X/√

CC X/√ XX √ X/√

DD X/√ X/√ XX √

• Each PLB is tested in its two operational fn.

• A f-faulty PLB Q config. as a TPG will have a GS of √ while Q configured as a CUT & performing its oper. functions will have GS of X. In all other cases GS is either a √ or a X

√ X/√ X/√ X

X/√ X √ X/√

• In some cases, faults in A and C ( or B and D) may not be distinguishable – a 2nd test reqd.

• Require 10.t1 time versus 16.t1 if both CUTs in a session are config. both their oper fns.

Ses.Ses.

PLBPLB

S1S1

(C/A)(C/A)

S2S2

(B/D)(B/D)

AA TPGTPGCUTCUT

b1,b2b1,b2

BBCUTCUT

c1,c2c1,c2

CUT CUT

b1,b2b1,b2

CCCUT CUT

c1,c2c1,c2ORAORA

DD ORAORA TPGTPG

Faulty Faulty

PLBPLB

S1S1

(C/A)(C/A)

S2S2

(B/D)(B/D)

AA √

BB X

CC X

DD √

Theorem: Fast-TAD using BISTer-1 is 1-diagnosable

SesSes

PLBPLBS1S1 S2S2 S3S3 S4S4

AA TPGTPG ORAORACUT CUT d1,d2d1,d2

a1,a2a1,a2

CUTCUTa1,a2a1,a2

b1,b2b1,b2

BBCUTCUTb1,b2b1,b2

c1,c2c1,c2

TPGTPG ORAORACUT CUT a1,a2a1,a2

b1,b2b1,b2

CCCUT CUT b1,b2b1,b2

c1,c2c1,c2

CUTCUTc1,c2c1,c2

d1,d2d1,d2

TPGTPG ORAORA

DD ORAORACUT CUT c1,c2c1,c2

d1,d2d1,d2

CUTCUTd1,d2d1,d2

a1,a2a1,a2

TPGTPG

Simulation Environment

• A 32 x 32 FPGA was simulated with 3-input 1-output PLBs.

• Fast-TAD with BISTer-1 and STAR BISTer (enhancement of BISTer-0 with 1-diagnosability) techniques were implemented on this FPGA.

• The adaptive diagnosis phase of the STAR BISTer is very complex; we have simulated only the fault detection and direct diagnosis phase of the STAR BISTer (BISTer-1 has no adaptive diagnosis phase)

• Two types of faults (with internal fault density up to 25%) were inserted: 1. Randomly distributed faults with external faulty density up to 40% 2. Clustered faults with cluster density up to 3%

Prob. of a fault around a “center” fault = k/d(k=const, d=distance)

1 2

Center faulty PLB

Correlated faulty PLB

Non-faulty PLB

Legend:

Simulation of 3 x 2 STAR BISTer [M. Abramovici et, al., ITW ’00]

TT CC

TT OO

TT CC

TT CC

TT OO

TT CC

• 1-diagnosable; it can diagnose 1 fault in a 3 x 2 BISTer area (1 / 6).• Each BISTer consists of 3 TPGs, 2 CUTs and 1 ORA – 6 sessions reqd.• STAR moves by 2 cols• Very complex adaptive diagnosis phase

T – TPG, O – ORA, C – CUT

BB AA TT

CC DD TT

BB AA TT

CC DD TT

TT BB AA

TT CC DD

BB AA TT

CC DD TTTT BB AA

TT CC DD

Version of our 2 x 2 BISTer-1 w/ a 3-PLB TPG

• # of TPG PLBs = ratio of inps/outps in PLB => 3 TPGs for testing 3-inp 1-outp PLBs

• 2x3 BISTer-1: 3 TPGs, 2 CUTs & 1 ORA

• Basically two partially overlapped basic 2x2

BISTer-1’s – 8 sessions reqd.• ROTE moves by 2 cols

• Result: Can diagnose up to 1 fault in every alt. col of a 2-row FPGA subarray – diagnosability is thus 1 / 4 approaching that of ideal Bister-1’s

Results: Fault Coverage v/s Fault

Density

405060708090

100

1 2 5 7 10 15 20 25 30 35 40

Fault density (%)

Faul

t cov

erag

e (%

)BISTer-1 STAR BISTer

Randomly distributed faults

30405060708090

100

8.8 16.9 26.6

Fault density (%)

Fau

lt c

ove

rag

e (%

)

BISTer-1 STAR BISTer

Clustered faults with k = 0.5 in

The three values of fault density in the plot correspond to cluster densities of 1%, 2% and 3% respectively.

dkp /

Results: Fault Latency v/s Fault Density

200300400500600700800

1 2 5 7 10 15 20 25 30 35 40

Fault density (%)

Fau

lt la

ten

cy (

x t_

1)BISTer-1 STAR BISTer

Conclusions

• Developed a 1-diag. (1 of 4) BISTer

• Developed (for the 1st time) a 2-diag. (2 of 6) – w/ high prob. -- BISTer

• Developed (for the 1st time) functional T&D: tests PLBs in only 2 funcs that they will perform; prev. methods performed exhaust testing

• Fast-TAD w/ BISTer-1 has the same diagnosability (1 of 4) for f-faults

• Our methods do not require adaptive diagnosis; previous techniques have complex adaptive diag. mechanisms

• Simulation results for Fast-TAD w/ BISTer-1:

fault coverages of 96% & 92 % at fault densities of 10% & 20% resp.

The previous best STAR-2x3-BISTer (non-adaptive version): coverages of 74% & 46% at these densities

• Much lower fault latency of Fast-TAD w/ BISTer-1 compared to that of the STAR-3x2-BISter

• Its high fault coverage at high flt. densities and low fault latency should prove useful for testing and diagnosing emerging tech. FPGAs (<= 90 nm, nanotechnology) that are expected to have high fault densities

BISTer-2 architecture

Y2Y1

D

ORA2

C

CUT TPG

CUT TPG

ORA1

AB

E

F

S1S1 S2S2 S3S3 S4S4 S5S5 S6S6

AA TPGTPG OR2OR2 TPGTPG CUTCUT OR1OR1 CUTCUT

BB CUTCUT TPGTPG OR2OR2 TPGTPG CUTCUT OR1OR1

CC OR1OR1 CUTCUT TPGTPG OR2OR2 TPGTPG CUTCUT

DD CUTCUT OR1OR1 CUTCUT TPGTPG OR2OR2 TPGTPG

EE TPGTPG CUTCUT OR1OR1 CUTCUT TPGTPG OR2OR2

FF OR2OR2 TPGTPG CUTCUT OR1OR1 CUTCUT TPGTPG

OR1 => ORA 1 (Y1)OR2 => ORA 2 (Y2)

Y1 – output of the ORA comparing CUTsY2 – output of the ORA comparing TPGs

Theorem: BISTer-2 is 1-diagnosableProof:Gross syndrome corresponding to Y1 for each faulty PLB is unique.E.g. Y1 is pass in section 2 only for faulty PLB A and no other PLB.

Faulty Faulty PLBPLB S1S1 S2S2 S3S3 S4S4 S5S5 S6S6

AA XX √ XX XX XX XX

BB XX XX √ XX XX XX

CC XX XX XX √ XX XX

DD XX XX XX XX √ XX

EE XX XX XX XX XX √

FF √ XX XX XX XX XX

Gross syndrome corresponding to Y1

Documents

Efficient On-Line Testing of FPGAs with Provable Diagnosabilities Vinay Verma (Xilinx Inc. ) Shantanu Dutt (Univ. of Illinois at Chicago) Vishal Suthar