38
Lehrstuhl Technische Informatik - Computer Engineering Brandenburgische Technische Universität Cottbus Architectures and Diagnosis Methods for Self Repairing Logic H. T. Vierhaus BTU Cottbus Computer Engineering

Architectures and Diagnosis Methods for Self Repairing Logic

  • Upload
    caspar

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

Architectures and Diagnosis Methods for Self Repairing Logic. H. T. Vierhaus BTU Cottbus Computer Engineering. Outline. 1. Parameters for Self Repair Functions. 2. Self Repair Based on FPGAs. 3. PLAs and CPLDs. 4. Duplication and Switched Logic Blocks. - PowerPoint PPT Presentation

Citation preview

Page 1: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Architectures and Diagnosis Methods for Self Repairing Logic

H. T. Vierhaus

BTU Cottbus

Computer Engineering

Page 2: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Outline1. Parameters for Self Repair Functions

2. Self Repair Based on FPGAs3. PLAs and CPLDs

4. Duplication and Switched Logic Blocks 5. Fault Diagnosis and Fault Administration6. Test and Fault Diagnosis7. Some Parameters in Comparison

8. Summary and Conclusions

Page 3: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Basic Parameters for BISRFault densities that can be managed

Overhead (chip area, time, dissipated power)

Types of faults that can / cannot be repaired

Compatibility with standard CMOS processes

Applicability to BISR in a production - test environmentor in the field of application

Page 4: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Repair Granularity and Fault Density

Granularity(transistors)

100 101 102 103 104 105 106

trans. gateFPGAblock

Makro-Ersatz(CPU etc.)

Hardly explored(logic)

Granularity(transistors)

100 101 102 103 104 105 106

trans. gateRT-macro cores

CPU

Block- Ersatz(ALU etc.)

Expected fault density (1 out of..)

Logic / GateLevel Ersatz

Page 5: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Repair Overhead versus Element Loss

Size of replaced blocks(granularity)

Repair procedureoverhead

Functioningelements lost

1 10 100 1k 10k 100k 1M 10M

Prohibitiveoverhead

Prohibitivefault density

NewMethodsandArchi-techtures

Page 6: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Block Structure of FPGAs

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB backup-row

functionally usedCLBs

row withfaulty CLB

usedCLBs

Programmableinterconnects

Page 7: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

FPGA ExperiencesFPGA repair schemes that discard a whole row or columnof CLBs are simple to implement but inefficient, as theylose many functional CLBs for a single fault.

FPGA schemes that reserve single CLBs in the matrixfor backup and do repair by single CLB replacement aremuch more difficult to implement because of the necessaryirregular-wiring process.

Page 8: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

FPGA with Irregular Repair Scheme

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

CLB CLB CLB CLB

backup block (reserved)

functionally usedCLBs

row withfaulty CLB

usedCLBs

Programmableinterconnects

Backupblockused forreplacement

Page 9: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

BISR by Standard FPGAs ?Configurable logic blocks (CLBs) are rather large(5000-10 000 transistors, estimated)

FPGAs are heterogeneous by nature:- memory-like lookup tables- logic elements (selectors, decoders, flip-flops, embedded arithmetic units)- local and global programmable interconnects with additional elements for programmability- embedded CPUs.

For fault densities below about 1 in 10 000, repair mustgo into CLBs or slices !

Page 10: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Structure of a CLB Slice

LogicField

Logicin

Program in

Logicout

Redudant Row

MUX FF

FFin SRAM

MUX

FF

out

out

SRAM

Page 11: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Look-up-Table

in

De-coder

Programming

outselect

sele

ct

backup cellfaulty cell

Page 12: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Self Repair within FPGA Basic Blocks

Heterogeneous repair strategies required (memory, logic)

Logic blocks may use methods known from memory BISR

Additional repair strategies are necesssary for logic elements

The basic overhead for FPGAs versus standard logic(about 10) is enhanced.Repair strategies for logic may use some features alreadyused in FPGAs (e. g. switched interconnects).

Page 13: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Flip-Flop Backup Scheme

LUT 1

LUT n

MUX

FF1

FFk

FFbackup

OutSel

ID

ID

S

S

Lookup-Tables

Logicin

Stor.

in

Logicin

Selector Block with spare FF

out

Page 14: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

PLA- like Structures

CZ1 Z2 Z3 Z4

Outputs

OR-Array

VDD

A BInputs

AND-Array

metal

poly-Si

n-diff.

contact

back-up elements

Page 15: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

PLA Repair Scheme

CZ1 Z2 Z3 Z4

Outputs

OR-Array

VDD

A BInputs

AND-Array

metal

poly-Si

n-diff.

contact

back-up elements

Switching Unit

Sw

itch

ing

unit

Specific programmingof cross points !

Page 16: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

FPGA / CPLD RepairLooks pretty easy at first glance because of regulararchitecture

Requires lines / colums of switches for configuration atinputs and between AND / OR matrices

Requires additional programmability of cross-points by double-gate transistor as in EEPROMs or Flash memory

Not fully compatible with standard CMOS

Limited number of (re-) configurations

Floating gate (FAMOS) transistors are fault-sensitive !

Page 17: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Double-Gate Transistors

p-Substrat

Isoliertes Gate

Steuer-GateTunnel-Oxid

Auswahl-Gate

Page 18: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Cell DuplicationVDD

VDD-SwitchSwitchcontrol

VDD1

GND

Out 1

in1

in2

Gate 1

Out 2

VDD2

GND

in1

in2

Gate 2

Outputcontrol

Out

Page 19: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Cell DuplicatonSimple scheme involving VDD off / on switch

Inherent duplication of efforts

VDD separation of fault cells

Extra effort for output isolation of fault cells necessary.

Input isolation (input gate shorts) is not easily possible.

Relatively large overhead for managing repair states andredundance (re-) organisation.

Fully CMOS compatible

Page 20: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Block Organization in Random Logic

Cell Cell

Cell Cell

BackupCell

Insw

Out

Sw

Logic Logic

Logic Logic

in

in

out

out

out

out

in

in

Page 21: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Logic Cluster ArchitectureA number of equal-type logic gates makes a cluster

The cluster contains one or more spare gates

A spare gate may replace a normal device, modificationis done via sets of input / output selctors / de-selectors

Problems:

Input gate short of a „normal“ device is not fully isolated

For n gates alternatively mapped to a single backup device,there are (n+1) control states.

Switching elements are complex and not fault tolerantBy themselves.

Page 22: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Modified Cluster Architecture

Cell

Logic

Cell

Logic

(backup)Logic Cell

Cell

Logic

Cell

Logic

InputSwitch

OutSwitch

inout

select control

Can possibly isolate aspecific gate, but stillrequires lots ofadministrative overhead.

Page 23: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Reconfiguration by Permutation Schemesswitch switch

FU

FUB

FUB

FU

FU

FU

FU

FU

inputs outputs

functional

unit

backupunit

2-Way Switch

state 0

state 1grounded„faulty“inputs / outputs

Page 24: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Specific FeaturesOnly 4 logic states for permutation in a cluster of 8 logic blocksincluding 2 for backup.

All single failed blocks plus some double failures can becompensated.

Failed components are isolated and input / output grounded.

Input gate shorts can be handled.

Internal blocks may have different complexities dependingon anticipated fault density.

Simple switching devices, fully CMOS compatible.

Fault tolerant switching devices need extra effort !

Page 25: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Fault Tolerant Switch

s s

s sin out

Switching elements can be made fault-tolerant bythemselves, both for on- and off-type faults !

... but at the cost of extra delays !

Page 26: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Test, Diagnosis, Fault Administration

For self rapair in the field of application, fault diagnosismust identify faulty elements that can be replaced.

The granularity of fault diagnosis is therefore dependingon the granularity of replacement (gates, RT-elements, CPUs)

Conventional fault diagnosis in scan-based test is limitedto the respective position in a scan chain.

As scan chains are often allocated in a random manner withouta strict reference to RT-level architectures, diagnosis methods usedwith production test are not a real solution.

A system that has redundant elements and self-repair functions mustrestore a „working“ status after power-off periods by:- storage and re-assembly of the previous status of repair, or by- self test, fault diagnosis and re-configuration after start-up.

Page 27: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Test and Redundancy Administration

SystemWithBISR

Capability

Redundant Elements

CPU

Repair StatusMemory

Statuscontrol

... makes a significant overhead beyond redundancy provision !

Page 28: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Test and Diagnostic Resolution (1)

Scan -in

Scan -out

G1 G2 G3 G4 G5

G7 G8 G9 G10

G11 G12 G13

G6

Scan test can only identifyfaulty scan-out location !

Page 29: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Test and Diagnostic Resolution (2)

Scan -in

Scan -out

G1 G2 G3 G4 G5

G7 G8 G9 G10

G11 G12 G13

G6

Scan test can only identifyfaulty scan-out location !

Non-resolvablefault !

Further resolution by multipletest patterns !

Page 30: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Production Test with DiagnosisTest

Fault Detection

DiagnosisScan-Path Nr.,

Bit-Nr.

Fault simulationLayout

Chip-Analysis

On-line

Off-line

.. is not available in the field !

Page 31: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Diagnosis by „Tentative“ Repair

Test Process

Fault Detection?

Diagnosis

Single-Repair-Function

Diagnosis /Fault List

yes

New Test Vector

Start

Testpatt.-List

Repair.annotation

Test compl.?

no

Repair-Process

no

yes

tentativerepair

final repair

test &status monitoring

Page 32: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Tentative RepairSwitch-off of faulty elements and power separation are often done by „fuses“.

Once a fuse is blown, it cannot be re-installed !!

Reconfiguration schemes based on „fuse“ or „antifuse“switching elements cannot be used in conjunction with„tentative repair“.

Page 33: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Enhanced Logic Cluster

scan

scan

scan

scan

swit

ch

FU

FUB

FUB

FU

FU

FU

FU

FU

inputsoutputs

scan

scansw

itch

Scan in Scan out

Extra scanouts at extra blocks

Page 34: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Diagnostic TestIn a bundle of 8 blocks and with 2 extra outputs.

By going through the 4 logic states of (re-)configuration,each block is once connected to the „spare“ inputs andoutputs.

If a test pattern is applied to 4 units of the same type bygoing through the 4 states, the faulty unit can be identified.

The „false“ output detection can be used locally to set a statusof re-configuration.

With multiple units of the same type tested in parallel, timeand overhead are resonable.

If tests are short and reliable, an initial test process after every power-downcan be performed. Keeping configurations in a memory is not necessary.

Page 35: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Local Test and Reconfiguration

scan

scan

scan

scan

swit

ch

FU

FUB

FUB

FU

FU

FU

FU

FU

inputsoutputs

scan

scansw

itch

Scan in Scan out

+

Ref. out

Switch control FSM

faultdetect

Page 36: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

Integrated Test & Repair

logic R

BIST&Repair

logic R

BIST&Repair

logic R

BIST&Repair

logic R

BIST&Repair

logic R

BIST&Repair

logic R

BIST&Repair

GlobalControl

BIST start

Monitoring of„repair resources

exhausted“conditions

Page 37: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

ComparisonFPGAs PLAs cell cell

duplication cluster

Repair bysingle method

Overhead forreplacement

Overhead fororganisation

CMOS-logiccompatible

Reconfigur.non-volatile

Gate-short-repair

10-20% 20-30% 100 % 10-20%

high medium 20-30% 30-50%

no no yes yes

no partial no no

yes no yes yes

yes * yes difficult yes

Diagnosis bytrial/ error

- + ++ +

VDD-separation

no no yes no

In-fieldrepair

+ + ++ + +

*

Page 38: Architectures and Diagnosis Methods for Self Repairing Logic

Lehrstuhl Technische Informatik - Computer Engineering

Brandenburgische Technische Universität Cottbus

SummarySeveral types of logic (FPGAs, CPLDs) require either aninhomogeneous replacement process based on different typesof redundant elements.Repair schems that need special devices (e. g. floating gate transistors) are not attractive.

Schemes that provide a high level of fault isolation forshort-type faults are most attractive.

Architectures that also provide excellent local (self-) testcoupled to locally organized self repair are possible.