33
29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair Dynamic Voting Schemes to Enhance Evolutionary Repair in in Reconfigurable Logic Devices Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara C. Milliord, C. A. Sharma, and R. F. DeMara University of Central Florida University of Central Florida

29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Embed Size (px)

Citation preview

Page 1: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

29 September 2005 29 September 2005

Dynamic Voting Schemes to Enhance Dynamic Voting Schemes to Enhance Evolutionary Repair in Evolutionary Repair in

Reconfigurable Logic DevicesReconfigurable Logic Devices

C. Milliord, C. A. Sharma, and R. F. DeMara C. Milliord, C. A. Sharma, and R. F. DeMara University of Central FloridaUniversity of Central Florida

C. Milliord, C. A. Sharma, and R. F. DeMara C. Milliord, C. A. Sharma, and R. F. DeMara University of Central FloridaUniversity of Central Florida

Page 2: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Technical Objective:Autonomous FPGA Regeneration

Redundancy

increases with amount of spare capacity

restricted at design-time

based on time required to select spare resource

determined by adequacy of spares available (?)

yes

Regeneration

weakly-related to number

recovery capacity

variable at recovery-time

based on time required to find suitable recovery

affected by multiple characteristics (+ or -)

yes

Overhead from Unutilized Spares weight, size, power

Granularity of Fault Coverage resolution where fault handled

Fault-Resolution Latency availability via downtime required to handle fault

Quality of Repair likelihood and completeness

Autonomous Operation recover without outside intervention

Increased availability without pre-configured spares …

everyday example spare tire can of fix-a-flat

NASA Moon, Mars, and Beyond:

Realize 10’s years service life ???

Reconfiguration allows new fault-handling paradigm

Page 3: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Problem Statement

• FPGAs in Space Harsh conditions lead to faults in hardware

Radiation Extreme temperatures Mechanical stress Long Mission duration

• Experiment with several combinations of GAs and voting schemes Population of FPGA configurations that are physically distinct,

but functionally equivalent Voting involves 3 or more configurations, with a majority output

• Hypothesis The added space and computation associated with a voting

scheme is justified by a quicker and more complete repair

Page 4: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

EHW Environments

• Evolvable Hardware (EHW) Environments enable experimental methods to research soft computing intelligent search techniques

• EHW operates by repetitive reprogramming of real-world physical devices using an iterative refinement process:

Genetic

Algorithm

Hardware in the loop

orTwo

modes

of

Evolvabl

e

Hardwar

e

Extrinsic Evolution

Genetic

Algorithm

software modelDone? Build it

device “design-time”refinement

Simulation in the loop

Intrinsic Evolution

device “run-time”refinement

new approach to

Autonomous Repair

of failed devices

Stardust Satellite: • >100 FPGAs onboard• hostile environment: radiation, thermal stress• How to achieve reliability to avoid mission failure???

Application

Page 5: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Genetic Algorithms (GAs)

selection of

parents

population of candidate solutions

parents

offspring

crossover

mutation

evaluatefitness

ofindividuals

replacement

start

Fitnessfunction

Goal reached

• Initial population of configurations Functionally equivalent, Physically distinct

• Fitness level Based on number of correct outputs for all possible inputs

• Creating a new generation Mutation “100011101” -> “101011101” Crossover “101100” & “011110” -> “101110”

Page 6: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Previous Work

• [1] Re-routing scheme replaces faulty CLB Time-saving method with low overhead

• [2] TMR fault-detection On-line approach High overhead and power consumption

• [3] On-line technique using a BIST Limited power consumption Spare resources

• [4] GA repair of integer multiplier Voting system may not always outperform individual with the highest

fitness Initialized GA with copies of one hand-designed configuration

[1] Xu, J., Si, P., Huang, W., and Lombardi, F., “A novel fault tolerant approach for SRAM-based FPGAs”, Proceedings of the Pacific Rim Int’l Symposium, Dec. 1999, pp. 40-44.

[2] Li, Y., Li, D., and Wang, Z., “A new approach to detect-mitigate-correct radiation-induced faults for SRAM-based FPGAs in aerospace application”, Proceedings of the IEE National Aerospace and Electronics Conference, Oct. 2000, pp. 588-594.

[3] Abramovici, M., Emmert, J., and Stroud, C., “Roving STARs: an integrated approach to on-line testing, diagnosis, and fault tolerance for FPGAs in adaptive computing systems”, Proceedings of The Third NASA DoD Workshop, July 2001, pp. 73-92.

[4] Vigander, S., “Evolutionary fault repair of electronics in space applications”, Dissertation, University of Sussex, Brighton, UK, 2001.

Page 7: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Experimental Setups

• C++ program that simulates FPGA circuit design/repair Input files

GA parameters Logic function truth table

Input/Output pairs FPGA parameters Configuration properties of

perfect individuals Simulate repair in voting

experiments

Output files Configuration properties at

selected generations Data showing fitness level at

each generation Produce graphs

Avnet FPGA Development Board

PCI I nt er f ace

Virtex-IIPro FPGA

Off ChipRAM

Controlhosted on

PC

FP

GA

Ou

tp

ut

Bit file

Input Data

• Loosely Coupled (LC) Virtex System PC WorkStation running Xilinx EDK

and ISE with AVNET V2Pro PCI card

(SoC) version using PowerPC embedded in FPGA fabric now operational … results reported on previous environment

Page 8: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Experimental Inputs

• GA parameters Population size Offspring population size Mutation rate Tournament size (2) Maximum number of

generations• FPGA parameters

Number of inputs (6) Number of outputs (6) Number of CLBs Number of look-up tables

(LUTs) per CLB (SW only)

Number of LUT select lines (SW only)

I1

I2

I3

I4

I5

I6

O1

O2

O3

O4

O5

O6

0 0 0 0 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0 0

0 1 0 0 0 1 0 0 0 0 1 0

0 1 0 0 1 0 0 0 0 1 0 0

0 1 0 0 1 1 0 0 0 1 1 0

0 1 0 1 0 0 0 0 1 0 0 0

0 1 0 1 0 1 0 0 1 0 1 0

0 1 0 1 1 0 0 0 1 1 0 0

0 1 0 1 1 1 0 0 1 1 1 0

0 1 1 0 0 0 0 0 0 0 0 0Ideal Fitness = 60

Page 9: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Experiment #1

• Circuit evolution – no repair• Maximize GA performance before voting

(tweak parameters) Used 200 for max number of generations Varied the mutation rate from .001 to .097 with a step

of .004 Population sizes of 15, 40, and 50 6, 9, 12, 16, and 36 for number of CLBs

• Evolve several perfect configurations repeated the most successful runs for 1000

generations

Page 10: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

FPGA Genetic Representations

• Chromosome Goals: Allow all possible LUT configurations Allow all possible CLB interconnections given constraints of routing support Disallow illegal FPGA configurations and non-coding introns (junk DNA) Facilitate crossover operator

• Bitstring representation is natural choice, though may not scale well (investigating generative reps)

• Representation shown here is sample specific to Xilinx Virtex FPGA

LUT 0 BITS

R-CLB = REMOTE CLB

R-LUTR-CLB

R-LUT = REMOTE LUT

R-LUTR-CLBLUT 0 INPUTS

R-LUTR-CLB R-LUTR-CLBLUT 3 INPUTS

LUT 3 BITS CLB 0 CLB 1

CLB 0

LUT0

LUT1

LUT2

LUT3

CLB 1 CLB n

LUT

0

LUT1

LUT2

LUT3

LUT0

LUT1

LUT2

LUT3

Page 11: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Generations = 200, pop size = 50,

CLBs = 9

MR F MR F MR F MR F MR F

.001 55 .021 54 .041 53 .061 55 .081 53

.005 57 .025 53 .045 52 .065 55 .085 52

.009 54 .029 54 .049 54 .069 54 .089 59

.013 54 .033 56 .053 52 .073 53 .093 55

.017 54 .037 53 .057 52 .077 54 .097 53

Experiment #1 Results

Page 12: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Perfect Individuals

• Parameters used in evolving perfect individuals (fitness of 60) Maximum Number of Generations: 1000 Mutation Rate: .002 Population Size: 50 Number of CLBs: 9

• These create a diverse initial population for TMR style voting in Experiment #2

Page 13: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

…Perfect Individuals

Config. Generations AND OR NOR XOR NAND

1 150 11 8 4 5 8

2 382 8 10 3 8 7

3 473 13 8 5 7 3

4 582 7 6 5 10 8

5 881 8 7 10 6 5

Avg. 493.6 9.4 7.8 5.4 7.2 6.2

Page 14: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Three-plex Experiments

• Six injected stuck-at faults on LUT inputs Resulting fitness of perfect individuals: 38, 40, 47

• Parameters Number of Generations: 400 Mutation Rate: .089 Population Size: 50 Number of CLBs: 9

Page 15: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Experiment #2

• Simulating repair• Implement voting schemes

Injected stuck-at faults Implemented 3-plex and 5-plex voting schemes Chose GA/FPGA parameters according to Experiment #1 For each voting run, graphed the fitness of best fit individual vs.

number of generations for voting elements and system Repeated 3-plex experiment with a single element (no voting) for

3X number of generations

GA #1

Configuration

FPGA Input Data

Voter

FPGA Output Data

Output OutputOutput

GA #2

ConfigurationGA #3

Configuration

Page 16: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Partial Repair: Max Fitness = 58 at generation 68

47

49

51

53

55

57

59

1 51 101 151 201 251 301 351

Generation

Fitn

ess o

f b

est in

div

idu

al

GA #1 GA #2 GA #3 Voting Result

Three-plex Voting Results

Page 17: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Three-plex Voting Results

Complete Repair achieved at generation 302

47

49

51

53

55

57

59

1 51 101 151 201 251 301 351

Generation

Fitn

ess o

f b

est in

div

idu

al

GA #1 GA #2 GA #3 Voting Result

Page 18: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Three-plex Voting Results

Complete Repair at generation 33

47

49

51

53

55

57

59

1 51 101 151 201 251 301 351

Generation

Fitn

ess o

f b

est in

div

idu

al

GA #1 GA #2 GA #3 Voting Result

Page 19: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Three-plex Voting Results

Perfect fitness is temporarily reached at generation 17

47

49

51

53

55

57

59

1 51 101 151 201 251 301 351

Generation

Fitn

ess o

f b

est in

div

idu

al

GA #1 GA #2 GA #3 Voting Result

Page 20: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Three-plex Voting Summary

Rank

Highest Voting Fitness

Reached

Earliest Generation of Highest

Fitness

GA #1 (voting

fitness/final fitness)

GA #2 (voting

fitness/final fitness)

GA #3 (voting

fitness/final fitness)

Final Vote Fitness

(Generation 400)

10 56 2 56/56 48/54 56/56 56

9 57 2 48/54 55/55 52/55 56

8 58 68 56/57 55/56 55/57 58

7 60 302 55/55 56/56 58/58 60

6 60 261 58/58 56/56 58/58 60

5 60 179 57/57 56/56 55/55 60

4 60 33 51/56 53/58 59/59 60

3 60 17 58/58 56/56 52/54 59

2 60 3 51/56 52/56 54/56 60

1 60 2 55/55 52/54 56/56 59

Page 21: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Compare: Single GA Run

• 1200 generations Total GA computation equivalent to a 3-plex run for

400 generations

• 3 runs Max fitness of 56 at 934 generations Max fitness of 56 at 852 generations Max fitness of 57 at 274 generations

• N-plex Voting advantageous Improved the likelihood of obtaining a complete

repair significantly with fewer total number of circuit evaluations

n x gv << go

for n-plex voting with gv voting generations vs. go

evolutionary generations without voting

Page 22: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Experiment #3: 5-plex

• Six injected stuck-at faults on LUT inputs Resulting fitness of perfect individuals: 38, 40, 47

• Parameters Number of Generations: 300 Mutation Rate: .089 Population Size: 50 Number of CLBs: 9

Page 23: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Five-plex Voting Results

Complete Repair at generation 48

47

49

51

53

55

57

59

1 51 101 151 201 251

Generation

Fitn

ess o

f b

est

ind

ivid

ua

l

GA #1 GA #2 GA #3 GA #4 GA #5 Voting Result

Page 24: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Five-plex Voting Results

Complete Repair fitness at generation 34

47

49

51

53

55

57

59

1 51 101 151 201 251

Generation

Fitn

ess o

f b

est

ind

ivid

ua

l

GA #1 GA #2 GA #3 GA #4 GA #5 Voting Result

Page 25: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Five-plex Voting Results

Perfect fitness at generation 2

47

49

51

53

55

57

59

1 51 101 151 201 251

Generation

Fitn

ess o

f b

est

ind

ivid

ua

l

GA #1 GA #2 GA #3 GA #4 GA #5 Voting Result

Page 26: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Five-plex Voting Summary

Rank

Highest Voting Fitness

Reached

Earliest Generation of Highest

Fitness

GA #1 (voting

fitness/final fitness)

GA #2 (voting

fitness/final fitness)

GA #3 (voting

fitness/final fitness)

GA #4 (voting

fitness/final fitness)

GA #5 (voting

fitness/final fitness)

Final Vote Fitness

(Generation 300)

10 59 68 55/58 54/55 55/55 54/55 53/55 57

9 60 154 54/56 56/56 52/52 56/58 55/55 60

8 60 108 56/56 54/54 55/55 55/55 53/53 60

7 60 55 53/55 56/56 57/57 56/56 56/56 60

6 60 48 55/55 56/56 53/56 55/57 59/59 60

5 60 34 56/56 55/58 51/55 57/57 55/55 60

4 60 27 56/56 53/55 52/57 55/55 52/56 58

3 60 4 56/56 53/55 51/56 56/56 52/56 60

2 60 3 49/54 51/55 55/55 53/57 52/56 55

1 60 2 50/54 56/56 54/54 50/55 54/56 60

Page 27: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

3-plex vs. 5-plex

• 3-plex scheme 7 out of 10 runs reached perfect fitness Average of 113.86 generations to do so 5 out of 10 runs exhibited perfect fitness upon

completion (400 generations)

• 5-plex scheme 9 out of 10 reached perfect fitness Average of 48.33 generations needed 7 out of 10 exhibited perfect fitness at completion

(300 generations)

Page 28: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Conclusion

• Autonomous FPGA Repair Strategy combining dynamic redundancy with online evolution

• TMR Style Voting beneficial in presence of partial refurbishment Complete repair can be quickly obtained with three/five

imperfectly repaired individuals• Improvement of fitness in an individual GA can

outperform voting fitness• Stabilization of a complete repair is more

important than how quickly it is achieved In all six runs where a perfect fitness was obtained after 50

generations, the fitness was maintained Only 5 of 10 runs which obtained a perfect fitness before 50

generations maintained that fitness for remainder of run

Page 29: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Development Board to Self-Contained FPGA

Year 1 Year 3Year 2

CRR on a Chip(Xilinx Virtex-II Pro)

Control viaon-chip

Power PC

Re-config

Config

Data

Configurationsin On ChipRAM Blocks

FunctionalCLBs

ICAP

Bit file

Data

Output

Request

Avnet FPGA Development Board

PCI Interface

Virtex-IIPro FPGA

Off ChipRAM

Controlhosted on

PCOutput

Bit file

Input Data

CRR on a Chip(Xilinx Virtex-II Pro)

Device Fault

Qualitative Analysis of CRR modelQualitative Analysis of CRR model• Number of iterations and completeness of regeneration repair • Percentage of time the device remains online despite physical resource

fault (availability)Hardware Resource ManagementHardware Resource Management

• Optimization of hardware profile for Xilinx Virtex II ProField Testing on SRAM-based FPGA in a Cubesat missionField Testing on SRAM-based FPGA in a Cubesat mission

Page 30: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

For further info … EH Websitehttp://cal.ucf.edu

Page 31: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Backup Slides

• On following pages …

Page 32: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Approach Online Recovery

Basis for Recovery

Test Vectors

Availability Externally-supplied Elements

Resource Recycling

Pre-determined

Limits

Power Consumption

TMR with Jiggling [Garvie,

Thompson]

Yes

Requires 2 datapaths

are operational

Pseudo-Exhaustive

100% for single fault,

0% thereafter 2 of 3 Majority Voter Yes Single

datapath

3n+v

[Vigander01] No Design complexity

Exhaustive Non-deterministic

GA Controller, function test vectors

Yes None 3n+v+r

[Lohn, Larchev, DeMara03]

No Design complexity

Pseudo-Exhaustive Functional

Test

Non-deterministic

GA Controller, function test vectors

Yes None 2n+r

[Lach98] No Available spares

Not Addressed

Either cmplete or

none

Device test vectors and controller

No Only one

faulty CLB per tile

2n+r

STARS

[Abramovici01] Yes Available

spares

Exhaustive Resource

Test

Only ~93% regardless of

fault occurrence

Test Reconfiguration Controller + device

test vectors Yes

Available spares within

routing chokepoints

s • (c+r)

[Keymeulen, Stoica,

Zebulum00] No

Depends on characteristics at design

time

Exhaustive during or

after evolution

Non-deterministic

None at runtime No Depends on redundancy

during design n • (1 + f(g))

Competitive Runtime

Reconfiguration (CRR)

[DeMara05]

Yes Recovery complexity

None Adaptable

Optional RAM … RAM coverage is

intrinsic

No test vectors

Yes None 2n+r

Fault Recovery Characteristics of Selected ApproachesFault Recovery Characteristics of Selected Approaches

Previous Work on Fault Recovery

Normalized Power Consumption (Energy per Operation):

n-plex solution using n redundant devices

Reconfiguration cost r

Gate-Level redundancy g

Updated with scan rate s

on c CLBs

Page 33: 29 September 2005 Dynamic Voting Schemes to Enhance Evolutionary Repair in Reconfigurable Logic Devices C. Milliord, C. A. Sharma, and R. F. DeMara University

Previous Work - Tool LevelPrevious Work - Tool Level

ApproachFPGA

SupportedOn-chip System

Bit Stream Reuse

System Coupling Degree

Potential Limitations

Moraes,

Mesquita,

Palma, Moller

Virtex XCV300 devices

No N LooseLack of Area

Relocation Capability

Raghavan, Sutton

Xilinx Virtex

devicesNo N Loose

Cumbersome CAD flow

Blodget, McMillan

Virtex II devices

Partial Y Medium

Limited hardware speed and capacity. Lack of

information for bit stream

reuse