Upload
jefferson-tommy
View
19
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Expediting GA-Based Evolution Using Group Testing Techniques for Reconfigurable Hardware 1. ReConFig’06 San Luis Potosi - Mexico. Rashad S. Oreifej, Carthik A. Sharma, and Ronald F. DeMara University of Central Florida. 1. Research support in-part by NSF grant CRCD: 0203446. Evolvable Hardware. - PowerPoint PPT Presentation
Citation preview
Rashad S. Oreifej, Carthik A. Sharma, and Ronald F. DeMaraUniversity of Central Florida
Rashad S. Oreifej, Carthik A. Sharma, and Ronald F. DeMaraUniversity of Central Florida
Expediting GA-Based Evolution Using Group Testing Techniques for Reconfigurable
Hardware1
ReConFig’06ReConFig’06San Luis Potosi - MexicoSan Luis Potosi - Mexico
ReConFig’06ReConFig’06San Luis Potosi - MexicoSan Luis Potosi - Mexico
1. Research support in-part by NSF grant CRCD: 0203446
Evolvable Hardware
Evolutionary Design:Evolutionary Design:• Start with available CLBs and IOBs• Implement a design using Genetic
Operators etc [Fogarty97]• Limited or no ability to re-design to account
for suspected faulty resources
Evolutionary Regeneration:Evolutionary Regeneration:• Start with an existing pool of designs
• Some existing configurations may use faulty resources
• Eliminate use of suspected faulty resources
• Genetic Operators can be applied to refurbish designs [Vigander01]
Previous Work
• Pre-compiled Column-Based Dual FPGA architecture [Mitra04]Pre-compiled Column-Based Dual FPGA architecture [Mitra04] Autonomous detection, repair by shifting pre-compiled columns Isolation using distributed CED-checkers and “blind” reconfiguration
attempts
• Overview of Combinatorial Group Testing and Applications Overview of Combinatorial Group Testing and Applications [Du00][Du00] Provides taxonomy and general algorithms for applying CGT Examples of CGT applications: DNA clone library filtering, vaccine
screening, computer fault diagnosis, etc.
• CGT Enhanced Circuit Diagnosis [Kahng04]CGT Enhanced Circuit Diagnosis [Kahng04] Present doubling, halving etc for circuit fault diagnosis using BIST,
CGT Requires ability to test resources individually
• Chinese Remainder Sieve technique [Eppstein05]Chinese Remainder Sieve technique [Eppstein05] Efficient non-adaptive and two-stage CGT based on prime number
driven test formation Improved algorithms for practical problem sizes (n < 1080) with small
number of defectives (d < 4)
Genetic Algorithms & Evolvable Hardware
GAs are strong candidates for implementing system GAs are strong candidates for implementing system refurbishment:refurbishment: They implement guided trial-and-error search using principles of
Darwinian evolution Iterative selection enforces “survival of the fittest” Genetic operators - mutation, crossover, … - can be used to
refurbish designs HypothesisHypothesis: Information regarding resource performance can
expedite GA-based refurbishment IndividualIndividual(Chromosome)(Chromosome)
GENEGENE
GAs frequently use strings of 1s and 0s to GAs frequently use strings of 1s and 0s to represent candidate solutionsrepresent candidate solutions FPGA Configuration File is a String of 1s and 0s
Conventional vs. CGT-Pruned GA
• Conventional GA: Conventional GA: Searches the whole space to evolve a working design or repair Information about resource suitability may accelerate search
• CGT-Pruned GA: CGT-Pruned GA: Prefers resources of higher fitness to evolve a working design or repair.
Q. How to obtain resource fitness information?A. Using Group Testing Techniques.
Combinatorial Group Testing identifies a decreasing group of “defectives” by iterative refinement
Tests on subsets of suspects Is expected to take less time. “Faster Design and Faster
Repair”
CGT-Pruned GA Simulator
Settings
Truth Table
Seed Config.
Fitness Report
Best Config.
CGT
GA
If Repair
Resource Info
No. Of CLBs = ...No. LUTs = ...Pop. Size = … . . .
I1 I2 ... O1 O2 ...0 0 ... 0 0 0 ...0 0 ... 0 1 0 … . . .
CLB #:0LUT #:0FunctionType: ORLUT inputlineInputLine#0:4InputLine#1:3 . . .
Gen. Max Ave 2 154 142 3 155 139 . . .
CLB #:0LUT #:0FunctionType: XORLUT inputlineInputLine#0:0InputLine#1:5 . . .
Experimental Setup
Target CircuitTarget Circuit 3-bit x 2-bit Multiplier
No. of ExperimentsNo. of Experiments 120 (60/Experiment Type Repair and Design)
FPGA ArchitectureFPGA Architecture Feed-Forward design
No. of ResourcesNo. of Resources 60 LUTs (15 CLB, 4LUTs/CLB)
Fault ModelFault Model Logic Single Fault Model
Fault TypeFault Type Stuck at One
CGT-Pruned Refurbishment
• IsolateIsolate and A and Avoidvoid suspect resources from being used suspect resources from being used
• HypothesisHypothesis: CGT-Pruned GA Repair evolves a full fitness circuit faster
than Conventional GA Repair Results show performance improvement in CGT-Pruned
Repair
Results: Conventional Vs. CGT-Pruned Repair
CGT-Pruned GA out-performs Conventional GA
Experiment Type Conventional Repair CGT-pruned Repair
Circuit 3-bit x 2-bit Multiplier 3-bit x 2-bit Multiplier
Number of Experiments 30 30
Arithmetic Mean (Generations)
17150 10700
Standard Deviation 15650 12550
Standard Error of the Mean 2850 2300
68% Confidence Interval [14300 → 20000] [8400 → 13000]
Achieving Refurbishment with Cell Swapping
• IsolateIsolate and and SwapSwap suspect resources suspect resources • Cell SwappingCell Swapping Operator Operator
Copy suspect resource “Cell” configuration to another unused cell GA searches for routing strategy to re-route interconnect to the
previously-unused cell• Refurbishment with Cell SwappingRefurbishment with Cell Swapping
Swap suspect cells one by one and evaluate fitness until full fitness is evolved
If swapping all suspect cells does not realize complete refurbishment, then employ other GA operators
CGT-Pruned GA Design
• Evolve the entire circuit design from scratchEvolve the entire circuit design from scratch• Avoid Avoid suspectsuspect resources and take advantage of resources and take advantage of
resource redundancy within the FPGAresource redundancy within the FPGA
CGT-Pruning outperforms Conventional GA-based techniques
Results: Conventional Vs. CGT-Pruned Design
Design of a circuit in the presence of a single stuck-at fault
Experiment Type Conventional design CGT-pruned design
Circuit 3-bit x 2-bit Multiplier 3-bit x 2-bit Multiplier
Number of Experiments 30 30
Arithmetic Mean (Generations)
64500 53900
Standard Deviation 36000 37300
Standard Error of the Mean 7200 7450
68% Confidence Interval [57300 → 71700] [46450 → 61350]
Comparison of Performance – Number of Generations for Repair
More than 70%70% of the experiments benefited substantially from resource information generated using CGT
Results Summary
As opposed to Conventional GAs, CGT-Pruned GAsCGT-Pruned GAs:: Completely refurbish configurations in 38%38% fewer generations Design fully functional configurations in 16%16% fewer generations Faulty resources are eliminated from
Pool of unused-resources in the case of repair as opposed to the pool of all-resources in the case of design.
Repair complexity vs. Design complexity Repair complexity << Design complexity Repairs were realized in one-fifthone-fifth of the time required for Design
Motivation
• Mission-critical Embedded Systems require high Mission-critical Embedded Systems require high reliability and availabilityreliability and availability
• Characteristics of Operating Environment may Characteristics of Operating Environment may induce hardware failures:induce hardware failures: Aging, Manufacturing Defects, …etc.
• System Reliability:System Reliability: Fault Avoidance. “Always Possible?”… No Design Margin. “Always Adequate?”… No Modular Redundancy. “Always Recoverable?”…No Fault Refurbishment. “Highly Flexible?” … Yes … but
technically challenging to achieve
Group Testing Techniques
• Competitive Group TestingCompetitive Group Testing Algorithm based on Algorithm based on group testinggroup testing
methodsmethods Use Use competitioncompetition between between
configurationsconfigurations Temporal information stored in Temporal information stored in HH matrix matrix Successive intersectionSuccessive intersection Monitor health history of resources Monitor health history of resources
which presents resource fitnesswhich presents resource fitness Simulated using C programming Simulated using C programming
language and GSL functions language and GSL functions [Sharma-[Sharma-06]06]
0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 2 1 0 0 1 0 0 0
0 0 1 0 1 1 0 1 0 0
0 0 1 1 0 1 0 0 0 0
0 0 1 0 0 1 1 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Relative fitness of resource α 1/H [i,j]
H H [i,j][i,j]
i,j
Three Fast Runs of the CGT-pruned GA Repair
GA evolves to a relatively very high fitness within the first few hundreds generations, but takes significantly more generations to reach the maximum fitness
References[1] Fogarty T. C., J. F. Miller, and P. Thomson, "Evolving Digital Logic Circuits on Xilinx 6000 Family FPGAs," in Proceedings of The
2nd Online Conference on Soft Computing, 23-27 June 1997.
[2] Sverre Vigander, “Evolutionary Fault Repair in Space Applications”, Master’s Thesis, Dept. of Computer & Information Science, Norwegian University of Science and Technology (NTNU), Trondheim, 2001.
[3] C. A. Sharma, R. F. DeMara, "A Combinatorial Group Testing Method for FPGA Fault Location", accepted to International Conference on Advances in Computer Science and Technology (ACST 2006), Puerto Vallarta, Mexico, January 23 - 25, 2006
[4] S. Mitra and E. J. McCluskey, “Which Concurrent Error Detection Scheme to Choose?,” in Proceedings of the International Test Conference 2000, p. 985, October 2000.
[5] D. Du and F. K. Hwang. Combinatorial Group Testing and its Applications, volume 12 of Series on Applied Mathematics. World Scientific, 2000.
[6] A. B. Kahng and S. Reda. “Combinatorial Group Testing Methods for the BIST Diagnosis Problem,” in Proceedings of the Asia and South Pacific Design Automation Conference, January 2004.
[7] Keymeulen, D.; Zebulum, R.S.; Jin, Y.; Stoica, A.. “Fault-Tolerant Evolvable Hardware Using Field-Programmable Transistor Arrays”, IEEE Transactions On Reliability, Vol. 49, No. 3, September 2000
[8] Lohn, J.; Larchev, G.; DeMara, R. “Evolutionary fault recovery in a Virtex FPGA using a representation that incorporates routing”, Parallel and Distributed Processing Symposium, 2003. Proceedings. International 22-26 April 2003
[9] Lach, J.; Mangione-Smith, W.H.; Potkonjak, M. “Low overhead fault-tolerant FPGA systems”, Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Volume 6, Issue 2, June 1998
[10] Miron Abramovici, John M. Emmert and Charles E. Stroud , “Roving Stars: An Integrated Approach To On-Line Testing, Diagnosis, And Fault Tolerance For Fpgas In Adaptive Computing Systems”, The Third NASA/DoD Workshop on Evolvable Hardware, Long Beach, Cailfornia 2001
[11] DeMara, R.F.; Kening Zhang. “Autonomous FPGA Fault Handling through Competitive Runtime Reconfiguration”, Evolvable Hardware, 2005. Proceedings. 2005 NASA/DoD Conference on 29-01 June 2005
[12] D. Eppstein, M. T. Goodrich, and D. S. Hirschberg. “Improved combinatorial group testing for realworld problem sizes”, In Workshop on Algorithms and Data Structures (WADS), Lecture Notes Comput. Sci. Springer, 2005.
[13] J. F. Miller, P. Thomson, and T. Fogarty. “Designing Electronic Circuits Using Evolutionary Algorithms. Arithmetic Circuits: A Case Study”, In D. Quagliarella, J. Periaux, C. Poloni, and G. Winter, editors, Genetic Algorithms and Evolution Strategy in Engineering and Computer Science, pages 105--131. Morgan Kaufmann, Chichester, England, 1998.
Fault Tolerant Design and Detection Characteristics
***Incorporates resource performance information
Previous Work
Our Goal: Autonomous FPGA Refurbishment
Redundancy
increases with amount of spare capacity
restricted at design-time
based on time required to select spare resource
determined by adequacy of spares available (?)
yes
Refurbishment
weakly-related to number
recovery capacity
variable at recovery-time
based on time required to find suitable recovery
affected by multiple characteristics (+ or -)
yes
Overhead from Unutilized Spares weight, size, power
Granularity of Fault Coverage resolution where fault handled
Fault-Resolution Latency availability via downtime required to handle fault
Quality of Repair likelihood and completeness
Autonomous Operation fix without outside intervention
increase availability without carrying pre-configured spares …
everyday example
spare tires can of fix-a-flat
Commercial Applications: Nextel: frequency allocation for cellular phone networks -- $15M
predicted savings in NY market Pratt & Whitney: turbine engine design --- engineer: 8 weeks;
GA: 2 days w/3x improvement
International Truck: production scheduling improved by 90% in 5 plants
NASA: superior Jupiter trajectory optimization, antennas, FPGAs
Koza: 25 instances showing human-competitive performance such as analog circuit design, amplifiers, filters
GA Success Stories
Adaptive GA Design
Circuit: 2 to 4 Decoder
CLBs: 2
LUTs/CLB: 4
Fault: Stuck at 1 and Stuck at 0
Traditional GA: 220 Generations *, std dev 240**
Adaptive GA: 152 Generations *, std dev 120**
* Arithmetic mean for twenty experiments
** Standard Deviation for twenty experiments
Analysis Metrics
Mean:
Standard Deviation:
Standard Error of the Mean:
Confidence Level:
1
)(1
2
n
xn
kxk
x
n
xn
kk
x
1
nSEM x
x
%68)( xxx SEMCL
%95)2( xxx SEMCL
CGT-Pruned GA Simulator
• C++ based console applicationC++ based console application• Consists of:Consists of:
Combinatorial Group Testing component Uses Gnu Scientific Library (GSL)
Genetic Algorithm component Object oriented architecture that models FPGA resources
• Modes of Operation:Modes of Operation: CGT-Pruned GA Repair
Use CGT to isolate suspect resources Avoid use of suspect-faulty resource in design refurbishment
process CGT-Pruned GA Repair with Cell Swapping
Swap suspect-faulty resources with previously unused resources to evolve a recovery
CGT-Pruned GA Design Evolve a new working design while avoiding suspect resources