Upload
vivien-kelly
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
On the reliability On the reliability of SRAM-based FPGAs of SRAM-based FPGAs
Luca Sterpone Luca Sterpone <[email protected]><[email protected]>
www.cad.polito.it
OutlineOutline
IntroductionIntroduction Previous worksPrevious works
Scrubbing with partial reconfigurationScrubbing with partial reconfiguration Triple Module RedundancyTriple Module Redundancy
Nowadays TrendsNowadays Trends Proposed approaches and methodologyProposed approaches and methodology
High Level Functional VHDLHigh Level Functional VHDL RPAR algorithmRPAR algorithm
ConclusionsConclusions
IntroductionIntroduction What’s a SRAM-based FPGA ?What’s a SRAM-based FPGA ?
The SRAM-based FPGA is an array of island-The SRAM-based FPGA is an array of island-style blocks. Each block consists of an array of style blocks. Each block consists of an array of logic elements and routing channels logic elements and routing channels programmed by a Static-RAM configuration programmed by a Static-RAM configuration memory.memory.
logic blocksI/O blocks
routing resources
ConfigurationBITSTREAM
IntroductionIntroduction The SRAM-based FPGA’s major vendors:The SRAM-based FPGA’s major vendors:
Altera familiesAltera families Cyclone and AcexCyclone and Acex
Low costLow cost Stratix-IIStratix-II
High density FPGAHigh density FPGA 90nm technologies90nm technologies
Xilinx familiesXilinx families SpartanSpartan
90nm technologies90nm technologies Up to 5 Million System gatesUp to 5 Million System gates Lower cost per gate and per pinLower cost per gate and per pin
VirtexVirtex High performanceHigh performance
IntroductionIntroduction
The SRAM-based FPGAs are very The SRAM-based FPGAs are very convenient because of:convenient because of: High flexibility in achieving multiple High flexibility in achieving multiple
requirements of different applicationsrequirements of different applications Low costLow cost High performanceHigh performance High turnaround timeHigh turnaround time Re-configurabilityRe-configurability
IntroductionIntroduction The performance and the capacity of the FPGAs The performance and the capacity of the FPGAs
suitable for space flight is increasing steadilysuitable for space flight is increasing steadily Increase from tens of thousands to millions of Increase from tens of thousands to millions of
system gatessystem gates
Spartan 90nm Die
Virtex-4 Die
IntroductionIntroduction Application of FPGAs has moved form glue logic to Application of FPGAs has moved form glue logic to
complete subsystems that combine real time complete subsystems that combine real time functions on a single chip, including functions on a single chip, including microprocessors and memoriesmicroprocessors and memories
The potentials for FPGA use in space is steadily The potentials for FPGA use in space is steadily increasing and opening up new application areasincreasing and opening up new application areas
The FPGAs are more commonly being used not only The FPGAs are more commonly being used not only in critical applications and are replacing ASICs on a in critical applications and are replacing ASICs on a regular basis.regular basis. SRAM-based FPGA
re-configurable Server
IntroductionIntroduction
What’s happened in the space environment What’s happened in the space environment ??
IntroductionIntroduction
The high-energy particles can hit the sensitive The high-energy particles can hit the sensitive silicon area of the SRAM-based FPGAsilicon area of the SRAM-based FPGA
High sensibility to High sensibility to Single Event UpsetsSingle Event Upsets (SEUs) (SEUs) The configuration memory elements could change their The configuration memory elements could change their
content content bit-flipbit-flip
SEUs may drastically alter the FPGA correct SEUs may drastically alter the FPGA correct operations causing unexpected outputs called operations causing unexpected outputs called Single Event Functional InterruptsSingle Event Functional Interrupts (SEFIs). (SEFIs).
IntroductionIntroduction iRoCiRoC technologiestechnologies conducted a series of tests conducted a series of tests
to determine the failure rate of five different to determine the failure rate of five different FPGA architectures: FPGA architectures: Virtex-II and Spartan-3 SRAM-based from Virtex-II and Spartan-3 SRAM-based from
XilinxXilinx SRAM-based Cyclone FPGA from Altera SRAM-based Cyclone FPGA from Altera Antifuse - based Axcelerator FPGA ProASIC Antifuse - based Axcelerator FPGA ProASIC
Plus devices form ActelPlus devices form Actel
FITFIT (failure in time) is defined as one failure in (failure in time) is defined as one failure in 101099 hours. hours.
IntroductionIntroduction The results were:The results were:
Antifuse- and flash-based FPGAs suffered no Antifuse- and flash-based FPGAs suffered no loss of configuration under neutron loss of configuration under neutron bombardementbombardement
The tested SRAM-based FPGAs demonstrated The tested SRAM-based FPGAs demonstrated a FIT rate ranging form a FIT rate ranging form 1,1501,150 at sea level to at sea level to 3,9003,900 at 5,000 feet to at 5,000 feet to 540,000 540,000 atat 60,000 feet.60,000 feet.
Please note thatPlease note that: : The integrated circuits typically have a FIT The integrated circuits typically have a FIT
rates lower than rates lower than 100100 The high-reliability applications require a FIT The high-reliability applications require a FIT
rate of rate of 1010 to to 2020..
IntroductionIntroduction
Safety critical applications such as Safety critical applications such as space applications must consider the space applications must consider the effect of energetic particles (radiation) effect of energetic particles (radiation) can have on electronic componentscan have on electronic components
The usage of the SRAM-based FPGAs in The usage of the SRAM-based FPGAs in safety critical applications needs the safety critical applications needs the develop of techiniques able to decrease develop of techiniques able to decrease the FIT ratio.the FIT ratio.
Previous worksPrevious works
SEU scrubbingSEU scrubbing The configuration bitstream is simply The configuration bitstream is simply
reloaded at a chosen interval.reloaded at a chosen interval.
+ The scrubbing requires a low overhead + The scrubbing requires a low overhead in the systemin the system
- The configuration logic is in “write mode” The configuration logic is in “write mode” for a greater percentage of timefor a greater percentage of time
- The chosen interval for scrub cycles The chosen interval for scrub cycles should be based on the expected static should be based on the expected static upset rate and could be very frequent.upset rate and could be very frequent.
Previous worksPrevious works Partial Reconfiguration + SEU ScrubbingPartial Reconfiguration + SEU Scrubbing
The configuration memory array is divided The configuration memory array is divided into separate segmentsinto separate segments
Thanks to error detection and correction Thanks to error detection and correction architecture (EDAC architecture) it is architecture (EDAC architecture) it is reloaded only the segment that is affected reloaded only the segment that is affected by SEUsby SEUs
- The architecture overhead is very highThe architecture overhead is very high- The power consumption are excessive for The power consumption are excessive for
space/mission critical application.space/mission critical application.
Previous works- TMR Previous works- TMR techniquetechnique The purpose is to remove all single The purpose is to remove all single
points of failure from the designpoints of failure from the design How to protect the design against SEUs ?How to protect the design against SEUs ?
A circuit can be hardened by designing A circuit can be hardened by designing three copies of the same circuit and three copies of the same circuit and building a majority voter on the building a majority voter on the outputs of the replicated circuits.outputs of the replicated circuits.
Depends on the type of data structure Depends on the type of data structure to be mitigatedto be mitigated
Throughput LogicThroughput Logic State-machine LogicState-machine Logic I/O LogicI/O Logic Special FeaturesSpecial Features
Previous works- TMR Previous works- TMR techniquetechnique Although TMR based approach can Although TMR based approach can
tolerate one SEU, they can not tolerate tolerate one SEU, they can not tolerate a second one before being refresheda second one before being refreshed
The refresh cycle of the configuration The refresh cycle of the configuration memory and of the flip-flops can be memory and of the flip-flops can be compared with the scrubbing memory compared with the scrubbing memory protected by EDAC architectureprotected by EDAC architecture
The refresh period needs to be shorted The refresh period needs to be shorted than the expected bit error periodthan the expected bit error period
The TMR based design is not as The TMR based design is not as efficient as presumed.efficient as presumed.
Previous works- TMR Previous works- TMR techniquetechnique There' are two kind of TMR methodologies:There' are two kind of TMR methodologies:
Functional Triple Modular Redundancy Functional Triple Modular Redundancy (FTMR) (2002) developed by the (FTMR) (2002) developed by the GAISLER GAISLER research.research.
A VHDL design methodology that provides TMR at A VHDL design methodology that provides TMR at different design levels:different design levels:
DeviceDeviceModularModularGateGate
Concurrent Error Detection-Duplication with Concurrent Error Detection-Duplication with Comparison for the user combinational logic Comparison for the user combinational logic (2003) presented by Lima et all(2003) presented by Lima et all
A VHDL design methodology that provides an A VHDL design methodology that provides an application oriented architecture able to detect application oriented architecture able to detect the SEU.the SEU.
Fernanda Lima, Luigi Carro, Ricardo Reis, “Designing fault tolerant system Fernanda Lima, Luigi Carro, Ricardo Reis, “Designing fault tolerant system into SRAM based FPGAs”, DAC 2003into SRAM based FPGAs”, DAC 2003
Previous works- TMR Previous works- TMR techniquetechnique Functional Triple Modular Functional Triple Modular
RedundancyRedundancy Triple Module Redundancy flip-flops:Triple Module Redundancy flip-flops:
Triple Module Redundancy sequential - Triple Module Redundancy sequential - logiclogic
Previous works- TMR Previous works- TMR techniquetechnique Functional Triple Modular RedundancyFunctional Triple Modular Redundancy
GAISLER Research Group Report on FPGA for ESA activities 2002GAISLER Research Group Report on FPGA for ESA activities 2002
Previous works- TMR Previous works- TMR techniquetechnique Concurrent Error Detection-Duplication Concurrent Error Detection-Duplication
with Comparison for the user with Comparison for the user combinational logic combinational logic
Previous works- TMR Previous works- TMR techniquetechnique Evaluation of the SEU sensitiveness of Evaluation of the SEU sensitiveness of the TMR basic architecture by simulation the TMR basic architecture by simulation
(BYU SEU simulator)(BYU SEU simulator)
Nathan Rollins, Michael J. Wirthlin, Michael Caffrey and Paul Graham, “Evaluating Nathan Rollins, Michael J. Wirthlin, Michael Caffrey and Paul Graham, “Evaluating TMR Techniques in the Presence of Single Event Upsets”TMR Techniques in the Presence of Single Event Upsets”
Department of Electrical and Computer Engineering, Brigham Young University.Department of Electrical and Computer Engineering, Brigham Young University.
Previous works- TMR Previous works- TMR techniquetechnique
Nathan Rollins, Michael J. Wirthlin, Michael Caffrey and Paul Graham, “Evaluating Nathan Rollins, Michael J. Wirthlin, Michael Caffrey and Paul Graham, “Evaluating TMR Techniques in the Presence of Single Event Upsets”TMR Techniques in the Presence of Single Event Upsets”
Department of Electrical and Computer Engineering, Brigham Young University.Department of Electrical and Computer Engineering, Brigham Young University.
Previous works- TMR Previous works- TMR techniquetechnique Evaluation of the SEU sensitiveness of the TMR basic architecture by fault injection Evaluation of the SEU sensitiveness of the TMR basic architecture by fault injection
P. Bernardi, M. Sonza Reorda, L. Sterpone, M. Violante “On the evaluation of SEU sensitiveness in SRAM-based FPGAs”, 12-14 July, IOLTS 2004.P. Bernardi, M. Sonza Reorda, L. Sterpone, M. Violante “On the evaluation of SEU sensitiveness in SRAM-based FPGAs”, 12-14 July, IOLTS 2004.
TMR design flow TMR design flow
1.1. User TMR design (VHDL – EDF)User TMR design (VHDL – EDF)2.2. SynthesizeSynthesize
1.1. SynthesisSynthesis2.2. RTL schematicRTL schematic3.3. Check SyntaxCheck Syntax
3.3. Implement DesignImplement Design1.1. MapMap2.2. Place & Route (PAR)Place & Route (PAR)
4.4. Generate Programming FileGenerate Programming File1.1. Native Circuit DescriptionNative Circuit Description2.2. Configuration memory fileConfiguration memory file
TMR design flow TMR design flow The place-and-route tools provided The place-and-route tools provided
by the FPGA vendors are capable of by the FPGA vendors are capable of optimisingoptimising the number of modules the number of modules used in the design by used in the design by recombiningrecombining the modules and the modules and compactingcompacting the the design.design.
TMR design flowTMR design flow It’s important to analyse the results It’s important to analyse the results
of the synthesis and the place-and-of the synthesis and the place-and-route at the netlist level to ensure route at the netlist level to ensure that the intended SEU protection has that the intended SEU protection has been implemented.been implemented.
Implement design
The TMR fault scenarioThe TMR fault scenario
The investigations are made at the The investigations are made at the architectural level of the SRAM-architectural level of the SRAM-based FPGAs manufactured by Xilinxbased FPGAs manufactured by Xilinx
The main macro-element is the TILEThe main macro-element is the TILE CLBCLB Buffer T-stateBuffer T-state Routing SwitchboxRouting Switchbox
The fault scenarioThe fault scenarioThe investigation methodologyThe investigation methodology
Xilinx – TMR design flowXilinx – TMR design flow
A possible Control Logic Block in a A possible Control Logic Block in a Xilinx TMR designXilinx TMR design
Implement design
Control Logic BlockControl Logic BlockThe fault scenario
SRMUX
CKINV
CEMUX
BXMUX
CY0F
BYMUX
CY0G
CYSELG
GYMUXG
FXMUX
CYINIT
CYSELF
Critical components for the TMR Critical components for the TMR architecture within the CLB:architecture within the CLB:
Combinational TMR designCombinational TMR design MUX FaultMUX Fault
CKINV, CY0G, CY0FCKINV, CY0G, CY0F
Sequential TMR designSequential TMR design MUX FaultMUX Fault
CKINV, CY0G, CY0F, BYMUX, BXMUX, CEMUX, CKINV, CY0G, CY0F, BYMUX, BXMUX, CEMUX, SRMUX, CYINT, CYSELF, CYSELGSRMUX, CYINT, CYSELF, CYSELG
INITIALIZATIONINITIALIZATION SYNC_ATTRSYNC_ATTR
Control Logic BlockControl Logic BlockThe fault scenario
Control Logic BlockControl Logic Block
MUX Fault : CKINVMUX Fault : CKINVThe fault scenarioCombinational Design
This MUX isn’t used before the configuration memory upset.
A possible SEU can activate it!
Then the upset becomes a SEFI in the TMR circuitry as this component
controls both the TMR LUTs!
TMR 1 bit j
TMR 2 bit j
Please note that the two TMR modules are related to signals referred to the
same bit (j) within the circuitry!
Control Logic BlockControl Logic Block
MUX Fault : CY0G/CY0FMUX Fault : CY0G/CY0FThe fault scenarioCombinational Design
The upset alters the output YB of the TMR 1 and the output COUT. COUT is used by another TMR module in a
different CLB.
The configuration memory upset provokes a miss configuration of the
CY0G MUX!
Control Logic BlockControl Logic Block
MUX Fault : BYMUX\BXMUXMUX Fault : BYMUX\BXMUXThe fault scenarioSequential Design
Control Logic BlockControl Logic BlockMUX Fault : CYINITMUX Fault : CYINIT
The fault scenarioSequential Design
Control Logic BlockControl Logic BlockMUX Fault : CYSELF/CYSELGMUX Fault : CYSELF/CYSELG
The fault scenarioSequential Design
Control Logic BlockControl Logic BlockINITIALIZATION: SYNC_ATTRINITIALIZATION: SYNC_ATTR
The fault scenarioSequential Design
Routing SwitchboxRouting Switchbox The routing switchboxes provide the The routing switchboxes provide the
interconnection between the whole logic interconnection between the whole logic resources implemented on the SRAM-based resources implemented on the SRAM-based FPGA.FPGA.
The fault scenario
Routing SwitchboxRouting Switchbox
The fault scenario of the Routing The fault scenario of the Routing Switchbox is based on basic events:Switchbox is based on basic events:
Unrouted netUnrouted net Antenna netAntenna net Bridge netBridge net Short netShort net Open netOpen net
The fault scenario
Critical cases for the TMR Critical cases for the TMR interconnection architecture:interconnection architecture:
Combinational TMR designCombinational TMR design Multiple basic events provoked by common control bitMultiple basic events provoked by common control bit Non-TMR signals routed by the PAR algorithmNon-TMR signals routed by the PAR algorithm
Sequential TMR designSequential TMR design Multiple basic events provoked by common control bitMultiple basic events provoked by common control bit Short eventShort event Non-TMR signals routed by the PAR algorithmNon-TMR signals routed by the PAR algorithm
Routing SwitchboxRouting SwitchboxThe fault scenario
Routing SwitchboxRouting Switchbox
(I) (I) Multiple basic events provoked by common Multiple basic events provoked by common control bit. control bit. OPEN-OPENOPEN-OPEN
The fault scenarioCombinational Design & Sequential Design
The upset in the configuration memory provokes the OPEN of both the
connection called: OUT1->H6W0 and H6M4 -> V6S4 !
Please note that the two faulty signals are related only to different TMR modules in sequential circuits!!!
dev15335.bit of Elliptic Filter
Routing SwitchboxRouting Switchbox(II) (II) Multiple basic events provoked by common Multiple basic events provoked by common
control bit. control bit. OPEN-SHORTOPEN-SHORT
The fault scenarioCombinational Design & Sequential Design
dev10984.bit of Elliptic Filter
Routing SwitchboxRouting Switchbox(III) (III) Multiple basic events provoked by common Multiple basic events provoked by common
control bit. control bit. OPEN-BRIDGEOPEN-BRIDGE
The fault scenarioCombinational Design & Sequential Design
dev3992.bit of Adder 16
Routing SwitchboxRouting Switchbox
(IV) (IV) Multiple basic events provoked by common Multiple basic events provoked by common control bit. control bit. BRIDGE-BRIDGEBRIDGE-BRIDGE
The fault scenarioCombinational Design & Sequential Design
Dev16568.bit Elliptic Filter
Routing SwitchboxRouting Switchbox Non-TMR signal routed by the PAR algorithmNon-TMR signal routed by the PAR algorithm
The fault scenarioCombinational Design & Sequential Design
The upset in the configuration memory provokes a bitflip within a MUX that
controls a CONSTANT value, used for different TMR modules.
Routing switchboxRouting switchboxCombinational Design & Sequential Design The fault scenario
TMR fault scenario TMR fault scenario classificationclassification
P. Bernardi, M. Sonza Reorda, L. Sterpone, M. Violante “Analysis of the P. Bernardi, M. Sonza Reorda, L. Sterpone, M. Violante “Analysis of the robustness of the TMR architecture in SRAM-based FPGAs”, 22-24 Sept, RADECS robustness of the TMR architecture in SRAM-based FPGAs”, 22-24 Sept, RADECS 2004.2004.
Routing SwitchboxRouting Switchbox
Short eventShort event
The fault scenarioSequential Design
The upset in the configuration memory provokes the conflict on the HEX LINE
bitween two different TMR modules.In this case the bad nodes are the HEX
LINE nodes.
The nodes related to the Hex Lines are very critical within the SRAM-based
FPGA.
Routing Switchbox – Hex Routing Switchbox – Hex lineslines The hex lines are The hex lines are partpart of the general of the general
purpose interconnection provided by the purpose interconnection provided by the Xilinx devices. They route a TILE signals Xilinx devices. They route a TILE signals to another TILEs six-blocks away in each to another TILEs six-blocks away in each one of the four directionsone of the four directions
Hex-lines signals can be accessed either Hex-lines signals can be accessed either at the endpoints or at the midpoint at the endpoints or at the midpoint (three blocks from the source).(three blocks from the source).
Routing Switchbox – GRMRouting Switchbox – GRM
A General Routing Matrix connectability is A General Routing Matrix connectability is formed by:formed by: 108 hex-lines for each TILE108 hex-lines for each TILE 96 bidiretional interconnection to the 96 bidiretional interconnection to the
TILEs in each one of the four directions.TILEs in each one of the four directions.
Nowadays TrendsNowadays Trends
+ Antifuse based FPGAs have so far dominated in + Antifuse based FPGAs have so far dominated in space applications but the SRAM based families space applications but the SRAM based families offer high gate countsoffer high gate counts
- The SRAM configuration memory has a level of - The SRAM configuration memory has a level of SEU sensitivity that can not be ignoredSEU sensitivity that can not be ignored
- The - The careful application of TMRcareful application of TMR and and complementary techniques could have an complementary techniques could have an overhead of 4.5 – 7.5 gates and a performance overhead of 4.5 – 7.5 gates and a performance reduction of about 50% (S. Habinc, reduction of about 50% (S. Habinc, Microelectronics Final Presentation Days, Microelectronics Final Presentation Days, ESA-ESA-ESTECESTEC, Feb. 4-5, 2004). Also reported by the , Feb. 4-5, 2004). Also reported by the GAISLER ResearchGAISLER Research group. group.
Nowadays TrendsNowadays Trends
VHDL – EDF TMR deviceVHDL – EDF TMR device
Synthesis
MAP
Place & Route
Reliable ??
The designer has no capability to control the process result
The designer can define the project constraints in term of
• area occupation for each hierarchy
• timing delay
Nowadays TrendsNowadays Trends
The idea is develop a Reliable Place The idea is develop a Reliable Place And Route process (RPAR algorithm) And Route process (RPAR algorithm) able to perform a dependable able to perform a dependable placement of both the interconnection placement of both the interconnection and logic resources.and logic resources.
Nowadays Trends – RPAR Nowadays Trends – RPAR algorithmalgorithmRPAR(RPAR(DD) ) {{for each area constraints for each area constraints AiAi for each logic node for each logic node LNLN within within AiAi do do {{ find the destination logic nodes list find the destination logic nodes list NODE_D_LISTNODE_D_LIST
in in AiAi for each destination node for each destination node DN DN in in NODE_D_LISTNODE_D_LIST
{{ ((NTNT)=connect_node_2_node()=connect_node_2_node(LNLN, , DN DN ))
if none connection are available thenif none connection are available then re_place(re_place(DN DN ))
elseelse update_avoid_node_graph(update_avoid_node_graph(NT, Ai NT, Ai ))
}}}}}}
((NTNT)=connect_node_2_node()=connect_node_2_node(LNLN, , DN DN ))
It supports the routing exploiting the Versatile Place It supports the routing exploiting the Versatile Place and Route Algorithm (VPR)and Route Algorithm (VPR) It performs a Shortest path connection between logic nodesIt performs a Shortest path connection between logic nodes It is controlled by different parameters that permit a good It is controlled by different parameters that permit a good
flexibilityflexibility The maximun length of the interconnectionThe maximun length of the interconnection The maximun delay of the interconnectionThe maximun delay of the interconnection
Vaughn Betz and Jonathan Rose, “VPR: A New Packing, Placement and Routing Tool Vaughn Betz and Jonathan Rose, “VPR: A New Packing, Placement and Routing Tool for FPGA research”, International Workshop on Field Programmable Logic and for FPGA research”, International Workshop on Field Programmable Logic and Applications 1997Applications 1997
Nowadays Trends – RPAR Nowadays Trends – RPAR algorithmalgorithm
1) (1) (NT NT )=connect_node_2_node()=connect_node_2_node(LNLN, , DN DN ))
2) update_avoid_node_graph(2) update_avoid_node_graph(NT, Ai NT, Ai ))
Nowadays Trends – RPAR Nowadays Trends – RPAR algorithmalgorithm
a
c
b
d
ef
g
h
i
Nowadays Trends – RPAR Nowadays Trends – RPAR algorithmalgorithm The RPAR algorithm is applied only to a The RPAR algorithm is applied only to a
range of interconnections and logics range of interconnections and logics that are involved in the fault with the that are involved in the fault with the TMR fault injection campaign.TMR fault injection campaign.
Nowadays Trends – RPAR Nowadays Trends – RPAR algorithmalgorithm TMR XilinxCircuit / Technique # SEU that provokes a fault Prog. BitsADD8 882 9785ADD16 1350 11963MUL8 2300 17448FILTER 2764 33888
RPARCircuit / Technique # SEU that provokes a fault Prog. BitsADD8 16 7867ADD16 28 10091MUL8 29 18974FILTER 2760 33078
TMR Dedicated FloorplanningCircuit / Technique # SEU that provokes a fault Prog. BitsADD8 25 7855ADD16 38 10036MUL8 38 18927FILTER 2800 33057
Nowadays Trends – Nowadays Trends – PlacementPlacement Each redundant module could be Each redundant module could be
partitioned in different partpartitioned in different part
Then perform the placement keeping Then perform the placement keeping the hierarchy of each partitionthe hierarchy of each partition
BA C
PARA B CA
,B,C
A,B
,C
A,B
,C
PARB
AC
BA
C
BA
C
BA
C
BA
C
BA
C
Nowadays TrendsNowadays Trends
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
20 3--X 3--X 3--X 1 3 3 3 1 2 1--2 1 1 2 2 2
21 3 3--2 1--3 1 3 3 3 1 2 1--2 1 1 2 2 2
22 3 3--2 1--3 1 3 3 3 1 2 1--2 1 1 2 2 2
23 3 3--2 1--3 1 3 3 3 1 2 1--2 1 1 2 2 2
24 3 3--2 1--3 1 3 3 3 1 2 1--2 1 1 2 2 2
25 3 3--2 1--3 1 3 3 3 1 2 1--2 1 1 2 2 2
26 3 3--2 1--3 1 3 3 3 p17-1 2 p19-2 1 1-p22 x-p24 2 2-p26
27 3 p9-2 1--X 1--p11 p13-X p14-X 3--X 3--1 2-p18 X-p20 X-p21 1-vco X X-p25 2--X
28 vc2-X X X X X X X X X X X X X X X
29 30 31 32 33 34 35 36 37 38 39 40 41 42
11 p8
12
13 p3
14 p5 p4 vc8 p0
15
16 p9 p7 p6 p2 p1
17 1 1 1 1 1 1 1 1 1 1
18 p12 1 1 1 1 1 1 1 1 1 1
19 vc4 vc7 1 1 1 1 1 1 1 1 1 1
20 vc6 1 1 1 1 1 1 1 1 1 1
21 2 2 2 2 2 2 2 2 2 2
22 p11 2 2 2 2 2 2 2 2 2 2
23 vc3 vc5 2 2 2 2 2 2 2 2 2 2
24 p12 2 2 2 2 2 2 2 2 2 2
25 vc2 3 3 3 3 3 3 3 3 3 3
26 p13 3 3 3 3 3 3 3 3 3 3
27 p16 p14 3 3 3 3 3 3 3 3 3 3
28 p15 3 3 3 3 3 3 3 3 3 3
ConclusionsConclusions
The obtained results encourage the The obtained results encourage the application and the improving of the application and the improving of the RPAR algorithmRPAR algorithm
The reliability enhancement is not The reliability enhancement is not finished! finished!
High level strategy
Reliability
Placement strategy RPAR algorithm
Hierarchical optimization
Dedicated floorplanning Efficient PAR
Future worksFuture works
Validation of the obtained results by Validation of the obtained results by radiation testingradiation testing
Evaluation of the impact of these Evaluation of the impact of these strategy onstrategy on Power consumptionPower consumption Timing delayTiming delay Area overheadArea overhead Applicability to different FPGAsApplicability to different FPGAs
ReferencesReferences European space components information exchange systemEuropean space components information exchange system
https://escies.org/https://escies.org/ Gaisler Reseach Group Gaisler Reseach Group
www.gaisler.comwww.gaisler.com XilinxXilinx
www.xilinx.comwww.xilinx.com Application notes XilinxApplication notes Xilinx
Carl Carmichael “Triple Module Redundancy Design Techniques for Carl Carmichael “Triple Module Redundancy Design Techniques for Virtex FPGAs”, XAPP197 November 1, 2001Virtex FPGAs”, XAPP197 November 1, 2001
M. Violante, M. Ceschia, M. Sonza Reorda, A. Paccagnella, P. M. Violante, M. Ceschia, M. Sonza Reorda, A. Paccagnella, P. Bernardi, M. Rebaudengo, D. Bortolato, M. Bellato, P. Zambolin and Bernardi, M. Rebaudengo, D. Bortolato, M. Bellato, P. Zambolin and A. Candelori “Analyzing SEU Effects in SRAM-based FPGAs”, IOLTS A. Candelori “Analyzing SEU Effects in SRAM-based FPGAs”, IOLTS 2003.2003.
M. Ceschia, M. Violante, M. Sonza Reorda, A. Paccagnella, P. M. Ceschia, M. Violante, M. Sonza Reorda, A. Paccagnella, P. Bernardi, M. Rebaudengo, D. Bortolato, M. Bellato, P. Zambolin, A. Bernardi, M. Rebaudengo, D. Bortolato, M. Bellato, P. Zambolin, A. Candelori “Identification and classification of single-event upsets in Candelori “Identification and classification of single-event upsets in the configuration memory of SRAM-based FPGAs”, IEEE Transaction the configuration memory of SRAM-based FPGAs”, IEEE Transaction on Nuclear Science 2003.on Nuclear Science 2003.
ReferencesReferences Brigham Young University, Department of Electrical and Computer Brigham Young University, Department of Electrical and Computer
Engineering Engineering
www.ee.byu.eduwww.ee.byu.edu M. Bellato, P. Bernardi, D. Bortolato, A. Candelori, M. Ceschia, A. M. Bellato, P. Bernardi, D. Bortolato, A. Candelori, M. Ceschia, A.
Paccagnella, M. Rebaudengo, M. Sonza Reorda, M. Violante, P. Paccagnella, M. Rebaudengo, M. Sonza Reorda, M. Violante, P. Zambolin “Evaluating the effects of SEUs affecting the configuration Zambolin “Evaluating the effects of SEUs affecting the configuration memory of an SRAM-based FPGA”, DATE 2004memory of an SRAM-based FPGA”, DATE 2004
F. Lima, C. Carmichael, J. Fabula, R. Padovani, R. Reis “A fault F. Lima, C. Carmichael, J. Fabula, R. Padovani, R. Reis “A fault injection analysis of Virtex FPGA TMR design methodology”, RADECS injection analysis of Virtex FPGA TMR design methodology”, RADECS 20012001
Fernanda Lima, Luigi Carro, Ricardo Reis, “Designing fault tolerant Fernanda Lima, Luigi Carro, Ricardo Reis, “Designing fault tolerant system into SRAM based FPGAs”, DAC 2003system into SRAM based FPGAs”, DAC 2003
Spare slidesSpare slides
Nowadays Trends – RPAR Nowadays Trends – RPAR algorithmalgorithm/* pre-algorithm Mapping operations *//* pre-algorithm Mapping operations */
for each defined area constraintsfor each defined area constraints set_area constraints set_area constraints AiAi for each logic_node for each logic_node LNLN of the design of the design DD dodo place place LNLN for each area constraints for each area constraints AiAi
RPAR(RPAR(DD) ) { { for each area constraints for each area constraints AiAi for each logic node for each logic node LNLN within within AiAi do do {{ find the destination logic nodes list find the destination logic nodes list NODE_D_LISTNODE_D_LIST in in AiAi for each destination node for each destination node DN DN in in NODE_D_LISTNODE_D_LIST
{{ ((NTNT)=connect_node_2_node()=connect_node_2_node(LNLN, , DN DN ))
if none connection are available thenif none connection are available then {{ re_place(re_place(DNDN))
}}elseelse {{ update_avoid_node_graph(update_avoid_node_graph(NT, Ai NT, Ai )) }}
}}}}}}
Nowadays Trends – RPAR Nowadays Trends – RPAR algorithmalgorithm
TMR XilinxCircuit / Technique # SEU that provokes a fault Prog. BitsADD8 882 9785ADD16 1350 11963MUL8 2300 17448FILTER 2764 33888
RPARCircuit / Technique # SEU that provokes a fault Prog. BitsADD8 16 7867ADD16 28 10091MUL8 29 18974FILTER 2760 33078
TMR Dedicated FloorplanningCircuit / Technique # SEU that provokes a fault Prog. BitsADD8 25 7855ADD16 38 10036MUL8 38 18927FILTER 2800 33057
The RPAR algorithm is applied only to a The RPAR algorithm is applied only to a range of interconnections and logics range of interconnections and logics that are involved in the fault with the that are involved in the fault with the TMR Dedicated floorplanning fault TMR Dedicated floorplanning fault injection campaign.injection campaign.