Upload
gaius
View
43
Download
0
Embed Size (px)
DESCRIPTION
Low Cost Control Flow Protection Using Abstract Control Signatures. Daya S Khudia and Scott Mahlke. University of Michigan. Soft Errors. Image credit: Certichip. Soft errors, also called single-event upsets(SEUs) Occur because of High energy particle strikes or electrical noise - PowerPoint PPT Presentation
Citation preview
University of MichiganElectrical Engineering and Computer Science1
Low Cost Control Flow Protection Using Abstract Control Signatures
Daya S Khudia and Scott Mahlke
University of Michigan
University of MichiganElectrical Engineering and Computer Science
Soft Errors• Soft errors, also called single-event upsets(SEUs)
– Occur because of• High energy particle strikes or electrical noise
• Parameters affecting soft error rates– Shrinking dimensions, Voltage scaling
• 100 times increase from 180nm to 16nm (Borkar, Micro’05). One failure per day every chip at 16nm (Feng et al, ASPLOS’10)
Image credit: Certichip
2
University of MichiganElectrical Engineering and Computer Science
Redundant execution in a single-threaded context Compiler interleaves original and redundant instructions
Our target is a low-overhead control flow protection solution Comparable coverage
Software-based control flow protection Usually by embedding signatures/assertions in basic blocks
Combine duplication and symptoms Improved by using profiling
Traditional dual/triple – modular redundancy Mission-critical reliability
3
Soft Error Detection
DMR, TMR
Signature/assertion based
(CFCSS, ACFC)
Target SolutionIncr
easi
ng O
verh
ead
Data flow Control flow
DMR, TMR
Instruction duplication (SWIFT, EDDI)
Instruction duplication + hardware symptoms (Shoestring, profileBased)
~100-200%
~30-70%
~10-30%
University of MichiganElectrical Engineering and Computer Science4
Why Control Flow Errors?
Data Flow Errors
Control Flow Target Errors
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Correct executions Incorrect executions
% of runs
• More than 70% of the transient faults lead to control flow errors (Vahdatpour et al.)
• Faults in hardware components manifest as control flow errors
• Program counter• Address circuitry
Errors in branch targets are 2.5x more likely to result in incorrect executions
University of MichiganElectrical Engineering and Computer Science5
Outline
• Background• Software-based control flow checking• Abstract Control Signatures (ACS)• Experimental evaluation• Conclusions
University of MichiganElectrical Engineering and Computer Science6
Control Flow Checking
update sig varBB1
check sig var
BB2update sig var
check sig var
• Two steps for control flow checking
• Compute signature at runtime
• Compare with an expected correct value
• In case of illegal control flow transfer, the signature check fails
University of MichiganElectrical Engineering and Computer Science7
Signature-Based Control Flow Checking
G = G xor d1
BB1 s1
G = = s1?
s2BB2
• Software-based control flow checking
• Update signature in each basic block
• Check signature in each basic block
• Can only handle errors in branch targets
• Errors in branch directions (conditions) are not covered
G = G xor d2
G = = s2?
G = G xor d2
G = s1 xor s1 xor s2
G = s2
d1 = - - -
d2 = s1 xor s2
University of MichiganElectrical Engineering and Computer Science8
Signature-Based Control Flow Checking
s1
d1 = - - -
G = = s2?
G = = s1?
s2
d2 = s1 xor s2
s3
d3 = s- xor s3G = G xor d3 D1 = s1 xor s3
G = = s3?
G = G xor d2
G = G xor D1 For branch fan-in nodes• Extra updates • Dynamically adjusting
signature are required
BB1
BB2
BB3
G = G xor d1
G = G xor D2
D1 = 0
University of MichiganElectrical Engineering and Computer Science
G = = s2?
G = = s1?G = G xor d3 D1 = s1 xor s3
G = = s3?
G = G xor d2
G = G xor D1
BB1
BB2
BB3
G = G xor d1
G = G xor D2
D1 = 0• Form regions
• Abstract away the details of control flow inside a region
9
Abstract Control Signatures
G = G xor d4 D2 = s2 xor s6
G = = s4?
BB4
G = G xor d5 D3 = s4 xor s7
G = = s5?
BB5
• Sources of overhead• Signature updates• Signature checks
University of MichiganElectrical Engineering and Computer Science
G = G xor d3 D3 = s4 xor s7
10
Abstract Control Signatures
G = = s2?
G = = s1?G = G xor d3 D1 = s1 xor s3
G = = s3?
G = G xor d2
G = G xor D1
BB1
BB2
BB3
G = G xor d1
G = G xor D2
D1 = 0Sig update
Sig update
Sig update
G = G xor d4 D2 = s2 xor s6
G = = s4?
BB4
G = = s5?
BB5
Sig update
Sig update
• Optimize signature updates• check simple run-time
properties
• Sources of overhead• Signature updates• Signature checks
• Optimize checks• Insert checks at region
boundaries
• Form regions• Abstract away the details
of control flow inside a region
Sig check
University of MichiganElectrical Engineering and Computer Science11
Insight 1: Optimized updates • Signature checking
• Make sure that control flow transfer took place from a legal predecessor
• Check counters (path length)
• Makes sure that proper number of BBs in predecessor region were visited
bb1C1 = 1
C1 = C1 + 1bb2
bb3
bb5
bb6
C1 = C1 + 1
C1 = C1 + 1
C1 = C1 + 1
bb4C1 = C1 + 1
C1 = = 4?C1 = = 5?
University of MichiganElectrical Engineering and Computer Science12
Insight 2: Optimized checks
• Sufficient to have a single check for a group of basic blocks
• Requirement on regions• The header block of a
region should dominate all the BBs in that region (single entry point)
• Nested loops should not be contained in a region
bb1
bb_latch1
Interval 1
Interval 2
bb2
bb4
bb3
bb_latch2
University of MichiganElectrical Engineering and Computer Science13
Balancing Increments
bb1C1 = 1
C1 = C1 + 1
C1 = = 3 or 4?
• Naively inserting checks• Multiple counter value
checks would be required at exits
Insert extra increment along these edges
C1 = = 5?C1 = = 4 or 5?C1 = = 5?
• Developed an algorithm to get (details are in paper)
• increment edges• increment amounts
bb2
bb3
bb4
bb5
C1 = C1 + 1
C1 = C1 + 1
C1 = C1 + 1
C1 = C1 + 1
C1 = C1 + 1
University of MichiganElectrical Engineering and Computer Science
bbN
bb2
• Move checks out of the loop• Insert increments
• Such that counter value is a power of two (facilitates remainder operation instead of costly division)
14
Optimization for Loops
bb1C1 = 1
C1 = C1 + 1
C1 = C1 + 1
C1 = 0
C1 / 3 == 0?
C1 == 3?
bb4
bb4
C1 = C1 + 1
C1 = C1 + 1
bb1
bb2 bb3
bb4C1 = C1 + 1
C1 = C1 + 1 C1 = C1 + 1
C1 = C1 + 2
C1 % 4 == 0?
University of MichiganElectrical Engineering and Computer Science15
Handling Call and Return Insts
update sig var with call specific lengthinverse update sig var
Ret_BB
return;
foo:
call foo;Inverse update with call specific length check sig var
update sig var
Entry_BB
call foo;
• Every function in the program is assigned a unique path length
• Global Signature variable is • Updated before and inversely updated after call• Inversely updated and updated inside callee
University of MichiganElectrical Engineering and Computer Science
System Overview
Insert signature updates and checks
• Collect required program information• Analyze program structure• Insert signature updates and checks
Operating System
Physical Hardware
• Trigger lightweight recovery based on• selective symptoms (hardware
exceptions)• signature comparison failsR
untim
e C
ompi
latio
n
16
University of MichiganElectrical Engineering and Computer Science
Evaluation Methodology• Program analysis and signatures updates/checks
– Implemented as compiler pass in the LLVM compiler• SPECINT2K Benchmarks• Statistical fault injection (SFI) experiments
– GEM5 simulator in ARM syscall emulation mode• Random (single) bit flip in control flow target
– Simulated entire benchmarks after fault injection– Log files analyzed for results classification
17
University of MichiganElectrical Engineering and Computer Science
Performance Overhead
18
164.gzip
175.vpr
176.gcc
181.mcf
186.cr...
197.pa...
253.pe...
254.gap
255.vo...
256.b...
300.t...
Gmean0%
20%
40%
60%
80%
100%
120%
140%
160%CFCSS CFCSS_ivl ACS ACS + calls_rets
Perf
orm
ance
Ove
rhea
d (r
untim
e)
The performance overhead is down from 75% to 11%
University of MichiganElectrical Engineering and Computer Science
Fault Coverage
19
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
CFCS
SCF
CSS_
ivl
ACS
ACS
+ ca
lls_r
ets
164.gzip 175.vpr 176.gcc 181.mcf 186.crafty 197.parser 253.perl 254.gap 255.vortex 256.bzip2 300.twolf average
0%10%20%30%40%50%60%70%80%90%
100%Masked CFDetects HWDetects Failures SDCs
% o
f run
s
On average, fault coverage of ACS is comparable to CFCSS with almost 7x reduction in overhead
University of MichiganElectrical Engineering and Computer Science
Fault Detection Latency
20
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
CFCS
S
ACS
164.gzip 175.vpr 176.gcc 181.mcf 186.crafty
197.parser
253.perl 254.gap 255.vortex
256.bzip2 300.twolf average
0.80
0.85
0.90
0.95
1.00
1.05
1.10
1.15
1.20
1.25
WithIn2K Within10K WithIn100K
Nor
mal
ized
det
ectio
n la
tenc
y
Fault detection latency is affected by a maximum of 5%
University of MichiganElectrical Engineering and Computer Science
Conclusions
21
• We propose Abstract Control Signatures (ACS)– Signature checking at coarse-grain– Simplified signature updates
• In comparison to a traditional signature based scheme (CFCSS)– Reduces performance overhead from 75% down to 11% – Fault coverage is comparable
University of MichiganElectrical Engineering and Computer Science
Thank You!Questions?
22
University of MichiganElectrical Engineering and Computer Science
Fault Injection Outcome Classification
• Masked– No corruption in the program output
• CFDetects– Detected by control flow checking
• Covered by symptoms (HWDetects)– Produces a symptom such as page fault in 2000 cycles of fault injection
• Failures– Fail status on program termination or infinite loop.
• SDCs (Silent Data Corruptions)– Fault injections which results in user visible corruptions
23