Upload
neylan
View
65
Download
1
Embed Size (px)
DESCRIPTION
Unleashing Mayhem on Binary Code. Sang Kil Cha Thanassis Avgerinos Alexandre Rebert David Brumley Carnegie Mellon University. Automatic Exploit Generation Challenge. Automatically F ind B ugs & G enerate Exploits. I = input(); i f (I < 42) vuln (); else safe();. AEG. Exploits. - PowerPoint PPT Presentation
Citation preview
Unleashing Mayhem on Binary Code
Sang Kil ChaThanassis Avgerinos
Alexandre RebertDavid Brumley
Carnegie Mellon University
2
Automatic Exploit Generation Challenge
Automatically Find Bugs & Generate Exploits
AEG
Program Exploits
I = input();if (I < 42) vuln();else safe();
3
Automatic Exploit Generation Challenge
Automatically Find Bugs & Generate Exploits
Explore Program
4
Ghostscript v8.62 Bugint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
Reading user input from command line
Buffer overflow
CVE-2009-4270
5
Multiple Pathsint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
ManyBranches!
6
Automatic Exploit Generation Challenge
Automatically Find Bugs & Generate Exploits
Transfer Control to Attacker Code
(exec “/bin/sh”)
7
Generating Exploitsint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
outp
rint
f
…
fmt
ret addr
count
args
bufuser
inpu
t
mai
n
esp
8
Generating Exploitsint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
Read Return Address from Stack Pointer (esp)
8
outp
rint
f
…
fmt
ret addr
count
args
bufuser
inpu
t
mai
n
esp
Control Hijack Possible
9Source
int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) {…Executables (Binary)
01010010101010100101010010101010100101010101010101000100001000101001001001001000000010100010010101010010101001001010101001010101001010000110010101010111011001010101010101010100101010111110100101010101010101001010101010101010101010
Unleashing MayhemAutomatically Find Bugs & Generate Exploits
for Executables
10
Demo
11
if x < 100
if x*x = 0xffffffff
x = input()
How Mayhem Works:Symbolic Execution
if x > 42
vuln()
x can be anything
x > 42
(x > 42) ∧ (x*x != 0xffffffff)
(x > 42) ∧ (x*x != 0xffffffff)
∧ (x >= 100)
f t
f t
f t
12
f t
f t
f t
x = input()
How Mayhem Works:Symbolic Execution
if x > 42
if x*x = 0xffffffff
vuln()
x can be anything
x > 42
(x > 42) ∧ (x*x == 0xffffffff)
if x < 100
13
f t
f t
f t
x = input()
if x > 42
if x*x = 0xffffffff
vuln()
Path Predicate = Π
x can be anything
x > 42
(x > 42) ∧ (x*x == 0xffffffff)Π =
if x < 100
14
f t
f t
f t
x = input()
How Mayhem Works:Symbolic Execution
if x > 42
if x*x = 0xffffffff
vuln()
x can be anything
x > 42
(x > 42) ∧ (x*x == 0xffffffff)
ViolatesSafety Policyif x < 100
15
int outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}
Safety Policy in Mayhem
outp
rint
f
…
fmt
ret addr
count
args
bufuser
inpu
t
mai
n
esp
Return to user-controlled address
EIP not affected by user input
Instruction Pointer (EIP) level:
16
Exploit Generation
Π∧
input[0-31] = attack code∧
input[1038-1042] = attack code address
Exploit is an input that satisfies the predicate:
Exploit PredicateCan transfer
control to attack code?
Can position attack code?
17
Challenges
Symbolic Execution Exploit Generation
Efficient Resource Management
Symbolic IndexChallenge
Hybrid ExecutionIndex-based Memory
Model
18
Challenge 1: Resource Management inSymbolic Execution
19
Current Resource Management in Symbolic Execution
Online Symbolic Execution
Offline Symbolic Execution
(a.k.a. Concolic)
20
Offline ExecutionOne pathat a time Method 1:
Re-run from scratch Inefficient⟹
Re-executedevery time
21
Online Execution
Method 2:Stop forking
Miss paths⟹
Method 3: Snapshot process
Huge disk ⟹image
Hit Resource Cap
Fork at branches
22
Mayhem: Hybrid Execution
Our Method:Don’t snapshot state; use path predicate to recreate state
9.4M 500K
Hit Resource Cap
Fork at branches
Ghostscript 8.62
“Checkpoint”
23
Hybrid Execution
Manage #executorsin memory within resource cap✓
Minimize duplicated work✓
Lightweight checkpoints✓
24
Challenge 2: Symbolic Indices
25
Symbolic Indices
x = user_input();y = mem[x];assert (y == 42);
x can be anything
Which memory cell contains 42?
232 cells to check
Memory0 232 -1
26
One Cause: Overwritten Pointers
42
mem[0x11223344]
mem[input]
…
arg
ret addr
ptr
buf
user
inpu
t
… assert(*ptr==42); return;
ptr address 11223344ptr = 0x11223344
27
Another Cause: Table LookupsTable lookups in standard APIs:• Parsing: sscanf, vfprintf, etc.• Character test: isspace, isalpha, etc.• Conversion: toupper, tolower, mbtowc, etc.• …
28
Method 1: Concretization
Over-constrained• Misses 40% of exploits in our experiments
Π∧ mem[x] = 42 ∧ ’Π
Π ∧ x = 17∧ mem[x] = 42 ∧ ’Π
✓ Solvable✗ Exploits
29
Method 2: Fully Symbolic
Π ∧ mem[x] = 42 ∧ ’Π
✗ Solvable✓ Exploits
Π ∧ mem[x] = 42 ∧ mem[0] = v0 ∧…∧ mem[232-1] = v232-1
∧ ’Π
30
Our ObservationPath predicate (Π)constrains rangeof symbolic memoryaccesses
y = mem[x]
f t
x <= 42
x can be anything
f
t
x >= 50
Use symbolic execution state to:Step 1: Bound memory addresses referencedStep 2: Make search tree for memory address values
42 < x < 50Π
31
Step 1 — Find Boundsmem[ x & 0xff ]
1. Value Set Analysis1 provides initial bounds• Over-approximation
2. Query solver to refine bounds
Lowerbound = 0, Upperbound = 0xff
[1] Balakrishnan et al., Analyzing memory accesses in x86 executables, ICCC 2004
32
Step 2 — Index Search Tree Construction
y = mem[x]if x = 1 then y = 10
Index
MemoryValue
1012
22
20
if x = 2 then y = 12
if x = 3 then y = 22
if x = 4 then y = 20
ite( x < 3, left, right )
ite( x < 2, left, right )
33
Fully Symbolic vs.Index-based Memory Modeling
Fully Symbolic Index-based Piecewise Opt.0
5000
10000Time Timeout atphttpd
v0.4b
34
Index Search Tree Optimization:Piecewise Linear Approximation
y = 2*x + 10
y = - 2*x + 28
Index
MemoryValue
35
Piecewise Linear Approximation
Fully Symbolic Index-based Piecewise Opt.0
5000
10000Time
2x faster
atphttpd v0.4b
36
Exploit Generation
37
a2ps
aeon
aspell
atphttpd
freeradius
ghostscript
glftpd
gnugol
htget
htpasswd
iwconfig
mbse-bbs
nCompress
orzHttpd
psUtils
rsync
sharutils
socat
squirrel mail
tipxd
xgalaga
xtokkaetama
coolplayer
destiny
dizzy
galan
gsplayer
muse
soritong
1 10 100 1000 10000 100000
Linux
(22)
Windows
(7)
38
a2ps
aeon
aspell
atphttpd
freeradius
ghostscript
glftpd
gnugol
htget
htpasswd
iwconfig
mbse-bbs
nCompress
orzHttpd
psUtils
rsync
sharutils
socat
squirrel mail
tipxd
xgalaga
xtokkaetama
coolplayer
destiny
dizzy
galan
gsplayer
muse
soritong
1 10 100 1000 10000 100000
2 Unknown Bugs:FreeRadius,GnuGol
39
Limitations• We do not claim to find all exploitable bugs
• Given an exploitable bug, we do not guarantee we will always find an exploit
• Lots of room for improving symbolic execution, generating other types of exploits (e.g., info leaks), etc.
• We do not consider defenses, which may defend against otherwise exploitable bugs– Q [Schwartz et al., USENIX 2011]But Every Report is Actionable
40
Related Work• APEG [Brumley et al., IEEE S&P 2008]
– Uses patch to locate bug, no shellcode executed
• Automatic Generation of Control Flow Hijacking Exploits for Software Vulnerabilities
[Heelan, MS Thesis, U. of Oxford 2009]– Creates control flow hijack from crashing input
• AEG [Avgerinos et al., NDSS 2011]
– Find and generate exploits from source code
• BitBlaze, KLEE, Sage, S2E, etc.– Symbolic execution frameworks
41
Conclusion• Mayhem automatically generated 29 exploits
against Windows and Linux programs
• Hybrid Execution– Efficient resource management for symbolic
execution
• Index-based Memory Modeling– Handle symbolic memory in real-world
applications
42
Thank You• Our shepherd: Cristian Cadar• Anonymous reviewers• Maverick Woo, Spencer Whitman
43
Q&ASang Kil Cha ([email protected])http://www.ece.cmu.edu/~sangkilc/