Upload
vernon-cummings
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
2
856c: 55 856d: 89e5 856f : 83ec08 8572: e8ddffffff 857b: c9 857c: c3 857d: 55 857e: 89e5 8581: 83ec18858b: e8bfffffff8591: c9 8592: c3
Binary code
3
856c: 55 856d: 89e5 856f : 83ec08 8572: e8ddffffff 857b: c9 857c: c3 857d: 55 857e: 89e5 8581: 83ec18858b: e8bfffffff8591: c9 8592: c3
push %ebpmov %esp, %ebpsub 8, %espcall 857dleaveretpush %ebpmov %esp, %ebpsub %eax, %ebpcall 866cleaveret
Binary code (with assembly)
4
856c: 55 856d: 89e5 856f : 83ec08 8572: e8ddffffff 857b: c9 857c: c3 857d: 55 857e: 89e5 8581: 83ec18858b: e8bfffffff8591: c9 8592: c3
push %ebpmov %esp, %ebpsub 8, %espcall fooleaveretpush %ebpmov %esp, %ebpsub %eax, %ebpcall printf leaveret
main
foo
Binary code (with symbol info)
5
A lot of code is stripped
•Commercial applications (usually)
•Proprietary libraries (often)
•Viruses
•OS libraries and utilities (depends on OS and OS version)
7
Finding functions
•Build a call graph and traverse it to find function start addresses
•Opportunistic parsing: use existing symbol names and addresses where available
•Works on a spectrum of binaries ranging from binaries with all symbols to fully stripped binaries
9
push %ebpmov %esp, %ebpsub 8, %espcall 857dleaveret
856c: 856d: 856f: 8572: 857b: 857c:
main
Call Graph creation
10
push %ebpmov %esp, %ebpsub 8, %espcall func857dleaveretpush %ebp
856c: 856d: 856f: 8572: 857b: 857c: 857d:
main
func857d
Call Graph creation
11
push %ebpmov %esp, %ebpsub 8, %espcall func857dleaveretpush %ebpmov %esp, %ebpsub %eax, %ebpcall 865ecall 866d leaveret
856c: 856d: 856f: 8572: 857b: 857c: 857d: 857e: 8581: 858b: 8591:8596: 8597:
main
func857d
Call Graph creation
12
Parsing Functions
•Disassemble function’s code by traversing intra-procedural control flow graph
•Highest address determines function size
13
Error Detection And Recovery
•CFG exit points are sometimes hard to identify
•Assume branches that are not obvious exits are intra-procedural
•Errors result in overestimation of function size
•Overlapping functions indicate error
14
Problems and Solutions
•Functions that are only called indirectly•Problem: static call graph traversal does not discover these functions
•Solution: examine gaps in text space and use heuristics to find functions
15
Problems and Solutions cont’d
•Indirect Jumps•Problem: need to find targets to complete CFG
•Solution: parse jump tables to find possible targets
16
Problems and Solutions cont’d
•Exception handling code•Problem: creates code blocks that appear unreachable
•Solution: get block addresses from exception table
17
Test Programs
paradyn 5.44 3.51 13,676
condor_starter 22.60 2.50 8,168
gimp 2.61 2.20 4,329
eon 10.44 0.51 1,163
om3 0.43 0.30 732
alara 3.65 0.26 948
bubba 0.09 0.02 66
size (MB)
unstripped
size (MB)
stripped
number of
functions
18
Evaluation
•Parse time (includes CFG creation)•~1.4x faster than prev. parser (with cfg)•~1.7x slower than prev. parser (without cfg)
•Stripped parse time•Varies: 1.2x - 1.9x slower than unstripped
•Symbol recreation •80% - 98% of original functions
19
Related Work
•Binary rewriters/instrumentation tools•eel, emil, etch, goblin, leel, plto
•Disassemblers (lots available)•IDAPro, Objdump, dumpbin, etc
•Symbol table reconstructors•dress, objdump-output-beautifier
20
Status
•Implemented on x86
•Ready for measurement and instrumentation
•Good start for security, but needs work