Upload
brandon-lucas
View
229
Download
0
Embed Size (px)
Citation preview
Intermission
Binary parsing
2The Deconstruction of Dyninst
_lock_fo
o
main
foo
dynamic instrumentation, debugger, static binary analysis tools, malware analysis, binary
editor/rewriter, …
3
Familiar territory
Benjamin Schwarz, Saumya Debray, and Gregory R. Andrews. Disassembly of executable code revisited. 2002
Cristina Cifuentes and K. John Gough. Decompilation of binary programs. 1995
Richard L. Sites, Anton Chernoff, Matthew B. Kirk, Maurice P. Marks, and Scott G. Robinson. Binary translation. 1993.
HenrikTheiling. Extracting safe and precise control flow from binaries. 2000.
Ramkumar Chinchani and Eric van den Berg. A fast static analysis approach to detect exploit code inside network flows. 2005.
J. Troger and C. Cifuentes. Analysis of virtual method invocation for binary translation. 2002.
Laune C. Harris and Barton P. Miller. Practical analysis of stripped binary code. 2005.
Christopher Kruegel, William Robertson, Fredrik Valeur, and Giovanni Vigna. Static disassembly of obfuscated binaries. 2004.
Nathan Rosenblum, Xiaojin Zhu, Barton P. Miller, and Karen Hunt. Learning to analyze binary computer code. 2008.
Amitabh Srivastava and Alan Eustace. ATOM: a system for building customized program analysis tools. 1994.
Barton Miller, Jeffrey Hollingsworth, and Mark Callaghan. Dynamic Program Instrumentation for Scalable Performance Tools. 1994.
We’ve been down this road…
4The Deconstruction of Dyninst
recursive traversal parsing“gap” parsing heuristicsprobabilistic code models
non-contiguous functions
code sharing non-returning
functions
preamble scanning handles stripped
binaries
learn to recognize function entry points
very accurate gap parsing
the DYNINST binary parser
What makes a parsing component?
5The Deconstruction of Dyninst
0111010110
1010101010
1110101001
0101011100
0100100101
1010110011
0101010101
0101001001
1110
0101110010110
Parsing API
simple, intuitive
representation
2
functions
blocksedgesInstructionAPI
SymtabAPI
platform independence supported by previous Dyninst components
3
Binarycodesource
abstraction
1
Flexible code sources
6The Deconstruction of Dyninst
a binary code object
Parser code source requirements:
code location
code data
access to code bytes
unsigned char * buf
41 56 49 89 fe 41 55 …
main foo bar baz
function hints & names
a few (optional) facts
pointer width
external linkage
PLT
Code source contract
7The Deconstruction of Dyninst
bool isValidAddress
bool isExecutableAddress
void * getPtrToInstruction
void * getPtrToData
unsigned
getAddressWidth
bool isCode
bool isData
Address codeOffset
Address codeLength
Nine mandatory methods
SymtabAPI implementation in 232 lines (including optional hints, function names)
Any binary code object that can be memory mapped can be parsed
Simple control flow interface
8The Deconstruction of Dyninst
Functions Blocks Edges
start addr.
extents
contain joined by
start addr.
end addr.
in edges
out edges
srctarg
type
Views of control flow
9The Deconstruction of Dyninst
while(!work.empty()) { Block *b = work.pop();
/* do something with b */
edgeiter eit = b->out().begin(); while(eit != b->out().end()) { work.push(*eit++); }}
Walking a control flow graphstarting here
What if we only want intraprocedural
edges?
Edge predicates
10The Deconstruction of Dyninst
while(!work.empty()) { Block *b = work.pop();
/* do something with b */
IntraProc pred; edgeiter eit = b->out().begin(&pred); while(eit != b->out().end()) { work.push(*eit++); }}
Walking a control flow graph Edge Predicates
Tell iterator whether Edge argument should be returnedComposable (and, or)
Examples:
Intraprocedural Single function
context Direct branches
only
Extensible CFG objects
11The Deconstruction of Dyninst
image_func
Function
Dyninst image_func
ParseAPI FunctionSimple, only need to represent control flow graph
Complex, handles instrumentation, liveness, relocation, etc.
Special callback points during
parsingparse parse parse
unresBranchNotify(insn)
[derived class does stuff]
parse parse parse
Factory interface for CFG objects
parser
custom
factory
mkfunc()
(Function*)image_func
What’s in the box?
12The Deconstruction of Dyninst
* box to be released soon
Binary ParserControl Flow Graph
RepresentationSymtabAPI-based
Code Source recursive descent
parsing speculative gap
parsing cross platform:
x86, x86-64, PPC, IA64, SPARC
graph interface extensible objects for
easy tool integration exports Dyninst
InstructionAPI interface
cross-platform supports ELF, PE,
XCOFF formats
Status
13The Deconstruction of Dyninst
conception code refactoring interface designDyninst re-integration
(major test case)
other major test case: compiler
provenance (come tomorrow!)