13
Intermission

Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Embed Size (px)

Citation preview

Page 1: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Intermission

Page 2: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Binary parsing

2The Deconstruction of Dyninst

_lock_fo

o

main

foo

dynamic instrumentation, debugger, static binary analysis tools, malware analysis, binary

editor/rewriter, …

Page 3: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

3

Familiar territory

Benjamin Schwarz, Saumya Debray, and Gregory R. Andrews. Disassembly of executable code revisited. 2002

Cristina Cifuentes and K. John Gough. Decompilation of binary programs. 1995

Richard L. Sites, Anton Chernoff, Matthew B. Kirk, Maurice P. Marks, and Scott G. Robinson. Binary translation. 1993.

HenrikTheiling. Extracting safe and precise control flow from binaries. 2000.

Ramkumar Chinchani and Eric van den Berg. A fast static analysis approach to detect exploit code inside network flows. 2005.

J. Troger and C. Cifuentes. Analysis of virtual method invocation for binary translation. 2002.

Laune C. Harris and Barton P. Miller. Practical analysis of stripped binary code. 2005.

Christopher Kruegel, William Robertson, Fredrik Valeur, and Giovanni Vigna. Static disassembly of obfuscated binaries. 2004.

Nathan Rosenblum, Xiaojin Zhu, Barton P. Miller, and Karen Hunt. Learning to analyze binary computer code. 2008.

Amitabh Srivastava and Alan Eustace. ATOM: a system for building customized program analysis tools. 1994.

Barton Miller, Jeffrey Hollingsworth, and Mark Callaghan. Dynamic Program Instrumentation for Scalable Performance Tools. 1994.

Page 4: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

We’ve been down this road…

4The Deconstruction of Dyninst

recursive traversal parsing“gap” parsing heuristicsprobabilistic code models

non-contiguous functions

code sharing non-returning

functions

preamble scanning handles stripped

binaries

learn to recognize function entry points

very accurate gap parsing

the DYNINST binary parser

Page 5: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

What makes a parsing component?

5The Deconstruction of Dyninst

0111010110

1010101010

1110101001

0101011100

0100100101

1010110011

0101010101

0101001001

1110

0101110010110

Parsing API

simple, intuitive

representation

2

functions

blocksedgesInstructionAPI

SymtabAPI

platform independence supported by previous Dyninst components

3

Binarycodesource

abstraction

1

Page 6: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Flexible code sources

6The Deconstruction of Dyninst

a binary code object

Parser code source requirements:

code location

code data

access to code bytes

unsigned char * buf

41 56 49 89 fe 41 55 …

main foo bar baz

function hints & names

a few (optional) facts

pointer width

external linkage

PLT

Page 7: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Code source contract

7The Deconstruction of Dyninst

bool isValidAddress

bool isExecutableAddress

void * getPtrToInstruction

void * getPtrToData

unsigned

getAddressWidth

bool isCode

bool isData

Address codeOffset

Address codeLength

Nine mandatory methods

SymtabAPI implementation in 232 lines (including optional hints, function names)

Any binary code object that can be memory mapped can be parsed

Page 8: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Simple control flow interface

8The Deconstruction of Dyninst

Functions Blocks Edges

start addr.

extents

contain joined by

start addr.

end addr.

in edges

out edges

srctarg

type

Page 9: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Views of control flow

9The Deconstruction of Dyninst

while(!work.empty()) { Block *b = work.pop();

/* do something with b */

edgeiter eit = b->out().begin(); while(eit != b->out().end()) { work.push(*eit++); }}

Walking a control flow graphstarting here

What if we only want intraprocedural

edges?

Page 10: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Edge predicates

10The Deconstruction of Dyninst

while(!work.empty()) { Block *b = work.pop();

/* do something with b */

IntraProc pred; edgeiter eit = b->out().begin(&pred); while(eit != b->out().end()) { work.push(*eit++); }}

Walking a control flow graph Edge Predicates

Tell iterator whether Edge argument should be returnedComposable (and, or)

Examples:

Intraprocedural Single function

context Direct branches

only

Page 11: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Extensible CFG objects

11The Deconstruction of Dyninst

image_func

Function

Dyninst image_func

ParseAPI FunctionSimple, only need to represent control flow graph

Complex, handles instrumentation, liveness, relocation, etc.

Special callback points during

parsingparse parse parse

unresBranchNotify(insn)

[derived class does stuff]

parse parse parse

Factory interface for CFG objects

parser

custom

factory

mkfunc()

(Function*)image_func

Page 12: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

What’s in the box?

12The Deconstruction of Dyninst

* box to be released soon

Binary ParserControl Flow Graph

RepresentationSymtabAPI-based

Code Source recursive descent

parsing speculative gap

parsing cross platform:

x86, x86-64, PPC, IA64, SPARC

graph interface extensible objects for

easy tool integration exports Dyninst

InstructionAPI interface

cross-platform supports ELF, PE,

XCOFF formats

Page 13: Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware

Status

13The Deconstruction of Dyninst

conception code refactoring interface designDyninst re-integration

(major test case)

other major test case: compiler

provenance (come tomorrow!)