Taint scope

Preview:

Citation preview

TAINTSCOPEA Checksum-Aware Directed fuzzing Tool for Automatic Software Vulnerability Detection

Tielei Wang1, Tao Wei1, Guofei Gu2, Wei Zou1

1Peking University, China2Texas A&M University, US

TERMS

Checksum – a way to check the integrity of data. Used in network protocols and files.

Fuzzing – generating malformed inputs and feeding them to the application.

Dynamic Taint Analysis – runs a program and observes which computations are affected by predefined taint sources (e.g. input)

data

data Checksum field

Checksum function

2

THE PROBLEM

The input mutation space is enormous .

Most malformed inputs dropped at an early stage, if the program employs a checksum mechanism.

3

THE PROBLEM

1 void decode_image(FILE* fd){2 ...3 int length = get_length(fd);4 int recomputed_chksum = checksum(fd, length);5 int chksum_in_file = get_checksum(fd);

//line 6 is used to check the integrity of inputs6 if(chksum_in_file != recomputed_chksum)7 error();8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);12 ...13 for(i=0; i<Height; i++){// read ith row to p14 read_row(p+Width*i, i, fd);

4

THE IDEA To infer whether/where a program checks the

integrity of input.

Identify which input bytes can flow into sensitive points:Taint analysis at byte level – monitors how application uses the input data.

Create malformed input focusing the “hot bytes”.

Repair checksum fields in input, to expose vulnerability.

Fully automatic

Found 27 new vulnerability – acrobat reader, google picasa and more.

5

HOW DOES IT WORK?

1. Dynamic taint tracing2. Detecting checksum3. Directed fuzzing4. Repairing crashed samples

6

HOW DOES IT WORK?

Execution Monitor

Checksum Locator

Directed Fuzzer

Checksum Repairer

Modified Program

Hot Bytes InfoInstruction Profile

CrashedSamples

Reports

7

HOW DOES IT WORK?

Runs the program with well-formed input.

Execution monitor records: Which input bytes related to arguments of API

functions (e.g. malloc, strcpy) – “hot bytes” report.

Which bytes each conditional jump instruction

depends on (e.g. JZ, JE, JB) – checksum report.

Considering only data flow (no control flow).

1. DYNAMIC TAINT TRACING

8

HOW DOES IT WORK?

Instruments instructions – movement (e.g. MOV, PUSH), arithmetic (e.g.

SUB, ADD), logic (e.g. AND, XOR) Taints all values written by an

instruction with union of all taint labels associated with values used by that instruction.

Considering also eflags register.

1. DYNAMIC TAINT TRACING

eax {0x6, 0x7}, ebx {0x8, 0x9} add eax, ebxeax {0x6, 0x7, 0x8, 0x9}, eflags {0x6, 0x7, 0x8, 0x9}

9

HOW DOES IT WORK?1. DYNAMIC TAINT TRACING -

EXAMPLE

…0x8048d5b: invoking malloc: [0x8,0xf]…

8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);

Input size is 1024 bytes“hot bytes” report:

10

HOW DOES IT WORK?

Input size is 1024 byteschecksum report:

1. DYNAMIC TAINT TRACING - EXAMPLE

6 if(chksum_in_file != recomputed_chksum)7 error();

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

11

HOW DOES IT WORK?

Checksum detector:

identify potential checksum check points the recomputed checksum value depends

on many input bytes Instruments conditional jump. Before

execution, checks whether the number of

marks associated with eflags register exceeds a threshold.

Problem with decompressed bytes.

2. DETECTING CHECKSUM

12

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot

13

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot Run well-formed inputs, identify the

always-taken and always-not-taken instructions.

14

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot Run well-formed inputs, identify the

always-taken and always-not-taken instructions.

Run malformed inputs, also identify the always-taken and always-not-taken instructions.

15

HOW DOES IT WORK?

Refinement:

2. DETECTING CHECKSUM

Well-formed inputs can pass the checksum test,

but most malformed inputs cannot Run well-formed inputs, identify the

always-taken and always-not-taken instructions.

Run malformed inputs, also identify the always-taken and always-not-taken instructions.

Identify the conditional jump instructions that behaves completely different when processing well-formed and malformed inputs.

16

HOW DOES IT WORK?

Checksum detector: Creates bypass rules –

always-taken, always-not-taken

2. DETECTING CHECKSUM

6 if(chksum_in_file != recomputed_chksum)7 error();

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

0x8048d4f: JZ: always-taken

17

HOW DOES IT WORK?

Checksum detector: Checksum field identification

Input bytes that affects chksum_in_file are the checksum field.

2. DETECTING CHECKSUM

6 if(chksum_in_file != recomputed_chksum)7 error();

18

HOW DOES IT WORK?

Generates malformed test cases – feeds them to the original or instrumented program.

According to the bypass rules, alters the execution traces at check points – sets the eflags register.

3. DIRECTED FUZZING

19

HOW DOES IT WORK?

All malformed test cases are constructed based on the “hot bytes” information Using attack heuristics:

bytes that influence memory allocation are set to small, large or negative.bytes that flow into string functions are replaced by characters such as %n, %p.

Output – test cases that could cause to crash or consume 100% CPU.

3. DIRECTED FUZZING

20

HOW DOES IT WORK?3. DIRECTED FUZZING

…0x8048d5b: invoking malloc: [0x8,0xf]…

6 if(chksum_in_file != recomputed_chksum)7 error();8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);

0x8048d4f: JZ: always-taken

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

“hot bytes” reportChecksum report

Bypass info

21

HOW DOES IT WORK?3. DIRECTED FUZZING

…0x8048d5b: invoking malloc: [0x8,0xf]…

6 if(chksum_in_file != recomputed_chksum)7 error();8 int Width = get_width(input_file);9 int Height = get_height(input_file);10 int size = Width*Height*sizeof(int);11 int* p = malloc(size);

0x8048d4f: JZ: always-taken

…0x8048d4f: JZ: 1024: [0x0,0x3ff]…

“hot bytes” reportChecksum report

Bypass info

Before executing 0x8048d4f, the fuzzer sets

the flag ZF in eflags to an opposite value

22

HOW DOES IT WORK?

Fixing is expensive - fixes checksum fields only in test cases that caused crashing.

How?Cr – row data in the checksum field

D – input data protected by checksum filedChecksum() – the complete checksum algorithmT – transformationWe want to pass the constraint:

4. REPAIRING CRASHED SAMPLES

Checksum(D) == T(Cr)

23

HOW DOES IT WORK?

Using symbolic execution to solve:

Checksum(D) is a runtime determinable constant:

Only Cr is a symbolic value. Common transformations (e.g. converting

from hex/oct to decimal), can be solved by existing solvers (STP).

4. REPAIRING CRASHED SAMPLES

Checksum(D) == T(Cr)

c== T(Cr)

24

HOW DOES IT WORK?

If the new test case cause the original program to crash,

4. REPAIRING CRASHED SAMPLES

a potential vulnerability is detected!

25

EVALUATION

An incomplete list of applications:

26

EVALUATION“hot bytes” identification results – memory allocation

27

EVALUATION

Checksum identification results:Threshold = 16

28

EVALUATION

Correct checksum fields:

29

EVALUATION

MS Paint Google Picasa Adobe Acrobat ImageMagick

irfanview gstreamer Winamp XEmacs

Amaya dillo wxWidgets PDFlib

27 previous unknown Vulnerabilities:

30

EVALUATIONVulnerabilities detected by TaintScope:

31

DISCUSSION TaintScope cannot deal with secure

integrity check schemes (e.g. cryptographic hash algorithms, digital signature) – impossible to generate valid test cases.

Limited effectiveness when all input data are encrypted (tracking decrypted data).

Checksum check points identification can be affected by the quality of inputs.

Not tracks control flow propagation. Not all instructions of x86 are

instrumented by the execution monitor.

32

CONCLUSIONTaintScope can perform: Directed fuzzing

Identify which bytes flow into system/library calls.

dramatically reduce the mutation space. Checksum-aware fuzzing

Disable checksum checks by control flow alternation.

Generate correct checksum fields in invalid inputs.

33

QUESTIONS

34

Recommended