160
Introduction to Fuzzing Alma Oracevic alma [email protected]

Introduction to Fuzzing

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Introduction to Fuzzing

Alma Oracevic

[email protected]

About the technique

2

▪It is about Fuzzing

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

About the technique

▪It is about Fuzzing

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

3

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

About the technique

▪It is about Fuzzing

▪No, it is not about Fuzzy logic

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

4

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

About the technique

▪It is about Fuzzing

▪No, it is not about Fuzzy logic

▪Neither about fuzzy set membership

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

5

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

About the technique

▪It is about Fuzzing

▪No, it is not about Fuzzy logic

▪Neither about fuzzy set membership

Software Testing

6

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

About the technique

▪It is about Fuzzing

▪No, it is not about Fuzzy logic

▪Neither about fuzzy set membership

Security Software Testing

7

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

About the technique

▪It is about Fuzzing

▪No, it is not about Fuzzy logic

▪Neither about fuzzy set membership

Security Software Testing

Memory- corruption bugs

8

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

About the technique

▪It is about Fuzzing

▪No, it is not about Fuzzy logic

▪Neither about fuzzy set membership

Security Software Testing

Memory- corruption bugs

Exploitable!

9

References:

1. book (Chapter

1, section 1.3):

Fuzzing for

Software Security

Testing and

Quality Assurance.

By Ari Takanen,

Jared DeMott,

Charlie Miller

2. Article "Fuzzing:

Hack, Art, and

Science”

By P. Godefroid

Why do we care?

10

Why do we care?

**http://www.cvedetails.com

11

Organization

14

▪Memory corruption vulnerabilities

▪Fuzzing- finding vulnerabilities

▪Types of Fuzzing

▪Some existing solutions

Memory Corruption Vulnerabilities

15

▪WYSINWYX: What You See Is Not What You eXecute by G. Balakrishnan et. al.

– Higher level code -> low-level representation

– Seemingly separate variables -> contiguous memory addresses

▪Contiguous memory locations allow for boundary violations!

example

17

#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

example#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

name

18

example#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

name

cookie

19

example#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

name

cookie

Saved RET

20

example#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

name

cookie

Saved RET

21

name

cookie

Saved RET

example#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

22

example#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

name

cookie

Saved RET

23

example#include <stdio.h>

int get_cookie(){

return rand();}

int main(){

int cookie;

char name[40];

cookie = get_cookie();

gets(name);

if (cookie == 0x41424344)

printf("You win %s\n!", name);

else printf("better luck next time :(");

return 0;

}

name

cookie

Saved RET

Memory

corruption

24

Side effects

25

Side effects

26

▪Over/underflow

Side effects

27

▪Over/underflow

▪Sensitive data corruption

Side effects

28

▪Over/underflow

▪Sensitive data corruption

▪Control data corruption (control hijacking)

Side effects

29

▪Over/underflow

▪Sensitive data corruption

▪Control data corruption (control hijacking)

Side effects

30

▪Over/underflow

▪Sensitive data corruption

▪Control data corruption (control hijacking)

Otherwise crash!

Fuzzing

37

Fuzzing

▪It started on a dark and stormy night…. [Barton P. Miller, late1980s]

38

Fuzzing

39

Fuzzing

40

●Run program on many abnormal/malformed inputs, look for unintended

behavior, e.g. crash.

Fuzzing

●Run program on many abnormal/malformed inputs, look for unintended

behavior, e.g. crash.

● An observable (measurable) side effect is essential

41

Fuzzing

●Run program on many abnormal/malformed inputs, look for unintended

behavior, e.g. crash.

● An observable (measurable) side effect is essential

● Should be scalable

42

Fuzzing

●Run program on many abnormal/malformed inputs, look for unintended

behavior, e.g. crash.

● An observable (measurable) side effect is essential

● Should be scalable

●Underlying assumption: if the unintended behavior is dependent on input, an attacker can craft such an input to exploit the bug.

43

Types of Fuzzing

44

▪Input based: mutational and Generative (grammar based)

▪Application based: black-box and white-box

▪Input Strategy: memory-less and evolutionary

Input Generation

45

Input Generation

46

▪Mutation Based: mutate seed inputs to create new test inputs

- Simple strategy is to randomly choose an offset and change

the byte.

Input Generation

47

▪Mutation Based: mutate seed inputs to create new test inputs

-

-

-

Simple strategy is to randomly choose an offset and change

the byte.

Pros: easy to implement and low overhead

Cons: highly structured inputs will become invalid quickly →

low coverage.

Cont..● Generation (Grammar) Based: Learn/create the format/model of

the input and based on the learned model, generate new inputs.

-

-

-

e.g. well-known file formats (jpeg, xml, etc.)

Pros: Highly effective for complex structured input parsing

applications → high coverage

Cons: expensive as models are not easy to learn or obtain.

JPEG file

format

JPEG file

format

Application Monitoring

51

Application Monitoring

▪Blackbox: Only interface is known.

52

Application Monitoring

▪Blackbox: Only interface is known.

53

Application Monitoring

▪Blackbox: Only interface is known.

● Whitebox: Application can

be analysed/monitored.

Static & Dynamic analysis

54

Problem with Traditional Fuzzing

55

Problem with Traditional Fuzzing

Blackbox + mutation: Aiming with luck!

56

Problem with Traditional Fuzzing

Blackbox + mutation: Aiming with luck!

57

Problem with Traditional Fuzzing

Blackbox + mutation: Aiming with luck!

58

... //JPEG parsing

read(fd, buf, size);

if (buf[1] == 0xD8 && buf[0] == 0xFF)

// interesting code here

else

pr_exit(“Invalid file”);

Problem with Traditional Fuzzing

Blackbox + mutation: Aiming with luck!

... //JPEG parsing

read(fd, buf, size);

if (buf[1] == 0xD8 && buf[0] == 0xFF)

// interesting code here

else

pr_exit(“Invalid file”);

59

Problem with Traditional Fuzzing

60

Problem with Traditional Fuzzing

61

▪Apply more heuristics to:– Mutate better

– Learn good inputs

Problem with Traditional Fuzzing

▪Apply more heuristics to:– Mutate better

– Learn good inputs

▪Apply more analysis (static/dynamic) to understand the application behavior.

➢But remember the scalability factor!

62

Problem with Traditional Smart Fuzzing

63

Problem with Traditional Smart Fuzzing

smart fuzzing: Aiming with educated guess!

64

Problem with Traditional Smart Fuzzing

smart fuzzing: Aiming with educated guess!

65

Evolutionary Fuzzing

66

Evolutionary Fuzzing

67

▪Recall: memory-less and Evolutionary fuzzing

Evolutionary Fuzzing

68

▪Recall: memory-less and Evolutionary fuzzing

▪Rather than throwing inputs, evolve them.

Evolutionary Fuzzing

69

▪Recall: memory-less and Evolutionary fuzzing

▪Rather than throwing inputs, evolve them.

▪Underlying assumption:– Inputs are parsed enough before going further deep in execution

Evolutionary Fuzzing

70

▪Recall: memory-less and Evolutionary fuzzing

▪Rather than throwing inputs, evolve them.

▪Underlying assumption:– Inputs are parsed enough before going further deep in execution

Evolutionary Fuzzing

71

▪What should be the feedback to evolve?– Code-coverage based fuzzing

➢Most of the contemporary fuzzers are here (AFL, AFLFast, Driller, VUzzer, ProbeFuzzer, CollAFL, Angora, QSYM, Nautilus, …

➢Uses code-coverage as the proxy metric for the effectiveness of a fuzzer

– Directed fuzzing➢Not much explored (BuzzFuzz, AGLGo, … )

➢There should be a way to find the destination and a sense of direction.

Evolving A Fuzzer

72

Evolving A Fuzzer

73

▪Lets start with something we are more familiar with- AFL

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

74

Q

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

75

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

76

Execute and

monitor

edges (BB)

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

Execute and

monitor

edges (BB)

New

edge

?

77

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

Execute and

monitor

edges (BB)

New

edge

?

Yes, add input to Q

78

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

Execute and

monitor

edges (BB)

New

edge

?

Yes, add input to Q

No, (perhaps) try more mutation

79

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

Execute and

monitor

edges (BB)

New

edge

?

Yes, add input to Q

No, (perhaps) try more mutation

80

Codecoverage-

Maximize it!

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

Execute and

monitor

edges (BB)

New

edge

?

Yes, add input to Q

No, (perhaps) try more mutation

Codecoverage-

Maximize it!

You do

something to

maximize

CC

81

Evolving A Fuzzer

▪Lets start with something we are more familiar with- AFL

inputs

Q

Mutate at

offset X

Bitflip,

replace,

arithmetic

Execute and

monitor

edges (BB)

New

edge

?

Yes, add input to Q

No, (perhaps) try more mutation

Codecoverage-

Maximize it!

You do

something to

maximize

CC

Analysis-

You need

something to

mutate in a

meaningful way

82

Fuzzing- A balancing Act

Fuzzer

performance

scalability

Blind-

mutation

83

Fuzzing- A balancing Act

Fuzzer

performance

scalability

Monitoring

Blind-

mutation

84

Fuzzing- A balancing Act

Fuzzer

performance

scalability

Monitoring

Blind-

mutation

Analysis A

85

Fuzzing- A balancing Act

Fuzzer

performance

scalability

Monitoring

Blind-

mutation

Analysis A

Analysis B

86

Fuzzing- A balancing Act

Fuzzer

Analysis C

87

Fuzzing- A balancing Act

Happy Advanced Fuzzer

performance

scalability

Monitoring

Blind-

mutation

Analysis A

Analysis C

Analysis B

88

Problem Exemplified….

89

Problem Exemplified….

90

Problem Exemplified….

a==\xffd8

91

Problem Exemplified….

a==\xffd8

92

Problem Exemplified….

a==\xffd8

Where is ‘a’?

93

Problem Exemplified….

a==\xffd8

Where is ‘a’?

94

Problem Exemplified….

a==\xffd8

Where is ‘a’?

95

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

96

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

97

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

98

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

99

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

100

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

101

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

102

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

Easy paths (superficial

paths), error code

103

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

Easy paths (superficial

paths), error code

104

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

Easy paths (superficial

paths), error code

105

Problem Exemplified….

a==\xffd8

Where is ‘a’?

What values?

Hard-to-reach-paths

(deeper buried bugs)

Easy paths (superficial

paths), error code

106

Issues identified…

107

▪For smart code-coverage based fuzzer, it is important to have some knowledge about:

Issues identified…

108

▪For smart code-coverage based fuzzer, it is important to have some knowledge about:

– Where (which offsets in input) to apply mutation

Issues identified…

109

▪For smart code-coverage based fuzzer, it is important to have some knowledge about:

– Where (which offsets in input) to apply mutation

– What values to replace with.

Issues identified…

110

▪For smart code-coverage based fuzzer, it is important to have some knowledge about:

– Where (which offsets in input) to apply mutation

– What values to replace with.

– How to avoid traps (paths leading to error handling code)

Fuzzing+Symbex

111

Fuzzing+Symbex

112

▪Symbolic/concolic execution can answer such questions.

– Driller: Augmenting Fuzzing Through Selective Symbolic Execution, NDSS’16

Fuzzing+Symbex

113

▪Symbolic/concolic execution can answer such questions.– Driller: Augmenting Fuzzing Through Selective Symbolic

Execution, NDSS’16

▪But... Scalability?

Fuzzing+Symbex

▪Symbolic/concolic execution can answer such questions.– Driller: Augmenting Fuzzing Through Selective Symbolic

Execution, NDSS’16

▪But... Scalability?

114

Observations on Fuzzing+Symbex

115

▪Lava: Large-scale automated vulnerability addition,” in Proc. IEEE S&P ’16. IEEE Press, 2016.

Observations on Fuzzing+Symbex

116

▪Lava: Large-scale automated vulnerability addition,” in Proc. IEEE S&P ’16. IEEE Press, 2016.

– quickly and automatically injecting large numbers of realistic bugs into program source code.

Observations on Fuzzing+Symbex

117

▪Lava: Large-scale automated vulnerability addition,” in Proc. IEEE S&P ’16. IEEE Press, 2016.

– quickly and automatically injecting large numbers of realistic bugs into program source code.

– injected bug is designed to be triggered only if a particular set of multi-bytes in the input is set to a magic value

Observations on Fuzzing+Symbex

118

▪Lava: Large-scale automated vulnerability addition,” in Proc. IEEE S&P ’16. IEEE Press, 2016.

– quickly and automatically injecting large numbers of realistic bugs into program source code.

– injected bug is designed to be triggered only if a particular set of multi-bytes in the input is set to a magic value

– Results are not very encouraging!

Concrete results (From LAVA paper)

119

QSYM- Enhancing Fuzzing by Enhancing Symbex

120

QSYM- Enhancing Fuzzing by Enhancing Symbex

121

QSYM- Enhancing Fuzzing by Enhancing Symbex

▪Presented in Usenix Sec’18

122

QSYM- Enhancing Fuzzing by Enhancing Symbex

▪Presented in Usenix Sec’18

▪Focuses on scaling symbex– Native execution, contrary to IR based execution in existing symbex tools

– Instruction-level symbolic execution➢Only the relevant instructions are executed symbolically (taintflow analysis)

➢Solving only relevant constraints related to the target branch

– Optimistic Solving

– …

123

QSYM- Enhancing Fuzzing by Enhancing Symbex

▪Presented in Usenix Sec’18

▪Focuses on scaling symbex

– Native execution, contrary to IR based execution in existing symbex tools

– Instruction-level symbolic execution➢Only the relevant instructions are executed symbolically (taintflow analysis)

➢Solving only relevant constraints related to the target branch

– Optimistic Solving

– …

▪Maintaining scalability with good heuristics + program analysis toimprove coverage

124

VUzzer- going further with more analysis

125

VUzzer- going further with more analysis

126

▪Presented at NDSS’17

▪Uses taintflow analysis + several heuristics

▪Main idea:

VUzzer- going further with more analysis

▪Presented at NDSS’17

▪Uses taintflow analysis + several heuristics

▪Main idea:– Leverage application’s control- and data-flow features to infer input properties:

applications is designed to work with that input!➢ Dynamic taintfow analysis

– Prioritize and deprioritize paths: Certain paths are difficult to execute as they are guarded by constraints (nested conditions)!➢ Static analysis and error handling code

▪Combines static and dynamic analysis + heuristics to improve coverage

127

Evolving Taintflow based Solution

128

Evolving Taintflow based Solution

inputs

Q

▪Moving to Vuzzer…Bitflip,

replacement,ar

ithmeti

c

Mutate at

offset X

Execute and

monitor

edges (BB)

New

edg

e?

Yes, add input to Q

No, (perhaps) try more mutation

129

Evolving Taintflow based Solution

inputs

Q

▪Moving to Vuzzer…Bitflip,

replacement,ar

ithmeti

c

Mutate at

offset X

Execute and

monitor

edges (BB)

New

edg

e?

Yes, add input to Q

No, (perhaps) try more mutation

130

Also perform taintflow to

determine interesting offsets/

values (O/V)

Evolving Taintflow based Solution

▪Moving to Vuzzer…

inputs

Q

Bitflip,

replace

ment,ar

ithmeti

c

Mutate at

offset X

Execute and

monitor

edges (BB)

New

edg

e?

Yes, add input to Q

No, (perha ps) try more mutati on

Also perform taintflow to

determine interesting offsets/

values (O/V)

Is it error

handing BB? If

so, not

interesting.

131

Evolving Taintflow based Solution

inputs

Q

▪Moving to Vuzzer…Bitflip,

replacement,ar

ithmeti

c

Mutate at

offset X

Execute and

monitor

edges (BB)

New

edg

e?

Yes, add input to Q

No, (perha ps) try more mutati on

Also perform taintflow to

determine interesting offsets/

values (O/V)

Is it error

handing BB? If

so, not

interesting.

132

Mutate only

interesting offsets

and with interesting

values (magic-bytes)

Evolving Taintflow based Solution

inputs

Q

▪Moving to Vuzzer…Bitflip,

replacement,ar

ithmeti

c

Mutate at

offset X

Execute and

monitor

edges (BB)

New

edg

e?

Yes, add input to Q

No, (perha ps) try more mutati on

Also perform taintflow to

determine interesting offsets/

values (O/V)

Is it error

handing BB? If

so, not

interesting.

Mutate only

interesting offsets

and with interesting

values (magic-bytes)

Input preference

with path

prioritization- static

analysis

133

There are problems… still!

134

There are problems… still!

135

▪Unaware of whether offset is processed by application– Waste of mutation time

There are problems… still!

136

▪Unaware of whether offset is processed by application– Waste of mutation time

▪What and Where to mutate– Different bugs have different triggering conditions

➢Buffer overflow involves buffer (strings, arrays etc.)

➢ Integer overflow involves integer data type.

There are problems… still!

137

▪Unaware of whether offset is processed by application– Waste of mutation time

▪What and Where to mutate– Different bugs have different triggering conditions

➢Buffer overflow involves buffer (strings, arrays etc.)

➢ Integer overflow involves integer data type.

Traditional Byte by byte mutation may not be a very

effective strategy!

There are problems… still!

138

▪Unaware of whether offset is processed by application– Waste of mutation time

▪What and Where to mutate– Different bugs have different triggering conditions

➢Buffer overflow involves buffer (strings, arrays etc.)

➢ Integer overflow involves integer data type.

Traditional Byte by byte mutation may not be a very

effective strategy!

TIFF (presented at ACSAC 2018)

Type inference: Main Insight

139

Type inference: Main Insight

Input

140

Type inference: Main Insight

Input

141

Application

execution

Type inference: Main Insight

Input

142

Application

execution

memory

Type inference: Main Insight

Input

143

Application

execution

DSI

memory

Type inference: Main Insight

InputApplication

execution

DSI

memory

INT8

144

Type inference: Main Insight

InputApplication

execution

DSI

memory

INT8

String buffer

145

Type inference: Main Insight

InputApplication

execution

DSI

memory

INT8

String buffer

INT32

146

Type inference: Main Insight

InputApplication

execution

memory

INT8

String buffer

INT32

DSIDTA

147

Type inference: Main Insight

InputApplication

execution

memory

INT8

String buffer

INT32

DSIDTA

148

Type inference: Main Insight

InputApplication

execution

memory

INT8

String buffer

INT32

DSIDTA

149

Type inference: Main Insight

InputApplication

execution

memory

INT8

String buffer

INT32

DSIDTA

150

Type inference: Main Insight

InputApplication

execution

memory

INT8

String buffer

INT32

DSIDTA

151

Type inference: Main Insight

InputApplication

execution

memory

INT8

String buffer

INT32

DSIDTA

152

Type inference: Main Insight

InputApplication

execution

memory

INT8

String buffer

INT32

INT8

String buffer

INT32

Tagged input

Type inferenceDSI

DTA

153

Input format learning (Grammar based fuzzing)

154

Input format learning (Grammar based fuzzing)

155

Input format learning (Grammar based fuzzing)

▪TIFF brought forward the idea of bringing grammar and mutation based fuzzing closer!

156

Input format learning (Grammar based fuzzing)

▪TIFF brought forward the idea of bringing grammar and mutation based fuzzing closer!

▪Angora (S&P’18)- on source code (LLVM based)

157

Input format learning (Grammar based fuzzing)

▪TIFF brought forward the idea of bringing grammar and mutation based fuzzing closer!

▪Angora (S&P’18)- on source code (LLVM based)

▪ProFuzzer ( S&P’19)- learning input structure with execution patterns

158

Input format learning (Grammar based fuzzing)

▪TIFF brought forward the idea of bringing grammar and mutation based fuzzing closer!

▪Angora (S&P’18)- on source code (LLVM based)

▪ProFuzzer ( S&P’19)- learning input structure with execution patterns

▪GRIMOIRE: Synthesizing Structure while Fuzzing (Usenix Sec’19)-grammar inference.

159

Input format learning (Grammar based fuzzing)

▪TIFF brought forward the idea of bringing grammar and mutation based fuzzing closer!

▪Angora (S&P’18)- on source code (LLVM based)

▪ProFuzzer ( S&P’19)- learning input structure with execution patterns

▪GRIMOIRE: Synthesizing Structure while Fuzzing (Usenix Sec’19)-grammar inference.

▪ and many more.. See: https://github.com/fengjixuchui/FuzzingPaper

160

Input format learning (Grammar based fuzzing)

▪TIFF brought forward the idea of bringing grammar and mutation based fuzzing closer!

▪Angora (S&P’18)- on source code (LLVM based)

▪ProFuzzer ( S&P’19)- learning input structure with execution patterns

▪GRIMOIRE: Synthesizing Structure while Fuzzing (Usenix Sec’19)-grammar inference.

▪ and many more.. See: https://github.com/fengjixuchui/FuzzingPaper

▪Focus shifted to How to mutate sensibly?

161

Evaluating Fuzzers- A tough question!

162

▪What experimental setup is needed to produce trustworthy results? [Evaluating Fuzz Testing, Klees et. al. CCS’18]

▪There is a randomness in mutation operation- thus results may differ run to run. Multiple runs.

▪Dataset- quite arbitrary (LAVA-M, Google fuzz, a set of real-world applications, binutils,..)

– VUzzer, perhaps for the 1st time, used three different datasets in the evaluation (DARPA CGC, LAVA-M, real-world apps)

▪Seed selection- which inputs to start with?

Evaluating Fuzzers- A tough question!

163

▪How to measure efficiency?– Code-coverage, but what about directed fuzzers?

➢Also for binary only fuzzers, measuring code coverage is not that straight forward-static binary instrumentation

➢Also, for source code based fuzzers, what about library code?

– Uniqueness of crashes➢How to differentiate several crashes? Often coredump does not have enough

information!

➢Root-cause analysis (not much is there! Failure Sketching, G. Candea EPFL)

Good Engineering

164

▪(the scope of) Optimization is everywhere in a fuzzer.

▪Light-weight fuzzers (e.g. AFL)– Branch bitmap (64K to be fit into the cache)

– Fork()

– Input trimming

▪Every program analysis introduces a performance hit– F1 (World’s fastest grammar based fuzzer- it is F0 by Brandon Falk)

▪VUzzer uses memory file system (tmpfs).

▪Vectorized Emulation: Putting it all together (Brandon Falk)

Conclusions

165

Conclusions

166

▪Fuzzing – seems easy unless you try it!

Conclusions

167

▪Fuzzing – seems easy unless you try it!

▪Scalability and performance cannot be negotiated much!– A good engineering, hardware assisted monitoring

Conclusions

168

▪Fuzzing – seems easy unless you try it!

▪Scalability and performance cannot be negotiated much!– A good engineering, hardware assisted monitoring

▪A good place to try program analysis techniques– Possibility to compromise correctness to make them scalable

Conclusions

169

▪Fuzzing – seems easy unless you try it!

▪Scalability and performance cannot be negotiated much!– A good engineering, hardware assisted monitoring

▪A good place to try program analysis techniques– Possibility to compromise correctness to make them scalable

▪Software will remain integral part of the cyber world- make is secure!