The Automated Exploitation Grand Challenge A Five-Year ...spw18.langsec.org/slides/Vanegue-AEGC-5-year-perspective.pdf · AEGC 2013/2018 vs DARPA Cyber Grand Challenge I Was Automated

The Automated Exploitation Grand Challenge

A Five-Year Retrospective

Julien Vanegue

IEEE Security & Privacy Langsec Workshop

May 25th 2018

AEGC 2013/2018 vs DARPA Cyber Grand Challenge

I Was Automated Exploit Generation solved with DARPA CGC?Not quite.

I DARPA Cyber Grand Challenge ranked solutions on threecriteria:

1. Attack (how well you exploited)2. Defense (how well you defended against exploits)3. Performance (Availability of your services)

I CGC Post-mortem: “Cyber Grand Challenge: The Analysis” :http://youtube.com/watch?v=SYYZjTx92KU

I DARPA CGC scratched the surface, this presentation focuseson what is under the carpet.

I We focus on memory attacks and defense, there are otherclasses we dont cover here.

http://youtube.com/watch?v=SYYZjTx92KU

Automated Exploit Generation Challenges

Original AEGC 2013 challenges: http://openwall.info/wiki/

_media/people/jvanegue/files/aegc_vanegue.pdf

In a nutshell, attacks are decomposed into five classes:

CLASS 1: Exploit Specification (“sanitizer synthesis”)CLASS 2: Input Generation (“white-box fuzz testing”)CLASS 3: State Space Management (“combinatorial explosion”)CLASS 4: Primitive Composition (“exploit chaining”)CLASS 5: Information disclosure (“environment determination”)

http://openwall.info/wiki/_media/people/jvanegue/files/aegc_vanegue.pdf

http://openwall.info/wiki/_media/people/jvanegue/files/aegc_vanegue.pdf

CLASS 1: Exploit Specification

For a given program p :For all inputs i1, ..., in:For all assertions a1, ..., am:

Safety condition: ∀a : ∀i : p(i)⇒ a

Attack condition: ∃a : ∃i : p(i)⇒ ¬a

where p is the program interpretation on the input i (forexample, construction of a SMT formula)

CLASS 1 approach: Sanitizer synthesis

Sanitizers are developer tools to catch bugs early at run time:

I Valgrind (ElectricFence before it): heap sanitizer (Problem:too intrusive for exploit dev)

I Address Sanitizer: clang compiler support to solve sameproblem as Valgrind in LLVM.

I Cachegrind: simulate how program interacts with cachehierarchy and branch predictor.

I Helgrind: detect data races, locking issues and other threadAPI misuses.

Current research directions include coupling sanitizers with staticanalysis and/or symbolic execution.See KLEE workshop talks: https://srg.doc.ic.ac.uk/klee18

https://srg.doc.ic.ac.uk/klee18

CLASS 2: Input Generation

After defining what attack conditions are, input generationprovides initial conditions to exercise sanitizing points:

I DART/SAGE: First white-box fuzzers (Godefroid, Molnar,Microsoft Research, 2006-)

I EXE/KLEE (Open-source Symbolic execution engine, Cadar,Dunbar and Engler, 2008-)

I American Fuzzy Lop aka AFL (Zalewski, 2014-) : (First?)open-source grey-box fuzzer

I More recently: Vuzzer, AFLfast, AFLgo, etc. (2016-)

These tools provide input mutation strategies to cover morepath/locations in tested programs.By now, input generation is a well-understood problem forrestricted sequential programs.

CLASS 3: State-space management

A well known problem in program analysis is Combinatorialexplosion. For several classes of programs, this leads toexponential blow-up of the state space:

I Multi-threaded programs: For i instructions, n threads:scheduling graph contains ni states.

I Heap-based programs: For i allocations, n possible allocationsize bins: heap config space contains ni states.

Motivation: Data Only Attacks (DOA)

Data-only attacks form a vulnerability class that can bypass exploitprotections such as:

I Non-execution mitigations (DEP, W∧E) : no code injectionneeded.

I Control-Flow Integrity (CFI) : no code redirection needed.

Under certain conditions, it can defeat:

I Address Space Layout Randomization (if it does not rely onabsolute addresses)

I Heap meta-data protections (if it does not rely on heapmeta-data corruptions)

Example of DOA: heartbleed (lines up chunks in memory to leakprivate material)

Decide safety using Adjacency predicate

∀x¬∃y : TGT (y) ∧ ADJ(x , y) ∧ OOB(x)

I ADJ(x,y) = true iff x and y are adjacent (base(x) + size(x) =base(y) or base(y) + size(y) = base(x).

I OOB(x) = true iff there exists an out-of-bound condition onmemory buffer x.

I TGT(x) = true iff memory cell x is an interesting target tooverwrite.

Decide safety using Distance function

∀x¬∃y : TGT (y) ∧ DOOB(x) > DIST (x , y)

I DIST(x,y) : N = | base(x) - base(y) |I DOOB(x) : N is the maximum offset from x’s base address

that can be (over)written/read.

I TGT(y) = true iff chunk y is an interesting target tooverwrite.

Automation challenges for Heap attacks

1. Do not confuse Logical and Spatial Heap semantics (ShapeAnalysis vs. Layout Analysis)

I Heap Models for Exploit Systems (Vanegue, Langsec 2015)I Automated Heap Layout Manipulation for Exploitation (Heelan

et al. to appear in Usenix Security 2018)

2. Decision of the ADJ(x,y) predicate is too approximate in theabstract. Requires tracking heap bins finely.

3. ADJ(x,y) is not separable for each heap bin: two chunksbelonging to different bins could still be adjacent.

4. Each heap allocator uses different rules for memorymanagement.

5. Heap distance across executions monotonically grows withtime (a problem for heap-heavy programs, such as browsers)

CLASS 4: Automate Exploit Chaining

I Five years ago: “Multi-interaction exploits” was already aproblem in the AEGC 2013

I Exploit Chaining is one of the main techniques used in realexploits today.

I Examples of Exploits Chain: Pinkie Pie Pwnium 2012 (chainof logic bugs and memory corruption to escape Chromesandbox): Used pre-rendering feature to load Native clientplug-in, from where triggered a buffer overflow in the GPUprocess, leading to impersonating a privileged renderer processvia IPC squatting. From there, used an insecureJavascript-to-C++ interface to specify extension path to beloaded (impersonating the browser), and finally loaded anNPAPI plug-in running out of the sandbox. See “A Tale ofTwo Pwnies (Part 1)” by Obes and Schuh (Google Chromiumblog)

Multi-interaction exploits (aka Exploit Chaining) leads

I As a matter of fact, little to no progress on automatingchaining in last 5 years.

I Weird Machines characterize exploits as untrustedcomputations over a state machine.

I Problem: How to automate state creation on the weirdmachine?

I Formally: If a program is a function of type: X ⇒ Y , whereX is an initial state leading to corrupted state Y then:

∃Z : X ⇒ Z ∧ Z ⇒ Y

We dub this “The intermediate exploit state problem”.

The Intermediate Exploit State problem

I There are whole chains of intermediates:

∃Z1,Z2, ...,Zn : X ⇒ Z1 ∧ Z1 ⇒ Z2 ∧ ... ∧ Zn−1 ⇒ Zn

I For each step i , is there a unique candidate Zi? Not if statedepends on control predicates (if/else/for conditions)

I Even for a single path, there may be multiple Zi one couldchoose from. In particular, see “The Weird Machines in ProofCarrying Code” (Langsec 2013): characterize unaccountedintermediate steps in PCC.

CLASS 5: Information disclosure (ex: side-channel attacks)

Information disclosures (or “Info leak”) has been used for at least15 years in exploits.

I Direct info leaks (read uninitialized memory, OOB read, etc)

I Indirect info leaks (infer information from timing or otherobservable quantities)

In the last year, new hardware-based info leaks were publicallyreleased (Spectre, Meltdown, etc):

I Variant 1: Speculative bound check bypass (Jan 2018)

I Variant 2: Branch Target Buffer (Jan 2018)

I Variant 3: Rogue Data Cache Load (Jan 2018)

I Variant 4: Speculative Store Bypass (May 2018)

Ref: “Reading privileged memory with a side-channel” (by JannHorn, Google P0)Attack: Exploit speculative caching CPU feature for timing attacks.Outcome: Attacker can predict bit values across privilege levels.

Spectre Variant 1 : a possible candidate for exploitautomation

s t ruc t a r r a y { u l o n g l e n ; uchar data [ ] ; }

( . . . )s t ruc t a r r a y * a r r 1 = i n i t t r u s t e d a r r a y (0 x10 ) ;s t ruc t a r r a y * a r r 2 = i n i t t r u s t e d a r r a y (0 x400 ) ;u l o n g u n t r u s t e d o f f s e t = r e a d u n t r u s t e d l e n ( ) ;i f ( u n t r u s t e d o f f s e t < a r r 1−>l e n ) {

uchar v a l u e = a r r 1−>data [ u n t r u s t e d o f f s e t ] ;u i n t i d x = 0 x200 + ( ( v a l u e & 1) * 0 x100 ) ;i f ( i d x < a r r 2−>l e n )

return ( a r r 2−>data [ i d x ] ) ;}

( . . . )

Insights

I Possible strategy: Assume CPU behavior, check programs forvulnerable code traits

I Interestingly: try to detect effects (cached state), not rootcause (as usual).

I This is non-standard for static analysis (usually go after rootcause by checking invariant, etc).

I Traditional black/grey/white-box fuzzers are blind to theseproperties.

I Checking such properties appears beyond compile-timeanalysis.

I Mitigations are already underway (ex: retpoline againstSpectre Variant 2).

I Augmented static analysis or symbolic execution could bedesigned to keep track of cached states and speculativeconditions (not trivial)

Another new problem: Automating Rowhammer-styleattacks

Rowhammer is a hardware attack that can flip bits in memory witha probabilistic chance of success.None of the discussed techniques would work to detect this:

I One need a probabilistic semantic to model such attacks.

I In spirit: Similar to brute-forcing a password: requires a lot oftries, success is aleatory.

I Possible approach: quantify attack success using techniquestypically used by cryptographic security proofs.

I Possible outcome: prove hardware is secure with very highprobability.

I Prediction: flaw will be fixed by design in next generationhardware.

I Counter-Prediction: probabilistic memory attacks are notgoing away, a framework is needed to study them.

Summing up

What is the next exciting autoresearch ahead?Done? Some techniques Few/No tech

Input Generation X

Exploit Specification X

State Space Management X

Primitive Composition X

Information disclosure X

Conclusion

Automated Exploit Generation is not yet solved in 2018.

Beware of folks telling you otherwise. People will try.

Questions?

Mail: [email protected] Twitter: @jvanegue

Documents

The Automated Exploitation Grand Challenge A Five-Year ...spw18.langsec.org/slides/Vanegue-AEGC-5-year-perspective.pdf · AEGC 2013/2018 vs DARPA Cyber Grand Challenge I Was Automated