RETURN ORIENTED PROGRAMME EVOLUTION · • What is return oriented programming? ... shellcode attacks won’t work. ... fairly useful, automating the ordinary, human task of assembling

R E T U R N

O R I E N T E D

P R O G R A M M E

E V O L U T I O N with

R O P E R

Olivia Lucca [email protected]://github.com/oblivia-simplexAtlSecCon, Halifax, April 28, 2017

[email protected]

https://github.com/oblivia-simplex

R E T U R N

O R I E N T E D

P R O G R A M M E

E V O L U T I O N with

R O P E R

Questions:• What is return orientedprogramming?

• What is genetic programming?

• How do we best cultivate theevolution of ROP payloads?

• What sort of things are theycapable of?

3. A Quick Introduction to Return Oriented Programming

• SITUATION: You have found anexploitable vulnerability in a targetprocess, and are able to corrupt theinstruction pointer.

• PROBLEM: The system or processenforces W⊕ X: you can’t write toexecutable memory, and you can’texecute writeable memory. Old-schoolshellcode attacks won’t work.

• SOLUTION: You can’t introduce anycode of your own, but you can reuselittle ‘gadgets’ of code that havealready been mapped to executablememory. The trick is rearrangingthese gadgets into something useful.

4. What is a ROP chain?

• A ‘gadget’ is any chunk ofmachine code that

1. is already mapped toexecutable memory

2. allows us to regain controlof the instruction pointerafter it executes

• The way a ROP gadget lets usregain control is that it endswith a particular form ofRETURN statement – those thatpop an address off the stackinto the instruction pointer.

• Ordinarily, the address poppedfrom the stack is a ‘bookmark’pointing to the site in thecode from which a function wascalled...

• ...but this is just aconvention. If an instructionpops an address from the stackinto the IP, it will do so nomatter what address we putthere.

• and we can take advantage ofthis to ‘chain’ arbitrarilymany gadgets together. Aseach reaches its RETURNinstruction, it sends theinstruction pointer to thenext gadget in the chain.

5. An Equally Quick Introduction to Genetic Programming

What is necessary in order fornatural selection to take place?

1 Reproduction with mutation

2 Variation in performance

3 Selection by performance

Anything that implements thesetraits can implement Darwinianevolution.

6. How ROPER works

ROPER evolves a population of ROPchains through a process ofnatural selection.

7. Evolutionary computation

The strategies ROPER adopts aredrawn from the field ofevolutionary computation, a broadclass of approaches to the problemof machine intelligence thatexploits the abstractness ofnatural selection by instantiatingit in code.

In particular, ROPER draws on thetradition of genetic programming,which treats a stochasticallygenerated set of programmes as thegenotypes of a population, andtheir performance when executed astheir phenotypes.

Selective pressures are brought tobear on the phenotypes, in orderto decide which genotypes areallowed to reproduce. Variationoperators are applied to thegenotypes, spawning newindividuals into the population.

Here, the genotypes are ROP-chains– stacks of pointers into gadgetsexisting in executable memory –and the phenotypes are thebehavioural profiles those chainsexhibit when hijacking theinstruction pointer of theexploited process.

8. Genetic Algorithm with Tournament Selection

9. Architecture of ROPER

10. Uneven Raw Materials

Register usage in tomato-RT-N18U-httpd, an ARM router HTTP daemon

Unlike classical linear genetic programming, where you have theclean slate of a customized instruction set and VM, here, we’redealing with the rough ground of already-compiled machine code (forthe ARM processor), and stuck with its idiosyncracies.

11. Pattern matching

The most basic type of problemthat ROPER can breed a populationof chains to solve is thatachieving a determinate registerstate in the CPU, specified by asimple pattern consisting ofintegers and wildcards.

This isn’t the most intriguingthing that ROPER can do, but it isfairly useful, automating theordinary, human task of assemblinga ROP chain that prepares the CPUfor a system call – to spawn aprocess, write to a file, open asocket, etc.

For example, suppose we wanted toprime the CPU for the call

execv("/bin/sh", ["/bin/sh"], 0);

We’d need a ROP chain that sets r0and r1 to point to some memorylocation that contains "/bin/sh",sets r2 to 0, and r7 to 11. Oncethat’s in place spawning a shellis as simple as jumping to anygiven address that contains an svcinstruction.

One of ROPER’s more peculiarsolutions to this problem – usinggadgets from a Tomato router’sHTTP daemon – is on the nextslide...

;; Gadget 0 ;; Extended Gadget 0 ;; Extended Gadget 1[000100fc] mov r0, r6 [00016890] str r0, [r4, #0x1c] [00012780] bne #0x18[00010100] ldrb r4, [r6], #1 [00016894] mov r0, r4 [00012784] add r5, r5, r7[00010104] cmp r4, #0 [00016898] pop {r4, lr} [00012788] rsb r4, r7, r4[00010108] bne #4294967224 [0001689c] b #4294966744 [0001278c] cmp r4, #0[0001010c] rsb r5, r5, r0 [00016674] push {r4, lr} [00012790] bgt #4294967240[00010110] cmp r5, #0x40 [00016678] mov r4, r0 [00012794] b #8[00010114] movgt r0, #0 [0001667c] ldr r0, [r0, #0x18] [0001279c] mov r0, r7[00010118] movle r0, #1 [00016680] ldr r3, [r4, #0x1c] [000127a0] pop {r3, r4, r5, r6, r7, pc}[0001011c] pop {r4, r5, r6, pc} [00016684] cmp r0, #0

[00016688] ldrne r1, [r0, #0x20] R0: 0002bc3eR0: 00000001 [0001668c] moveq r1, r0 R1: 00000000R1: 00000001 [00016690] cmp r3, #0 R2: 00000000R2: 00000001 [00016694] ldrne r2, [r3, #0x20] R7: 0000000bR7: 0002bc3e [00016698] moveq r2, r3

[0001669c] rsb r2, r2, r1 ;; Extended Gadget 2;; Gadget 1 [000166a0] cmn r2, #1 [000155ec] b #0x1c[00012780] bne #0x18 [000166a4] bge #0x48 [00015608] add sp, sp, #0x58[00012798] mvn r7, #0 [000166ec] cmp r2, #1 [0001560c] pop {r4, r5, r6, pc}[0001279c] mov r0, r7 [000166f0] ble #0x44[000127a0] pop {r3, r4, r5, r6, r7, pc} [00016734] mov r2, #0 R0: 0002bc3e

[00016738] cmp r0, r2 R1: 00000000R0: ffffffff [0001673c] str r2, [r4, #0x20] R2: 00000000R1: 00000001 [00016740] beq #0x10 R7: 0000000bR2: 00000001 [00016750] cmp r3, #0R7: ffffffff [00016754] beq #0x14 ;; Extended Gadget 3

[00016758] ldr r3, [r3, #0x20] [00016918] mov r1, r5 **;; Gadget 2 [0001675c] ldr r2, [r4, #0x20] [0001691c] mov r2, r6[00016884] beq #0x1c [00016760] cmp r3, r2 [00016920] bl #4294967176[00016888] ldr r0, [r4, #0x1c] [00016764] strgt r3, [r4, #0x20] [000168a8] push {r4, r5, r6, r7, r8, lr}[0001688c] bl #4294967280 [00016768] ldr r3, [r4, #0x20] [000168ac] subs r4, r0, #0[0001687c] push {r4, lr} [0001676c] mov r0, r4 [000168b0] mov r5, r1[00016880] subs r4, r0, #0 [00016770] add r3, r3, #1 [000168b4] mov r6, r2[00016884] beq #0x1c [00016774] str r3, [r4, #0x20] [000168b8] beq #0x7c[000168a0] mov r0, r1 [00016778] pop {r4, pc} [000168bc] mov r0, r1[000168a4] pop {r4, pc} [000168c0] mov r1, r4

[000168c4] blx r2R0: 00000001 R0: 0000000bR1: 00000001 R1: 00000000 R0: 0002bc3eR2: 00000001 R2: 00000000 R1: 0002bc3eR7: 0002bc3e R7: 0002bc3e R2: 00000000

R7: 0000000b

13. Extended Gadgets & Introns

This chain is interestingbecause its execution pathspends most of its time ingadgets that aren’t referencedin the chain itself (labelled‘extended gadgets’ on the lastslide). Gadget # 2 jumpsbackwards, and writes to its ownstack, overriding the pointersin its genome.

Chains like this emergefrequently, usually accompaniedby spikes in the population’scrash frequency – jumpingblindly to arbitrary addressesis hazardous.

What selection pressures couldbe responsible for thisphenomenon?

Conjecture:

• genes are selected not just forfitness, but for heritability

• our crossover operator has onlyweak/emergent respect for genelinkage, and none for homology

• so good genes are always at riskof being broken up instead ofpassed on

• ‘introns’ can pad importantgenes, and they decrease thechance that crossover willdestroy them – and so areselected for

• by branching away from the ROPstack at Gadget 2, our specimentransforms about 90% of itsgenome into introns

14. Fleurs du Malware

It seemed natural to see if ROPERcould also tackle traditionalmachine learning benchmarks, andgenerate ROP payloads that exhibitsubtle and adaptive behaviour.

To the best of my knowledge, thishas never been attempted before.

I decided to start with thewell-known Iris dataset, compiledby Ronald Fisher & Edgar Andersonin 1936.

Each ROP-chain in the populationwould be passed the petal andsepal measurements of eachspecimen in the Iris dataset.

The fitness of the chains was maderelative to the accuracy withwhich they could predict thespecies of iris from thosepredictions.

Given time, the population wouldbe able to recognize iris specieswith an accuracy of about 96 %, asan effect of evolution alone.

15. Low-Hanging Fruit & its Consequences for Diversity

• A challenge facing any machinelearning technique is to avoidgetting trapped in merely localoptima.

• This can happen, for example,if it hyperspecializes on aparticularly simple portion –the “low hanging fruit” – ofthe problem set, while failingto adapt to more difficultproblems.

• The phenomenon is analogous toa natural populationover-adapting to aparticularly hospitable niche.

• But in the wild, this isoffset by an increase incompetition and crowding,which increase the selectivepressure acting on formerlyhospitable niches.Low-hanging fruit doesn’t lastvery long.

16. Implementing Niching through Fitness Sharing

• In order to address thisissue, we first need to keeptrack of where, in the problemspace, the overfitting occurs.Where is the low-hangingfruit?

• To do this, we tag eachproblem in our space with a‘difficulty’ field, whichkeeps track of how ourspecimens perform on it, onaverage.

• Since the whole point oftracking difficulty is to haveit transform dynamically overthe course of the evolution,we’ll update these scoresevery so many iterations.

• On the next slide, we plot theprogress of the population’sbest and average fitnessscores on the left, and thedifficulty rations of ourproblems on the right –plotted by class mean andstandard deviation.

17. Tracking Niches without Crowding

18. Crowding Implemented as Fitness Sharing

• We haven’t yet changedanything in the way eachspecimen’s fitness isevaluated. The graph onlyshows us how the population isperforming, with respect toeach class of problems.

• But we can use thisinformation to tweak ourfitness function in waysrelevant to niching.

• All that we need to do is toscale the fitness pointsawarded for each problem withrespect to that problem’sdifficulty. The rewards forsolving ‘difficult’ problems(uncrowded niches) will begreater than those awarded forsolving ‘easy’ problems(crowded niches).

19. Niching with Crowding

20. Dynamic Braiding of Difficulty by Niche

A detailed view of the intricate braiding of niche availability thattakes place once we enable fitness sharing. The image is anenlargement of the right panel of the graph on the last slide,focussing on the region between iterations 3000 and 5000.

Because the environment perennially adjusts to the population’sstrengths and weaknesses, no specimen encounters the exact samefitness space as its distant ancestors, and cannot benefit fromoverfitting, or a diet of exclusively low-hanging fruit.

21. Snek!

The next step, which I’m currently working on, is to have ROPERevolve populations that can respond to dynamic environments. A goodsandbox for this sort of thing is to have ROPER’s populations playgames.

They’re currently learning how to play an implementation of Snakethat I hacked together (github.com/oblivia-simplex/snek).

github.com/oblivia-simplex/snek

22. Horizons and Applications

What potential uses are there for adaptive or intelligent ROP-chainpayloads?

• GOOD: IDS subversion and training through AI arms races – canROPER evolve payloads that evade the detection of AIs trained torecognize ROP execution? Can we use these to train better IDSAIs?

• EVIL: a component of complex, context-sensitive malware, usingfeature-recognizing ROP-chains to sense weaknesses oropportunities in a network

Documents

RETURN ORIENTED PROGRAMME EVOLUTION · • What is return oriented programming? ... shellcode attacks won’t work. ... fairly useful, automating the ordinary, human task of assembling