43
cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class today, due Tuesday at 2:01pm

Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Embed Size (px)

Citation preview

Page 1: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

cs3102: Theory of Computation

Class 21: Undecidability in Theory and Practice

Spring 2010University of VirginiaDavid Evans

Exam 2: Out at end of class today, due Tuesday at 2:01pm

Page 2: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Menu

• Turing-equivalent Grammar• Universal Programming Languages• “Return-Oriented Programming”• Problems Computers Can and Cannot Solve

What does undecidability mean for problems real people (not just CS theorists and Busy Beavers) care about?

Page 3: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Models of ComputationMachine Replacement Grammar

Finite Automata (Class 2-5) Regular Grammar (AaB)

Pushdown Automata (add a stack) (Classes 6-7)

Context-free Grammar (Classes 7-9) (ABC)

Turing machine (add an infinite tape) (Classes 14-20)

Adapted from Class 1:

?

Page 4: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Unrestricted Grammar

aXbY cZd

Right and left sides of a grammar rule can be any sequence ofterminals and nonterminals.

How can we prove unrestricted grammars are equivalent to a Turing machine?

Page 5: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Simulation Proof

Show that we can simulate every Unrestricted Grammar with some TM.

Show that we can simulate every TM with some Unrestricted Grammar.

Fairly easy, but tedious (not shown): design a TM that does the grammar replacements by writing on the tape.

Page 6: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Simulating TM with UG

Page 7: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Simulating TM with UG

Page 8: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Simulating TM with UG

(No replacement rules with R on left side)

Page 9: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Initial Configuration

. . .

q0

w0 w1 w2

Page 10: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Models of ComputationMachine Replacement Grammar

Finite Automata (Class 2-5) Regular Grammar (AaB)

Pushdown Automata (add a stack) (Classes 6-7)

Context-free Grammar (Classes 7-9) (ABC)

Turing machine (add an infinite tape) (Classes 14-20)

Unrestricted Grammar (Class 21) ()

Page 11: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Universal Programming Language

• Definition: a programming language that can describe every algorithm.

• Equivalently: a programming language that can simulate every Turing Machine.

• Equivalently: a programming language in which you can implement a Universal Turing Machine.

Page 12: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Which of these are Universal Programming Languages?

Python

JavaC++

C#HTML

Scheme

Ruby

COBOL

Fortran

JavaScriptPostScript

BASIC

x86

TeXPDF

Page 13: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Proofs• BASIC, C, C++, C#, Fortran, Java, JavaScript, PDF,

PostScript, Python, Ruby, Scheme, TeX, etc. are universal– Proof: implement a TM simulator in the PL

• HTML (before HTML5) is not universal:– Proof: show some algorithm that cannot be

implemented in HTML• An infinite loop

– HTML5 might be a universal programming language! (Proof is worth challenge bonus.)

Page 14: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Why is it impossible for a programming language to be

both universal and resource-constrained?

Resource-constrained means it is possible to determine an upper bound on the resources any program in the language can consume.

Page 15: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

All universal programming language are equivalent in power: they can all simulate a TM, which can carry out any mechanical algorithm.

Page 17: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Proliferation of Universal PLs• “Aesthetics”– Some people like :=, others prefer =.– Some people think whitespace shouldn’t matter (e.g.,

Java), others think programs should be formatted like they mean (e.g., Python)

– Some people like goto, others like throw.• Expressiveness vs. Simplicity– Hard to write programs in SUBLEQ

• Expressiveness vs. “Truthiness”– How much you can say with a little code vs. how likely

it is your code means what you think it does

Page 18: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Programming Language Design SpaceExp

ress

iveness

“Truthiness”

Scheme

Python

Java

C

low

high

Spec#Ada

strict typing,static

more mistake prone less mistake prone

public class HelloWorld { public static void main(String[] args) { System.out.println ("Hello!"); }}

print ("Hello!")

(display “Hello!”)

x86

Page 19: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Do most x86 programs contain Universal Turing Machines?

Hovav Shacham. The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86). CCS 2007.

[Paper link]

Page 20: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

CS 3330 CondensedInstruction Pointer (ip)

Fetch Instruction

Update ip

load

Execute InstructionStack

Return Address

Return Address

Return Address

Memory

Execution Stack(not like PDA stack)

Page 21: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

CS 2150 Really Condensedx86 programs are just sequences of bytes (1 byte = 8 bits = 2 hex characters)

Instructions are encoded using variable-length (1-15 bytes)First byte is opcode that identifies the type of instruction

bb ef be ad de MOV $0xDEADBEEF, %eax 5-byte instruction that writes the constant 0xDEADBEEF into register %eax

c3 RET1-byte instruction that returns (jumps to the return address that is stored in a location on the stack)

eb fe JMP -22-byte instruction that moves the instruction pointer back 2

Page 22: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Sequences of Instructions

f7 c7 07 00 00 00 0f 95 45 c3

(Example from Shacham’s paper)

test $0x00000007, %edi setnzb -61(%ebp)

retinc %ebp

xchg %ebp, %eaxmovl $0x0f000000, (%edi)

Page 23: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

“Return-Oriented Programming”

Return Address 3

Return Address 1

Return Address 2

Return Address 6

Return Address 4

Return Address 5

Execution Stack

41 AC 16 FA A0 44 79 8C 2D 43 B3 47 4C 62 69 80 B8 C9 9F BB 34 99 E9 4E 75 D5 A5 64 5F 5A B6 4B 61 A8 B4 AE 27 85 05 0B CA 20 F5 F2 16 C2 B4 8D 54 13 58 93 94 DC 02 16 55 B1 AD 57 80 97 BB DA 39 B3 23 7D B3 BD D2 87 D6 D2 C5 67 8B BE 5F 09 BB B8 F7 EF 93 15 1E 8F 5E 4C C1 66 C1 1D 82 06 B7 C1 62 96 00 17 F9 CD 82 2F 93 C2 10 5D DD 21 4D 16 F4 8E 36 7B 7D 91 C7 D3 E1 49 DB A5 FE A4 61 5C 5D E4 8C 8D 6C 33 C3 46 7E 27 F7 88 25 37 F6 F9 B0 E8 B8 42 11 43 4F 6B 57 03 56 FB C8 07 4B 9A F7 FC 1D 8D 0D D5 98 38 0B D7 9B 0F 5B 8D A7 CE F5 66 50 5B 36 82 0F DA 39 16 35 55 6B C7 D0 48 52 89 F6 C7 2C 1F EF B7 56 2D F0 1B 39 2F 26 65 69 FB 42 6F DD F7 A1 5D 83 C5 07 43 C3 B9 E7 BA DF B7 DD 28 5C 62 6F 2F 17 9F D1 51 EC 82 0C 40 7B 51 91 F5 19 31 B7 E0 C7 0B 5A 03 CB 3A 55 82 60 39 85 92 BE 38 C2 DB EA A7 E6 8C 0B B8 0A 53 35 C2 BA 54 1D CF D8 76 B1 DD F2 4E DF 3F C6 FF A7 BF 4B 89 D4 11 2F 3F 4F F7 93 B5 CB 6A AA 01 E1 E6…

Page 24: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Injecting Malicious Code

int main (void) { int x = 9; char s[4];

gets(s); printf ("s is: %s\n“, s); printf ("x is: %d\n“, x);}

Stack

s[0]

s[1]

s[2]

s[3]

x

return address

C Programa

b

c

d

e

f

g

h

...

Page 25: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Buffer Overflowsint main (void) { int x = 9; char s[4];

gets(s); printf ("s is: %s\n“, s); printf ("x is: %d\n“, x);}

> gcc -o bounds bounds.c> boundsabcdefghijkls is: abcdefghijklx is: 9> boundsabcdefghijklms is: abcdefghijklmnx is: 1828716553> boundsabcdefghijklns is: abcdefghijklnx is: 1845493769> boundsaaa... [a few thousand characters]crashes shell

(User input)

= 0x6d000009

= 0x6e000009Note: your results may vary (depending on machine, compiler, what else is running, time of day, etc.). This is what makes C fun! What does this kind of mistake look like

in a popular server?

Page 26: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Code Red

Page 27: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Defenses

• Use a type-safe programming language (e.g., bounds checking)

• Write-xor-Execute pages– When the OS loads a page into memory, it is

marked as either executable or writable: can’t be both

– Hence: attacker can inject all the code it wants on the stack, but can’t jump to it and execute it

Page 28: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

“Return-Oriented Programming”

Return Address 3

Return Address 1

Return Address 2

Return Address 6

Return Address 4

Return Address 5

Execution Stack

41 AC 16 FA A0 44 79 8C 2D 43 B3 47 4C 62 69 80 B8 C9 9F BB 34 99 E9 4E 75 D5 A5 64 5F 5A B6 4B 61 A8 B4 AE 27 85 05 0B CA 20 F5 F2 16 C2 B4 8D 54 13 58 93 94 DC 02 16 55 B1 AD 57 80 97 BB DA 39 B3 23 7D B3 BD D2 87 D6 D2 C5 67 8B BE 5F 09 BB B8 F7 EF 93 15 1E 8F 5E 4C C1 66 C1 1D 82 06 B7 C1 62 96 00 17 F9 CD 82 2F 93 C2 10 5D DD 21 4D 16 F4 8E 36 7B 7D 91 C7 D3 E1 49 DB A5 FE A4 61 5C 5D E4 8C 8D 6C 33 C3 46 7E 27 F7 88 25 37 F6 F9 B0 E8 B8 42 11 43 4F 6B 57 03 56 FB C8 07 4B 9A F7 FC 1D 8D 0D D5 98 38 0B D7 9B 0F 5B 8D A7 CE F5 66 50 5B 36 82 0F DA 39 16 35 55 6B C7 D0 48 52 89 F6 C7 2C 1F EF B7 56 2D F0 1B 39 2F 26 65 69 FB 42 6F DD F7 A1 5D 83 C5 07 43 C3 B9 E7 BA DF B7 DD 28 5C 62 6F 2F 17 9F D1 51 EC 82 0C 40 7B 51 91 F5 19 31 B7 E0 C7 0B 5A 03 CB 3A 55 82 60 39 85 92 BE 38 C2 DB EA A7 E6 8C 0B B8 0A 53 35 C2 BA 54 1D CF D8 76 B1 DD F2 4E DF 3F C6 FF A7 BF 4B 89 D4 11 2F 3F 4F F7 93 B5 CB 6A AA 01 E1 E6…

Defeats WoX defense! Attacker can run any code they want by finding a Turing-complete set of “gadgets” it in your program and jumping to them!

Page 29: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Does it really work?

• Likelihood of finding enough gadgets in “random” bytes to make Turing-complete

libc (C library included in nearly all Unix programs)contains more than enough (18MB ~ expect to have ~ 74000 RET (c3) instructions)

• Demonstration of attack on voting machine:

http://www.youtube.com/watch?v=lsfG3KPrD1I

Page 30: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Vulnerability Detection

Input: an x86 program POutput: True if there is some input w, such that

running P on w allows the attacker to overwrite return addresses on stack; False otherwise.

Page 31: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Example: Morris Internet Worm (1988)P = fingerd – Program used to query user status (running on most

Unix servers)

isVulnerable(P)?Yes, for w = “nop400 pushl $68732f pushl $6e69622f movl sp,r10 pushl $0 pushl $0 pushl r10 pushl $3 movl sp,ap chmk $3b”– Worm infected several thousand computers (~10% of

Internet in 1988)

Page 32: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Vulnerability Detection

Input: an x86 program POutput: True if there is some input w, such that

running P on w allows the attacker to overwrite return addresses on stack; False otherwise.

Page 33: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Vulnerability Detection is Undecidable

Page 34: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Vulnerability Detection is Undecidable

Page 35: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

“Solving” Undecidable Problems

• Undecidable means there is no program that 1. Always gives the correct answer, and2. Always terminates

• Must give up one of these:– Giving up #2 is not acceptable in most cases– Must give up #1: cannot be correct on all inputs

• Or change the problem– e.g., modify P to make it invulnerable, etc.

Page 37: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Actual Vulnerability Detectors

• Sometimes give the wrong answer:– “False positive”: say P is a vulnerable when it isn’t– “False negative”: say P is safe when it is

• Heuristics to find common errors• Heuristics to rank-order possible problems

Can Microsoft squash 63,000 bugs in Windows 2000? … Overall, there are more than 65,000 "potential issues" that could emerge as problems, as discovered by Microsoft's Prefix tool. Microsoft is estimating that 28,000 of these are likely to be "real" problems.

Page 38: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Computability in Theory and Practice

(Intellectual Computability Discussion on TV)

http://video.google.com/videoplay?docid=1623254076490030585#

Page 39: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Ali G ProblemInput: a list of numbers (mostly 9s)Output: the product of the numbers

Is LALIG decidable?

LALIG = { < k0, k1, …, kn, p> | each ki represents a number and p represents a number that is the product of all the kis. numbers }

Yes. It is easy to see a simple algorithm (e.g., elementary school multiplication) that decides it.

Can real computers solve it?

Page 40: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class
Page 41: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class
Page 42: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Ali G was Right!

• Theory assumes ideal computers:– Unlimited, perfect memory– Unlimited (finite) time

• Real computers have:– Limited memory, time, power outages, flaky

programming languages, etc.– There are many decidable problems we cannot

solve with real computer: the actual inputs do matter (in practice, but not in theory!)

Page 43: Cs3102: Theory of Computation Class 21: Undecidability in Theory and Practice Spring 2010 University of Virginia David Evans Exam 2: Out at end of class

Charge

• Exam 2 out now• Due at beginning of class, Tuesday

• It has some pretty tough questions (and no really easy questions): don’t get stressed out if you can’t answer everything