7
1 David Evans David Evans http://www.cs.virginia.edu/evans http://www.cs.virginia.edu/evans Curing Cancer Curing Cancer with Your Cell with Your Cell Phone: Phone: Why Why all all Sciences are Sciences are Becoming Becoming Computing Sciences Computing Sciences 2 College Science Scholars Computer Science = Doing Cool Stuff with Computers? 3 College Science Scholars Toaster Science = Doing Cool Stuff with Toasters? 4 College Science Scholars Computer Science • Mathematics is about declarative (“what is”) knowledge; Computer Science is about imperative (“how to”) knowledge • The Study of Information Processes – How to describe them – How to predict their properties – How to implement them quickly, cheaply, and reliably Language Logic Engineering 5 College Science Scholars Most Science is About Information Processes Which came first, the chicken or the egg? How can a (relatively) simple, single cell turn into a chicken? 6 College Science Scholars Agenda • Three Big Ideas: –All Computers are Equally Powerful – Programs are Data, Data are Programs –Many Surprisingly Different Problems are Equally Difficult • One Open Question – Is a machine that can always guess correctly able to solve problems a normal machine can’t?

Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

1

David EvansDavid Evanshttp://www.cs.virginia.edu/evanshttp://www.cs.virginia.edu/evans

Curing CancerCuring Cancer

with Your Cell with Your Cell

Phone: Phone:

Why Why allall Sciences are Sciences are

Becoming Becoming

Computing SciencesComputing Sciences

2College Science Scholars

Computer Science = Doing Cool Stuff with Computers?

3College Science Scholars

Toaster Science = Doing Cool Stuff with Toasters?

4College Science Scholars

Computer Science

• Mathematics is about declarative (“what is”) knowledge; Computer Science is about imperative (“how to”) knowledge

• The Study of Information Processes

–How to describe them

–How to predict their properties

–How to implement them

quickly, cheaply, and reliably

Language

Logic

Engineering

5College Science Scholars

Most Science is About Information Processes

Which came first, the chicken or the egg?

How can a (relatively) simple, single cell turn into a chicken?

6College Science Scholars

Agenda

• Three Big Ideas:

–All Computers are Equally Powerful

–Programs are Data, Data are Programs

–Many Surprisingly Different Problems are Equally Difficult

• One Open Question

–Is a machine that can always guess correctly able to solve problems a normal machine can’t?

Page 2: Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

2

7College Science Scholars

“Computers” before WWII

8College Science Scholars

Mechanical Computing

9College Science Scholars

Modeling Pencil and Paper

“Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child’s arithmetic book.”

Alan Turing, On computable numbers, with anapplication to the Entscheidungsproblem, 1936

# C S S A 7 2 3

How long should the tape be?

... ...

10College Science Scholars

Modeling Brains

•Rules for steps•Remember a

little

“For the present I shall only say that the justification lies in the fact that the human memory is necessarily limited.”

Alan Turing

11College Science Scholars

Turing’s Model

1

Start

2

Input: #

Write: #

Move: ←

# 1 0 1 1 0 1 1... ...1 0 1 1 0 1 1 1 #

Input: 1

Write: 0

Move: ←

Input: 1Write: 1

Move: →

Input: 0Write: 0

Move: →3

Input: 0Write: #

Move: ←

12College Science Scholars

Universal Machine

Description of a Turing Machine M Input

UniversalMachine

Result tape of running M on Input

A Universal Turing Machine can simulateany Turing Machine running on any Input!

Page 3: Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

3

13College Science Scholars

Church-Turing Thesis• All mechanical computers are equally powerful*

• There exists a Turing machine that can simulate any mechanical computer

• Any computer that is powerful enough to simulate a Turing machine, can simulate any mechanical computer

*Except for practical limits like memory size, time,

energy, etc.

14College Science Scholars

What This Means• Your cell phone, watch, iPod, etc. has a processor powerful enough to simulate a Turing machine

• A Turing machine can simulate the world’s most powerful supercomputer

• Thus, your cell phone can simulate the world’s most powerful supercomputer (it’ll just take a lot longer and will run out of memory)

15College Science Scholars

Recap

• All Computers are Equally Powerful

• Programs are Data, Data are Programs

• Many Problems are Equally Difficult

–But no one knows how difficult!

16College Science Scholars

A “Hard” Problem?

17College Science Scholars

Generalized Pegboard Puzzle

• Input: a configuration of n pegs on a cracker barrel style pegboard (of any size)

• Output: if there is a sequence of jumps that leaves a single peg, output that sequence of jumps. Otherwise, output false.

Is this a “hard” problem?

18College Science Scholars

Solving Problems

• A solution to a problem instance: given a pegboard configuration, here’s the sequence of jumps

• A solution to a problem: a procedure that (1) always finds the correct answer, and (2) always finishes.

Page 4: Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

4

19College Science Scholars

“Brute Force” Solvers

• Enumerate all possible answers

–Every possible sequence of jumps

• Try them all until you find one that works

–Simulate the jumps

• This works for almost all problems!

• Problem: how long does it take?

20College Science Scholars

Problem Solving Time

0

200

400

600

800

1000

1200

1 2 3 4 5 6 7 8 9 10

~ n

~ n3

~ 2n

Problem Input Size

Tim

e

21College Science Scholars

Increasing Problem Size

0

10000

20000

30000

40000

50000

60000

70000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

~ 2n

~ n3

22

Tractable and Intractable Problems

0

200000

400000

600000

800000

1000000

1200000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

“tractable”

“intractable”

I do nothing that a man of unlimited funds, superb physical endurance, and maximum scientific knowledge could not do.

– Batman (may be able to solve intractable problems, but computer

scientists can only solve tractable ones for large n)

23College Science Scholars

This makes a huge difference!

1

100

10000

100000

0

1E+08

1E+10

1E+12

1E+14

1E+16

1E+18

1E+20

1E+22

1E+24

1E+26

1E+28

1E+30

2 4 8 16 32 64 128

today

2032

time since

“Big

Bang”

log-log scale

~ 2n

~ n3

24College Science Scholars

Back to the Pegboard...

• A brute force solution is easy...but on the pink line

• Is there a tractable solution?

1

100

10000

100000

0

1E+08

1E+10

1E+12

1E+14

1E+16

1E+18

1E+20

1E+22

1E+24

1E+26

1E+28

1E+30

2 4 8 16 32 64 128

Page 5: Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

5

25College Science Scholars

Deciding a Problem Is Hard

• “I tried really hard and still couldn’t solve it.”

–Maybe the speaker isn’t smart enough

–Maybe a few days more effort will find it

• “Lots of really smart people tried really hard and no one could solve it.”

• “It seems sort of like this other problem that we think is hard...”

26College Science Scholars

Reduction

PegboardSolver

Hard Problem Solver

Input Trans-

former

Input

OutputTrans-

formerjumps

Outp

ut

27College Science Scholars

Reading the Genome

Whitehead Institute, MIT

28College Science Scholars

Gene Reading Machines

• One read: about 700 base pairs

• But…don’t know where they are on the chromosome

ACCAGAATACCCGTGATCCAGAATAAActual

Genome

ACCAGAATACCRead 1

TCCAGAATAARead 2

TACCCGTGATCCARead 3

29College Science Scholars

Genome Assembly

Input: Genome fragments (but without knowing where they are from)

Ouput: The full genome

ACCAGAATACCRead 1

TCCAGAATAARead 2

TACCCGTGATCCARead 3

30College Science Scholars

Genome Assembly

Input: Genome fragments (but without knowing where they are from)

Ouput: The smallest genome sequence such that all the fragments are substrings.

ACCAGAATACCRead 1

TCCAGAATAARead 2

TACCCGTGATCCARead 3

Page 6: Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

6

31College Science Scholars

Genome Assembly Solver

PegboardSolver

Genome Assembly Solver

Input Trans-

former

OutputTrans-

formerjumps

ACCAGAATACC

TCCAGAATAA

TACCCGTGATCCA

ACCAGAATACCCGTGATCCAGAATAA

(~30M reads,

~900 bp)

32College Science Scholars

What This Means• We already know the shortest common superstring (genome assembly) problem is “hard”

• The pegboard problem must also be hard, since we could use a solver for it to solve the genome assembly problem

–Requires: we can build fast transformers that don’t increase the problem size exponentially

33College Science Scholars

Non-Deterministic Machines

1

Start

2

Input: #Write: #

Move: ←

# 1 0 1 1 0 1 1... ...1 0 1 1 0 1 1 1 #

Input: 1

Write: 0

Move: ←

Input: 1Write: 1

Move: →

Input: 0Write: 0

Move: →34

Input: #Write: 0

Move: →

34College Science Scholars

Non-Deterministic Machine

• Everytime there is a choice, it can guess the correct choice without looking ahead

• If we had such a machine, solving Pegboard (or Genome Assembly, etc.) problem would be easy:

–It can guess the solution one step (alignment) at a time

35College Science Scholars

Big Open Question

Is a non-deterministic machine able to solve problems that are intractable on a deterministic machine? 1

100

10000

100000

0

1E+08

1E+10

1E+12

1E+14

1E+16

1E+18

1E+20

1E+22

1E+24

1E+26

1E+28

1E+30

2 4 8 16 32 64 128

Seems obvious that the magic guesscorrectly ability should be useful...but no one knows for sure!

36College Science Scholars

Recap• “P vs NP” problem (one of the millennium

prize problems)

• Solving the pegboard puzzle is equivalent to solving genome assembly

• With a non-deterministic machine, we could solve both

• With a mechanical computer, we don’t know if a tractable solution exists (but can’t prove it doesn’t): We don’t know if checking a solution is really easier than finding it

Page 7: Curing Cancer Computer Science - cs.virginia.eduevans/talks/css/scholars06.pdf · prize problems) • Solving the pegboard puzzle is equivalent to solving genome assembly • With

7

37College Science Scholars

Summary• Computer Science is the study of information processes: all about problem solving

• Many seemingly paradoxical results:–All Computers are Equally Powerful!

–Many Surprisingly Different Problems are Equivalent!

• And seemingly obvious open problems:–Is checking a solution is really easier than finding it?

38College Science Scholars

Computer Science at UVa

• New Interdisciplinary Major in Computer Science for A&S students (approved last year)

• Take CS150 this Spring

–Every scientist needs to understand computing, not just as a tool but as a way of thinking

• Lots of opportunities to get involved in research groups

39College Science Scholars

My Research Group

• Computer Security: computing in the presence of adversaries

• Recent student projects:–Proof that the Pegboard puzzle is hard (Mike Peck and Chris Frost)

–Disk-level virus detection (Adrienne Felt)

–Web Application Security (Sam Guarnieri)

–N-Variant Systems: run variants of a program simultaneously (Sean Talts)

40College Science Scholars

Questions

http://www.cs.virginia.edu/[email protected]