Machine Science Distilling Free-Form Natural Laws from Experimental Data Hod Lipson, Cornell...

Preview:

Citation preview

Machine ScienceDistilling Free-Form Natural Laws from Experimental Data

Hod Lipson, Cornell University

Lipson & Pollack, Nature 406, 2000

Camera

CameraView

Adapting in simulation

Simulator

Evolve ControllerIn Simulation

Try it in reality!

Download

Crossing The

Reality Gap

Adapting in reality

Evolve ControllerIn Reality

Try it !

Too many Physical

Trials

Simulation & Reality

Evolve Controller

“Simulator”

Try it in reality!

Build

Collect Sensor Data

Evolve SimulatorEvolve Simulators Evolve Robots

ServoActuators

Tilt Sensors

Emergent Self-Model

With Josh Bongard and Victor Zykov, Science 2006

Damage Recovery

With Josh Bongard and Victor Zykov, Science 2006

RandomPredictedPhysical

System Identification

?

Perturbations

Photo: Floris van Breugel

Photo: Floris van Breugel

Structural Damage Diagnosis

With Wilkins Aquino

.com

With Jeff Clune, Jason Yosinski

Symbolic Regression

f(x)=exsin(|x|)

What function describes this data?

John Koza, 1992

Encoding Equations

*

sin–

x1 3

f(x)

+

x2 -7

Building Blocks: + - * / sin cos exp log … etc

sin(x2)

x1*sin(x2)

(x1 – 3)*sin(x2)

(x1 – 3)*sin(-7 + x2)

John Koza, 1992

Michael D. Schmidt, Hod Lipson (2006)

Models: Expression treesSubject to mutation and selection

Experiments: Data-pointsSubject to mutation and selection

{const,+,-,*,/,sin,cos,exp,log,abs}

+

sin×

–1.2

x 2x

+

sin×

–1.2

x 2x

Solution Accuracy

Entire DatasetCoevolved Dataset

15

17

19

21

23

25

27

29

31

33

35

0 5000 10000 15000

Generations

Sol

utio

n S

ize

(# o

f nod

es)

Coevolution

Exact

Solution Complexity

Entire Dataset

Coevolved Dataset

Semi-empirical mass formula

Modeling the binding energy of an atomic nucleus

ZAA

ZAa

A

ZZaAaAaE ACSVB ,

21 2

3132

odd ,

odd

even ,

0,

0

0

NZ

A

NZ

ZA

210 A

aP

R2 = 0.999915

Weizsäcker’s Formula:

A

ZN.

A

Z.A.A..E

..

B

2

260

2640 )(2917390

391243138314

R2 = 0.99944

Inferred Formula:

Systems of Differential Equations

• Regress on derivative

time x1 x2 …

0 3.4 -1.7 …0.1 3.2 -0.9 …0.2 3.1 -0.1 …0.3 2.7 1.2 …… … … …

State Variables

dx1/dt x2/dt …

-2.0 8.0 …-1.0 8.0 …-4.0 1.3 …-5.7 1.9 …… … …

Derivatives

0

1

2 6

1

2 3

3

1 3143

1 322 1 2 14

3

2*

32 1 3 2

42 3

*2.5 100

1 13.6769*

*200 6* * 12* *

1 13.6769*

6* * 16* *

16* * 100*

J

v

v v

v

v v

v

S AdS

dt A

S AdSS N S N

dt A

dSS N S A

dt

dSA S

dt

4

2 4

3 5

1

2 4

22 1 2 4

3 1 32 3 34

3 2

2*

55

*

6* * 100* *

*200 32* * 1.28*

1 13.6769

1.3*

v

v v

v v

v

J

N S

dNS N N S

dt

dA S AA S A

dt A

dSS

dt

Original Equations Inferred Equations

Inferring Biological Networks

With Michael Schmidt, John Wikswo (Vanderbilt), Jerry Jenkins (CFDRC)

0

1

2 6

1

2

1 3143

1 322 1 2 14

3

2*

32 1

*2.42114 99.2721

1 13.5956*

*199.935 5.99475* * 11.9895* *

1 13.6734*

5.99857* * 1

J

v

v v

v

v

S AdS

dt A

S AdSS N S N

dt A

dSS N

dt

3

43

2 4

3 2

42 3 2 4

22 1 2 4

3 1 3

3

43

2*

5.99606* *

15.997* * 100.0

0.01

15* *

5.99857* * 99.9963* *

*197.781

1 13.263

286

3

*

v

vv

v

extran

v

v

eous

S A

dSA S N S

dt

dNS N N S

dt

dA S

dt

S

A

A

3 5

1

2 3 3

2

55

31.9682* * 1.29659*

1.29626*

v v

J

A S A

dSS

dt

Charles Richter

Ingm

ar Z

ange

r, J

ohn

Am

end

Wet Data, Unknown System

With Michael Schmidt (Cornell) and Gurol Suel (UT Southwestern)

Wet Data, Unknown System

With Michael Schmidt (Cornell) and Gurol Suel (UT Southwestern)

Cell #1

Cell #2

Cell #3-60 …

Blue Dots = data points, Green Line = regressed fit

=

=

0

1

1

11

/ /

/ //

nK K

k Kn nK S

S kS Sp

K S

K KdKK

dt k K K S

SdSS

dt K SK k

Biologist’s Inferred Model: Gurol Suel, et. al., Science 2007

Symbolic Regression Inferred Time-Delay Model:

S

Kcba

dt

dSK

Scba

dt

dK

SSS

KKK

7423.163214.170.1582

18

51

t

tt

G

S

dt

dG

05.33019.0922.114

15

25

t

tt

S

G

dt

dS

Withheld Test Set #1 Fit

1355.10312.2192.3526

17

54

t

tt

G

S

dt

dG

9693.20178.0271.132

18

57

t

tt

S

G

dt

dS

Withheld Test Set #2 Fit

4406.67452.391.5057

21

46

t

tt

G

S

dt

dG

871.32965.0426.295

20

54

t

tt

S

G

dt

dS

Withheld Test Set #3 Fit

Looking For Invariants

Data Mining

42

42+x-x

42+1/(1000+x2)

?

From Data: x y …

0.1 2.30.2 4.50.3 9.70.4 5.10.5 3.30.6 1.0… … …

δxδy

δyδx

, , …

Calculate partial derivatives Numerically:

δfδx

δfδy

From Equation:

δx’δy’

δy’δx’

, , …

Calculate predicted partial derivatives Symbolically:

-5 0 5

-3

-2

-1

0

1

2

3

x

y

x

y

x3 + x – y2 – 1.5 = 0x2 + y2 – 16 = 0 x2 + y2 + z2 – 1 = 0

-2 -1 0 1 2-3

-2

-1

0

1

2

3

x

y

x

y

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

x

z

x

z

Homework

Circle Elliptic Curve Sphere

Linear Oscillator

• Coefficients may have different scales and offsets each run

Pendulum

2

2.42847*cos( )

d

Ldt 2

3.52768* 9.43429*cos( ) d

Hdt

H

L

Double Linear Oscillator

2 22 2 2 11 2 1 214.691* 15.551* 21.676* 8.3808* 2.6046*

dx dxH x x x x

dt dt

would be plus for Lagrangian

-k1 ω1 – k2 ω2 + k3 ω1 cos(θ1 – θ2) + k4 ω2 cos(θ1 – θ2)

k1 ω1 ω2 – k2 cos(θ1 – θ2)

k1 ω12 + k2 ω2

2 – k3 ω1 ω2 cos(θ1 – θ2) – k4 cos(θ1) – k5 cos(θ2)

Parsimony [-nodes]

Pre

dic

tive

Ab

ilit

y [-

log

err

or]

A k1 θ2 – k2 ω12 – k3 ω2

2 + k4 ω1 ω2 cos(θ1 – k5 θ2) + k6 cos(θ2) + k7 cos(θ1) – k8 cos(k9 θ2) – k10

cos(k11 – k12 θ2)

-k1 ω12 – k1 ω2

2 + k1 ω1 ω2 cos(θ2) + k1 cos(θ2) + k1 cos(θ1)

ω2·cos(θ1 θ2) + ω1

0

-2

-0.4

-0.8

-1.2

-1.6

SimpleComplex

Less

P

red

ictiv

eM

ore

P

red

ictiv

e

1

2

3

4

K K t tK

t t

S S t tS

t t

SK

t K

KS

t S

b cda

d

b cda

d

x y zz=f(x,y) ?

dy/dt=f(x,y) ?

f(x,y,z)=0 ?

x y zz=f(x,y) ?

dy/dt=f(x,y) ?

f(x,y,z)=0 ?

I feel lucky

Von Thomas Hermanowski, Dr. Andreas Rick, Dr. Jochen Weber

Particle Accelerators

angle

Distortion Correction (GE X-Ray Detectors)5x Resolution Improvement

Jay Schuren

Concluding RemarksConcluding Remarks

Correlation is enough. Faced with massive data, [the Scientific Method] is becoming obsolete. We can stop looking for models.

“”Chris Anderson

Wired 16.07

The data deluge accelerates our ability to hypothesize, model, and test.

Theoretical physicists are not yet obsolete, but scientists have taken steps toward

replacing themselves

I am worried that we have enjoyed a brief window in human history where we could actually understand things, but that

period may be coming to an end.

-- Steve Strogatz

The end of insight

Recommended