Theory Theory| How We Come to Understand Our Worldcsc.ucdavis.edu/~chaos/papers/theorytheory.pdf ·...

Preview:

Citation preview

Theory Theory—

How We Come to Understand Our World

James P. Crutchfield

www.santafe.edu/∼chaos

3 May 2002

Abtract We are confronted, as never before, with an explosion in the size

of databases: from data-mining the web to dynamical brain imaging, the various

genome projects, and monitoring the millions of shipping containers that enter

our ports. These datasets are so vast that there is no conceivable way humans

can directly examine all of the data. What has been an intuitive and very human

discovery process—hypothesizing structures and building predictive theories about

the systems that produce them—must now be automated. But do we understand

the notions of structure and pattern well enough that we can teach machines to

discover them? How do we build theories that capture patterns in useful and

predictive ways? I will announce some very recent results on automated pattern

discovery and theory building for cellular automata. The lessons hint at what a

future ”artificial” science might look like.

Joint work with Carl McTague.

Data Explosion:

• Neurophysiology: multiple neuron recordings (> 100 neurons @ 1kHz)

• Web Data-Mining: 10-100 GB/day, multi-terabyte databases

• Astrophysical data: Hubble, EUV, ...

• Neuroimaging: MEG 100-200 Squids @ 1 kHz

• Geophysics: Earthquake monitoring with 1000s sensors, of different kinds

• BioInformatics: Genome projects, microarray sequencing, ...

• Searchable Video Databases

• Very large-scale simulations: weather, hydrodynamics, reaction kinetics, ...

• ...

1

Need machines to help.

Current approach is Pattern Recognition:

• Match data against existing palette of templates

• Fitting polynomials, assuming IID and distributions are Gaus-

sian, taking Fourier/Laplace Transforms, ...

Begs the Question:

Where did that palette come from in the first place?

2

How is this done now?

Baconian Scientific Algorithm:

Select Domain

?

Hypothesize

?

Test

?

Refine

¾

Hypothesis generation: Guessing within a discipline’s domain.

Must we always “guess” or intuit?

Problem: Hypothesis wrong? Little hint of where to go next.

The Inverse Problem: How to go from Data to Theory?

3

History of the Inverse Problem

Recently in nonlinear physics,

Attractor reconstruction:

• Packard et al (1980): “Geometry from a Time Series”

• Takens (1981), ...

“Theory From Time Series”:

• JPC/McNamara (1987): Equations of Motion from a Data Series

• Farmer/Sidorovich (1987): Nonlinear prediction

• Casdagli (1989), ...

Problem with this, too, alas.

4

The Underlying Problem: Intrinsic Representation

Reformulating the Question:

Does data contain information about an appropriate representa-

tion?

Solution:

Tackle head on: Define structure, pattern, regularity, ...

Computational Mechanics:

Accounts for computational structure.

Extends Statistical Mechanics: more than “statistics”.

5

Analogy:

Physics “accounts” for energy flow and transduction.

Computational Mechanics accounts for

1. Amount of historical information stored in process;

2. The architecture supporting that storage; and

3. How stored information is transformed into future states.

More directly:

1. Physics has measures of disorder: temperature, thermodynamic entropy.

2. Comp’l mechanics measures degrees of pattern: structural complexity.

6

Causal Architecture: A Review of Computational Mechanics

Processes:↔S= . . . S−1S0S1 . . . =

←S→S .

Causal States S: Sets of←S that are equally predictive about

→S

←s ∼

←s′such that P(

→S |←S=

←s ) = P(

→S |←S=

←s′) (1)

ε-Machine: M = S, T

• Theorem: M is the unique, optimal predictor of minimal size.

• Theorem: M is a minimal sufficient statistic for a process.

• The intrinsic representation of a process.

Stored information is Statistical Complexity: Cµ = H[P(S)].

JPC & K Young, Physical Review Letters 63 (1989) 105–108.

CR Shalizi & JPC, J. Statistical Physics 104 (2001) 817–879.

7

Cellular Automata: Artificial Physics’s

t

t+1

1 1 00

1 0 0 1

0

0

0

1

1

0

0

1

0

0

1

0

0

0

000

001

010

011

100

101

110

111

Look Up Tableφ

LUT OutputBit String

Glo

bal M

ap

Φ

Lattice of N Sites

Global StateSt

Global StateSt+1

Local State SiNeighborhood(radius = 1)

0

Time

49

49Site0

Elementary Cellular Automaton 18

ECA 18 Rule Table:

ηt 000 001 010 011 100 101 110 111st+1 = φ(ηt) 0 1 0 0 1 0 0 0

8

Cellular Automata Computational Mechanics

A* A*

Ω Φ(Ω)Φ

Space of configurations A∗: Represented by finite-state automata.

CA LUT: A finite-state transducer TΦ that reads in sets Ωt and outputs Ωt+1.

Language evolution:Implemented by Finite-Machine Evolution (FME) Operator Φ:(i) Φ(Ω), (ii) Drop Transducer Inputs, (iii) Strip Transient States, (iv) Convert NFA → DFA,and (v) Minimize.

Computational Complexity, starting with n-state language:

• Φ(Ω) and Drop Inputs: O(n).

• Identify Transient States: O(n2).

• NFA → DFA: O(2n).

• Minimize: O(n log2 n).

9

Wolfram (1984): Iterate all configurations Ω0 = A∗:

Ωt = ΦtΩ0 . (2)

Regular Language Complexity: |Ωt|

t ECA 18 ECA 22

1 5 152 47 2803 143 45064 ≥ 20,000 ≥ 20,000

10

A Structural View

Domain: Spacetime shift-invariant pattern

1. Λ = Φp(Λ), for some p > 0.

2. Λ = σm(Λ), for some m > 0.

Particle:

1. ΛiαΛj, α ∈ A∗.

2. α = Φp(α), for some p > 0.

Particle Interactions:

1. α+ β → γ.

2. α+ ν → ∅.

3. ....

11

Example: ECA 18

0

Time

99

99Site0

Elementary Cellular Automaton 18

Regularity: Regions of (0A)∗ ... A domain?

12

1

2

0 0 1

[0,0] 0|0

[0,1]

1|1

[1,0]

0|0 [1,1]

1|00|1

1|0

0|0

1|0

Start([1,1],2)

([1,0],1)

0|0

([1,1],1)

1|0

([0,0],2)

0|1

([1,0],2)

0|0

([0,0],1)

0|1

([0,1],1)

1|00|0

1|1

([0,1],2)

0|0 1|0

0|0 0|0

HypothesizedDomainECA 18 TransducerComposition

Iteration of ECA 18’s Domain

13

([1,1],2)

([1,0],1)

0

([1,1],1)

0

([0,0],2)

1

([1,0],2)

0

([0,0],1)

1

([0,1],1)

0 0

1

([0,1],2)

00

00

6

1

1

3

0

2

0 00

1

1

2

1 0 0

Strip TransientsDrop Inputs Minimize QED

14

Recognizing Structure: Domain Filtering

Factor out that data explained by structures one knows.

0

Time

99

99Site0

Elementary Cellular Automaton 18

0

Time

99

99Site0

ECA 18: Domain Filtered

15

ECA 18’s Particle CatalogDomain Process LanguageΛ0 (0Σ)∗

Particle Wedgesα Λ002n+10Λ0

Interactionsα+ α → ∅

The Enigma: ECA 22ECA 22 Lookup Table

ηt 000 001 010 011 100 101 110 111st+1 = φ(ηt) 0 1 1 0 1 0 0 0

0

Time

99

99Site0

History: highly disordered (high entropy) but ...

• Moore: ”... nonlocal structures ...”

• Grassberger (1986): ”... long-range, complex correlations ...”; Power law decays.

16

1

2

0

3

0

4

0

0 1

[0,0] 0|0

[0,1]

1|1

[1,0]

0|1 [1,1]

1|00|1

1|0

0|0

1|0

Start([1,1],4)

([1,0],1)

0|0

([1,1],1)

1|0

([0,0],2)

0|1

([1,0],2)

0|0

([1,1],3)

([1,0],4)

0|0

([0,0],1)

0|1

([0,1],1)

1|0

([1,1],2)

([1,0],3)

0|0

([0,0],4)

0|1

([0,0],3)

0|1

0|0

([0,1],4)

0|11|0

0|0

0|1

([0,1],3)

0|1

0|0 1|1

([0,1],2)

0|1

0|0

Start

Iterate of ECA 22’s Candidate Domain

CandidateDomainTransducerComposition

17

([1,1],4)

([1,0],1)

0

([1,1],1)

0

([0,0],2)

1

([1,0],2)

0

([1,1],3)

([1,0],4)

0

([0,0],1)

1

([0,1],1)

0

([1,1],2)

([1,0],3)

0

([0,0],4)

1

([0,0],3)

1

0

([0,1],4)

10

0

1

([0,1],3)

1

0 1

([0,1],2)

1

0

10

3

1

4

0

5

1

1

1

0

2

0

0

Drop InputsStrip Transients

& Minimize

ECA 22 Domain Proof?

But ≠ Candidate!18

10

3

1

4

0

5

1

1

1

0

2

0

0

[0,0] 0|0

[0,1]

1|1

[1,0]

0|1 [1,1]

1|00|1

1|0

0|0

1|0

Start

Second Iterate of ECA 22’s Candidate Domain

CandidateIterateTransducerComposition

([1,1],1)

([1,0],2)

0|0

([0,0],3)

0|1

([1,1],2)

([1,0],3)

0|0

([0,0],4)

0|1

([1,1],4)

([1,0],1)

0|0

([1,1],5)

1|0

([0,0],2)

0|1([1,1],10)

1|0

([1,0],5)

([0,1],10)

1|0

([1,1],3)

1|0

([1,0],10)

([0,1],3)

1|0

([1,0],4)

0|1

0|0

([0,1],1)

0|1

([0,1],5)

1|1

([0,0],1)

0|0

([0,1],2)

0|1

1|00|0

([0,1],4)

0|11|0

0|0

([0,0],5)

1|1

1|0

0|1

1|0

0|0

([0,0],10)

1|1

Start

19

([1,1],1)

([1,0],2)

0

([0,0],3)

1

([1,1],2)

([1,0],3)

0

([0,0],4)

1

([1,1],4)

([1,0],1)

0

([1,1],5)

0

([0,0],2)

1

([1,1],10)

0

([1,0],5)

([0,1],10)

0

([1,1],3)

0

([1,0],10)

([0,1],3)

0

([1,0],4)

1

0

([0,1],1)

1

([0,1],5)

1

([0,0],1)

0

([0,1],2)

1

00

([0,1],4)

10

0

([0,0],5)

1

0

1

0

0

([0,0],10)

1

22

15

0

8

0

6

1

19

0

0

3

1

0

5

0

4

0

0

1

4

0

3

0 1

2

0

0

Drop Inputs Strip Transients

ECA 22 Domain Proof Completed

Minimize QED!

20

1

2

0

3

0

4

0

0 1

Theorem: Λ ,Λ is a spacetime domain for ECA 22.

Proof: (i) Λ = Φ(Λ )

(ii) Λ = Φ(Λ )

in other words

(iii) Λ = Φ (Λ )

(iv) Λ = Φ (Λ )

2

10 0

0

Λ10Λ0

0

10

00

00

00

10

10 2

100

0

10

3

1

4

0

5

1

1

1

0

2

0

0

This is the first structure discovered for ECA 22.Captures much of ECA 22’s behavior.

21

The Enigma? ECA 22

0

Time

99286Site0

0

Time

99286Site0

22

ECA 22’s Particle CatalogDomains Process LanguageΛ0 Λ0

0 = (000Σ)∗

Λ01 = (1110+ 0000)

Λ1 0∗

Λ2 (01)∗

Λ3 (0011)∗

Particles Wedgesα Λ0

0A001AΛ00

Λ01A′1111B′Λ0

1

β Λ00A01AΛ

00

Λ01A′110B′Λ0

1Λ0

1A′001B′Λ01

Interactionsα+ α → ββ+ β → ∅

α+ β → Λα−βgasβ+ α → Λα−βgas

ECA 22: Pattern Discovery

Theorem: Λ00,Λ01 is the only (positive entropy) domain within

the complexity horizon (up to and including 6-state candidates).

Proof: Automated proof methods using the Haskell, a functional

programming language.

How was this done?

23

Candidates for Domains: Process Languages

Finite-state process language: represented by (recurrent portionof) ε-machine.

AKA finite-state machine with one strongly connected compo-nent; all states are start and final states.

n-DCL Library: The domain candidates with n-states.

How many are there?

States n-DCL Library

n Size

1 3

2 7

3 78

4 1,388

5 35,186

6 1,132,61324

Automated Search

1. For all Λ ∈ n-DCL, exclude by counterexample:

(a) Generate long s ∈ Λ.

(b) Iterate this single configuration Φt(s).

(c) Test: If Φt(s) /∈ Λ, then Λ is expanding.

2. For all nonexpansive n-DCL, exclude by entropy rate:

hµ(Φt(Λ)) < hµ(Λ)⇒ Candidate contracts. (3)

3. For all nonexpanding, noncontracting n-DCL, attempt to di-

rectly prove invariance theorem: Λ = ΦpΛ?

Λ that pass tests are domains.

25

ECA 22: Table of a Million Theorems

States n-DCL ECA 22 Nonexpanding n-DCLn Size Number Contracting Domains1 3 2 Σ∗ Λ1 = 0∗

2 7 2 (0Σ)∗ Λ2 = (01)∗

3 78 04 1,388 2 Λ0

0, Λ3 = (0011)∗

5 35,186 5 ((11)(11 + 01)∗(10) + 00)∗

((101)(00+ 1) + (1(00+ 1) + 0))∗

([(101)+(1(00 + 1) + 00)] + [1(00+ 1) + 0])∗

([1+(01)(00 + 1)] + [1+(00) + 0])∗

([(1+01)+(1+00+ 00)] + [1+00+ 0])∗

6 1,132,613 268 267 Λ01

QED

Several months ago: Estimated compute time ∼ 5 Beowulf years!

When finally proved, proof took ∼ 1 week: speed = 7000 tph!

26

1 1

2

0 0

3

1

1

4

0

0

5

1

1

6

0

0

[0,0] 0|0

[0,1]

1|1

[1,0]

0|1 [1,1]

1|00|1

1|0

0|0

1|0

Start

ECA 22’s Last Candidate

Transducer Candidate 27

Iterate of the Last Candidate161 states and 318 transitions

Theorem: Not a domain.Proof: Nonexpanding + h(Φ(Λ)) < h(Λ) ⇒ Candidate contracts.

91 1010

94

1

1001

104

0

96

1

93

0

92

0

1

97 1320

127

1

1311

147

0

1

126

0

98

0

1

95

01

128

1

159

0

1580

129 01

0

130

1

1

1

106

0

1050

78

1

2

0

6

1

4

1

5

0

3

76

0

46

1

75

1

137

0

45

0

44

1

1

0

0

82

1

0

811

7

1

107

0

0

134

1

8

1130

16

1

153

1

1120

15

0

40

1

9

121

0

61

1

31

1

120

0

60

1

86

0

10

0

1171

12

1

116

0

1163

0

331

1090

62

1

32

1

152

0

1

0

13

0

55

1

154

1

157

0

0

1

14

0

67

1

660

114

1

115

0

30

1

1

1190

29

1

0

181

39

0

17

1

0

1

0

19

0

87

1

841

0

20

0

36

1

351

135

0

21

1

85

0

155

1

148

0

220

1

271

0

23

1

37

0

1

138

0

24

1

0250

1

1

26

0

0

1601

0

1

28

1

79

0

0

1

01

1

122 0

0

1251

0

1

1510

34

1

139

0

0

83

1

1

0

143

0

11

0

38

0

1

0

68

1

0

1

1

431

145

0

47

1

0

48

0

108

1

0

501

49

1

0

1

0

1 144

0

0

1

1

990

0

1

51

1

0

1

142

0

52

0

90

1

89

1

102

0

53

1

0

54

0

1

1

0

56

0

1

57

0

124

1

1

123

0

58

1

0

59

0

70

1

65

1

69

0

1

0

1

0

64

0

1

1

0

01

71 1

0

1

0

72

1

0

730

1

1

770

74 0

1

1

133

0

1

0

0

136

1

01

80

1

0

0

1

1

0

0

1

0

1

0

1

161

0

1

1

0

0

1

0

1

149

1

0

1

0

1501

0

1

0

0

1

1

0

156

1

0

10

41

1

0

42

1

0

1

140

0

0

1

0 1

0

1461

88

1

0

1

0

0

10

1031

1

0

110

1

0

111

1

0

1

0

118

1

0

0

1

1

0

10

1

0

0

0

1

1411

0

1

0

1

0

0

1

1

0

28

The Near Term: All Elementary CAs

Push of a Button:

1. CA Pattern Discoverer will automatically build structural the-

ories for all 256 ECAs.

2. Result: a website of about 1000 pages, with several dozen

pages per ECA.

Beware of Prophets Expounding the CA Gospel!

29

The Future: Artificial Science

Implications:

1. Artificial Particle Physics.

2. Automated Pattern Discovery.

3. Theorists unemployed?

30

Recommended