Dynamic Indeterminism in Science David R. Brillinger Statistics Department University of California,...

Preview:

Citation preview

Dynamic Indeterminism in Science

David R. Brillinger Statistics Department University of California, Berkeley

www.stat.berkeley.edu/~brill brill@stat.berkeley.edu

I. Neyman

II. Stochastics

III. Population dynamics

IV. Moving particles

V. Discussion

A succession of examples, some JN’s, some DRB + collaborators’

1. INTRODUCTION

I. NEYMAN 

1894 Born, Bendery, Monrovia

1916 Candidate in Mathematics, U. of Kharkov

  1917-1921 Lecturer, Institute of Technology, Kharkov

  1921-1923 Statistician, Agricultural Research Inst, Bydgoszcz, Poland

  1923 Ph.D. (Mathematics), University of Warsaw

  1923-1934 Lecturer, University of WarsawHead, Biometric Laboratory, Nencki Inst.

1934-1938 Lecturer, then Reader, University College

1955 Statistics Department, UCB

1961 Professor Emeritus, UCB

1981 Died, Oakland, California

1938 Professor of Mathematics, UC Berkeley

Polish ancestry and very Polish.

“His devotion to Poland and its culture and traditions was very marked, and when his influence on statistics and statisticians had become world wide it was fashionable ... to say that `we have all learned to speak statistics with a Polish accent' …”

D.G. Kendall (1982)

Twinkle in the eye - coatOwn money for visitors and students

Drinks at Faculty Club “To the ladies present, and …”

Soccer “I was one of the forwards, not on the center, …, but on the left. … I could run fast.”

2. THE MAN.

“He seemed to know personally all the statisticians of the world.”

T. L. Page (1982)

Strong social conscience

“this is in connection with the current developments in the South, including the arrests of large numbers of youngsters, their suspension or dismissal from schools, the tricks used to prevent Negroes from voting, …”

Neyman and others (1963)

Many, many visitors to Berkeley

“… the delight I experience in trying to fathom the chance mechanisms of phenomena in the empirical world.”

Neyman(1970)

215 research papersFrom 1948, 55 out of 140 with E.L.Scott

3. NEYMAN’S WORK.

K. Pearson (The Grammar of Science), R. A. Fisher (Statistical Methods for Research Workers)

“… there is not the slightest doubt that his (RAF’s) many remarkable achievements had a profound influence on my own thinking and work.”

Neyman (1967)

Applied at the start (agriculture) and at the end (Using Our Discipline to Enhance Human Welfare)

Special influences.

Agriculture, astronomy, cancer, entomology, oceanography, public health, weather modification, …

Theory.

CIs, testing, sampling, optimality, C(α), BAN, …

Applications.

Observed and expected Formal tests with broad alternatives Chi-squared

“appears reasonable”, “satisfactory fit”, …

“… the method of synthetic photographic plates”

Neyman, Scott, Shane (1952)

One simulates realizations of a fitted model

How were models validated?

Photographic plate Synthetic

“When the calculated scheme of distribution was compared with the actual …, it became apparent that the simple mechanism could not produce a distribution resembling the one we see.”

Neyman and Scott (1956)

Discovered variability beyond elementary clustering

“The essence of dynamic indeterminism in science consists in an effort to invent a hypothetical chance mechanism, called a 'stochastic model', operating on various clearly defined hypothetical entities, such that the resulting frequencies of various possible outcomes correspond approximately to those actually observed.”

Neyman(1960)

“… stochastic is used as a synonym of indeterministic.”Neyman and Scott (1959)

II. STOCHASTICS

Time series. Chapter in Neyman (1938)

Markov.

“Markov is when the probability of going - let's say - between today and tomorrow, whatever, depends only on where you are today. That's Markovian. If it depends on something that happened yesterday, or before

yesterday, that is a generalization of Markovian.”Neyman in Reid (1998)

States of health, Fix and Neyman (1951)

4. RANDOM PROCESSES.

Vector contains basic information concerning evolution

Can incorporate background knowledge

Can make situation Markov

Evolution/dynamic equation

Measurement equation

State space model.

6. SARDINES. In 1940s Neyman called upon to study the declining sardine catches along the West Coast.

III. POPULATION DYNAMICS

Season 41-2 42-3 43-4 44-5 45-6

Age=1 926.0 718.0 1030.0 951.0 493.0

2 6206.0 2512.0 1308.0 2481.0 1634.0

3 3207.0 4496.0 2245.0 1457.0 1529.0

4 868.0 1792.0 2688.0 1298.0 799.0

5 361.0 478.0 929.0 1368.0 407.0

6 95.1 169.4 327.0 498.5 299.2

7 47.2 36.0 98.4 148.0 111.2

Sardines (arbitrary units) landed on West Coast

Na,t: fish aged a available year t

N(t) = [Na,ta,t]: state vector

na,t: expected number caught

qa: natural mortality age a

Qt: fishing mortality year t

Model: Na+1,t+1 = Na,t(1-qa)(1-Qt)

H0: qb = qb+1 = … = qa , a > b

“Certain publications dealing with the survival rates of the sardines begin with the assumption that both the natural death rate and the fishing mortality are independent of the age of the sardines, …”

Neyman(1948)

“… steady increase in fishing effort … 1943-8”

“… the death rate has a component which increases with the increase in age of the sardines. It may be presumed that this component is due to natural causes.”

Neyman(1948)

“While in certain instances the differences between Tables IV and VII are considerable, it will be recognized that the general character of variation in the figures of both tables is essentially similar.”

(ibid)

How to study further? HA?

Neyman et al (1952), astronomy

EDA: plot |X-Y| versus (X+Y)/2

Tables of fitted and observed.

Guckenheimer, Gutttorp, Oster & DRB in late 70s studied A. J. Nicholson’s blowfly data.

7. Lucilia cuprina.

Population maintained with limited food for 2 years

Started with pulse

Counts of eggs, emerging, deaths every other day

Life stages

egg: .5 – 1.0 day larva: 5-10 days pupa: 6-8 days adult: 1-35 days

Obtaining the data

State space setup.

Na,t: number aged a on occasion tEt: number emerging = N0,t

Nt: state vector = [Na,t]Nt: number of adults = 1’Nt

Dt: number dying = Nt-1 + Et – Nt

qa,t: Prob{individual aged a dies aged a | history}

Dt | history fluctuates about Σa qa,tNa,t

Question: Dynamical system leading to chaos?

qa,t = 1 – (1-αa)(1-βNt)(1-γNt-1)

αa: dies | age a

βNt: dies | Nt adults

γNt-1: dies | Nt-1, preceding time

NLS, weights Nt2

Age and density dependent model,

Death rate age/density dependent

Nonlinear dynamic system, chaos possible

“Nicholson was using the flies as a computer.”P.A.P. Moran (late 70s)

Blowfly conclusions.

8. CLOUD SEEDING.

JN started work in early 50s California, Arizona, Switzerland

Emphasized importance of randomization

Hail suppression experiment Grossversuch III, Ticino

Suitable days (thunderstorm forecast)

Silver iodide seeding from ground generators

IV MOVING PARTICLES

Data: 3 hr rainfall at Zurich, 120km

Particles born at Ticino at times σj

Point process, {σj}, has rate pM(t) t, time of day

Travel times independent, density f(.)

Particles arrive at Zurich at rate pN(t)

pN(t) = ∫ pM(t-u)f(u)du

DRB (1995)

X: cumulative process of rain

pX(t): rate of rainfall

pX(t) = μR ∫ pM(t-u)f(u) du

E{X(t)} = ∫0t pX(v)dv

α: rate of unrelated rainfall

μR: mean rain per particle

pM(t) = C, A < t < B

Regression function.

α + C0 μR [∫ab F(u)du- ∫c

d F(u)du]

a = t-2-A, b = t+1-A, c = t-2-B, d = t+1-B

Travel velocities, gamma

OLS, weights 53 and 38

Running mean [X(t+1)-X(t-2)]/3

5.50 ± 1.96(.76)

Seeding started at 7.5 hr

CI for T, arrival time of effect

13.0 ± 1.5 hr

Approximate 95% CI for travel time.

DEs. Newtonian motion

Described by potential function, H

Planar case, location r = (x,y)’, time t

dr(t) = v(t)dt

dv(t) = - β v(t)dt – β H(r(t),t)dt

v: velocity β: coefficient of friction

dr = - H(r,t)dt = μ(r,t)dt, β >> 0

Advantage of H - modelling

12. Equations of motion.

dr(t) = μ(r(t),t)dt + σ(r(t),dt)dB(t)

μ: drift (2-)vector

σ: diffusion (2 by 2-)matrix

{B(t)}: bivariate Brownian

(Continuous Gaussian random walk)

SDE benefits,

conceptualization, extension

SDEs.

(r(ti+1)-r(ti))/(ti+1-ti) =

μ(r(ti),ti) + σ Zi+1/√(ti+1-ti)

Euler scheme

Approximate likelihood

Solution/approximation.

Starkey Reserve, OregonCan elk, deer, cows, humans coexist?NE pasture

14. ELK. DRB et al(2001 - 2004)

8 animals, control days, Δt = 2hr

Part A.

Model.

dr = μ(r)dt + σdB(t)

μ smooth - geography

Nonparametric fit

Estimate of μ(r): velocity field

Rocky Mountain elk (Cervus elaphus)

Boundary (NZ fence)

dr(t)= μ(r(t),t)dt + σ(r(t),dt)dB(t) +dA(r(t),t)

A, support on boundary, keeps particle in

What is the behavior at the fence?

Synthetic path.

Experiment with explanatory

Same 8 animals

ATV days, Δt = 5min

Part B.

dr(t)= μ(r(t))dt + υ(|r(t)-x(t-τ)|)dt + σdB(t)

x(t): location of ATV at time t

τ: time lag

Model.

Examples of dynamic indeterminism

JN’s EDA.

Residuals.“... one can observe a substantial number of consecutive differences that are all negative while all the others are positive. ... the `goodness of fit' is subject to a rather strong doubt, irrespective of the actual computed

value of χ2, even if it happens to be small.”Neyman (1980)

(X-Y) vs. (X+Y)/2 plot

V. DISCUSSION

JN: the gentleman of statistics

Role models – JN, JWT, …. I was lucky.

Lunch time conversations, Neyman Seminars, drinks at Faculty Club, hooplas, …

Aager, Guckenheimer, Guttorp, Kie, Oster, Preisler, Stewart, Wisdom

Cattaneo, Guha, Lasiecki

Lovett, Spector

NSF, FS/USDA

Acknowledgements.