36
March 29, 2007 KNAW Lecture 1 Statistics as a Distillation of Everyday Experience Gerald van Belle Department Biostatistics, Department of Occupational and Environmental Health Sciences University of Washington, Seattle, WA.

Statistics as a Distillation of Everyday Experience

  • Upload
    benoit

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Statistics as a Distillation of Everyday Experience. Gerald van Belle Department Biostatistics, Department of Occupational and Environmental Health Sciences University of Washington, Seattle, WA. Where Are We Going?. Statistics as distillation of everyday experience - PowerPoint PPT Presentation

Citation preview

Page 1: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 1

Statistics as a Distillation of Everyday Experience

Gerald van Belle

Department Biostatistics, Department of Occupational and Environmental Health Sciences

University of Washington,

Seattle, WA.

Page 2: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 2

Where Are We Going?

Statistics as distillation of everyday experience

1. Variation

2. Causation

Experience can benefit from everyday statistics

Page 3: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 3

Page 4: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 4

A. Variation in everyday experience

1. Describing and classifying variation

2. Selection in the face of variation

3. Controlling variation

4. Inducing variation

5. “Missing data”

Page 5: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 5

Statistician drowning in river of average depth 25 cm

Page 6: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 6

1. Describing and classifying variation

a. We tell stories of ab”normality”* Air travel horror stories, laptop disasters,…

b. We sort into genres: art, biology, literature* Concept of “population”* Characteristics of population and “sample”

c. Variation in time, space, social structures,…* Waves on beach (non-stationarity)* Hierarchy, “social class”

d. We make inferences based on limited data* And often get the wrong population* Basis for a great deal of humor* Switch in “expectation”

Page 7: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 7

2. Selection in the face of variation

a. Need to know selection mechanismSpend very little time on this,Assumption of “missing at random”

b. Error of thinking that current observation is representative

c. Unintended intentional selection

d. Two examples:

Page 8: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 8

Page 9: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 9

Page 10: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 10

2. Selection in the face of variation--2

a. Need to know selection mechanismRandom selection as gold standard

b. “Representativeness”* Kruskal and Mosteller papers* Slippery concept* Large sample vs small sample

Page 11: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 11

3. Controlling variation

1. Clearest examples in sports: Divisions, “junior”,…

2. Societal examplesMin, max speed limitsOccupational (noise limits, flying hours)Vergunningen, vergunningen,…

3. “Blocking” in statistics

Page 12: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 12

4. Inducing variation

1. Antitrust laws * Increase competition, i.e. variability

2. Draft system in sports* Teams more equal, P(win) near 1/2

3. Societal* Admission to medical school in Holland* Representativeness (again)* Key to clinical trials

Page 13: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 13

5. “Missing” data

1. Serious problem, obviously2. Anatomy of missingness

1. Normal (e.g. pediatrician chart)2. Transcription error3. Just not there (Murphy was here)4. Deliberately missing (e.g. extended testing on

subset of patients)

3. Impacts population of inference4. Example:

Page 14: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 14

Page 15: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 15

Vulnerability Analysis of Spitfires (sample: 15/400)

Page 16: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 16

Composite of hits

*

* ***

*

*

***

**

*

*

Abraham Wald

Advice:

Page 17: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 17

…as we know,

there are known knowns;

there are things we know we know.

We also know there are known unknowns;

that is to say,

we know there are some things

we do not know.

But there are also unknown unknowns—

the ones we don’t know we don’t know.Donald Rumsfeld

Another anatomy of missingness

(set to music, see NPR website)

Page 18: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 18

…as we know,

there are known knowns;

there are things we know we know.

We also know there are known unknowns;

that is to say,

we know there are some things

we do not know.

But there are also unknown unknowns—

the ones we don’t know we don’t know.Donald Rumsfeld

Non-missing

MCAR/MAR

Non-ignorable

Translation into modern statistics

Page 19: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 19

B: Causation in everyday experience

1. Aristotle’s four causes

2. Hardwired to look for causation

3. Hardwired to assume association is causation

4. Hardwired to assign blame (secondary causes)

(The Dutch: “oorzaak” is closer to Aristotle’s

Page 20: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 20

1. Aristotle’s four causes

1. Material cause (table made of wood)

2. Formal cause (four legs and flat top make this a table)

3. Efficient cause (carpenter makes a table)

4. Final cause (surface for eating or writing makes this a

table)(From S.M. Cohen, U Washington)

Page 21: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 21

Four questions

1. What is the question? was

2. Is it testable? Was

3. Where will you get the data? did

4. What will the data tell you? do

The great divide

Page 22: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 22

Not everything that can be counted

counts and not everything that counts

can be counted.Albert Einstein

1. What is the Question? Is it testable?

Page 23: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 23

Page 24: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 24

2. Frequent Consulting Scenario

Testable hypothesis

What was the question?

but

Page 25: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 25

Example from Science (February 23, 2007)

Title Redefining the age of ClovisFront page Flints (pictures)Page 1045 Summary paragraphPage 1067 News storyPage 1122—1126 Article (numbers)

Very different “flavor” for each section

Page 26: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 26

Question Testable version

Page 27: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 27

3. Hardwired to look for causation

1. Story

2. Instinctive looking for causes3. Challenging in courts

crime in search of criminal, stock market (“if only I had bought Microsoft in 1980”), and science (global warming)

4. Life forces us to do this ex post factoWhitehead quote

Page 28: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 28

4. Hardwired to assume association is causation

1. Story

2. Criteria for causation help

3. Causation in observational studies is a great challenge

4. R.A. Fisher introduced randomizationas way of establishing cause-effect

Page 29: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 29

5. Hardwired to look for secondary causes

1. Assists in soccer, hockey, basketball

2. Tendency to blame (1) immediate efficient cause to more distal(Adam→Eve→snake maneuver )(2) move from efficient cause to material or formal cause; that is, change the “universe of discourse”

3. Surrogate outcomes in science (great work by Ross Prentice)

Page 30: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 30

6. Challenges to Causation in Observational Studies

1. Selection bias “Where did you get the data?”

2. Confounding

“What do you think the data are telling you?”

3. Interplay of selection bias and confounding

Effect modification needs to be considered

Page 31: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 31

7. Observational vs experimental studies

Characteristic Observational ExperimentEthical issues Fewer MoreOrientation Retrospective

ProspectiveInferenceWeaker StrongerSelection bias Big problem Less Confounding Present AbsentRealism More LessCausal plausibility Weaker StrongerResearcher control Less MoreAnalysis More complicated Less

Page 32: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 32

Consequence

In view of (1) Variation (2) Undocumented selection (3) Hard-wired tendencies for causation(4) Observational dataWe have a lethal mixture for dealing with important scientific questions affecting daily life:

Page 33: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 33

APHA Journal, May, 2004

Page 34: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 34

So we always need to ask:

1. What is (was) the question?

2. Is (Was) it testable?

3. Where will (did) you get the data?

4. What do you think the data are telling you?

Page 35: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 35

1. Variation is fact of life

2. “Population” as model

3. Representativeness

4. Regression to the mean

5. What is the question?

6. Testable question?

7. Association or causation

8. Causation through randomization

Experience can benefit from everyday statistics

Variation

Causation

Page 36: Statistics as a Distillation of Everyday Experience

March 29, 2007 KNAW Lecture 36

Hartelijkbedankt

voor Uw

aandacht