Making Statistics Surprising Roger Watt Kelly Younger Lizzie Collins Rebecca Skinner Francesca Worsnop

Making Statistics Surprising

Roger WattKelly YoungerLizzie Collins

Rebecca SkinnerFrancesca Worsnop

Idea

Knowledge

Science from the outside

Idea Result

Evidence Knowledge

Science from the inside

Idea Result

Evidence Knowledge

Science from the inside

Idea Result

Hypothesis

Design

Evidence

Data Analysis

Inference

Knowledge

Persuading

Describing

Idea Result

Hypothesis

Design

Evidence

Data Analysis

Inference

Knowledge

Persuading

Describing

What matters here?

Idea Result

Hypothesis

Design

Evidence

Data Analysis

Inference

Knowledge

Persuading

Describing

Decisions required

What matters here?

Idea Result

Evidence

Data Analysis

Inference

Knowledge

Persuading

Describing

What variables?What types of variable?What relationships between variables?

What sampling method?What deployment of sample (between/within)? What sample size?

Hypothesis

Design

Knowledge

What matters here?

Lesson

• We must make decisions– these matter

• We may have preferences– these don’t matter

The Student Journey

Idea Result

Hypothesis

Design

Evidence

Data Analysis

Inference

Knowledge

Persuading

Describing

What appears to matter here to a student?

Result

Data AnalysisInference

What appears to matter here to a student? What test?t-testchi-sqrcorrelationANOVAregressionANCOVAMANOVA

How to test?FormulaeCalculationsΣ(xi-x)2

SPSSWhat columns?

Numbers….Dozens of numbersSSQF, t, pHow many sig figs?

The Student Experience

• Stats is Hard– disconnected facts– tedious arithmetic

• Stats is Disempowering– easy to make simple mistakes– myriad of details obscure concepts

• Stats is not fun– no pleasant surprises

The Main Goal: Doing stats

• Understanding:– Preserve the whole picture

• Conceptual Insight:– Full grasp of issues that matter for the outcome

• Skills:– Confident in essentials

The Plan

• Materials– Whole picture always present– Concentrate on research decisions– Remove disconnected facts

• Learning– Repeated Experience– Immediate feedback– Discovery

Idea Result

Hypothesis

Design

Evidence

Data Analysis

Inference

Knowledge

Persuading

Describing

The Whole Picture

IdeaResult

Hypothesis

Design

Evidence

Data Analysis

Inference

Knowledge

Persuading

Describing

Research Decisions

Knowledge

Idea Result

Evidence

Data Analysis

Inference

Knowledge

Persuading

DescribingWhat variables?What types of variable?What relationships between variables?


Hypothesis

Design

BrawStats

• Materials– Whole process always visible– Decisions require user input• everything else automatic

• Learning – Encourages experimenting & discovery– Every action produces a relevant graphical output• immediately

BrawStats

• Hypothesis– How many variables?– What variables?– What types of variable?

– What relationship between variables?

Variables

Variables

Variables

Logic

Variables

Logic

female male50

100

150

gender

IQ

Hypothesis Dependent VariableIndependent Variable

IQ (Interva l )gender (Categorica l )

Mean = 100female(50%)

Std = 15male(50%)

Predicted Means

IQ

genderfemale107

male

93

Variables

Logic

Prediction

BrawStats

• Design– How to sample?– Within/Between?– How many participants?

female male50

100

150

gender

IQ




Std = 15male(50%)

Predicted Means

IQ

genderfemale107

male

93

Variables

Logic

Prediction

female male50

100

150

gender

IQ




Std = 15male(50%)

Predicted Means

IQ

genderfemale107

male

93

Variables

Logic

Prediction

female male50

100

150

gender

IQ




Std = 15male(50%)

Predicted Means

IQ

genderfemale107

male

93

Variables

Logic

Prediction

Design

BrawStats

• Everything else– done for you

female male50

100

150

gender

IQ




Std = 15male(50%)

Predicted Means

IQ

genderfemale107

male

93

Variables

Logic

Prediction

Design

female male50

100

150

gender

IQ




Std = 15male(50%)

Predicted Means

IQ

genderfemale107

male

93

Variables

Logic

Prediction

Design

Variables

Logic

Prediction

Design

Variables

Logic

Prediction

Design

Evidence

Variables

Logic

Prediction

Design

Evidence

BrawStats

• Structure1. Whole process always visible2. Decisions require user input3. Everything else automatic

• Learning 4. Every action produces a relevant graphical

output immediately5. Encourages experimenting & discovery

1. Whole process always visible

2. Decisions require user input

3. Everything else automatic

4. Relevant graphical output immediately

5. Encourages experimenting & discovery





The Next Goal : Expected Outcomes

• Understanding:– Relationship of outcome to chance (sampling error)

• Conceptual Insight:– Strengths and weaknesses of statistical testing

(NHST)

• Skills:– Interpret statistical outcomes




(NHST)





(NHST)


The Next Goal: Expected Outcomes



(NHST)


Consequences of the p-value distribution

H0 Correct H0 Incorrect

p<=0.05 Type I error

p>0.05 Type II error

We are locked into the type of system given by this truth table:

0.01 0.1 1.0

0.2

0.4

0.6

0.8

1

criterion p

p(Ty

pe I

erro

r)t-test independent samples (n=63100)

0.01 0.1 1.0

0.2

0.4

0.6

0.8

1

p(Ty

pe II

err

or)

Lessons

• sampling error matters• p-value – depends on sampling error– is poorly behaved

• p-values cannot be easily interpreted

The Last Goal: Exploring stats

• Understanding:– Relationship of outcome to design decisions

• Conceptual Insight:– Strengths and weaknesses of designs

• Skills:– Make optimal decisions

Knowledge

Idea Result

Evidence

Data Analysis

Inference

Knowledge

Persuading

DescribingWhat variables?What types of variable?What relationships between variables?


Hypothesis

Design

The Basic Design Choices

• Variable Type• Between/Within• No participants• Sampling strategy

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1




Std = 15male(50%)

Predicted Means

IQ

genderfemale107

male

93



i o c5 c4 c3 c20

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

type of IV

Pearson correlation(n=11260) IQ

i o c5 c4 c3 c21

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

i o c5 c4 c3 c20

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

type of IV

Pearson correlation(n=18380) IQ

i o c5 c4 c3 c21

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)



i r0

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

repeated measures

t-test paired samples(n=10480)gender

i r1

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

i r0

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

repeated measures

t-test paired samples(n=162040)gender

i r1

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)



20 40 60 80 1000

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

no of participants

t-test independent samples(n=2780) gender

20 40 60 80 1001

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

20 40 60 80 1000

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

no of participants


20 40 60 80 1001

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)



0.2 0.4 0.6 0.80

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

independence


0.2 0.4 0.6 0.81

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

0.2 0.4 0.6 0.80

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

independence


0.2 0.4 0.6 0.81

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

The Basic Assumptions

• Normality:– skew– kurtosis

-1 -0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

skew


-1 -0.5 0 0.5 11

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

-1 -0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

skew


-1 -0.5 0 0.5 11

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

-1 -0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

kurtosis


-1 -0.5 0 0.5 11

0.8

0.6

0.4

0.2

0

p(Ty

pe II

err

or)

-1 -0.5 0 0.5 10

0.2

0.4

0.6

0.8

1

p(Ty

pe I

erro

r)

kurtosis


Lessons

• early decisions matter:– interval>ordinal>categorical– no participants– sampling strategy• between/within• non-independence

• not much else matters– skew– kurtosis

The Student Experience

• Stats is Hard– disconnected facts– tedious arithmetic

• Stats is Disempowering– easy to make simple mistakes– myriad of details obscure concepts

• Stats is not fun– no pleasant surprises





The Plan

• Materials– Whole picture always present– Concentrate on research decisions– Remove disconnected facts

• Learning– Repeated Experience– Immediate feedback– Discovery

BrawStats

• Materials– Whole process always visible– Decisions require user input• everything else automatic

• Learning – Encourages experimenting & discovery– Every action produces a relevant graphical output• immediately

Lessons

• It (almost) worked– not sure why– maybe because:• no numbers/arithmetic• single coherent process• it is (??) self-explaining & self-illustrating• foraging for undocumented features

Documents

Making Statistics Surprising Roger Watt Kelly Younger Lizzie Collins Rebecca Skinner Francesca Worsnop