56
Assumptions

Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Embed Size (px)

Citation preview

Page 1: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Assumptions

Page 2: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

“Essentially, all models are wrong, but some are useful”

George E.P. Box

Your model has to bewrong…… but that’s o.k.if it’s illuminating!

Page 3: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Linear ModelAssumptions

Absence ofCollinearity

Normality of Errors

Homoskedasticity of Errors

No influentialdata points

Independence

Page 4: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Linear ModelAssumptions

Absence ofCollinearity

Normality of Errors

Homoskedasticity of Errors

No influentialdata points

Independence

Page 5: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Absence of Collinearity

Baayen(2008: 182)

Page 6: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Absence of Collinearity

Baayen(2008: 182)

Page 7: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Where does collinearitycome from?

…most often, correlated predictor variables

Demo

Page 8: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

What to do?

Page 9: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Linear ModelAssumptions

Absence ofCollinearity

Normality of Errors

Homoskedasticity of Errors

No influentialdata points

Independence

Page 10: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Baayen(2008: 189-

190)

Leverage

Page 11: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 12: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 13: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 14: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 15: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

DFbeta

(…and much more)

Leave-one-outInfluence Diagnostics

Page 16: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Winter & Matlock (2013)

Page 17: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Linear ModelAssumptions

Absence ofCollinearity

Normality of Errors

Homoskedasticity of Errors

No influentialdata points

Independence

Page 18: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Normality of ErrorThe error (not the data!) is assumed to be normally distributed

So, the residuals should be normally distributed

Page 19: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

xmdl = lm(y ~ x)hist(residuals(xmdl))

Page 20: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

qqnorm(residuals(xmdl))qqline(residuals(xmdl))

Page 21: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

qqnorm(residuals(xmdl))qqline(residuals(xmdl))

Page 22: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Linear ModelAssumptions

Absence ofCollinearity

Normality of Errors

Homoskedasticity of Errors

No influentialdata points

Independence

Page 23: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Homoskedasticity of ErrorThe error (not the data!) is assumed to have equal variance across the predicted values

So, the residuals should have equal variance across the predicted values

Page 24: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 25: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Page 26: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Page 27: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Page 28: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

WHAT TO IF NORMALITY/HOMOSKEDAS

TICITY IS VIOLATED?

Either: nothing + report the violation

Or: report the violation + transformations

Page 29: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Two types of transformations

LinearTransformation

s

NonlinearTransformation

s

Leave shape of the distribution

intact (centering, scaling)

Do change the shape of the distribution

Page 30: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 31: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 32: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Before transformation

Page 33: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

After transformation

Still bad….…. but better!!

Page 34: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Assumptions

Absence ofCollinearity

Normality of Errors

Homoskedasticity of Errors

No influentialdata points

Independence

Page 35: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Normality of Errors

Homoskedasticity of Errors

(Histogram of Residuals)

Q-Q plot of Residuals

Residual Plot

Assumptions

Page 36: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Absence ofCollinearity

No influentialdata points

Independence

Normality of Errors

Homoskedasticity of Errors

Assumptions

Page 37: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Absence ofCollinearity

Normality of Errors

Homoskedasticity of Errors

No influentialdata points

Independence

Assumptions

Page 38: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!
Page 39: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

What isindependence?

Page 40: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Rep 1

Rep 2

Rep 3

Item #1

Subject

Common experimental data

Item...

Item...

Page 41: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Rep 1

Rep 2

Rep 3

Item #1

Subject

Common experimental data

Pseudoreplication= DisregardingDependencies

Item...

Item...

Page 42: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Subject1 Item1Subject1 Item2Subject1 Item3… …

Subject2 Item1Subject2 Item2Subject3 Item3…. …

Machlis et al. (1985)

“pooling fallacy”

Hurlbert (1984)

“pseudoreplication”

Page 43: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Hierarchical data is everywhere• Typological data

(e.g., Bell 1978, Dryer 1989, Perkins 1989; Jaeger et al., 2011)

• Organizational data

• Classroom data

Page 44: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

German

French

English

Spanish Italian

Swedish

NorwegianFinnish

Hungarian

Turkish

Romanian

Page 45: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

German

French

English

Spanish Italian

Swedish

NorwegianFinnish

Hungarian

Turkish

Romanian

Page 46: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Class 1 Class 2

Hierarchical data is everywhere

Page 47: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Class 1 Class 2

Hierarchical data is everywhere

Page 48: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Class 1 Class 2

Hierarchical data is everywhere

Page 49: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Hierarchical data is everywhere

Page 50: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

IntraclassCorrelation (ICC)

Hierarchical data is everywhere

Page 51: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Simulation for 16 subjects

pseudoreplication

items analysis

Type Ierrorrate

Page 52: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Interpretational Problem:What’s the population

for inference?

Page 53: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

Violating the independence assumption makesthe p-value…

…meaningless

Page 54: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

S1

S2

Page 55: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

S1

S2

Page 56: Assumptions. “Essentially, all models are wrong, but some are useful” George E.P. Box Your model has to be wrong… … but that’s o.k. if it’s illuminating!

That’s it(for now)