69
One-Way Analysis of Variance Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 1

One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Embed Size (px)

Citation preview

Page 1: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way Analysis of Variance

Nathaniel E. Helwig

Assistant Professor of Psychology and StatisticsUniversity of Minnesota (Twin Cities)

Updated 04-Jan-2017

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 1

Page 2: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Copyright

Copyright © 2017 by Nathaniel E. Helwig

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 2

Page 3: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Outline of Notes

1) Categorical Predictors:OverviewDummy codingEffect coding

2) One-Way ANOVA:Model form & assumptionsEstimation & basic inferenceMemory example (part 1)

3) Multiple Comparisons:Comparing meansBonferroni correctionTukey correctionScheffé correctionSummary of correctionsMemory example (part 2)

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 3

Page 4: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors

Categorical Predictors

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 4

Page 5: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors Overview

Categorical Variables (revisited)

Suppose that X ∈ {x1, . . . , xg} is a categorical variable with g levels.Categorical variables are also called factors in ANOVA contextExample: sex ∈ {female,male} has two levelsExample: drug ∈ {A,B,C} has three levels

To code a categorical variable (with g levels) in a regression model, weneed to include g − 1 different variables in the model.

If we know µ = 1g∑g

j=1 µj , where µj is mean for j-th factor level

Then we know µg = g(µ− 1g∑g−1

j=1 µj) by definition

Only g total free parameters, so cannot estimate µ and {µj}gj=1

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 5

Page 6: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors Dummy Coding

Dummy Coding: Definition

Dummy coding uses g − 1 binary variables to code a factor:

xij =

{1 if i-th observation is in j-th level0 otherwise

for i ∈ {1, . . . ,nj} and j ∈ {1, . . . ,g − 1}.

Regression model becomes

yij = b0 +∑g−1

j=1 bjxij + eij

where b0 = µg and bj = µj − µg for j ∈ {1, . . . ,g − 1}.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 6

Page 7: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors Dummy Coding

Dummy Coding: Considerations

Dummy coding is useful for. . .One-way ANOVA model (unique parameter for each factor level)Comparing treatment groups to clearly defined “control” group

Dummy coding is less useful when. . .Have g > 2 levels and/or model is more complicatedDo NOT have a clearly defined “control” or “reference” group

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 7

Page 8: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors Dummy Coding

Dummy Coding: R Syntax

The contrasts function controls the coding scheme for a factor.

Use the contr.treatment option for dummy coding.

> x = factor(rep(letters[1:3], each=5))> x[1] a a a a a b b b b b c c c c c

Levels: a b c> contrasts(x) <- contr.treatment(nlevels(x))> contrasts(x)2 3

a 0 0b 1 0c 0 1

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 8

Page 9: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors Effect Coding

Effect Coding: Definition

Effect coding also uses g − 1 variables to code a factor:

xij =

1 if i-th observation is in j-th level−1 if i-th observation is in g-th level

0 otherwise

for i ∈ {1, . . . ,nj} and j ∈ {1, . . . ,g − 1}.

Regression model becomes

yij = b0 +∑g−1

j=1 bjxij + eij

where b0 = µ and bj = µj − µ for j ∈ {1, . . . ,g}; note bg = −∑g−1

j=1 bj .

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 9

Page 10: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors Effect Coding

Effect Coding: Considerations

Effect coding is useful for. . .Simple interpretation of b0 as overall meanComparing each group’s effect to overall mean

Effect coding is less useful when. . .Have g = 2 levels for a factorDo have a clearly defined “control” or “reference” group

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 10

Page 11: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Categorical Predictors Effect Coding

Effect Coding: R Syntax

The contrasts function controls the coding scheme for a factor.

Use the contr.sum option for effect (deviation) coding.

> x = factor(rep(letters[1:3],each=5))> x[1] a a a a a b b b b b c c c c c

Levels: a b c> contrasts(x) <- contr.sum(nlevels(x))> contrasts(x)[,1] [,2]

a 1 0b 0 1c -1 -1

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 11

Page 12: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model

One-Way ANOVA Model

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 12

Page 13: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Model Form and Assumptions

One-Way ANOVA Model (cell means form)

The One-Way Analysis of Variance (ANOVA) model has the form

yij = µj + eij

for i ∈ {1, . . . ,nj} and j ∈ {1, . . . ,g} whereyij ∈ R is real-valued response for i-th subject in j-th factor levelµj ∈ R is real-valued population mean for the j-th factor level

eijiid∼ N(0, σ2) is a Gaussian error term

nj is number of subjects in j-th factor level and n =∑g

j=1 nj

g is number of factor levels

Implies that yijind∼ N(µj , σ

2).

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 13

Page 14: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Model Form and Assumptions

One-Way ANOVA Model (dummy coding)

Using dummy coding, the one-way ANOVA becomes

yij = b0 +

g−1∑j=1

bjxij + eij

for i ∈ {1, . . . ,nj} and j ∈ {1, . . . ,g} where

xij =

{1 if i-th observation is in j-th level0 otherwise

b0 = µg is reference group meanbj = µj − µg for j ∈ {1, . . . ,g − 1}

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 14

Page 15: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Model Form and Assumptions

One-Way ANOVA Model (effect coding)

Using effect coding, the one-way ANOVA becomes

yij = b0 +

g−1∑j=1

bjxij + eij

for i ∈ {1, . . . ,nj} and j ∈ {1, . . . ,g} where

xij =

1 if i-th observation is in j-th level−1 if i-th observation is in g-th level

0 otherwiseb0 = µ is overall meanbj = µj − µ for j ∈ {1, . . . ,g}

Note that bg = −∑g−1

j=1 bj by definition

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 15

Page 16: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Model Form and Assumptions

One-Way ANOVA Model (matrix form)

In matrix form, the one-way ANOVA model is

y = Xb + ey1y2y3...

yn

=

1 x11 x12 · · · x1(g−1)1 x21 x22 · · · x2(g−1)1 x31 x32 · · · x3(g−1)...

......

. . ....

1 xn1 xn2 · · · xn(g−1)

b0b1b2...

bg−1

+

e1e2e3...

en

where

definition of xij and {bj}g−1j=1 will depend on coding scheme

i ∈ {1, . . . ,n} and second subscript on y and e is dropped

Implies that y ∼ N(Xb, σ2In).

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 16

Page 17: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Model Form and Assumptions

One-Way ANOVA Model (assumptions)

The fundamental assumptions of the one-way ANOVA model are:1 xij and yi are observed random variables (known constants)

2 eiiid∼ N(0, σ2) is an unobserved random variable

3 b0,b1, . . . ,bg−1 are unknown constants

4 (yi |xi1, . . . , xi(g−1))ind∼ N(b0 +

∑g−1j=1 bjxij , σ

2)note: homogeneity of variance

Interpretation of bj depends on coding schemeDummy: bj is difference between j-th mean and reference meanEffect: bj is difference between j-th mean and overall mean

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 17

Page 18: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

Ordinary Least Squares (cell means form)

We want to find the factor level mean estimates (i.e., µj terms) thatminimize the ordinary least squares criterion

SSE =

g∑j=1

nj∑i=1

(yij − µj)2

The least-squares estimates are the factor level means

µj = yj

where yj = 1nj

∑nji=1 yij is sample mean of Y for j-th factor level.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 18

Page 19: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

Ordinary Least Squares (simple proof)

Note that we want to minimize

(SSE)j =

nj∑i=1

(yij − µj)2 =

nj∑i=1

y2ij − 2µj

nj∑i=1

yij + njµ2j

separately for each j ∈ {1, . . . ,g}.

Taking the derivative with respect to µj we have

d(SSE)j

dµj= −2

nj∑i=1

yij + 2njµj

and setting to zero and solving for µj gives µj = 1nj

∑nji=1 yij = yj

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 19

Page 20: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

Ordinary Least Squares (general case)

In general, we can use the regression approach

SSE =n∑

i=1

(yi − b0 −

∑g−1j=1 bjxij

)2= ‖y− Xb‖2

where i ∈ {1, . . . ,n} and n =∑g

j=1 nj ; note that the second subscripton Y is now dropped because there is only one summation.

The OLS solution has the form

b = (X′X)−1X′y

which is the same as the MLR model; note that ANOVA is MLR withcategorical predictors!

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 20

Page 21: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

Fitted Values and Residuals

SCALAR FORM:

Fitted values are given by

yi = b0 +∑g−1

j=1 bjxij

and residuals are given by

ei = yi − yi

MATRIX FORM:

Fitted values are given by

y = Xb

and residuals are given by

e = y− y

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 21

Page 22: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

ANOVA Sums-of-Squares: Scalar Form

In one-way ANOVA model, the relevant sums-of-squares areTotal: SST =

∑ni=1(yi − y)2 =

∑gj=1∑nj

i=1(yij − y)2

Between: SSB =∑n

i=1(yi − y)2 =∑g

j=1 nj(yj − y)2

Within: SSW =∑n

i=1(yi − yi)2 =

∑gj=1∑nj

i=1(yij − yj)2

The corresponding degrees of freedom areSST: dfT = n − 1SSB: dfB = g − 1SSW: dfW = n − g

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 22

Page 23: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

ANOVA Sums-of-Squares: Matrix Form

In MLR models, the relevant sums-of-squares are

SST =n∑

i=1

(yi − y)2

= y′ [In − (1/n)J] y

SSB =n∑

i=1

(yi − y)2

= y′ [H− (1/n)J] y

SSW =n∑

i=1

(yi − yi)2

= y′ [In − H] y

Note: H = X(X′X)−1X′ and J is an n × n matrix of onesNathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 23

Page 24: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

Partitioning the Variance (same as MLR model)

We can partition the total variation in yi as

SST =n∑

i=1

(yi − y)2

=n∑

i=1

(yi − yi + yi − y)2

=n∑

i=1

(yi − y)2 +n∑

i=1

(yi − yi )2 + 2

n∑i=1

(yi − y)(yi − yi )

= SSB + SSW + 2n∑

i=1

(yi − y)ei

= SSB + SSW

See MLR notes for the proof.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 24

Page 25: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

Estimated Error Variance (Mean Squared Error)

An unbiased estimate of the error variance σ2 is

σ2 = SSW/(n − g)

=n∑

i=1

(yi − yi)2/(n − g)

=1

n − g

g∑j=1

(nj − 1)s2j

where s2j = 1

nj−1∑nj

i=1(yij − yj)2 is j-th factor level’s sample variance.

The estimate σ2 is the mean squared error (MSE) of the model, and isthe pooled estimate of sample variance.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 25

Page 26: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Estimation and Basic Inference

ANOVA Table and Overall F Test

We typically organize the SS information into an ANOVA table:

Source SS df MS F p-valueSSB

∑ni=1(yi − y)2 g − 1 MSB F ∗ p∗

SSW∑n

i=1(yi − yi)2 n − g MSW

SST∑n

i=1(yi − y)2 n − 1MSB = SSB

g−1 , MSW = SSWn−g , F ∗ = MSB

MSW ∼ Fg−1,n−g ,p∗ = P(Fg−1,n−g > F ∗)

F ∗-statistic and p∗-value are testing H0 : b1 = · · · = bg−1 = 0 versusH1 : bk 6= 0 for some k ∈ {1, . . . ,g − 1}

Equivalent to testing H0 : µj = µ ∀j versus H1 : not all µj are equal

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 26

Page 27: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: Data Description

Visual and auditory cues example from Hays (1994) Statistics.

Does lack of visual/auditory synchrony affect memory?

Total of n = 30 college students participate in memory experiment.Watch video of person reciting 50 wordsTry to remember the 50 words (record number correct)

Randomly assign nj = 10 subjects to one of g = 3 video conditions:fast: sound precedes lip movements in videonormal: sound synced with lip movements in videoslow: lip movements in video precede sound

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 27

Page 28: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: Descriptive Statistics

Number of correctly remembered words (yij ):

Subject (i) Fast (j = 1) Normal (j = 2) Slow (j = 3)1 23 27 232 22 28 243 18 33 214 15 19 255 29 25 196 30 29 247 23 36 228 16 30 179 19 26 20

10 17 21 23∑10i=1 yij 212 274 218∑10i=1 y2

ij 4738 7742 4810

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 28

Page 29: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: OLS Estimation (by hand)

The least-squares estimates of µj are the sample means:

µ1 = y1 =1

10

10∑i=1

yi1 = 212/10 = 21.2

µ2 = y2 =1

10

10∑i=1

yi2 = 274/10 = 27.4

µ3 = y3 =1

10

10∑i=1

yi3 = 218/10 = 21.8

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 29

Page 30: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: OLS Estimation (in R: by hand)

# define response and factor vectors> sync = c(23,27,23,22,28,24,18,33,21,15,

19,25,29,25,19,30,29,24,23,36,22,16,30,17,19,26,20,17,21,23)

> cond = factor(rep(c("fast","normal","slow"), 10))

# sum of sync for each level of cond> tapply(sync, cond, sum)fast normal slow212 274 218

# sum-of-squares of sync for each level of cond> sumsq = function(x){ sum(x^2) }> tapply(sync, cond, sumsq)fast normal slow4738 7742 4810

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 30

Page 31: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: OLS Estimation (in R: dummy pt. 1)

> smod = lm(sync ~ cond)> summary(smod)$coef

Estimate Std. Error t value Pr(>|t|)(Intercept) 21.2 1.408440 15.0521126 1.183308e-14condnormal 6.2 1.991835 3.1127073 4.350803e-03condslow 0.6 1.991835 0.3012297 7.655470e-01

Note that. . .b0 = y1 = 21.2 is mean of fast conditionb1 = y2 − y1 = 27.4− 21.2 = 6.2 is difference between means ofnormal and fast conditionsb2 = y3 − y1 = 21.8− 21.2 = 0.6 is difference between means ofslow and fast conditions

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 31

Page 32: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: OLS Estimation (in R: dummy pt. 2)> contrasts(cond)

normal slowfast 0 0normal 1 0slow 0 1> contrasts(cond) <- contr.treatment(3, base=2)> contrasts(cond)

1 3fast 1 0normal 0 0slow 0 1> smod = lm(sync ~ cond)> summary(smod)$coef

Estimate Std. Error t value Pr(>|t|)(Intercept) 27.4 1.408440 19.454146 2.052094e-17cond1 -6.2 1.991835 -3.112707 4.350803e-03cond3 -5.6 1.991835 -2.811478 9.072381e-03

Note that. . .b0 = y2 = 27.4 is mean of normal conditionb1 = y1 − y2 = 21.2− 27.4 = −6.2 is difference between meansof fast and normal conditionsb2 = y3 − y2 = 21.8− 27.4 = −5.6 is difference between meansof slow and normal conditions

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 32

Page 33: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: OLS Estimation (in R: effect)> contrasts(cond) <- contr.sum(3)> contrasts(cond)

[,1] [,2]fast 1 0normal 0 1slow -1 -1> smod = lm(sync ~ cond)> summary(smod)$coef

Estimate Std. Error t value Pr(>|t|)(Intercept) 23.466667 0.8131633 28.858492 7.900265e-22cond1 -2.266667 1.1499866 -1.971037 5.904989e-02cond2 3.933333 1.1499866 3.420330 2.003601e-03

Note that. . .b0 = y = 23.47 is grand mean (y = 212+274+218

30 = 70430 = 23.467)

b1 = y1 − y = 21.2− 23.467 = −2.266667 is difference betweenmean of fast condition and overall meanb2 = y2 − y = 27.4− 23.467 = 3.933333 is difference betweenmean of normal condition and overall meanImplicitly we have: b3 = −(b1 + b2) = −(3.933333− 2.266667) =y3 − y = 21.8− 23.467 = −1.67 is difference between mean ofslow condition and overall mean

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 33

Page 34: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: Sums-of-Squares (by hand)

Defining n =∑g

j=1 nj = 30, the relevant sums-of-squares are

SST =∑g

j=1∑nj

i=1(yij − y)2 =∑g

j=1∑nj

i=1 y2ij −

1n

(∑gj=1∑nj

i=1 yij

)2

= (4738 + 7742 + 4810)− (7042/30) = 769.4667

SSW =∑g

j=1∑nj

i=1(yij − yj)2 =

∑gj=1∑nj

i=1 y2ij −

∑gj=1

(∑nji=1 yij

)2

nj

= (4738 + 7742 + 4810)− ([2122 + 2742 + 2182]/10) = 535.6

SSB = SST − SSW = 769.4667− 535.6 = 233.8667

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 34

Page 35: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: ANOVA Table (by hand)

Putting sums-of-squares on previous slide into an ANOVA table:

Source SS df MS F p-valueSSB 233.8667 2 116.9333 5.8947 0.0075SSW 535.6000 27 19.8370SST 769.4667 29

Note that F ∗ = 5.8947 ∼ F2,27 and P(F2,27 > F ∗) = 0.0075.

Assuming a typical α level (e.g., α = 0.01 or α = 0.05), we wouldreject the null hypothesis H0 : µj = µ ∀j .

We conclude that there is some mean difference on the responsevariable (# of remembered words) between the different conditions.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 35

Page 36: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

One-Way ANOVA Model Memory Example (Part 1)

Memory Example: ANOVA Table (in R)

> sync = c(23,27,23,22,28,24,18,33,21,15,19,25,29,25,19,30,29,24,23,36,22,16,30,17,19,26,20,17,21,23)

> cond = factor(rep(c("fast","normal","slow"), 10))> smod = lm(sync ~ cond)> anova(smod)Analysis of Variance Table

Response: syncDf Sum Sq Mean Sq F value Pr(>F)

cond 2 233.87 116.933 5.8947 0.007513 **Residuals 27 535.60 19.837---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 36

Page 37: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons

Multiple Comparisons

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 37

Page 38: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Comparing Means

Limitations of Overall F Test

If we reject H0 : µj = µ ∀j then we know that there are some meandifferences, but we do not know where the mean differences exist

This is assuming that g > 2Note that if g = 2 we have an independent samples t test

If g > 2 we need to perform followup tests (multiple comparisons) todetermine where the mean differences are occurring in the data.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 38

Page 39: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Comparing Means

Linear Combinations of Factor Level Means

A linear combination L of the factor level means has the form

L =∑g

j=1 cjµj

where cj are the coefficients defining the particular linear combination.

In practice we never know µj so we define

L =∑g

j=1 cj µj

where µj is our least-squares estimate of µj .

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 39

Page 40: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Comparing Means

Testing Linear Combinations

Remember µj = yj which implies

V (µj) = V(

1nj

∑nji=1 yij

)=

1n2

jV(∑nj

i=1 yij

)=

∑nji=1 V (yij)

n2j

=njσ

2

n2j

=σ2

nj

is the variance of µj , given that V(yij)

= σ2 from the assumptions.

This implies that

V (L) = V(∑g

j=1 cj µj

)=∑g

j=1 c2j V (µj) = σ2∑g

j=1 c2j /nj

is the variance of any linear combination L.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 40

Page 41: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Comparing Means

Testing Linear Combinations (continued)

To test H0 : L = L∗ versus H1 : L 6= L∗ use

t∗ =L− L∗√

V (L)∼ tn−g

where V (L) = σ2∑gj=1 c2

j /nj uses the MSE to estimate σ2.

If population σ2 is known, then use

Z ∗ =L− L∗√

V (L)∼ N(0,1)

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 41

Page 42: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Comparing Means

Contrasts and Pairwise Comparisons

A contrast is a linear combination of the factor level means such thatthe coefficients sum to zero, i.e.,

∑gj=1 cj = 0.

µ1 − µ2 is mean difference of first two levels(µ1 + µ2)/2− µ3 is mean of first two levels minus third level

A pairwise comparison is a contrast involving two factor level means:µ1 − µ2 is a pairwise comparison of first two levelsµ1 − µ3 is a pairwise comparison of first and third levelsµ2 − µ3 is a pairwise comparison of second and third levels

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 42

Page 43: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Comparing Means

Multiple Comparison Problem

If we test multiple linear combinations of factor level means, we needto worry about the Familywise Type I Error Rate (FWER).

FWER is probability of making at least one Type I Error among alltested linear combinations.

Single test Type I Error = P(Reject H0 | H0 true) = α

For q independent tests with level α: FWER = 1− (1− α)q

Generally FWER will depend on number of tests and whether or nottests are independent of one another.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 43

Page 44: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Bonferroni Correction

Bonferroni’s Correction: Definition

Suppose we want to test f linear combinations of factor level means.

According to Boole’s inequality, for f tests with level α∗

FWER ≤∑f

k=1 P(Reject H0k | H0k true) =∑f

k=1 α∗ = fα∗

regardless of whether or not the tests are independent of one another.

Bonferroni’s correction sets α∗ = α/f to ensure that FWER ≤ α.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 44

Page 45: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Bonferroni Correction

Bonferroni’s Correction: Properties

Major strength: applicable to many situations (no assumptions)

Major weakness: overly conservative in some cases

Suppose we have f = 3 independent tests and want FWER ≤ 0.05FWER = 1− (1− α∗)3

Bonferroni: α∗ = 0.05/3 = 0.0167FWER = 1− (1− 0.0167)3 = 0.04917

Suppose we have f = 10 independent tests and want FWER ≤ 0.05FWER = 1− (1− α∗)10

Bonferroni: α∗ = 0.05/10 = 0.005FWER = 1− (1− 0.005)10 = 0.0489

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 45

Page 46: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Tukey Correction

Detour: Studentized Range Distribution

Assume the following. . .

zkiid∼ N(µ, σ2) for k ∈ {1, . . . , r}

σ2 is an estimate of σ2 based on ν degrees-of-freedomσ2 is independent of zk for k ∈ {1, . . . , r}

The studentized range statistic is defined as

qr ,ν =range(zk )

σ

where range(zk ) = max(zk )−min(zk ), and (r , ν) are the numeratorand denominator degrees of freedom.

One-sided probability distribution (similar to F ), so only reject ifobserved q > q(α)

r ,ν where P(qr ,ν > q(α)r ,ν ) = α.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 46

Page 47: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Tukey Correction

Testing All Possible Pairwise Comparisons

Want to test all possible pairwise comparisons between means.H0 : µj − µk = 0 ∀j , k versus H1 : not all µj − µk = 0There are g(g − 1)/2 unique pairwise comparisons

Note that the variance of a pairwise comparison L = µj − µk is

V (L) = σ2(

1nj

+1nk

)where nj and nk are the sample sizes of factor levels j and k .

If nj = n∗∀j , simplifies to V (L) = σ2( 2n∗

) for any pairwise comparison.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 47

Page 48: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Tukey Correction

Tukey’s Honest Significant Difference (HSD) Test

Proposed by John Tukey (1953) for balanced ANOVA, i.e., nj = n∗∀j .

Test statistic is defined as

q∗ =

√2L√

V (L)=

yj − yk√σ2/n∗

where L = yj − yk , V (L) = σ2( 2n∗

), and σ2 is MSE of model.

Considering all pairwise comparisons, q∗ ∼ qg,n−g where qg,n−g isstudentized range distribution with (g,n − g) degrees of freedom.

Note that yj ∼ N(µj , σ2/n∗) for all j ∈ {1, . . . ,g}

Under H0 : µj − µk = 0 ∀j , k , we have yj ∼ N(µ, σ2/n∗) ∀j

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 48

Page 49: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Tukey Correction

Tukey’s HSD Test (continued)

To form a 100(1− α)% CI around the mean difference yj − yk use

(yj − yk )±q(α)

g,n−g√2

√V (L)

where q(α)g,n−g is critical value from studentized range distribution.

If you form all possible CIs around pairwise mean differences, you willcontrol FWER exactly at level α using Tukey’s HSD test.

More conservative than forming all CIs using tn−g critical values

Example 95% CI: q(0.95)3,27 /

√2 = 2.48 and t(0.975)

27 = 2.05

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 49

Page 50: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Tukey Correction

Tukey-Kramer Test

In unbalanced ANOVA, i.e., nj 6= nk for at least one (j , k), use HSDextension proposed by John Tukey (1953) and Clyde Kramer (1956)

Called the Tukey-Kramer test or Tukey-Kramer procedure

To form a 100(1− α)% CI around the mean difference yj − yk use

(yj − yk )±q(α)

g,n−g√2

√V (L)

where V (L) = σ2( 1nj

+ 1nk

) is estimated variance of L = µj − µk .

If you form all possible CIs around pairwise mean differences, you willcontrol FWER below (but not exactly at) level α using Tukey-Kramer.

See Hayter (1984) for formal proof of TK conservativenessNathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 50

Page 51: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Scheffé Correction

Testing All Possible Contrasts

Want to test all possible contrasts between factor level meansH0 :

∑gj=1 cjµj = 0 for all c ∈ C where

C = {c = (c1, . . . , cg) :∑g

j=1 cj = 0} is set of all contrasts

H1 :∑g

j=1 cjµj 6= 0 for some c ∈ C

Note that µj = µ ∀j ⇐⇒∑g

j=1 cjµj = 0 for all c ∈ C

Proof of µj = µ ∀j =⇒∑g

j=1 cjµj = 0 for all c ∈ CIf µj = µ ∀j , then

∑gj=1 cjµj = µ

∑gj=1 cj = µ(0) = 0 for all c ∈ C

Proof of∑g

j=1 cjµj = 0 for all c ∈ C =⇒ µj = µ ∀jIf∑g

j=1 cjµj = 0 for all c ∈ C, then µj − µk = 0 for all j , k

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 51

Page 52: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Scheffé Correction

Contrasts and Overall ANOVA F Test

Remember the overall F test (associated with ANOVA table) is testingH0 : µj = µ ∀j versus H1 : not all µj are equal

Equivalent to testing H0 :∑g

j=1 cjµj = 0 for all c ∈ C versusH1 :

∑gj=1 cjµj 6= 0 for some c ∈ C

See previous slide for proof of equivalence

Point: if we want to test all possible contrasts, we can use Fg−1,n−gdistribution to control FWER at level α.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 52

Page 53: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Scheffé Correction

Scheffé’s Method

Want to test all possible contrasts between factor level meansH0 :

∑gj=1 cjµj = 0 for all c ∈ C where

C = {c = (c1, . . . , cg) :∑g

j=1 cj = 0} is set of all contrasts

H1 :∑g

j=1 cjµj 6= 0 for some c ∈ C

To form a 100(1− α)% CI for a contrast L =∑g

j=1 cjµj use

L±√

(g − 1)F (α)g−1,n−g

√V (L)

where V (L) = σ2∑gj=1 c2

j /nj is estimated variance of L =∑g

j=1 cj yj .

If you form CIs around all possible contrasts, you will control FWERexactly at level α using Scheffé’s method. Proof

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 53

Page 54: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Summary of Corrections

Summary of Multiple Comparisons

If we want to form 100(1− α)% CI around all L = µj − µk use

L± C√

V (L)

where V (L) = σ2( 1nj

+ 1nk

) and C is some critical value.

Each procedure uses different critical value:

No correction: C = t(α/2)n−g

Bonferroni: C = t(α∗/2)

n−g with α∗ = α/[g(g − 1)/2]

Tukey: C = q(α)g,n−g/

√2

Scheffé: C =√

(g − 1)F (α)g−1,n−g

Scheffé’s critical value is ALWAYS larger than Tukey value, becauseset of all pairwise comparisons is subset of set of all contrasts.Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 54

Page 55: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Summary of Corrections

Choosing Between Multiple Comparisons

You want CIs that are as narrow as possible and control FWER.

If you are interested in all pairwise comparisons, use Tukey-Kramer.

If you are interested in all possible contrasts, use Scheffé.

If you are interested in some subset of all pairwise comparisons (or allcontrasts), Bonferroni may be most efficient approach.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 55

Page 56: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Pairwise Comparison Estimates

Suppose we want to test all g(g − 1)/2 = 3 unique pairwisecomparisons between factor level means:

L1 = µ1 − µ2 = µfast − µnormalL2 = µ2 − µ3 = µnormal − µslowL3 = µ3 − µ1 = µslow − µfast

The estimated pairwise comparisons are given byL1 = µ1 − µ2 = 21.2− 27.4 = −6.2L2 = µ2 − µ3 = 27.4− 21.8 = 5.6L3 = µ3 − µ1 = 21.8− 21.2 = 0.6

and we know that V (L) = σ2(2/n∗) = (19.8370)(2/10) = 3.9674

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 56

Page 57: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Pairwise Comparison CI Values

If we want to form 95% CI around all three L = µj − µk use

L± C√

V (L) = L± C√

3.9674

whereC = t(.025)

27 = 2.0518 with no correction

C = t(.008)27 = 2.5525 with Bonferroni correction

C =q(.05)

3,27√2

= 2.4794 with Tukey correction

C =√

2F (.05)2,27 = 2.5900 with Scheffé correction

Note that Tukey is best (i.e., produces narrowest CIs), followed byBonferroni, and then Scheffé.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 57

Page 58: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Pairwise CIs (no correction)

Using no correction the CI estimates are:

L1 ± t(.025)27

√V (L1) = −6.2± 2.0518

√3.9674 = [−10.2869; −2.1131]

L2 ± t(.025)27

√V (L2) = 5.6± 2.0518

√3.9674 = [1.5131; 9.6869]

L3 ± t(.025)27

√V (L3) = 0.6± 2.0518

√3.9674 = [−3.4869; 4.6869]

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 58

Page 59: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Pairwise CIs (Bonferroni)

Using Bonferroni correction the CI estimates are:

L1 ± t(.008)27

√V (L1) = −6.2± 2.5525

√3.9674 = [−11.2841; −1.1159]

L2 ± t(.008)27

√V (L2) = 5.6± 2.5525

√3.9674 = [0.5159; 10.6841]

L3 ± t(.008)27

√V (L3) = 0.6± 2.5525

√3.9674 = [−4.4841; 5.6841]

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 59

Page 60: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Pairwise CIs (Tukey)

Using Tukey correction the CI estimates are:

L1 ±q(.05)

3,27√2

√V (L1) = −6.2± 2.4794

√3.9674 = [−11.1386; −1.2614]

L2 ±q(.05)

3,27√2

√V (L2) = 5.6± 2.4794

√3.9674 = [0.6614; 10.5386]

L3 ±q(.05)

3,27√2

√V (L3) = 0.6± 2.4794

√3.9674 = [−4.3386; 5.5386]

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 60

Page 61: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Pairwise CIs (Scheffé)

Using Scheffé correction the CI estimates are:

L1 ±√

2F (.05)2,27

√V (L1) = −6.2± 2.59

√3.9674 = [−11.3589; −1.0411]

L2 ±√

2F (.05)2,27

√V (L2) = 5.6± 2.59

√3.9674 = [0.4411; 10.7589]

L3 ±√

2F (.05)2,27

√V (L3) = 0.6± 2.59

√3.9674 = [−4.5589; 5.7589]

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 61

Page 62: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Good use of Scheffé

Suppose we want to test four contrasts:L1 = µ1 − µ2 = µfast − µnormalL2 = µ2 − µ3 = µnormal − µslowL3 = µ3 − µ1 = µslow − µfastL4 = µ2 − µ1+µ3

2 = µnormal − µslow+µfast2

If we want to form 95% CI around all four Lj use

Lj ± C√

V (Lj)

whereC = t(.006)

27 = 2.6763 using Bonferroni (α∗ = .05/4 = .0125)

C =√

2F (.05)2,27 = 2.59 using Scheffé

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 62

Page 63: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Multiple Comparisons Memory Example (Part 2)

Memory Example: Good use of Scheffé (continued)

Note that L4 = 27.4− 21.8+21.22 = 5.9 and

V (L4) = σ2∑3j=1

c2j

nj= (19.8370)

(110 + (−1/2)2

10 + (−1/2)2

10

)= 2.9756

Using Scheffé correction the CI estimates are:

L1 ±√

2F (.05)2,27

√V (L1) = −6.2± 2.59

√3.9674 = [−11.3589; −1.0411]

L2 ±√

2F (.05)2,27

√V (L2) = 5.6± 2.59

√3.9674 = [0.4411; 10.7589]

L3 ±√

2F (.05)2,27

√V (L3) = 0.6± 2.59

√3.9674 = [−4.5589; 5.7589]

L4 ±√

2F (.05)2,27

√V (L4) = 5.9± 2.59

√2.9756 = [1.4322; 10.3678]

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 63

Page 64: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Appendix

Appendix

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 64

Page 65: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Appendix

Scheffé’s Method: Logic

Remember that the overall ANOVA F test has the form

F ∗ =MSBMSW

=

1g−1

∑gj=1 nj(yj − y)2

σ2 ∼ Fg−1,n−g under H0

which implies that

S2 =SSBMSW

=

∑gj=1 nj(yj − y)2

σ2 ∼ (g − 1)Fg−1,n−g under H0

Defining the test of a single contrast as Tc = L√V (L)

, note that

supc∈C

T 2c = S2

where sup denotes the supremum (i.e., least upper-bound).Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 65

Page 66: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Appendix

Scheffé’s Method: Proof (part 1)

To prove the claim supc∈C T 2c = S2, define the n × 1 vector

a′ =(

c1n1

1′n1c2n2

1′n2· · · cg

ng1′ng

)where cj are contrast coefficients and 1nj is an nj × 1 vector of ones.

Define y′ = (y′1, . . . ,y′g) where y′j = (y1j , . . . , ynj j) and note that

a′y =∑g

j=1cjnj

1′njyj =

∑gj=1 cj yj = L

‖a‖2 = a′a =∑g

j=1

(cjnj

)21′nj

1nj =∑g

j=1 c2j /nj

which implies that

T 2c =

(a′y)2

σ2‖a‖2

for any contrast c ∈ C.Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 66

Page 67: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Appendix

Scheffé’s Method: Proof (part 2)

Now note that a = Xb where

X =

1n1 0n1 · · · 0n1

0n2 1n2 · · · 0n2...

.... . .

...0ng 0ng · · · 1ng

b =

c1/n1c2/n2

...cg/ng

which implies that a = Ha where H = X(X′X)−1X′ is hat matrix.

Also note that a′1n =∑g

j=1cjnj

1′nj1nj =

∑gj=1 cj = 0, which implies that

a = (H− H0)a

where H0 = 1n 1n1′n is projection matrix for constant space.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 67

Page 68: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Appendix

Scheffé’s Method: Proof (part 3)

Putting things together, we can write the single contrast test as

T 2c =

(a′y)2

σ2‖a‖2=

[a′(H− H0)y]2

σ2‖a‖2

for any contrast c ∈ C, given that a′ = a′(H− H0).

By the Cauchy-Schwarz inequality, we know that

(u′v)2 ≤ ‖u‖2‖v‖2

with equality holding only when u = wv for some w 6= 0.

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 68

Page 69: One-Way Analysis of Variance - School of Statistics : …users.stat.umn.edu/~helwig/notes/aov1-Notes.pdf ·  · 2017-01-06Categorical Predictors Categorical Predictors Nathaniel

Appendix

Scheffé’s Method: Proof (part 4)

Letting a = u and v = (H− H0)y, we have

T 2c =

[a′(H− H0)y]2

σ2‖a‖2

≤ ‖a‖2‖(H− H0)y‖2

σ2‖a‖2=‖(H− H0)y‖2

σ2 = S2

If we define a = (H− H0)y, then T 2c reaches its upper bound of S2.

To control FWER at level α note that

P(T 2c ≤ S2 ∀c ∈ C) = P(sup

c∈CT 2

c ≤ S2)

and we know that S2 ∼ (g − 1)Fg−1,n−g under H0.

Use S =√

(g − 1)F (α)g−1,n−g to form 100(1− α)% CI for Tc

Return

Nathaniel E. Helwig (U of Minnesota) One-Way Analysis of Variance Updated 04-Jan-2017 : Slide 69