26
Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Lecture 14: ANOVA and the F-test

S. Massa, Department of Statistics, University of Oxford

3 February 2016

Page 2: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Example

Consider a study of 983 individuals and examine the relationshipbetween duration of breastfeeding and adult intelligence. Eachindividual had to perform 3 tests, and breastfeeding duration wasmarked in 5 classes.

TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9

N 272 305 269 104 23

Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2

Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9

Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4

Page 3: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Example

I First of all notice that we list adjusted means.I This means that the actual data has been analysed to remove

effects of confounding factors (mother smoking, parents’income etc), so that the effect of breastfeeding could beisolated.

I Test if the duration of breastfeeding affects adult intelligence.

Page 4: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

The General Setup

I Suppose we have independent samples from K differentnormal distributions, with means µ1, . . . , µK and commonvariance σ2. Test H0 : µ1 = . . . = µK .

I We call the K groups, levels.I We have ni samples from the i-th level, Xi1 , Xi2 , . . . , Xini

,and N =

∑Ki=1 ni total observations.

I The sample mean is X̄ while the sample mean of group i is X̄i

X̄i = 1ni

ni∑j=1

Xij , X̄ = 1N

∑i,j

Xij .

Page 5: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

The Idea Behind ANOVA

I If the K means are all equal, then:the observations should be as far from their own level meanX̄i, as they are from the overall mean X̄.

I If the means were different then observations should be closerto the mean of their level than the overall mean.

Page 6: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Between Groups and Error Sum of Squares

The Between Groups Sum of Squares (BSS) is the total squaredeviation of the group means from the overall mean,

BSS =K∑

i=1ni(X̄i − X̄)2;

The Error Sum of Squares (ESS) is the total squared deviation ofthe samples from their group means;

ESS =K∑

i=1

ni∑j=1

(Xij − X̄i)2

=K∑

i=1(ni − 1)s2

i ,

where si is the SD of observations in level i.

Page 7: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Total Sum of Squares

The Total Sum of Squares (TSS) is the total square deviation ofthe samples from the overall mean.

TSS =∑i,j

(Xij − X̄)2

= (N − 1)s2,

where s is the sample SD of all observations together. We alsohave

BMS = BSSK − 1 , EMS = ESS

N −K.

Page 8: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

ANOVA

Two basic mathematical facts behind ANOVAI First

TSS = ESS + BSS.

The variability among the data can be split in two pieces:1. the variability among the means of the groups;2. the variability within the groups;

Evaluate how the total variability is split among the twotypes: if there is too much between group variability thiswould cast doubt on the validity of the null.

I SecondBoth EMS and BMS are estimates for σ2.

Page 9: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

The F -statistic

The F -statistic is

F = BMSEMS = N −K

K − 1BSSESS .

The critical region is of the form {F ≥ f}, where f will depend onthe significance level of the test.

Essentially we would reject the null hypothesis for larger values ofF (that is BMS bigger than EMS).

Page 10: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

The F -distributionUnder the null hypothesis the F statistic has the F distributionwith (K − 1, N −K) degrees of freedom.

This is a continuous distribution on the positive real numbers withtwo parameters.

Figure: The probability density function of the F distribution for variousdegrees of freedom.

Page 11: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

The F -statistic

The important quantities are summerised in the following table:

Errors SS d.f. MS F

Between Groups BSS K − 1 BMS = BSS/(K − 1) BMS/EMS

Within Groups ESS N −K EMS = ESS/(N −K)

Total TSS N − 1

Page 12: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

ANOVA: the Breastfeeding Study

Recall the data from the breastfeeding study:

TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9

N 272 305 269 104 23

Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2

Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9

Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4

Find TSS via TSS = ESS + BSS.

Page 13: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Example: the Breastfeeding Study

TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9

N 272 305 269 104 23

Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2

Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9

Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4

ESS =5∑

k=1(nk − 1)s2

k

= 271 · 15.92 + 304 · 15.22 + 268 · 15.72 + 103 · 13.12 + 22 · 14.42

= 227000;

Page 14: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Example: the Breastfeeding Study

TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9

N 272 305 269 104 23

Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2

Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9

Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4

BSS =5∑

k=1nk(x̄k − x̄)2

= 272 · (99.4− 101.7)2 + 305 · (101.7− 101.7)2++ 269 · (102.3− 101.7)2 + 104 · (106.0− 101.7)2++ 23 · (104.0− 101.7)2

= 3597.

Page 15: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Example: the Breastfeeding StudyWe complete as follows

Table: ANOVA table for breastfeeding data: Full Scale IQ, Adjusted.

SS d.f. MS F

Between 3597 4 894.8 3.81Samples = 3597/4 = 894.8/234.6

Within 227000 968 234.6Samples = 227000/968

Total 230600 972= 3597 + 227000

Since N = 973 and K = 5 under the null the F -statistic isdistributed according to F (4, 968).

Page 16: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Example: the Breastfeeding Study

Having computed F = 3.81 we now look up the critical values inour table for the 0.05 level:

K = 4 so we pick the fourth column, but N −K is much morethan 60 so we take the bottom row.

The critical value turns out to be 2.37 so we reject the nullhypothesis at the 0.05 level.

Page 17: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

The Kruskal-Wallis Test

I The F -test has one basic assumption: the samples areassumed to be normally distributed, that is the F -test isparametric.

I The non-parametric version is known as the Kruskal-Wallistest.

I As with the rank sum test, the basic idea is to substitute theranks for the actual observed values.

Page 18: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

The Kruskal-Wallis test

I Suppose K levels, with ni observations in level i.I Assign to each observation its rank relative to the whole

sample.I Sum the ranks in each group giving rank sums R1, . . . , RK .

The Kruskal-Wallis test statistic is

H = 12N(N + 1)

K∑i=1

R2i

ni− 3(N + 1). (1)

Under the null hypothesis, H ∼ χ2K−1.

Page 19: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Exercise and Bone Density in Rats

I A study was performed to examine the effect of exercise onbone density in rats.

I 30 rats were divided into three groups of ten: ’high’, ’low’ and’control’.

I Their bone density was measured at the end of the treatmentperiod.

I Test

H0 : different groups have same mean densityH1 : different groups have different mean density

Page 20: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Exercise and Bone Density in Rats

High 626 650 622 674 626 643 622 650 643 631Low 594 599 635 605 632 588 596 631 607 638

Control 614 569 653 593 611 600 603 593 621 554

Compute

x̄high = 638.70, s2high = 275.34

x̄low = 612.5 s2low = 373.61

x̄cont = 601.10, s2cont = 748.77

x̄ = 617.4

Thus

ESS = 9s2high + 9s2

low + 9s2cont = 12579.5,

BSS = 10(638.7− 617.4)2 + 10(612.5− 617.4)2 + 10(601.1− 617.4)2

= 7433.9

Page 21: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Exercise and Bone Density in Rats

ANOVA table:

SS d.f. MS F

Between 7434 2 3717 7.98

Errors (Within 12580 27 466

Total 20014 29

The number of degrees of freedom here is(K − 1, N −K) = (2, 27).

There is no row for 27 so we just look at the row for 30 and findthe critical value to be 3.32.

So we reject the null at the 5% level.

Page 22: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Exercise and Bone Density in Rats I

Use the Kruskal-Wallis test. First we assign ranks to the data,breaking ties as usual.

High 18.5 27.5 16.5 30 18.5 25.5 16.5 27.5 25.5 20.5Low 6 8 23 11 22 3 7 20.5 12 24

Control 14 2 29 4.5 13 9 10 4.5 15 1

The rank sums are then computed as follows

Rhigh = 226.5, Rlow = 136.5, Rcont = 102.

Page 23: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Exercise and Bone Density in Rats II

Then compute

H = 12N(N + 1)

K∑i=1

R2i

ni− 3(N + 1)

= 1230× 31

[226.52

10 + 136.52

10 + 1022

10]− 3× 31 = 10.66

At the 5% level, the critical value for χ2 with K − 1 = 2 degrees offreedom is 5.99. We therefore still reject the null hypothesis.

Page 24: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

RecapI Given independent samples from K normally distributed

populations N(µ1, σ2), . . . N(µK , σ

2) we want to test if thelevel means µ1, . . . , µK could all be equal.

I We computeESS: the squared deviations of observations from their

own group mean; andBSS: the squared deviations of group means from the

overall mean.I Failure of the null should result in higher BMS compared to

EMS.I The F -test is defined as

F = N −KK − 1

BSSESS = BMS

EMS .

I Under the null F has the F -distribution with (K − 1, N −K)degrees of freedom.

Page 25: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Recap

We summarise our calculation in an ANOVA table:

SS d.f. MS F

Between BSS K − 1 BMSX/YTreatments (A) (B) (X = A/B)

Errors (Within ESS N −K EMSTreatments) (C) (D) (Y = C/D)

Total TSS N − 1

Page 26: Lecture 14: ANOVA and the F-test - Oxford Statisticsmassa/Lecture 14.pdf · Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Recap

I Now the F -test depends crucially on our data being normallydistributed.

I If we have reason to believe this may not be satisfied then weuse the non-parametric Kruskal-Wallis test.

I Replace the data by their ranks relative to the whole sample.I Let Ri be the rank sum in the i-th level.

H = 12N(N + 1)

K∑i=1

R2i

ni− 3(N + 1). (2)

I Under the null H ∼ χ2K−1.