View
2
Download
0
Category
Preview:
Citation preview
Lecture 14: ANOVA and the F-test
S. Massa, Department of Statistics, University of Oxford
3 February 2016
Example
Consider a study of 983 individuals and examine the relationshipbetween duration of breastfeeding and adult intelligence. Eachindividual had to perform 3 tests, and breastfeeding duration wasmarked in 5 classes.
TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9
N 272 305 269 104 23
Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2
Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9
Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4
Example
I First of all notice that we list adjusted means.I This means that the actual data has been analysed to remove
effects of confounding factors (mother smoking, parents’income etc), so that the effect of breastfeeding could beisolated.
I Test if the duration of breastfeeding affects adult intelligence.
The General Setup
I Suppose we have independent samples from K differentnormal distributions, with means µ1, . . . , µK and commonvariance σ2. Test H0 : µ1 = . . . = µK .
I We call the K groups, levels.I We have ni samples from the i-th level, Xi1 , Xi2 , . . . , Xini
,and N =
∑Ki=1 ni total observations.
I The sample mean is X̄ while the sample mean of group i is X̄i
X̄i = 1ni
ni∑j=1
Xij , X̄ = 1N
∑i,j
Xij .
The Idea Behind ANOVA
I If the K means are all equal, then:the observations should be as far from their own level meanX̄i, as they are from the overall mean X̄.
I If the means were different then observations should be closerto the mean of their level than the overall mean.
Between Groups and Error Sum of Squares
The Between Groups Sum of Squares (BSS) is the total squaredeviation of the group means from the overall mean,
BSS =K∑
i=1ni(X̄i − X̄)2;
The Error Sum of Squares (ESS) is the total squared deviation ofthe samples from their group means;
ESS =K∑
i=1
ni∑j=1
(Xij − X̄i)2
=K∑
i=1(ni − 1)s2
i ,
where si is the SD of observations in level i.
Total Sum of Squares
The Total Sum of Squares (TSS) is the total square deviation ofthe samples from the overall mean.
TSS =∑i,j
(Xij − X̄)2
= (N − 1)s2,
where s is the sample SD of all observations together. We alsohave
BMS = BSSK − 1 , EMS = ESS
N −K.
ANOVA
Two basic mathematical facts behind ANOVAI First
TSS = ESS + BSS.
The variability among the data can be split in two pieces:1. the variability among the means of the groups;2. the variability within the groups;
Evaluate how the total variability is split among the twotypes: if there is too much between group variability thiswould cast doubt on the validity of the null.
I SecondBoth EMS and BMS are estimates for σ2.
The F -statistic
The F -statistic is
F = BMSEMS = N −K
K − 1BSSESS .
The critical region is of the form {F ≥ f}, where f will depend onthe significance level of the test.
Essentially we would reject the null hypothesis for larger values ofF (that is BMS bigger than EMS).
The F -distributionUnder the null hypothesis the F statistic has the F distributionwith (K − 1, N −K) degrees of freedom.
This is a continuous distribution on the positive real numbers withtwo parameters.
Figure: The probability density function of the F distribution for variousdegrees of freedom.
The F -statistic
The important quantities are summerised in the following table:
Errors SS d.f. MS F
Between Groups BSS K − 1 BMS = BSS/(K − 1) BMS/EMS
Within Groups ESS N −K EMS = ESS/(N −K)
Total TSS N − 1
ANOVA: the Breastfeeding Study
Recall the data from the breastfeeding study:
TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9
N 272 305 269 104 23
Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2
Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9
Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4
Find TSS via TSS = ESS + BSS.
Example: the Breastfeeding Study
TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9
N 272 305 269 104 23
Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2
Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9
Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4
ESS =5∑
k=1(nk − 1)s2
k
= 271 · 15.92 + 304 · 15.22 + 268 · 15.72 + 103 · 13.12 + 22 · 14.42
= 227000;
Example: the Breastfeeding Study
TestDuration of Breastfeeding (months)≤ 1 2-3 4-6 7-9 > 9
N 272 305 269 104 23
Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0SD 16.0 14.9 15.7 13.3 15.2
Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4SD 15.8 15.2 15.6 13.9 14.9
Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0SD 15.9 15.2 15.7 13.1 14.4
BSS =5∑
k=1nk(x̄k − x̄)2
= 272 · (99.4− 101.7)2 + 305 · (101.7− 101.7)2++ 269 · (102.3− 101.7)2 + 104 · (106.0− 101.7)2++ 23 · (104.0− 101.7)2
= 3597.
Example: the Breastfeeding StudyWe complete as follows
Table: ANOVA table for breastfeeding data: Full Scale IQ, Adjusted.
SS d.f. MS F
Between 3597 4 894.8 3.81Samples = 3597/4 = 894.8/234.6
Within 227000 968 234.6Samples = 227000/968
Total 230600 972= 3597 + 227000
Since N = 973 and K = 5 under the null the F -statistic isdistributed according to F (4, 968).
Example: the Breastfeeding Study
Having computed F = 3.81 we now look up the critical values inour table for the 0.05 level:
K = 4 so we pick the fourth column, but N −K is much morethan 60 so we take the bottom row.
The critical value turns out to be 2.37 so we reject the nullhypothesis at the 0.05 level.
The Kruskal-Wallis Test
I The F -test has one basic assumption: the samples areassumed to be normally distributed, that is the F -test isparametric.
I The non-parametric version is known as the Kruskal-Wallistest.
I As with the rank sum test, the basic idea is to substitute theranks for the actual observed values.
The Kruskal-Wallis test
I Suppose K levels, with ni observations in level i.I Assign to each observation its rank relative to the whole
sample.I Sum the ranks in each group giving rank sums R1, . . . , RK .
The Kruskal-Wallis test statistic is
H = 12N(N + 1)
K∑i=1
R2i
ni− 3(N + 1). (1)
Under the null hypothesis, H ∼ χ2K−1.
Exercise and Bone Density in Rats
I A study was performed to examine the effect of exercise onbone density in rats.
I 30 rats were divided into three groups of ten: ’high’, ’low’ and’control’.
I Their bone density was measured at the end of the treatmentperiod.
I Test
H0 : different groups have same mean densityH1 : different groups have different mean density
Exercise and Bone Density in Rats
High 626 650 622 674 626 643 622 650 643 631Low 594 599 635 605 632 588 596 631 607 638
Control 614 569 653 593 611 600 603 593 621 554
Compute
x̄high = 638.70, s2high = 275.34
x̄low = 612.5 s2low = 373.61
x̄cont = 601.10, s2cont = 748.77
x̄ = 617.4
Thus
ESS = 9s2high + 9s2
low + 9s2cont = 12579.5,
BSS = 10(638.7− 617.4)2 + 10(612.5− 617.4)2 + 10(601.1− 617.4)2
= 7433.9
Exercise and Bone Density in Rats
ANOVA table:
SS d.f. MS F
Between 7434 2 3717 7.98
Errors (Within 12580 27 466
Total 20014 29
The number of degrees of freedom here is(K − 1, N −K) = (2, 27).
There is no row for 27 so we just look at the row for 30 and findthe critical value to be 3.32.
So we reject the null at the 5% level.
Exercise and Bone Density in Rats I
Use the Kruskal-Wallis test. First we assign ranks to the data,breaking ties as usual.
High 18.5 27.5 16.5 30 18.5 25.5 16.5 27.5 25.5 20.5Low 6 8 23 11 22 3 7 20.5 12 24
Control 14 2 29 4.5 13 9 10 4.5 15 1
The rank sums are then computed as follows
Rhigh = 226.5, Rlow = 136.5, Rcont = 102.
Exercise and Bone Density in Rats II
Then compute
H = 12N(N + 1)
K∑i=1
R2i
ni− 3(N + 1)
= 1230× 31
[226.52
10 + 136.52
10 + 1022
10]− 3× 31 = 10.66
At the 5% level, the critical value for χ2 with K − 1 = 2 degrees offreedom is 5.99. We therefore still reject the null hypothesis.
RecapI Given independent samples from K normally distributed
populations N(µ1, σ2), . . . N(µK , σ
2) we want to test if thelevel means µ1, . . . , µK could all be equal.
I We computeESS: the squared deviations of observations from their
own group mean; andBSS: the squared deviations of group means from the
overall mean.I Failure of the null should result in higher BMS compared to
EMS.I The F -test is defined as
F = N −KK − 1
BSSESS = BMS
EMS .
I Under the null F has the F -distribution with (K − 1, N −K)degrees of freedom.
Recap
We summarise our calculation in an ANOVA table:
SS d.f. MS F
Between BSS K − 1 BMSX/YTreatments (A) (B) (X = A/B)
Errors (Within ESS N −K EMSTreatments) (C) (D) (Y = C/D)
Total TSS N − 1
Recap
I Now the F -test depends crucially on our data being normallydistributed.
I If we have reason to believe this may not be satisfied then weuse the non-parametric Kruskal-Wallis test.
I Replace the data by their ranks relative to the whole sample.I Let Ri be the rank sum in the i-th level.
H = 12N(N + 1)
K∑i=1
R2i
ni− 3(N + 1). (2)
I Under the null H ∼ χ2K−1.
Recommended