Upload
phillip-riley
View
212
Download
0
Embed Size (px)
Citation preview
ONE WAY ANALYSIS OF VARIANCE
ANOVAo It is used to investigate the effect of one factor
which occurs at h levels (≥3).
Example: Suppose that we wish to test the effect of temperature at levels (20, 30, 35, 40oC) on the serum total proteins.
Biostatistics and Data analysis3rd Lecture
RANDOM MODEL HYPOTHESIS
1) Thek samples (independent random samples) drawn from K specific populationswith means
2) Each of the k population is normally distributed.
3) Each of thek population has the same variance
𝑯𝟎 :𝝁𝟏=𝝁𝟐=……….=𝝁𝒌
𝑯 𝑨 :𝒂𝒕 𝒍𝒆𝒂𝒔𝒕 𝒐𝒏𝒆𝒑𝒂𝒊𝒓 𝒐𝒇 𝝁′ 𝒔𝒓𝒆𝒏𝒐𝒕𝒆𝒒𝒖𝒂𝒍 .
The summary statistics for each row are shown in the table below
20oC 25oC 30oC
Sample size (n) 7 9 8
Mean 2.2857 2.444 5.625
S.D. 0.487 0.882 1.922
Variance (S2) 0.237 0.778 3.694
Temperature(oC)
Serum Total Proteins (g/dL)
20 2, 3, 2, 2, 3, 2, 225 4, 3, 2, 3, 1, 2, 2, 3, 230 5, 6, 7, 4, 2, 6, 7, 8
o The sum of the squares of the deviations between a value and the mean of the value
SS between groups SS(B)SS within groups SS(W)
o The average squared deviation from the mean and are found by dividing the variation by the degrees of freedom
MS = SS / df
MS between groups MS(B) MS within groups MS(W)
Variances (Mean of Squares) = MS
Variation (Sum of Squares) = SS
• Are all of the values identical?– There are variations among the data called
the total variation SS(T).
Variation (Sum of Squares) = SSTemperature
(oC)Serum Total Proteins (g/dL) Means
20 2, 3, 2, 2, 3, 2, 2 2.285725 4, 3, 2, 3, 1, 2, 2, 3, 2 2.44430 5, 6, 7, 4, 2, 6, 7, 8 5.625
• Are all of the sample means identical?– There variation called between group
SS(B)variation or variation due to Factor.
Temperature(oC)
Serum Total Proteins (g/dL) Means
20 2, 3, 2, 2, 3, 2, 2 2.285725 4, 3, 2, 3, 1, 2, 2, 3, 2 2.44430 5, 6, 7, 4, 2, 6, 7, 8 5.625
• Are each of the values within each group identical?– There is variation within group SS(W) (error variation).
Temperature(oC)
Serum Total Proteins (g/dL) Means
20 2, 3, 2, 2, 3, 2, 2 2.285725 4, 3, 2, 3, 1, 2, 2, 3, 2 2.44430 5, 6, 7, 4, 2, 6, 7, 8 5.625
– The variation between groups, SS(B), or the variation due to the factor
– The variation within groups, SS(W), or the error variation
There are two sources of variation
• Here is the basic one-way ANOVA table
Source SS df MS F P
Between (Factor)
Within (Error)
Total
The summary statistics for the grades of each row are shown in the table below
20oC 25oC 30oC
Sample size (n) 7 9 8
Mean 2.2857 2.444 5.625
S.D. 0.487 0.882 1.922
Variance (S2) 0.237 0.778 3.694
Temperature(oC)
Serum Total Proteins (g/dL)
20 2, 3, 2, 2, 3, 2, 225 4, 3, 2, 3, 1, 2, 2, 3, 230 5, 6, 7, 4, 2, 6, 7, 8
Grand Mean– The grand mean is the average of all the values
– It is a weighted average of the individual sample means
1
1
k
i iik
ii
n xx
n
𝐗𝐠=𝐧𝟏𝐗𝟏+𝐧𝟐𝐗𝟐+…+𝐧𝐤𝐗𝐤
𝐧𝟏+𝐧𝟐+…+𝐧𝐤
Between Group Variation, SS(B)
𝐒𝐒 (𝐁 )=𝟕 (𝟐 .𝟐𝟖𝟓𝟕−𝟑 .𝟒𝟓𝟖)𝟐+𝟗 (𝟐 .𝟒𝟒𝟒−𝟑 .𝟒𝟓𝟖 )𝟐+𝟖 (𝟓 .𝟔𝟐𝟓−𝟑 .𝟒𝟓𝟖 )𝟐
𝐒𝐒 (𝐁 )=𝟗 .𝟔𝟐+𝟗 .𝟐𝟓𝟒+𝟑𝟕 .𝟓𝟔𝟕=𝟓𝟔 .𝟕𝟏𝟏
Within Group Variation, SS(W)
𝐒𝐒 (𝑾 )=𝟔∗𝟎 .𝟐𝟑𝟕+𝟖∗𝟎 .𝟕𝟕𝟖+𝟕∗𝟑 .𝟔𝟗𝟒
𝐒𝐒 (𝑾 )=𝟏 .𝟒𝟐𝟐+𝟔 .𝟐𝟐𝟒+𝟐𝟓 .𝟔𝟗𝟒=𝟑𝟑 .𝟓𝟎𝟒
𝐒 (𝐖 )=𝒅𝒇 𝟏∗𝑺𝟏𝟐+𝒅𝒇 𝟐∗𝑺𝟐
𝟐+…+𝒅𝒇 𝒌∗𝑺𝒌𝟐
• After filling in the sum of squares, we have …
Source SS df MS F p
Between 56.441 2
Within 33.504 21
Total 89.945 23
– MS = SS / df• MS(B) = 56.441 / 2 = 28.221• MS(W) = 33.504 / 21 = 1.595
Variances
• After filling in the sum of squares, we have …
Source SS df MS F p
Between 56.441 2 28.221
Within 33.504 21 1.595
Total 89.945 23
– An F test statistic is the ratio of two sample variances
– The MS(B) and MS(W) are two sample variances and that’s what we divide to find F.
– F = MS(B) / MS(W)F = 28.2 / 1.595 = 17.69
F test
After filling in the sum of squares, we have …
Source SS df MS Fcal P
Between 56.441 2 28.221 17.69
Within 33.504 21 1.595
Total 89.945 23
Tabulated F2,21(5%)= 3.47, F2,21(1%)= 5.78 , F2,21(0.1%)= 5.78 Thus calculated F at df 2,21 > Tabulated at F2,21(0.1%)= 5.78 Thus reject null hypothesis