Upload
jacob-hughes
View
223
Download
5
Tags:
Embed Size (px)
Citation preview
Locating Variance: Post-Hoc Tests
Dr James Betts
Developing Study Skills and Research Methods (HL20107)
Lecture Outline:
•Influence of multiple comparisons on P
•Tukey’s HSD test
•Bonferroni Corrections
•Ryan-Holm-Bonferroni Adjustments.
Tim
e to
Fat
igu
e (m
in)
0
20
40
60
80
100
120 PlaceboLGIHGIGlucose
Thomas et al. (1991)
*
*P <0.05 vs. Placebo, HGI & Glucose
PlaceboLucozadeGatoradePowerade
PlaceboLucozadeGatoradePowerade
Tim
e to
Fat
igu
e (m
in)
0
20
40
60
80
100
120 PlaceboLGIHGIGlucose
Thomas et al. (1991)
*
*P <0.05 vs. Placebo, HGI & Glucose
PlaceboLucozadeGatoradePowerade
PlaceboLucozadeGatoradePowerade
Why not multiple t-tests?i.e.• Placebo vs Lucozade• Placebo vs Gatorade• Placebo vs Powerade• Lucozade vs Gatorade• Lucozade vs Powerade• Gatorade vs Powerade
• We accept ‘significance’ and reject the null hypothesis at P0.05 (i.e. a 5% chance that we are wrong)
• Performing multiple tests therefore means that our overall chance of committing a type I error is >5%.
Post-hoc Tests• A popular solution is the Tukey HSD
(Honestly Significant Difference) test
• This uses the omnibus error term from the ANOVA to determine which means are significantly different
T = (q)
√Error Variance
n
ANOVA
TimetoFatigue
3434.475 3 1144.825 6.364 .001
6476.500 36 179.903
9910.975 39
Between Groups
Within Groups
Total
Sum ofSquares df Mean Square F Sig.
q table for Tukey’s HSD
Multiple Comparisons
Dependent Variable: TimetoFatigue
Tukey HSD
-20.00000* 5.99838 .010 -36.1550 -3.8450
3.30000 5.99838 .946 -12.8550 19.4550
-11.40000 5.99838 .246 -27.5550 4.7550
20.00000* 5.99838 .010 3.8450 36.1550
23.30000* 5.99838 .002 7.1450 39.4550
8.60000 5.99838 .487 -7.5550 24.7550
-3.30000 5.99838 .946 -19.4550 12.8550
-23.30000* 5.99838 .002 -39.4550 -7.1450
-14.70000 5.99838 .086 -30.8550 1.4550
11.40000 5.99838 .246 -4.7550 27.5550
-8.60000 5.99838 .487 -24.7550 7.5550
14.70000 5.99838 .086 -1.4550 30.8550
(J) TrialLucozade
Gatorade
Powerade
Placebo
Gatorade
Powerade
Placebo
Lucozade
Powerade
Placebo
Lucozade
Gatorade
(I) TrialPlacebo
Lucozade
Gatorade
Powerade
MeanDifference
(I-J) Std. Error Sig. Lower Bound Upper Bound
95% Confidence Interval
The mean difference is significant at the .05 level.*.
Tukey Test Critique• As you learnt last week, the omnibus error term is
not reflective of all contrasts if sphericity is violated
• So Tukey tests commit many type I errors with even a slight degree of asphericity.
PlaceboLucozadeGatoradePowerade
Solution for Aspherical Data• There are alternatives to the Tukey HSD test which
use specific error terms for each contrast – Fisher’s LSD (Least Significant Difference)
– Sidak
– Bonferroni
– Many others…e.g. Newman-Kewls, Scheffe, Duncan, Dunnett, Gabriel, R-E-G-W, etc.
Fisher’s LSD BonferroniTrial 3
Trial 1
Trial 2
Trial 4
Pre 30min 60min 90min 1h 2h 3h 4h 10min Post
Ser
um
In
suli
n (
pm
ol.l-1
)
0
100
200
300
400
CHO CHO-PRO
Run 1 Recovery Run 2
*
Bonferroni Correction Critique• Correction of LSD values successfully controls for
type I errors following a 1-way ANOVA
• However, factorial designs often involve a larger number of contrasts, many of which may not be relevant.
Recovery Supp. 1Recovery Supp. 2
See also Perneger (1998) BMJ 316: 1236
Solution for Factorial Designs• An adjustment to the standard Bonferroni
correction can be applied for factorial designs
• This ‘Ryan-Holm-Bonferroni’ or ‘stepwise’ method involves returning to the P values of interest from our LSD test
• These P values are placed in numerical order and the most significant is Bonferroni corrected (i.e. P x m)
• However, all subsequent P values are multplied by m minus the number of contrasts already corrected.
Summary Post-Hoc Tests• A Tukey test may be appropriate when sphericity
can be assumed
• Multiple t-tests with a Bonferroni correction are more appropriate for aspherical data
• Stepwise correction of standard Bonferroni procedures maintain power with factorial designs
• Best option is to keep your study simple:– Pre-planned contrast at a specific time point– Summary statistics (e.g. rate of change, area under curve)
– Just make an informed based on the data available.
• Atkinson, G. (2001) Analysis of repeated measurements in physical therapy research Physical Therapy in Sport 2: p. 194-208
• Atkinson, G. (2002) Analysis of repeated measurements in physical therapy research: multiple comparisons amongst level means and multi-factorial designs Physical Therapy in Sport 3: p. 191-203
Further reading from this lecture…
• Batterham A. M. & Atkinson, G. (2005) How Big Does My Sample Need to Be? A primer on the Murky World of Sample Size Estimation Physical Therapy in Sport 6: p. 153-163.
Compulsory reading for next week’s lecture…