View
40
Download
3
Category
Tags:
Preview:
DESCRIPTION
Ecole Nationale Vétérinaire de Toulouse. Statistics : the ten main mistakes. Didier Concordet d.concordet@envt.fr. July 2005. Statistical mistakes are frequent. • Many surveys of statistical errors in the medical literature - PowerPoint PPT Presentation
Citation preview
Statistics : the ten main mistakes
Didier Concordetd.concordet@envt.fr
Ecole NationaleVétérinairede Toulouse
July 2005
2
Statistical mistakes are frequent
• Many surveys of statistical errors in the medical literaturewith error rates ranging from 30%-90% (Altman, 1991; Gore et. al.,1976; Pocock et. al., 1987 and MacArthur, 1984)
• Reviews of the biomedical literature have consistently found that about half the articles use incorrect statistical methods (Glantz, 1980)
3
When do they occur ?
• When designing the experiment• When collecting data• When analysing data• When interpreting results
4
Design
• Lack of a proper randomisation the inference space is not definedpoor balance of the groups to be comparedlack of control group (maybe les frequent now)there exist confounding factors
• Lack of power the sample size is not large enough to answer the questionthe statistical unit is not well defined
5
Inference space definition (M1)An experiment in 2 years old beagles showed that the temperature of dogs treated with the antipyretic drug A decreased by 2 °C.
Does this result still hold forall 2 years old beagles3 years olds beaglesbeaglesdogsman
6
Poor balance (M2)Clinical trial
comparison of 2 antipyreticsrectal temperature after treatment
X = 39N = 100SD = 1
REFERENCE
X = 37N = 100SD = 1
New TRT
Reference < New TRT (P<0.001)
7
Poor balanceClinical trial
comparison of 2 antipyreticsrectal temperature after treatment
Clinical trial 1
X = 40N = 90SD = 1
REFERENCE
X = 42N = 50SD = 1
New TRT
New TRT< RefP<0.001
Clinical trial 2
X = 30N = 10SD = 1
REFERENCE
X = 32N = 50SD = 1
New TRT
New TRT < RefP<0.001
Conclusion : Reference > New TRT
8
Power (M3)
A clinical study to compare efficacy of two treatments (Ref. and Test)
Expected difference between the treatments = 4SD 2.
For the efficacy variable
A parallel two groups design is planned with 5 dogs in each groups
What to think about this study ?
35 % of power for a type I risk of 5%Even if the expected difference exists, only 35% of the samples (of size 5)of dogs actually exhibits it !
9
PowerEfficacy variable on two groups of dogs
Ref Test
Mean 15.4SD 2.4
20.02.6
N 5 5
Student t-test :P = 0.18Actually no conclusion
10
A real storyA study was performed in order to study the effect of diet on several biochemical compounds (about 20).
To this end, a dog was fed with a "normal" diet during 3 months and then with the new diet during 3 months.
Every two days, a blood sample was taken and the biochemical compounds were dosed.
At the end of the experiment 90 data were available for each biochemical compound.
There was a significant difference between the effects of the two diets for 10 biochemical compounds (P<0.001).
This result was obtained with a sample size of 90
11
Statistical unit (M4)
The statistical unit (an individual) is a statistical object that cannot be divided.
We want to generalise results obtained on a finite collection of units (a sample) to a population of units.
Despite the appearance of "wealth", the sample size was equal to 1 not 90.At the end of the experiment, the only dog of the experiment was well known but what about the other dogs of the population ?
12
Experiment
• Missing data not adequately reported
• Extreme values excluded
• Data ignored because they did not support the
hypothesis ?
13
Analysis
• Failure to check assumptions of the statistical methods (M5)
homoscedasticity (for a t-test, a linear regression,…)
using a linear regression without first establishing linearity…correlation
• Ignoring informative "missing" datadeath and its consequencesdata below LOQ
• Choosing the question to get an answer• Multiple comparisons
14
Homoscedasticity (M5)
1 Treatment
Cle
aran
ce
2
t-testP-value = 0.56
After log-transfP-value = 0.026
What the t-test can see
15
Linearity/Correlation (M5) Linear regression
Correlation R = -0.002
Linear regression
Correlation R = -0.93
16
Linearity/Correlation
Linear regression
Correlation R = 0.84
A linear model with 3 groups
Within group Correlation R = -0.92
17
36.0
36.5
37.0
37.5
38.0
38.5
39.0
39.5
40.0
40.5
41.0
1 2 3 4 5 6
Time (Day)
Tem
pera
ture
(°C
)Ignoring data (M6)
36.0
36.5
37.0
37.5
38.0
38.5
39.0
39.5
40.0
40.5
41.0
1 2 3 4 5 6
Time (Day)
Tem
pera
ture
(°C
)
18
Ignoring data
36.0
36.5
37.0
37.5
38.0
38.5
39.0
39.5
40.0
40.5
41.0
1 2 3 4 5 6
Time (Day)
Tem
pera
ture
(°C
)
36.0
36.5
37.0
37.5
38.0
38.5
39.0
39.5
40.0
40.5
41.0
1 2 3 4 5 6
Time (Day)
Tem
pera
ture
(°C
)
19
Choosing the question to get an answer (M7)Occurs frequently in the presentation of clinical trials results
The question becomes random : it changes with the sample of animals. The question is chosen with its answer in hands… Think about a flip coin game where you win 1€ when tail or head occurs. You choose the decision rule once you know the result of the flip !
Such an approach increases the number of false discoveries.
20
Multiple comparisons (M8)
1 2 3 4 5Mean 700 880 730 790 930SD 48 50 55 44 60
One wants to compare the ADG obtained with 5 different diets in pig
1 3 4 2 5Ten T-tests
A risk of 5% for each comparison : the global risk can be very large
21
Interpretation/presentation
• Standard error and standard deviation
• P values : non significant effects
• False causality
22
Standard error / standard deviation (M9)
The clairance of the drug was equal to 68 ± 5 mL/mn
Two possible meanings depending on the meaning of 5
If 5 is the standard error of the mean (se) there is 95 % chance that the population mean clearance belongs to
[68 - 2 5 ; 68 + 2 5 ]
If 5 is the standard deviation (SD) 95 % of animals have their clearance within
[68 - 2 5 ; 68 + 2 5 ]
23
P values (M10)
The difference between the effect of the drugs A and B is not significant (P = 0.56) therefore drug A can be substituted by drug B.
NOThe only conclusion that can be drawn from such a P value is that you didn't see any difference between the effect of the drugs A and B. That does not mean that such a difference does not exist.
Absence of evidence is not evidence of absence
24
P values (M10)
The drug A has a higher efficacy than the drug B (P = 0.001)The drug C has a higher efficacy than the drug B (P = 0.04) Since 0.001<0.04 the drug A has a higher than the drug B. NOThe only conclusion that can be drawn from such a P value is that you are sure than A>B and less sure than C>B.This does not presume anything about the amplitude of the differences.
Significant does not mean important
25
False causality : lying with statistics
There is a strong positive correlation between the number of firefighters present at a fire and the amount of fire damage.Thus, the firefighters present at fire create higher fire damage !
The correlation coefficient is nothing else than a measure of the strength of a linear relationship between 2 variables.Correlation cannot establish causality.A strong correlation between X and Y can occurs when"X" causes "Y""Y" causes "X""Z" causes "X" and "Y" (Z = fire size in the previous example)Incidentally with small samples size when X and Y are independent
26
How to avoid these mistakes ?
• Consult your prefered statistician for help in the design of complicated experiments• Use basic descriptive statistics first (graphics, summary statistics,…)• Use common sense• Consider to learn more statistics
Recommended