Statistics : the ten main mistakes

Didier Concordetd.concordet@envt.fr

Ecole NationaleVétérinairede Toulouse

July 2005

Statistical mistakes are frequent

• Many surveys of statistical errors in the medical literaturewith error rates ranging from 30%-90% (Altman, 1991; Gore et. al.,1976; Pocock et. al., 1987 and MacArthur, 1984)

• Reviews of the biomedical literature have consistently found that about half the articles use incorrect statistical methods (Glantz, 1980)

When do they occur ?

• When designing the experiment• When collecting data• When analysing data• When interpreting results

Design

• Lack of a proper randomisation the inference space is not definedpoor balance of the groups to be comparedlack of control group (maybe les frequent now)there exist confounding factors

• Lack of power the sample size is not large enough to answer the questionthe statistical unit is not well defined

Inference space definition (M1)An experiment in 2 years old beagles showed that the temperature of dogs treated with the antipyretic drug A decreased by 2 °C.

Does this result still hold forall 2 years old beagles3 years olds beaglesbeaglesdogsman

Poor balance (M2)Clinical trial

comparison of 2 antipyreticsrectal temperature after treatment

X = 39N = 100SD = 1

REFERENCE

X = 37N = 100SD = 1

New TRT

Reference < New TRT (P<0.001)

Poor balanceClinical trial

comparison of 2 antipyreticsrectal temperature after treatment

Clinical trial 1

X = 40N = 90SD = 1

REFERENCE

X = 42N = 50SD = 1

New TRT

New TRT< RefP<0.001

Clinical trial 2

X = 30N = 10SD = 1

REFERENCE

X = 32N = 50SD = 1

New TRT

New TRT < RefP<0.001

Conclusion : Reference > New TRT

Power (M3)

A clinical study to compare efficacy of two treatments (Ref. and Test)

Expected difference between the treatments = 4SD 2.

For the efficacy variable

A parallel two groups design is planned with 5 dogs in each groups

What to think about this study ?

35 % of power for a type I risk of 5%Even if the expected difference exists, only 35% of the samples (of size 5)of dogs actually exhibits it !

PowerEfficacy variable on two groups of dogs

Ref Test

Mean 15.4SD 2.4

20.02.6

Student t-test :P = 0.18Actually no conclusion

A real storyA study was performed in order to study the effect of diet on several biochemical compounds (about 20).

To this end, a dog was fed with a "normal" diet during 3 months and then with the new diet during 3 months.

Every two days, a blood sample was taken and the biochemical compounds were dosed.

At the end of the experiment 90 data were available for each biochemical compound.

There was a significant difference between the effects of the two diets for 10 biochemical compounds (P<0.001).

This result was obtained with a sample size of 90

Statistical unit (M4)

The statistical unit (an individual) is a statistical object that cannot be divided.

We want to generalise results obtained on a finite collection of units (a sample) to a population of units.

Despite the appearance of "wealth", the sample size was equal to 1 not 90.At the end of the experiment, the only dog of the experiment was well known but what about the other dogs of the population ?

Experiment

• Missing data not adequately reported

• Extreme values excluded

• Data ignored because they did not support the

hypothesis ?

Analysis

• Failure to check assumptions of the statistical methods (M5)

homoscedasticity (for a t-test, a linear regression,…)

using a linear regression without first establishing linearity…correlation

• Ignoring informative "missing" datadeath and its consequencesdata below LOQ

• Choosing the question to get an answer• Multiple comparisons

Homoscedasticity (M5)

1 Treatment

t-testP-value = 0.56

After log-transfP-value = 0.026

What the t-test can see

Linearity/Correlation (M5) Linear regression

Correlation R = -0.002

Linear regression

Correlation R = -0.93

Linearity/Correlation

Linear regression

Correlation R = 0.84

A linear model with 3 groups

Within group Correlation R = -0.92

1 2 3 4 5 6

Time (Day)

)Ignoring data (M6)

1 2 3 4 5 6

Time (Day)

Ignoring data

1 2 3 4 5 6

Time (Day)

1 2 3 4 5 6

Time (Day)

Choosing the question to get an answer (M7)Occurs frequently in the presentation of clinical trials results

The question becomes random : it changes with the sample of animals. The question is chosen with its answer in hands… Think about a flip coin game where you win 1€ when tail or head occurs. You choose the decision rule once you know the result of the flip !

Such an approach increases the number of false discoveries.

Multiple comparisons (M8)

1 2 3 4 5Mean 700 880 730 790 930SD 48 50 55 44 60

One wants to compare the ADG obtained with 5 different diets in pig

1 3 4 2 5Ten T-tests

A risk of 5% for each comparison : the global risk can be very large

Interpretation/presentation

• Standard error and standard deviation

• P values : non significant effects

• False causality

Standard error / standard deviation (M9)

The clairance of the drug was equal to 68 ± 5 mL/mn

Two possible meanings depending on the meaning of 5

If 5 is the standard error of the mean (se) there is 95 % chance that the population mean clearance belongs to

[68 - 2 5 ; 68 + 2 5 ]

If 5 is the standard deviation (SD) 95 % of animals have their clearance within

[68 - 2 5 ; 68 + 2 5 ]

P values (M10)

The difference between the effect of the drugs A and B is not significant (P = 0.56) therefore drug A can be substituted by drug B.

NOThe only conclusion that can be drawn from such a P value is that you didn't see any difference between the effect of the drugs A and B. That does not mean that such a difference does not exist.

Absence of evidence is not evidence of absence

P values (M10)

The drug A has a higher efficacy than the drug B (P = 0.001)The drug C has a higher efficacy than the drug B (P = 0.04) Since 0.001<0.04 the drug A has a higher than the drug B. NOThe only conclusion that can be drawn from such a P value is that you are sure than A>B and less sure than C>B.This does not presume anything about the amplitude of the differences.

Significant does not mean important

False causality : lying with statistics

There is a strong positive correlation between the number of firefighters present at a fire and the amount of fire damage.Thus, the firefighters present at fire create higher fire damage !

The correlation coefficient is nothing else than a measure of the strength of a linear relationship between 2 variables.Correlation cannot establish causality.A strong correlation between X and Y can occurs when"X" causes "Y""Y" causes "X""Z" causes "X" and "Y" (Z = fire size in the previous example)Incidentally with small samples size when X and Y are independent

How to avoid these mistakes ?

• Consult your prefered statistician for help in the design of complicated experiments• Use basic descriptive statistics first (graphics, summary statistics,…)• Use common sense• Consider to learn more statistics

Statistics : the ten main mistakes

Documents

Ten mistakes online learning programs can make

The Ten Most Common Data Mining Business Mistakes

Ten Hiring Mistakes

Ten Common Contracting Mistakes (or Unintended Consequences)

Top ten mistakes leaders make 2014 web version

Reef Keeping. Top ten most costly mistakes Top ten most costly mistakes

Top Ten Mistakes of College Writing

Ten Critical Mistakes at Trial

Ten Mistakes to Avoid When Starting a Business

TOP TEN MISTAKES EMPLOYERS MAKE, AND HOW TO …osattorneys.com/pdf/Top Ten Mistakes Made By Emplo… · · 2012-05-16TOP TEN MISTAKES EMPLOYERS MAKE, AND HOW TO AVOID THEM Mark

Ten Twitter Mistakes You Should Never Make

Ten Serious Mistakes That Music Artists Make Unknowingly

Ten Common Blogging Mistakes to Avoid

Top Ten Mistakes Property Management Business owners make

Statistics : the ten main mistakes Didier Concordet d.concordet@envt.fr Ecole Nationale Vétérinaire de Toulouse July 2005

Top Ten Contractor Mistakes in OFCCP Audits

Top Ten Accounting Mistakes

Ten mistakes of doing business in China Mistakes 8, 9, and ......Ten mistakes of doing business in China CMS Hong Kong 2 Mistake No. 8: "Champagne clauses" – Dispute resolution clauses

Top Ten Certification Mistakes Organic Producers Make

003 Gotcha! Top Ten Employer Compliance Mistakes