35
Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department of Medical Informatics, University of Szeged

Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

Embed Size (px)

Citation preview

Page 1: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

Biostatistics, statistical software V.

Statistical errors, one-and two sided tests. One-way and multifactor

analysis of variance.

Krisztina Boda PhD

Department of Medical Informatics, University of Szeged

Page 2: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 2Krisztina Boda

One- and two tailed (sided) tests Two tailed test

H0: there is no change

Ha: There is change (in either direction)

One-tailed test H0: the change is

negative or zero Ha: the change is positive

p-values: p(one-tailed)=p(two-tailed)/2

Page 3: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 3Krisztina Boda

Significance Significant difference – if we claim that there is a

difference (effect), the probability of mistake is small (maximum - Type I error ).

Not significant difference – we say that there is not enough information to show difference. Perhaps there is no difference There is a difference but the sample size is small The dispersion is big The method was wrong

Even is case of a statistically significant difference one has to think about its biological meaning

Page 4: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 4Krisztina Boda

Statistical errors

Truth Decision

do not reject H0 reject H0 (significance)

H0 is true correct Type I. error its probability:

Ha is true Type II. error correctits probability:

Page 5: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 5Krisztina Boda

Error probabilities The probability of type I error is known ( ). The probability of type II error is not known

() It depends on

The significance level (), Sample size, The standard deviation(s) The true difference between populations others (type of the test, assumptions, design, ..)

The power of a test: 1- ability to detect a real effect; probability to have a significant p-value

Page 6: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 6Krisztina Boda

The power of a test on case of fixed sample size and , with two alternative

hypotheses

Page 7: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 7Krisztina Boda

ANOVAAnalysis of Variance

Comparison the mean of of several (>2), normally distributed samples

Types: One-way:

Control, treatment I, treatment II.

Two-way (treatment + sex)

Any „way” (factor) can be „independent” („between-subjects”) sex, treatments „repeated measures” („within-subjects”) data measured

on the same patient

Page 8: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 8Krisztina Boda

Why not t-test (pair wise)?We can get significant result only by chance at

every 20th caseCSOP R1 R2 R3 R4 R5 R6 R7 R8

1.00 -.84 1.73 2.36 -.30 -.31 -.31 -.56 1.58

1.00 .59 .44 .60 -.75 -.28 -1.51 -.81 -.12

1.00 .19 -.73 -1.04 1.27 .69 -.21 -.52 -1.34

1.00 -1.05 .88 1.27 1.05 -.87 .68 -.17 -.15

1.00 .12 -.75 -.05 -1.13 2.21 .74 -.90 -.45

1.00 1.10 -.20 -.78 1.02 .67 .18 -.52 -.34

1.00 -.19 -.57 -.41 2.25 -1.26 -.27 .44 -2.52

1.00 .45 1.20 2.77 -.17 -.68 .60 .54 -.37

1.00 -.58 -.01 .60 1.66 2.14 2.31 -.90 -1.75

1.00 -.39 .93 -.51 .31 -.60 -.21 .55 .57

1.00 -.23 -1.21 -1.08 .02 .31 -1.28 1.20 1.62

1.00 .87 .97 -1.04 .60 -.29 .86 1.09 -.68

2.00 .42 -1.18 -.64 -.08 1.10 .39 -.66 2.12

2.00 1.26 -2.13 -1.78 -.60 -1.25 -1.10 .19 -1.54

2.00 -.60 -.83 -.94 1.61 .95 1.37 .10 -.97

2.00 -1.75 .63 .16 .24 -.25 1.49 .42 -2.01

2.00 .07 -.33 -.56 .36 .12 -.48 .78 -1.29

2.00 .15 .85 .10 -2.07 .18 2.14 1.71 .62

2.00 .98 -1.20 -.46 -.92 .08 -1.37 .80 -.67

2.00 -.42 1.05 -.29 .73 .10 1.42 .79 1.67

2.00 2.00 .06 2.24 -.31 -.13 -.01 .04 -.45

2.00 -1.85 -1.83 3.35 1.83 -.12 -.30 -1.68 .57

2.00 1.06 -.55 -.36 -.80 -1.41 -1.49 .89 .82

2.00 -.57 -2.15 2.15 -.99 -1.63 .00 -.41 1.42

t-pr. 0.882846 0.053926 0.96894 0.205339 0.418212 0.928912 0.391001 0.508963

sign 4 Type I error

Page 9: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 9Krisztina Boda

The increase of type I error

It can be shown that when t tests are used to test for differences between multiple groups, the chance of mistakenly declaring significance (Type I Error) is increasing. For example, in the case of 5 groups, if no overall differences exist between any of the groups, using two-sample t tests pair wise, we would have about 30% chance of declaring at least one difference significant, instead of 5% chance.

In general, the t test can be used to test the hypothesis that two group means are not different. To test the hypothesis that three ore more group means are not different, analysis of variance should be used.

Page 10: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 10Krisztina Boda

Each statistical test produces a ‘p’ value If the significance level is set at 0.05 (false

positive rate) and we do multiple significance testing on the data from a single clinical trial,

then the overall false positive rate for the trial will increase with each significance test.

Page 11: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 11Krisztina Boda

False positive rate for each test = 0.05 Probability of incorrectly rejecting ≥ 1

hypothesis out of N testings = 1 – (1-0.05)N=1-(1-)n

Page 12: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 12Krisztina Boda

The increase of experimentwise Type I error

0

0.2

0.4

0.6

0.8

1

1.2

0 10 20 30 40 50 60 70 80 90 100 110

Number of comparisons

Familywise type I. error probabilty by number of comparisons

0

0.2

0.4

0.6

0.8

1

1.2

0 10 20 30 40 50 60 70 80 90 100 110

Number of camparisons

Page 13: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 13Krisztina Boda

Compound hypotheses

(H01 and H02 and... H0n ) null hypotheses, the significance levels are 1, 2, …, n

How to choose i-s so that the level of the compound hypothesis (H01 and H02 and ... H0n ) would be no greater than ? (0,1)

Page 14: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 14Krisztina Boda

Bonferroni correction

The is divided by the number of comparisons. (H01 and H02 and H0n ) is rejected, if at least one pi</n

In case of many comparisons, this is too conservative (will not show real differences).

Page 15: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 15Krisztina Boda

Holm-modification (SAS: step-down Bonferroni)

The pi-s are sorted. p1p2...pn

H0i is tested at level If any of them is significant, then reject (H01 and

H02 and... H0n ) . Pl. n=5 p1 /5=0.01 if p1 is not smaller, then finish p2 /4=0.0125 ha p2 is not smaller, then finish p3 /3=0.0166 is not smaller, then finish p4 /2=0.025 …. p5 /1=0.05

in 1

Page 16: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 16Krisztina Boda

FDR (false discovery rate)

p1p2...pn

Begin with the greatest p-value, it remains the same

The next is tested at level Pl. n=5 p5 p4 /(4*5) p3 /(3*5) p2 /(2*5) p1 /(1*5)=0.05

)( inn

Page 17: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 17Krisztina Boda

Correction of unique p-values The SAS System

The Multtest Procedure

p-Values

False Stepdown Discovery Test Raw Bonferroni Hochberg Rate

1 0.9999 1.0000 0.9999 0.9999 2 0.2318 0.9272 0.9272 0.5795 3 0.3771 1.0000 0.9999 0.6285 4 0.8231 1.0000 0.9999 0.9999 5 0.0141 0.0705 0.0705 0.0705

Page 18: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 18Krisztina Boda

One-Way ANOVA

Let us suppose that we have t independent samples (t “treatment” groups) drawn from normal populations with equal variances ~N(µi,).

Assumptions: Independent samples normality Equal variances

Null hypothesis: population means are equal, µ1=µ2=.. =µt

Page 19: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 19Krisztina Boda

http://lib.stat.cmu.edu/DASL/Stories/CancerSurvival.html.Cameron, E. and Pauling, L. (1978) Supplemental ascorbate in the

supportive treatment of cancer: re-evaluation of prolongation of survival times in terminal human cancer. Proceedings of the National Academy of

Science USA, 75, 4538Ð4542.

Original Square root transformed

116171713N =

GROUP

BreastOvaryColonBronchusStomach

SU

RV

IVA

L

5000

4000

3000

2000

1000

0

-1000

63

60

52

34

23

7

116171713N =

GROUP

BreastOvaryColonBronchusStomach

SQ

SU

RV

70

60

50

40

30

20

10

0

55

34

23

Page 20: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 20Krisztina Boda

Method

If the null hypothesis is true, then the populations are the same: they are normal, and they have the same mean and the same variance. This common variance is estimated in two distinct ways: between-groups variance within-groups variance

If the null hypothesis is true, then these two distinct estimates of the variance should be equal

‘New’ (and equivalent) null hypothesis: 2between=2

within their equality can be tested by an F ratio test The p-value of this test:

if p>0.05, then we accept H0. The analysis is complete. if p<0.05, then we reject H0 at 0.05 level. There is at least one

group-mean different from one of the others

Page 21: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 21Krisztina Boda

. 0

1

2

3

4

5

6

7

0 1 2 3 4 0

1

2

3

4

5

6

7

0 1 2 3 4 a) b

Random samples drawn from normal distribution with equal (a) and uneqal (b) means and unique dispersion.

Page 22: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 22Krisztina Boda

The ANOVA table

Source of variation

Sum of squares Degrees of freedom

Variance F p

Between groups

2

1

)( xxnQ ii

t

ik

t-1 1

2

t

Qs k

k Fs

sk

b

2

2 p

Within groups

2

11

)( iij

n

j

t

ib xxQ

i

N-t tN

Qs b

b 2

Total

2

11

)( xxQ ij

n

j

t

i

i

N-1

ANOVA

SQSURV

3295.038 4 823.759 6.484 .000

7495.266 59 127.038

10790.304 63

Between Groups

Within Groups

Total

Sum ofSquares df Mean Square F Sig.

Page 23: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 23Krisztina Boda

Pairwise comparisons As the two-sample t-test is inappropriate to do this, there are special tests for multiple

comparisons that keep the probability of Type I error as . The most often used multiple comparisons are the modified t-tests.

Modified t-tests(LSD) Bonferroni: α/(number of comparisons) Scheffé Tukey Dunnett: a test comparing a given group (control) with the others

Multiple Comparisons

Dependent Variable: SQSURV

Dunnett t (2-sided)a

-18.8090* 4.61748 .001 -30.3632 -7.2547

-19.9927* 4.36140 .000 -30.9062 -9.0793

-13.5661* 4.36140 .010 -24.4796 -2.6526

-7.6217 5.72032 .474 -21.9355 6.6922

(J) GROUPBreast

Breast

Breast

Breast

(I) GROUPStomach

Bronchus

Colon

Ovary

MeanDifference

(I-J) Std. Error Sig. Lower Bound Upper Bound

95% Confidence Interval

The mean difference is significant at the .05 level.*.

Dunnett t-tests treat one group as a control, and compare all other groups against it.a.

Page 24: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 24Krisztina Boda

Examplehttp://lib.stat.cmu.edu/DASL/Stories/ReadingComprehension.html

Researchers at Purdue University conducted an experiment to compare three methods of teaching reading.

Students were randomly assigned to one of the three teaching methods, and their reading comprehension was tested before and after they received the instruction. Several different measures of reading comprehension, from both the pre- and posttests are included in the dataset.

Reference: Moore, David S., and George P. McCabe (1989). Introduction to the Practice of Statistics. Original source: study conducted by Jim Baumann and Leah Jones of the Purdue University Education Department.

Page 25: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 25Krisztina Boda

Page 26: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 26Krisztina Boda

ANOVA

POST2 Posttest score on second reading comprehension measure

95.121 2 47.561 8.407 .001

356.409 63 5.657

451.530 65

Between Groups

Within Groups

Total

Sum ofSquares df Mean Square F Sig.

Page 27: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 27Krisztina Boda

Multiple Comparisons

Dependent Variable: POST2 Posttest score on second reading comprehension measure

-.682 .717 .345 -2.11 .75

-2.818* .717 .000 -4.25 -1.39

.682 .717 .345 -.75 2.11

-2.136* .717 .004 -3.57 -.70

2.818* .717 .000 1.39 4.25

2.136* .717 .004 .70 3.57

-.682 .717 1.000 -2.45 1.08

-2.818* .717 .001 -4.58 -1.05

.682 .717 1.000 -1.08 2.45

-2.136* .717 .012 -3.90 -.37

2.818* .717 .001 1.05 4.58

2.136* .717 .012 .37 3.90

(J) groupcode Typeof instruction thatstudent received2 DRTA

3 Strat

1 Basal

3 Strat

1 Basal

2 DRTA

2 DRTA

3 Strat

1 Basal

3 Strat

1 Basal

2 DRTA

(I) groupcode Typeof instruction thatstudent received1 Basal

2 DRTA

3 Strat

1 Basal

2 DRTA

3 Strat

LSD

Bonferroni

MeanDifference

(I-J) Std. Error Sig. Lower Bound Upper Bound

95% Confidence Interval

The mean difference is significant at the .05 level.*.

Page 28: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 28Krisztina Boda

Nonparametric one-way ANOVAKruskal-Wallis test.

As a result, it gives one p-value. If it is nit significant, the null hypothesis is accepted.

If the null hypothesis is rejected, further tests are required to make pairwise comparisons. These pairwise comparisons are generally not available in standard statistical packages. Pairwise comparisons can be performed by Mann Whitney U tests and p-values can be corrected by Bonferroni correction

Test Statisticsa,b

14.954

4

.005

Chi-Square

df

Asymp. Sig.

SURVIVAL

Kruskal Wallis Testa.

Grouping Variable: GROUPb.

Page 29: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 29Krisztina Boda

Two-way ANOVA, example

Does systolic blood pressure depend on Diabetes or not Male or femaleIndependent factors

Page 30: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 30Krisztina Boda

Two-way repeated measurements ANOVA

Does QT widening in the Langendorff-perfused rat heart represent the effect of repolarization delay or conduction slowing? J Cardiovasc Pharmacol. 42 (2003) 612-21

Page 31: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 31Krisztina Boda

Effect of regional ischemia and K+ content of the perfusion solution on the QT90 interval (A) and heart rate (B)

in drug-free isolated rat hearts (n = 12 hearts per group). (mean ± SEM)

3 mM K5 mM K

A.

Time (min)

-10 -5 0 5 10 15 20 25

QT

90

(m

s)

50

60

70

80

90

100 3 mM K+

5 mM K+

A.

Time (min)

-10 -5 0 5 10 15 20 25

Hea

rt r

ate

(bea

ts/m

in)

250

300

350

400

450B.

Time (min)

-10 -5 0 5 10 15 20 25

B.

Page 32: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 32Krisztina Boda

Frequently, separate univariate analyses are used for every time point and take no account the fact that data are related in time. A second problem is the frequent occurrence of missing values in the data. A repeated measurement ANOVA model is more appropriate (Brown and Prescott).

repeated testing is taking place and therefore a significant effect is more likely to occur at some time point by chance.

Page 33: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 33Krisztina Boda

Repeated measurement ANOVA model

We can examine: The treatment effect

(K+) Time-effect Their interaction

Time (min)

-10 -5 0 5 10 15 20 25H

ea

rt r

ate

(b

eat

s/m

in)

250

300

350

400

450B.

****

* **

Time (min)

-10 -5 0 5 10 15 20 25

B.

****

* **

3 mM K5 mM K3 mM K+

5 mM K+*

*

*

In high potassium concentration the heart rate is significantly higher, independently of the time it was measured

Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F KALIUM 1 22 9.14 0.0063 time 9 198 21.70 <.0001 KALIUM*time 9 198 0.54 0.8465

Page 34: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 34Krisztina Boda

Review questions and exercises

Problems to be solved by hand-calculations ..\Handouts\Problems hand V.doc

Solutions ..\Handouts\Problems hand V solutions.doc

Problems to be solved using computer ..\Handouts\Problems comp V.doc, ..\Handouts\Problems comp V solutions.doc

Page 35: Biostatistics, statistical software V. Statistical errors, one-and two sided tests. One-way and multifactor analysis of variance. Krisztina Boda PhD Department

INTERREG 35Krisztina Boda

Useful WEB pages

http://www-stat.stanford.edu/~naras/jsm http://www.ruf.rice.edu/~lane/rvls.html http://my.execpc.com/~helberg/statistics.html http://www.math.csusb.edu/faculty/stanton/m26

2/index.html