35
SJS SDI_3 1 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

Embed Size (px)

Citation preview

Page 1: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 1

Design of Statistical Investigations

Stephen Senn

3. Design of Experiments 1

Some Basic Ideas

Page 2: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 2

Elements of an ExperimentThe “Nouns”

• Experimental material– Basic units– Blocks– Replications

• Treatments – Orderings– Dimensions– Combinations

Page 3: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 3

Elements of an ExperimentThe “Verbs”

• Allocation– Which material gets which treatment

• For example using some form of randomisation

• Conduct– How will it all be carried out?

• Measuring– When to measure what

• Analysis

Page 4: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 4

Exp_1 Rat TXB2

• Experimental material – 36 Rats

• Treatments to be studied– 6 in a ‘one-way layout’

• 4 new chemical entities

• 1 vehicle

• 1 marketed product

Page 5: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 5

Caution!!!!!• In practice such things are not given

• Material– Why rats and not mice, dogs, or guinea-pigs?– Why 36?

• Treatments– Why these 6?

• In practice the statistician can be involved in such decisions also

Page 6: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 6

Exp_1 Rat TXB2Allocation

• If rats are not differentiable in any way we can determine, we might as well allocate at random?

• Unconstrained randomisation not a good idea, however. Some treatments will be allocated to few rats.

• So constrain to have 6 rats per group

Page 7: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 7

S-Plus Randomisation#M2 Rat TXB2 Randomisation#Vector of treatmentstreat<-c(rep("V",6),rep("M",6),rep("a",6), rep("b",6),rep("c",6),rep("d",6)) #Random number for each ratrnumb<-runif(36,0,1)#Sort rats by random number rat<-sort.list(rnumb)#Join rats and treatments temp.frame<-data.frame(rat,treat)#Sort rows by ratdes.frame<-sort.col(temp.frame,c("rat","treat"),"rat")#Print designdes.frame

We shall illustrate an alternative using the sample function later in the course

Page 8: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 8

Result of Randomisation rat treat 9 1 M22 2 b 4 3 V33 4 d13 5 a11 6 M10 7 M31 8 d 7 9 M19 10 b 3 11 V25 12 c18 13 a

rat treat 12 14 M17 15 a20 16 b24 17 b34 18 d26 19 c23 20 b30 21 c16 22 a21 23 b32 24 d28 25 c 8 26 M

rat treat 14 27 a 1 28 V29 29 c36 30 d 6 31 V 5 32 V35 33 d15 34 a 2 35 V27 36 c

Page 9: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 9

Exp_1 Rat TXB2Conduct

• We will not cover this in this course

• This does not mean that this is not important

• In the Exp_1 example precise instructions might be necessary for treating the rats.

Page 10: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 10

Exp_1 Rat TXB2Measurement

• Obviously we have to decide what it is important to measure

• Here it has been decided to measure TXB2 a marker of Cox-1 activity

• Cox = cyclooxygenase

• Analgesics are designed to inhibit Cox-2, which is involved in synthesis of inflammatory prostaglandins

Page 11: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 11

Measurement (Cont)

• However they also tend to inhibit Cox-1 which is involved in synthesis of the prostaglandins that help maintain gastric mucosa

• Cox-1 inhibition can lead to ulcers

• Ulcers are an unwanted side-effect of Non Steroidal Anti-inflammatory Drugs (NSAIDs)

Page 12: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 12

The Moral

• Even ‘simple’ experiments may involve complex subject matter-knowledge

• It may be dangerous for the statistician to assume that all that is being produced is sets of numbers, details being irrelevant

• Team work may be necessary

Page 13: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 13

Analysis

• One-way layout

• Six treatments

• Balanced design

• “No-brainer” is one-way ANOVA– We shall look at the maths of one-way ANOVA

in more detail later.– For the moment take this as understood

Page 14: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 14

S-PLUS ANOVA Code#Analysis of TXB2 data#Set contrast optionsoptions(contrasts=c(factor="contr.treatment",ordered="contr.poly"))#Input datatreat<-factor(c(rep(1,6),rep(2,6), rep(3,6),rep(4,6),rep(5,6),rep(6,6)),labels=c("V","M","a","b","c","d")) TXB2<-c(196.85,124.40,91.20,328.05,268.30,214.70,2.08,1.97,4.80,5.01,2.52,9.35,315.85,75.60,322.80,212.15,42.95, 111.90,127.95,81.75,52.70,352.85,198.80,107.65,83.19,66.80,81.15,39.00,61.96,87.00,74.48,60.00,77.00,42.00,48.95,66.30)fit1<-aov(TXB2~treat)#ANOVAsummary(fit1)

Page 15: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 15

S-PLUS Output

summary(fit1) Df Sum of Sq Mean Sq F Value Pr(F) treat 5 184595.5 36919.11 6.313142 0.000409356

Residuals 30 175439.3 5847.98

So there is highly significant difference between treatments but this does not make this an adequate analysis

Page 16: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 16

S-PLUS Diagnostic Code

#Diagnostic plot datapar (mfrow=c(2,2))plot(treat~TXB2)hist(resid(fit1),xlab="residual")plot(fit1$fitted.values,resid(fit1),xlab="fitted",ylab="residual")abline(h=0)qqnorm(resid(fit1),xlab="theoretical",ylab="empirical")qqline(resid(fit1))

Page 17: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 17

TXB2

tre

at

0 100 200 300

12

34

56

-100 0 100 200

02

46

81

01

2

residual

fitted

resi

du

al

0 50 100 150 200

-10

00

10

02

00

theoretical

em

pir

ica

l

-2 -1 0 1 2

-10

00

10

02

00

Page 18: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 18

Model Failure

• Histogram of residuals has heavy tails

• QQ Plot shows clear departure from Normality

• Variance increases with mean– Suggests log-transformation

Page 19: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 19

LTXB2

tre

at

1 2 3 4 5 6

12

34

56

-1.0 -0.5 0.0 0.5 1.0

02

46

residual

fitted

resi

du

al

0 50 100 150 200

-1.0

-0.5

0.0

0.5

1.0

theoretical

em

pir

ica

l

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

1.0

Page 20: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 20

Exp_2: A Simple Design Problem(The simplest)

• You have N experimental units in total

• They are completely exchangeable

• You have two treatments A and B– with no prior knowledge of their effects

• You wish to compare A and B– continuous outcome assumed Normal

• How many units for A and for B?

Page 21: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 21

Solution is obvious

• Allocate half the units to one treatment and half to the other– Assuming that there is an even number of units

• However, we should go through the design cycle

• What sort of data will we collect?

• What will we do with them?

Page 22: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 22

Basic Design CycleObjective

Tentative Design

Potential Data

Possible Analysis

Possible Conclusions

Relevant factors

Page 23: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 23

The Anticipated Data

• Two mean outcomes

• Variances expected to be the same– Assumption but

• Reasonable under null hypothesis

• No other assumption is more reasonable given that we know nothing about the treatments

• We will calculate the contrast between these means

Page 24: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 24

1 2

21 2

1 2

21 2 1 2

1 2

2 2 2 21 2

1 2

ˆ

ˆvar( ) (1/ 1/ )

(1/ 1/ ) ( )

/ /

Y Y

n n

n n N

f n n n n N

dfn n N

ddf df

n ndn dn

Page 25: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 25

Now set the derivatives equal to zero

1 2

2 21

2 22

0 (1)

/ 0 (2)

/ 0 (3)

n n N

n

n

From (2) and (3) we have

2 2 2 21 2

1 2

/ /n n

n n

Page 26: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 26

So What!!??

• Solution is obvious

• Statistical theory does not seem to have helped us very much

• However, this was a trivial problem

• We now try a slightly more complicated experiment

• This leads to a non-trivial problem

Page 27: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 27

Exp_3A More Complicated Case

• Now suppose that we are comparing k experimental treatments to a single control.

• The treatments will not be compared to each other.

• How many units should we allocate to each treatment?– We assume that variances do not vary with

treatment: homoscedasticity

Page 28: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 28

Exp_3 Continued

• Arguments of symmetry suggest the active treatments be given to the same number of units, say n.

• Suppose that m units will be allocated the control.

• With N units in total we have N = m + kn

Page 29: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 29

We consider the variance of a typical contrast

2 2/ /m n

Incorporating the necessary constraint using a Lagrange multiplier we obtain the following objective function

2 2/ / ( )f m n N m nk

And proceed to minimise this by setting the partial derivatives with respect to m, n and equal to zero. (Note that we assume that k and N are fixed in the design specification.)

Page 30: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 30

Set derivatives equal to zero.

Solution gives

2 2

2 2

/

/ /

/ /

df d N m nk

df dm m

df dn n k

Setting equal to zero we have

2 2

2 2

(4)

/ (5)

/( ) (6)

N m nk

m

kn

Page 31: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 31

From (4) and (5) we have

2 2kn m n m k

Substituting in (4) we have

1

1

N m km k

N m k

Nm

k

Page 32: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 32

Check

• Exp_2 was a special case of Exp_3 with k = 1

• So our general solution must give the same answer as the special case when k = 1

• But when k = 1 the formula yields m = N/2, which is the solution we reached before

Page 33: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 33

Allocation as a function of number of experimental treatments

*

*

*

**

** * *

Pro

po

rtio

n o

f u

nits

on

co

ntr

ol

2 4 6 8

0.0

0.2

0.4

0.6

#

#

##

##

# # #

Number of experimental treatments

2 4 6 8

0.0

0.2

0.4

0.6

naive = *optimal = #

Page 34: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 34

Exp_3 Concluded

• The “optimal solution” was not easy to guess

• It consists of more units to the control than to the experimental treatment

• Lesson: be careful!

Page 35: SJS SDI_31 Design of Statistical Investigations Stephen Senn 3. Design of Experiments 1 Some Basic Ideas

SJS SDI_3 35

Questions

• What are the practical problems in implementing the solution we found for Exp_3?

• Why might this not be a good solution after all?

• Are there any implications for the design of Exp_1?