56
(C) Stephen Senn 2005-200 6 1 Has Modelling Killed Randomisation Inference? Stephen Senn

Has modelling killed randomisation inference frankfurt

Embed Size (px)

Citation preview

Page 1: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 1

Has Modelling Killed Randomisation Inference?

Stephen Senn

Page 2: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 2

General Thesis• During the last third of the last century, our view of

statistical analysis as it relates to design of statistical investigations underwent a change we hardly recognised– From the dogmatic: ‘design always determines analysis’– To the pragmatic: ‘analysis is often related to design’

• We were hardly aware of this change– This change is particularly well illustrated through the work of

one statistician – John Nelder• However, the revolution is not complete and we suffer

(as a community) from some sort of randomisation/modelling schizophrenia

• This tension can be resolved by reversing the precedence of design and analysis.

Page 3: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 3

Outline• Randomisation based inference• Model based inference• Some examples (mainly medical)

– Potential conflicts– Potential pitfalls

• Is a resolution possible?

Page 4: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 4

What do I mean by Randomisation Inference?

• I do not mean inference based solely on permutation tests– These are often applied when randomisation has not

been applied– Restriction to these tests when randomisation has

been applied makes the field of randomisation inference too narrow and does not capture the essential distinction to modelling

• I mean analysis driven by design and randomisation

• The role of the experimenter is part of the set up• Causal understanding is the driver

Page 5: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 5

What do I mean by modelling?

• This is much more difficult• Basic idea is that deterministic and

stochastic relationships exist between the variables in the data-set

• The statistician’s job is to find useful (economical) descriptions of these relationships

• Ability to predict is the driver

Page 6: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 6

General Balance• An idea of John Nelder’s• Two papers in the Proceedings of the

Royal Society, 1965 concerning “The analysis of randomized experiments with orthogonal block structure”– Block structure and the null analysis of

variance– Treatment structure and the general analysis

of variance

Page 7: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 7

Basic Idea• Splits an experiment into two radically

different components– The block structure, which describes the way

that the experimental units are organised• The way that variation amongst units can be

described– The treatment structure, which reflects the

way that treatments are combined for the scientific purpose of the experiment

Page 8: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 8

Design Driven Modelling• Together with a third piece of information, the

design matrix, these determine the analysis of variance– Note that because both block and treatments

structure can be hierarchical such a design matrix is not on its own sufficient to derive an ANOVA

• But together with John’s block and treatment structure it is– For designs exhibiting general balance

• This approach is incorporated in GenStat®

Page 9: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 9

GenStat

Page 10: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 10

"General Analysis of Variance."BLOCK Runs*PositionsTREATMENTS MaterialCOVARIATE "No Covariate"ANOVA [PRINT=aovtable,information,means; FACT=1; FPROB=yes;\PSE=diff] Wear

***** Analysis of variance ***** Variate: Wear of material Source of variation d.f. s.s. m.s. v.r. F pr. Runs stratum 3 986.50 328.83 5.37 Positions stratum 3 1468.50 489.50 7.99 Runs.Positions stratumMaterial 3 4621.50 1540.50 25.15 <.001Residual 6 367.50 61.25 Total 15 7444.00

Page 11: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 11

S Plus®

Page 12: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 12

*** Analysis of Variance Model ***

Short Output:Call: aov(formula = Wear ~ Runs + Positions + Materials, data = Nelder, na.action = na.exclude)

Terms: Runs Positions Materials Residuals Sum of Squares 986.5 1468.5 4621.5 367.5Deg. of Freedom 3 3 3 6

Residual standard error: 7.826238 Estimated effects are balanced

Df Sum of Sq Mean Sq F Value Pr(F) Runs 3 986.5 328.833 5.36871 0.03901297Positions 3 1468.5 489.500 7.99184 0.01616848Materials 3 4621.5 1540.500 25.15102 0.00084982Residuals 6 367.5 61.250

Page 13: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 13

Simple Examples• These are simple examples where the

differences between the packages seem unimportant– The other packages carry out significance

tests on blocks but GenStat® does not• But for complex examples you will begin to

see material differences unless you take a great deal of care with the other packages

Page 14: Has modelling killed randomisation inference frankfurt

Welham et al 2004 14

Yates 1937 as quoted by Welham et al 2004

Page 15: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 15

Page 16: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 16

GenStat Code

TREATMENTS Variety*Nitrogen

BLOCK Blocks/Wplots/Subplots

ANOVA [PRINT=aov,means; FPROBABILITY=yes] Yield

Page 17: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 17

***** Analysis of variance ***** Variate: Lnyld100 (Natural log of forage yield ,log mult by 100) Source of variation d.f.(m.v.) s.s. m.s. v.r. F pr. Rows stratum 3 436.55 145.52 2.14 Cols stratum 3 374.00 124.67 1.83 Rows.Cols stratumCutdate 3 110851.83 36950.61 542.20 <.001Residual 5(1) 340.74 68.15 6.30 Rows.Cols.Subplots stratumNitrogen 1 888.34 888.34 82.14 <.001Nitrogen.Cutdate 3 196.48 65.49 6.06 0.013Residual 10(2) 108.15 10.81 Total 28(3) 103854.90

Page 18: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 18

1972Annus Mirabilis

• David Cox, – Proportional Hazards (20,851 citations)

• Lindley and Smith– Bayesian linear models (682 citations)

• Nelder and Wedderburn– Generalised Linear Models (1,205 citations)

• Peto and Peto– Log-rank test (2,182 citations) As of 15

March 2006

Page 19: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 19

Generalised Linear Models• Problem in describing this is the opposite

of that we have with general balance• We are too familiar with it

– Used far more often than cited• Like Student’s t-test• McCullagh and Nelder 2nd edition now has 2,500

citations

• Changed the face of statistical modelling

Page 20: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 20

Design-led and Model-led Inference

• Nelder’s work on general balance is in many ways the high water mark of design-led inference– Block structure, treatment structure and the way the

latter is applied to the former determine the analysis• Nelder’s work on GLMs is more or the less the

opposite– Tremendous power and flexibility frees us to try many

approaches• Is there a contradiction?

Page 21: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 21

The Two ApproachesRandomisation Modelling

• Robust• Seems to produce

strong consensus for designed experiments and sampling plans

• Uses elegant symmetries to produce inference

• Flexible• Not limited to

experiments (and random samples)

• Much more powerful in terms of scope

• Can more honestly reflect uncertainty

Page 22: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 22

The Graphical Model Justification for Randomisation

U

Y

T U

Y

T

See Davison, 2003, p 418

Page 23: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 23

The Problem with This• Consider three possible

ways in which the observed unbalanced design on the left occurred– Deliberate design– Randomised until this

pattern appeared, which was then chosen

– Randomised and then found that by chance this pattern appeared

TreatmentGrade A B

Moderate 44 56

Severe 56 44

Page 24: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 24

The Modeller’s Criticism• Why should the property of the average

inference over all experiments you might have run be of any relevance when making a specific inference from the experiment you did run?

• If you are at 35,000ft, four engines are on fire and the captain has had a heart-attack can you say:– “Why worry, on average air travel is very safe?”

Page 25: Has modelling killed randomisation inference frankfurt

Jack Good, 'Good Thinking' 1983 25

A Quote from Jack GoodThe use of random sampling is a device for obtaining apparently precise objectivity but this precise objectivity is attainable, as always, only at the price of throwing away some information (by using a Statistician’s Stooge who knows the random numbers but does not disclose them)…

…But the use of sampling without randomization involves the pure Bayesian in such difficult judgments that, at least if he is at all Doogian, he might decide by Type II rationality, to use random sampling to save time.

Page 26: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 26

The randomiser’s criticism• You pick a model• Fit it• Check it

– Plots, deviance, AIC etc• Change it• Fit it again etc• Make a final choice• Express your uncertainty as if you always knew

this was the only model possible• Why don’t you write an astrology column for a

popular newspaper?

Page 27: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 27

Yates v AitkenBiometrics 1981 &1982

• Murray Aitken answers a query about the analysis of a repeated measures design

• Suggests a split-plot approach and provides ANOVA decomposition

• Yates writes in to argue that this is invalid as there has been no randomisation at the ‘split-plot’ level and in fact never could be for a factor like time

• This is a clash of modelling and randomisation philosophies

Page 28: Has modelling killed randomisation inference frankfurt

Biometrics 1981 28

The original query

Page 29: Has modelling killed randomisation inference frankfurt

Biometrics 1981 29

Aitken’s Anova

Page 30: Has modelling killed randomisation inference frankfurt

Biometrics 1982 30

Yates’s Anova

Page 31: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 31

Examples and Controversies• Blocks and centres• Minimisation

– CAESAR study• Adaptive randomisation

– Smoking cessation study• Meta-analysis• Sampling theory

Page 32: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 32

Randomisation in Multi-Centre Trials

S tan d a rd S tru ctu re

Block 1 Block 2 Block 3

C e ntre 1

Block 1 Block 2

C e ntre 2

Block 1 Block 2 Block 3

C e ntre 3

T ria l

Page 33: Has modelling killed randomisation inference frankfurt

R A Fisher 33

A Quote from the Master

“Two things are necessary, however: (a) that a sharp distinction should be drawn between those components of error which are to be eliminated in the field, and those which are not to be eliminated; and that while the elimination of the one class shall be complete, no attempt shall be made to eliminate the other; (b) that the statistical process of the estimation of error shall be modified so as to take account of the field arrangement, and so that the components of error actually eliminated in the field shall equally be eliminated in the statistical laboratory.” (My emphasis.)

R.A. Fisher, CP 48 pp507-508

Page 34: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 34

Problems• Now, it is a fact that very few clinical

trials are randomised by centre.• They are more usually randomised by

sub-centre block.• However, the analysis almost never

includes the block in the model.• The common approach thus does not

eliminate in the ‘laboratory’ that which was eliminated in the ‘field’.

Page 35: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 35

Minimisation

“Dangers of minimizationOn the point about logistics and practical complexity, when reviewing applications from sponsor companies, we have often observed that the use of minimization may result in more harm than good. In the context of confirmatory Phase III studies, we have rarely seen the need to use any allocation procedure more complex than simple stratified randomization with permuted blocks. However, in circumstances where sponsors have chosen to use minimization, we have seen situations where programming algorithms have been incorrect, the choice of factors to include in the minimization algorithm has been poorly thought out or where telephone systems or Web-based systems have proved unreliable. Such examples occur across large and small companies and so our general advice to all companies would be to avoid such a procedure since it seems to add no benefit but has potential difficulties.”

Simon Day, Jean-Marie Grouin, John A. Lewis Applied Clinical Trials, Jan 2005

“If randomisation is the gold standard, minimisation may be the platinum standard. “ Tom Treasure and Kenneth MacRae, BMJ, 1998

Page 36: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 36

We can explain this with the help of an example of Pocock’s (Clinical Trials A Practical Approach, Wiley, 1983, p85).

Suppose that we wish to balance patients in a clinical trial for advanced breast cancer with respect to 4 factors: performance status (ambulatory and non-ambulatory), age ( < 50, 50), disease free interval (<2 years, 2 years), dominant metastatic lesion (visceral, osseous, soft tissue).

Suppose that the next patient to be entered is ambulatory, age < 50, disease free interval 2 years and visceral metastasis and suppose that 80 patients have been entered already so that the position is as given in the table below.

Minimisation

Page 37: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 37

Factor Level No. on each treatment A B

NextPatient

PerformanceStatus

AmbulatoryNon-Amb.

3010

319

*

Age < 50 50

1822

1723

*

Disease-freeInterval

< 2 years 2 years

319

328 *

Dominantmet. lesion

VisceralOsseousSoft tissue

198

13

217

12

*

Using the levels of the factors for this particular patient we see that the sum for A is 30+18+9+19=76 while for B it is 31+17+8+21 = 77. Therefore we assign the patient to A.

Page 38: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 38

Efficiency and the Linear Model

1 11 1 0

2 12 2 1

1

11 ...

1 ...

k

k

n n kn k

Y X XY X X

Y X X

X β

Y

ˆ -1β = X X X Y

12ˆ ˆ( ) , ( ) .E V β β β X X

Page 39: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 39

1 2

11 12 1

12 22 2

1

22

ˆvar( ) ( )

2 /

k

k kk

X X

a a aa a

a a

a n

The value of 2 depends on the model.

The value of a22 depends on the design and this only achieves its lower bound when covariates are balanced.

Choose allocations such that a22 is minimised. (Or used biased coin based on this principle.)

Atkinson’s Approach

Page 40: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 40

6 8 10 12 14 16 18 20 22 24

Sample size

0.6

0.7

0.8

0.9

1.0E

ffici

ency

Efficiency of Randomised Design Compared to a Balanced One

10 x 1000 runs at each sample size

Balanced Numbers: One Covariate

runmeanformula

Page 41: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 41

Is Covariate Balance Valid?

2 sequentially balanced allocations

2 2 sequentially balan

2 patients

2 males2 f

2 ! balanced all

em

ocation

ales

s! !

(2 )!

ced allocati

(2 )! balanced allocations! ! ! !

ons

n

m f

nn n

m

n

mf

fm m f f

no covariates

one dichotomous covariate0 5 10

1

10

100

1 1031 1041 1051 106

RandomisedMinimised

patients per group

num

ber o

f pos

sibl

e al

loca

tions

Page 42: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 42

The Problem• All such sequential balancing methods restrict

the randomisation strongly to a degree beyond that necessary to balance by the factor by the end of the trial

• This may lead to invalid variance estimates– incompatible with Fisher philosophy of randomised

experiments• see also Nelder general balance

• A common defence would be that it leads to conservative inference– and that this is good

Page 43: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 43

Don’t Forget the Variance Estimate

Full "Correct"Model

ReducedRandomised

ReducedMinimised

Treatment Treatment Treatment

Covariate

Error Error Error

Total Total Total

Page 44: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 44

From the CAESAR StudyThe Emperor’s New Allocation?

“Patients were allocated in a ratio of 1:2:1 (placebo: lamivudine: lamivudine plus loviride) by centrally administered minimisation algorithm, based on disease stage, CD4 count and current treatment at screening…

...The primary outcome, time to progression to a new protocol-defined AIDS event or death, was compared for those taking lamivudine relative to those taking placebo by the log-rank test stratified by baseline CD4 count, AIDS diagnosis, and current treatment.”

Page 45: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 45

Randomisation Analysis• FDA demanded a randomisation based

analysis of this trial• Two were performed

– Assuming patient entry was random– Conditioning on the order in which patients

arrived in the trial• The first agreed closely with the model

based analysis but the second gave a much more significant result

Page 46: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 46

Why?

• The trial was in HIV/AIDS• This is a condition for which prognosis

has been rapidly improving over time• I speculate that later cohorts had better

prognosis than earlier• Minimisation balanced by prognosis• A model with time of entry would also

have produced a precise answer

Page 47: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 47

Adaptive Randomisation of a Clinical Trial

Randomised controlled trial of home based motivational interviewing by midwives to help pregnant smokers quit or cut down.

Tappin DM, Lumsden MA, Gilmour WH, Crawford F, McIntyre D, Stone DH, Webber R, MacIndoe S, Mohammed E.

Paediatric Epidemiology and Community Health Unit, Department of Child Health, Division of Developmental Medicine, University of Glasgow, Yorkhill, Glasgow G3 8SJ. [email protected]

OBJECTIVE: To determine whether motivational interviewing--a behavioural therapy for addictions-provided at home by specially trained midwives helps pregnant smokers to quit. DESIGN: Randomised controlled non-blinded trial analysed by intention to treat. ……RESULTS: 17/351 (4.8%) women in the intervention group stopped smoking (according to self report and serum cotinine concentration < 13.7 ng/ml) compared with 19/411(4.6%) in the control group... …

Page 48: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 48

“We planned to recruit 930 women (310 intervention, 620 controls) to give 90% power to detect, at the 5% significance level, an improvement in quit rate from 7.5% in the control group6 to 15% in the intervention group. After six months the 1:2 intervention:control ratio was modified to 1:1 as pilot recruitment rates were not achieved.10 From 1 March 2001 to 31 May 2003 we recruited 351 women in the intervention group and 411 in the control group (figs 1 and 2), providing 89% power for a quit rate in the control group of 7.5%.6”

“Random allocation used balanced stratification for three levels of smoking before pregnancy (< 10, 10-20, > 20 cigarettes a day (level of smoking)) and cutting down (smoking half or less of the amount before pregnancy at the time of booking (change already)). “

“Multiple logistic regression was used to estimate the odds ratio of quitting and of smoking more with adjustment for potential confounders and variables used in stratification. “

Page 49: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 49

Possible (Probable) SituationRecruitment Intervention Control Total

Early 60 120 180

Late 291 291 582

Total 351 411 762

Page 50: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 50

Consequences and Implications

• Treatment is confounded with recruitment• This point overlooked by authors• Analysis only unbiased if no trend effect

– But remember the CAESAR study• Analysis violates principle of concurrent control• This is true of all adaptive designs as

conventionally analysed– For example bandit and play the winner designs

Page 51: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 51

Random effects Meta-analysis

2

2

2

,

,

,

????

iic i

ic

iit i i

it

i

i

X Nn

X Nn

N

Some statisticians have put a random effect on this term without comment. Yet to do so permits recovery of inter-trial information which if we had unequal randomisation within trials might be substantial. Hence concurrent control is abandoned

2 2iNB if we have a classical

linear model

Page 52: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 52

Just to show you can find this debate in sampling as well as experiments

“To come back to the context of sampling theory, as I mentioned before, the paper I published in JRSS 1966 demonstrated that the likelihood principle implies that inference should be independent of the sampling design in general. This led to the development of model theory in survey sampling. The proponents of this theory, Royall (10) and subsequently others, rejected all use of randomization frequencies at the inference stage. For them, the inference must follow strictly from the superpopulation model. I have no sympathy for this view. I firmly believe that all nontrivial inference would require both model probabilities and randomization frequencies. Looking at the development of model theory today, it looks as though its proponents use all sorts of excuses for using the sampling design, and still somehow in a religious way maintain their model theory. But all this had done some good; it has helped people understand the role of randomization in survey sampling better. Randomization has survived this attack and has emerged with new strength and new meaning.”

A conversation with Godambe

(My emphasis)

Page 53: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 53

My View• Randomisation cannot be used as an excuse to

ignore prognostic covariates• You must condition on what is considered

important and has been observed• However randomisation allows you to use the

distribution in probability of unobserved covariates– By ‘marginalisation’

• Thus the model is primary– Reflects what is known/believed/assumed

Page 54: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 54

However• The assumed model defines a class of

(nearly) equally suitable designs• To refuse to choose one of these at

random is – to declare that you are not confident with your

model– to be unhelpful to others who do not share

your views

Page 55: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2005-2006 55

In other wordsWe move from

“so that the components of error actually eliminated in the field shall equally be eliminated in the statistical laboratory”

to

“so that only components of error to be eliminated in the statistical laboratory shall actually be eliminated in the field ”

And that we see randomisation as extremely valuable but not essential

Page 56: Has modelling killed randomisation inference frankfurt

(C) Stephen Senn 2004 56

If I may be allowed an autoquoteThe randomised experiment with distinction between block and treatment factors, Normal error terms, correct and inevitable partitionings of variances determined by design, and close parallels between randomisation and modelling approaches, seems to us less like a commanding fortress of excellence, set somewhat apart in the city of inference, but more like a single apartment (albeit, perhaps, the penthouse suite) in the tower-block of data-modelling we all now occupy.