43
Department of SOCIAL MEDICINE University of BRISTOL Risk of bias assessments: analysis and interpretation Jonathan Sterne Department of Social Medicine, University of Bristol, UK

2010 smg training_cardiff_day2_session4_sterne

Embed Size (px)

Citation preview

Page 1: 2010 smg training_cardiff_day2_session4_sterne

1Department ofSOCIAL MEDICINE

University ofBRISTOL

Risk of bias assessments:analysis and interpretation

Jonathan SterneDepartment of Social Medicine, University of Bristol, UK

Page 2: 2010 smg training_cardiff_day2_session4_sterne

2Department ofSOCIAL MEDICINE

University ofBRISTOL

Bias in the results of meta-analyses• Reporting biases

– Publication bias– Selective reporting of outcomes– Fertile ground for statistical tests– Now, finally, addressed in the obvious way, by

mandatory trial registration

Page 3: 2010 smg training_cardiff_day2_session4_sterne

3Department ofSOCIAL MEDICINE

University ofBRISTOL

Page 4: 2010 smg training_cardiff_day2_session4_sterne

4Department ofSOCIAL MEDICINE

University ofBRISTOL

Bias in the results of meta-analyses• Reporting biases

– Publication bias– Selective reporting of outcomes– Fertile ground for statistical tests– Now, finally, addressed in the obvious way, by

mandatory trial registration• Biases resulting from flaws in trial conduct

Page 5: 2010 smg training_cardiff_day2_session4_sterne

5Department ofSOCIAL MEDICINE

University ofBRISTOL

5

Randomised Controlled Trials (RCTs)

• Simple idea … Eligible participants

Randomise

Interventiongroup

Comparisongroup

Assessoutcomes

Assessoutcomes

Page 6: 2010 smg training_cardiff_day2_session4_sterne

6Department ofSOCIAL MEDICINE

University ofBRISTOL

6

• Deceptively simple idea Eligible participants

Randomise

Interventiongroup

Comparisongroup

Assessoutcomes

Assessoutcomes

Bias can beintroduced atall stages of the conductof RCTs

Randomised Controlled Trials (RCTs)

Page 7: 2010 smg training_cardiff_day2_session4_sterne

7Department ofSOCIAL MEDICINE

University ofBRISTOL

Flaws in the conduct of RCTs• Trials provide causal inferences about the effect of the

intervention if we randomise sufficient individuals and avoid selection and performance biases

• This can be undermined by:– Inadequate generation of randomisation sequence

Page 8: 2010 smg training_cardiff_day2_session4_sterne

8Department ofSOCIAL MEDICINE

University ofBRISTOL

Flaws in the conduct of RCTs• Trials provide causal inferences about the effect of the

intervention if we randomise sufficient individuals and avoid selection and performance biases

• This can be undermined by:– Inadequate generation of randomisation sequence– Inadequate concealment of allocation

Problems with randomisation may cause selection bias, if participants or healthcare providers can predict treatment allocation

Page 9: 2010 smg training_cardiff_day2_session4_sterne

9Department ofSOCIAL MEDICINE

University ofBRISTOL

Flaws in the conduct of RCTs• Trials provide causal inferences about the effect of the

intervention if we randomise sufficient individuals and avoid selection and performance biases

• This can be undermined by:– Inadequate generation of randomisation sequence– Inadequate concealment of allocation– Inadequate blinding

• Performance bias• Care of intervention and control groups

not comparable

• Detection bias• Measurement of outcomes not

comparable

Page 10: 2010 smg training_cardiff_day2_session4_sterne

10Department ofSOCIAL MEDICINE

University ofBRISTOL

Flaws in the conduct of RCTs• Trials provide causal inferences about the effect of the

intervention if we randomise sufficient individuals and avoid selection and performance biases

• This can be undermined by:– Inadequate generation of randomisation sequence– Inadequate concealment of allocation– Inadequate blinding– Excluding patients, or analysing them in the wrong group

Page 11: 2010 smg training_cardiff_day2_session4_sterne

11Department ofSOCIAL MEDICINE

University ofBRISTOL

Including biased trials will cause meta-analyses to be biased

Page 12: 2010 smg training_cardiff_day2_session4_sterne

12Department ofSOCIAL MEDICINE

University ofBRISTOL

0.2 0.4 0.6 21

Meta-analysisSingle large trial

Odds Ratio

Nitrates in myocardial infarction

Inpatient geriatric assessment

Magnesium in myocardial infarction

Aspirin for pre-eclampsia prevention

Intervention

Egger et al. BMJ 1997

Page 13: 2010 smg training_cardiff_day2_session4_sterne

13Department ofSOCIAL MEDICINE

University ofBRISTOL

Including biased trials will cause meta-analyses to be biased

• An obvious solution is to score the quality of trials included in the meta-analysis

• We could then downweight low quality trials, or exclude trials scoring below a chosen quality threshold

Page 14: 2010 smg training_cardiff_day2_session4_sterne

14Department ofSOCIAL MEDICINE

University ofBRISTOL

The death of quality scores• 25 known checklists• Between 3 and 34 components• Frequently no definitions of quality• Most components said to be based on “accepted criteria”

(Moher et al. Controlled Clinical Trials 1995; 16: 62-73)

Page 15: 2010 smg training_cardiff_day2_session4_sterne

15Department ofSOCIAL MEDICINE

University ofBRISTOL

“Quality scores are uselessand potentially misleading”

Greenland Am.J.Epidemiol. 1994;140:290-296

“perhaps the most insidious form of subjectivity masquerading as objectivity is ‘quality scoring’. This practice subjectively merges objective information with arbitrary judgements in a manner that can obscure important sources of heterogeneity among study results”

Page 16: 2010 smg training_cardiff_day2_session4_sterne

16Department ofSOCIAL MEDICINE

University ofBRISTOL

Relative risk with 95% CI

0·5 0·75 1 1·25

Qua

lity

Scal

e 1-

25

All trials1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

Re-analysis using 25 different scales to assess trial quality

Endpoint: Deep vein thrombosis

Jüni et al. JAMA 1999 282: 1054-1060

High quality

Low quality

Page 17: 2010 smg training_cardiff_day2_session4_sterne

17Department ofSOCIAL MEDICINE

University ofBRISTOL

Empirical evidence of bias

Evidence-based critical appraisal

Page 18: 2010 smg training_cardiff_day2_session4_sterne

18Department ofSOCIAL MEDICINE

University ofBRISTOL

Meta-epidemiology(Naylor, BMJ 1997; 315: 617-619)

• Identify a large number of meta-analyses• Record characteristics of individual studies (allocation

concealment, blinding, type of publication, language etc.) • Compare treatment effects within each meta-analysis (for

example not double blind vs. double blind)• Estimate ratio of odds ratios

Page 19: 2010 smg training_cardiff_day2_session4_sterne

19Department ofSOCIAL MEDICINE

University ofBRISTOL

Schulz KF, Chalmers I, Hayes RJ, Altman DG. (1995) Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 273: 408-412.

0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2

No

YesDouble-blind

Yes

NoExclusions

Inadequate/unclear

AdequateSequence generation

Inadequate/unclear

AdequateConcealment of allocation

Ratio of Odds Ratios

Treatment effectover-estimation under-estimation

Reference

Empirical evidence of bias33 meta-analyses, 250 RCTs

Page 20: 2010 smg training_cardiff_day2_session4_sterne

20Department ofSOCIAL MEDICINE

University ofBRISTOL

NOTE: Weights are from random effects analysis

Overall (I-squared = 88.6%, p = 0.000)

Kjaergard, 2001

Als-Neilsen, 2004

Moher, 1998

Study ID

Balk, 2002

Egger, 2003

Schulz, 1995

0.79 (0.66, 0.95)

0.60 (0.37, 0.97)

1.02 (0.93, 1.13)

0.63 (0.45, 0.88)

ratios (95% CI)

0.95 (0.83, 1.09)

0.79 (0.70, 0.89)

0.66 (0.59, 0.73)

0.79 (0.66, 0.95)

0.60 (0.37, 0.97)

1.02 (0.93, 1.13)

0.63 (0.45, 0.88)

ratios (95% CI)Ratio of odds

0.95 (0.83, 1.09)

0.79 (0.70, 0.89)

0.66 (0.59, 0.73)

.25 1 4.25 .5 1 2

Ratio of odds ratios

Allocation concealment:combined evidence

Page 21: 2010 smg training_cardiff_day2_session4_sterne

21Department ofSOCIAL MEDICINE

University ofBRISTOL

Analysis of meta-epidemiological studies

• Most studies used a logistic regression approach assuming fixed effects within and between meta-analyses– Assumes no between-trial heterogeneity, and that effects of bias are

the same in each meta-analysis

• Two-stage approach:– Estimate the effect trials characteristics separately in each meta-

analysis– Combine estimates across meta-analyses

(Sterne et al. Statistics in Medicine 2002; 21: 1513-1524)

Page 22: 2010 smg training_cardiff_day2_session4_sterne

22Department ofSOCIAL MEDICINE

University ofBRISTOL

Non blinded more beneficial

Non blinded less beneficial

0.93 (0.83, 1.04)

Ratio of odds ratios(95% CI)

Overall (76)

No. of trials*

314 vs. 432

Comparison (No. of meta-analyses)

* Non blinded vs. blinded

Variability inbias (P value)

0.11 (p<0.001)The image cannot be displa…

Ratio of odds ratios0.5 0.75 1 1.5 2

1.01 (0.92, 1.10)

0.75 (0.61, 0.93)Subjective outcomes (32)

Objective outcomes (44) 210 vs. 227

104 vs. 205

0.08 (p<0.001)

0.14 (p=0.001)Th

The image cannot be displayed. Your computer may not have enough me…

Combined analysis of three empirical studies: blinding

Page 23: 2010 smg training_cardiff_day2_session4_sterne

23Department ofSOCIAL MEDICINE

University ofBRISTOL

Inadequately concealedmore beneficial

Inadequately concealed less beneficial

Ratio of odds ratios0.5 0.75 1 1.5 2

Ratio of odds ratios(95% CI)

0.83 (0.74, 0.93)Overall (102)

No. of trials*

532 vs. 272

Comparison (No. of meta-analyses)

* Inadequately/unclearly concealed vs. adequately concealed

0.11 (p<0.001)

The image cannot be displa…

The image cannot be displayed. Your computer may not have enou

0.91 (0.80, 1.03)

0.69 (0.59, 0.82)

Objective outcomes (62)

Subjective outcomes (40)

310 vs. 174

222 vs. 98

0.11 (p<0.001)

0.07 (p=0.011)

Variability inbias (P value)

Combined analysis of three empirical studies: allocation concealment

Wood, L., Egger, M., Gluud, L.L., Schulz, K., Jüni, P., Altman, D.G., Gluud, C., Martin, R.M., Wood, A.J.G. and Sterne, J.A.C. (2008) Empirical evidence of bias in treatment effect estimates in controlled trials with

different interventions and outcomes: meta-epidemiological study. BMJ, 336: 601-605.

Page 24: 2010 smg training_cardiff_day2_session4_sterne

24Department ofSOCIAL MEDICINE

University ofBRISTOL

Inadequately concealedmore beneficial

Inadequately concealed less beneficial

Ratio of odds ratios0.5 0.75 1 1.5 2

Ratio of odds ratios(95% CI)

0.83 (0.74, 0.93)Overall (102)

No. of trials*

532 vs. 272

Comparison (No. of meta-analyses)

* Inadequately/unclearly concealed vs. adequately concealed

0.11 (p<0.001)

The image cannot be displa…

The image cannot be displayed. Your computer may not have enou

0.91 (0.80, 1.03)

0.69 (0.59, 0.82)

Objective outcomes (62)

Subjective outcomes (40)

310 vs. 174

222 vs. 98

0.11 (p<0.001)

0.07 (p=0.011)

Variability inbias (P value)

Combined analysis of three empirical studies: allocation concealment

Wood, L., Egger, M., Gluud, L.L., Schulz, K., Jüni, P., Altman, D.G., Gluud, C., Martin, R.M., Wood, A.J.G. and Sterne, J.A.C. (2008) Empirical evidence of bias in treatment effect estimates in controlled trials with

different interventions and outcomes: meta-epidemiological study. BMJ, 336: 601-605.

Emprical evidence on the effect of flaws in trial

conduct will always be limited by imperfect

reporting of trials

Page 25: 2010 smg training_cardiff_day2_session4_sterne

25Department ofSOCIAL MEDICINE

University ofBRISTOL

Bias assessment in Cochrane reviews• Project led by Julian Higgins and Doug Altman• “Risk of bias”, not “quality”• Cochrane reviewers are now asked to judge whether there

is a risk of bias in the results of the trial– “Yes”: high risk of bias– “No”: low risk of bias

1. Sequence generation (randomisation)2. Allocation concealment3. Blinding of participants, personnel and outcomes4. Incomplete outcome data (attrition and exclusions)5. Selective outcome reporting6. Other (including topic-specific, design-specific)

Page 26: 2010 smg training_cardiff_day2_session4_sterne

26Department ofSOCIAL MEDICINE

University ofBRISTOL

26

The ROB tool: how to assess itemsTwo components

1. Description of what happened• possibly including ‘done’, ‘probably done’, ‘probably not done’ or

‘not done’ for some items

2. Review authors’ judgement • whether bias unlikely to be introduced through this item (Yes, No,

Unclear)

Yes = Low risk of biasNo = High risk of biasUnclear = unable to make a clear judgement

‘Blinding’ and ‘Incomplete outcome data’ may need separate assessments for different outcomes

Page 27: 2010 smg training_cardiff_day2_session4_sterne

27Department ofSOCIAL MEDICINE

University ofBRISTOL

27

Risk of bias tableStudy: Fisman 1981

Authors’ judgement Description

Sequence adequately generated?

UNCLEAR “Patients were randomly allocated”.

Allocation concealed? UNCLEAR No information.

Blinding?All included outcomes

Low risk

“double blind design”. “Millet… resembles lecithin in appearance… When ground, each substance could be distinguished from the other by hue and taste but staff were not informed as too which was which.”

Incomplete outcome data addressed?All included outcomes

High risk

Data unavailable for meta-analysis.Randomised: lecithin = Not stated, placebo = Not stated, Total = 33.Missing: lecithin = 7 (non-cooperation or diarrhoea = 2; moved to nursing home = 4, death = 2), placebo = 5 (non-cooperation or diarrhoea = 3, death = 2), total missing = 36%.

Page 28: 2010 smg training_cardiff_day2_session4_sterne

28Department ofSOCIAL MEDICINE

University ofBRISTOL

28

Study: Fisman 1981

Authors’ judgement Description

Free of selective reporting? High risk

No quantitative results reported due to lack of effect.It is apparently clear which outcomes were measured.

Free of other bias? Low risk No problems apparent.

Risk of bias table

Page 29: 2010 smg training_cardiff_day2_session4_sterne

29Department ofSOCIAL MEDICINE

University ofBRISTOL

Adequate sequence generationAllocation concealment

Blinding (Patient-reported outcomes)Blinding (Mortality)

Incomplete outcome data addressed (Short-term outcomes (2-6 wks))Incomplete outcome data addressed (Longer-term outcomes (> 6ks))

Free of selective reportingFree of other bias

Yes (Low risk of bias) Unclear No (High risk of bias)

Summary of risk of bias for included trials

Incorporating bias assessments into reviews

So now I’ve dealt with bias in my review? No!

Page 30: 2010 smg training_cardiff_day2_session4_sterne

30Department ofSOCIAL MEDICINE

University ofBRISTOL

30

Summarising risk of bias

• Reviewers will need to do this:– for an outcome within a study (across bias domains)– for an outcome across studies (for a meta-analysis)

• Outcome-level summaries should inform the choice of meta-analytic strategy for that outcome

• Meta-analysis-level summaries should inform the interpretation (summary of findings)

Page 31: 2010 smg training_cardiff_day2_session4_sterne

31Department ofSOCIAL MEDICINE

University ofBRISTOL

Incorporating outcome-level bias assessments into meta-analyses

1. Present all studies and provide a narrative discussion of risk of bias

– such an approach is discouraged because• Descriptions of bias are likely to be lost in discussion and conclusions• Results from studies at high risk of bias should be downweighted

2. Primary analysis restricted to studies at low (or low and unclear) risk of bias

– Often only a small proportion of trials– Reviewers reluctant to discard information,

3. Present multiple analyses with equal prominence– Confusing for readers and decision-makers

Page 32: 2010 smg training_cardiff_day2_session4_sterne

32Department ofSOCIAL MEDICINE

University ofBRISTOL

Evaluation of use of ROB tool

Page 33: 2010 smg training_cardiff_day2_session4_sterne

33Department ofSOCIAL MEDICINE

University ofBRISTOL

Testing for bias within a meta-analysis is unlikely to help

Page 34: 2010 smg training_cardiff_day2_session4_sterne

34Department ofSOCIAL MEDICINE

University ofBRISTOLOdds ratio

.01 .1 1 10 100

No. of eventsTreatment Control

Australia 1976 11/33 17/31Europe 1974 6/110 8/113Europe 1984 8/39 15/40Germany 1989 2/16 2/16Germany 1994 1/18 1/18Hong Kong 1974 1/20 1/20Japan 1977 4/47 0/41Switzerland 1975 1/18 3/22Taiwan 1997 2/21 2/19USA 1979a 6/16 11/15USA 1979b 4/7 5/8USA 1987 27/75 36/76USA 1994a 2/21 0/20USA 1994d 6/25 1/14USA 1996a 46/136 58/89USA 1997 88/205157/218

Subtotal 215/807317/760

Canada 1977 6/22 9/28Romania 1976 1/20 0/20USA 1988 15/126 18/142USA 1994b 12/37 21/34USA 1996b 3/10 1/11

Subtotal 37/215 49/235

Overall 252/1022366/995

Example: Clozapine versus neuroleptic medication for schizophrenia

Concealment inadequate (H)

Concealment adequate (L)

Page 35: 2010 smg training_cardiff_day2_session4_sterne

35Department ofSOCIAL MEDICINE

University ofBRISTOL

Concealment adequate (L)

Concealment inadequate (H)

All

0.25 0.50 1.00 2.00Treatment odds ratio (log scale)

Example: Clozapine versus neuroleptic medication for schizophrenia

ROR: 0.66 (0.31,1.41)

Page 36: 2010 smg training_cardiff_day2_session4_sterne

36Department ofSOCIAL MEDICINE

University ofBRISTOLRatio (inadeqate v adequate concealment of allocation)

.01 .05 .1 .5 1 2 5 10

Combined

The effects of components of trial quality are usually imprecisely estimated in a single meta-analysis

Little hope of adjusting for the effects of trial quality using only the information available in the meta-analysis

Data from Schulz et al. (JAMA 1995)

Page 37: 2010 smg training_cardiff_day2_session4_sterne

37Department ofSOCIAL MEDICINE

University ofBRISTOL

Effects of flaws in the conduct of trials

• Change in average intervention effect (bias)– the focus of most previous research

• Between-meta-analysis variability in average effect of bias

• Increases in between-trial heterogeneity

• If we knew that lack of blinding always exaggerated intervention effects by 20% there would be no problem

• Bias matters because its effects are uncertain

Page 38: 2010 smg training_cardiff_day2_session4_sterne

38Department ofSOCIAL MEDICINE

University ofBRISTOL

Consequences of flaws in trial conduct

Page 39: 2010 smg training_cardiff_day2_session4_sterne

39Department ofSOCIAL MEDICINE

University ofBRISTOL

Consequences of flaws in trial conduct

How might we use evidence about the effects of flaws in trial conduct, from meta-epidemiological studies, to combine data from studies at high and low risk of bias in meta-analyses?

Page 40: 2010 smg training_cardiff_day2_session4_sterne

40Department ofSOCIAL MEDICINE

University ofBRISTOL

Consequences for a single study at high risk of bias

• Correct the estimated effect of intervention, for the average bias associated with the flaw(s) in the trial

• Increase the trial variance by adding:– the average increase in between-trial heterogeneity– the between-meta-analysis variance in average bias

• So, trials at high risk of bias should be downweighted in meta-analyses

Page 41: 2010 smg training_cardiff_day2_session4_sterne

41Department ofSOCIAL MEDICINE

University ofBRISTOL

Concealment adequate (L)

Concealment inadequate (H)

All

Bias-adjusted0.25 0.50 1.00 2.00

Treatment odds ratio (log scale)

Example: Clozapine versus neuroleptic medication for schizophrenia

Relative weight(type H studies)=81%

Relative weight(type H studies)=37%w=0.41

ROR: 0.66 (0.31,1.41)

Page 42: 2010 smg training_cardiff_day2_session4_sterne

42Department ofSOCIAL MEDICINE

University ofBRISTOL

Consequences for meta-analyses• Limits to the informational value of studies at high

risk of bias:1. Even a very large study at high risk of bias has

minimum variance corresponding to the sum of variances of increase in heterogeneity and variance in bias

2. Even a meta-analysis of large studies at high risk of bias has minimum variance corresponding to the between-meta-analysis variance in average bias

Given current knowledge, the best approach for Cochrane review authors is to restrict meta-analyses to studies at low (or low and unclear) risk of bias

Page 43: 2010 smg training_cardiff_day2_session4_sterne

43Department ofSOCIAL MEDICINE

University ofBRISTOL

Conclusions• Flaws in the conduct of randomized controlled trials are

important because they increase uncertainty

• If we want to include potentially biased evidence in a systematic review, then we should downweight and correct for bias, based on empirical evidence on its effects

• The best currently available approach for Cochrane reviewers is to present a primary meta-analysis restricted to studies at low (or low and unclear) risk of bias– Bias assessments based on the Risk of Bias Tool seem widely

accepted

– Improvements to the Handbook, RevMan and training materials may be needed to improve incorporation of bias assessments in reviews and SoF