3
Comment Douglas G. Altman * Cancer Research UK/NHS Centre for Statistics in Medicine, Oxford, United Kingdom The extent to which nonrandomised studies may yield reliable evidence remains a matter of some debate, and it is likely that the value of such studies will vary according to the clinical context and the interventions being evaluated (Deeks et al., 2003). While it may sometimes be the case that non- randomised studies have better external validity than RCTs, those who favour such studies for this reason overlook, or underplay, the importance of the potential for bias in nonrandomised studies. Of particular importance are factors which influence the selection of treatment, notably those incurring the risk of ‘confounding by indication’ (Moses, 1995). Randomised controlled trials (RCTs) are undoubtedly the best way to derive reliable evidence relat- ing to the comparative effectiveness of two (or more) health care interventions. Randomisation is intended to yield groups of participants who differ only by chance in their characteristics, in particular (but not only) with respect to features associated with differential prognosis. This line of reasoning is underpinned by the expectation that RCTs will give an unbiased answer to the question being addressed. In reality, there are many reasons why RCTs may fall short of that ideal. Key among these are biases due to lack of allocation concealment; lack of blinding, perhaps especially of assessment of outcomes; loss to follow up; and poor compliance with therapy. It should be remembered, though, that most of these issues apply also to nonrandomised studies, in addition to the key problem of lack of comparability (selection bias) already mentioned. Those of us who have highlighted methodological failings in some RCTs run the risk of weakening the credibility of RCTs. This would clearly be most unfortunate, yet it seems wrong not to investigate and report weaknesses in RCTs as conducted and reported, in the hope that these studies can lead to even better studies in the future. In his very interesting paper, Vance Berger examines the extent to which allocation concealment may not protect against biased allocation, in the case where the trial is not blinded and then when it is (Berger, 2005). The key distinction of RCTs is random allocation. As has become apparent in recent years, both the generation of a random sequence and the mechanism used to implement it, and the desirability of concealment of this allocation process (Berger and Bears, 2003). The lack of concealed alloca- tion has been recognised as one of the most important potential sources of bias (Jȱni et al., 2001). The empirical evidence of the importance of allocation concealment may well have led researchers to tighten their procedures. This paper shows that the assumption that bias was thus impossible may not have been justified. (As an aside, the term allocation concealment may be only about 10 years old, but the principle was adopted by Bradford Hill 60 years ago for the MRC streptomycin trial.) The most common options for treatment allocation in RCTs are simple randomisation and various forms of restricted randomisation: block randomisation, stratified block randomisation, and minimisa- tion. All these methods should provide unbiased allocation, but only simple randomisation does not attempt to restrict the randomisation in some way to ensure good balance of numbers and or character- istics of participants in the different treatment groups. Berger considers the extent to which restricted randomisation could compromise the unpredictability that is expected to be provided by randomisation with allocation concealment. As he shows, in certain common situations in which block randomisation is commonly used, it is possible for an appreciable percentage of assignments to be guessed correctly. In essence, if running totals of participants per arm are available then more often than not the next assignment will be to the arm with the lower total. * Corresponding author: e-mail: [email protected], Phone: +44 1865 226799, Fax: +44 1865 226962 Biometrical Journal 47 (2005) 2, 128 130 DOI: 10.1002/bimj.200510108 # 2005 WILEY-VCH Verlag GmbH &Co. KGaA, Weinheim

Comment

Embed Size (px)

Citation preview

Page 1: Comment

Comment

Douglas G. Altman*

Cancer Research UK/NHS Centre for Statistics in Medicine, Oxford, United Kingdom

The extent to which nonrandomised studies may yield reliable evidence remains a matter of somedebate, and it is likely that the value of such studies will vary according to the clinical context andthe interventions being evaluated (Deeks et al., 2003). While it may sometimes be the case that non-randomised studies have better external validity than RCTs, those who favour such studies for thisreason overlook, or underplay, the importance of the potential for bias in nonrandomised studies. Ofparticular importance are factors which influence the selection of treatment, notably those incurringthe risk of ‘confounding by indication’ (Moses, 1995).

Randomised controlled trials (RCTs) are undoubtedly the best way to derive reliable evidence relat-ing to the comparative effectiveness of two (or more) health care interventions. Randomisation isintended to yield groups of participants who differ only by chance in their characteristics, in particular(but not only) with respect to features associated with differential prognosis.

This line of reasoning is underpinned by the expectation that RCTs will give an unbiased answer tothe question being addressed. In reality, there are many reasons why RCTs may fall short of thatideal. Key among these are biases due to lack of allocation concealment; lack of blinding, perhapsespecially of assessment of outcomes; loss to follow up; and poor compliance with therapy. It shouldbe remembered, though, that most of these issues apply also to nonrandomised studies, in addition tothe key problem of lack of comparability (selection bias) already mentioned.

Those of us who have highlighted methodological failings in some RCTs run the risk of weakeningthe credibility of RCTs. This would clearly be most unfortunate, yet it seems wrong not to investigateand report weaknesses in RCTs as conducted and reported, in the hope that these studies can lead toeven better studies in the future. In his very interesting paper, Vance Berger examines the extent towhich allocation concealment may not protect against biased allocation, in the case where the trial isnot blinded and then when it is (Berger, 2005).

The key distinction of RCTs is random allocation. As has become apparent in recent years, boththe generation of a random sequence and the mechanism used to implement it, and the desirabilityof concealment of this allocation process (Berger and Bears, 2003). The lack of concealed alloca-tion has been recognised as one of the most important potential sources of bias (J�ni et al., 2001).The empirical evidence of the importance of allocation concealment may well have led researchersto tighten their procedures. This paper shows that the assumption that bias was thus impossiblemay not have been justified. (As an aside, the term allocation concealment may be only about 10years old, but the principle was adopted by Bradford Hill 60 years ago for the MRC streptomycintrial.)

The most common options for treatment allocation in RCTs are simple randomisation and variousforms of restricted randomisation: block randomisation, stratified block randomisation, and minimisa-tion. All these methods should provide unbiased allocation, but only simple randomisation does notattempt to restrict the randomisation in some way to ensure good balance of numbers and or character-istics of participants in the different treatment groups.

Berger considers the extent to which restricted randomisation could compromise the unpredictabilitythat is expected to be provided by randomisation with allocation concealment. As he shows, in certaincommon situations in which block randomisation is commonly used, it is possible for an appreciablepercentage of assignments to be guessed correctly. In essence, if running totals of participants per armare available then more often than not the next assignment will be to the arm with the lower total.

* Corresponding author: e-mail: [email protected], Phone: +44 1865 226799, Fax: +44 1865 226962

Biometrical Journal 47 (2005) 2, 128–130 DOI: 10.1002/bimj.200510108

# 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Page 2: Comment

Although this suggestion is not completely new I suspect that many trialists and statisticians will beunaware of the issue.

It seems clear though that the same issues apply to minimisation although it would require simula-tion to evaluate the extent of possible bias due to guessing. The number of variables being minimisedis likely to have an impact.

While I accept Berger’s calculations and share his concerns, it is worth noting that there is noempirical evidence that bias is indeed engendered by taking advantage of the strategy outlined. Ber-ger and Weinstein (2004) discussed several trials with evidence of possible compromised allocation,but there was no clear evidence of bias due to guessing and indeed in some cases the problemsseemed clearly due to other factors such as opening envelopes ahead of time. The absence of clearevidence does not mean that such practice does not happen, of course, nor that we should not guardagainst it.

I agree with Berger’s first suggestion, that “those who design trials should use allocation proceduresthat are resistant to selection bias”. One way to achieve this is to abandon all but simple randomisation.I don’t really recommend that strategy, but there is unlikely to be consensus on which forms of re-stricted randomisation might still have adequate protection. The perceived advantages of blocking andstratification are not perhaps as great as some may think but they do serve a useful purpose. I would bereluctant to abandon them totally. Clearly, though, there is a strong case for omitting centre from thelist of stratifying variables, and for reducing predictability. The same applies to minimisation.

Schulz and Grimes (2002) recently suggested a new variant of restricted randomisation called‘mixed randomisation’. They make two suggestions that should reduce the risk of selection bias evenin unblinded trials with access to sequence of past assignments. First, they mix randomised permutedblocks with occasional sequences of simple randomisation (possibly of random length). Second, theysuggest starting with a block of deliberately unbalanced assignments so that balance will be the excep-tion rather than the rule later on the sequence. This second innovation might reduce the risk of suc-cessful guessing the next assignment. While clearly a more complex system, it will be rather less easyto guess successfully the upcoming assignments.

I agree with Berger that there is an important distinction between the process and the reality, andthis applies to several aspects of trial methodology (e.g. blinding). I am not really convinced, though,by the feasibility of his second suggestion: “Those who evaluate reports of randomized trials, such asregulatory agencies and medical journals, should require assessments of selection bias.” Likewise histhird suggestion, that “Those who perform meta-analyses should consider selection bias when deter-mining the weights to assign to each study included,” does not seem workable.

Berger’s final idea, that “researchers should develop methods to salvage useful and reliable be-tween-group comparisons from trials that are found to have selection bias” is certainly reasonable inprinciple, but it relies on developing ways to be sure that such bias has occurred. I am not convincedthat it is possible. After all, small P values for baseline comparisons will inevitably occur from timeto time.

It is certainly important that RCTs are made as bias-free as is reasonably possible. This paper byBerger is valuable in pointing out that widely used methods that are generally thought to fulfil thatneed are in fact open to some potential bias, and that more care is needed in selecting a suitableallocation procedure. We should consider how best to avoid the possibility of the type of selectionbias described in this paper. However, we must ensure that in the process we do not undermine theconsiderable efforts of the vast majority of trialists who strive honestly to obtain reliable evidenceabout health care interventions.

References

Berger, V. (2005). Quantifying the magnitude of baseline covariate imbalances resulting from selection bias inrandomized clinical trials. Biometrical Journal 47, 119–127.

Berger, V. and Bears, J. D. (2003). When can a clinical trial be called randomised? Vaccine 21, 468–472.

Biometrical Journal 47 (2005) 2 129

# 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Page 3: Comment

Berger, V. W. and Weinstein, S. (2004). Ensuring the comparability of comparison groups: is randomization enough?Controlled Clinical Trials 25, 515–524.

Deeks, J. J., Dinnes, J., D’Amico, R. A., Sowden, A. J., Sakarovitch, C., Song, F., Petticrew, M., and Altman, D. G.(2003). Evaluating non-randomised intervention studies. Health Technology Assessment 7, (27), 1–173.

J�ni, P., Altman, D. G., and Egger, M. (2001). Assessing the quality of controlled clinical trials. British MedicalJournal 323, 42–46.

Moses, L. E. (1995). Measuring effects without randomized trials? Options, problems, challenges. Medical Care33, (4 Suppl), AS8–AS14.

Schulz, K. F. and Grimes, D. A. (2002). Unequal group sizes in randomised trials: guarding against guessing.Lancet 359, 966–970.

130 D. G. Altman: Comment

# 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim