5

Click here to load reader

Literature review January-March 2005

Embed Size (px)

Citation preview

Page 1: Literature review January-March 2005

PHARMACEUTICAL STATISTICS

Pharmaceut. Statist. 2005; 4: 141–145

Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/pst.170

Literature Review January–March 2005

Simon Day1,*,y and Meinhard Kieser2

1Medicines and Healthcare Products Regulatory Agency, Room 13-205, Market Towers,

1 Nine Elms Lane, London SW8 5NQ, UK2Department of Biometry, Dr Willmar Schwabe Pharmaceuticals, Karlsruhe, Germany

INTRODUCTION

This review covers the following journals received during the

period January 2005 to middle of March 2005:

* Applied Statistics, volume 54, parts 1, 2.* Biometrical Journal, volume 46, part 6 and volume 47,

part 1.* Biometrics, volume 60, part 4.* Biometrika, volume 91, part 4.* Biostatistics, volume 6, part 1.* Clinical Trials, volume 1, part 6 and volume 2, part 1.* Communications in Statistics – Simulation and Computation,

volume 33, part 4.* Communications in Statistics – Theory and Practice, volume

33, parts 11, 12.* Computational Statistics & Data Analysis, volume 48, parts

1–4 and volume 49, part 1.* Drug Information Journal, volume 39, part 1.* Journal of Biopharmaceutical Statistics, volume 15,

part 1.* Journal of the American Statistical Association, volume 99,

part 4 and volume 100, part 1.* Journal of the Royal Statistical Society, Series A, volume

168, part 1.* Statistics in Medicine, volume 24, parts 1–7.* Statistical Methods in Medical Research, volume 14,

part 1.

SELECTED HIGHLIGHTS FROM THE

LITERATURE

The theme of Statistical Methods in Medical Research was:

* Part 1: Disease mapping (pages 1–112).

The following special issue of Statistics in Medicine has been

published:

* Lipman HB, Cadwell B, Kerkering JC, Lin LS, Sieber WK

(editors). 9th Biennial CDC/ATSDR Symposium on Sta-

tistical Methods: Study Design and Decision Making in

Public Health. Statistics in Medicine 2005; 24(4):491–669.

One tutorial has appeared in Statistics in Medicine:

* Austin PC, Tu JV, Daly PA, Alter DA. The use of quantile

regression in health care research: a case study examining

gender differences in the timeliness of thrombolytic therapy.

Statistics in Medicine 2005; 24:791–816.

The first issue of 2005 of the Biometrical Journal is dedicated

to the topic of therapeutic equivalence. The eight papers

included are from presentations given at a conference on this

subject held in Dusseldorf, Germany. A number of methodo-

logical problems surrounding non-inferiority trials are ad-

dressed, such as the choice of the equivalence margin (based on

a review of methods used in published non-inferiority trials),

the simultaneous comparison of efficacy and safety, and the

assessment of non-inferiority for time-to-event data and for

binary outcomes. This selection of papers from recognized

experts in this field is supplemented by comments by Peter

Bauer and Stephen Senn and a tribute to Joachim Rohmel (to

whom this issue is dedicated) upon his retirement from the

Bundesinstitut fur Arzneimittel und Medizinprodukte by

Robert O’Neill.

* Biometrical Journal, Volume 47, part 1 (pages 5–107).

Phase I

Trials are usually assumed to enrol (and then randomize) a

single, homogeneous group of patients. Yuan and Chappell

discuss up-and-down designs, isotonic design and the continual

reassessment methods in the context of groups of cancer

patients with very different toxicity susceptibility. Instead of

running several trials in different subgroups, which may yield

conflicting results, they enrol all patients and adjust the analysis

to give a coherent result across all strata in the study.

Copyright # 2005 John Wiley & Sons, Ltd.Received \60\re /teci

yE-mail: [email protected]

*Correspondence to: Simon Day, Medicines and HealthcareProducts Regulatory Agency, Room 13-205, Market Towers, 1Nine Elms Lane, London SW8 5NQ, UK.

Page 2: Literature review January-March 2005

* Yuan Z, Chappell R. Isotonic designs for phase I cancer

clinical trials with multiple risk groups. Clinical Trials 2004;

1:499–508.

The paper by Rosenberger et al. describes a user-friendly

interactive web-based software system for Bayesian phase I

trials that sequentially assigns the dose levels to patients. The

background methodology (which was published in 2003 in

Biometrics) is briefly sketched, but the focus is on a description

of the considerations on the development requirements and

their solutions. A demonstration version of the program, which

was developed by an interdisciplinary team of statisticians,

programmers, an information system expert and an oncologist,

is available from the web.

* Rosenberger WF, Canfield GC, Perevozskaya I, Haines

LM, Hausner P. Development of interactive software for

Bayesian optimal phase I clinical trial design. Drug

Information Journal 2005; 39:89–98.

Summarizing concentration–time profiles of drug is a common

problem, but one usually approached by comparing various

summary measures (area under the curve, maximum concentration,

time to maximum concentration, and so on). A direct comparison

of the shapes of the curves is more difficult, partly because the

problem is more difficult to specify. The following paper gives a

nice method and explains it clearly with a real example:

* Liao JJZ. Comparing the concentration curves directly in a

pharmacokinetics, bioavailability/bioequivalence study.

Statistics in Medicine 2005; 24:883–891.

Phase II

Logan describes a flexible phase II design which combines

learning about – and dropping less effective – treatment arms

with sample size re-estimation to ensure that the treatment arms

finally selected give enough useful information to pass on to

phase III. It is as efficient as running several single-arm trials

but generally has smaller sample size than the total of those

several single-arm trials.

* Logan BR. Optimal two-stage randomized phase II clinical

trials. Clinical Trials 2005; 2:5–12.

Surrogate endpoints

Assessing the usability of an endpoint as a surrogate endpoint is

never easy. Korn et al. use a mixed level approach to consider

data on individual patients as well as data at the trial level. It is

an interesting paper in its own right, but is made more so by an

accompanying commentary and then rejoiner. All should be

read together:

* Korn EL, Albert PS, McShane LM. Assessing surrogates as

trial endpoints using mixed models. Statistics in Medicine

2005; 24:163–182.

* Freedman LS. Commentary on ‘Assessing surrogates as

trial endpoints using mixed models’. Statistics in Medicine

2005; 24:183–185.* Korn EL, Albert PS, McShane LM. Rejoiner to commen-

tary by Dr Freedman of ‘Assessing surrogates as trial

endpoints using mixed models’. Statistics in Medicine 2005;

24:187–190.

Multiplicity

Jan and Shieh propose two closed testing procedures to identify

the minimum effective dose and compare them in a Monte

Carlo simulation. It is a real practical benefit that their

procedures do not make any order assumptions on the dose–

response relationship.

* Jan S-L, Shieh G. Nonparametric multiple test procedures

for dose finding. Communications in Statistics – Simulation

and Computation 2004; 33:1021–1037.

The following paper by Romano and Wolf presents a further

development in the field of stepdown methods in multiple

testing, of which the well-known Bonferroni–Holm procedure is

a special case. They construct general stepdown procedures that

do not require an assumption needed for the resampling-based

method proposed by Westfall and Young.

* Romano JP, Wolf M. Exact and approximate stepdown

methods for multiple hypothesis testing. Journal of the

American Statistical Association 2005; 100:94–108.

Sample size calculation and recalculation

More on issues following re-estimation of sample size part way

through a study: this paper uses both the estimate of variance

and the size of the effect at the unblinded interim analysis to

determine eventual study size. As the title suggests, the authors

present details of estimating unbiased effect sizes and their

confidence intervals.

* Cheng Y, Shen Y. Estimation of a parameter and its exact

confidence interval following sequential sample size reesti-

mation trials. Biometrics 2004; 60:910–918.

Liu et al. address the same topic but from the viewpoint of

conditional estimation. They provide methods that can be

applied in group sequential and adaptive designs for the

estimation of primary and secondary parameter estimates.

* Liu A, Troendle JF, Yu KF, Yuan VW. Conditional

maximum likelihood estimation following a group sequen-

tial test. Biometrical Journal 2004; 46:760–768.

Wust and Kieser present a method for sample size recalcula-

tion in the case of binary outcomes; it only uses information

from blinded data and thus does not require the randomization

code to be broken. The focus is on the following problem: as

recruitment is usually not stopped when those patients are

Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145

142 Literature Review142

Page 3: Literature review January-March 2005

enrolled whose data are used for sample size re-estimation, a

number of patients have not gone through the whole treatment

phase at that time but have already reached some intermediate

point of the observation phase. The proposed procedure takes

into account this ‘short-term data’ which leads to a decreased

variability of the resulting sample size and power.

* Wust K, Kieser M. Including long- and short-term data in

blinded sample size recalculation for binary endpoints.

Computational Statistics & Data Analysis 2005; 48:835–855.

Finally, a paper on non-adaptive sample size calculation.

Ahn and Jung examine the effect of dropouts in repeated

measurement studies analysed with the generalized estimation

equation (GEE) method. A sample size formula proposed by

the authors that includes the dropout pattern turns out to work

well. On the other hand, the naive approach of calculating the

sample size without taking into account dropouts and correct-

ing this number by dividing it by the proportion of completers

yields empirical powers larger than the desired value.

* Ahn C, Jung S-H. Effect of dropout on sample size

estimates for test on trends across repeated measurements.

Journal of Biopharmaceutical Statistics 2005; 15:33–41.

Interim analyses

Two welcomed papers not so much to do with interim analyses

as with data monitoring committees appeared in Clinical Trials.

Among the various bodies involved in clinical trials – the data

monitoring committee, steering committee, sponsor, funder,

principal investigator – who is responsible for what? That

depends on what the agreement was. DeMets et al. suggest

some wording (legally justified) to help everyone know who is

responsible for what. It is just a pity that we need such legal

recourse whilst trying to find treatments to help seriously ill

patients.

* DeMets DL, Fleming TR, Rockhold F, Massie B, Merchant

T, Meisel A, Mishkin B, Wittes J, Stump D, Califf R.

Liability issues for data monitoring committee members.

Clinical Trials 2004; 1:525–531.

Clemens et al. give some insight into the closed (but increasingly

opening) world of data monitoring committees:

* Clemens F, Elbourne D, Darbyshire J, Pocock S, DAMO-

CLES Group. Data monitoring in randomized controlled

trials: surveys of recent practice and policies. Clinical Trials

2005; 2:22–33.

The paper by Cook and Kosorok considers clinical trials with

interim analyses where the primary variable is the time to the

first of a number of possible clinical events. In such studies, a

committee is often employed that classifies reported events

whether or not they meet the criteria defined for primary

endpoint events. The authors propose a procedure that accounts

for uncertain information on events that could not be classified

definitely at the timepoint of an interim analysis.

* Cook TD, Kosorok MR. Analysis of time-to-event data

with incomplete event adjudication. Journal of the American

Statistical Association 2004; 99:1140–1152.

Study design

Measurements should be reliable – no one would dispute such a

statement. So John Lachin has set out to justify what

‘reliability’ means and, for example, how it is different from

validity. Relative reliability of different outcome measures

should help to influence which ones are used in clinical trials:

* Lachin JM. The role of measurement reliability in clinical

trials. Clinical Trials 2004; 1:553–566.

In ‘superiority’ trials it is well understood why the analysis of

the intention-to-treat analysis set must take precedence over

that of the per-protocol set. In non-inferiority trials the

situation is not so clear, and that is why both analyses should

be supportive (and the per-protocol analysis is often seen as

primary). Brittain and Lin provide empirical evidence that, in

studies of antibiotics, the two analysis sets tend to give the same

result. In non-inferiority trials, the intention-to-treat set is often

considered insensitive to treatment differences, particularly in

the face of poor trial quality. Maybe that is not so in good

quality trials and – if many trials are of good quality – then

both analysis sets might be expected to agree. But perhaps we

still need per protocol analyses, just in case the trial is not of

such high quality.

* Brittain E, Lin D. A comparison of intent-to-treat and per-

protocol results in antibiotic non-inferiority trials. Statistics

in Medicine 2005; 24:1–10.

It is good to see a randomized trial of how to carry out

randomized trials. . . what induces patients to take part in,

comply with and remain in study protocols? Avenell et al.

compare an open, unblinded design with a randomized, double-

blind design. Rather worryingly, the authors advocate open,

uncontrolled studies – whilst acknowledging that may threaten

the validity of the results. Perhaps this is a nice result to know,

but not one that can be used.

* Avenell A, Grant AM, McGee M, McPherson G, Campbell

MK, McGee MA. The effects of an open design on trial

participant recruitment, compliance and retention – a

randomized controlled trial comparison with a blinded,

placebo-controlled design. Clinical Trials 2004; 1:490–498.

(See also the papers noted in the ‘miscellaneous’ section about

the science of study management.)

Targeting treatments to the right patients is a good intention.

Maitournam and Simon consider the efficiency of running a

trial in a small – but targeted – population rather than taking

the broader population of ‘all comers’. The relative efficiency

depends on the size of effect in various subpopulations and the

prevalence of those subsets. Of course, we also need to think

about what target population we want to try to get a licence for!

Literature Review 143

Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145

Page 4: Literature review January-March 2005

* Maitournam A, Simon R. On the efficiency of targeted

clinical trials. Statistics in Medicine 2005; 24:329–339.

Data analysis issues

The following methods could be applicable across the range

of pharmaceutical statistical work from basic laboratory science

through to phase III trials. It goes right back to laboratory

work and considers the problems of limits of quantitation

for laboratory assays. When concentrations become very

low, estimating the concentration becomes very imprecise.

This problem is also applicable to environmental epidemiology

(and other areas); the author illustrates the ideas using

data from a clinical inhalation study (as well as from simulated

data).

* Cox C. Limits of quantitation for laboratory assays. Applied

Statistics 2005; 54:63–76.

Missing values and more criticism of ‘last observation carried

forward’: Mallinckrodt et al. show more results of the type that

are becoming quite common when comparing repeated

measures models with those of imputing missing values by

LOCF. Much of what they say is sound, but what we really

need instead of these result-driven papers are some that

carefully challenge and explain what assumptions are being

made by different models and what sorts of sensitivity analyses

might help (and how they can be done). Some of the

assumptions being stated about LOCF are not those that many

statisticians would make, and this gives a potentially off-putting

introduction to an otherwise very clear paper.

* Mallinckrodt CH, Kaiser CJ, Watkin JG, Molenberghs G,

Carrol RJ. The effect of correlation structure on treatment

contrasts estimated from incomplete clinical trial data with

likelihood-based repeated measures compared with last

observation carried forward ANOVA. Clinical Trials 2004;

1:477–489.

The paper by Ibrahim et al. gives a profound review of

common approaches for dealing with missing covariate data in

generalized linear models (GLMs), namely the maximum

likelihood, the multiple imputation, the fully Bayesian, and

the weighted estimation equations technique. It is extremely

helpful for the reader that not only the relationship between

these procedures is discussed but also their advantages and

disadvantages and their computational implementation.

* Ibrahim JG, Chen M-H, Lipsitz SR, Herring AH. Missing-

data methods for generalized linear models: a comparative

review. Journal of the American Statistical Association 2005;

100:332–346.

Missing values are usually seen as an endpoint problem.

White and Thompson consider the problem of missing values in

the covariates and evaluate the appropriateness of various

methods (imputation and others) to overcome it.

* White IR, Thompson SG. Adjusting for partially missing

baseline measurements in randomized trials. Statistics in

Medicine 2005; 24: 993–1007.

We are all clear about the dangers of adjusting for covariates

measured after (and influenced by) treatment assignment. That

is what Vansteelandt and Goetghebeur set out to do but using

that inevitable post-randomization variable: drug exposure.

The authors are fully aware of the value of intention-to-treat

analyses and so carefully search for what can, and cannot, be

concluded based on observed drug exposure.

* Vansteelandt S, Goetghebeur E. Sense and sensitivity when

correcting for observed exposures in randomized clinical

trials. Statistics in Medicine 2005; 24:191–210.

The following paper gives a review (including available

software sources) of several methods for calculating confidence

intervals for the risk ratio in non-inferiority trials. Furthermore,

results of a thorough simulation study are presented that

investigates the characteristics of the various procedures with

respect to type I error rate, power and agreement of test

decisions.

* Dann RS, Koch GG. Review and evaluation of methods for

computing confidence intervals for the ratio of two

proportions and considerations for non-inferiority clinical

trials. Journal of Biopharmaceutical Statistics 2005; 15:

85–107.

Lawson examines ten methods for computing confidence

intervals for the odds ratio in 2� 2 tables when there are small

sample sizes:

* Lawson R. Small sample confidence intervals for the odds

ratio. Communications in Statistics – Simulation and

Computation 2004; 33:1095–1113.

The paper by Nam et al. investigates three statistical methods

for establishing non-inferiority based on exponentially distrib-

uted censored survival data. For those who are sceptical about

the value of a high degree of specialization, the results are just

what they were waiting for: the performance of the tests with

respect to significance level and power turns out to be virtually

identical. However, in contrast to the robust nonparametric

log-rank test, the two parametric competitors are quite sensitive

to departures from the assumption of an exponential model

leading to heavy inflation of the test size.

* Nam J-M, Kim J, Lee S. Equivalence of two treatments and

sample size determination under exponential survival model

with censoring. Computational Statistics & Data Analysis

2005; 49:217–226.

The Wilcoxon–Mann–Whitney test is widely applied in the

analysis of clinical trials. The paper by Chen and Luo proposes

some improvements to versions of this test that are implemen-

ted in software packages and used in practice.

Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145

144 Literature Review144

Page 5: Literature review January-March 2005

* Chen X, Luo X. Some modifications on the application of

the exact Wilcoxon–Mann–Whitney test. Communications in

Statistics – Simulation and Computation 2004; 33:1007–1020.

Pharmacovigilance

The Medical Dictionary for Regulatory Activities (MedDRA)

has now been available for about 10 years. MedDRA is

required for submissions of individual case safety reports in

Japan and Europe, while the FDA encourages sponsors to use

it. However, MedDRA is also widely used for adverse event

analyses in clinical trials. The paper by Kubler et al. compares

MedDRA to previous dictionaries, gives recommendations for

adverse event analyses based onMedDRA and discusses how to

deal with practical problems like the semi-annually updating of

MedDRA that may lead to the existence of different versions

during an ongoing trial.

* Kubler J, Vonk R, Beimel S, Gunselmann W, Homering M,

Nehrdich D, Koster J, Theobald K, Voleske P. Adverse

event analysis and MedDRA: business as usual or

challenge? Drug Information Journal 2005; 39:63–72.

Miscellaneous

Studying fraud and misconduct is not easy (speculation is so

much easier!). Reynolds summarizes findings of the US Office

of Research Integrity over a 10-year period. Of course, we (and

they) never know what they didn’t get to see. The paper

compares misconduct in trials and epidemiological research, by

seniority of staff and describes some of the sanctions imposed.

It concludes by accepting that carelessness and poor research

practice – which are not considered as ‘misconduct’ – may

actually have a greater negative impact than genuine cases of

fraud/misconduct (but that takes us back into the world of

speculation again).

* Reynolds SM. ORI findings of scientific misconduct in

clinical trials and publicly funded research, 1992–2002.

Clinical Trials 2004; 1:509–516.

Finally, as was mentioned earlier, it is good to see science

applied to how to do trials. The most recent issue of Clinical

Trials carried four articles discussing the benefits(?) of web-

based ‘clinical trialing’, introduced by an accompanying

editorial:

* Reboussin D, Espeland MA. The science of web-based

clinical trial management (editorial). Clinical Trials 2005;

2:1–2.* Winget M, Kincaid H, Lin P, Li L, Kelly S, Thornquist M.

A web-based system for managing and co-ordinating

multiple multisite studies. Clinical Trials 2005; 2:42–49.* Schmidt JR, Vignati AJ, Pogash RM, Simmons VA, Evans

RL. Web-based distributed data management in the Child-

hood Asthma Research and Education (CARE) Network.

Clinical Trials 2005; 2:50–60.* Mitchell R, Shah M, Ahmad S, Smith Rogers A, Ellenberg

JH. A unified web-based Query and Notification System

(QNS) for subject management, adverse events, regulatory,

and IRB components of clinical trials. Clinical Trials 2005;

2:61–71.* Litchfield J, Freeman J, Schou H, Elsley M, Fuller R,

Chubb B. Is the future of clinical trials internet-based? A

randomized clinical trial. Clinical Trials 2005; 2:72–79.

Literature Review 145

Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145