Click here to load reader
Upload
simon-day
View
219
Download
5
Embed Size (px)
Citation preview
PHARMACEUTICAL STATISTICS
Pharmaceut. Statist. 2005; 4: 141–145
Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/pst.170
Literature Review January–March 2005
Simon Day1,*,y and Meinhard Kieser2
1Medicines and Healthcare Products Regulatory Agency, Room 13-205, Market Towers,
1 Nine Elms Lane, London SW8 5NQ, UK2Department of Biometry, Dr Willmar Schwabe Pharmaceuticals, Karlsruhe, Germany
INTRODUCTION
This review covers the following journals received during the
period January 2005 to middle of March 2005:
* Applied Statistics, volume 54, parts 1, 2.* Biometrical Journal, volume 46, part 6 and volume 47,
part 1.* Biometrics, volume 60, part 4.* Biometrika, volume 91, part 4.* Biostatistics, volume 6, part 1.* Clinical Trials, volume 1, part 6 and volume 2, part 1.* Communications in Statistics – Simulation and Computation,
volume 33, part 4.* Communications in Statistics – Theory and Practice, volume
33, parts 11, 12.* Computational Statistics & Data Analysis, volume 48, parts
1–4 and volume 49, part 1.* Drug Information Journal, volume 39, part 1.* Journal of Biopharmaceutical Statistics, volume 15,
part 1.* Journal of the American Statistical Association, volume 99,
part 4 and volume 100, part 1.* Journal of the Royal Statistical Society, Series A, volume
168, part 1.* Statistics in Medicine, volume 24, parts 1–7.* Statistical Methods in Medical Research, volume 14,
part 1.
SELECTED HIGHLIGHTS FROM THE
LITERATURE
The theme of Statistical Methods in Medical Research was:
* Part 1: Disease mapping (pages 1–112).
The following special issue of Statistics in Medicine has been
published:
* Lipman HB, Cadwell B, Kerkering JC, Lin LS, Sieber WK
(editors). 9th Biennial CDC/ATSDR Symposium on Sta-
tistical Methods: Study Design and Decision Making in
Public Health. Statistics in Medicine 2005; 24(4):491–669.
One tutorial has appeared in Statistics in Medicine:
* Austin PC, Tu JV, Daly PA, Alter DA. The use of quantile
regression in health care research: a case study examining
gender differences in the timeliness of thrombolytic therapy.
Statistics in Medicine 2005; 24:791–816.
The first issue of 2005 of the Biometrical Journal is dedicated
to the topic of therapeutic equivalence. The eight papers
included are from presentations given at a conference on this
subject held in Dusseldorf, Germany. A number of methodo-
logical problems surrounding non-inferiority trials are ad-
dressed, such as the choice of the equivalence margin (based on
a review of methods used in published non-inferiority trials),
the simultaneous comparison of efficacy and safety, and the
assessment of non-inferiority for time-to-event data and for
binary outcomes. This selection of papers from recognized
experts in this field is supplemented by comments by Peter
Bauer and Stephen Senn and a tribute to Joachim Rohmel (to
whom this issue is dedicated) upon his retirement from the
Bundesinstitut fur Arzneimittel und Medizinprodukte by
Robert O’Neill.
* Biometrical Journal, Volume 47, part 1 (pages 5–107).
Phase I
Trials are usually assumed to enrol (and then randomize) a
single, homogeneous group of patients. Yuan and Chappell
discuss up-and-down designs, isotonic design and the continual
reassessment methods in the context of groups of cancer
patients with very different toxicity susceptibility. Instead of
running several trials in different subgroups, which may yield
conflicting results, they enrol all patients and adjust the analysis
to give a coherent result across all strata in the study.
Copyright # 2005 John Wiley & Sons, Ltd.Received \60\re /teci
yE-mail: [email protected]
*Correspondence to: Simon Day, Medicines and HealthcareProducts Regulatory Agency, Room 13-205, Market Towers, 1Nine Elms Lane, London SW8 5NQ, UK.
* Yuan Z, Chappell R. Isotonic designs for phase I cancer
clinical trials with multiple risk groups. Clinical Trials 2004;
1:499–508.
The paper by Rosenberger et al. describes a user-friendly
interactive web-based software system for Bayesian phase I
trials that sequentially assigns the dose levels to patients. The
background methodology (which was published in 2003 in
Biometrics) is briefly sketched, but the focus is on a description
of the considerations on the development requirements and
their solutions. A demonstration version of the program, which
was developed by an interdisciplinary team of statisticians,
programmers, an information system expert and an oncologist,
is available from the web.
* Rosenberger WF, Canfield GC, Perevozskaya I, Haines
LM, Hausner P. Development of interactive software for
Bayesian optimal phase I clinical trial design. Drug
Information Journal 2005; 39:89–98.
Summarizing concentration–time profiles of drug is a common
problem, but one usually approached by comparing various
summary measures (area under the curve, maximum concentration,
time to maximum concentration, and so on). A direct comparison
of the shapes of the curves is more difficult, partly because the
problem is more difficult to specify. The following paper gives a
nice method and explains it clearly with a real example:
* Liao JJZ. Comparing the concentration curves directly in a
pharmacokinetics, bioavailability/bioequivalence study.
Statistics in Medicine 2005; 24:883–891.
Phase II
Logan describes a flexible phase II design which combines
learning about – and dropping less effective – treatment arms
with sample size re-estimation to ensure that the treatment arms
finally selected give enough useful information to pass on to
phase III. It is as efficient as running several single-arm trials
but generally has smaller sample size than the total of those
several single-arm trials.
* Logan BR. Optimal two-stage randomized phase II clinical
trials. Clinical Trials 2005; 2:5–12.
Surrogate endpoints
Assessing the usability of an endpoint as a surrogate endpoint is
never easy. Korn et al. use a mixed level approach to consider
data on individual patients as well as data at the trial level. It is
an interesting paper in its own right, but is made more so by an
accompanying commentary and then rejoiner. All should be
read together:
* Korn EL, Albert PS, McShane LM. Assessing surrogates as
trial endpoints using mixed models. Statistics in Medicine
2005; 24:163–182.
* Freedman LS. Commentary on ‘Assessing surrogates as
trial endpoints using mixed models’. Statistics in Medicine
2005; 24:183–185.* Korn EL, Albert PS, McShane LM. Rejoiner to commen-
tary by Dr Freedman of ‘Assessing surrogates as trial
endpoints using mixed models’. Statistics in Medicine 2005;
24:187–190.
Multiplicity
Jan and Shieh propose two closed testing procedures to identify
the minimum effective dose and compare them in a Monte
Carlo simulation. It is a real practical benefit that their
procedures do not make any order assumptions on the dose–
response relationship.
* Jan S-L, Shieh G. Nonparametric multiple test procedures
for dose finding. Communications in Statistics – Simulation
and Computation 2004; 33:1021–1037.
The following paper by Romano and Wolf presents a further
development in the field of stepdown methods in multiple
testing, of which the well-known Bonferroni–Holm procedure is
a special case. They construct general stepdown procedures that
do not require an assumption needed for the resampling-based
method proposed by Westfall and Young.
* Romano JP, Wolf M. Exact and approximate stepdown
methods for multiple hypothesis testing. Journal of the
American Statistical Association 2005; 100:94–108.
Sample size calculation and recalculation
More on issues following re-estimation of sample size part way
through a study: this paper uses both the estimate of variance
and the size of the effect at the unblinded interim analysis to
determine eventual study size. As the title suggests, the authors
present details of estimating unbiased effect sizes and their
confidence intervals.
* Cheng Y, Shen Y. Estimation of a parameter and its exact
confidence interval following sequential sample size reesti-
mation trials. Biometrics 2004; 60:910–918.
Liu et al. address the same topic but from the viewpoint of
conditional estimation. They provide methods that can be
applied in group sequential and adaptive designs for the
estimation of primary and secondary parameter estimates.
* Liu A, Troendle JF, Yu KF, Yuan VW. Conditional
maximum likelihood estimation following a group sequen-
tial test. Biometrical Journal 2004; 46:760–768.
Wust and Kieser present a method for sample size recalcula-
tion in the case of binary outcomes; it only uses information
from blinded data and thus does not require the randomization
code to be broken. The focus is on the following problem: as
recruitment is usually not stopped when those patients are
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145
142 Literature Review142
enrolled whose data are used for sample size re-estimation, a
number of patients have not gone through the whole treatment
phase at that time but have already reached some intermediate
point of the observation phase. The proposed procedure takes
into account this ‘short-term data’ which leads to a decreased
variability of the resulting sample size and power.
* Wust K, Kieser M. Including long- and short-term data in
blinded sample size recalculation for binary endpoints.
Computational Statistics & Data Analysis 2005; 48:835–855.
Finally, a paper on non-adaptive sample size calculation.
Ahn and Jung examine the effect of dropouts in repeated
measurement studies analysed with the generalized estimation
equation (GEE) method. A sample size formula proposed by
the authors that includes the dropout pattern turns out to work
well. On the other hand, the naive approach of calculating the
sample size without taking into account dropouts and correct-
ing this number by dividing it by the proportion of completers
yields empirical powers larger than the desired value.
* Ahn C, Jung S-H. Effect of dropout on sample size
estimates for test on trends across repeated measurements.
Journal of Biopharmaceutical Statistics 2005; 15:33–41.
Interim analyses
Two welcomed papers not so much to do with interim analyses
as with data monitoring committees appeared in Clinical Trials.
Among the various bodies involved in clinical trials – the data
monitoring committee, steering committee, sponsor, funder,
principal investigator – who is responsible for what? That
depends on what the agreement was. DeMets et al. suggest
some wording (legally justified) to help everyone know who is
responsible for what. It is just a pity that we need such legal
recourse whilst trying to find treatments to help seriously ill
patients.
* DeMets DL, Fleming TR, Rockhold F, Massie B, Merchant
T, Meisel A, Mishkin B, Wittes J, Stump D, Califf R.
Liability issues for data monitoring committee members.
Clinical Trials 2004; 1:525–531.
Clemens et al. give some insight into the closed (but increasingly
opening) world of data monitoring committees:
* Clemens F, Elbourne D, Darbyshire J, Pocock S, DAMO-
CLES Group. Data monitoring in randomized controlled
trials: surveys of recent practice and policies. Clinical Trials
2005; 2:22–33.
The paper by Cook and Kosorok considers clinical trials with
interim analyses where the primary variable is the time to the
first of a number of possible clinical events. In such studies, a
committee is often employed that classifies reported events
whether or not they meet the criteria defined for primary
endpoint events. The authors propose a procedure that accounts
for uncertain information on events that could not be classified
definitely at the timepoint of an interim analysis.
* Cook TD, Kosorok MR. Analysis of time-to-event data
with incomplete event adjudication. Journal of the American
Statistical Association 2004; 99:1140–1152.
Study design
Measurements should be reliable – no one would dispute such a
statement. So John Lachin has set out to justify what
‘reliability’ means and, for example, how it is different from
validity. Relative reliability of different outcome measures
should help to influence which ones are used in clinical trials:
* Lachin JM. The role of measurement reliability in clinical
trials. Clinical Trials 2004; 1:553–566.
In ‘superiority’ trials it is well understood why the analysis of
the intention-to-treat analysis set must take precedence over
that of the per-protocol set. In non-inferiority trials the
situation is not so clear, and that is why both analyses should
be supportive (and the per-protocol analysis is often seen as
primary). Brittain and Lin provide empirical evidence that, in
studies of antibiotics, the two analysis sets tend to give the same
result. In non-inferiority trials, the intention-to-treat set is often
considered insensitive to treatment differences, particularly in
the face of poor trial quality. Maybe that is not so in good
quality trials and – if many trials are of good quality – then
both analysis sets might be expected to agree. But perhaps we
still need per protocol analyses, just in case the trial is not of
such high quality.
* Brittain E, Lin D. A comparison of intent-to-treat and per-
protocol results in antibiotic non-inferiority trials. Statistics
in Medicine 2005; 24:1–10.
It is good to see a randomized trial of how to carry out
randomized trials. . . what induces patients to take part in,
comply with and remain in study protocols? Avenell et al.
compare an open, unblinded design with a randomized, double-
blind design. Rather worryingly, the authors advocate open,
uncontrolled studies – whilst acknowledging that may threaten
the validity of the results. Perhaps this is a nice result to know,
but not one that can be used.
* Avenell A, Grant AM, McGee M, McPherson G, Campbell
MK, McGee MA. The effects of an open design on trial
participant recruitment, compliance and retention – a
randomized controlled trial comparison with a blinded,
placebo-controlled design. Clinical Trials 2004; 1:490–498.
(See also the papers noted in the ‘miscellaneous’ section about
the science of study management.)
Targeting treatments to the right patients is a good intention.
Maitournam and Simon consider the efficiency of running a
trial in a small – but targeted – population rather than taking
the broader population of ‘all comers’. The relative efficiency
depends on the size of effect in various subpopulations and the
prevalence of those subsets. Of course, we also need to think
about what target population we want to try to get a licence for!
Literature Review 143
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145
* Maitournam A, Simon R. On the efficiency of targeted
clinical trials. Statistics in Medicine 2005; 24:329–339.
Data analysis issues
The following methods could be applicable across the range
of pharmaceutical statistical work from basic laboratory science
through to phase III trials. It goes right back to laboratory
work and considers the problems of limits of quantitation
for laboratory assays. When concentrations become very
low, estimating the concentration becomes very imprecise.
This problem is also applicable to environmental epidemiology
(and other areas); the author illustrates the ideas using
data from a clinical inhalation study (as well as from simulated
data).
* Cox C. Limits of quantitation for laboratory assays. Applied
Statistics 2005; 54:63–76.
Missing values and more criticism of ‘last observation carried
forward’: Mallinckrodt et al. show more results of the type that
are becoming quite common when comparing repeated
measures models with those of imputing missing values by
LOCF. Much of what they say is sound, but what we really
need instead of these result-driven papers are some that
carefully challenge and explain what assumptions are being
made by different models and what sorts of sensitivity analyses
might help (and how they can be done). Some of the
assumptions being stated about LOCF are not those that many
statisticians would make, and this gives a potentially off-putting
introduction to an otherwise very clear paper.
* Mallinckrodt CH, Kaiser CJ, Watkin JG, Molenberghs G,
Carrol RJ. The effect of correlation structure on treatment
contrasts estimated from incomplete clinical trial data with
likelihood-based repeated measures compared with last
observation carried forward ANOVA. Clinical Trials 2004;
1:477–489.
The paper by Ibrahim et al. gives a profound review of
common approaches for dealing with missing covariate data in
generalized linear models (GLMs), namely the maximum
likelihood, the multiple imputation, the fully Bayesian, and
the weighted estimation equations technique. It is extremely
helpful for the reader that not only the relationship between
these procedures is discussed but also their advantages and
disadvantages and their computational implementation.
* Ibrahim JG, Chen M-H, Lipsitz SR, Herring AH. Missing-
data methods for generalized linear models: a comparative
review. Journal of the American Statistical Association 2005;
100:332–346.
Missing values are usually seen as an endpoint problem.
White and Thompson consider the problem of missing values in
the covariates and evaluate the appropriateness of various
methods (imputation and others) to overcome it.
* White IR, Thompson SG. Adjusting for partially missing
baseline measurements in randomized trials. Statistics in
Medicine 2005; 24: 993–1007.
We are all clear about the dangers of adjusting for covariates
measured after (and influenced by) treatment assignment. That
is what Vansteelandt and Goetghebeur set out to do but using
that inevitable post-randomization variable: drug exposure.
The authors are fully aware of the value of intention-to-treat
analyses and so carefully search for what can, and cannot, be
concluded based on observed drug exposure.
* Vansteelandt S, Goetghebeur E. Sense and sensitivity when
correcting for observed exposures in randomized clinical
trials. Statistics in Medicine 2005; 24:191–210.
The following paper gives a review (including available
software sources) of several methods for calculating confidence
intervals for the risk ratio in non-inferiority trials. Furthermore,
results of a thorough simulation study are presented that
investigates the characteristics of the various procedures with
respect to type I error rate, power and agreement of test
decisions.
* Dann RS, Koch GG. Review and evaluation of methods for
computing confidence intervals for the ratio of two
proportions and considerations for non-inferiority clinical
trials. Journal of Biopharmaceutical Statistics 2005; 15:
85–107.
Lawson examines ten methods for computing confidence
intervals for the odds ratio in 2� 2 tables when there are small
sample sizes:
* Lawson R. Small sample confidence intervals for the odds
ratio. Communications in Statistics – Simulation and
Computation 2004; 33:1095–1113.
The paper by Nam et al. investigates three statistical methods
for establishing non-inferiority based on exponentially distrib-
uted censored survival data. For those who are sceptical about
the value of a high degree of specialization, the results are just
what they were waiting for: the performance of the tests with
respect to significance level and power turns out to be virtually
identical. However, in contrast to the robust nonparametric
log-rank test, the two parametric competitors are quite sensitive
to departures from the assumption of an exponential model
leading to heavy inflation of the test size.
* Nam J-M, Kim J, Lee S. Equivalence of two treatments and
sample size determination under exponential survival model
with censoring. Computational Statistics & Data Analysis
2005; 49:217–226.
The Wilcoxon–Mann–Whitney test is widely applied in the
analysis of clinical trials. The paper by Chen and Luo proposes
some improvements to versions of this test that are implemen-
ted in software packages and used in practice.
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145
144 Literature Review144
* Chen X, Luo X. Some modifications on the application of
the exact Wilcoxon–Mann–Whitney test. Communications in
Statistics – Simulation and Computation 2004; 33:1007–1020.
Pharmacovigilance
The Medical Dictionary for Regulatory Activities (MedDRA)
has now been available for about 10 years. MedDRA is
required for submissions of individual case safety reports in
Japan and Europe, while the FDA encourages sponsors to use
it. However, MedDRA is also widely used for adverse event
analyses in clinical trials. The paper by Kubler et al. compares
MedDRA to previous dictionaries, gives recommendations for
adverse event analyses based onMedDRA and discusses how to
deal with practical problems like the semi-annually updating of
MedDRA that may lead to the existence of different versions
during an ongoing trial.
* Kubler J, Vonk R, Beimel S, Gunselmann W, Homering M,
Nehrdich D, Koster J, Theobald K, Voleske P. Adverse
event analysis and MedDRA: business as usual or
challenge? Drug Information Journal 2005; 39:63–72.
Miscellaneous
Studying fraud and misconduct is not easy (speculation is so
much easier!). Reynolds summarizes findings of the US Office
of Research Integrity over a 10-year period. Of course, we (and
they) never know what they didn’t get to see. The paper
compares misconduct in trials and epidemiological research, by
seniority of staff and describes some of the sanctions imposed.
It concludes by accepting that carelessness and poor research
practice – which are not considered as ‘misconduct’ – may
actually have a greater negative impact than genuine cases of
fraud/misconduct (but that takes us back into the world of
speculation again).
* Reynolds SM. ORI findings of scientific misconduct in
clinical trials and publicly funded research, 1992–2002.
Clinical Trials 2004; 1:509–516.
Finally, as was mentioned earlier, it is good to see science
applied to how to do trials. The most recent issue of Clinical
Trials carried four articles discussing the benefits(?) of web-
based ‘clinical trialing’, introduced by an accompanying
editorial:
* Reboussin D, Espeland MA. The science of web-based
clinical trial management (editorial). Clinical Trials 2005;
2:1–2.* Winget M, Kincaid H, Lin P, Li L, Kelly S, Thornquist M.
A web-based system for managing and co-ordinating
multiple multisite studies. Clinical Trials 2005; 2:42–49.* Schmidt JR, Vignati AJ, Pogash RM, Simmons VA, Evans
RL. Web-based distributed data management in the Child-
hood Asthma Research and Education (CARE) Network.
Clinical Trials 2005; 2:50–60.* Mitchell R, Shah M, Ahmad S, Smith Rogers A, Ellenberg
JH. A unified web-based Query and Notification System
(QNS) for subject management, adverse events, regulatory,
and IRB components of clinical trials. Clinical Trials 2005;
2:61–71.* Litchfield J, Freeman J, Schou H, Elsley M, Fuller R,
Chubb B. Is the future of clinical trials internet-based? A
randomized clinical trial. Clinical Trials 2005; 2:72–79.
Literature Review 145
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 141–145