Upload
simon-day
View
216
Download
0
Embed Size (px)
Citation preview
PHARMACEUTICAL STATISTICS
Pharmaceut. Statist. 2005; 4: 221–224
Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/pst.177
Literature Review March–June 2005
Simon Day1,*,y and Meinhard Kieser2
1Medicines and Healthcare products Regulatory Agency, Room 13-205, Market Towers,
1 Nine Elms Lane, London SW8 5NQ, UK2Department of Biometry, Dr Willmar Schwabe Pharmaceuticals, Karlsruhe, Germany
INTRODUCTION
This review covers the following journals received during the
period from the middle of March 2005 to middle of June 2005:
* Applied Statistics, volume 54, part 3.* Biometrical Journal, volume 47, part 2.* Biometrics, volume 61, part 1.* Biometrika, volume 92, part 1.* Biostatistics, volume 6, part 2.* Clinical Trials, volume 2, parts 2, 2s.* Communications in Statistics – Simulation and Computation,
volume 34, parts 1, 2.* Communications in Statistics – Theory and Practice, volume
34, parts 1–5.* Computational Statistics & Data Analysis, volume 49, parts
2–4.* Drug Information Journal, volume 39, part 2.* Journal of Biopharmaceutical Statistics, volume 15, parts 2, 3.* Journal of the American Statistical Association, volume 100,
part 2.* Journal of the Royal Statistical Society, Series A, volume
168, parts 2, 3.* Statistics in Medicine, volume 24, parts 8–13.* Statistical Methods in Medical Research, volume 14, part 2.
SELECTED HIGHLIGHTS FROM THE
LITERATURE
A collection of nine papers was published in Clinical Trials,
these being the proceedings of a workshop organised by the UK
Medical Research Council on cluster randomized trials. There
is an introductory editorial to set the scene:
* Moulton LH. A practical look at cluster-randomized trials
(editorial). Clinical Trials 2005; 2:89–90.
Part 2, 2005, of the Journal of Biopharmaceutical Statistics is a
special issue devoted to non-clinical statistical applications in
the pharmaceutical industry. A wide range of topics is covered,
spanning the whole development process of a drug from the
discovery phase to manufacturing.
* Journal of Biopharmaceutical Statistics, 15:193–373.
Phase I
The paper by Zhou reviews Bayesian decision-making proce-
dures for ‘first-into-man’ phase I dose-escalation studies with a
binary response. Such studies are mostly performed in oncology
where doses identified in preclinical trials are administered to
patients for whom other treatments have failed. The objective is
to find a dose that is high enough to be effective but which
at the same time avoids a risk of toxicity. In a simulation
study, the performance of the Bayesian approach is investigated
under six scenarios (‘very safe’ to ‘very toxic’) to compare the
impact of different choices of the number of dose levels and
cohort sizes.
* Zhou Y. Choosing the number of doses and the cohort size
for phase 1 dose-escalation studies. Drug Information
Journal 2005; 39:125–137.
While many papers have been published that deal with
the design and analysis of phase I trials in cancer, compa-
ratively little attention has been given to phase I studies in
other indications. The exception proves the rules: Kang and
Ahn investigate the statistical properties of a design that is
frequently used to determine the maximum tolerated dose
(MTD) in patients with Alzheimer’s disease. The performance
of phase I dose-finding studies in patients is particularly
suggested in this indication as it has been shown in the
past that the MTD in Alzheimer’s disease patients may be
substantially higher than the MTD determined in normal
volunteers.
* Kang S-H, Ahn C. The expected toxicity rate at the
maximum tolerated dose in bridging studies in Alzheimer’s
disease. Drug Information Journal 2005; 39:149–157.
Copyright # 2005 John Wiley & Sons, Ltd.Received \60\re /teci
yE-mail: [email protected]
*Correspondence to: Simon Day, Medicines and Healthcareproducts Regulatory Agency, Room 13-205, Market Towers, 1Nine Elms Lane, London SW8 5NQ, UK.
Phase II
One problem is to find the best combination of doses of two
agents – Wang and Ivanova take a slightly simpler problem
where fixed doses of one agent are available and the objective is
to find the best dose of the second agent so as to obtain a
specified toxicity profile. The methods used are described as
more efficient than running several studies at each of the
specified doses of the first agent:
* Wang K, Ivanova A. Two-dimensional dose finding in
discrete dose space. Biometrics 2005; 61:217–222.
Multiplicity
Control of error rates in trials with multiple outcomes,
comparisons, analyses, etc. is an important problem. Whilst
the term ‘gatekeeping’ is used in various contexts, multiplicity is
the meaning for Chen et al. They develop methods with better
power properties than others by considering the interrelation-
ships between various different primary and secondary sig-
nificance tests.
* Chen X, Luo X, Capizzi T. The application of enhanced
parallel gatekeeping strategies. Statistics in Medicine 2005;
24:1385–1397.
Spurrier considers the setting where one aims at demonstrat-
ing that any of more than one experimental treatments is/are
better than the best of more than one controls. The approach
can be viewed as an extension of Dunnett’s many-to-one
comparison procedure. Normal and distribution-free testing
methods are derived that maintain the experimentwise error
rate in the strong sense. By inversion of the tests, simultaneous
confidence intervals are obtained.
* Spurrier JD. Multiple comparisons with the best control in a
one-way layout. Communications in Statistics – Theory and
Methods 2005; 34:651–660.
Sample size calculation and recalculation
Have we not adequately covered estimating the variance from
ongoing trials (with a view to reassessing sample size)? Xing and
Ganju add to the literature on this:
* Xing B, Ganju J. A method to estimate the variance of
an endpoint from an on-going blinded trial. Statistics in
Medicine 2005; 24:1807–1814.
The following paper derives a sample size formula for studies
where the accuracy of different diagnostic tests is compared
and where multiple samples come from the same patient. The
approach takes into account the various sources of correlation
that are present in such settings, and simulations show that the
method works well.
* Liu A, Schisterman EF, Mazumdar M, Hu J. Power and
sample size calculation of comparative diagnostic accuracy
studies with multiple correlated test results. Biometrical
Journal 2005; 47:140–150.
Willan and Pinto take a very pragmatic approach and
consider costs of trials, the ability to make pragmatic decisions,
as well as the standard Type I and II error rates:
* Willan AR, Pinto EM. The value of information and
optimal clinical trial design. Statistics in Medicine 2005;
24:1791–1806.
Interim analyses
We start with a pessimistic view – interim analyses for futility.
Timing of interim analyses is often an administrative issue, but
Gould looks at the science and gives helpful recommendations.
Most notably, and helpfully, a good rule of thumb is not to
carry out interim futility analyses when you have less than
about 40% of the data from the trial. Before then, it is hardly
worth it:
* Gould AL. Timing of futility analyses for ‘proof of concept’
trials. Statistics in Medicine 2005; 24:1815–1835.
In the pharmaceutical industry, everybody seems to be
talking about acceleration of drug development. An attractive
concept to achieve this goal is the combination of phase II
and phase III aspects in a study with a two-stage design.
Conceptually, these designs randomize patients to experi-
mental treatments and a control in the first stage, select
promising treatments at the interim analysis based on the
information available then, and randomize the patients to this
subset of treatments in the second stage. Data from both stages
are used for final decision-making. There are big hopes
connected with this appealing approach of integrating the
aspects of learning and confirming in a single trial – reflected by
the fact that three papers on this topic have recently been
published. The proposals differ in the extent of flexibility they
provide, for example with respect to the selection rule
and options for design adaptations. The first two papers deal
with the situation that a ‘provisional’ short-term (surrogate)
endpoint and a long-term clinical endpoint are available.
This enables use of information from interim analyses even
if the duration of patient enrolment is short relative to the
follow-up period.
* Liu Q, Pledger GW. Phase 2 and 3 combination designs to
accelerate drug development. Journal of the American
Statistical Association 2005; 100:493–502.* Todd S, Stallard N. A new clinical trial design combining
phases 2 and 3: sequential designs with treatment selection
and a change of endpoint. Drug Information Journal 2005;
39:109–118.
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 221–224
Literature Review222
* Bischoff W, Miller F. Adaptive two-stage test procedures to
find the best treatment in clinical trials. Biometrika 2005;
92:213–227.
‘Overrunning’ in interim analyses is a particular problem
in cases of fast recruitment and a comparatively long
treatment phase: additional data may accumulate for some
time after the formal stopping rule has been reached. The
following paper proposes a method that deals with this
situation and that is tailored to avoid the awkward case of
conflicting decisions based on the interim data and the final
analysis set:
* Wust K, Kieser M. Monitoring continuous long-term
outcomes in adaptive designs. Communications in Statistics
� Computation and Simulation 2005; 34:321–341.
Chen proposes a strategy for interim analysis that can be
applied in studies with binary endpoints where an exact test is
used for the analysis. Due to the discreteness of the response,
there is generally an ‘unused’ type I error rate. Hence, this ‘free’
a can be spent in an interim analysis without any penalty for the
final test. This idea is illustrated for a single-arm study with the
Pearson–Clopper test and a two-arm study with Fisher’s exact
test.
* Chen C. A note on penalty-free interim analyses of clinical
studies with binary response. Biometrical Journal 2005;
47:194–198.
Data Analysis Issues
There are many situations where we are interested in the
outcome for only a subset of patients in a trial. Although
mortality is usually the most important endpoint, sometimes we
wish to know answers to questions such as ‘of those who
survived, which treatment gives best quality of life?’ (or some
other relevant endpoint). Of course, we lose the benefit of
randomized comparisons as soon as we condition on the
survival endpoint; Hayden et al. describe approaches to this
problem:
* Hayden D, Pauler DK, Schoenfeld D. An estimator for
treatment comparisons among survivors in randomized
trials. Biometrics 2005; 61:305–310.
Selection bias in trials is a subject that has been much ignored
– mostly because people do not think it can happen with
appropriate blinding and randomization; but it probably can
happen. You need to be convinced it can happen (see Berger’s
recent book on Selection bias and covariate imbalances in
randomized clinical trials, published by Wiley in 2005), but if
you are, then the following papers (as well as material in the
book) discuss its consequences and solutions. In the first paper,
Berger challenges the view that the impact of selection bias is
negligible in randomized and blinded trials by quantifying the
extent of baseline imbalance that can result from its presence
when using the randomized blocks procedure. The article is
accompanied by enlightening discussion contributions by
Douglas Altman, Joachim Roehmel and Stephen Senn. The
second paper focuses on procedures for the detection of
selection bias and for an appropriate adjustment of treatment
group comparisons.
* Berger VW. Quantifying the magnitude of baseline covari-
ate imbalances resulting from selection bias in randomized
clinical trials (with discussion). Biometrical Journal 2005;
47:119–139.* Ivanova A, Barrier RC, Berger VW. Adjusting for
observable selection bias in block randomized trials.
Statistics in Medicine 2005; 24:1537–1546.
In the following paper, ‘administrative’ data are taken to be
epidemiological- or demographic-type data (data from hospital
discharge summaries, for example). It is often easy and
inexpensive to get lots of these data so it is obvious to ask
how they might be combined with ‘clinical’ data (here taken
to mean the data from individuals in a study). Austin et al.’s
proposal is to do so via propensity scores.
* Austin PC, Mamdani MM, Stukel TA, Anderson GM,
Tu JV. The use of the propensity score for estimating
treatment effects: administrative versus clinical data.
Statistics in Medicine 2005; 24:1563–1578.
The next paper gives a helpful summary and comparison of
methods for calculating Poisson confidence intervals that have
been proposed in the literature:
* Byrne J, Kabaila P. Comparison of Poisson confidence
intervals. Communications in Statistics – Theory and
Methods 2005; 34:545–556.
Meta-Analysis
Meta-analysts fall into two groups – those analysing individual
patients’ data and those combining summary statistics
across studies. Some meta-analysts, of course, sit happily in
either camp. Pharmaceutical companies often have an
advantage that they have all the original raw data from all
the studies – and for them, the paper by Tudor Smith et al. may
be of some interest. The problem described by Williamson and
Gamble may be one less often associated with meta-analyses
carried out by companies because it describes the case of only
having selected reporting (of the most positive results?) in
published papers.
* Tudur Smith C, Williamson PR, Marson AG. Investigating
heterogeneity in an individual patient data meta-analysis of
time to event outcomes. Statistics in Medicine 2005;
24:1307–1319.
Literature Review 223
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 221–224
* Williamson PR, Gamble C. Identification and impact of
outcome selection bias in meta-analysis. Statistics in
Medicine 2005; 24:1547–1561.
Pharmacovigilance
The assessment of the potential of a drug to delay ventricular
repolarization has gained increased attention through the
recently drafted ICH E14 guideline on this subject. The analysis
of the QT interval (the time required for ventricular depolar-
ization and repolarization) has become a necessary step in
safety evaluation. The QT interval is inversely related to the
heart rate. It is current practice to use one of the available
methods to correct the QT interval for the effects of the heart
rate and to base the assessment on the corrected QTc interval.
The paper by Wei and Chen gives an overview of the commonly
applied correction methods and points out the limitations of
these procedures. As an alternative, a model-based correction
approach is proposed and its application is illustrated with an
example.
* Wei GCG, Chen JYH. Model-based correction to the QT
interval for heart rate for assessing mean QT interval change
due to drug effect. Drug Information Journal 2005; 39:
139–148.
Regulatory issues
We put this next collection of paper, commentaries and
rejoinder under the heading of ‘regulatory’ issues – although
it is as much to do with what constitutes reliable and convincing
evidence as anything else. It relates to the ‘two trials’ paradigm
(‘paradigm’ being the term the authors use). Shun et al. are
approaching the subject from a formal statistical point of view,
rather than a more general, less formal, consideration of the
value of reproducibility of evidence. Considerations of one, or
two, populations are central. A good two-way discussion with
sensible ‘plus’ points and ‘minus’ points ensues: Koch is broadly
supportive, Huque less so.
* Shun Z, Chi E, Durrleman S, Fisher L. Statistical
consideration of the strategy for demonstrating clinical
evidence of effectiveness – one larger vs two smaller pivotal
studies. Statistics in Medicine 2005; 24:1619–1637.* Koch GG, Huque MF. Commentaries on Statistical
consideration of the strategy for demonstrating clinical
evidence of effectiveness – one larger vs two smaller pivotal
studies. Statistics in Medicine 2005; 24:1639–1651.* Shun Z, Chi E, Durrleman S, Fisher L. Rejoinder:
Statistical consideration of the strategy for demonstrating
clinical evidence of effectiveness – one larger vs two smaller
pivotal studies. Statistics in Medicine 2005; 24:1652–1656.
Cost-effectiveness
All multinational studies of costs (with or without effectiveness)
suffer from the problem that costs and cost structures differ so
widely between countries. You can pool data across countries
and risk estimating a cost that is applicable nowhere, or stratify
and lose the benefits of sample size from a bigger study. Pinto
et al. compromise and use shrinkage estimates that have both
benefits of being country-specific and of smaller variance than
those of simple stratified analyses. Perhaps compromising a
little on both facets may produce the ‘best’ (in some very
informal sense) estimate. Some of the algebra is a little heavy
going but worth the trouble if you work in this area. Even if you
do not, there is a long list of interesting and useful references,
some of which may be worth following up.
* Pinto EM, Willan AR, O’Brien BJ. Cost–effectiveness
analysis for multinational clinical trials. Statistics in
Medicine 2005; 24:1965–1982.
Miscellaneous
Pharmacogentics is (apparently) the brave new world for
clinical trialists. Some statisticians/trialists are already involved
– others are sitting on the sidelines. An interesting broad
discussion of some of the issues might help bring some of us up
to speed:
* Kelly PJ, Stallard N, Whittaker JC. Statistical design and
analysis of pharmacogenetic trials. Statistics in Medicine
2005; 24:1495–1508.
Some people advocate the use of Microsoft Excel for
statistical analyses, and courses on this subject are offered.
McCullough and Wilson investigated Excel 2003 with respect to
the accuracy of some statistical distributions, estimation
(summary statistics, ANOVA, linear and nonlinear regression)
and random number generation. They compared the results
with those obtained from previous versions. In all three areas
they found that ‘the performance of Excel ... is still inadequate’.
Potential users would do well to take a look at the paper before
using the statistical procedures in the software.
* McCullough BD, Wilson B. On the accuracy of statistical
procedures in Microsoft Excel 2003. Computational Statis-
tics & Data Analysis 2005; 49:1244–1252.
Finally, presidential addresses usually contain interesting
material that is not always as technically demanding on the
reader as many other papers. Where is our profession going?
Are we losing bits? Are we gaining bits? Geert Molenberghs, the
outgoing president of the International Biometric Society, gives
his views:
* Molenberghs G. Biometry, biometrics, bioinformatics,. . .,bio-X. Biometrics 2005; 61:1–9.
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 221–224
Literature Review224