Robert O’Neill , Ph.D. Director, Office of Biostatistics, CDER, FDA

template

An Update on Statistical Issues Associated with the International Harmonization of Technical Standards for Clinical Trials (ICH)
Robert ONeill , Ph.D.Director, Office of Biostatistics, CDER, FDA
22nd Spring Symposium, New Jersey Chapter of ASA, Wed. June 6,2001

Outline of talk
International Harmonization of technical standards: efficacy, safety, quality statistics - where does it fit inResources - who are the people and what are the processesA focus on a few ICH Guidances of interestA few issues of particular statistical concernThe future - where do we go from here

Harmonization of technical standards
ICH (Europe, Japan, United States)Began in 1989; ICH 1 in Brussels 1991ICH continues todayOutside of ICHAPEC - Bridging study initiative , Teipei meetingCanada, observers, WHO

Statistical Resources in the ICH regions
United StatesCDER, CBEREuropeU.K., Germany, SwedenCPMPJapanMHW; advisors, universityChina, Taiwan, Canada, Korea

Web addresses for information and guidances
www.fda.gov/cder/guidance/index.htmwww.ifpma.org/ich1www.emea.eu.int/

ICH Guidances with statistical content
E1; Extent of population exposure to assess clinical safetyE3; structure and content of clinical study reports (CONSORT statement)E4; Dose-response information to support drug registrationE5; Ethnic factors in the acceptability of foreign clinical dataE9; Statistical principles for clinical trialsE10; Choice of control groupE11; Clinical investigation of medicinal products in the pediatric population

ICH Guidances with statistical content

SafetycarcinogenicityQualityStability (expiration dating) : Q1A, Q1E

New initiatives from the European Regulators (CPMP)- Points to Consider Documents
On Validity and Interpretation of Meta-Analyses, and One Pivotal Study (Jan, 2001)On Missing Data (April, 2001)On Choice of deltaOn switching between superiority and non-inferiority On some multiplicity issues and related topics in clinical trials

Efficacy Working Party (EWP) Points to ConsiderCPMP/EWP/1776/99 Points to Consider on Missing Data (Released for Consultation January 2001) CPMP/EWP/2330/99 Points to Consider on Validity and Interpretation of Meta-Analyses, and one Pivotal study ( released for consultation October 2000) CPMP/EWP/482/99 Points to Consider on Switching between Superiority and Non-inferiority (Adopted July 2000)

ICH E9Statistical Principles for Clinical Trials: Contents
Introduction ( Purpose, scope, direction )Considerations for Overall Clinical DevelopmentStudy Design ConsiderationsStudy ConductData AnalysisEvaluation of safety and tolerabilityReportingGlossary of terms

Study Design: A Major Focus of the Guideline
Prior planningProtocol considerations

Prospective Planning
Design of the trialAnalysis of outcomes

Confirmatory Study vs. Exploratory Study
A hypothesis stated in advance and evaluatedData driven findings

Design Issues
EndpointsComparisonsChoice of study typeChoice of control groupSuperiorityNon-inferiorityEquivalenceSample sizeAssumptions, sensitivity analysis

Choice of Study Type
Parallel group designCross-over designFactorial designMulticenter design

Analysis: Outcome Assessment
Multiple endpointsAdjustments

Assessing Bias and Robustness of Study Results
Analysis sets

Analysis Sets
ITT principleAll randomized populationFull Analysis populationPer Protocol

Data Analysis Considerations
Prespecification of the AnalysisAnalysis setsFull analysis setPer Protocol SetRoles of the Different Analysis SetsMissing Values and Outliers

Statistical Analysis Plan (SAP)
A more technical and detailed elaboration of the principal features stated in the protocol.Detailed procedures for executing the statistical analysis of the primary and secondary variables and other data.Should be reviewed and possibly updated during blind review, and finalized before breaking the blind.Results from analyses envisaged in the protocol (including amendments) regarded as confirmatory.May be written as a separate document.

Analysis Sets
The ideal: the set of subjects whose data are to be included in the analysis:all subjects randomized into the trialsatisfied entry criteriafollowed all trial procedures perfectlyno loss to follow-upcomplete data records

Used to describe the analysis set which is complete as possible and as close as possible to the intention to treat principleMay be reasonable to eliminate from the set of ALL randomized subjects, those who fail to take at least one dose, or those without data post randomization.Reasons for eliminating any randomized subject should be justified and the analysis is not complete unless the potential biases arising from exclusions are addressed and reasonably dismissed.
Full Analysis Set

Sometimes described as:Valid cases, efficacy sample, evaluable subjectsDefines a subset of the subjects in the full analysis setMay maximize the opportunity for a new treatment to show additional efficacyMay or may not be conservativeBias arises from adherence to protocol related to treatment and/or outcome
Per Protocol Set

Advantageous to demonstrate a lack of sensitivity of the principal trial results to alternative choices of the set of subjects analyzed.The full analysis set and per protocol set play different roles in superiority trials, and in equivalence or non-inferiority trials.Full analysis set is primary analysis in superiority trials - avoids optimistic efficacy estimate from per protocol which excludes non-compliers. Full analysis set not always conservative in equivalence trial
Roles of the Different Analysis Sets

Impact on Drug Development
On sponsor design and analysis of clinical trials used as evidence to support claimsOn regulatory advice and evaluation of sponsor protocols and completed clinical trialsOn maximizing quality and utility of clinical studies in later phases of drug developmentOn multidisciplinary understanding of key concepts and issues Enhanced attention to planning and protocol considerations

Will the Guideline Help to Avoid Problem Areas in the Future - Maybe !
Not a substitute for professional advice-will require professional understanding and implementation of the principles statedWill not assure correct analysis and interpretationMost of the guideline topics reflect areas where problems have been observed frequently in clinical trials in drug development

ICH : Chemistry
Q1E: Bracketing and Matrixing Designs for Stability Testing of Drug Substances and Drug Products:Considerable new work, including extensive simulations to evaluate size of studies and the ability to detect important changes to expiration date setting (incomplete blocks, alias, etc).

ICH E10: Choice of Control Group and Related Design Issues in Clinical Trials
Section 1.5 is very statistically oriented involving issues like:Assay sensitivtyHistorical evidence of sensitivity to drug effectsChoice of a margin for a non-inferiority (dont show a difference ) trial.

Assay Sensitivity in Non-inferiority designs
Assay sensitivity is a property of a clinical trial defined as the ability to distinguish an effective treatment from a less effective or ineffective treatmentNote that this property is more than just the statistical power of a study to demonstrate an effect - it also deals with the conduct and circumstances of a trial

The presence of assay sensitivity in a non-inferiority trial may be deduced from two determinations
1) Historical evidence of sensitivity to drug effects, I.e., that similarly designed trials in the past regularly distinguished effective treatments from less effective or ineffective treatments, and2) Appropriate trial conduct, I.e. that the conduct of the trial (current) did not undermine its ability to distinguish effective treatments from less effective or ineffective treatments. [can be fully evaluated only after the active control non-inferiority trial is completed.]

Successful use of a non-inferiority trial thus involves four critical steps
1) Determining that historical evidence of sensitivity to drug effect exists. Without this determination, demonstration of efficacy from a showing of non-inferiority is not possible and should not be attempted.2) Designing a trial. Important details of the trial design, e.g. study population, concomitant therapy, endpoints, run-in periods, should adhere closely to the design of the placebo-controlled trials for which historical sensitivity to drug effects has been determined.

Successful use of a non-inferiority trial thus involves four critical steps (cont.)
3) Setting a margin. An acceptable non-inferiority margin should be defined, taking into account the historical data and relevant clinical and statistical considerations.4) Conducting the trial. The trial conduct should also adhere closely to that of the historical trials and should be of high quality.

Choosing the Non-inferiority margin
Prior to the trial, a non-inferiority margin, sometimes called a delta, is selected.This margin is the degree of inferiority of the test treatments to the control that the trial will attempt to exclude statistically.The margin chosen cannot be greater than the smallest effect size that the active drug would be reliably expected to have compared with placebo in the setting of the planned trial. [based on both statistical reasoning and clinical judgement, should reflect uncertainties in evidence and be suitably conservative.]

Outline of the Issues
What is the the non-inferiority designWhat are the various objectives of the designComplexities in choosing the margin of treatment effect - it depends upon the strength of evidence for the treatment effect of the active controlLiterature on historical controls, and on the heterogeneity of treatment effects among studiesThe statistical approaches to each objective, and their critical assumptionsCautions and concluding remarks

Non-Inferiority Design
A study design used to show that a new treatment produces a therapeutic response that is no less than a pre-specified amount of a proven treatment (active control), from which it is then inferred that the new treatment is effective. The new treatment could be similar or more effective than the existing proven treatmentA non-inferiority margin is pre-selected as the allowable reduction in therapeutic response. The margin is chosen based on the historical evidence of the efficacy of the active control and other clinical and statistical considerations relevant to the new treatment and the current study.ICH - E10: This delta can not be greater than the smallest effect size that the active drug would be reliably expected to have compared with placebo in the setting of a planned trial. - the concept of reliably and repeatedly being able demonstrate a treatment effect of a specified size !

Non-Inferiority Design (contd)
A test treatment is declared clinically non-inferior to the active control if:the trial has the necessary assay sensitivity for the trial to be valid for non-inferiority testingthe one-sided 97.5 confidence interval is entirely to the right of -

Inference for Non-Inferiority
Delta Limits & 95% Confidence Intervals
0
Control Better
Test Agent Better
Non-inferiority shown
Non-inferiority shown
Non-inferiority not shown
Non-inferiority shown/superiority could be claimed
-
Treatment Difference

What are the various objectives of the non-inferiority design
To prove efficacy of test treatment by indirect inference from the active control treatmentTo establish a similarity of effect to a known very effective therapy - e.g. anti-infectivesTo infer that the test treatment would have been superior to an imputed placebo ; ie. had a placebo group been included for comparison in the current trial. - a new and controversial area - choice of margin is the key

What is the Evidence supporting the treatment effect of the active control, and how convincing is it ?
Large treatment effects vs. small or modest effectsLarge treatment effects - anti-infectivesModest treatment effects - difficulties in reliably demonstrating the effect - Sensitivity to drug effectsAmount of prior study data available to estimate an effectOne single studySeveral studies, of different sizes and qualityNo estimate or study directly on the comparator - standard of care

How is the margin chosen based upon prior study data
For a large treatment effect, it is easier - a clinical decision of how similar a response rate is needed to justify efficacy of a test treatment - e.g. anti-infectives is an example.For modest and variable effects, it is more difficult ; and some approaches suggest margin selection based upon several objectives.

Complexities in choosing the margin (how much of the control treatment effect to give up)
Margins can be chosen depending upon which of these questions is addressed:how much of the treatment effect of the comparator can be preserved in order to indirectly conclude the test treatment is effective - a clinical decision for very large effects; a statistical problem for small and modest effectshow much of a treatment effect would one require for the test treatment to be superior to placebo, had a placebo been used in the current active control study - a lesser standard than the above

How convincing is the prior evidence of a treatment effect ?
Do clinical trials of the comparator treatment consistently and reliably demonstrate a treatment effect - when they do not, what is the reason ?Study is too small to detect the effect - under powered for a modest effect sizeThe treatment effect is variable, and the estimate of the magnitude will vary from study to study, sometimes with NO effect in a given study - a BIG problem for active controlled studies (Sensitivity to drug effect)

How do you know which treatment effect size is appropriate for the current active control ?

How much protection should be built into the choice of the margin to account for unknown bias and uncertainty in study differences ?

Inherently, the answer relies upon historical controls and their applicability to the current study
Choice of the margins should take into account all sources of variability as well as the potential biases associated with non-comparability of the current study with the historical comparisons.A need to balance the building in of bias in the comparison and quantifying the amount of treatment effect preserved, as a function of the relative amount of data from the historical studies and the current study

Use of historical controls in current RCTs
Pocock,S. The combination of randomized and historical controls in clinical trials. J. Chronic Diseases 1976, 29 pp.175-188Lists 6 conditions to be met for valid use of historical controls with controls in current trialOnly if all these conditions are met can one safely use the historical controls as part of a randomized trial. Otherwise, the risk of a substantial bias occurring in treatment comparisons cannot be ignored.

Importance of the assumption of constancy of the active control treatment effect derived from historical studies
It is relevant to the design and sample size of the current study, to the choice of the margin, to the amount of bias built into the comparisons, to the amount of effect size one can preserve (both of these are likely confounded), and to the statistical uncertainty of the conclusion.Before one can decide on how much of the effect to preserve, one should estimate an effect size for which there is evidence of a consistent demonstration that effect size exists.

Explaining Heterogeneity among independent studies : Lessons from meta-analyses
Variation in baseline risk as an explanation of heterogeneity in meta-analysis, S.D. Walter, Stat. In Medicine, 16, 2883-2900 (1997)An empirical study of the effect of the control rate as a predictor of treatment efficacy in meta-analysis of clinical trials, Schmid,Lau,McIntosh and Cappelleri, Stat. In Medicine, 17, 1923-1942 (1998)

Explaining Heterogeneity among independent studies : Lessons from meta-analyses (cont.)
Explaining heterogeneity in meta-analysis: a comparison of methods. Thompson and Sharp, Stat. In Medicine, 18, 2693-2708 (1999)Assessing the potential for bias in meta-analysis due to selective reporting of subgroup analyses within studies. Hahn, Williamson, Hutton, Garner and Flynn, Stat. In Medicine, 19, 3325-3336 (2000)

Explaining Heterogeneity among independent studies : Lessons from meta-analyses (cont.)
Large trials vs. meta-analysis of smaller trials - How do their results compare ? Cappelleri, Ioannidis, Schmid, de Ferranti, Aubert, Chalmers, Lau. JAMA, 16 1332-1338, 1996Discordance between meta-analysis and large-scale randomized controlled trials: examples from the management of acute myocardial infarction. Borzak and Ridker, Ann. Internal Med.,123, 873-877 (1995)Discrepancies between meta-analysis and subsequent large randomized controlled trials. LeLorier, Gregoire, Benhaddad, Lapierre,Derderian. NEJM, 337, 536-42 (1997)

Use of meta-analysis - necessary but not sufficient
Distinguish under powered studies from well powered studies for a common effect size - if possibleHow many trials are consistent with no effect, rather than an effect of some sizeDetermine between trial variability as an additional factor to consider in choosing a conservative marginHow do you know if the current study comes from the same trial population, and where does it rest in the trial distribution - critical to assumptions for control group rate and constancy of treatment effectResorting to meta-analysis of all studies, when few individual studies reject null, tells you something !

Three approaches to the problem
Indirect confidence interval comparisons (ICIC) (CBER/FDA type method, etc.) - thrombolytic agents in the treatment of acute MIVirtual method (Hasselblad & Kong, Fisher, etc.) - Clopidogrel, aspirin, placebo

Bayesian approach (Gould, Simon, etc.)- treatment of unstable angina and non-Q wave MI

When may it not be possible to estimate a margin or to use the non-inferiority design to infer efficacy ?
There is a known creep in the standard of care over time and/or the active control treatment, which renders any past estimates of active control treatment effects not comparable or valid for the current comparison, under conditions of medical practice in the new current studye.g. use of surfactants in neonatal treatment

ICH E5

Ethnic Factors in the Acceptability of Foreign Clinical Data

Key Features of E5
Operational definition of ethnic factorsClinical Data Package Fulfilling Regulatory Requirements in New RegionExtrapolation of Foreign Clinical Data to New Region (role of ethnic factors)Bridging StudiesGlobal Development Strategies

Ethnic Factor Definition
intrinsic factors: characteristics associated with the drug recipient (ADME studies)race, age, gender, organ dysfunction, genetic polymorphismextrinsic factors: characteristics associated with the environment and culture in which one lives (clinical outcomes)clinical trial conduct, diet, tobacco and alcohol use, compliance with prescribed medications

Assessing a medicines sensitivity to ethnic factors(part of the screening process)
Properties of a compound making it more likely to be sensitive:Metabolism by enzymes known to show genetic polymorphismHigh likelihood of use in a setting of multiple co-medications

Assessment of the Clinical Data Package (CDP) for acceptability
Question 1: Meets regulatory requirements - yes/noQuestion 2: Extrapolation of foreign data appropriate - yes/noQuestion 3: Further clinical study (ies) needed for acceptability by the new region - yes/noQuestion 4: Acceptability in the new region - yes/no

Meets regulatory requirements
Issues of evidenceConfirmatory evidence; two or more studies showing treatment effectsInterpreting results of foreign clinical trials which provide that evidence (may be one study, or all studies, or part of a study)Which study designs provide evidenceActive control / non-inferiority designsPlacebo or active control / show a difference designs

The sources of data for an application (implementation)
All clinical studies for efficacy performed in foreign regionOne study in the United States, one or more foreign clinical studiesMulti-center/ multi-region clinical trials form the basis for efficacy

Considerations for evaluating clinical efficacy between regions
Study design differences Magnitude of treatment effect sizesEffect size variability; subgroup differencesImpact of intrinsic factors - determined when ?Impact of Extrinsic factorstrial conduct and monitoringusage of concomitant medicationsprotocol adherence

Bridging Studies
WhenWhyWhat type
E5 is purposely vague on how to do this or what their design should be

Study design and study objectives(need examples and experience)
What type of bridging study would be helpful for extrapolation -PK/PDAnother clinical trial of the primary clinical endpointequivalence/non-inferiority: treatment effect acceptably close - margin or deltadose response study superiority design - estimate treatment effect size for comparison

E5 allows for a new study in the new region - why is that needed ?
When all the clinical data is derived from a foreign region and extrapolation is an issueWhen the experience with clinical trials in that region is minimalWhen there is concern with ability to confirm a finding from a study(ies)A confirmatory clinical trial is the bridging study

Developmental Strategies for Global Development
Early vs. later strategiesDesigning population pk/pd into clinical studiesPlanning to explain effect size differences among regionsDesign of bridging studies early in development

Study Design
Better planning in Phase I, II, III and more efficient study designs to address several subgroup questions simultaneouslyDesign Phase III with some knowledge of PK / PD differences in Phase I / IIAddress multiple questions simultaneously for efficiency (age, gender, ethnic)

Study Design
Assessing the influence of ethnic factors in each study Phase (I, II, III) and to identify earlier and account for, by design, the influence of ethnic factorsEthnic factors as another subgroupAge, gender, renal status, etc.Ethnic factors integrated withDose responseGeriatricsPopulation exposure for safety

Remarks
Little experience at this time with bridging studiesLittle experience with Japanese trials in NDA applications, or trials from AsiaMore experience with foreign trials from Europe - possible heterogeneity of treatment effects being evaluated; concern for experience in new regions like Eastern Europe

The future
Appears to be increasingly dependent on statistical input, methods, study design, interpretation , etc.Statistical resources (people) are needed in the regulatory agencies in all countries/regions serious about inference - not always present , maintained - cannot develop guidance documents and consensus positions without this,nor rely on guidances aloneGlobal drug development is beginning to recognize the need for early planning for multi-regional inference - the questions and study designs are just unfolding