10
PHARMACEUTICAL STATISTICS Pharmaceut. Statist. 9: 288–297 (2010) Published online 20 October 2009 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/pst.391 Missing data: Discussion points from the PSI missing data expert group Tomasz Burzykowski 1 , James Carpenter 2 , Corneel Coens 3 , Daniel Evans 4 , Lesley France 5 , Mike Kenward 2 , Peter Lane 6 , James Matcham 7 , David Morgan 8 , Alan Phillips 9, ,y , James Roger 2,10 , Brian Sullivan 11 , Ian White 12 , and Ly-Mee Yu 13 of the PSI Missing Data Expert Group 1 MSOURCE Medical Development, Warszawa, Poland 2 Medical Statistics Unit, London School of Hygiene & Tropical Medicine, London, UK 3 EORTC, European Organisation for Research and Treatment of Cancer, AISBL-IVZW, Belgium, UK 4 Pfizer Sandwich Laboratories, Kent, UK 5 AstraZeneca, Parklands, Alderley Park, Macclesfield, Cheshire, UK 6 GlaxoSmithKline, Marlow, UK 7 Amgen Ltd, Cambridge, UK 8 Ipsen, Clinical Development Data Sciences, Berkshire, UK 9 ICON Clinical Research, Buckinghamshire, UK 10 GlaxoSmithKline Research and Development Ltd., Middlesex, UK 11 Statistical Solutions Ltd., Cork, Ireland 12 MRC Biostatistics Unit, Institute of Public Health, Cambridge, UK 13 Centre for Statistics in Medicine, University of Oxford, Oxford, UK The Points to Consider Document on Missing Data was adopted by the Committee of Health and Medicinal Products (CHMP) in December 2001. In September 2007 the CHMP issued a recommendation to review the document, with particular emphasis on summarizing and critically appraising the pattern of drop-outs, explaining the role and limitations of the ‘last observation carried forward’ method and describing the CHMP’s cautionary stance on the use of mixed models. In preparation for the release of the updated guidance document, statisticians in the Pharmaceutical Industry held a one-day expert group meeting in September 2008. Topics that were debated included minimizing the extent of missing data and understanding the missing data mechanism, defining the principles for handling missing data and understanding the assumptions underlying different analysis methods. A clear message from the meeting was that at present, biostatisticians tend only to react to missing data. Limited pro-active planning is undertaken when designing clinical trials. Missing data y E-mail: [email protected] *Correspondence to: Alan Phillips, ICON Clinical Research, 2 Globeside, Globeside Business Park, Marlow, Buckingham- shire, SL7 1HZ, UK. Copyright r 2010 John Wiley & Sons, Ltd.

Missing data: Discussion points from the PSI missing data expert group

Embed Size (px)

Citation preview

PHARMACEUTICAL STATISTICS

Pharmaceut. Statist. 9: 288–297 (2010)

Published online 20 October 2009 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/pst.391

Missing data: Discussion points from the

PSI missing data expert group

Tomasz Burzykowski1, James Carpenter2, Corneel Coens3, Daniel Evans4,Lesley France5, Mike Kenward2, Peter Lane6, James Matcham7,David Morgan8, Alan Phillips9,�,y, James Roger2,10, Brian Sullivan11,Ian White12, and Ly-Mee Yu13 of the PSI Missing Data Expert Group1MSOURCE Medical Development, Warszawa, Poland2Medical Statistics Unit, London School of Hygiene & Tropical Medicine, London, UK3EORTC, European Organisation for Research and Treatment of Cancer, AISBL-IVZW,

Belgium, UK4Pfizer Sandwich Laboratories, Kent, UK5AstraZeneca, Parklands, Alderley Park, Macclesfield, Cheshire, UK6GlaxoSmithKline, Marlow, UK7Amgen Ltd, Cambridge, UK8Ipsen, Clinical Development Data Sciences, Berkshire, UK9ICON Clinical Research, Buckinghamshire, UK10GlaxoSmithKline Research and Development Ltd., Middlesex, UK11Statistical Solutions Ltd., Cork, Ireland12MRC Biostatistics Unit, Institute of Public Health, Cambridge, UK13Centre for Statistics in Medicine, University of Oxford, Oxford, UK

The Points to Consider Document on Missing Data was adopted by the Committee of Health and

Medicinal Products (CHMP) in December 2001. In September 2007 the CHMP issued a

recommendation to review the document, with particular emphasis on summarizing and critically

appraising the pattern of drop-outs, explaining the role and limitations of the ‘last observation carried

forward’ method and describing the CHMP’s cautionary stance on the use of mixed models.

In preparation for the release of the updated guidance document, statisticians in the

Pharmaceutical Industry held a one-day expert group meeting in September 2008. Topics that were

debated included minimizing the extent of missing data and understanding the missing data

mechanism, defining the principles for handling missing data and understanding the assumptions

underlying different analysis methods.

A clear message from the meeting was that at present, biostatisticians tend only to react to missing

data. Limited pro-active planning is undertaken when designing clinical trials. Missing data

yE-mail: [email protected]

*Correspondence to: Alan Phillips, ICON Clinical Research,2 Globeside, Globeside Business Park, Marlow, Buckingham-shire, SL7 1HZ, UK.

Copyright r 2010 John Wiley & Sons, Ltd.

mechanisms for a trial need to be considered during the planning phase and the impact on the

objectives assessed. Another area for improvement is in the understanding of the pattern of missing

data observed during a trial and thus the missing data mechanism via the plotting of data; for

example, use of Kaplan–Meier curves looking at time to withdrawal. Copyright r 2009 John Wiley

& Sons, Ltd.

Keywords: missing data; LOCF; MMRM; multiple imputation

1. BACKGROUND

The Points to Consider Document on MissingData was adopted by the Committee of Healthand Medicinal Products (CHMP) in December2001 [1]. In September 2007 the CHMP issued arecommendation to review the document [2], withparticular emphasis on the following:

1. Summarizing and critically appraising thepattern of drop-outs.

2. Use of sensitivity analysis or the justificationfor its absence.

3. Explaining the role and limitations of the‘last observation carried forward’ (LOCF)method.

4. Describing the CHMP’s cautionary stance onthe use of mixed models.

In preparation for the release of the updatedguidance document, PSI (Statisticians in thePharmaceutical Industry), a professional associa-tion of statisticians in the Pharmaceutical Industry,held a one-day expert group meeting in September2008. A list of the meeting attendees and affiliationsis given in Appendix. Topics that were debatedincluded the following:

1. Minimizing the extent of missing data andunderstanding the missing data mechanism.

2. Defining the principles for handling missingdata.

3. Understanding the assumptions underlyingdifferent analysis methods.

The statistical techniques developed for hand-ling missing data usually assume that the missingdata mechanism can be one of the following:

1. Missing completely at random (MCAR).2. Missing at random (MAR).3. Missing not at random (MNAR).

Definitions for each of these terms are providedin Table I.

The remainder of this paper summarizes thequestions raised, resulting discussions and consen-suses reached. It should be noted that some of theissues were not discussed at the meeting. However,the meeting acted as the catalyst. Consequently, theissues debated after the September 2008 meeting,and the subsequent outcome have also beenincluded in this paper. After a brief review of theissues associated with each topic, the majorquestions raised are listed, immediately followedby a summary of the discussion and agreements.The context of the discussion is largely that oflongitudinal clinical trials with dropouts or with-drawals. However, many of the points raised areapplicable to other situations.

2. MINIMIZING MISSING DATA ANDUNDERSTANDING THE MISSINGDATA MECHANISM

The CHMP Points to Consider document onMissing Data states in Section 3: ‘In the designand conduct of a clinical trial all efforts should be

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

Discussion points from the PSI missing data expert group 289

directed towards minimizing the amount ofmissing data likely to occur’. The expert groupdiscussed what proactive steps could be undertakenby trialists to minimize the amount of missingdata in clinical trials, and how best to understandthe pattern of missing data observed during a trialand thus the missing data mechanism.

In the context of this paper, the term ‘‘pattern’’is used to cover a multitude of differentissues relating to missing data including, forexample,

1. Timing of discontinuations.2. Differential timing of discontinuation by

treatment group.3. Reasons for discontinuation.4. Differential reasons for discontinuation by

treatment group, by time, or by treatmentand time.

5. How baseline and/or post-baseline character-istics of those who discontinue differ fromthose who complete a trial.

Trialists often investigate the observed patternsof missing data to provide information relating tothe missing data mechanism.

Q1. What practical steps can be taken to avoidthe presence of missing data in (a) short-term and(b) long-term clinical trials?

Following the ICH guideline on StatisticalPrinciples for Clinical Trials, ICH E9 [3], theexpert group acknowledged that missing valuesrepresent a potential source of bias and that everyeffort should be undertaken to plan the study sothat the amount of missing data is minimized.However, there was consensus that there willalmost always be some missing data. It was agreedthat the principles for minimizing the amount ofmissing data do not depend on the length of thetrial. Suggestions to minimize the amount ofmissing data included the following:

1. Design the study and write the protocol sothat key data are clearly identified.

2. The protocol should proactively plan formissing data; for example, unambiguously statethe objectives of the study, the patient popula-tion of interest and how missing data mayimpact any inferences to be made. To illustratethe issues a nephrology trial was consideredwhere haemoglobin data are collected weekly

Table I. MCAR, MAR and MNAR definitions.

Missing completelyat random (MCAR)

The missing value mechanism is unrelated to the observed or unobserved responses, or toother measurements such as baseline values and treatment group. In particular, theprobability that an observation is missed does not depend on how big or small it wouldhave been if observed or on the size of the previous or subsequent observations on thesame or any subject. Under MCAR any method of analysis that would have been validfor the complete data, such as ANCOVA, remains valid for the observed data

Missing at random (MAR) The missing value mechanism may be dependent on observed measurements, includingresponses, but given these measurements, there is no remaining dependence onunobserved responses. The concept of Missing at Random (MAR) is most simplyexplained in the context of patient dropout in a longitudinal study. Suppose that twopatients share the same treatment and covariates, and exactly the same responsemeasurements up to the point at which one drops out and the other remains. Then themissing data from the subject who drops out are MAR if they have the same statisticalbehaviour as the observations from the subject who remains. Under MAR a validanalysis can be constructed that does not require knowledge of the specific form of themissing value mechanism

Missing not at random(MNAR)

Even after accounting for observed measurements, there remains dependence between themissing value mechanism and the unobserved responses. Under MNAR a validanalysis does require knowledge of the specific form of the missing value mechanism,but in practice we will almost never know this mechanism

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

290 T. Burzykowski et al.

for 24 weeks. In such trials it is expected thatabout 30% of patients will withdraw fromfollow up. Reasons for withdrawal includedeath, kidney transplantation, adverse events,loss to follow up, etc. In most cases simplyextending the trial or increasing sample size willnot adequately address missing data. Protocolsand statistical analysis plans rarely discuss theexpected patterns of missing data, or considerthe impact of the potential patterns on theoverall scientific validity of the trial. Statisti-cians should proactively plan for variousmissing data mechanisms when determiningthe sample size, using existing knowledge of thedisease and compound under investigation, andthe likely impact on the overall inferences to bedrawn.

3. Consider a two-step withdrawal process forpatients: withdrawal of consent for treatmentand withdrawal of consent from observation.Once a patient has withdrawn consent fortreatment, only assessments needed to addresskey efficacy and safety questions of interestshould be undertaken. In addition, the proto-col should clearly state how the collection ofthe follow-up data would help address thekey scientific questions of interest. It wasacknowledged that in some disease areas (e.g.pain control, diabetes) it might be a challengeto explain the value of continuing to observepatients whilst the patients are not beingtreated with the trial study medication. Inother disease areas (e.g. Oncology) suchpractices are already standard. Broadly speak-ing, at present it seems that the collection offollow-up data is often undertaken for drugsthat modify disease progression, but not forsymptomatic treatments. It was also recog-nized that switching of treatments can be anissue with continuing to monitor patients afterwithdrawal of treatment. Also switching treat-ments can result in confounding of treatmenteffects, which may be difficult to interpret or oflimited value for short acting treatments orsubjective responses. Nonetheless if data arecollected after withdrawal with the aim ofimproving compliance, it was suggested that

the amount of such data could be reduced; forexample, only collect data relating to theprimary endpoint and adverse events.

4. In any clinical trial there will always be‘necessary’ and ‘unnecessary’ discontinua-tions. For ethical reasons, a trial must alwaysbe designed to permit ‘necessary’ discontinua-tions such as allowing a patient to discontinuedue to lack of efficacy or an adverse event.These outcomes in themselves are often usefulwhen assessing a treatment’s effectiveness andsafety. However, ‘unnecessary’ discontinua-tions such as lost to follow up, which do notclearly map to adverse events or lack ofefficacy can be reduced by tighter clinicalprotocols. That is, tighter control of thepatient population through stricter inclusionor exclusion criteria; for example, patientsshould be selected who are more likely tocomplete the study. The disadvantage of thisapproach is that it can reduce the generali-zability of the trial findings. It is worth noting,though, that the occurrence of missing dataanyway influences the generalizability of theresults obtained from the observed data. Insummary, although a tall order, one importantway to minimize missing data is to select apatient population that minimizes disconti-nuations for ‘unnecessary’ reasons; that is,those that are not causally related to thepharmacodymanic effects of the drug, whilestill enrolling a representative population.

Another suggestion that was made after themeeting was to reduce the amount of data beingcollected in individual trials and simplify casereport forms (CRFs). If only key relevant data arecollected, then the chance of data being capturedreliably will increase, hence reducing the amountof missing data.

Q2. What methods do you think should beroutinely employed to understand the nature ofmissing data?

To understand the nature of missing data itis important that the relevant information is

Discussion points from the PSI missing data expert group 291

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

collected. In a large number of clinical trialssponsored by the PSI, standard withdrawal ordiscontinuation CRFs are employed. These haveprescribed standard lists for reasons for withdrawalsuch as adverse event, lack of efficacy, lost to followup, etc. The group felt that often statisticians do notgive enough thought to the customization of theseCRFs for the disease under consideration or thestudy objectives; for example, how often aredisease- or study-specific reasons included? Toillustrate the point consider ‘Lost to Follow Up’in oncology trials. What does this actually mean?Should study-specific reasons be provided to betterunderstand what happens to these patients? Theunderstanding of patient withdrawal patterns andassociated missing data mechanism starts with thecollection of relevant information.

The expert group also felt that during theplanning phase of a clinical trial it is importantto identify potential predictors of missing data,both to facilitate the collection of relevant dataand for potential inclusion in the analysis. Forexample, consider an asthma clinical trial. In suchtrials FEV1 is often used as the primary endpoint.It is widely recognized that ‘asthma exacerbations’may also be an important endpoint. In fact, whensuch events occur a patient may visit their healthcare professional, who in turn may advise thepatient to withdraw from the trial. Subsequentlywhen designing asthma trials it may be importantto collect data on ‘asthma exacerbations’. It is,however, important to make the distinctionbetween potential predictors of missing data andscenarios in which drop-out rates simply differbetween the treatment groups. The latter case is anexample of MAR. Given that treatment is alwaysincluded in the model, the statistical analysis willtherefore not be biased.

The importance of eliminating practices thatartificially increase the number of discontinuationswhen designing clinical trials was also discussed.For example, should patients be discontinued forprotocol violations or for lack of compliance? Theanswer to the question really depends on theprecise question of interest. However, an alter-native to discontinuing such patients is to permitthem to stay in the study but flag the data as

non-compliant. This is akin to having follow-updata that can be used in some analyses to test somehypotheses and excluded from other analyses.A specific example of artificial increase in missingdata would be the use of electronic diaries for dailypain. If patients do not enter their data by the endof the day should the device not permit them toenter the data? Alternatively should entry beallowed but the data flagged as out of the desiredtime window?

Another area where the expert group feltimprovements could be made was for trialists tostart thinking earlier in the process about themechanisms that cause missing data. As outlinedin ICH E9, drug development spans many yearsand comprises an ordered program of clinical trialseach with their own specific objectives. Little effortis made to understand missing data in the earlierphases of drug development. Sponsors tend tostart considering the impact of missing data duringlate Phase II and Phase III, the pivotal clinicaltrials, when such issues can affect the approval ofthe final package by the regulatory authorities.Missing data mechanisms need to be consideredwhen making go/no-go decisions at the end ofPhase I and early Phase II. In addition the impactof missing data on later phase study design shouldbe considered. It is important to note that the useof Phase I data can be problematic. Phase I studiesoften involve healthy volunteers, which may notprovide useful information about what to expect inpatients. However, in some therapeutic areas, suchas oncology, when Phase I studies are focused onthe population of interest, an insight into themissing data mechanism may be gained.

Q3. What are the relative merits of the followingexploratory analyses?

� Plotting raw data and inspection of thedata?

� Analysis by pattern of missing data (drop-outcohort)?

� Logistic regression of drop-out on earlier data?

There was consensus that graphical display isone of the most important tools available to

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

292 T. Burzykowski et al.

statisticians when trying to understand the causesof missing data. Although analytical methods existfor exploring missing data, a large amount ofinformation can be ascertained by simply plottingthe data: for example,

1. Kaplan–Meier plots to look at time to with-drawal both overall and for specific reasons.

2. Plots of treatment means against time forcohorts of subjects with similar follow-up times.

3. Plots of treatment means against time forthose who drop out at each visit together withthe corresponding means against time forthose who continue.

The latter two suggestions should be plotted onthe same time scale for ease of comparison. Thekey to success is thinking through the question ofinterest and intelligently plotting the data.

It is often useful to complement such graphs withlogistic regression to explore predictors of dropouts.This is especially useful in identifying key predictorsfrom a set of candidate predictors. Such regressionscan rule out MCAR in favour of a MAR mechan-ism. However, no definitive statements about theexact missing data mechanism can be made. Even if,from the observed data, this mechanism appearsMCAR, the data may yet be MNAR. In otherwords the direct cause of the dropouts may alwaysbe unobserved. This is discussed in more detail inCarpenter and Kenward [4]

Q4. How would the approach differ if themissing values were in safety data as opposed toefficacy data?

The expert group agreed that the principles forminimizing and understanding missing datashould not change for safety data, but thechallenges may be very different. For example, inPhase III there are often a small number of specificadverse events of interest that are compound-specific. During the design phase careful consid-eration needs to be given as to how informationwill be collected about such events, and the impactof missing data on the inferences to be drawn. Itwas agreed that there is a need for more than

simple summary tables of adverse event incidencerates in clinical study reports. Increased use ofgraphical displays and more in-depth analyses arerequired. Any interpretation should be linked tothe risk management plan [5].

Although the expert group agreed that theprinciples for minimizing and understanding miss-ing data should not change for efficacy and safetydata, it is interesting to note that principles forhandling the missing data in the analysis from aregulatory perspective do seem to differ. Regulatorsoften seek conservative efficacy analyses but seldomsee the value in conservative methods that underestimate the treatment effects for safety data.

3. DEFINING THE PRINCIPLES FORHANDLING MISSING DATA

As discussed in the CHMP Points to Considerdocument on Missing Data, if missing values arehandled by simply excluding any patients withmissing outcomes from the analysis a largenumber of issues can arise, which may affect theinterpretation of the trial results. The followingsection summarizes the discussions at the expertgroup meeting relating to the principles thatshould be applied when handling missing data.

Q5. Regulators have stated on numerous occa-sions that missing data from patients who dropout are different from other types of missing data.What are the principles for handling differenttypes of missing data?

The expert group agreed that the key issue whenhandling any missing data is understanding themechanism causing the missing data. It is essentialthat the proposed method of analysis, andassociated handling of missing data, regardless ofwhether the patient discontinued or not, must bedirectly linked to and properly reflect the originalobjectives of the study, including any assumptionsmade when designing the trial. Specifically forpatients who withdraw, the group felt that thecritical question is what information needs to be

Discussion points from the PSI missing data expert group 293

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

collected for patients who discontinue, as suchpatients will occur in every trial. How missing datais handled is an integral part of the description ofthe primary comparison. The cost of runningadditional trials to investigate the effect of missingdata far outweighs the cost of collecting theappropriate information in the first instance.

Q6. What are the principles for sensitivityanalysis in the light of missing data?

The expert group agreed that two importantprinciples exist when considering sensitivity ana-lyses: transparency and relevance of the assump-tions. It is important to clearly describe theoriginal assumptions when designing the study sothat all stakeholders can assess their relevance.The assumptions underlying any sensitivity ana-lyses should be divergent from the originalassumptions. It was agreed that, in contrast, aseries of ‘wrong’ analyses does not properlyconstitute a sensitivity analysis.

Q7. Regulators seem to be favouring a require-ment for sponsor companies to monitor patientsafter withdrawal. How should post-withdrawaldata be handled in the statistical analysis?

Some literature has been published relating tothe use of data collected after withdrawal [4,6].Although too complex a topic to focus upon indetail here, since the issue of collecting data afterwithdrawal currently seems to be a critical one fromthe regulatory perspective, it was briefly discussedby the expert group. The group agreed that the issuereinforces the need to clearly define the objectives ofthe study. In defining the objectives clearly andprecisely it will become apparent whether collectingdata from patients who withdraw is necessary toaddress the question of concern. If such data are notcollected then it may be necessary to describe howany resultant selection biases will be addressed. Itwas noted that the mechanism for withdrawal mightdiffer between on-treatment and off-treatmentperiods. This in turn may lead to further technicalchallenges when incorporating data from patientsafter withdrawal into the analysis.

4. UNDERSTANDING THE UNDER-LYING ASSUMPTIONS OF THEDIFFERENT ANALYSIS METHODS

In recent years a large amount of literature hasbeen published on the merits of the differentapproaches for handling missing data [7–9]. Thisfinal session of the meeting focused on clarifyingthe assumptions behind the different methods andhow they might relate to the objectives of the trial,specifically for a longitudinal clinical trial withdropouts or withdrawals.

Q8. What are the underlying assumptions of the(a) LOCF, (b) mixed model for repeated measures(MMRM) and (c) multiple imputation (MI) meth-ods for handling missing data in a longitudinalclinical trial with dropouts or withdrawals?

LOCF is a single-imputation method. It makes animplicit assumption that the patients would sustainthe same response seen at an early study visit forthe entire duration of the trial. The assumption isuntestable and potentially unrealistic. Even thestrong MCAR assumption does not suffice toguarantee that an LOCF analysis is valid; in factKenward andMolenberghs [10] have shown that theassumptions under which LOCF is valid does not fitnaturally into the MCAR, MAR and MNARframework. Further, the uncertainty of imputationis not taken into account, and so, as discussed byMallinckrodt et al. [7], the method can result insystematic underestimation of the standard errors.

MMRM and MI analyses make the assumptionthat data are MAR. In a MMRM analysisinformation from the observed data is used viathe within-patient correlation structure to provideinformation about the unobserved data, butthe missing data are not explicitly imputed.A MMRM analysis uses all the available data toprovide information about the unobserved data[7]. It estimates the treatment effects assuming thewithdrawn patients have the same statisticalbehaviour as those who continued. That is,MMRM assumes that the data observed untilthe point of discontinuation is a valid predictor ofthe unobserved data. In MI, the imputation step is

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

294 T. Burzykowski et al.

separate from the modelling step, and so there isan additional flexibility to explore differentassumptions about the nature of the missing data.If this flexibility is not used then it may in somecircumstances essentially give the same results asMMRM, and so offers no advantages over thatmethod. Further details on each of the abovemethods are provided in Mallinckrodt et al. [7].

It was acknowledged that if the underlyingmechanisms that cause missing data are non-informative the resulting impact on the statistical

analysis is far easier to handle, compared withinformative missingness. The data being analysed,however, cannot provide evidence to distinguishbetween these two situations.

Q9. When might the assumptions for each of themethods be considered valid?

Table II outlines when it might be appropriateor inappropriate to use LOCF, MMRM or MItechniques in a longitudinal clinical trial with

Table II. Summary of When LOCF, MMRM and MI should be considered for a longitudinal clinical trial with acontinuous response variable.

Statisticalmethod

Situations where technique can beconsidered

Situations where technique should not beconsidered

LOCF Stable disease following first post-treatmentobservation

Diseases with marked improvement or deteriorationover time (e.g. Alzheimers)

Short-term trials Relapsing or remitting diseases (e.g. generalizedanxiety disorder)

Disease involving transient treatment effectsWhen MMRM is used since it nullifies the repeated-

measures aspects of the technique

MMRM Trials where the objective is to makeinferences about treatment effects if pa-tients stayed on treatment, but where nopost-withdrawal data has been collected

When withdrawn patients do not mimic patientswho continue in the study given same back-ground history

Trials where the objective is to make inferencesabout treatment effects if patients stayed ontreatment but where off-treatment data is in-cluded in the analysis

An unstructured covariance matrix shouldalways be employed. Time should alwaysbe fitted as a class variable. The baselineresponse should nearly always be crossedwith time�

If multivariate normal assumption does not hold forproviding information about the missing data

MI Increased flexibility required. In MI theimputation part is separated from model-ling part. Extra variables and complexitycan be incorporated such as treatmentwithdrawals, outcomes etc. In particularpost-randomization variables predictive ofdropouts can be incorporated

When Monte Carlo simulation not appropriateIf multivariate normal assumption does not hold forproviding information about the missing data

Different imputation schemes are needed fordifferent treatment groups

�If the baseline response is not crossed with time then an increase (rather than a decrease) in the variability of the

treatment comparison data can be observed. The reason for this is that the correlation between the baseline score and the

outcome variable nearly always decreases with time; that is, the serial correlation decays. If baseline is fitted as a main

effect then the estimated regression coefficient is averaged across all visits, and is larger than the correct baseline regression

coefficient for the final time. This means that the analysis over-corrects for the endpoint of interest, which is typically a

comparison at the final visit. So even with no missing data one can get an over-corrected estimate of treatment difference.

Discussion points from the PSI missing data expert group 295

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

dropouts or withdrawals. In such studies it isimportant to recognize that MMRM and MI intheir most basic form, both assume the multi-variate normal distribution when providing in-formation about the missing data. Invalidinferences can be drawn when the assumption isnot met. There are, however, generalizations andmodifications of these approaches, which whilebased on the same basic principles, are valid underother distributional assumptions.

One of the main issues when determining how tohandle missing data is that the true missing datamechanism will always be unknown and not testablefrom the data. It is only possible to suggest the dataare not consistent with the MCAR assumption. Noamount of clever modelling can overcome this issue.If the mechanism for missingness is informative thenit may not be possible to fully evaluate the impact ofthe treatment of missing data in the analysis and thismust be carefully considered in the interpretation ofthe data. Subsequently, the key issues are whatquestions are being answered from the analysis forthe trial, and under what assumptions does theproposed analysis answer the questions. Doubtsabout aspects of the assumptions can be addressedthrough appropriate sensitivity analyses. That is,sensitivity analyses can be used to explore the influ-ence of the effect of missing data when doubts existregarding the missing data mechanism. The use ofsensitivity analyses needs to be approached carefully.Sensitivity analyses focused on specific assumptionsare useful when determining the robustness of theinferences from a clinical trial.

5. CONCLUSION

The Points to Consider Document on MissingData was adopted by the CHMP in December2001. Since the issuance of the guidance documentthere has been increased debate within thestatistical community about the merits of thedifferent approaches used to handle missing datasuch as LOCF, MMRM and MI. Subsequently inSeptember 2007 the CHMP issued a recommenda-tion to review the document, with particularemphasis on summarizing and critically appraising

the pattern of drop-outs, explaining the role andlimitations of LOCF and describing the CHMP’scautionary stance on the use of mixed models. Itwas clear from the one-day PSI sponsored expertgroup meeting that the 2001 guideline places agreat deal of emphasis on the merits of thedifferent statistical methods available for handlingmissing data, and not enough on the principlesthat should be considered when designing trials.

The expert group also concluded that currentlybiostatisticians tend to react to missing data.Comprehensive, proactive planning is rarely under-taken when designing trials. It is imperative that theprecise objectives of the trial are documented andthe potential impact of missing data thoroughlyconsidered during the planning phase. Missing datamechanisms for a trial need to be considered.Sensitivity analyses investigating the robustness ofthe inferences to the different assumptions madeshould also be considered. Another identified areafor improvement is in the understanding of thepattern of missing data observed during a trial andsubsequently the missing data mechanism via theplotting of data; for example, use of Kaplan–Meiercurves of time to withdrawal. Finally it wasconcluded that the handling of missing data is adifficult area. If the mechanism for the missing datais non-informative then the issue can be addressedby using relatively straightforward statistical techni-ques. However, if the mechanism for the missingdata is informative then the issues are complex, andappropriate sensitivity analysis is called for.

ACKNOWLEDGEMENTS

The authors would like to thank the reviewers andeditor for their insightful comments, which led toan improved version of the manuscript.

REFERENCES

1. CPMP Points to Consider Document. Missing Data(2001). Available at: http://www.emea.europa.eu/pdfs/human/ewp/177699EN.pdf.

2. Recommendation for the Revision of the Points toConsider on Missing Data (2007). Available at: http://www.emea.europa.eu/pdfs/human/ewp/43998007en.pdf

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst

296 T. Burzykowski et al.

3. ICH E9 Expert Working Group. Statisticalprinciples for clinical trials: ICH harmonizedtripartite guideline. Statistics in Medicine 1999;18:1905–1942.

4. Carpenter JR, Kenward MG. Missing datain randomised controlled trials—a practicalguide. National Institute for Health Research:Birmingham, 2008. Publication RM03/JH17/MK.Available at: http://www.pcpoh.bham.ac.uk/publichealth/methodology/projects/RM03_JH17_MK.shtml.

5. Guideline on Risk management Systems forMedicinal products for Human Use. Availableat: http://www.emea.europa.eu/pdfs/human/euleg/9626805en.pdf.

6. Mallinckrodt CH, Kenward MG. Conceptual con-siderations regarding choice of endpoints, hypo-theses, and analyses in longitudinal clinical trials.Drug Information Journal 2009; 43:449–458.

7. Mallinckrodt CH, Lane PW, Schnell D, Peng Y,Mancuso J. Recommendations for the primaryanalysis of continuous endpoints in longitudinalclinical trials. DIJ 2008; 42:303–319.

8. Mallinckrodt CH, Watkin JG, Molenberghs G,Carroll RJ. Choice of the primary analysis inlongitudinal clinical trials. Pharmaceutical Statistics2004; 3:161–169.

9. Lane P. Handling drop-out in longitudinal clini-cal trials: a comparison of the LOCF and MMRMapproaches. Pharmaceutical Statistics 2008; 7:93–106.

10. Kenward MG, Molenberghs G. Last observationcarried forward: a crystal ball? Journal of Biophar-maceutical Statistics 2009; 19:872–888.

APPENDIX A: PSI DISCUSSIONGROUP MEMBERSHIP ANDAFFILIATION

Tomasz Burzykowski MSOURCE Medical Devel-opment

James Carpenter London School of Hygieneand Tropical Medicine

Corneel Coens EORTCDaniel Evans PfizerLesley France AstraZenecaMike Kenward London School of Hygiene

and Tropical MedicinePeter Lane GlaxoSmithKlineJames Matcham AmgenDavid Morgan IpsenAlan Phillips ICON Clinical ResearchJames Roger GlaxoSmithKlineBrian Sullivan Statistical SolutionsIan White MRC Biostatistics UnitLy-Mee Yu Centre for Statistics in Medi-

cine, University of Oxford

Discussion points from the PSI missing data expert group 297

Copyright r 2010 John Wiley & Sons, Ltd. Pharmaceut. Statist. 9: 288–297 (2010)DOI: 10.1002/pst