Bias reduction in the presence of informative censoring ... · To my family and friends, thank you for your continued love, support, and understanding, especially when I missed weekend

BIAS REDUCTION IN THE PRESENCE OF INFORMATIVE CENSORING:

APPLICATION OF THE COX MODEL TO

MULTIDRUG-RESISTANT TUBERCULOSIS COHORT ANALYSES

MEREDITH BLAIR BROOKS

A Dissertation Submitted to the Faculty of

The School of Health Professions

in Partial Fulfillment of the Requirements

for the Degree of Doctor of Philosophy in Population Health

in the Department of Health Sciences

Northeastern University

Boston, Massachusetts

December 5, 2017

i

DEDICATION

This dissertation is lovingly dedicated to my husband, Craig. Simply put, you are my rock. Your

steadfast support inspires me every day to work harder than the last.

ii

ACKNOWLEDGEMENTS

Foremost, I would like to express my deep gratitude to Dr. Justin Manjourides for generously

sharing his time, providing a wealth of opportunities for learning and development, and his ability

to articulate complex methods in a simple manner. I could not have asked for a better mentor.

Dr. Carole Mitnick serves as a true role model. She leads by example through her immense

knowledge, ability to always be present, commitment to create learning opportunities, and extreme

patience and generosity. While it seems impossible to find words that adequately characterize Dr.

Mitnick, it is simple to state that as a mentor, she is one of a kind.

I would like to thank Dr. John Griffith for his insightful comments, invaluable feedback, and

thought-provoking questions. He has helped me to critically think about future steps and how my

work can be applied to a broader context.

I am grateful to my fellow doctoral students in the Population Health program at Northeastern

University who have played many roles throughout the last few years, including study partner,

sounding board, and friend.

Colleagues in the Department of Global Health and Social Medicine at Harvard Medical School

have played an essential role in supporting my pursuit of higher education. They have helped to

identify appropriate data sets, facilitate learning of necessary methods, and have been incredibly

flexible, patient, and understanding of my competing time commitments.

To my family and friends, thank you for your continued love, support, and understanding,

especially when I missed weekend get-togethers, birthday celebrations, or was preoccupied with my

schoolwork even when I was present. My appreciation for you is immeasurable, and I plan on

making it up ten-fold.

iii

ABSTRACT

Background

Cox proportional hazards models are used to analyze multidrug-resistant tuberculosis (MDR-TB)

cohorts. Due to limited resources to follow MDR-TB patients past their initial treatment outcome,

longer survival data is often not available. This can lead to the presence of informative censoring

which violates Cox model assumptions. We investigate whether the presence of informative

censoring biases treatment effects, estimate the magnitude and direction of this bias, and propose

alternative, simple-to-implement censoring techniques to reduce the impact of this bias.

Methods

We use Cox proportional hazards regression and varying censoring techniques to evaluate the

association between receipt of an aggressive treatment regimen and time to death. We apply

alternative censoring techniques informed by the literature and derived from predictive modeling.

We explore the impact of these techniques on treatment effect estimates obtained through

simulation and through Cox proportional hazards modeling of observed data (two cohorts of MDR-

TB patients; Socios En Salud, Lima, Peru [1999-2002] and Partners In Health Russia, Tomsk

Oblast, Russian Federation [2000-2004]).

Results

We observe that the conventional censoring approach violates the non-informative censoring

assumption of the Cox model and produces biased treatment effect estimates. The conventional

method consistently underestimates the treatment effect. Use of alternative, better informed

censoring techniques reduces bias and produces stronger, more accurate treatment effect estimates.

iv

Conclusions

Informative censoring is present in these MDR-TB cohorts due to multiple treatment outcome

definitions and lack of survival data past the initial treatment outcome. Use of alternative censoring

techniques mitigates the effects of violating the non-informative censoring assumption when using

Cox proportional hazards models to analyze MDR-TB cohorts. Unless methods are integrated into

analyses to reduce these biases, inaccurate treatment effect estimates may be produced and used to

inform treatment guidelines.

1

TABLE OF CONTENTS

List of Tables ................................................................................................................................................ 5

List of Figures ............................................................................................................................................... 7

List of Supplemental Materials ..................................................................................................................... 8

List of Abbreviations .................................................................................................................................... 9

Chapter 1: Introduction ............................................................................................................................... 10

Background .................................................................................................................................................. 11

Tuberculosis ............................................................................................................................................. 11

Multidrug-resistant Tuberculosis ............................................................................................................. 12

Treatment Outcome Definitions ............................................................................................................... 13

Survival Analysis ..................................................................................................................................... 14

Cox Proportional Hazards Models ........................................................................................................... 15

Application of Cox Proportional Hazards Models to MDR-TB Cohort Data .......................................... 21

Problem ........................................................................................................................................................ 24

Significance.................................................................................................................................................. 25

Dissertation Overview ................................................................................................................................. 27

References .................................................................................................................................................... 29

Chapter 1 Tables .......................................................................................................................................... 33

Chapter 1 Figures ......................................................................................................................................... 36

Chapter 2: Adjustments for Informative Censoring in Cox Proportional Hazards Models: Application to a

Multidrug-resistant Tuberculosis Cohort .................................................................................................... 37

Abstract ........................................................................................................................................................ 38

Introduction .................................................................................................................................................. 40

2

Methods ....................................................................................................................................................... 41

Study Population ...................................................................................................................................... 41

Exposure Variable Definitions ................................................................................................................. 42

Outcome Definition ................................................................................................................................. 42

Statistical Analysis ................................................................................................................................... 43

Ethics Statement ....................................................................................................................................... 45

Results .......................................................................................................................................................... 45

Discussion .................................................................................................................................................... 46

Conclusion ................................................................................................................................................... 48

Funding ........................................................................................................................................................ 49

References .................................................................................................................................................... 50

Chapter 2 Tables .......................................................................................................................................... 52

Chapter 2 Figures ......................................................................................................................................... 57

Chapter 3: Use of Predicted Vital Status to Improve the Analysis of Multidrug-resistant Tuberculosis

Cohorts ........................................................................................................................................................ 58

Abstract ........................................................................................................................................................ 59

Introduction .................................................................................................................................................. 61

Methods ....................................................................................................................................................... 62

Study Cohort ............................................................................................................................................ 62


Outcome Definition ................................................................................................................................. 64

Statistical Analysis ................................................................................................................................... 64

3


Results .......................................................................................................................................................... 68

Discussion .................................................................................................................................................... 70

Conclusions .................................................................................................................................................. 72

Funding ........................................................................................................................................................ 72

References .................................................................................................................................................... 73

Chapter 3 Tables .......................................................................................................................................... 76

Chapter 3 Figures ......................................................................................................................................... 81

Chapter 4: Bias Estimates From Informative Censoring in Multidrug-resistant Tuberculosis Cohort Analyses:

A Simulation Study ..................................................................................................................................... 82

Abstract ........................................................................................................................................................ 83

Introduction .................................................................................................................................................. 85

Methods ....................................................................................................................................................... 86

Study Population ...................................................................................................................................... 86


Treatment Effects ..................................................................................................................................... 89

Censoring Techniques .............................................................................................................................. 90

Statistical Methods ................................................................................................................................... 91


Results .......................................................................................................................................................... 92

Discussion .................................................................................................................................................... 95

Conclusions .................................................................................................................................................. 97

4

Funding ........................................................................................................................................................ 97

References .................................................................................................................................................... 99

Chapter 4 Tables ........................................................................................................................................ 101

Chapter 4 Figures ....................................................................................................................................... 107

Chapter 5: Conclusions ............................................................................................................................. 115

Summary .................................................................................................................................................... 116

Problem .................................................................................................................................................. 116

Research Findings .................................................................................................................................. 118

Limitations ............................................................................................................................................. 120

Recommendations .................................................................................................................................. 121

Research Contributions .............................................................................................................................. 122

Future Work ............................................................................................................................................... 123

Conclusion ................................................................................................................................................. 125

References .................................................................................................................................................. 127

Supplemental Materials ............................................................................................................................ 129

5

LIST OF TABLES

Table 1.1 MDR-TB treatment outcome definitions 33

Table 1.2 Effect of censoring on survival probability estimates 34

Table 1.3 Long term survival after initial treatment outcome 35

Table 2.1 Explanatory variable definitions for the Lima, Peru cohort 52

Table 2.2 Characteristics of MDR-TB cohort from Lima, Peru 53

Table 2.3 Breakdown of time to treatment outcomes 54

Table 2.4 Change in effect estimates between the low-, high-, and equal-risk

assumptions

55

Table 2.5 Change in effect estimates between the mixed- and equal-risk assumptions 56

Table 3.1 Explanatory variable definitions for the Tomsk, Russia cohort 76

Table 3.2 Characteristics of MDR-TB cohort from Tomsk, Russia 77

Table 3.3 Model performance characteristics using 10-fold cross validation 78

Table 3.4 Distribution of end-of-study outcomes by initial treatment outcomes 79

Table 3.5 Change in treatment effect estimates using varying approaches to handle

censored observations

80

Table 4.1 Explanatory variable definitions for the simulated data 101

Table 4.2 Full results of model performance across treatment effect estimates for

univariate analysis

102


multivariable analysis (aggressive treatment regimen results)

103


multivariable analysis (adolescence results)

104

6


univariate analysis with non-informative censoring

105

Table 4.6 Results of model performance when applied to the real Lima, Peru cohort

data

106

7

LIST OF FIGURES

Figure 1.1 Kaplan-Meier product limit survival estimates 36

Figure 2.1 Kaplan-Meier curve comparison across four censoring assumptions 57

Figure 3.1 Receiver Operating Characteristics curve for final prediction model

selection

81

Figure 4.1 Relative bias of the estimated effect of the aggressive treatment regimen

in univariate analysis by censoring technique

107

Figure 4.2 Mean squared error of the estimated effect of the aggressive treatment

regimen in univariate analysis by censoring technique

108

Figure 4.3 Power of the estimated effect of the aggressive treatment regimen in

univariate analysis by censoring technique

109

Figure 4.4 95% confidence interval coverage rates of the estimated effect of the aggressive

treatment regimen in univariate analysis by censoring technique

110

Figure 4.5 Relative bias of the estimated effect of the aggressive treatment regimen in

multivariable analysis by censoring technique

111

Figure 4.6 Mean squared error of the estimated effect of the aggressive treatment regimen

in multivariable analysis by censoring technique

112

Figure 4.7 Power of the estimated effect of the aggressive treatment regimen in


113

Figure 4.8 95% confidence interval coverage rates of the estimated effect of the aggressive

treatment regimen in multivariable analysis by censoring technique

114

8

LIST OF SUPPLEMENTAL MATERIALS

R code for rejection sampling algorithm 130

R code for model development and adjustment of censoring assumptions 132

9

LIST OF ABBREVIATIONS

AD Adolescent

AR Aggressive treatment regimen

BIC Bayesian information criterion

BMI Body mass index

CI Confidence interval

EPTB Extra-pulmonary tuberculosis

HIV Human immunodeficiency virus

HR Hazard ratio

IQR Interquartile range

MDR-TB Multidrug-resistant tuberculosis

MSE Mean squared error

PH Proportional hazards

ROC Receiver operating characteristics

SD Standard deviation

TB Tuberculosis

WHO World Health Organization

XDR-TB Extensively drug-resistant tuberculosis

10

CHAPTER 1: INTRODUCTION

11

Background

Tuberculosis

Tuberculosis (TB) is an infectious disease caused by the bacteria Mycobacterium tuberculosis.1

While TB affects all areas of the body, the most common form is pulmonary TB, which impacts the

lungs.1 TB spreads when a person sick with TB expels the bacteria through the air, such as through

coughing, sneezing, or other similar actions.1,2 A person becomes infected when they inhale the TB

bacteria.1 People infected with TB do not exhibit symptoms of the disease and cannot infect others.1

The magnitude of TB infection is enormous; it is estimated that one-quarter of the world’s

population is infected.1,3 People with TB infection have a five to 15 percent lifetime risk of

developing TB disease, with certain subgroups, such as those with compromised immune systems,

being at higher risk than others.1

People who develop TB disease often experience symptoms, such as coughing for greater than two

weeks, fever, night sweats, and weight loss.1 Due to the lack of specificity of these symptoms, TB

patients often delay seeking medical care or are misdiagnosed by clinicians who assume a more

general illness, like the flu. When patients do seek care, limitations to current diagnostic tests,

including a lack of sensitivity4, long delays to produce results5, cost and lack of universal

availability6, can lead to delays in diagnosis, or lack of diagnosis. This further promotes

transmission of disease.

Although TB is treatable and curable with prompt diagnosis and appropriate medicines, if left

undiagnosed and untreated during the infectious period, TB patients are capable of infecting up to

15 contacts annually.1 Additionally, without proper treatment up to two-thirds of people with active

TB disease will die.7 Current treatment guidelines for new cases of TB, as recommended by the

World Health Organization (WHO), consist of a six-month regimen of four first-line, antimicrobial

12

drugs.8 Community- or home-based directly observed treatment is recommended8 in attempts to

improve adherence to the six-month treatment regimen because lack thereof may lead patients to

remain infectious and/or to acquire drug resistance.

The global burden of TB is high; in 2016 the WHO estimates 10.4 (8.8 - 12.2) million incident

cases of TB and 1.3 (1.2 - 1.4) million associated deaths.9 Of the 10.4 million cases, only 61 (52-72)

percent initiated TB treatment, further compromising control of TB spread.9 For TB patients who

do initiate treatment, success rates are 83 percent globally.9

Multidrug-resistant Tuberculosis

TB control is complicated by the emergence of multidrug-resistant (MDR) TB. MDR-TB is a form

of Mycobacterium tuberculosis that does not respond to at least two of the most powerful, first-line

anti-TB drugs, rifampicin and isoniazid.10 MDR-TB is a global threat because it spreads easily, is

more difficult to diagnose, and requires a more complex treatment regimen that lasts three to four

times longer in duration than a treatment regimen for drug-susceptible TB.11 The WHO estimates

600,000 (540,000 - 660,000) new cases were eligible for MDR-TB treatment in 2016 and 240,000

(140,000 - 340,000) related deaths occurred.9 Globally, only 22 percent of cases eligible for MDR-

TB treatment actually initiated treatment and, of those, success rates are low at 54 percent.9 The low

percentage of treatment initiation and low success rate can be attributed to the difficulties of current

MDR-TB treatment regimens, including long treatment duration, expensive drugs, being difficult to

implement, and associations with severe toxicities. Additionally, it has been demonstrated that in

MDR-TB patients who initiate treatment but in whom treatment fails, up to 80 percent will die

within three years.12 MDR-TB is not only a deadly disease but also a highly infectious one; every

untreated MDR-TB patient will infect about six new people annually.1 More effective drugs and

13

treatment regimens may be on the horizon with the recent introduction of a standardized, shortened

treatment regimen lasting nine to 12 months for eligible subgroups,11 drugs emerging from the

developmental pipeline, and clinical trials producing initial data about the efficacy of novel drug

combinations.

Treatment Outcome Definitions

Six mutually exclusive treatment outcome definitions for MDR-TB are used and based on treatment

completion and bacteriologic results. The six outcomes are cure, treatment completion, death,

treatment default, treatment failure, and transfer out. Table 1.1 details MDR-TB treatment outcomes

as defined by Laserson et al.13 Composite treatment outcomes are also used, classified as either

successful (cure, treatment completion) or unsuccessful (death, treatment default, treatment failure,

transfer out). We use outcome definitions from Laserson et al. (2005)13 despite definitions having

been updated by the WHO in 201414 because data used in this research were previously classified

and published according to these definitions.

Laserson et al. also make recommendations for how to conduct MDR-TB cohort analyses.13

Recommendations include the following: 1) develop cohorts based on the date of MDR-TB

treatment initiation; 2) perform analyses on all patients who receive MDR-TB treatment, regardless

of treatment duration; 3) assign all patients the first outcome that they experience; 4) perform

analyses 36 months after the last patient enrollment date in the cohort; and 5) follow patients for

two years after the initial outcome assignment to allow the ability to detect relapse. While patients

are usually followed by local programs from the time of treatment initiation until the first treatment

outcome, information about longer survival is rarely collected. This is due to the scarcity of

resources in areas that experience the majority of the MDR-TB burden and the intensity of

14

monitoring required for TB patients. When using limited data that is lacking information on

survival after the initial treatment outcome, it is important to use the most efficient analysis

methods to reduce potential bias in effect estimates. Techniques for analyzing MDR-TB cohort data

may vary based on the quality of a particular TB program’s ability to monitor patients, including

documentation of the treatment regimen, risk factors, comorbidities, and frequency of follow-up

until an outcome is observed.

As resistant TB strains continue to spread and new resistance patterns emerge, the need for new

drugs is dire, as is identifying drug combinations that may be most beneficial for certain sub-groups

of MDR-TB patients. Continued analysis of MDR-TB treatment outcomes in programmatic settings

is essential to identify effective treatments, track resistance patterns, guide treatment

recommendations, and control the spread of MDR-TB.

Survival Analysis

Survival analysis is a time-to-event methodology where the outcome variable is the time to

occurrence of an event.15 Survival time is defined as the length of time from the designated origin

until the time at which the event of interest occurs. Survival analysis is unique in that it uses

information from censored observations, which occur when individuals do not experience the event

of interest during the study period, leaving their actual survival time unknown.15 Here within, the

term censoring will refer to right censoring, which occurs if the event happens after the observed

survival time.16 Right censoring may occur for several reasons, such as the event does not occur

before the study period ends, an individual is lost to follow-up at some point during the study

period, or an individual is withdrawn from the study or can no longer be followed up with for some

15

other reason. Ultimately, for observations that are right censored, the observed survival time is

shorter than the actual survival time.16

There are several types of survival analysis approaches, including those that are non-parametric,

semi-parametric, and parametric. Although commonly used, the non-parametric Kaplan-Meier

analysis17 is not suitable for determining the relationship between specific variables and survival

times. When multivariable approaches are necessary, the semi-parametric Cox proportional hazards

model has become one of the most widely accepted methods to model survival data18 and will be

focused on throughout.

Cox Proportional Hazards Models

Cox proportional hazards models estimate the hazard function, ℎ𝑖(𝑡|𝑥), which is the instantaneous

rate of failure at time 𝑡, given the survival up to time 𝑡, and has the form:

ℎ𝑖(𝑡|𝑥) = ℎ0(𝑡) × exp(𝜷′𝑿𝒊) ,

where ℎ0(𝑡) is the baseline hazard function common to everyone, 𝜷′ is the vector of regression

coefficients, and 𝑿𝒊 is the vector of covariates for observation 𝑖.19 This model is considered semi-

parametric because there is no assumption made about the baseline hazard function (which serves

as the non-parametric part of the model), but does assume a parametric form for the effect of the

predictors on the hazard (serving as the parametric part of the model).18 Cox proportional hazards

models produce estimated hazard ratios for outcomes associated with each measured covariate and

are independent of time.19 As opposed to linear or logistic regression models, Cox proportional

hazards models allow each member of a cohort to contribute individual survival time, 𝑡𝑖, by

including observations that are censored prior to an observed event.19 Because this model does not

16

treat censored observations as missing, several assumptions must be met for inferences to be valid.

The first assumption is that event times (also referred to as failure times) are independent of one

another. The event of interest discussed moving forward is death. The second assumption is that the

hazard of failure is proportional across levels of a given covariate, resulting in a constant hazard

ratio over time. The third assumption is non-informative censoring, in which censor times are

independent of failure times.19 If any assumptions are not met, results produced from the Cox

proportional hazards model may be invalid. The assumption of non-informative censoring is the

focus of the work herein.

Under the non-informative censoring assumption, since the actual time of death is not observed,

censored observations are considered to be at equal risk of failure as individuals still at-risk after the

observed censor time.20 If this assumption is valid, then knowledge of the true failure time for the

censored observations is not necessary to produce unbiased estimates.

When an individual is censored their contribution to the model (1

𝑁), where N is the total number of

individuals in the sample, has an impact on estimated survival probabilities by being equally

distributed across all remaining, at-risk individuals in the cohort after the time of censoring.20 This

is referred to as the ‘equal-risk assumption’ throughout. We demonstrate this through the Kaplan-

Meier estimate of the survivor function, �̂�(𝑡), which is an estimate of the probability that a patient

will survive (�̂�) beyond a specified time (𝑡). This is given by:

�̂�(𝑡) = ∏ (𝑛𝑗−𝑑𝑗

𝑛𝑗)𝑘

𝑗=1 ,

Where k is the total number of distinct uncensored failure times observed in the sample, j represents

each individual failure time, 𝑛𝑗 is the number of individuals alive just before time 𝑡𝑗, and 𝑑𝑗 is the

number of individuals who will die at time 𝑡𝑗 .19 Empirically, as observations are censored,

17

reductions in �̂�(𝑡) are observed at each subsequent event time, increasingly more so as each

censored observation’s contribution is distributed among fewer remaining observations. The impact

of censored observations is demonstrated in Table 1.2.

In Table 1.2, both panels show a population of N=10 at 10 time points. Panel A (no censoring)

displays one death at each time point and the subsequent decrease in the number of individuals alive

just before that time point (𝑛𝑗 − 𝑑𝑗). The difference between survivor functions at two time points

represents the individual contribution of each death to the overall model, as is evidenced in the

Kaplan-Meier curve (Figure 1.1, Panel A) by the individual drops in the curve. Here, the

contribution of each death is 0.10. Panel B is inclusive of censored observations. The individual

contribution that each death adds to the overall model is equivalent to that seen in Panel A (0.10)

until there is a censored observation at t=4. When deaths occur after censored observations, we

observe reductions in �̂�(𝑡), increasingly so as more observations are censored: 0.117 at t=5 and

after one censored observation; 0.145 at t=7 and after two censored observations; 0.438 at t=10 and

after four censored observations. Drops in the survivor function are illustrated in Figure 1.1, Panel

B.

We also note how censored observations influence descriptive survival characteristics of the cohort.

The estimated median survival time (time at which half the cohort remains alive) for the cohort

without censoring is t=5 with a corresponding survival probability of 0.50, while in the cohort with

censoring, the estimated median survival time is t=7 with a corresponding survival probability of

0.44.

The survival function is tied closely with the hazard function through:

ℎ(𝑡) =𝑓(𝑡)

𝑆(𝑡) ,

18

where h(t) is the hazard function defined earlier and f(t) is the failure rate defined as 1 - S(t).19 The

non-informative censoring assumption may not be upheld if censored observations have higher or

lower risk of failure than remaining individuals in the cohort.21 If this occurs, the censored

observations’ contribution to the model being equally distributed across remaining individuals will

incorrectly impact survival probabilities.19 Thus, due to the influence of censored observations on

the survival function, it is crucial to check for the presence of informative censoring.

No definitive test exists to detect non-informative censoring. However, there are several approaches

described for identifying trends that may indicate whether this assumption is being violated. The

first includes plotting observed survival times against explanatory variables, distinguishing

censored and uncensored observations from one another. Patterns of censored observations may

imply that informative censoring is present. The second approach involves using a logistic

regression model to examine the relationship between explanatory variables and the probability of

being censored. Large changes in the deviance when particular explanatory variables are included

in the model may indicate the presence of informative censoring. The third approach includes

examining the sensitivity of the assumptions about what happens to censored observations after the

time of being censored through two analyses. The first assumes that all individuals who are

censored are at high risk of failure and fail immediately after the censor time. The second assumes

that all individuals who are censored are at low risk of failure and survive at least as long as the

longest survival time in the cohort. If results from these two sensitivity analyses are different than

results from the original analysis, it may imply that results are sensitive to the presence of

informative censoring.19 The third approach will be utilized for this research.

While literature acknowledges that failure to account for informative censoring leads to biased

estimates, the majority is written in the context of randomized trials. It is common for patients to

19

drop out of randomized trials, with reasons for dropping out being directly related to risk of failure.

For example, patients who drop out may be less compliant or tagged as more severe cases, leaving

them at high risk of failure after dropping out. On the contrary, patients at low risk of failure may

drop out because they are healthier and do not need continued care or monitoring. Despite an

abundance of literature, there are no universally accepted methods applied to data to account for the

presence of informative censoring due to nuances in data across studies and content areas. The most

common methods discussed are complete case analysis22,23, multiple imputation24-28, inverse

probability censor weighting29-31, competing risks32, sensitivity analyses24, and redefining endpoints

or describing reasons for censoring.

The most extreme method, which potentially introduces the most bias, is complete case analysis, in

which only uncensored observations are included in analysis.22,23 However, if censored

observations are not missing completely at random, they will produce biased estimates.

Additionally, this results in a loss in efficiency due to a substantial reduction in sample size when

removing all censored observations.23

Multiple imputation is a common statistical method that has been proposed across many fields for

handling missing data25,26 and, specifically, informative censoring.27 Multiple imputation makes

assumptions regarding the missing data that cannot be verified from the observed data, including

that data are missing at random.33 This method can be used to impute failure times for censored

observations from the entire sample of observed failure times remaining after the censor time.27 An

R package, called InformativeCensoring, has even been developed to simplify two different

methods using multiple imputation for informative censoring.27,28

20

Inverse probability censor weighting is also proposed as a solution for handling censoring problems

in survival data. The underlying idea is to weight non-censored observations to account for the

probability of censored observations remaining in the study. These weights are estimated as

functions of observed outcomes prior to censoring and of patient characteristics thought to predict

censoring.29,30 Depending on the reasons for censoring, the treatment group or covariate patterns

may not suffice in explaining risks of being censored.31,34

The use of competing risks regression is suggested in certain scenarios when informative censoring

is present. However, understanding the distinction between a competing risk and a censored

observation is essential. A competing risk is an event that a patient experiences other than the event

of interest which modifies the probability -- or completely precludes the occurrence -- of the event

of interest. Censoring, on the other hand, refers to an inability to observe the time at which an event

occurs.32

In the event that the above referenced methods are not appropriate to use, the literature provides

numerous other recommendations for handling informative censoring. One recommendation

includes developing alternate endpoint definitions, such as for individuals who are assumed at high

risk of failure immediately after censoring.21,35 Similarly, if the alternate endpoint changes the risk

of failure, it can be considered a competing risk.21 If the presence of non-informative censoring is

unavoidable or risk after censoring cannot be detected, presentation of sensitivity analyses

encompassing different scenarios of assumptions, such as best- and worst-case scenarios, followed

by a discussion of consistencies and discrepancies may suffice to quantify the effect that

informative censoring has on the analysis.21,24 Reporting reasons for missing survival data past the

censor time by exposure group, if possible, may elucidate whether individuals are at a similar risk

of failure as those remaining in the cohort.24 Instead of waiting until the analysis phase, better data

21

collection and retention methods can be integrated at the design stage to reduce censoring prior to

the end of the study.

Application of Cox Proportional Hazards Models to MDR-TB Cohort Data

The analysis of MDR-TB cohorts has not advanced in decades. There are three main ways in which

MDR-TB cohorts are analyzed: simply described by frequencies, logistic regression, or Cox

proportional hazards models. When Cox models are applied to MDR-TB cohorts, patients enter the

cohort at the time of treatment initiation and are followed until the outcome of interest (this event

will be death moving forward) or one of the other five non-death treatment outcomes definitions is

met.13 With patients only followed until the initial treatment outcome and not for longer survival,

observations are censored at the time of any non-death treatment outcome, irrespective of whether a

successful or unsuccessful non-death outcome is experienced.

As discussed earlier, under the non-informative censoring assumption of the Cox model, all

censored observations are assumed to be at equal risk of failure as those remaining and at risk in the

cohort after the censor time. However, literature suggests that observations censored due to

experiencing a successful non-death treatment outcome have a different risk of death compared to

those experiencing an unsuccessful non-death treatment outcome.12,36-41 Knowing why observations

are censored can greatly influence estimated survival probabilities. Table 1.3 reports on 1,174

individuals, of which, 1,051 (89.5 percent) experienced a successful initial treatment outcome and

123 (10.5 percent) experienced an unsuccessful non-death treatment outcome. Of those who

experienced a successful outcome, 95.5 percent remain alive at the end of the defined follow-up

period, 4.2 percent died, and 0.3 percent are lost to follow-up. Of those experiencing an

unsuccessful non-death initial treatment outcome, 41.5 percent remain alive at the end of the

22

follow-up period, 51.2 percent died, and 7.3 percent are lost to follow-up. This literature suggests

that people who experience initial treatment outcomes do not have an equal risk of death after the

time at which the outcome is observed12,36-41, potentially violating the non-informative censoring

assumption.

Although informative censoring has been studied comprehensively, methods for reporting on

censored observations, identifying the presence of informative censoring, and handling it in

analyses have not made their way into the field of MDR-TB cohort analyses. While many MDR-TB

studies use Cox proportional hazards models and define the censoring indicator, the majority do not

report checking for informative censoring. In the rare few that do, methods used to check for

informative censoring are not clear or thorough. Methods reported include checking only one

explanatory variable for an association with the probability of defaulting on treatment (one of five

non-death treatment outcomes that are censored)42 and adjusting for variables that had previously

been associated with mortality.43 Another just noted that “non-informative censoring was

performed” without describing what methods or tests were used.44

Finding methods to identify and handle the presence of informative censoring in MDR-TB cohort

analyses may reduce bias in treatment effect estimates. Standard methods used in trials to handle

informative censoring may not be appropriate in the context of MDR-TB cohorts. In settings other

than a controlled trial, data capture and patient monitoring is much less intensive and there may be

many more reasons for stopping treatment or not being followed until the outcome of interest is

observed. Additionally, MDR-TB treatment outcome definitions are unique in that there are six

mutually exclusive outcomes and recommendations suggest including patients in cohorts from the

time of treatment initiation until the first outcome definition is met.13 This translates into all

individuals who do not die being censored at the time they experience a non-death treatment

23

outcome. This complicates several of the methods that are used in other areas to handle informative

censoring. Standard multiple imputation may not be applicable because data are not missing at

random and the distribution from which the failure times are imputed would not introduce the

appropriate random error to get unbiased estimates of the treatment effect. Imputing failure times

from observed outcomes for patients who were the sickest in the cohort (those who died), will

likely overestimate death in the cohort, biasing the treatment effect towards the null hypothesis.

Similarly, inverse probability censor weighting may not be appropriate for use in MDR-TB cohorts

because if higher weights are provided to subjects who are not censored, all patients who die will be

weighted the most, despite not being representative of the entire MDR-TB cohort. It is also not

appropriate to consider non-death treatment outcomes as competing risks because they do not

modify the probability of dying, death is simply not observed.

The overarching goal of this research is to identify techniques to handle informative censoring that

do not profoundly disrupt the current methods most frequently used to analyze MDR-TB cohorts.

For health systems that want to analyze their own programmatic data, methods need to be simple,

easy to interpret, and comparable across populations. Additionally, we aim to use these methods to

demonstrate the magnitude and direction of bias that informative censoring may introduce in

standard MDR-TB cohort analyses.

24

Problem

Application of the Cox proportional hazards model to cohorts lacking long-term survival data may

introduce bias. One of the underlying assumptions of the Cox model is non-informative censoring.

When Cox models are applied to MDR-TB cohorts, the non-informative censoring assumption of

the model may be violated due to multiple treatment outcome definitions and limited resources to

follow patients past the initial treatment outcome. This leads to patients who experience any non-

death treatment outcome being censored at the time of the outcome and assuming that they are at

equal risk of failure as remaining individuals still at risk in the cohort. However, literature suggests

that they are not all at equal risk of failure. In fact, MDR-TB patients experiencing successful

treatment outcomes are at very low risk of death (4.2 percent) compared to patients experiencing

unsuccessful treatment outcomes (51.2 percent).12,36-41 Utilizing the Cox model when the non-

informative censoring assumption is violated may result in biased estimates for the effects of TB

therapies. Underestimating the effect of potentially effective treatments due to limited long term

survival data can hinder the adoption of optimal effective drugs and regimens into practice, while

overestimating their effectiveness can lead to non-effective treatments being utilized, potentially

increasing TB-related morbidity, mortality, or development of resistance. Additionally, bias of

treatment effect estimates can lead to an inability to accurately identify sub-groups that are at the

highest risk of death, or those that would benefit most from specific treatments. Methods to identify

and minimize these biases should be used when analyzing MDR-TB cohorts.

25

Significance

With limited MDR-TB cohort data available to analyze and an increasing need to identify effective

MDR-TB drugs and treatment regimens, it is crucial to understand how treatment effect estimates

may be biased if the non-informative censoring assumption of the Cox proportional hazards model

is violated. This research aims to improve the performance of Cox proportional hazards models

when they are applied to studies where individuals cannot be followed beyond the first observed

treatment outcome, such as in MDR-TB cohorts. Many MDR-TB cohort study analyses are

completed by researchers who have adopted methods from previous analyses without closely

assessing the appropriateness of the methods being used or how the model’s underlying

assumptions, such as non-informative censoring, may bias results. By investigating different

assumptions about how censored observations are handled, the estimation of effect sizes can be

made more accurate. Correcting for current biases may lead to stronger estimated effect sizes that

can contribute to existing evidence for (or against) the use of certain treatments in specific

populations. These increased effect sizes can lead to increased power, which will then require

smaller sample sizes, making MDR-TB cohort studies more efficient. Identifying and implementing

approaches to detect these biases, as well as to characterize their magnitude and direction, may lead

to more appropriate and less-biased estimates of treatment effects informing programmatic and

clinical guidelines for MDR-TB treatment. This work aims to specifically provide MDR-TB

researchers with focused strategies to appropriately apply the Cox proportional hazards model to

cohort data with limited interruption to current analysis techniques. To the best of our knowledge,

nobody has examined these issues specifically in MDR-TB cohort analyses, nor widely

disseminated ways to address these issues for similar cohorts. This research will contribute to the

literature, fill an important gap that is currently overlooked, and, hopefully, lead to better reporting

26

of methods to handle informative censoring in MDR-TB studies and future research to continue

refining methods.

27

Dissertation Overview

We strive to develop adaptations to current methods to reduce the impact that violation of model

assumptions has on biasing MDR-TB treatment effect estimates. This will be accomplished through

the following aims:

Aim 1: Adjustments for informative censoring in Cox proportional hazards models: application to a

multidrug-resistant tuberculosis cohort

We assess whether use of the conventional non-informative censoring assumption of the Cox

proportional hazards model produces biased treatment effect estimates when used to analyze studies

with informative censoring present, using an MDR-TB cohort study as an exemplar. We propose a

simple-to-implement alternative censoring technique informed by the literature to reduce the impact

of informative censoring and more accurately estimate survival in the presence of such censoring.

Aim 2: Use of predicted vital status to improve the analysis of multidrug-resistant tuberculosis

cohorts

We derive and validate a tool to predict vital status at the end of a study period and to assess

whether estimated MDR-TB treatment effects are less biased when predicted vital status is

incorporated in Cox proportional hazards models. Using initial treatment outcomes to inform

estimates of the vital status at the end of the study period can provide useful information when

modelling long-term survival. We anticipate that integration of the predicted end-of-study vital

status will produce stronger and less biased treatment effect estimates.

28

Aim 3: Estimates of bias from informative censoring in multidrug-resistant tuberculosis cohort

analyses: a simulation study

We evaluate the performance of different censoring techniques on simulated data sets to select a

method that produces less biased effect estimates, more accurately estimating true treatment effects.

We anticipate that the standard censoring technique will bias treatment effect estimates toward the

null hypothesis and that models differentiating between risk of death for individuals experiencing

successful versus unsuccessful non-death treatment outcomes will produce the least biased

treatment effect estimates.

Two datasets are used to complete these aims; one cohort consisting of MDR-TB patients treated in

Lima, Peru (1999-2002) and a second cohort consisting of MDR-TB patients treated in Tomsk,

Russia (2000-2004). Secondary analysis of both data sets was reviewed and declared exempt by the

Institutional Review Board at Northeastern University.

Aims 1, 2, and 3 are presented in Chapters 2, 3, 4, respectively. Chapter 5 discusses overall

conclusions from the research and future works.

29

References

1. World Health Organization. Tuberculosis fact sheet. Geneva, Switzerland: WHO. 2017.

2. Centers for Disease Control and Prevention. Basic tuberculosis facts, updated 2016. Atlanta,

Georgia, USA: CDC. 2016.

3. Houben RMGJ, Dodd PJ. The global burden of latent tuberculosis infection: a re-estimation

using mathematical modelling. PLoS Medicine. 2016; 13(10): e1002152.

4. Steingart K, Ng V, Henry M, et al. Sputum processing methods to improve the sensitivity of

tuberculosis: a systematic review. Lancet Infectious Diseases. 2006; 6: 664-74.

5. World Health Organization on behalf of the Special Programme for Research and Training in

Tropical Diseases. Diagnostics for tuberculosis: global demand and market potential. Geneva,

Switzerland: WHO. 2006.

6. Centers of Disease Control and Prevention, National Center for HIV/AIDS, Viral Hepatitis,

STD, and TB Prevention, Division of Tuberculosis Elimination. Updated guidelines for the use

of nucleic acid amplification tests in the diagnosis of tuberculosis. Morbidity and Mortality

Weekly Report. 2009; 28(1).

7. Styblo, K. 1991. Epidemiology of tuberculosis; selected paper number 24. The Hague: Royal

Netherlands Tuberculosis Association. 1991.

8. World Health Organization. Guidelines for treatment of drug-susceptible tuberculosis and

patient care, 2017 update. Geneva, Switzerland: WHO. 2017.

9. World Health Organization. Global Tuberculosis Report 2017. Geneva, Switzerland: WHO.

2017.

10. Centers of Disease Control and Prevention. Multidrug-resistant tuberculosis fact sheet. Atlanta,

Georgia, USA: CDC. 2014.

11. World Health Organization. WHO treatment guidelines for drug-resistant tuberculosis, 2016

update. Geneva, Switzerland: WHO. 2016.

12. Gelmanova IY, Zemlyanaya NA, Andreev E, Yanova G, Keshavjee S. Case fatality among

patients who failed multidrug-resistant tuberculosis treatment in Tomsk, Russia. International

Union Against Tuberculosis and Lung Diseases, poster presentation; Malaysia 2012.

13. Laserson KF, Thorpe LE, Leimane V, et al. Speaking the same language: treatment outcome

definitions for multidrug-resistant tuberculosis. International Journal of Tuberculosis and Lung

Disease. 2005; 9(6): 640-5.

14. World Health Organization. Definitions and reporting framework for tuberculosis – 2013

revision. Geneva, Switzerland: WHO. 2013.

30

15. Kleinbaum DG, Klein M. Survival Analysis: A Self learning text, third edition. Springer, New

York, 2012.

16. Lagakos SW. General right censoring and its impact on the analysis of survival data.

Biometrics. 1979; 35(1): 139-56.

17. Kaplan E, Meier P. Nonparametric estimation from incomplete observations. Journal of

American Statistical Association. 1958; 53(282): 457-81.

18. Cox D. Regression Models and Life-Tables. Journal of the Royal Statistical Society Series B

(Methodological). 1972; 34(2).

19. Collett D. Modelling Survival Data in Medical Research. Second Edition. 2003.

20. Efron B. The Efficiency of Cox's Likelihood Function for Censored Data. Journal of the


21. Campigotto F, Weller E. Impact of informative censoring on the Kaplan-Meier estimate of

progression-free survival in phase II clinical trials. Journal of Clinical Oncology. 2014; 32(27):

3068-74.

22. Little RJA. Rubin DB. Statistical analysis with missing data. New York: Wiley. 1987.

23. Leung KM, Elashoff RM, Afifi, AA. Censoring issues in survival analysis. Annual Reviews in

Public Health. 1997; 18: 83-104.

24. Shih W. Problems in dealing with missing data and informative censoring in clinical

trials. Current Controlled Trials in Cardiovascular Medicine. 2002; 3(1): 4.

25. Schafer JL, Graham JW. Missing data: our view of the state of the art. Psychological Methods.

2002; 7(2): 147–77.

26. Mackinnon A. The use and reporting of multiple imputation in medical research - a review.

Journal of Internal Medicine. 2010; 268(6): 586–93.

27. Jackson D, White IR, Seaman S, Evans H, Baisley K, Carpenter J. Relaxing the independent

censoring assumption in the Cox proportional hazards model using multiple imputation.

Statistics in Medicine. 2014; 33(27): 4681-94.

28. Hsu CH, Taylor JMG. Nonparametric comparison of two survival function with dependent

censoring via nonparametric multiple imputation. Statistics in Medicine. 2009; 28(3): 462-75.

29. Robins JM, Finkelstein DH. Correcting for non-compliance and dependent censoring in an

AIDS clinical trial with inverse probability of censoring weighted (IPCW) log-rank tests.

Biometrics. 2000; 56: 779-88.

31

30. Sharfstein DO, Robins JM. Estimation of the failure time distribution in the presence of

informative censoring. Biometrika. 2002; 89: 617-34.

31. Howe CJ, Cole SR, Chmiel JS, Munoz A. Limitation of inverse probability-of-censoring

weights in estimating survival in the presence of strong selection bias. American Journal of

Epidemiology. 2011; 173(5): 569-77.

32. Donoghoe MW, Gebski V. The importance of censoring in competing risk analysis of the

subdistribution hazard. BMC Medical Research Methodology. 2017; 17: 52.

33. National Research Council. Panel on handling missing data in clinical trials, committee on

national statistics, division of behavioral and social sciences and education: The prevention and

treatment of missing data in clinical trials. Washington, D.C: National Academies Press; 2010.

34. Yoshida M, Matsuyama Y, Ohashi Y. Estimation of treatment effect adjusting for dependent

censoring using the IPCW method: an application to a large primary prevention study for

coronary events (MEGA study). Clinical Trials. 2007; 4: 318-28.

35. Gorouhi F, Khatami A, Davari P. A neglected issue in interpretation of results of randomized

controlled trials: Informative censoring. Dermatology Online Journal. 2009; 15(1): 13.

36. Shin SS, Furin JJ, Alcantara F, Bayona J, Sanchez E, Mitnick CD. Long-term follow-up for

multidrug-resistant tuberculosis. Emerging Infectious Diseases. 2006; 12(4): 687-8.

37. Gelmanova IY, Khan FA, Becerra MC, et al. Low rates of recurrence after successful treatment

of multidrug-resistant tuberculosis in Tomsk, Russia. International Journal of Tuberculosis and

Lung Disease. 2015; 19(4): 399-405.

38. Migliori GB, Espinal M, Danilova ID, Punga VV, Grzemska M, Raviglione MC. Frequency of

recurrence among MDR-TB cases 'successfully' treated with standardised short-course

chemotherapy. International Journal of Tuberculosis and Lung Disease. 2002; 6(10): 858-64.

39. Kwak N, Yoo CG, Kim YW, Han SK, Yim JJ. Long-term survival of patients with multidrug-

resistant tuberculosis according to treatment outcomes. American Journal of Infection Control.

2016; 44(7): 843-5.

40. Franke MF, Appleton SC, Bayona J, et al. Risk factors and mortality associated with default

from multidrug-resistant tuberculosis treatment. Clinical Infectious Diseases. 2008; 46(12):

1844-51.

41. Becerra MC, Appleton SC, Franke MF, et al. Recurrence after treatment for pulmonary

multidrug-resistant tuberculosis. Clinical Infectious Diseases. 2010; 51(6): 709-11.

42. Mitnick CD, Franke MF, Rich ML, et al. Aggressive regimens for multidrug-resistant

tuberculosis decrease all-cause mortality. PloS One. 2013; 8(3): e58664.

32

43. Tierney DB, Franke MF, Becerra MC, et al. Time to culture conversion and regimen

composition in multidrug-resistant tuberculosis treatment. PLoS One. 2014; 9(9): e108035.

44. Pepper DJ, Marais S, Wilkinson RJ, et al. Clinical deterioration during antituberculosis

treatment in Africa: incidence, causes and risk factors. BMC Infectious Diseases. 2010; 10: 83.

33

Chapter 1 Tables

Table 1.1. MDR-TB treatment outcome definitions13

Treatment Outcome Defined as an MDR-TB patient who has:

Cure

Completed treatment according to country protocol AND

a) Consistently culture-negative (> 5 results) for final 12 months of

treatment, OR

b) If only one positive culture is reported during that time, with no clinical

evidence of deterioration, and positive culture is followed by >3

consecutive negative cultures taken at least 30 days apart.

Treatment

Completed

Completed treatment according to country protocol BUT does not meet the

definition for cure or treatment failure due to lack of bacteriologic results

(i.e., <5 cultures performed in final 12 months of therapy).

Death Died for any reason during the course of MDR-TB treatment.

Treatment Default MDR-TB treatment interrupted for >2 consecutive months for any reason.

Treatment Failure

a) >2 of the 5 cultures recorded in the final 12 months of treatment are

positive, OR

b) Any 1 of the final 3 cultures is positive, OR

c) If a clinical decision has been made to terminate treatment early due to

poor response or adverse events.

Transfer Out Transferred to another reporting and recording unit and for whom the

treatment outcome is unknown.

34

Table 1.2. Effect of censoring on survival probability estimates

Panel A. Product Limit Survival Estimates; No

Censoring (N=10)

Panel B. Product Limit Survival Estimates; Censoring

(N=10)

tj dj cj nj (nj – dj)/ nj �̂�(𝑡) Model

Contribution

(1/N)

tj dj cj nj (nj – dj)/ nj �̂�(𝑡) Model

Contribution

(1/N)

1 1 0 10 (10-1)/10=0.90 0.90 0.100 1 1 0 10 (10-1)/10=0.90 0.90 0.100

2 1 0 9 (9-1)/9=0.89 0.80 0.100 2 1 0 9 (9-1)/9=0.89 0.80 0.100

3 1 0 8 (8-1)/8=0.88 0.70 0.100 3 1 0 8 (8-1)/8=0.88 0.70 0.100

4 1 0 7 (7-1)/7=0.86 0.60 0.100 4 0 1 7 (7-0)/7=1.00 0.70 0.000

5 1 0 6 (6-1)/6=0.83 0.50 0.100 5 1 0 6 (6-1)/6=0.83 0.58 0.117

6 1 0 5 (5-1)/5=0.80 0.40 0.100 6 0 1 5 (5-0)/5=1.00 0.58 0.000

7 1 0 4 (4-1)/4=0.75 0.30 0.100 7 1 0 4 (4-1)/4=0.75 0.44 0.145

8 1 0 3 (3-1)/3=0.67 0.20 0.100 8 0 1 3 (3-0)/3=1.00 0.44 0.000

9 1 0 2 (2-1)/2=0.50 0.10 0.100 9 0 1 2 (2-0)/2=1.00 0.44 0.000

10 1 0 1 (1-1)/1=0.00 0.00 0.100 10 1 0 1 (1-1)/1=0.00 0.00 0.438

N is the total number of individuals in the sample.

�̂�(𝑡) is an estimate of the probability that a patient will survive beyond a specified time.

j represents each individual failure time.

tj is the time at which a failure takes place.

𝑑𝑗 is the number of individuals who will die at time 𝑡𝑗.

cj is the number of individuals who are censored at time tj.

𝑛𝑗 is the number of individuals alive just before time 𝑡𝑗.

35

Table 1.3. Long term survival after initial treatment outcome

Cohort Follow up

(months)

Initial Treatment

Outcome

Long-term Outcome

(N,%)

Alive Died

Lost To

Follow

Up

Shin, Peru

(2006)36

Median (range)

46 (3-84)

Cure (86) 83, 96.5% 3, 3.5% 0, 0%

Default (9) 2, 22.2% 5, 55.6% 2, 22.2%

Failed Treatment (1) 0, 0.0% 1, 100.0% 0, 0.0%

Gelmanova,

Russia (2015)37 12

Cure/Treatment

Completion (364) 345, 94.8% 19, 5.2% 0, 0.0%

Gelmanova,

Russia (2012)12 60 Failed Treatment (36) 4, 11.1% 25, 69.4% 7, 19.4%

Migliori, Russia

(2002)38

Median (range)

6.5 (1-37)

Cure/Treatment

Completion (21) 17 , 81.0% 1, 4.8% 3, 14.3%

Kwak, Korea

(2016)39

Not explicitly

reported

Cure (150) 144, 96.0% 6, 4.0% 0, 0.0%

Treatment Completion

(28) 26, 92.9% 2, 7.1% 0, 0.0%

Failed Treatment (20) 14, 70.0% 6, 30.0% 0, 0.0%

Default (10) 9, 90.0% 1, 10.0% 0, 0.0%

Franke, Peru

(2008)40

Median (range)

35 (25-43) Default (47) 22, 46.8% 25, 53.2% 0, 0.0%

Becerra, Peru

(2010)41

Not explicitly

reported Cure (402) 389, 96.8% 13, 3.2% 0, 0.0%

Total Successful Outcomes (n=1051) 1004, 95.5% 44, 4.2% 3, 0.3%

Total Unsuccessful Outcomes (n=123) 51, 41.5% 63, 51.2% 9, 7.3%

36

Chapter 1 Figures

Figure 1.1. Kaplan-Meier product limit survival estimates

37

CHAPTER 2: ADJUSTMENTS FOR INFORMATIVE CENSORING IN COX

PROPORTIONAL HAZARDS MODELS: APPLICATION TO A

MULTIDRUG-RESISTANT TUBERCULOSIS COHORT

Authors: Meredith B Brooks1,2, Carole D Mitnick2,3, Justin Manjourides1

Institutional Affiliations:

1. Department of Health Sciences, Northeastern University, Boston, MA

2. Department of Global Health and Social Medicine, Harvard Medical School, Boston, MA

3. Partners In Health, Boston, MA

38

Abstract

Background

Cox proportional hazards models are typically used to analyze multidrug-resistant tuberculosis

cohorts. In these cohorts, it is common practice that patients are only followed through the time of

their initial treatment outcome and not for longer survival. When the Cox model is applied to the

corresponding data, the lack of long-term follow-up data may result in violation of the non-

informative censoring assumption and subsequently bias treatment effect estimates. We

demonstrate the impact violating this assumption has on treatment effect estimates and propose a

simple-to-implement approach to reduce potential bias.

Methods

We analyze a cohort of multidrug-resistant tuberculosis patients from Lima, Peru (1999-2002) to

assess how estimates of associations between an aggressive treatment regimen and death vary

across different assumptions regarding censored observations. A new censoring technique

incorporating a differential risk of death and is dependent on the initial treatment outcome,

informed by current literature, is proposed and evaluated to satisfy the non-informative censoring

assumption of the model and reduce bias.

Results

The new censoring technique shows less evidence of informative censoring, and treatment effect

estimates are 8 to 16 percent stronger than those produced from the conventional model.

Conclusion

Assigning differential risk of death to censored multidrug-resistant tuberculosis patients based on

their initial treatment outcome more accurately reflects each individual’s long-term survival and

39

produces stronger effect estimates, suggesting that conventional models may underestimate

treatment effects. While our proposed method may more accurately estimate treatment effects,

further refinement of censoring approaches should be explored to reduce biases related to the

presence of informative censoring in Cox proportional hazards models, as applied to multidrug-

resistant tuberculosis cohorts.

40

Introduction

Cox proportional hazards (PH) regression is widely used to model time-to-event data with

incomplete follow-up, such as that observed in studies involving multidrug-resistant tuberculosis

(MDR-TB) cohorts.1 Observations are censored if the event of interest does not occur during follow

up.1 Several assumptions must be met for inferences from Cox PH models to be valid:

1) Event times are independent;

2) Proportional hazards across levels of each covariate; and

3) Censoring is independent of the event (non-informative).2

If these assumptions are violated, results from PH models may be invalid.

Heuristically, non-informative censoring implies that failure is equally likely to occur among

censored individuals and among individuals still at risk at the censoring time3—(we refer to this

conventional assumption as the ‘equal-risk assumption’). Empirically, as more subjects are

censored, larger reductions in the estimate of the survival probabilities are observed at each event

time, as each censored observation’s survival probability is distributed among the observations

remaining at risk.1,2 However, if the censoring is informative, incomplete information on failure

times may bias estimates.

To demonstrate this bias, we consider the use of Cox PH regression models to analyze MDR-TB

cohort studies targeted at assessing death, where multiple outcomes combined with routine

practices around patient follow-up may lead to violation of the non-informative censoring

assumption. In such analyses, an individual experiencing any one of five defined non-death

treatment outcomes is censored using the equal-risk assumption due to the absence of longer

survival data. However, literature suggests that individuals who experience successful treatment

41

outcomes have drastically improved long-term survival compared to those experiencing non-death

unsuccessful outcomes. Across seven studies following MDR-TB patients past the initial treatment

outcome, we observe that of patients experiencing a successful initial treatment outcome, 95.5

percent remained alive at the end of the longer cohort period. This is compared to only 41.5 percent

of patients who experienced an unsuccessful, non-death initial treatment outcome remaining alive at

the end of the same period.4-10 This differential risk of survival violates the non-informative

censoring assumption. Those individuals experiencing successful treatment outcomes do not have

the same probability of dying as those still being treated, making the equal-risk assumption

inappropriate.

Due to its influence on model estimates, it is crucial to identify informative censoring and, if

present, use appropriate methods to reduce potential bias. Here, we assess whether this equal-risk

assumption produces biased treatment effect estimates in studies with multiple non-death treatment

outcomes, using an MDR-TB cohort study as an exemplar. We propose a simple-to-implement

alternative censoring procedure that reduces the impact of informative censoring and more

accurately estimates survival in the presence of such censoring.

Methods

Study Population

This analysis was performed on a retrospective cohort of patients who received their first treatment

for MDR-TB in Lima, Peru between 1999 and 2002. Treatment was tailored and regimens were

constructed based on each patient’s drug-susceptibility test results and prior treatment exposure.

This cohort has been reported on previously. 9-14 Each patient was followed from treatment

initiation until first treatment outcome or the designated study period (the longest treatment

duration) was completed.

42

Exposure Variable Definitions

The primary exposure variable was the proportion of treatment time that an individual was on an

aggressive treatment regimen. An aggressive treatment regimen was defined as a regimen

containing at least five likely effective drugs based on previous treatment history and current drug

resistance pattern during the intensive phase of treatment, and at least four likely effective drugs

during the continuation phase.15 The aggressive treatment regimen has previously been shown to

reduce mortality and recurrence in this cohort.11,13 Other covariates explored include the presence

of at least one comorbidity, number of previous treatment regimens received, sex, poor nutritional

status, tachycardia, extra-pulmonary TB (EPTB), human immunodeficiency virus (HIV)

coinfection, and the number of resistant agents. Age is explored closely, as adolescents in this

cohort are a healthier subgroup compared to their adult counterparts16, resulting in more adolescent

observations being censored and suggestive of the presence of informative censoring.2 Detailed

covariate definitions can be found in Table 2.1.

Outcome Definition

MDR-TB treatment outcome definitions used are: cure, treatment completion, treatment failure,

treatment default, transfer out, and death.17 All analyses model time to death, and non-death

outcomes are aggregated as successful (cure, treatment completion) or unsuccessful (treatment

failure, treatment default, transfer out). Observations are censored when the first non-death outcome

definition is observed.

43

Statistical Analysis

We describe the cohort, including demographics, comorbidities, treatment characteristics, treatment

outcomes, and time to outcomes. Characteristics are quantified by frequency and percent for

categorical variables and mean and standard deviation for continuous variables. Time to events are

presented as median and interquartile range. We describe the full cohort characteristics as well as

the breakdown of characteristics by adult and adolescent subgroups. We assess whether these

characteristics are statistically different from one another across age groups through use of chi-

square, Fishers exact test, or t-tests.

Our primary analysis assesses the association between the aggressive treatment regimen and time to

death. We compare produced hazard ratio (HR) estimates from four different models applied to the

same dataset with the following specifications:

Model 1: Full cohort, controlling for age group;

Model 2: Full cohort, controlling for age group and eight covariates previously

demonstrated to be associated with death11;

Model 3: Full cohort, stratified by age group;

Model 4: Subgroup analysis of adolescents only.

Models 3 and 4 explore how a subgroup experiencing more cures, more quickly16 may impact the

estimated treatment effects through more pronounced informative censoring. The conventional

equal-risk assumption is used in all models producing reference HR estimates to be compared to

HRs estimated under alternative censoring strategies.

44

First, we assess whether the non-informative censoring assumption is violated. To do this, we

present two additional analyses in which a high- and low-risk of death for censored observations is

assumed.2 Under the ‘high-risk’ assumption, censored observations (any individual experiencing a

non-death treatment outcome) are assumed to die immediately after the censor time. Under the

‘low-risk’ assumption, all censored observations are assumed to survive through the end of the

study. Large differences in HR estimates between the high-, low-, and equal-risk assumption

models suggest biases resulting from violations of the non-informative censoring assumption.

Then, to improve estimation, we propose a ‘mixed-risk’ censoring assumption based on survival

trends reported in the current literature.4-10 Simply, individuals experiencing successful treatment

outcomes are assumed to survive through the end of the study period while individuals experiencing

unsuccessful non-death outcomes are considered to be at equal-risk of death as those individuals

remaining in the cohort and are censored using the conventional procedure. If this mixed-risk

assumption more accurately describes the long-term survival of patients whose observations are

censored due to experiencing a non-death treatment outcome, as the literature suggests, consistent

effect estimates further from the null HR of 1.0 indicates that the equal-risk assumption may

underestimate true treatment effects. Kaplan-Meier curves corresponding to Model 1 are produced

under all four censoring assumptions as another mechanism to observe how varying the censoring

assumption can impact the estimated survival probabilities of the cohort.

SAS Version 9.3 (SAS Institute, Cary, NC) is used for analyses; R Version 3.4.1 is used to create

figures.

45

Ethics Statement

The parent study was approved by the Institutional Review Board at Harvard Medical School and

by the Ministry of Health of Peru. Secondary analysis was reviewed and declared exempt by the


Results

A total of 667 subjects are included in the analysis, 90 adolescents and 577 adults; 38.8 percent are

female, and 30.0 percent have poor nutritional status. The aggressive treatment regimen is used for

an average of 68.8 percent (standard deviation [SD]: 37.8 percent) of treatment time for adolescents

and 55.0 percent (SD: 41.6 percent) for adults. The longest follow-up survival time reported is

2227.0 days, defining the length of the study period. The median follow-up time is 744.0 days,

indicating that most observations were censored or died well before study completion. Two-thirds

of the cohort had successful treatment outcomes, while 20.7 percent died. Adolescents experienced

a higher percentage of successful treatment outcomes (75.6 to 64.3 percent), shorter median time to

successful outcomes (758.9 to 778.0 days), less death (11.1 to 22.2 percent), and shorter median

time to death (313.0 to 356.0 days) compared to adults. Table 2.2 describes overall cohort

characteristics and breakdown by age groups. Table 2.3 describes the time to treatment outcomes

and breakdown by age groups.

Across all models, the aggressive treatment regimen and adolescence are associated with reduced

hazard of death (Table 2.4). The high-risk assumption increases HRs by 191.3 to 550.0 percent for

the aggressive treatment regimen, and decreases HRs for adolescents by 49.9 to 18.7 percent.

Differences in the aggressive treatment regimen effect estimates between the equal- and low-risk

46

assumptions are more modest, ranging from 0 to 13.0 percent, due to the relatively low occurrence

of death in this population (Table 2.4).

Compared to the equal-risk method, the mixed-risk assumption yields stronger estimates for the

aggressive treatment regimen that are consistently further from the null hypothesis of no association

(see Table 2.5) by 8.7 to 16.7 percent, depending on the model used. Additionally, we observe

narrower confidence intervals across all models using the mixed-risk assumption. Similarly,

estimates for adolescence are 7.8 to 13.3 percent stronger using the mixed-risk assumptions as

compared to those using the equal-risk method.

As a result of this censoring, the Kaplan-Meier curve under the equal-risk assumption estimates that

only 60.0 percent of adolescents survived until day 1,100 despite the data indicating that 75.6

percent of adolescents had successful treatment outcomes. Further, for the death that occurred at

approximately day 1,100, a large drop in the survival curve is observed (due to distributing the

weight of all previously censored observations to the few remaining events) resulting in the survival

curves intersecting, suggesting a violation of the proportional hazards assumption. See Figure 2.1.

Additionally, although only 22.2 percent of adults died during the study, under the equal-risk

assumption, the estimated median survival time is 1,600 days (Figure 2.1, Panel A), an artifact of

more than 50 percent of adults being censored.

Discussion

While the equal-risk assumption produces similar estimates as the low-risk method, the large

differences observed between the high- and equal-risk estimates suggests informative censoring is

present in studies conducted in this manner. The vastly different effect estimates produced by

47

models utilizing the high-risk assumption compared to the equal-risk assumption and the larger

proportion of censored observations across levels of explanatory variables, such as adolescence,

suggests that treating censoring as non-informative may bias results. Although the difference in

hazard ratio estimates between the equal- and low-risk assumptions is modest, this is expected

because the equal-risk assumption distributes the contributions of censored observations across all

remaining at-risk observations following the censor time. With a low risk of death overall in the

population, most individuals will follow a trajectory similar to observations in the low-risk

assumption.

The mixed-risk approach may more appropriately represent the survival experience in settings

similar to this MDR-TB cohort. If the mixed-risk assumption more accurately reflects long-term

patient survival probabilities, then the equal-risk assumption may consistently underestimate the

true treatment effect. While assuming that individuals experiencing successful treatment outcomes

will survive until the end of study may not be true for all cases, it is closer to the truth, per current

literature4-10, than assuming they are equally likely to die as those still at-risk. Because this

assumption is more likely to hold, the mixed-risk assumption may be a better option to incorporate

into analyses than the conventional equal-risk assumption in studies that report multiple non-death

treatment outcomes and for which individuals are lacking longer survival data.

Censoring adolescent observations at the time of non-death treatment outcomes and using the

equal-risk assumption produced results (e.g. median survival time) inconsistent with the observed

data and led to crossing of survival curves, potentially violating the assumption of proportional

hazards. Survival of adolescents, based on the data, is more accurately depicted in the low- and

mixed-risk assumption Kaplan-Meier curves (Figure 2.1, Panels C and D). When the equal-risk

48

assumption is used, the curve yields an entirely different, and inaccurate, picture of survival in the

population.

Conclusion

Observations censored due to non-death outcomes are not equally likely to die at subsequently

observed death times; patients experiencing successful non-death treatment outcomes are at a lower

risk of death for the remainder of the study period than those experiencing unsuccessful non-death

treatment outcomes.4-10 Treating all censored observations as having equal risk of death after being

experiencing an initial outcome may lead to underestimation of true treatment effects.

While none of the assumptions regarding the censoring mechanisms can be validated in this cohort

due to lack of long-term survival data, the mixed-risk assumption best reflects findings from other

studies.4-10 Operationalizing this method in analysis may result in less biased estimates of treatment

effects. The methods proposed in this work are simple to implement, as they only require recoding

of the censoring indicator.

In the absence of a validated censoring assumption, true effect estimates are unknown. It may,

therefore, be more informative to present estimates produced from a mixed-risk assumption in

addition to those produced from the equal-risk assumption to highlight the potential bias in effect

estimates. Consequences of this bias may include underestimating the benefit of an aggressive

treatment regimen, suggesting that associations identified in previous analyses when the equal-risk

assumption was employed may be conservative estimates.11,13

Logistic regression is commonly used in MDR-TB analyses and removes the burden of accounting

for survival time, subsequently removing concerns of bias being introduced due to the presence of

49

informative censoring. However, using logistic regression instead of Cox PH models introduces

other types of biases related to only being able to include participants with full outcome data

available. In the context of MDR-TB cohort data, for which there may not be information about

long-term patient follow-up, this may require adjustments to the outcome definition. Instead, we

suggest continued use of Cox PH models in MDR-TB cohorts, with continued exploration of more

complex censoring mechanisms to better reflect what is observed in existing literature and to

account for differential risk of death based on the non-death outcome observed.

Funding

This work was supported by: a pilot study grant (U19 OH008861) from the Harvard T.H. Chan

School of Public Health Center for Work, Health and Wellbeing, a National Institute for

Occupational Safety and Health Center of Excellence to Promote a Healthier Workforce to JM; the

Northeastern University, Department of Health Sciences, Population Heath Doctoral Program to

MBB; a career development award from the National Institute of Allergy and Infectious Diseases (5

K01 A1065836) to CDM; Bill and Melinda Gates Foundation; Thomas J. White; Partners in Health;

the Peruvian Ministry of Health; the David Rockefeller Center for Latin American Studies at

Harvard University; the Francis Family Foundation; the Pittsfield Anti-tuberculosis Association; the

Eli Lilly Foundation; and the Hatch Family Foundation.

50

References

1. Cox D. Regression models and life-tables. Journal of the Royal Statistical Society, Series B


2. Collett D. Modelling survival data in medical research. Second edition. 2003.

3. Efron B. The efficiency of Cox’s likelihood function for censored data. Journal of the American

Statistical Association. 1977; 72(359): 557-65.





Lung Disease. 2015; 19(4): 399-405.



Union Against Tuberculosis and Lung Diseases, poster presentation; Malaysia. 2012.


recurrence among MDR-TB cases ‘successfully’ treated with standardised short-course




2016; 44(7): 843-5.



1844-51.





12. Mitnick CD, Shin SS, Seung KJ, et al. Comprehensive treatment of extensively drug-resistant

tuberculosis. New England Journal of Medicine. 2008; 359: 563–74.

13. Franke MF, Appleton SC, Mitnick CD, et al. Aggressive regimens for multidrug-resistant

tuberculosis reduce recurrence. Clinical Infectious Diseases. 2013; 56(6): 770-6.


composition in multidrug-resistant tuberculosis treatment. PloS One. 2014; 9(9): e108035.

51

15. Mukherjee JS, Rich ML, Socci AR, et al. Programmes and principles in treatment of multidrug-

resistant tuberculosis. Lancet. 2004; 363: 474–81.

16. Tierney DB, Milstein MB, Manjourides J, Furin JJ, Mitnick CD. Treatment outcomes for

adolescents with multidrug-resistant tuberculosis in Lima, Peru. Global Pediatric Health. 2016;

3:2333794X16674382.



Disease. 2005; 9(6): 640-5.

18. Centers for Disease Control and Prevention. Clinical growth charts. [cited 2016]. Available

from: http://www.cdc.gov/growthcharts/clinical_charts.htm.

19. World Health Organization. The second decade: improving adolescent health and development.

Geneva, Switzerland: WHO. 2001.

52

Chapter 2 Tables

Table 2.1. Explanatory variable definitions for the Lima, Peru cohort

Covariate Definition

Aggressive treatment

regimen

A treatment regimen containing at least five likely effective drugs based

on previous treatment history and baseline drug resistance pattern during

the intensive phase of treatment, and at least four likely effective drugs

during the continuation phase.

Calculated as proportion of total treatment time on an aggressive

treatment regimen.

Comorbidities

Presence of at least one of the following: cardiovascular disease, diabetes

mellitus, hepatitis or cirrhosis, epilepsy/seizures, renal insufficiency,

psychiatric disorder, history of smoking or substance use/abuse.

Presence of at least one comorbidity versus none (reference).

Previous treatment regimens

Combination of number of previous treatment regimens received and

receipt of a standardized treatment regimen.

Receipt of less than or equal to two previous treatment regimens and/or

did not receive the standardized treatment regimen for MDR-TB versus

receipt of more than two previous treatment regimens or of the

standardized treatment regimen for MDR-TB (reference).

Sex

Female sex.

Female versus male (reference).

Poor nutritional status

Low body mass index (BMI) per the Centers of Disease Control and

Prevention definitions18 or clinical assessment of malnutrition.

Low BMI and/or malnutrition versus normal BMI and not malnourished

(reference).

Tachycardia

Heart rate greater than 100 beats per minute.

Tachycardia versus no tachycardia (reference).

Extra-pulmonary TB (EPTB)

Clinical assessment of EPTB at baseline.

EPTB versus no EPTB (reference).

Human immunodeficiency

virus (HIV)

Diagnosis or documentation of HIV at baseline.

HIV versus no HIV (reference).

Number of resistant agents

Resistance to the following 12 drugs or drug classes was tested:

capreomycin, cycloserine, ethambutol, ethionamide, isoniazid, kanamycin

or amikacin, para-aminosalicylic acid, pyrazinamide, rifampicin,

streptomycin, 1st generation fluoroquinolones (ciprofloxacin, ofloxacin),

and later-generation fluoroquinolones (gatifloxicin, levofloxacin,

moxifloxacin)11,13

Number of resistant results for the 12 drugs listed above.

Age

Adolescent are 10-19 years old while adults are greater than or equal to 20

years old.19

Adolescent versus adult (reference).

53

Table 2.2. Characteristics of MDR-TB cohort from Lima, Peru

Covariate

Full Cohort Adolescent Adult Difference between

adolescents &

adults

(p-value)

N=667

N=90 (13.5%)

N=577 (86.5%)

Demographics

Female 261 (39.1) 37 (41.1) 224 (38.8) 0.679

Severity Indicators

Poor nutritional status 179 (30.0) 23 (29.5) 156 (30.1) 0.918

Tachycardia 194 (29.9) 30 (33.3) 164 (29.3) 0.436

Extrapulmonary TB 58 (8.7) 3 (3.3) 55 (9.6) 0.051


Mean, SD

5.43 (1.70) 4.99 (1.58) 5.50 (1.71) 0.005#

Comorbidities

Human Immunodeficiency

Virus

10 (1.5) 1 (1.1) 9 (1.6) 1.00+

At least 1 comorbidity 233 (36.6) 18 (21.7) 215 (38.8) 0.058

Treatment Characteristics

Received <2 previous

regimens and did not

receive standardized

regimen for MDR-TB

171 (25.8) 46 (51.1) 125 (21.8) <0.001

Proportion of treatment

time on an aggressive

regimen

Mean, SD

0.57 (0.41) 0.69 (0.38) 0.55 (0.42) <0.001#

Treatment Outcomes

Successful Outcomes:

(Cured/completed

treatment)

439 (65.8) 68 (75.6) 371 (64.3) 0.036

Unsuccessful Outcomes: 228 (34.2) 22 (24.4) 206 (35.7) 0.036

Treatment Failed 18 (2.7) 3 (3.3) 15 (2.6) 0.723+

Died 138 (20.7) 10 (11.1) 128 (22.2) 0.016

Default 67 (10.0) 8 (8.9) 59 (10.2) 0.695

Transferred Out 5 (0.8) 1 (1.1) 4 (0.7) 0.517+

All values are n, % unless otherwise specified.

All p-values are calculated using chi-square tests except: # indicates use of t-test, + indicates use of Fishers

exact test.

SD: Standard deviation

54

Table 2.3. Breakdown of time to treatment outcomes

Median time to

Outcome days

Full Cohort Adolescent Adult

N=667 N=90 N=577

Median time to any

successful outcome (IQR)

773.0

(734.0–869.0)

758.5

(734.0–847.5)

778.0

(735.0–876.0)

Median time to any

unsuccessful outcome (IQR)

405.5

(134.5–683.0)

436.0

(164.0–701.0)

398.5

(133.0–682.0)

Median time to death (IQR) 356.0

(96.0–605.0)

313.0

(125.0–488.0)

356.0

(92.0–637.5)

IQR: Interquartile range

55

Table 2.4. Change in effect estimates between the

low-, high-, and equal-risk assumptions

Model

# Covariate

Censoring mechanism assumptions

Equal risk (Reference Group)

HR (95% CI)

High risk

HR (95% CI)

%

Change

Low risk

HR (95% CI)

%

Change

1

AR 0.22** (0.14, 0.33) 0.70** (0.57, 0.85) 218.18% 0.23** (0.15, 0.35) 4.55%

AD 0.64 (0.34, 1.23) 1.16 (093, 1.45) 81.25% 0.58 (0.30, 1.11) -9.38%

2

AR 0.23** (0.14, 0.38) 0.67** (0.54, 0.84) 191.30% 0.26** (0.16, 0.42) 13.04%

AD 0.75 (0.36, 1.56) 1.13 (0.87, 1.46) 50.67% 0.70 (0.34, 1.47) -6.67%

3 AR 0.22** (0.14, 0.34) 0.70** (0.57, 0.85) 218.18% 0.23** (0.15, 0.35) 4.55%

4 AR 0.12* (0.02, 0.55) 0.78 (0.44, 1.40) 550.00% 0.12* (0.03, 0.54) 0%


Model 2: Full cohort, controlling for age group and eight covariates



AR: Aggressive Treatment Regimen

AD: Adolescent

CI: Confidence Interval

HR: Hazard Ratio

*p-value: <0.05; **p-value: <0.0001

56

Table 2.5. Change in effect estimates between the

mixed- and equal-risk assumptions

Model

# Covariate


Equal risk

(Reference Group)

HR (95% CI)

Mixed risk

HR (95% CI)

Increase

in Effect

Size

1 AR 0.22** (0.14, 0.33) 0.20** (0.13, 0.31) 9.09%

AD 0.64 (0.34, 1.23) 0.59 (0.31, 1.12) 7.81%

2 AR 0.23** (0.14, 0.38) 0.21** (0.12, 0.34) 8.70%

AD 0.75 (0.36, 1.56) 0.65 (0.31, 1.35) 13.33%

3 AR 0.22** (0.14, 0.34) 0.20** (0.13, 0.31) 9.09%

4 AR 0.12* (0.02, 0.55) 0.10* (0.02, 0.48) 16.67%


Model 2: Full cohort, controlling for age group and eight covariates




AD: Adolescent


HR: Hazard Ratio

*p-value: <0.05; **p-value: <0.0001

57

Chapter 2 Figures

Figure 2.1. Kaplan-Meier curve comparison across four censoring assumptions

Legend: Kaplan-Meier curves for Model 1 across all four censoring assumptions, by age groups

(adolescents in red, adults in blue).

58

CHAPTER 3: USE OF PREDICTED VITAL STATUS TO IMPROVE THE

ANALYSIS OF MULTIDRUG-RESISTANT TUBERCULOSIS COHORTS

Authors: Meredith B Brooks1,2, Salmaan Keshavjee2,3,4, Irina Gelmanova4,5, Nataliya A

Zemlyanaya5, Carole D Mitnick2,4, Justin Manjourides1




3. Division of Global Health Equity, Brigham and Women’s Hospital, Boston, MA

4. Partners In Health, Boston, MA

5. Partners In Health, Tomsk Oblast, Russian Federation

59

Abstract

Background

Multidrug-resistant tuberculosis cohorts often lack long-term end-of-study survival data, and are

summarized instead by initial treatment outcomes. This leads to censoring subjects at the time of

the first non-death treatment outcome, violating the non-informative censoring assumption of the

Cox proportional hazards model and producing biased effect estimates. To address this problem, we

develop a tool to predict vital status at study conclusion and assess its ability to reduce bias in

treatment effect estimates.

Methods

We derive and apply a logistic regression model to predict vital status at the end of the cohort

period and modify the unobserved survival outcomes to better match the predicted survival

experience of study subjects. We compare hazard ratio estimates for effect of an aggressive

treatment regimen from Cox proportional hazards models using time to initial treatment outcome,

predicted end-of-study vital status, and true survival time.

Results

Models fit from initial treatment outcomes underestimate treatment effects by up to 22.1 percent,

while using predicted survival outcomes reduced the bias by 5.4 percent. Models utilizing predicted

long-term survival outcomes produced effect estimates consistently stronger and closer to the true

treatment effect than those produced by models using the initial treatment outcome.

Conclusion

Using initial treatment outcomes to estimate treatment effects violates the non-informative

censoring assumption of the Cox proportional hazards model resulting in underestimation of the

60

benefit of an aggressive treatment regimen. Predicting end-of-study vital status may reduce this bias

in analyses of multidrug-resistant tuberculosis treatment cohorts, yielding more accurate, and likely

larger, treatment effect estimates. Further, these larger effect sizes can have downstream impacts on

study design by increasing power and reducing sample size needs.

61

Introduction

Multidrug-resistant tuberculosis (MDR-TB) is caused by the bacteria Mycobacterium tuberculosis

being resistant to two powerful, first-line anti-TB drugs, rifampicin and isoniazid. Globally in 2016,

an estimated 600,000 people were eligible for multidrug-resistant tuberculosis treatment, of which

approximately 20% received care. The latest available treatment outcome data showed that

treatment success (including cure or treatment completion) occurred in only 54 percent of

individuals, while 16 percent died, 8 percent had treatment fail, 15 percent were lost-to-follow-up,

and 7 percent had no outcome information available.1 Lack of safe and effective MDR-TB

treatment is a major driving force behind MDR-TB as a global health problem.1 The advent of two

novel MDR-TB drugs2,3 and shortened regimens4 offer opportunities for improved treatment access

and outcomes. These developments further intensify the need for accurate estimation of treatment

effectiveness. A common approach to assessing the effects of MDR-TB treatment on the risk of

death is the Cox proportional hazards (PH) model5, in part because of the variable treatment

duration and for the ability to allow each individual to contribute either a censor or failure time.6

TB programs are seldom able to follow MDR-TB patients beyond the time at which one of six

initial treatment outcome definitions7 is met, despite the overall study period being defined as the

longest interval from treatment start until an initial treatment outcome in the cohort. Truncating

patient survival times due to lack of follow-up data may bias treatment effect estimates when using

PH regression due to violation of the non-informative censoring assumption of the model.8 This

occurs when observations are censored from the data and assumed to be at equal risk of

experiencing the event of interest (often death) as all at-risk individuals remaining in the cohort.9

However, literature suggests that individuals who experience successful treatment outcomes have a

62

lower risk of death by the end of a defined cohort period (4.2 percent) compared to those who

experience unsuccessful non-death treatment outcomes (51.2 percent).10-16

The most accurate analysis would incorporate the true end-of-cohort vital status. When this is

unavailable due to follow-up ceasing after an initial non-death treatment outcome occurs, it may be

possible to improve treatment effect estimates by predicting the vital status of an individual at the

end of the defined study period. This would allow differential risk of death for individuals censored

from the data, making the non-informative censoring assumption more likely to hold. Additionally,

leveraging the initial treatment outcome to inform long-term survival may produce more accurate

treatment effect estimates compared to censoring all observations from the data regardless of the

reason follow-up was terminated.

Here, we seek to derive and validate a tool to predict vital status at the end of a study period and to

assess whether estimated treatment effects are less biased when predicted vital status is incorporated

in Cox PH models. Using initial treatment outcomes to inform estimates of the vital status at the

end of the study cohort period can provide useful information when modelling long-term survival.

Integration of the predicted end-of-study vital status is anticipated to produce stronger and less

biased effect estimates.

Methods

Study Cohort

The study population is a cohort of consecutive patients with suspected or confirmed MDR-TB,

who initiated treatment in Tomsk Oblast, Russian Federation between September 2000 and

November 2004. Informed consent was obtained from patients prior to treatment start. More details

about the enrollment and data collection methods for this cohort have been previously

63

described.11,12,17-19 Patient data with up to six years of follow-up time from the start of treatment are

available. Patients are classified as having MDR-TB if they had a culture positive for

Mycobacterium tuberculosis and drug susceptibility test results showing resistance to at least

isoniazid and rifampin on a specimen collected any time between two months prior to and one

month after initiation of MDR-TB treatment.18 For this study, cohort participants are included if

they have baseline MDR-TB, and if data are available regarding treatment initiation, the initial

treatment outcome, and if vital status at the end of the study cohort period is discernable. The study

period is defined as the longest duration from treatment initiation until an initial treatment outcome.

End of study vital status is defined as whether a patient remained alive or had died prior to the end

of the defined study period.


The primary exposure of interest is receipt of an aggressive treatment regimen, which has

previously been shown to improve outcomes.18-21 The aggressive treatment regimen is defined as a

regimen containing at least five likely effective drugs during the intensive phase of treatment,

followed by at least four likely effective drugs during the continuation phase of treatment.22 A

binary variable was used to classify each patient as ever or never having been exposed to an

aggressive treatment regimen.

Other characteristics explored are those previously identified as being risk factors for death11,17,23,24,

including age, sex, alcohol abuse or dependence, presence of a comorbidity, prior treatment history,

low body mass index (BMI), severe baseline clinical status18, extra-pulmonary TB (EPTB), and

extensively drug-resistant TB (XDR-TB).25,26 Detailed covariate definitions can be found in Table

3.1.

64

Outcome Definition

Standard MDR-TB treatment outcome definitions are used.7 A successful treatment outcome

encompasses treatment completion and cure. Unsuccessful treatment outcomes include treatment

failure, all-cause mortality, default during treatment, or transfer out. Patients are followed from

treatment initiation until the time when their first treatment outcome is observed. The primary

outcome is the time from treatment initiation until death.

Statistical Analysis

To characterize the population, we describe demographic information, comorbidities, treatment

characteristics, and treatment outcomes. Characteristics are quantified by the frequency and percent

for categorical variables and means and standard deviations (SD), unless noted otherwise, for

continuous variables. Selection bias is evaluated by assessing whether included and excluded

participants are statistically different from one another through use of chi-square, Fishers exact test,

or t-tests.

Our primary analysis involves a two-step procedure. First, a logistic regression model is fit to

predict the probability of survival at the end of the study period. Second, a Cox proportional

hazards model is fit, incorporating recoded failure and censoring outcomes based on the vital status

predicted in the logistic regression model.

Step 1. Logistic regression model for long-term vital status:

A logistic regression model is used to predict the probability of survival at the end of the study

period for each individual, 𝑖, who experienced a non-death initial treatment outcome. Vital status is

65

modeled as a random variable, taking the value 1 with probability equal to the parameter 𝑝𝑖, which

is a function of the initial treatment outcome (𝑂𝑖 ) and patient characteristics (𝑿𝑖). The parameter 𝑝𝑖

is estimated for each individual in the cohort.

Potential predictors eligible for the model include all combinations of the initial treatment outcomes

and patient characteristics that may be associated with survival. For model derivation and internal

validation, we use 10-fold cross-validation. Data are randomly divided into ten sets, the model is

built on nine of these sets and then the performance of the model is measured on the remaining set.

This is repeated until all of the ten data sets are used to test model performance. The model with the

best performance is selected as the final model.

The primary means of comparing predictive models is the Bayesian Information Criterion (BIC)27,

for which lower values indicate better fit. We also use the c-statistic to assess model discrimination,

the ability of the model to differentiate between individuals who died at the end of the study and

those who did not. The larger the c-statistic, the better the model discriminates.28 To assess model

calibration, which describes the agreement between the predicted and observed risks, we compute

the Hosmer-Lemeshow statistic.29 We define good calibration as a Hosmer-Lemeshow statistic p-

value greater than the type-one error rate of 0.05, indicating no evidence that the observed and

predicted risks significantly differ.

A receiver operating characteristics (ROC) curve is used to select a probability threshold, through

use of the Youden’s index, that maximizes the discriminative properties, including sensitivity,

specificity, positive predictive value, and negative predictive value of the model. The Youden’s

index is the vertical distance from the ROC diagonal chance line to each point on the curve and

aims to minimize the false negative and positive rates.30 Discriminatory property definitions are as

66

follows: sensitivity is defined as the probability of the model predicting survival given the

individual truly survived; specificity is defined as the probability of the model predicting death

given the individual truly died; positive predictive value is defined as the probability of actually

surviving given the model predicts survival; and negative predictive value is defined as the

probability of actually dying given the model predicts death. The probability threshold identified is

used to assign each individual a vital status of alive (𝑌�̂� = 1) or dead (𝑌�̂� = 0) at the end of the study

period (i.e., if the probability threshold is set at 0.85, then if 𝑝�̂� > 0.85, 𝑌�̂� = 1; if 𝑝�̂� < 0.85, 𝑌�̂� = 0).

Step 2. Cox proportional hazards model:

To evaluate the bias introduced when survival information after the initial treatment outcome is

lacking, we run two Cox proportional hazards models. Each model uses three different approaches

for a total of six scenarios. Models 1 and 2 both assess the association between receipt of an

aggressive treatment regimen and death. Model 1 assesses the univariate association, while Model 2

assesses the association controlling for covariates previously found to be associated with time to

death.

The three approaches we use on each model are as follows:

Approach 1: The first approach follows the conventional censoring assumption in which the

event time for each individual is either the observed time to death, or the individual is

censored at the time of the first observed non-death outcome.

Approach 2: The second approach uses the predicted vital status at the end of the study

period (�̂�). All individuals assigned a 𝑌�̂� = 1 are assumed to survive at least until the end of

the study period and contribute full survival time during that period. All individuals

67

assigned a 𝑌�̂� = 0 are assumed to be at equal risk of death as those at-risk individuals

remaining in the cohort. These observations are censored at the time of an observed non-

death treatment outcome.

Approach 3: The third approach, the gold standard, utilizes the true vital status at the end of

the study (𝑌𝑖). Individual event times are either the time of death or time to the end of the

study period at which point all remaining, alive individuals are censored. Approaches 1 and

2 are compared to this approach.

Estimated hazard ratios (HR) and 95 percent confidence intervals (CI) for the aggressive treatment

regimen variable are presented for each model and approach. Relative change between the HRs for

each model are calculated by comparing those produced from Approaches 1 and 2 to those from

Approach 3. Relative to Approach 3, HRs closer to the null hypothesis of 1.0 underestimate the

treatment effect, while HRs further from 1.0 overestimate the treatment effect. The magnitude and

direction of the bias from Approaches 1 and 2 are assessed. Relative changes are compared to

identify which approach produces the least biased effect estimates.

SAS V9.4 (SAS Institute, Cary, NC) is used for all analyses.

Ethics Statement

The parent study was approved by the Institutional Review Boards at Harvard School of Public

Health and the Siberian State Medical University (Tomsk, Russia). Secondary analysis was

reviewed and declared exempt by the Institutional Review Board at Northeastern University.

68

Results

A total of 638 individuals with suspected or confirmed MDR-TB were consecutively enrolled

during the study period. Of these, 614 individuals have confirmed MDR-TB by culture and drug

susceptibility testing. The longest interval from treatment start until the initial treatment outcome is

1293 days, defining the duration of the study period. Among the 614 individuals, vital status at the

end of the study period is unable to be ascertained for 167 (27.2 percent); these observations are

excluded, leaving 447 eligible participants included in this analysis.

The mean age of the cohort is 35.9 (SD: 11.4) years, 81.2 percent are male; 53.0 percent have

history of incarceration. Almost everyone (99.3 percent) has previously been treated for

tuberculosis; many have had prior injectable (33.3 percent) and/or fluoroquinolone (15.8 percent)

exposure. The mean number of previous tuberculosis treatments for the cohort is 2.1 (SD: 1.2), with

one-third having greater than two previous treatments. Over half (62.8 percent) present with

bilateral and cavitary disease on the baseline chest radiograph or with severe baseline clinical status

(62.0 percent), and 4.9 percent present with baseline XDR-TB. Of the 447 included in the analysis,

82.6 percent receive an aggressive regimen at some point during MDR-TB treatment. Two-thirds of

participants experience a successful initial treatment outcome while 6.7 percent died, 8.7 percent

had treatment fail, and 17.4 percent defaulted on treatment. The 167 excluded participants are

statistically, significantly different from those included in the following ways: fewer females, fewer

married, more unemployed, more currently or previously incarcerated, fewer with severe baseline

clinical status, more with EPTB, and more experienced an initial treatment outcome of default. Full

baseline characteristics for included and excluded participants are in Table 3.2.

Predicting long term survival:

69

Through 10-fold cross validation, we identify our final predictive model, which includes covariates

for a successful initial treatment outcome, treatment failure, and age (centered):

Log (𝑝

1−𝑝) = 2.57 + 2.48*Successful – 0.75*Failure – 0.04*Age

This final model is selected due to a combination of having the lowest BIC value (163.64), the

highest c-statistic (0.95), and a high Hosmer-Lemeshow statistic p-value (0.99). Table 3.3 shows

top performing model characteristics using 10-fold cross validation, including the selected model.

Using an ROC curve (see Figure 3.1), we identify the best cutoff at 0.99; resulting in a sensitivity of

0.81 (95% CI: 0.77, 0.85), specificity of 1.00 (95% CI: 0.93, 1.00), positive predictive value of

1.00, and a negative predictive value of 0.43 (95% CI: 0.38, 0.49). The overall prediction accuracy

of the model is 83.7 percent, with 297 true positives, 52 true negatives, 0 false positive, and 68 false

negatives.

Using the predicted probabilities, 99.3 percent of subjects experiencing an initial successful non-

death treatment outcome are estimated to remain alive at the end of the study period, which is close

to the true outcome in which 99.7 percent remained alive (see Table 3.4). No patients who

experience an initial unsuccessful non-death treatment outcome are predicted to stay alive, when in

reality, 56.8 percent actually did. Two-thirds of people defaulting on treatment and one-third of

people whose treatment failed truly remain alive at the end of the period.

In univariate analyses using Approach 1, receipt of an aggressive treatment regimen is protective

against death (HR: 0.32; 95% CI: 0.15, 0.69). Compared to using Approach 3 (HR: 0.26; 95% CI:

0.17, 0.41), this results in a 22.1 percent relative change. The model using Approach 2 leads to a

70

HR: 0.31 (95% CI: 0.14, 0.66), which results in a 16.7 percent relative change to the model using

Approach 3. Approach 2 yields a reduction in the bias observed using Approach 1 by 5.4 percent.

In multivariable analysis using Approach 1, receipt of an aggressive treatment regimen is still

protective against death (HR: 0.24; 95% CI: 0.10, 0.54), resulting in a 6.3 percent change from the

same model utilizing Approach 3 (HR: 0.22; 95% CI: 0.14, 0.36). The model using Approach 2

yields a HR: 0.23 (95% CI: 0.10, 0.52), resulting in a 3.2 percent relative change from Approach 3.

Approach 2 yields a reduction in bias observed using Approach 1 by 3.1 percent. See Table 3.5 for

more details.

Discussion

Compared to the conventional method of censoring all non-death initial treatment outcomes,

incorporating predicted end-of-study vital status for MDR-TB patients into Cox PH models can

reduce bias in treatment effect estimates. Conventional censoring methods utilizing time to the

initial treatment outcome improperly censors survival times, leading to underestimation of the

treatment effect by up to 22.1 percent. Models utilizing the predicted end-of-study vital status to

inform the censoring assumption leads to stronger effect estimates. This change is consistent across

univariate and multivariable analyses.

Application of individual survival probabilities allows for distinction between successful and

unsuccessful non-death treatment outcomes, which literature suggests result in different risk of

survival at the end of a study period.10-16 This differs from the conventional approach that

effectively treats all censored observations as being at equal risk of death as those observations

remaining in the cohort.

71

Our predictive model has good fit statistics, discrimination, and calibration. However, there are

some limitations to this model. We observe a large false-negative misclassification rate. When the

false-negatives produced from the predictive model are applied to the Cox proportional hazards

model, we observe an underestimation of the true treatment effect because, instead of observations

being accurately classified as ‘alive’ at the end of the cohort and contributing full survival time,

they are classified as ‘dead’ and censored at the time of the initial treatment outcome. Reduction of

the false-negative rate would produce stronger, more accurate treatment effect estimates.

In addition to model limitations, our study as a whole has several limitations that must be

considered. As the goal of this study is to compare estimates among a naïve model, a predictive

model, and a fully informed model (which requires end-of-study outcome knowledge), we exclude

167 patients with MDR-TB. They are different from those included, with a statistically significant

higher proportion of men, unmarried, unemployed, currently or previously incarcerated, with

EPTB, and an initial treatment outcome of default. Significantly fewer have a severe baseline

clinical status. If the differences between those included and excluded lead to more deaths after the

initial treatment outcome, bias may be introduced away from the null hypothesis, indicating a

possible overestimation of the treatment effect.

Additionally, we only assess two options for using the predicted probabilities to inform the way in

which observations are censored: censor at the time of the initial treatment outcome or censor at the

end of the study period. Developing additional ways in which the observations are censored, such

as at different time points after experiencing the initial treatment outcome, may be more realistic

and produce more accurate treatment effect estimates. The predictive model is not validated in an

independent cohort; however, it performs well when evaluated through 10-fold cross validation,

which attempts to assess how the results will generalize to an independent data set.

72

Conclusions

We find that using only the initial treatment outcome to analyze the treatment effect violates the

non-informative censoring assumption of the Cox PH model and underestimates the benefit of

receiving an aggressive treatment regimen. Incorporating predicted end-of-cohort vital status of an

individual may reduce biases in the analyses of MDR-TB treatment cohorts, allowing observation

of larger and more accurate treatment effect sizes and, in turn, increasing study power.

We provide a simple-to-implement method to analyze data which can potentially overcome the

current limitation of MDR-TB cohorts lacking survival data past the initial treatment outcome. This

method can allow researchers to estimate a range of potential effect estimates instead of one biased

estimate. While the predictive model produces valid predictions for subjects from the underlying

population, external validation is necessary before recommendation for use of the predictive model

in other MDR-TB cohorts. Improved accuracy of effect estimates is essential to guide MDR-TB

treatment recommendations.

Funding





MBB; Bill and Melinda Gates Foundation; Eli Lilly Foundation; Partners In Health; the John D.

and Catherine T. MacArthur Foundation, and the Hatch Family Foundation.

73

References

1. World Health Organization. Global tuberculosis report 2017. Geneva, Switzerland: WHO.

2017.

2. Sirturo (bedaquiline) product insert. Silver Spring, MD: Food and Drug Administration

(http://www.accessdata.fda.gov/drugsatfda_docs/label/2012/204384s000lbl.pdf).

3. Otsuka Novel Products GmbH. Labelling and package leaflet: Deltyba (delamanid).

4. Van Deun A, Maug AK, Salim MA, et al. Short, highly effective, and inexpensive standardized

treatment of multidrug-resistant tuberculosis. American Journal of Respiratory and Critical Care

Medicine. 2010; 182(5): 684–92.

5. Hosmer DW, Lemeshow, S. Applied Survival Analysis: Regression Modeling of Time to Event

Data. Wiley, Inc., New York. 1999.

6. Cox D. Regression models and life-tables. Journal of the Royal Statistical Society, Series B




Disease. 2005; 9(6): 640-5.

8. Brooks MB, Mitnick CD, Manjourides J. Adjusting for informative censoring in Cox

proportional hazards models: application to a multidrug-resistant tuberculosis cohort.

Dissertation; currently unpublished. 2017.

9. Efron B. The efficiency of Cox’s likelihood function for censored data. Journal of the American

Statistical Association. 1977; 72(359): 557-65.



11. Gelmanova IY, Ahmad Khan F, Becerra MC, et al. Low rates of recurrence after successful

treatment of multidrug-resistant tuberculosis in Tomsk, Russia. International Journal of

Tuberculosis and Lung Disease. 2015; 19(4): 399-405.







http://www.accessdata.fda.gov/drugsatfda_docs/label/2012/204384s000lbl.pdf

74



2016; 44(7): 843-5.



1844-51.



17. Keshavjee S, Gelmanova IY, Farmer PE, et al. Treatment of extensively drug-resistant

tuberculosis in Tomsk, Russia: a retrospective cohort study. Lancet. 2008; 372(9647):1403–9.

18. Velasquez GE, Becerra MC, Gelmanova IY, et al. Improving outcomes for multidrug-resistant

tuberculosis: aggressive regimens prevent treatment failure and death. Clinical Infectious

Diseases. 2014; 59(1): 9-15.

19. Khan FA, Gelmanova IY, Franke MF, et al. Aggressive regimens reduce risk of recurrence after

successful treatment of MDR-TB. Clinical Infectious Diseases. 2016; 63(2): 214-20.







23. Mitnick CD, Bayona JJ, Palacios EE, et al. Community-based therapy for multidrug-resistant

tuberculosis in Lima, Peru. New England Journal of Medicine. 2003; 348: 119–28.

24. Kurbatova EV, Taylor A, Gammino VM, et al. Predictors of poor outcomes among patients

treated for multidrug-resistant tuberculosis at DOTS-plus projects. Tuberculosis. 2012; 92(5):

397–403.

25. Centers for Disease Control and Prevention. Emergence of Mycobacterium tuberculosis with

extensive resistance to second-line drugs—worldwide, 2000-2004. Morbidity and Mortality

Weekly Report. 2006; 55: 301-5.

26. World Health Organization. Report of the meeting of the WHO Global Task Force on XDR-TB.


27. Harrell FE Jr. Regression modeling strategies. Springer; New York. 2001.

75

28. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating

characteristic (ROC) curve. Radiology. 1982; 143: 29–36.

29. Hosmer DW, Lemeshow S. A goodness-of-fit test for the multiple logistic regression model.

Communications in Statistics. 1980; A10: 1043–69.

30. Youden WJ. Index for rating diagnostic tests. Cancer. 1950; 3: 32-5.

76

Chapter 3 Tables

Table 3.1. Explanatory variable definitions for the Tomsk, Russia cohort



regimen

A treatment regimen containing at least five likely effective drugs

based on previous treatment history and baseline drug resistance

pattern during the intensive phase of treatment, and at least four likely

effective drugs during the continuation phase.

Receipt of an aggressive treatment regimen at any point during the

course of treatment versus none (reference).

Age

Age in years.

Age as a continuous variable.

Sex

Female sex.


Alcohol abuse/dependence

Alcohol abuse or dependence was determined at baseline or at the time

of the doctor prescribing treatment.

Alcohol abuse/dependence versus none (reference).

Comorbidities

Presence of at least one of the following: diabetes mellitus, chronic

renal insufficiency, seizure disorder, baseline hepatitis or

transaminitis, or psychiatric disease.



Number of previous treatment regimens.

Receipt of more than two or less than/equal to two previous treatment

regimens (reference).

Low body mass index (BMI)

Less than 20 kg/m2 for men and less than 18.5 kg/m2 for women.

Low BMI versus not (reference).

Severe baseline clinical status

Respiratory insufficiency, hemoptysis, or sputum acid-fast bacilli

smear (+++) at baseline.18

Severe baseline clinical status versus not (reference).




Extensively drug-resistant

(XDR) TB

Resistance to the following: isoniazid, rifampin, any fluoroquinolone,

and at least one of three second-line injectable drugs.25,26

XDR-TB versus none (reference).

77

Table 3.2. Characteristics of MDR-TB cohort from Tomsk, Russia

Covariate

Included

Participants

Excluded

Participants

Difference

between groups

(p-value) N=447 N=167

Demographics

Age, years (Mean, SD) 35.9 (11.4) 35.9 (11.1) 0.9741#

Female sex 84 (18.8) 19 (11.4) 0.0287

Married (n=434 ) 200 (46.1) 45 (29.0) 0.0002

Unemployed (n= 445) 352 (79.1) 150 (90.9) 0.0007

Current or previous incarceration 237 (53.0) 112 (67.1) 0.0018

Alcohol abuse/dependence 194 (43.4) 69 (41.3) 0.6425

Illicit drug use 79 (17.7) 35 (21.0) 0.3516

Severity Indicators

Bilateral and cavitary disease on baseline chest

radiograph (n=443)

278 (62.8) 96 (58.9) 0.3862

Severe pulmonary disease on baseline chest

radiograph

195 (43.6) 71 (42.8) 0.8498

Low BMI (n=446) 190 (42.6) 73 (43.7) 0.8045

Severe baseline clinical status 277 (62.0) 79 (47.3) 0.0011

EPTB (n=381) 39 (10.2) 7 (4.4) 0.0269

Previous TB-related surgery (n= 445) 50 (11.2) 12 (7.2) 0.1445

XDR-TB 22 (4.9) 10 (6.0) 0.5968

Comorbidities

HIV (n=446) 3 (0.7) 2 (1.2) 0.6147+

Diabetes mellitus (n=446) 18 (4.0) 7 (4.2) 0.9308

At least 1 comorbidity 322 (72.0) 110 (65.9) 0.1365

Treatment Characteristics

Months on an aggressive treatment regimen (Mean,

SD)

11.6 (7.9) 10.7 (8.3) 0.2102#

Ever on an aggressive treatment regimen 369 (82.6) 133 (79.6) 0.4061

Previously treated for TB 444 (99.3) 167 (100) 0.5664+

History of prior injectable exposure (n=436) 145 (33.3) 46 (28.1) 0.2223

History of prior fluoroquinolone exposure (n=436) 69 (15.8) 21 (12.8) 0.3557

History of prior default 16 (3.6) 7 (4.2) 0.7222

Number of previous TB treatments (Mean, SD) 2.1 (1.2) 2.2 (1.3) 0.2911#

>2 previous TB treatment (n=436) 141 (32.3) 64 (40.0) 0.0810

Treatment Outcomes

Successful Outcomes: 299 (66.9) 107 (64.1) 0.5114

Cure 280 (62.6) 103 (61.7) 0.8265

Treatment Completion 19 (4.3) 4 (2.4) 0.2813

Unsuccessful Outcomes: 148 (33.1) 60 (36.0) 0.5114

Treatment Failed 39 (8.7) 15 (9.0) 0.9202

Died 30 (6.7) 0 (0.0) N/A*

Default 78 (17.4) 45 (27.0) 0.0089

Transferred Out 1 (0.2) 0 (0.0) N/A**

All values are n, % unless otherwise specified. All p-values are calculated using chi-square tests except: #

indicates use of t-test, + indicates use of Fishers exact test.

* By definition, no patients who experienced a death were excluded because their status was known.

** Too few outcomes to compare.

SD: Standard deviation; BMI: Body mass index; EPTB: Extra-pulmonary TB; XDR-TB: Extensively drug-

resistant TB; HIV: Human Immunodeficiency Virus

78

Table 3.3. Model performance characteristics using 10-fold cross validation

Model covariates

Global Fit Discrimination Calibration

BIC

C-statistic

Hosmer-Lemeshow

test statistic (p-value)

1. Successful 169.43 0.90 0 (N/A)

2. Successful + Failure 163.98 0.93 0 (1.00)

3.* Successful + Failure + Age 163.64 0.95 1.56 (0.99)

4. Successful + Failure + Sex + Age 166.22 0.95 1.69 (0.98)

* indicates final selected model.

BIC: Bayesian information criterion

Successful was defined as the initial treatment outcome being cure or treatment completion versus

not (reference).

Failure was defined as the initial treatment outcome being treatment failure versus not (reference).

Age (in years) was included as a continuous variable.

Sex was defined as female versus male (reference).

79

Table 3.4. Distribution of end-of-study outcomes by initial treatment outcomes

Initial Treatment Outcome

(n=447)

n, %

Predicted end-of-study

outcome

Actual end-of-study

outcome

Alive n, %

297

(71.2%)*

Dead n, %

120

(28.8%)*

Alive n, %

365

(87.5%)*

Dead n, %

52

(12.5%)*

Successful 299 (66.9) 297 (99.3) 2 (0.7) 298 (99.7) 1 (0.3)

Cure 280 (62.6) 278 (99.3) 2 (0.7) 279 (99.6) 1 (0.4)

Treatment Completion 19 (4.3) 19 (100.0) 0 (0.0) 19 (100.0) 0 (0.0)

Unsuccessful 148 (33.1) 0 (0.0) 118 (100.0) 67 (56.8) 51 (43.2)

Death 30 (6.7) N/A N/A N/A N/A

Treatment Failure 39 (8.7) 0 (0.0) 39 (100.0) 13 (33.3) 26 (66.7)

Default/Transfer Out 79 (17.7) 0 (0.0) 79 (100.0) 54 (68.4) 25 (31.6)

*Out of 417 who experienced an initial non-death treatment outcome.

Note: Percentages for the ‘Predicted end-of-study outcome’ and ‘Actual end-of-study outcome’

columns are calculated based on the n in the initial treatment outcome column in the same row.

80

Table 3.5. Change in treatment effect estimates using varying

approaches to handle censored observations

Model

#

Covariate

Approach 1: Using initial

treatment

outcomes

Approach 2: Incorporating

predicted end-

of-study

outcomes

Approach 3: Using actual

end-of-study

outcomes

Relative

Change

(Approach

1 & 3)

Relative

Change

(Approach

2 & 3)

HR (95% CI) HR (95% CI) HR (95% CI)

1

AR

0.321

(0.150, 0.688)*

0.307

(0.143, 0.658)*

0.263

(0.169, 0.408)** 22.1% 16.7%

2 0.235

(0.103, 0.536)*

0.228

(0.100, 0.519)*

0.221

(0.137, 0.357)** 6.3% 3.2%

Model 1: Univariate association of receipt of an aggressive treatment regimen and time to death.

Model 2: Multivariable: same as univariate, plus controlling for the following: age, sex, alcohol

abuse/dependence, baseline comorbidities, severe baseline clinical status, XDR-TB [used covariates found

significant in previous studies for which no data were missing as to not introduce additional bias through

missing data problems].

AR: Aggressive Regimen


HR: Hazard Ratio

*p-value: <0.05; **p-value: <0.0001

81

Chapter 3 Figures

Figure 3.1. Receiver Operating Characteristics curve for final prediction model selection

82

CHAPTER 4: BIAS ESTIMATES FROM INFORMATIVE CENSORING IN

MULTIDRUG-RESISTANT TUBERCULOSIS COHORT ANALYSES: A

SIMULATION STUDY

Authors: Meredith B Brooks1,2, Justin Manjourides1




83

Abstract

Background

When multidrug-resistant tuberculosis cohorts are limited by lack of survival data past the initial

treatment outcome, the underlying Cox proportional hazards model assumption of non-informative

censoring may be violated. The presence of informative censoring may result in biased treatment

effect estimates. Alternate censoring mechanisms show potential to reduce these biases. However,

without longer survival data available, validating these methods is difficult. Here, we use simulated

data to compare the performance of the Cox model using several censoring techniques.

Methods

We simulate data to mirror a cohort of multidrug-resistant tuberculosis patients from Lima, Peru.

Informative censoring is introduced using a rejection sampling algorithm. Cox proportional hazards

models are used to estimate associations between an aggressive treatment regimen and death across

three assumptions regarding censored observations: the conventional non-informative censoring

assumption, an extension of short term survival informed by literature, and incorporation of a

predicted long-term vital status. Models are compared across various scenarios to demonstrate

which censoring technique produces the least biased estimates.

Results

The protective effect of the aggressive treatment regimen is consistently underestimated by the

conventional model, up to 7.6 percent. Models using alternative censoring techniques produce

treatment effect estimates consistently stronger and more accurate than the conventional method,

underestimating the treatment effect by less than 2.4 percent across all scenarios.

84

Conclusion

Use of alternative censoring techniques that account for differential risks of survival beyond the

initial treatment outcome may more accurately reflect long-term survival. This leads to reduction in

bias of treatment effect estimates in multidrug-resistant tuberculosis cohort analyses, yielding more

accurate and, larger treatment effect estimates.

85

Introduction

Cox proportional hazards (PH) models are commonly used to analyze survival data.1 Although the

goal is to follow individuals until the observation of some specified event, this is not always

possible. When individuals do not experience the event of interest during the study period, they are

censored from the data, leaving their actual event-time unknown. A key assumption of the Cox PH

model is independence between event times and censor times, referred to as non-informative

censoring.2 An implication of non-informative censoring is that individuals with an unobserved

event time are assumed to have the same risk of failure as those individuals still in the cohort after

that censor time. Violation of this assumption may lead to invalid treatment effect estimates.

The effects of informative censoring can be illustrated through application of the Cox model to

multidrug-resistant tuberculosis (MDR-TB) treatment cohorts that are often characterized by the

absence of follow-up data after an initial treatment outcome is reached. When Cox PH models are

used to analyze MDR-TB cohorts, patients are followed until one of six mutually exclusive

treatment outcome definitions3 is met, not for longer survival times. Thus, when fitting Cox models,

patients who meet one of the five non-death outcome definitions are not followed until the end of

the study period, but are censored at the time of that non-death treatment outcome. Literature

suggests that the risk of death over the study period after an initial treatment outcome is observed is

substantially higher for individuals experiencing unsuccessful (51.2 percent) versus successful

outcomes (4.2 percent).4-10 This discrepancy existing in MDR-TB cohort studies that lack longer

survival data, violates the non-informative censoring assumption of the Cox model, potentially

producing invalid effect estimates.

It has previously been demonstrated that violation of the non-informative censoring assumption can

result in biased treatment effect estimates.11 However, without data regarding the survival status of

86

individuals following the initial treatment outcome, it is impossible to understand the true

magnitude and direction of the bias or to confirm that models using alternate censoring techniques

produce more accurate estimates.

Using simulated data, in place of real data, to evaluate model performance has several benefits.

With a simulation study, we specify all parameters used to generate the data. When creating a

model to estimate treatment effects, the true treatment effect is a pre-specified parameter and can be

directly compared to estimates, demonstrating the magnitude and direction of bias. With real data,

the true treatment effect is unknown, and the bias cannot be estimated. Thus, simulated data is used

to evaluate the performance of several censoring techniques and to identify the method that

produces the least biased effect estimates. We anticipate that the conventional model will lead to the

most biased treatment effect estimates and censoring techniques differentiating between the initial

non-death treatment outcomes experienced produce the less biased, more precise treatment effect

estimates.

Methods

Study Population

This simulation is informed by a cohort of patients who received their first treatment for MDR-TB

in Lima, Peru (1999-2002) that has previously been reported on.9,10,12-16 No new data were collected

for this analysis.


The main exposure of interest is the proportion of time an individual is on an aggressive treatment

regimen, defined as a regimen containing at least five likely effective drugs based on previous

87

treatment history and baseline drug resistance pattern during the intensive phase of treatment, and at

least four likely effective drugs during the continuation phase.17 Other important covariates include

sex, poor nutritional status, tachycardia, extra-pulmonary TB (EPTB), human immunodeficiency

virus (HIV) coinfection, number of previous treatment regimens, comorbid conditions, and number

of resistant agents. Detailed variable definitions are in Table 4.1. Age is generated randomly from a

normal distribution between the ages of 10 and 82, with a mean of 31.5 and standard deviation (SD)

of 12.0 years. Adolescents are then classified as any individual younger than 20 years of age18 and

is included as a binary variable. Female sex, having a poor nutritional status, tachycardia, EPTB,

HIV, having had two or more previous treatment regimens, and having a comorbid condition are all

randomly generated from binomial distributions with probabilities equal to 0.39, 0.30, 0.30, 0.09,

0.02, 0.26, and 0.37, respectively. Number of resistant agents is generated from a normal

distribution between 2 and 11, with a mean of 5.43 (SD: 1.7), and rounded to the nearest whole

number. In previous analysis of the Lima cohort, we observe that the proportion of time on an

aggressive treatment regimen significantly differed by age group, with a mean of 68.8 percent (SD:

37.8 percent) of treatment time for adolescents and 55.0 percent (SD: 41.6 percent) for adults.11 To

reflect this, the proportion of time on an aggressive treatment regimen is calculated in a two-step

procedure. First, we produce risk scores for being on an aggressive treatment regimen as a function

of the other covariates using linear regression as follows:

Aggressive Treatment Regimen = 0.830 - 0.025*comorbidity + 0.140*previous regimen -

0.004*sex -0.040*poor nutritional status + 0.006*tachycardia - 0.129*EPTB + 0.004*HIV -

0.050*resistant agents

Next, across the estimated risk scores, the lowest 30 percent of adults and 19 percent of adolescents

are assigned to have no time on an aggressive regimen, while the upper 33 percent of adults and 46

88

percent of adolescents are assigned to be on an aggressive regimen for their entire study time. The

middle 37 percent of adults and 35 percent of adolescents have their proportion of time on an

aggressive treatment regimen generated randomly from a normal distribution between 0 and 1 with

mean of 0.60 (SD: 0.18) for adults and mean of 0.65 (SD: 0.15) for adolescents.

Outcomes are generated from a multinomial distribution, with probabilities of having a successful

outcome (cure or treatment completion), death, treatment failure, default/transfer out (combined due

to low numbers of patients who transferred out) equal to 0.66, 0.21, 0.03, and 0.10, respectively.11

Time to successful treatment outcomes, deaths, and treatment failure, in days, are generated from

Weibull distributions to reasonably match the breakdown of event times in the Lima, Peru data

using the following shape and scale parameters: 3.6 and 908; 0.9 and 419; 1.8 and 1162,

respectively. Time to default/transfer out were generated from a uniform distribution between 2 and

1800 days.

To introduce informative censoring into the data set, a rejection sampling algorithm, modified from

that presented by Griffin et al19, is used to assign outcomes and event times to covariates,

conditional on age and the proportion of time on an aggressive treatment regimen. The following

methods are used:

1) For each subject i, i=1,…,N, create a vector Xi consisting of that subject’s assigned p covariates.

2) Define ti as the event time for each individual. For i = 1,…,N, if ti corresponds to a successful

treatment outcome, si = 1; if it corresponds to treatment failure, fi = 1; if it corresponds to a

default/transfer out, di = 1. For i = 1,…,N, define δi as the event indicator, with δi = 1 for death

=1, and δi = 0 otherwise. Then, sort the N survival status pairs (ti, δi) such that ti < ti+1.

89

3) Starting from the earliest observed time, randomly assign each consecutive survival status pair

(ti, δi) to a covariate vector Xi using the following rules:

a. If δi = 1 (i.e, ti is an event [death] time), use a rejection sampler to assign the

covariate vector. First, define Ri as the risk set for ti, such that Ri contains all

individual covariate vectors that have not yet been assigned. Next, randomly select a

covariate vector, Xj, from Ri and calculate exp(β’Xj)/𝑐𝑡𝑖where 𝑐𝑡𝑖 = max(exp(β’Xi))

over all covariate vectors in Ri. Draw U from a uniform distribution between 0 and 1.

If U < exp(β’Xj)/𝑐𝑡𝑖 then assign Xi to the event time, ti, and the associated outcome;

otherwise repeat this step.

b. If δi = 0 and si = 1, use a rejection sampler to assign the covariate vector. Follow all

previous steps, except if U > exp(β’Xj)/𝑐𝑡𝑖 then assign Xi to the event time, ti, and the

associated outcome; otherwise repeat this step. If exp(β’Xj)/𝑐𝑡𝑖 = 1 then assign Xi to

an event time and an associated outcome by simple random sampling from Ri with

equal probability 1/size(Ri), where size(Ri) is equal to the number of individuals still

at risk at time ti.

c. If δi = 0 and fi = 1 or di = 1, assign Xi by simple random sampling from Ri with equal

probability 1/size(Ri).

Treatment Effects

For all simulated datasets, we specify the true effects of an aggressive treatment regimen, β1, on

time to death corresponding to a series of potential hazard ratios (HR) ranging from 0.2 to 1.4 in 0.1

increments.11,12,14 Additionally, in multivariable models we control for the true effect of

adolescence, β2, specified and held constant at a corresponding hazard ratio of 0.65 to allow for a

90

reasonable effect of adolescence on the hazard of death.11,16 The duration of the cohort period is

defined as the maximum time from treatment start until an initial treatment outcome is observed

over the entire cohort.

Censoring Techniques

Across all simulations we explore the use of three censoring techniques: 1) equal-risk; 2) mixed-

risk; 3) predicted risk.

The conventional, equal-risk approach assumes non-informative censoring and subjects are

immediately censored after they experience a non-death treatment outcome, as longer survival is

unknown. Under this approach, it is assumed that all individuals who experienced a non-death

treatment outcome are at equal risk of failure as those still at-risk in the cohort, and the censoring is

non-informative.

The mixed-risk censoring technique accounts for differential risk of survival for individuals who

experience a non-death treatment outcome. Informed by the literature4-10, we make the assumptions

that individuals who experience a successful non-death treatment outcome will survive at least until

the end of the study period, while individuals who experience an unsuccessful non-death treatment

outcome are at equal risk of failure as those still at-risk after the censor time.11

The predicted-risk censoring technique incorporates predicted end-of-study vital status for each

individual as a function of their initial treatment outcome. The prediction model is developed and

validated on a cohort of MDR-TB patients from Tomsk, Russia.20 The model produces estimated

survival probabilities for each individual. Then, using a probability threshold selected to maximize

discriminatory properties of the model, individuals are predicted to either remain alive at the end of

91

the study period or to have died prior to the end of the period. Individuals predicted to survive

contribute full survival time from treatment initiation until the end of the study period. Those

predicted to have died are censored at the time of the initial treatment outcome and assumed at

equal risk of failure as those still at-risk after the censor time.

Statistical Methods

We assess both a univariate and multivariable Cox PH model, each using the three censoring

techniques, resulting in six modeling scenarios. Further, we evaluate estimates produced from each

model across 13 values of the true treatment effect, β1, resulting in 78 scenarios. In all scenarios,

model performance is evaluated based on 1,000 simulated datasets, each consisting of N=1000

subjects.

Evaluating Model Performance

Bias of each censoring technique, calculated as the mean difference (𝛽1̂ − 𝛽1) across all

simulations, is assessed for each model. When β1 < 0 (or log(β1) < 1.0), a positive bias indicates

underestimation of the treatment effect, while negative bias indicates overestimation. When β1 >0, a

negative bias indicates underestimation of the treatment effect, while positive bias indicates

overestimation. Relative bias, (𝛽1̂ − 𝛽1) / 𝛽1, demonstrates the relationship between the true

treatment effect, β1, and the bias. Precision is assessed by calculating the mean of the mean squared

error (MSE), (𝛽1̂ − 𝛽1)2 + variance(β1̂), for each model. The coverage probability, defined as the

percentage of 95% confidence intervals (CI) for 𝛽1̂ containing β1, and power, calculated the

percentage of 95% confidence intervals for 𝛽1̂ containing non-null effects, are presented for each

model.

92

Additional Analyses

Simulations are also run on a simulated data set with non-informative censoring. To ensure non-

informative censoring, steps 3b and 3c of the rejection sampling algorithm were combined. All

subjects who experienced a non-death treatment outcome, δi = 0, are assigned a covariate vector, Xi,

by simple random sampling from the risk set, Ri, with equal probability 1/size(Ri).

The univariate and multivariable models, each using the three censoring assumptions, are then run

on the observed Lima cohort data. Although the true effects are unknown, we aim to determine if

similar patterns in model performance are evident in the real data. Hazard ratios for the aggressive

treatment regimen and adolescence are estimated. The relative change between HRs produced under

the different censoring techniques are presented, with the model utilizing the conventional equal-

risk technique serving as the reference.

R version 3.4.1 is used for all data simulation and analyses. R code is included in the Supplemental

Materials.

Ethics Statement

The parent study was approved by the Institutional Review Board at Harvard Medical School and

by the Ministry of Health of Peru. Secondary analysis was reviewed and declared exempt by the


Results

In univariate analyses, for HRs less than 1.0, we observe that all bias is positive, indicating that all

models underestimate the treatment effect when informative censoring is present. Consistently,

93

models using the predicted-risk technique are least biased, followed by the mixed-risk models, and

finally the equal-risk models, with relative bias increasing from 0.7 to 7.6 percent in parallel with

stronger treatment effects. Power is consistent at the Type-1 error rate of α=0.05 when there is no

effect. When HRs are greater than 1.0, all three models overestimate the treatment effect. The

mixed-risk model is most biased and the equal-risk model performs better than the predicted-risk

models in most scenarios. Overall, the MSEs are consistent between all censoring techniques and

coverage rates hover around 95 percent for all scenarios. Power increases as β1 moves further from

the null for all models, but is slightly lower for equal-risk models in most scenarios. Table 4.2

shows full model performance results for univariate analysis. Figures 4.1, 4.2, 4.3, 4.4 show the

relative bias, MSE, power, and 95% confidence interval coverage rates, respectively, for the three

different censoring techniques in univariate analysis. Figures 4.5, 4.6, 4.7, 4.8 show the relative

bias, MSE, power, and 95% confidence interval coverage rates, respectively, for the three different

censoring techniques in multivariable analysis

In multivariable analyses, for HRs less than 1.0, we observe that all bias is positive, indicating

underestimation of the treatment effect in all models, and that the equal-risk technique is the most

biased. While both the mixed- and predicted-risk techniques perform better, the mixed-risk

technique is consistently the least biased. When the HRs are greater than 1.0, all models

overestimate the treatment effect, with the equal-risk models being least biased and the mixed- and

predicted-risk techniques performing similarly. Overall, the MSEs are consistent between all

censoring techniques and coverage rates hover around 95 percent for all scenarios. Power increases

as β1 moves further from the null for all models. Full results from the multivariable model for the

aggressive treatment regimen effect are presented in Table 4.3.

94

In multivariable analyses the treatment effect, β1, is varied while the effect of adolescence, β2,

remains constant at log(0.65). When the HRs are less than 1.0 for the treatment effect, the models

using the equal-risk censoring technique are the most biased, consistently underestimating the effect

of adolescence. The mixed-risk technique also underestimates the effect in most scenarios, whereas

the predicted-risk technique consistently overestimates the effect. When HRs are greater than 1.0

for the treatment effect, equal- and mixed-risk models overestimate, and the predicted-risk

underestimate, the effect of adolescence. Mixed-risk models are the least biased across all

scenarios. For all values of β1 the MSE and confidence interval coverage rates are similar, but

power is consistently higher for the predicted-risk and lowest for equal-risk models. Full results

from multivariable analysis for adolescence are shown in Table 4.4.

When the non-informative censoring assumption is upheld, models produce slightly biased

treatment effect estimates, with no discernable pattern regarding a best performing model or a

direction of bias. Table 4.5 shows full model performance results for univariate analysis when the

non-informative censoring assumption is upheld.

When applied to the observed Lima cohort, in univariate analysis, we observe stronger effect

estimates produced for the aggressive treatment regimen in the models using the mixed-risk (8.1

percent) and predicted-risk (7.1 percent) techniques compared to the equal-risk technique. In

multivariable analysis, findings are similar, with the mixed-risk technique producing stronger effect

estimates by 7.4 percent for the aggressive treatment regimen and 8.6 percent for adolescence, and

the predicted-risk models producing stronger effects by 6.0 percent for the treatment regimen and

11.2 percent for adolescence, as compared to the equal-risk model. See Table 4.6 for full results.

95

Discussion

Comparing the performance of Cox PH models across three different censoring techniques on

simulated data sets, we identify that the conventional censoring method results in biased treatment

effect estimates in the presence of informative censoring. We propose two alternative censoring

techniques for application to MDR-TB cohorts that account for the differential risk of death among

censored observations. These alternate techniques provide a relatively simple and direct method to

adjust for the presence of informative censoring in similar MDR-TB treatment cohorts. The mixed-

and predicted-risk techniques produce more accurate treatment effect estimates than the

conventional model across most scenarios, with the predicted-risk model outperforming the other

two.

In scenarios in which the aggressive treatment regimen is protective against death, use of the

conventional method consistently underestimates the true treatment effect. This underestimation

occurs because the conventional equal-risk assumption does not reflect the true survival experience

of patients. Literature suggests that most individuals who experience a successful non-death initial

treatment outcome will remain alive at study end4-10, whereas the conventional method assumes that

they are at equal risk of death as remaining at-risk individuals in the cohort. Thus, we observe that

application of models utilizing the two alternative censoring techniques, each differentiating

between risk of death for successful versus unsuccessful non-death treatment outcomes,

consistently reduces bias and produces more accurate treatment effect estimates.

In scenarios in which the aggressive treatment regimen is adversely associated with death, use of

the mixed- and predicted-risk censoring techniques no longer consistently produce the least biased

estimates. The extension of survival time for individuals experiencing a successful non-death

treatment outcome in both the mixed- and predicted-risk techniques inherently imparts a more

96

protective effect, with healthier individuals remaining in the study longer. If there is, in fact, a true

adverse effect of the treatment regimen, these models may not produce accurate effect estimates due

to the underlying assumption of increased survival and less event occurrences. In these instances,

the conventional model may be more appropriate to use, or another alternative model that makes

different assumptions about the longer survival of censored observations.

When the non-informative censoring assumption was upheld, no variation in the censoring

assumption performed better than the others, indicating that the censor times truly were independent

of the event times.2 In these scenarios, use of the conventional model seems appropriate.

Application of these methods to the real cohort shows similar trends to those observed in the

simulation. While we cannot confirm the true treatment effect due to true end-of-study survival data

missing, we can draw parallels between the real and simulated data sets. This may imply that within

the real cohort, the conventional method is underestimating the benefits of an aggressive treatment

regimen.

This simulation study is limited by the parameters used to develop characteristics, treatment

outcomes, and survival times. Results of this simulation may not be generalizable to cohorts with

different distributions of patient characteristics, proportions of successful treatment outcomes or

deaths, or survival times. Additionally, while both alternative censoring techniques are found to be

more accurate than the conventional method, they are also making assumptions that may not be

valid across different populations. Despite the predicted-risk models performing so well, it is based

on a predictive model validated in a cohort of MDR-TB patients from Tomsk, Russia20 and has not

been validated on the Lima cohort due to the lack of long-term survival data available.

97

Despite the limitations, the need to check for the presence of informative censoring remains; this

study highlights the potential bias that violating this assumption can introduce and proposes

alternative censoring techniques that can reduce this bias.

Conclusions

In sum, the non-informative censoring assumption of the Cox PH model may be violated in MDR-

TB treatment cohorts when there is a lack of survival data past the initial treatment outcome. This

violation may result in biased treatment effect estimates. We find that in the presence of informative

censoring, adjusting the censoring technique used in the models, as informed by the literature and

through predictive modeling, can reduce bias and estimate treatment effects more accurately. This

work suggests that previous estimates of the aggressive treatment regimen’s protective effect in the

Lima cohort12,14 may have been conservative, underestimating the true benefit of the regimen.

Additionally, this work provides two alternative censoring techniques that estimate more accurate

treatment effects and can be implemented at the analysis stage through simple recoding of survival

times. Biased treatment effect estimates can have crucial implications; underestimating the

effectiveness of treatments can hinder the adoption of optimal drugs and regimens into practice.

Funding





MBB; a career development award from the National Institute of Allergy and Infectious Diseases (5

K01 A1065836) to CDM; Bill and Melinda Gates Foundation; Thomas J. White; Partners in Health;

98

the Peruvian Ministry of Health; the David Rockefeller Center for Latin American Studies at

Harvard University; the Francis Family Foundation; the Pittsfield Anti-tuberculosis Association; the

Eli Lilly Foundation; and the Hatch Family Foundation.

99

References

1. Cox D. Regression Models and Life-Tables. Journal of the Royal Statistical Society Series B


2. Collett D. Modelling Survival Data in Medical Research. Second Edition.2003.



Disease. 2005; 9(6): 640-5.





Lung Disease. 2015; 19(4): 399-405.









2016; 44(7): 843-5.



1844-51.










100







3:2333794X16674382.



18. World Health Organization. The second decade: improving adolescent health and development.


19. Griffin BA, Anderson GL, Shih RA, Whitsel EA. Use of alternative time scales in Cox

proportional hazard models: implications for time-varying environmental exposures. Statistics

in Medicine. 2012; 31: 3320-7.

20. Brooks MB, Keshavjee S, Gelmanova I, Zemlyanaya NA, Mitnick CD, Manjourides J. Use of

predicted vital status to improve the analysis of multidrug-resistant tuberculosis cohorts.


21. Centers for Disease Control and Prevention. Clinical growth charts. [cited 2016]. Available

from: http://www.cdc.gov/growthcharts/clinical_charts.htm.

101

Chapter 4 Tables

Table 4.1. Explanatory variable definitions for the simulated data


Age

Adolescent are 10-19 years old while adults are greater than or equal to 20

years old.18

Adolescent versus adult (reference).

Sex

Female sex.


Poor nutritional status

Low body mass index (BMI) per the Centers of Disease Control and

Prevention definitions21 or clinical assessment of malnutrition.

Low BMI and/or malnutrition versus normal BMI and not malnourished

(reference).

Tachycardia

Heart rate greater than 100 beats per minute.

Tachycardia versus no tachycardia (reference).




Human immunodeficiency

virus (HIV)

Diagnosis or documentation of HIV at baseline.

HIV versus no HIV (reference).


Combination of number of previous treatment regimens received and

receipt of a standardized treatment regimen.

Receipt of less than or equal to two previous treatment regimens and/or

did not receive the standardized treatment regimen for MDR-TB versus

receipt of more than two previous treatment regimens and received the

standardized treatment regimen for MDR-TB (reference).

Comorbidities

Presence of at least one of the following: cardiovascular disease, diabetes

mellitus, hepatitis or cirrhosis, epilepsy/seizures, renal insufficiency,

psychiatric disorder, history of smoking or substance use/abuse.



Resistance to the following 12 drugs or drug classes was tested:

capreomycin, cycloserine, ethambutol, ethionamide, isoniazid, kanamycin

or amikacin, para-aminosalicylic acid, pyrazinamide, rifampicin,

streptomycin, 1st generation fluoroquinolones (ciprofloxacin, ofloxacin),

and later-generation fluoroquinolones (gatifloxicin, levofloxacin,

moxifloxacin)12,14

Number of resistant results for the 12 drugs listed above.


regimen

A treatment regimen containing at least five likely effective drugs based

on previous treatment history and baseline drug resistance pattern during

the intensive phase of treatment, and at least four likely effective drugs

during the continuation phase.17

Calculated as proportion of total treatment time on an aggressive

treatment regimen.

102

Table 4.2. Full results of model performance across treatment effect estimates

for univariate analysis

Hazard Ratio

associated

with β1

Censoring

Assumption

Relative

Bias (%)

Bias

MSE

95% CI

Coverage (%)

Power

(%)

0.20

Equal-risk 7.6 0.015 1.032 93.2 100.0

Mixed-risk 1.8 0.004 1.031 95.6 100.0

Predicted-risk 1.7 0.003 1.031 95.4 100.0

0.30

Equal-risk 5.2 0.015 1.030 94.3 100.0

Mixed-risk 0.8 0.002 1.030 95.6 100.0

Predicted-risk 0.6 0.002 1.030 95.5 100.0

0.40

Equal-risk 4.3 0.017 1.033 96.0 100.0

Mixed-risk 1.5 0.006 1.033 95.8 100.0

Predicted-risk 1.2 0.005 1.033 95.8 100.0

0.50

Equal-risk 4.7 0.023 1.036 95.0 97.6

Mixed-risk 2.4 0.012 1.036 94.3 98.3

Predicted-risk 2.1 0.011 1.036 94.1 98.5

0.60

Equal-risk 3.7 0.022 1.035 95.6 83.2

Mixed-risk 2.0 0.012 1.034 95.4 85.6

Predicted-risk 1.6 0.010 1.034 95.4 85.9

0.70

Equal-risk 2.9 0.020 1.041 95.5 52.7

Mixed-risk 1.4 0.010 1.040 95.8 58.7

Predicted-risk 1.1 0.008 1.040 95.5 59.0

0.80

Equal-risk 3.0 0.024 1.050 94.3 24.2

Mixed-risk 2.2 0.017 1.049 94.5 26.4

Predicted-risk 1.8 0.014 1.049 94.8 26.7

0.90

Equal-risk 1.0 0.009 1.053 94.8 9.6

Mixed-risk 0.9 0.008 1.054 93.8 10.7

Predicted-risk 0.5 0.004 1.054 93.9 11.5

1.00 (Null)

Equal-risk 0.7 0.007 1.057 95.6 4.4

Mixed-risk 0.6 0.006 1.056 95.4 4.6

Predicted-risk 0.2 0.002 1.056 95.2 4.8

1.10

Equal-risk 1.7 0.019 1.066 94.8 8.5

Mixed-risk 1.9 0.021 1.066 95.0 8.5

Predicted-risk 1.6 0.017 1.066 95.0 8.7

1.20

Equal-risk 1.4 0.017 1.077 94.9 17.6

Mixed-risk 2.1 0.026 1.075 94.4 18.9

Predicted-risk 1.7 0.020 1.074 94.7 18.2

1.30

Equal-risk 0.9 0.011 1.086 94.9 30.4

Mixed-risk 1.8 0.023 1.088 94.8 33.6

Predicted-risk 1.3 0.017 1.087 94.5 32.8

1.40

Equal-risk 0.6 0.008 1.089 95.9 44.0

Mixed-risk 1.6 0.022 1.089 96.2 47.4

Predicted-risk 1.2 0.016 1.089 96.3 46.2

Univariate model: association of proportion of treatment time on an aggressive treatment

regimen and time to death

MSE: Mean squared error

CI: Confidence interval

103

Table 4.3. Full results of model performance across treatment effect estimates for

multivariable analysis (aggressive treatment regimen results)

Hazard Ratio

associated

with β1

Censoring

Assumption

Relative

Bias (%)

Bias

MSE

95% CI

Coverage

(%)

Power

(%)

0.20

Equal risk 7.4 0.015 2.025 94.4 100.0

Mixed risk 1.7 0.003 2.024 95.4 100.0

Predicted risk 2.0 0.004 2.024 95.8 100.0

0.30

Equal risk 5.8 0.017 1.818 94.3 100.0

Mixed risk 1.7 0.005 1.816 94.7 100.0

Predicted risk 1.8 0.005 1.816 94.5 100.0

0.40

Equal risk 4.5 0.018 1.646 96.0 100.0

Mixed risk 1.2 0.005 1.644 95.8 100.0

Predicted risk 1.3 0.005 1.644 96.3 100.0

0.50

Equal risk 3.9 0.019 1.495 96.3 98.0

Mixed risk 1.6 0.008 1.493 96.2 98.5

Predicted risk 1.7 0.008 1.494 96.2 98.6

0.60

Equal risk 4.0 0.024 1.370 94.1 82.1

Mixed risk 2.2 0.013 1.368 94.4 85.4

Predicted risk 2.2 0.013 1.368 94.3 85.9

0.70

Equal risk 2.7 0.019 1.263 95.1 54.6

Mixed risk 1.7 0.012 1.262 94.4 56.5

Predicted risk 1.8 0.012 1.262 94.3 56.7

0.80

Equal risk 3.0 0.024 1.178 94.4 23.2

Mixed risk 2.1 0.017 1.177 93.9 24.3

Predicted risk 2.1 0.017 1.177 94.1 24.3

0.90

Equal risk 1.3 0.011 1.109 94.9 10.8

Mixed risk 1.0 0.009 1.108 94.1 10.8

Predicted risk 1.0 0.009 1.108 94.3 10.5

1.00 (Null)

Equal risk 2.2 0.022 1.064 94.7 5.3

Mixed risk 2.0 0.020 1.063 95.1 4.9

Predicted risk 2.1 0.021 1.063 94.9 5.1

1.10

Equal risk 0.9 0.010 1.035 94.9 9.2

Mixed risk 1.2 0.013 1.035 94.1 9.0

Predicted risk 1.2 0.013 1.035 94.3 8.9

1.20

Equal risk 0.7 0.008 1.029 94.6 17.0

Mixed risk 1.2 0.015 1.029 95.3 17.3

Predicted risk 1.2 0.015 1.029 95.5 17.0

1.30

Equal risk 1.5 0.019 1.041 94.7 32.5

Mixed risk 2.3 0.030 1.041 94.8 33.0

Predicted risk 2.2 0.029 1.041 94.6 32.4

1.40

Equal risk 0.2 0.003 1.076 94.9 45.3

Mixed risk 1.5 0.021 1.076 94.4 48.4

Predicted risk 1.4 0.020 1.076 94.5 47.0

Multivariable model: association of proportion of treatment time on an aggressive treatment

regimen and time to death, controlling for adolescence. β1 varied, β2 constant at log(0.65).



104

Table 4.4. Full results of model performance across treatment effect estimates for

multivariable analysis (adolescence results)

Hazard Ratio

associated

with β1

Censoring

Assumption

Relative

Bias (%)

Bias

MSE

95% CI

Coverage

(%)

Power

(%)

0.20

Equal risk 3.0 0.019 1.476 96.0 35.2

Mixed risk 1.4 0.009 1.474 96.2 37.8

Predicted risk -2.8 -0.018 1.474 95.6 45.5

0.30

Equal risk 4.1 0.027 1.466 94.2 33.6

Mixed risk 2.8 0.018 1.461 94.0 37.3

Predicted risk -1.4 -0.009 1.461 95.4 46.3

0.40

Equal risk 1.7 0.011 1.482 94.3 39.3

Mixed risk 0.5 0.003 1.479 94.7 42.2

Predicted risk -3.6 -0.023 1.479 95.4 49.1

0.50

Equal risk 1.3 0.009 1.484 95.6 41.0

Mixed risk -0.3 -0.002 1.482 94.9 44.4

Predicted risk -4.3 -0.028 1.482 95.4 54.5

0.60

Equal risk 2.0 0.013 1.509 96.0 40.3

Mixed risk 0.6 0.004 1.508 95.8 43.4

Predicted risk -3.4 -0.022 1.508 95.8 50.6

0.70

Equal risk 2.7 0.018 1.442 94.6 39.2

Mixed risk 1.5 0.010 1.440 94.4 42.2

Predicted risk -2.6 -0.017 1.440 94.3 50.6

0.80

Equal risk 2.0 0.013 1.434 94.8 43.4

Mixed risk 0.5 0.003 1.432 94.7 45.7

Predicted risk -3.6 -0.023 1.433 94.5 54.0

0.90

Equal risk 1.3 0.009 1.436 94.9 43.9

Mixed risk 0.2 0.001 1.434 95.0 45.9

Predicted risk -3.9 -0.025 1.434 94.2 54.2

1.00 (Null)

Equal risk 1.4 0.009 1.430 94.6 42.6

Mixed risk -0.1 0.000 1.426 95.0 46.0

Predicted risk -4.1 -0.027 1.426 95.5 54.2

1.10

Equal risk 2.1 0.014 1.438 94.7 41.8

Mixed risk 1.0 0.006 1.436 94.1 44.7

Predicted risk -3.1 -0.020 1.436 94.3 53.1

1.20

Equal risk 2.3 0.015 1.445 94.9 41.5

Mixed risk 0.7 0.004 1.443 95.3 45.1

Predicted risk -3.3 -0.022 1.443 95.9 53.2

1.30

Equal risk 1.7 0.011 1.435 96.4 43.8

Mixed risk 0.6 0.004 1.434 95.6 46.9

Predicted risk -3.4 -0.022 1.434 96.6 54.7

1.40

Equal risk 2.8 0.018 1.421 95.9 42.9

Mixed risk 1.3 0.008 1.420 95.5 47.0

Predicted risk -2.7 -0.018 1.420 95.3 52.9

Multivariable model: association of proportion of treatment time on an aggressive treatment

regimen and time to death, controlling for adolescence. β1 varied, β2 constant at log(0.65).



105

Table 4.5. Full results of model performance across treatment effect estimates

for univariate analysis with non-informative censoring

Hazard Ratio

associated

with β1

Censoring

Assumption

Relative

Bias (%)

Bias

MSE

95% CI

Coverage

(%)

Power

(%)

0.20

Equal risk 0.0 0.000 0.066 94.9 100.0

Mixed risk 1.0 0.005 0.066 95.0 100.0

Predicted risk 1.0 0.005 0.066 95.1 100.0

0.30

Equal risk -1.3 -0.004 0.058 95.5 100.0

Mixed risk -0.3 -0.001 0.058 95.4 100.0

Predicted risk -0.3 -0.001 0.058 95.1 100.0

0.40

Equal risk -1.0 -0.004 0.055 94.3 99.8

Mixed risk -0.3 -0.001 0.055 94.4 99.8

Predicted risk -0.3 -0.001 0.055 94.5 99.8

0.50

Equal risk -1.8 -0.009 0.054 96.1 99.3

Mixed risk -1.4 -0.007 0.054 96.1 99.2

Predicted risk -1.4 -0.007 0.054 96.1 99.2

0.60

Equal risk 0.7 0.004 0.056 95.5 88.1

Mixed risk 0.8 0.005 0.056 95.4 88.0

Predicted risk 0.8 0.005 0.056 95.5 87.9

0.70

Equal risk -0.1 -0.001 0.056 94.1 59.4

Mixed risk 0.0 -0.000 0.055 94.1 59.6

Predicted risk 0.0 -0.000 0.055 94.2 60.1

0.80

Equal risk 0.9 0.007 0.052 95.3 26.1

Mixed risk 1.0 0.008 0.052 95.5 26.4

Predicted risk 1.0 0.008 0.052 95.3 26.5

0.90

Equal risk 0.2 0.002 0.055 94.3 11.7

Mixed risk 0.3 0.003 0.055 94.6 11.7

Predicted risk 0.3 0.003 0.055 94.5 11.7

1.00 (Null)

Equal risk -0.2 -0.002 0.055 95.3 4.7

Mixed risk -0.2 -0.002 0.055 94.9 5.1

Predicted risk -0.2 -0.002 0.055 94.9 5.1

1.10

Equal risk -0.7 -0.008 0.055 93.3 9.9

Mixed risk -0.6 -0.007 0.056 93.5 10.4

Predicted risk -0.6 -0.007 0.056 93.6 10.2

1.20

Equal risk 0.2 0.002 0.055 96.8 19.0

Mixed risk 0.2 0.002 0.055 96.4 18.1

Predicted risk 0.2 0.002 0.055 96.2 18.8

1.30

Equal risk -0.3 -0.004 0.053 96.2 34.5

Mixed risk -0.4 -0.005 0.053 96.1 34.1

Predicted risk -0.4 -0.005 0.053 96.0 33.5

1.40

Equal risk -0.2 -0.003 0.054 94.7 52.8

Mixed risk -0.3 -0.004 0.053 94.9 52.4

Predicted risk -0.3 -0.004 0.053 95.0 52.2

Univariate model: association of proportion of treatment time on an aggressive treatment

regimen and time to death



106

Table 4.6. Results of model performance when applied to the real Lima, Peru cohort data

Model # Covariate


Equal-risk

(Reference Group)

HR (95% CI)

Mixed-risk

HR (95% CI)

Increase in

effect size

(ER – MR)

Predicted-risk

HR (95% CI)

Increase in

effect size

(ER – PR)

1 AR 0.21 (0.14, 0.32) 0.19 (0.13, 0.30) 8.1% 0.20 (0.13, 0.30) 7.1%

2 AR 0.22 (0.14, 0.33) 0.20 (0.13, 0.31) 7.4% 0.20 (0.13, 0.31) 6.0%

AD 0.64 (0.34, 1.23) 0.59 (0.31, 1.12) 8.6% 0.57 (0.30, 1.09) 11.2%

Model 1: Full cohort, univariate analysis;

Model 2: Full cohort, controlling for age group


AD: Adolescent


HR: Hazard Ratio

ER: Equal-risk assumption

MR: Mixed-risk assumption

PR: Predicted-risk assumption

107

Chapter 4 Figures

Figure 4.1. Relative bias of the estimated effect of the aggressive treatment regimen in univariate

analysis by censoring technique

108

Figure 4.2. Mean squared error of the estimated effect of the aggressive treatment regimen in

univariate analysis by censoring technique

109

Figure 4.3. Power of the estimated effect of the aggressive treatment regimen in univariate analysis by

censoring technique

Legend: Power is calculated as the percentage of times the 95% confidence interval for 𝛽1̂ does not contain

a null effect (hazard ratio of 1.00)

110

Figure 4.4. 95% confidence interval coverage rates of the estimated effect of the aggressive treatment

regimen in univariate analysis by censoring technique

Legend: 95% confidence interval coverage rates are calculated as the percentage of times the 95%

confidence interval for 𝛽1̂ contained β1.

111

Figure 4.5. Relative bias of the estimated effect of the aggressive treatment regimen in multivariable

analysis by censoring technique

112

Figure 4.6. Mean squared error of the estimated effect of the aggressive treatment regimen in


113

Figure 4.7. Power of the estimated effect of the aggressive treatment regimen in multivariable analysis

by censoring technique

Legend: Power is calculated as the percentage of times the 95% confidence interval for 𝛽1̂ does not contain

a null effect (hazard ratio of 1.00)

114

Figure 4.8. 95% confidence interval coverage rates of the estimated effect of the aggressive treatment

regimen in multivariable analysis by censoring technique

Legend: 95% confidence interval coverage rates are calculated as the percentage of times the 95%

confidence interval for 𝛽1̂ contained β1.

115

CHAPTER 5: CONCLUSIONS

116

Summary

Problem

With 10.4 million new cases in 2016 and 1.3 million related deaths, tuberculosis (TB) is one of the

top ten leading causes of death globally and the leading cause of death from an infectious disease.1

Despite TB being treatable and curable, only 61 percent of new cases are initiated on TB treatment.1

Of those who start treatment, 83 percent experience successful treatment outcomes.1 TB control is

further complicated by the spread of drug resistance. There were 600,000 new cases eligible for

multidrug-resistant (MDR) TB treatment in 2016 and 240,000 related deaths.1 Only 22 percent of

these incident cases initiate treatment, in which the treatment success rate is low at 54 percent.1

Although trials are underway to identify better treatment regimens, the true test of effectiveness will

come after the trial period when the controlled setting is removed. Analysis of cohorts using

treatments under real life conditions will be crucial to identifying effective regimens.

Cox proportional hazards (PH) models are commonly used to analyze MDR-TB treatment cohorts.

Because patients are often only followed until the first of six mutually exclusive treatment outcome

definitions is met2, they each contribute either time to the event of interest (death) or time to one of

the other five non-death treatment outcomes. Observations are censored at the time of the initial

non-death treatment outcome because survival past that time is unknown. Under the Cox model

assumption of non-informative censoring, event times are expected to be independent of the censor

times.3 If this assumption is upheld, it means that censored observations are at equal risk of failure

as remaining at-risk individuals still in the cohort after the censor time.4 However, when the non-

informative censoring assumption is not upheld, such as if censored observations have higher or

lower risk of failure as remaining individuals in the cohort5, estimated survival probabilities may be

biased.3 MDR-TB literature suggests that individuals who experience non-death treatment

117

outcomes are not at equal risk of death after their initial treatment outcome as individuals remaining

in the cohort at that time. In fact, people who experience a successful non-death treatment outcome

(cure or treatment completion) are at much lower risk of death than people who experience an

unsuccessful non-death treatment outcome (treatment failure, treatment default, transfer out), with

4.2 and 51.2 percent, respectively.6-12 This differential risk of death potentially violates the non-

informative censoring assumption of the Cox model. Due to the way censored observations impact

survival probabilities, the presence of informative censoring may bias treatment effect estimates.

In cohorts with high proportions of individuals experiencing an initial successful treatment outcome

(66 percent in Lima, Peru cohort11-18; 67 percent in Tomsk, Russia cohort7,8,19-22) and knowledge

that those outcomes lead to low risk of death at the end of a study period (4.2 percent from a

literature review6-12; 0.3 percent in Tomsk, Russia cohort22), the benefit of treatment may be

underestimated when the non-informative censoring assumption is made. More accurate estimation

of treatment effect estimates is essential to identify effective treatment regimens and to inform

treatment guidelines.

Although methods have been explored to reduce bias introduced by informative censoring, they

may not be the most appropriate methods to implement for MDR-TB cohort analyses for several

reasons. First, these methods have mostly been developed and implemented in the context of

controlled trials, which have significantly better follow-up than standard observational cohorts and

often more well-defined classifications for reasons patients cannot be followed. Second, MDR-TB

treatment outcome definitions are unique in that there are six mutually exclusive outcomes and

recommendations suggest including patients in cohorts from the time of treatment initiation only

until the first outcome definition is met.2 This translates into all individuals who do not die being

censored at the time they experience a non-death treatment outcome. This complicates several of

118

the methods that are used to handle informative censoring that rely on the distribution of survival

times for non-censored individuals because the only non-censored individuals in MDR-TB

treatment cohorts are those who die. Using those observations to make inferences about censored

individuals’ survival times would overestimate death in the cohort. Third, the current MDR-TB

cohort analysis landscape is limited.

In order for mechanisms to handle informative censoring to be adopted into current practice, they

need to align with current methods most frequently used to analyze MDR-TB cohorts without

disruption. These methods need to be simple, easy to interpret, and comparable across populations.

This work aims to specifically provide MDR-TB researchers with simple to implement alternative

approaches to appropriately apply the Cox proportional hazards model to cohort data with limited

survival information, while not further complicating current analysis techniques.

Research Findings

We sought to develop adaptations to current methods used to analyze MDR-TB cohorts to reduce

biases introduced due to the presence of informative censoring. To this end, the following research

was conducted.

First, we identify the presence of informative censoring in an MDR-TB cohort (Lima, Peru, 1999-

2002)11-18 and propose a simple alternative censoring technique informed by the literature. We

make a mixed-risk assumption to account for the differential risk of death for patients experiencing

successful versus unsuccessful non-death treatment outcomes. This assumption more accurately

reflects literature reporting on post initial treatment outcome survival risk.6-12 When comparing

treatment effect estimates produced from the conventional method and the mixed-risk method, we

observe that the mixed-risk method produced stronger effect estimates. Assuming that the mixed-

119

risk assumption more accurately reflects the true survival of patients after the initial treatment

outcome, the conventional method consistently underestimates the treatment effect by 8.7 to 16.7

percent.18

Second, we further refine the mixed-risk assumption by using actual data to make a distinction

between risk of death for individuals experiencing different initial treatment outcomes (Tomsk,

Russia, 2000-2004).7,8,19-22 To do this, we develop and validate a model predicting end of study vital

status conditional on the initial treatment outcome to allow for differential risk of death across the

numerous non-death treatment outcomes. We then incorporate the predicted vital status into the

censoring technique and compared treatment effect estimates across models using the conventional

method, the new predicted-risk method, and a model incorporating the actual end of study vital

status. We observe that the conventional method underestimates the treatment effect by up to 22.1

percent, whereas use of the predicted-risk method reduces bias by 5.4 percent. The predicted-risk

method is consistently less biased then the conventional method and produces stronger effect

estimates.22

Third, because MDR-TB cohort data often lacks end of study vital status, a true treatment effect

cannot be identified and it is difficult to accurately assess the bias that informative censoring may

introduce. With simulated data, the true treatment effect can be a pre-specified parameter of the

model and can be directly compared to produced estimates to understand the magnitude and

direction of bias. We perform a simulation study to compare bias across models using the

conventional, mixed-risk, and predicted-risk censoring techniques to identify which performs most

accurately. We observe that all methods underestimate the treatment effect when the effect is truly

protective, with the conventional method consistently performing the poorest—resulting in up to

7.6 percent bias. The mixed- and predicted-risk methods are less biased across all scenarios, with a

120

maximum of 2.4 percent bias. The models using the predicted-risk method are consistently the most

accurate.23

Our findings across all three studies demonstrates that the conventional method, using only time to

the initial treatment outcome, in the analysis of MDR-TB treatment cohorts, results in biased

estimates and, more specifically, underestimates the benefit of the aggressive treatment regimen.

Incorporating a differential risk of survival for those with non-death initial treatment outcomes into

the censoring technique reduces bias in effect estimates, both in real data and in simulation.18,22,23

Findings are consistent across numerous scenarios, including univariate and multivariable analyses,

as well as across two heterogeneous cohorts.

Limitations

There are underlying assumptions used in all three studies. While none of the assumptions are fully

accurate, the mixed- and predicted-risk assumptions seem closer to the truth than the conventional

method, as evidenced by producing more accurate treatment effect estimates in simulation. In each

study, we perform analyses across multiple scenarios and various model specifications and observe

consistent results. However, these analyses may perform otherwise in cohorts with different

population make-ups and where the proportion of patients who experience death, successful, or

unsuccessful treatment outcomes vary widely.

In previous analyses of the Lima, Peru and Tomsk, Russia cohorts, the aggressive treatment

regimen is protective against death and recurrence.13,15,20,21 All three of our studies evaluate the

association of an aggressive treatment regimen and hazard of death and found consistent results.

While the reduction in bias observed when using the alternative censoring techniques may hold true

121

for other treatment regimens that have protective effects, it may not for regimens that prove to be

harmful. Because the mixed-risk technique assumes an underlying higher proportion of successful

treatment outcomes, it may not be an accurate method if there are actually more negative

outcomes/death in the population. The predicted-risk technique may perform better, but requires

validation in a cohort with a different underlying treatment outcome distribution.

Recommendations

To achieve the most accurate treatment effect estimates, appropriate measures should be employed

to ensure all model assumptions are upheld and accurately reflecting true patient survival. First, we

recommend checking for the presence of informative censoring. This can be done through assessing

definitions of outcomes and censored observations or through one of the three methods suggested

by Collett (2003): 1) plotting observed survival times against explanatory variables while

distinguishing censored and uncensored observations from one another; 2) using logistic regression

to examine the relationship between explanatory variables and the probability of being censored; 3)

examining the sensitivity of the results when varying assumptions about what happens to censored

observations after the time of being censored.3 If informative censoring is present, it is important to

understand how results may be biased and to apply new censoring techniques, such as those

proposed in this research, to reduce biases. In the event that the direction of bias that informative

censoring may introduce is unclear, it may be most beneficial to report results from the standard

model as well as from a sensitivity analysis in which an alternative censoring technique is used to

present a range of potential treatment effects. This, accompanied with a description of the

consistencies and discrepancies between the censoring techniques and the produced results may

provide readers with a clearer picture than presenting a single, biased result. Finally, we recommend

122

standardizing reporting. Researchers should report methods for how they check for informative

censoring, results of these tests, and how informative censoring is handled in the analysis. While

this standardization of reporting should be executed by researchers, journals may have higher

authority to require this of researchers prior to publication.

Research Contributions

This work contributes to the field in many ways. First, this research suggests that existing literature

on the benefit of an aggressive treatment regimen may be conservative13,15,20,21 and the benefit may

be larger than previously reported. Second, the identification of more accurate treatment effects,

and specifically stronger effects, increases the ability to make treatment recommendations and to

identify subgroups at higher risk of poor outcomes. The identification of stronger effects translates

to increased power, necessitating reduced sample sizes in cohorts, ultimately generating more

efficient analyses. Third, this research fills a gap in the literature about the presence of informative

censoring in MDR-TB cohorts and the downstream effects it may have on treatment effect

estimates. We also elucidate how these effects may expand beyond the context of MDR-TB cohorts

to studies in which there are multiple outcome definitions and limited survival time available. There

are currently limited papers dedicated to discussing the application of methods to analyze MDR-TB

treatment cohorts. Current methods used for these analyses are not advanced, with most

publications about MDR-TB treatment outcomes simply describing the frequency of outcomes,

while others use logistic regression or Cox proportional hazards models. Our research keeps these

restrictions in mind, ensuring that all proposed methods for handling informative censoring are

small adaptations from standardly used methods so as not to disrupt current practice. Proposed

methods simply require recoding of event times based on the initial treatment outcome and can be

implemented immediately without needing to learn new, more advanced methodology.

123

Future Work

It is essential to ensure the dissemination of results to MDR-TB researchers about how Cox

proportional hazards models may bias effect estimates in the presence of limited survival data to

maximize interpretability of research studies. Reporting of all potential biases, including those

introduced from the presence of informative censoring, is essential when presenting the results of

MDR-TB treatment effectiveness studies. This allows readers to understand the potential magnitude

and direction of any bias and implications for result interpretation. Researchers should be informed

about the importance of: 1) defining the outcome event, censoring indicator, and the duration of the

study; 2) conducting and reporting outcomes of formal tests to check for the presence informative

censoring; 3) noting limitations if informative censoring is present; and 4) reporting what methods

are used in the analysis to reduce bias. The most efficient method to spread the message of this

research is to target journals so that it becomes a standard in the literature and can be requested of

researchers prior to publication.

Within this research we offer two simple alternative censoring techniques for use by researchers.

Because lack of a statistical coding skillset may limit some researchers who wish to analyze MDR-

TB cohorts using Cox proportional hazards models, we propose the development of a simple user-

interface, such as a ShinyApp (RStudio), that can guide researchers to format their data and insert

their own population parameters and model specifications to run the conventional Cox model and

models using the two alternative censoring techniques to identify the least biased method to use.

ShinyApp, or a similar program, would allow for simple data upload or entry from the user end,

with coding pre-programmed behind the scenes. This would broaden the user base who could

benefit from this research, as well as aid more researchers with limited data analysis experience to

use more robust methods for analyzing their data.

124

Our findings can be strengthened by several additional avenues of future research. To better

understand the effectiveness of MDR-TB treatments in cohorts in which survival past the initial

treatment outcome is unknown, further refinement of the estimation of long term survival may be

necessary. While our research currently uses existing data to predict vital status at the end of a

defined follow-up period to inform the censoring indicator, further refinement would be beneficial.

This refinement may include time varying estimates to approximate survival probabilities over

time, allowing for estimation of when a person may die during the follow up to more accurately

account for survival times.

Although Cox proportional hazards models are the most advanced models currently used to analyze

MDR-TB cohorts, further adoption and integration of parametric models may be useful. A simple

extension of the Cox proportional hazards model that would not disrupt interpretation of results

may be parametric proportional hazards models. With limited survival data available and many

censored observations, it is difficult to approximate the underlying distribution of survival times

necessary for use in parametric proportional hazards models. However, if the distribution of

survival times loosely follows a Weibull distribution, for example, hazard ratio estimates may be

more accurate than those produced from the standard Cox model in which no assumption about the

survival time distribution is made. Our two alternative censoring techniques would also be

applicable to these models, if necessary. Parametric proportional hazards models may identify a

method for producing more accurate treatment effect estimates in MDR-TB cohorts, while not

straying too far from current practice. Further exploration of these methods are necessary.

Additionally, important groundwork should be done to synthesize existing literature about

informative censoring into one consolidated resource for researchers. This resource can be a

reference about methods to identify and handle informative censoring across different types of

125

research studies and content areas. The TB research landscape would also benefit greatly from a

large scale review of the literature to synthesize how MDR-TB cohort analyses are conducted. This

would help researchers identify the types of methods being used and in what scenarios, ways to

check for violation of model assumptions, and how to define outcomes and censored observations,

if applicable. Recommendations can then be made to provide researchers with guidelines for how to

approach common scenarios observed during MDR-TB cohort analyses.

Long-term follow-up of MDR-TB patients is not often available due to financial and human

resource constraints of local TB programs. Although collecting this data may be expensive, the bias

introduced into analyses when this data is missing may persuade researchers or programs to ensure

that at least one longer term follow-up visit or phone call to all patients is completed to assess vital

status. There is a significant lack of literature about outcomes of MDR-TB patients after their initial

treatment outcome. Collecting this data could better inform censoring techniques like the mixed-

risk method used in Chapter 2. Similarly, a worthwhile modelling exercise might include varying

the proportion of long-term data available to identify a threshold at which enough data is available

to inform a predictive model of long-term vital status for the rest of the cohort. By identifying a

smaller proportion of the cohort that would be followed for longer, resources may be saved while

still providing optimal data for a more accurate analysis.

Conclusion

In conclusion, we have identified that long treatment duration, multiple treatment outcome

definitions, and lack of longer survival data past the initial treatment outcome can lead to

informative censoring when applying Cox proportional hazards models to MDR-TB cohort data.

Findings suggest that censoring all non-death treatment outcomes in MDR-TB analyses may violate

126

the non-informative censoring assumption of the Cox model. These findings demonstrate that not

adjusting the non-informative censoring assumption may bias effect estimates and potentially

underestimate protective treatment effects. Accounting for differential risk of survival for censored

observations meeting a successful versus unsuccessful non-death treatment outcome reduces the

impact of informative censoring, producing more accurate effect estimates.

These findings are a topic of population health importance, as they may significantly impact results

of MDR-TB treatment analyses and, in turn, how treatment recommendations are made. With such

limited MDR-TB treatment data available due to low resources prohibiting long-term patient

monitoring in areas that experience the majority of the MDR-TB burden, it is essential to be as

efficient as possible when analyzing available cohort data. Accurate assessment of MDR-TB

treatment effectiveness is essential as resistant strains of tuberculosis continue to spread and new

drugs move through the developmental pipeline into programmatic clinical care.

127

References

1. World Health Organization. Global tuberculosis report 2017. Geneva, Switzerland: WHO.

2017.



Disease. 2005; 9(6): 640-5.

3. Collett D. Modelling Survival Data in Medical Research. Second Edition. 2003.

4. Efron B. The Efficiency of Cox's Likelihood Function for Censored Data. Journal of the


5. Campigotto F, Weller E. Impact of informative censoring on the Kaplan-Meier estimate of

progression-free survival in phase II clinical trials. Journal of Clinical Oncology. 2014; 32(27):

3068-74.





Lung Disease. 2015; 19(4): 399-405.









2016; 44(7): 843-5.



1844-51.





128









3:2333794X16674382.




19. Keshavjee S, Gelmanova IY, Farmer PE, et al. Treatment of extensively drug-resistant

tuberculosis in Tomsk, Russia: a retrospective cohort study. Lancet. 2008; 372(9647):1403–9.

20. Velasquez GE, Becerra MC, Gelmanova IY, et al. Improving outcomes for multidrug-resistant

tuberculosis: aggressive regimens prevent treatment failure and death. Clinical Infectious

Diseases. 2014; 59(1): 9-15.

21. Khan FA, Gelmanova IY, Franke MF, et al. Aggressive regimens reduce risk of recurrence after

successful treatment of MDR-TB. Clinical Infectious Diseases. 2016; 63(2): 214-20.

22. Brooks MB, Keshavjee S, Gelmanova I, Zemlyanaya NA, Mitnick CD, Manjourides J. Use of

predicted vital status to improve the analysis of multidrug-resistant tuberculosis cohorts.


23. Brooks MB, Manjourides J. Estimates of bias from informative censoring in multidrug-resistant

tuberculosis cohort analyses: a simulation study. Dissertation; currently unpublished. 2017.

129

SUPPLEMENTAL MATERIALS

130

R code for rejection sampling algorithm

#Data: Covariate matrix

#Time.Use: Outcome and time matrix

#AR.beta: aggressive treatment regimen effect

#Age.Ado.beta: adolescent effect

Data$hazard <- exp((AR.beta*AggReg) + (Age.Ado.beta*Age.Ado))

SAMPLE <- NULL

for (j in 1:N){

#Max of hazard becomes new column

Data$hazard.max <- max(Data$hazard)

#Ratio of hazard to max(hazard)becomes new column

Data$Ratio <- (Data$hazard / Data$hazard.max)

#Column 2: corresponding status is death

if(Time.Use[1,2]==1) {

repeat{

#Select a random number, i, from 1:N

Sample.num <- sample(1:(nrow(Data)), 1, replace=FALSE)

#Create a temporary dataset that is just Person i's row

Temp <- Data[Sample.num,]

#Draw U from a uniform distribution between 0 and 1

U <- (runif(1, min = 0, max = 1))

#Compare the ratio to U. If U < ratio, assign Time and Death.

#If U > ratio, repeat.

if (U < Temp$Ratio) break

}

#Add the survival time and censoring indicator to i's row

Temp<-cbind(Temp, Time.Use[1,])

#Store this person in the SAMPLE dataset

SAMPLE <- rbind(SAMPLE,Temp)

#Remove person i from the full dataset so next risk is N-1

Data <- Data[-Sample.num,]

#Remove the row of times and censors already assigned

Time.Use <- Time.Use[-1,]

}

else{

#Column 1 = cure or no cure; corresponding status = cure

if (Time.Use[1,1] == 1) {

repeat {

131

#Selects a random number, i, from 1:N


#Creates a temporary dataset that is just Person i's row


#If ratios do not all equal 1, then compare to U

#If remaining ratios = 1, assign to that time

#(see step after second break)

if(sum(Data$Ratio)!=nrow(Data)) {

#Draw U from a uniform distribution between 0 and 1

U <- (runif(1, min = 0, max = 1))

#Compare the ratio to U. If U > ratio, assign Time and Cure.

#If U < ratio, repeat.

if (U > Temp$Ratio) break

}

break }







#Remove the first row of times and censors already assigned


}

else {

#Select a random number, i, from 1:N


#Create a Temp dataset that is just Person i's row








#Remove the row of times and censors already assigned


}

}

}

132

R code for model development and adjustment of censoring assumptions #EQUAL RISK ASSUMPTION

#Create survival object

Mod1.EQ.surv <- Surv(SAMPLE$Surv.Times, SAMPLE$Died)

#MODEL 1: univariate analysis

Model1.EQ <- coxph(Mod1.EQ.surv ~ AggReg, data = SAMPLE)

#MODEL 2: Controlling for adolescence

Model2.EQ <- coxph(Mod1.EQ.surv ~ AggReg + Age.Ado, data = SAMPLE)

#MIXED RISK ASSUMPTION

#Edit survival times

SAMPLE$MixedRisk.Time <- ifelse(SAMPLE$Success == 1, max(SAMPLE$Surv.Times),

SAMPLE$Surv.Times)

#create survival object

Mod2.MR.surv <- Surv(SAMPLE$MixedRisk.Time, SAMPLE$Died)


Model1.MR <- coxph(Mod2.MR.surv ~ AggReg, data = SAMPLE)


Model2.MR <- coxph(Mod2.MR.surv ~ AggReg + Age.Ado, data = SAMPLE)

#APPLY PREDICTED VITAL STATUS

#Predictive logistic regression model from Tomsk data:

#Log (p/(1-p)) = 2.567 + 2.478*Success + 0.753*Txfail + 0.042*Age

#Note: age is centered

SAMPLE$log.odds <- (2.567 + (2.478*SAMPLE$Success) + (-0.753*SAMPLE$Failure)

+(-0.042*SAMPLE$Age.Center))

SAMPLE$odds <- exp(SAMPLE$log.odds)

SAMPLE$Prob.Survival <- (SAMPLE$odds / (1+SAMPLE$odds))

#Edit survival time based on threshold of 0.9902125709

SAMPLE$PredictRisk.Time <- ifelse(SAMPLE$Prob.Survival >= 0.9902125709,

max(SAMPLE$Surv.Times), SAMPLE$Surv.Times)

#Create survival object

Mod3.PP.surv <- Surv(SAMPLE$PredictRisk.Time, SAMPLE$Died)


Model1.PP <- coxph(Mod3.PP.surv ~ AggReg, data = SAMPLE)


Model2.PP <- coxph(Mod3.PP.surv ~ AggReg + Age.Ado, data = SAMPLE)

Documents

Bias reduction in the presence of informative censoring ... · To my family and friends, thank you for your continued love, support, and understanding, especially when I missed weekend