bayesian_analysis_hazard

8/7/2019 bayesian_analysis_hazard

1/19

Lifetime Data Anal

DOI 10.1007/s10985-010-9181-x

Bayesian analysis for monotone hazard ratio

Yongdai Kim Jin Kyung Park Gwangsu Kim

Received: 9 April 2009 / Accepted: 2 July 2010 Springer Science+Business Media, LLC 2010

Abstract We propose a Bayesian approach for estimating the hazard functions under

the constraint of a monotone hazard ratio. We construct a model for the monotone

hazard ratio utilizing the Coxs proportional hazards model with a monotone time-

dependent coefficient. To reduce computational complexity, we use a signed gamma

process prior for the time-dependent coefficient and the Bayesian bootstrap prior for

the baseline hazard function. We develope an efficient MCMC algorithm and illustrate

the proposed method on simulated and real data sets.

Keywords Bayesian bootstrap Censoring Monotone hazard ratio Orderrestriction Proportional hazards model

1 Introduction

Estimation and inference of two survival functions S1 and S2 under certain order restric-

tions have received much attention in survival analysis. The most popular order restric-tion is the stochastic ordering, which assumes that S1(t) S2(t) for all t [0, ).The nonparametric estimator of the survival functions under the stochastic ordering

were found by Brunk et al. (1966) for complete observations and Dykstra (1982)

for right censored data, and asymptotic properties were studied by Paestgaard and

Huang (1996). Bayesian approaches for stochastic ordering have been proposed by

Arjas and Gasbarra (1996) and Gelfand and Kottas (2001). Also, uniform stochastic

Y. Kim (B) G. KimSeoul National University, Seoul, Korea

e-mail: [email protected]

J. K. Park

International Vaccine Institute, Seoul, Korea

123


2/19

Yongdai Kim et al.

ordering, which assumes that S1(t)/S2(t) is nonincreasing/nondecreasing in t, has

been considered by Dykstra et al. (1991) and Mukerjee (1996).

Statistical inference under order restriction on hazard functions has also been con-

sidered in the context of assessing the validity of the proportional hazards assumption.

Gill and Schumacher (1987) and Deshpande and Sengupta (1995) proposed test sta-tistics for assessing the hypothesis of the proportional hazards against the monotone

hazard ratio alternative, and Sengupta et al. (1998) developed a testing procedure for

the increasing cumulative hazard ratio alternative. However, these methods do not give

an estimation of the hazard ratio under order restriction.

In this paper, we propose a Bayesian approach for estimating the hazard functions

under the monotone hazard ratio constraint. We construct a model for the monotone

hazard ratio using the Coxs proportional hazards model with a time-dependent coef-

ficient that is monotone. An advantage of this model is that we can simultaneously

estimate the monotone hazard ratio and assess the validity of the proportional hazardsassumption against the monotone hazard ratio alternative.

We utilize a signed gamma process prior for the monotone hazard ratio. For the

prior of the baseline hazard function, we could use gamma process (Kalbfleisch 1978;

Kim and Lee 2003a) and beta process (Laud et al. 1998; Kim and Lee 2003a) pri-

ors. Such priors, however, require extensive computation for obtaining the posterior

because there are two nonparametric priors: one for the monotone hazard ratio and the

other for the baseline hazard function. To reduce the computational burden, we utilize

the Bayesian bootstrap (BB) prior proposed by Kim and Lee (2003b). The BB prior

makes the problem conceptually parametric and yields a much simpler MCMC algo-rithm to compute, while still retaining the flexibility of nonparametric priors. Also,

Kim and Lee (2003b) showed that the posterior obtained with the BB prior closely

approximates the full Bayesian posterior with gamma or beta processes priors.

The paper is organized as follows. In Sect. 2, the model and prior are presented. In

Sect. 3, we first review the BB approach for the proportional hazards model and then

develop an efficient MCMC algorithm for calculating the BB posterior numerically. In

Sect. 4, we illustrate the proposed method on various data sets. In Sect. 5, we present

concluding remarks.

2 Model and prior

Let (xsi , si ), s = 1, 2, i = 1, . . . , ns be observations of pairs of right censored right-censored time and censoring indicator. That is, xsi = min{tsi , csi } and si = I(tsi csi ) where xsi and csi are survival and censoring times, respectively.

To model the monotone hazard ratio assumption, we propose the following propor-

tional hazards model with a time-dependent coefficient: the hazard functions for the

groups s=

1, 2 are given as

1(t) = (t)

and

123


3/19


2(t) = exp (0 + 1 H(t))(t)

where 0 (, ), 1 {1, 0, 1} and H() is a nondecreasing nonnegative func-tion with H(0)

=0. Note that the hazard ratio is monotonically increasing, constant,

or monotonically decreasing when 1 is 1,0 or 1, respectively. Hence, we can assessthe validity of the proportional hazards assumption using the posterior probability of

1 being 0. Also, we can estimate the monotone hazard ratio by estimating 0, 1, and

H, as the hazard ratio is given as

2(t)/1(t) = exp (0 + 1 H(t)) .

Note that the hazard ratio is modeled nonparametrically, as H is completely unspeci-

fied.

Remark Another advantage of the proposed model is that we could easily incorporate

other covariates z, if they exist in the model, by setting

1(t|z) = exp(z)(t)

and

2(t

|z)

=expz

+0

+1 H(t)(t).

This is useful if we want to know whether the risk of one group decreases faster than

that of the other group after adjusting for other risk factors such as age, gender, etc.

For prior, we use standard parametric priors for 0 and 1 and a nonparametric prior

for H. A priori, we let 0 N(0, 20 ) and Pr(1 = k) = 1/3 for k = 1, 0, 1. For H,a priori, we let H be a gamma process with mean H0 and precision parameter c > 0.

That is, H is a nondecreasing stochastic process on [0, ) with independent incre-ments such that H(0) = 0 and H(t) H(s), s t follows a gamma distribution withmean H0(t)

H0(s) and variance (H0(t)

H0(s))/c. See Lo (1982) and Kalbfleisch

(1978) for details of gamma processes. To reduce computational complexity, we use

the BB prior for , which is explained in detail in Sect. 3.

3 Posterior: Bayesian bootstrap approach

In this section, we develop an efficient MCMC algorithm to calculate the BB posterior

distribution. We first review the BB approach proposed by Kim and Lee (2003b) and

present the corresponding MCMC algorithm.

3.1 Bayesian bootstrap for the proportional hazards model: review

The main idea of the BB approach for the proportional hazards model is to approxi-

mate the full Bayesian posterior by the BB posterior that is proportional to the product

123


4/19

Yongdai Kim et al.

of the empirical likelihood and prior. Let (x1, 1, z1() ) , . . . , (xn, n , zn ()) be obser-vations where xi are right-censored times (i.e., minimum of survival and censoring

times), i are censoring indicators, and zi () are (time-dependent) covariates. Underthe proportional hazards model given as

(t|z) = exp(z(t))(t)

where (t|z) is the hazard function of the survival time with covariate z, the likelihoodfunction of = (,()) is

L( ) =n

i=1

exp(zi (xi )

)(xi )

iexp

xi

0

exp(zi (s))(s)ds

=

ni=1

exp(zi (xi )

)d(xi )

iexp

xi

0

exp(zi (s))d(s)

, (1)

where (t) = t0 (s)ds is the cumulative hazard function. Let q be the number ofdistinct, uncensored observations, and let 0 < t1 < < tq be the correspond-ing ordered, uncensored observations. Then, the empirical likelihood is obtained by

assuming that is a step function having jumps only at t1, . . . , tq and replacing d(t)

by (t) = (t) (t) in (1), which results in

L E( ) =n

i=1

exp(zi (xi )

)(xi )

iexp

k:tkxiexp(zi (tk)

)(tk)

. (2)

For details of the empirical likelihood (2), see Andersen et al. (1993). Finally, the BB

posterior of is defined to be proportional to the product of the empirical likelihood

and prior.

Remark There is an alternative empirical likelihood called the binomial form empir-

ical likelihood. See Kim and Lee (2003b) for details. An advantage of the binomial

form is that the resulting BB posterior can be obtained as a limit of full Bayesian

posteriors. However, the computation is more difficult, and the BB posterior may not

be proper. Therefore, we do not consider the binomial form empirical likelihood in

this paper.

An advantage of the BB approach is that the dimension of parameter, , is finite

because we discretize to a step function with finitely many jumps. That is, the

parameters in the empirical likelihood are and {(tk), k = 1, . . . , q}, and hence,the posterior distribution can be obtained easily using Bayes theorem.

A technical difficulty in the BB approach is the choice of the prior for {(tk), k =1, . . . , q}. For this, Kim and Lee (2003b) proposed the following improper prior (BBprior):

123


5/19


() q

k=1

1

(tk), (3)

and showed that the resulting posterior is always proper, approximates the full Bayes-

ian posterior well, and has desirable large sample properties. It is interesting to note

that the marginal BB posterior of with the prior (3) turns out to be proportional to

the Coxs partial likelihood times prior.

Remark The BB approach does not require prior information on , which may be a

disadvantage when we have prior information. However, we could incorporate prior

information to the BB posterior by choosing the prior of accordingly. Suppose

a priori follows a gamma process with mean 0 and precision parameter c > 0.

Given that we could think of (tk) as an approximation of (tk) (tk1), wecould incorporate the prior information into the BB posterior by choosing the BB prioras

() q

k=1((tk))

c(0(tk)0(tk1))1 exp (c(tk)) . (4)

Remark A similar approach to the BB is to assume a piecewise constant hazard func-

tion. That is, (t) is given by

(t) = mk=1

kI(sk1 < t sk)

for some sequence 0 = s0 < s1 < s2 < < sm . See, for example, Arjas andGasbarra (1996) and Ibrahim et al. (2001). Nonetheless, we use the BB approach

because it has more sound theoretical backgrounds (at least asymptotically) and pro-

vides a simpler MCMC algorithm. In contrast, it is not easy to choose the break

points s1, . . . , sm in the piecewise constant hazard model, and the computation of the

posterior would be more difficult.

3.2 Bayesian bootstrap posterior

The parameter in the model is = (0, 1, H,). The likelihood of the proposedmodel is

L( ) =2

s

=1

nsi

=1

exp (0 + 1 H(xsi ))I(s=2) d(xsi )

si

exp

xsi

0

exp (0 + 1 H(u))I(s=2) d(u) .

The full Bayesian computation is extremely hard, as the likelihood involves terms like

123


6/19

Yongdai Kim et al.

t

0

exp(1 H(s))d(s),

which require the knowledge of sample paths of both H(t) and (t). To resolve thisproblem, we employ the BB approach as follows:

Let 0 < t1 < t2 < < tq be the corresponding ordered distinct uncensoredsurvival times among the pooled sample, and let R(t) = {(s, i ) : xsi t} andD(t) = {(s, i ) : xsi = t, si = 1}. Let (tk) = (tk) (tk) = k, and weassume that (t) = tkt k. Then, the empirical likelihood of the proposed modelbecomes

LE

( ) =q

k=1

d(tk)

k exp

(2,i )D(tk)(0 + 1 H(tk))

exp

k

(s,i )R(tk)

exp (0 + 1 H(tk))I(s=2)

where d(t) is the cardinality of D(t). For prior ofks, we use the BB prior

() =

q

k=1

1

k ,

as in (3) where = (1, . . . , q ). Then, the BB posterior of is given by

B B ( |Data) L E()(),

where () = (0)(1)(H)().

3.3 MCMC algorithm

We use a Gibbs sampler algorithm in which the parameters 0, 1, and H are gen-

erated sequentially from the conditional BB posteriors. We can easily generate 0 and

1 using the Metropolis-Hastings (MH) algorithm with the following conditional BB

posterior distributions:

(0|1, , H, Data) exp

0q

k=1 (2,i )D(tk)

1

exp

exp(0)

qk=1

k exp(1 H(tk))

(2,i )R(tk)1

(0),

(5)

123


7/19


(1|0, , H, Data) exp1

qk=1

H(tk)

(2,i )D(tk)1

exp exp(0)

qk=1

k exp(1 H(tk))

(2,i )R(tk)1(1).

(6)

Also, the conditional BB posterior distribution of k given 0, 1, H and data is a

gamma distribution with mean k/k and variance k/2k, where k = d(tk) and

k =

(s

,i

)R

(tk

)

exp (0 + 1 H(tk))I(s=2) . (7)

The difficult part is to generate H from the conditional BB posterior. To gen-

erate H, we use the Gibbs sampler algorithm with the acceptance-rejection (AR)

sampling technique (Ripley 2006). Note that the empirical likelihood depends on

H through H(t1) , . . . , H(tq ), and so it suffices to generate W = (W1, . . . , Wq )from the conditional posterior where Wk = H(tk) H(tk1) and H(t0) = 0.In applying the Gibbs sampler algorithm to generate W, we need to generate Wkfrom its conditional distribution given 0, 1, , W

(k) and data where W(k) =(W1, . . . , Wk

1, Wk

+1, . . . , Wq ).

Identifiability issues arise. First, 0 and W1 are not identifiable in the empirical like-

lihood, whereas 0 +W1 is identifiable. Other unidentifiable quantities in the empiricallikelihood are Wk for k > p where

p = min{max{x1i : 1i = 1}, max{x2i : 2i = 1}}.

Note that Wk for k > p are not used in the empirical likelihood when p = max{x2i :2i = 1}, as they affect the empirical likelihood through 0 + 1Wk + log k whenp

=max

{x1i

:1i

=1

}, in which case Wk and k are not identifiable by the empirical

likelihood. To avoid these identifiability issues, we let W1 = 0 and Wk = 0 for k > p,which is equivalent to using H0 instead of H0 in the prior parameter of the gammaprocess where H0 (t) = 0 for t < t1, H0 (t) = H0(t) H0(t1) for t1 t tp andH0 (t) = H0(tp) for t > tp.

We now explain how to generate Wk from its conditional posterior distribution. Let

H(l)k = H(tk) Wl . Then, the conditional posterior distribution ofWl for 2 l p

given others = (0, 1, , W(l), Data) is given as

(Wl |others) expWl 1q

k=l

(2,i )D(tk)

1

exp

exp(1Wl )

q

k=lk exp

0 + 1 H(l)k

(2,i )D(tk)

1

123


8/19

Yongdai Kim et al.

Wvl 1l exp(cWl )I(Wl 0),

where vl = c(H0 (tl ) H0 (tl1)). Let

l =q

k=l

(2,i )D(tk)

1

and

l =q

k=lk exp

0 + 1 H(l)k

(2,i )D(tk)

1

.

Then, the conditional posterior distribution of Wl is simplified as

(Wl |others) = hl (exp(1Wl )) Wvl 1l exp(cWl )I(Wl 0), (8)

where

hl (y) = yl exp(l y). (9)

Note that the maximum of hl (exp(1Wl )), say hl , on Wl

(0,

) can be easily

calculated and we can easily generate a random number from the gamma distribution.Hence, we can use the AR sampling technique for generating Wl from (8) as follows:

1. Generate W Gamma(vl , c) where Gamma(a, b) is the gamma distribution withmean a/b and variance a/b2.

2. Generate U Uniform(0, 1).3. Let Wl = W ifhl (exp(1W))/ hl U. Otherwise, go to 1.

The MCMC algorithm for the BB posterior can be summarized as follows:

Sampling

0given

1, , H and data: We use the random-walk MH algorithm. Let

0 be a candidate value generated from a random-walk kernel q(0,

0 ). Then,

the acceptance rate is

(0 |1, , H, Data)q(0 , 0)(0|1, , H, Data)q(0, 0 )

where (0|1, , H, Data) is in (5). Sampling 1 given 0, , H and data: We generate 1 = h for h {1, 0, 1} with

probability ph where

ph =(h|0, , H, Data)

l{1,0,1} (l|0, , H, Data)

and (h|0, , H, Data) is in (6).

123


9/19


Sampling given 0, 1, H and data: For k = 1, . . . , q, generate k fromGamma(k, k) where = d(tk) and k is in (7).

Sampling H given 0, 1, and data:

Let Wk = H(tk) H(tk1). Let W1 = 0 and Wk = 0 for k > p. For l = 2, . . . , p1. Generate W Gamma(vl , c).2. Generate U Uniform(0, 1).3. Let Wl = W ifhl (exp(1W))/ hl U where hl is in (9). Otherwise, go to 1.

Let H(t) = k:tkt Wk.

4 Numerical experiments

In this section, we illustrate the proposed model on various data sets. For prior param-

eters, we let 20 = 10, H0(t) = log(1 + t) and c = 1.

4.1 Simulation 1

We let n1 = n2 = 50 and generated survival times of the first group from the expo-nential distribution with mean 20, and those of the second group from the exponential

distribution with mean 30. Censoring times are generated from the exponential dis-

tribution such that the censoring probability is 0.3. Note that the model used for thesimulation satisfies the proportional hazards assumption. We obtained the posterior

distributions of using the proposed MCMC algorithm. We iterated the MCMC algo-

rithm 100,000 times after a burn-in period of 10,000 iterations. Then, we collected

2,000 samples at every 50th iteration after the burn-in for further analysis. We used

a relatively extreme thinning (every 50th iteration) to make the samples almost inde-

pendent, making further analysis easier.

Figure 1 gives the traceplots and histograms of0 and H(t) and (t) at t = 20 (themean survival time of the first group) generated from the MCMC algorithm. The pro-

posed MCMC algorithm converges well, and the posterior densities have nice shapes(at least, they are unimodal). Figure 2a shows how the empirical probability of1 = 0,calculated based on the generated samples from the MCMC algorithm, converges. The

two dashed lines in the figure represent the 95% confidence interval obtained from the

samples, assuming that they are independent. With the exception of the early stage of

the iteration, the empirical probabilities lie inside the confidence limits, which implies

that the MCMC algorithm converges well to its stationary distribution for 1, too.

Figure 2b displays the posterior probabilities of 1, which supports the proportional

hazards model because it has the largest value when 1 = 0.Figure 3 shows the acceptance probability of Wk for k

=2, . . . , p in the AR

sampling step inside the MCMC algorithm. The smallest acceptance probability is

around 30%, which implies that the AR sampling step does not significantly hamper

the overall computing time of the MCMC algorithm.

Table 1 compares the Bayes estimator and 90% (equal-tail) posterior probability

interval of0 with those obtained from the proportional hazards model (i.e., 1 = 0)

123


10/19

Yongdai Kim et al.

the number of iteration

gamma0


H(20)


Lambda(20)

gamma0

density

H(20)

density

Lambda(20)

density

0 1000 2000 0 1000 2000 0 1000 2000

3 1 1 0 2 4 6 8 0.1 0.3 0.5 0.7

3

2

1

0

1

2

0

2

4

6

8

10

0.2

0.3

0.4

0.5

0.6

0.

0

0.2

0.4

0.6

0.8

1.0

1.2

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0

1

2

3

4

5

(a)

(b)

Fig. 1 Panel a shows the traceplots of0, H(20) and (20), and panel b shows the corresponding histo-

grams

and corresponding frequentist counterpart. The posterior interval based on the pro-

posed model is much wider than the other two intervals. This is because there is

additional uncertainty in estimating H for the proposed model. However, all intervals

contain the true value 0.4055.We conducted additional simulations to investigate the effect of the censoring prob-

ability and sample sizes on the posterior distribution. Table 2 presents the posterior

distributions of 1 for various values of the censoring probability and sample sizes.

The results are stable and consistently support the proportional hazards model.

123


11/19


0 500 1000 1500 20000.5

0.6

0.7

0.8

0.9

1.0

the number of iterationTheempiricalposteriorprobability

101

gamma1

Posteriorproba

bility

0.0

0.2

0.4

0.6

0.8

ofgamma

1

(a) (b)

Fig. 2 Panel a shows the traceplots of the empirical posterior probability of 1 = 0 (solid) with the 95%confidence limits (dashed), and panel b present the posterior probabilities of 1

2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50k

Acceptanceprobability

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 3 Acceptance probabilities of Wl for l = 2, . . . , p in the AR algorithm

Table 1 Bayes estimator and 90% posterior probability interval of0 of the proposed model (MHR Mono-

tone hazard ratio) with those obtained from the proportional hazards ( PH) model and corresponding frequ-

entist results

Method Point estimate 90% Interval

BB with the MHR model 0.6438 (1.1583, 0.0898)BB with the PH model 0.6638 (1.0805, 0.2382)MLE with the PH model 0.6600 (1.0716, 0.2484)

Table 2 The posterior probabilities of 1 = 1, 0 and 1 for various values of the censoring probabilityand sample sizes in simulation 1

(n1, n2) 30% censoring 50% censoring

(50,50) (0.0610, 0.8955, 0.0435) (0.0570, 0.8120, 0.1310)

(100,50) (0.2215, 0.7595, 0.0190) (0.0555, 0.7980, 0.1465)

(50,100) (0.0615, 0.6580, 0.3255) (0.0575, 0.7460, 0.1965)

(100,100) (0.0010, 0.7590, 0.2400) (0.0315, 0.8855, 0.0830)

123


12/19

Yongdai Kim et al.

gamma1

Posterio

rprobabilty

0.0

0.2

0.4

0.6

0.8

0 20 40 60 80 100

4

2

0

2

4

6

time

log(Ha

zardratio)

BayesTrue90%PB

0 20 40 60 80 100

0

1

2

3

4

time

Cumula

tivelambda

BayesTrue90%PB

(a) (b) (c)

Fig. 4 Panel a draws the posterior probability of 1, and panel b and c presents Bayes estimators of the

log hazard ratio and with the pointwise 90% probability bands (PB) and true functions, respectively

Table 3 The posterior probabilities of 1 = 1, 0 and 1 for various values of the censoring probabilityand sample sizes in Simulation 2

(n1, n2) 30% censoring 50% censoring

(50,50) (0.9550, 0.0445, 0.0005) (0.9505, 0.0465, 0.0030)

(100,50) (1.0000, 0.0000, 0.0000) (0.9975, 0.0025, 0.0000)

(50,100) (0.9915, 0.0085, 0.0000) (0.8630, 0.1345, 0.0025)

(100,100) (1.0000, 0.0000, 0.0000) (0.9995, 0.0005, 0.0000)

4.2 Simulation 2

We let 2(t) = t11(t). The hazard ratio is increasing monotonically when > 1 and decreasing when < 1. We set = 0.5 and 1(t) = 1/20 to have amonotonically decreasing hazard ratio, and = 20/

10 to make the mean survival

time of the second group equal to 20. The other set-ups such as sample sizes, censor-

ing probability, the number of iterations of the MCMC algorithm etc., are the same as

those for the simulated data set 1.

The posterior probability of 1 is given in Fig. 4a, which strongly supports the

true model, monotonically decreasing hazard ratio. Figure 4b and c present the Bayesestimator and corresponding pointwise 90% posterior probability bands of the log

hazard ratio (0 +1 H(t)) and cumulative baseline hazard function (t) with the trueones, respectively. Note that the true functions lie inside the probability bands, imply-

ing that the proposed method estimates the monotone hazard ratio and cumulative

baseline hazard function well.

As is done for Simulation 1, Table 3 presents the posterior probabilities of 1 for

various values of the censoring probability and sample sizes. All of the results strongly

indicate that the hazard ratio is decreasing.

4.3 Prior sensitivity

Priors need to be specified for three parameters 0, 1 and H. Since 1 has a value

among {1, 0, 1}, the uniform prior is a natural one. For 0, unless the prior variance

123


13/19


14/19

Yongdai Kim et al.

0 500 1000 1500 20000.2

0.4

0.6

0.8

1.0


Theempiricalposteriorprobability

0 500 1000 1500 20000.2

0.4

0.6

0.8

1.0


Theempiricalpos

teriorprobability

ofgamma1

ofgamm

a1

(a) (b)

Fig. 5 The panels a and b show the traceplots of the empirical posterior probability of 1 = 0, 1 (solid)with the 95% confidence limits (dashed) for the Leukemia and Ovarian data sets, respectively

As is done in Simulation 1, Fig. 5 presents the traceplots of 0 and Figs. 6 and 7

present the traceplots and corresponding histograms of 0, H(10) and (10) for the

Leukemia and Ovarian data sets. It seems that there is no problem in the convergence

of the MCMC algorithms.

Figure 8 presents the posterior probabilities of1 for the two data sets, and Table 5

gives the p-values of the three frequentist test statistics for the proportional hazards

model against the monotone hazard ratio alternative, as well as the DIC (deviance

information criterion, Spiegelhalter et al. (2002)) values and the effective numbers of

parameters ( pD) of the proposed model with 1=

1, 0 and 1, respectively.

The GS1 and GS2 in Table 5 represent the test statistics proposed by Gill and

Schumacher (1987), with the Gehan versus log-rank weights and Prentice versus log-

rank weights, respectively, the DS is the test statistic proposed by Deshpande and

Sengupta (1995). The DIC is calculated based on the marginal likelihood obtained

by integrating out the baseline hazard function with respect to the prior. Because we

used the BB prior, the resulting marginal likelihood becomes the partial likelihood.

Note that the DIC is an extension of the AIC (Akaike information criterion), and the

AIC works well with the partial likelihood Hjort and Claeskens (2006). Hence, it is

reasonable to calculate the DIC with the marginal likelihood. The five methods, the

posterior probability, three p-values, and DIC, indicated that the proportional hazardsassumption is valid for the Leukemia data set, but not for the Ovarian data set.

Remark When we are interested in the validity of the monotone hazard ratio assump-

tion, the frequentist tests are not valid because the rejection of the frequentist tests does

not necessarily mean that the monotone hazard ratio is valid. In contrast, the Bayesian

resultsthe posterior probability of1 and the DIC values, directly confirm whether

the assumption of the monotone hazard ratio is valid.

Remark Along with the DIC values for 1=

1, 0 and 1, we calculated the DIC

value of the model where 1 is random. The DIC value with random 1 would be

expected to be smaller than that with 1 = 0 when the proportional hazards assump-tion is valid. The DIC and pD values with random 1 for the Leukemia and Ovarian

data sets are 175.23, 1.20, and 128.41, 2.16 respectively, which do not confirm our

conjecture. We find, however, that the DIC values are unstable, particularly when the

123


15/19


0 1000 2000

4

3

2

1

0

1


gamma0

0 1000 2000

0

2

4

6

8


H(10)

0 1000 2000

0.5

1.0

1.5

2.0

2.5


Lambda(10)

gamma0

density

4 2 0 20

.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

H(10)

density

0 2 4 6 80

.0

0.1

0.2

0.3

0.4

Lambda(10)

density

0.5 1.5 2.5

1.5

1.0

0.5

0

.0

(a)

(b)

Fig. 6 The Leukemia data set resultspanel a presents the traceplots of0, H(10), and (10) and panel

b shoes the corresponding histograms

proportional hazards assumption is valid. Note that the difference of the DIC values

between 1=

0 and 1 for the Leukemia data set is very small, whereas the posterior

probabilities are much different. We think that the DIC may not be appropriate for

our model because our model is semiparametric (i.e., the hazard ratio is completely

unspecified), and the DIC is developed mainly for parametric models where the max-

imum likelihood estimator is asymptotically Gaussian. We leave this problem as a

future work.

123


16/19

Yongdai Kim et al.

gamma0

H(10)

2

0

2

4

6

0

2

4

6

0.2

0.4

0.6

0.8

1.0

Lambda(10)

gamma0

density

4 0 2 4 6

H(10)

density

0 2 4 6

Lambda(10)

density

0.0 0.4 0.8

0.00

0.05

0.10

0.15

0.20

0.2

5

0.30

0.35

0.0

0.1

0.2

0.3

0.4

0.0

0.5

1.0

1.5

2.0

2.5

0 1000 2000


0 1000 2000


0 1000 2000


(a)

(b)

Fig. 7 The Ovarian data set resultspanel a presents the traceplots of0, H(10), and (10) and panel b

shows the corresponding histograms

For the Ovarian data set, in which the proportional hazard assumption is rejected

against the monotone hazard ratio, we draw the Bayes estimator of the hazard ratio

with the pointwise 90% probability bands in Fig. 9a. The figure suggests that the hazard

ratio of the second group (stage II) over the first group (stage IIA) decreases steadily.

We draw the Bayes estimators of the two cumulative hazard functions 1 and 2 with

their pointwise 90% probability bands and the empirical cumulative hazard (ECH)

123


17/19


1 0 1

gamma1

Posteriorp

robability

0.0

0.1

0.2

0.3

0.4

0.5

101

gamma1

Posterior

probability

0.0

0.2

0

.4

0.6

0.8(b)(a)

Fig. 8 Panels a and b present the posterior probabilities of 1 for the Leukemia and Ovarian data sets,

respectively

Table 5 P-values of the three frequentist test statistics forthe proportional hazards against monotone hazardratio and the DIC, and pD values for the proposed model with 1 = 1, 0, 1, respectively

p-values DIC and pD

GS1 GS2 DS 1 = 1 1 = 0 1 = 1

Leukemia 0.6897 0.6807 0.1660 176.73, 1.39 174.64, 0.92 174.72, 1.28

Ovarian 0.0571 0.0507 0.0298 127.52, 2.03 130.73, 1.05 133.46, 1.67

100 200 300 400

4

3

2

1

0

1

2

time

Hazardratio

100 200 300 400

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

time

Lambda1(t)

BayesECH

90% PB

100 200 300 400

0.0

0.5

1.0

1.5

2.0

2.53.0

time

Lambda2(t)

(a) (b) (c)BayesECH

90% PB

Fig. 9 Part a draws the Bayes estimator of H(t) with its poinwise 90% probability band, part b for 1and part c for

2.

functions in Fig. 9b and c, respectively. The Bayes estimators and ECH functions are

close and are located inside the probability bands.

5 Concluding remarks

We proposed a Bayesian approach for estimating the two hazard functions under the

monotone hazard ratio constraint and developed an efficient MCMC algorithm. We

demonstrated with simulated and real data sets that the MCMC algorithm, based on

the BB approach, converges well and provides reliable results.

In this paper, we modeled the monotone hazard ratio nonparametrically. An alterna-

tive model is a piecewise constant monotone hazard ratio, which provides information

123


18/19

Yongdai Kim et al.

about when the hazard ratio changes. The proposed BB approach can be easily modi-

fied to this model to save significant computational costs.

The proposed model can be extended to a case where there are more than two haz-

ard functions. Suppose there are three hazard functions 1, 2 and 3 with 2/1 and

3/2 increasing monotonically. We can model 2 and 3 by

2(t) = exp

(2)

0 + H(2)(t)

1(t)

and

3(t) = exp

(3)

0 + H(2)(t) + H(3)(t)

1(t)

where H(2) and H(3) are two independent gamma processes a priori. The proposedMCMC algorithm can be easily modified for this model as well.

Studying asymptotic properties of the posterior distribution is worth pursuing. With-

out H, Kim and Lee (2003b) and Kim (2006) proved that the convergence rate of the

BB and full Bayesian posteriors is 1/

n. We think, however, that the convergence

rate of the posterior of H to the true hazard ratio would be slower than 1/

n, as the

optimal convergence rate for the hazard function is typically slower than 1/

n. This

conjecture would partly explain the wider probability interval of0 for the proposed

model compared to the results for the proportional hazards model in Table 1 and the

wider probability band for 2 in Fig. 9c compared to that of1 in Fig.9b.

Acknowledgment This work was supported by the Korea Science and Engineering Foundation (KOSEF)

grant funded by the Korea government (MEST) (R01-2007-000-20045-0(2008)).

References

Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical methods based on counting processes.

Springer, New York

Arjas E, Gasbarra D (1996) Bayesian inference of survival probabilities under stochastic ordering con-

straints. J Am Stat Assoc 91:11011109Brunk HD, Franck WE, Hanson DL, Hogg RV (1966) Maximum likelihood estimation of the distribution

of two stochastically ordered random variables. J Am Stat Assoc 61:10671080

Deshpande JV, Sengupta D (1995) Testing for the hypothesis of proportional hazards in two population.

Biometrika 82:251261

Dykstra RL (1982) Maximum likelihood estimation of the survival functions of stochastically ordered

random variables. J Am Stat Assoc 77:621628

Dykstra RL, Kochar S, Robertson T (1991) Statistical inference for uniform stochastic ordering in several

population. Ann Stat 19:870888

Gelfand AE, Kottas A (2001) Nonparametric Bayesian modeling for stochastic order. Ann Stat 53:865876

Gill R, Schumacher M (1987) A simple test of the proportional hazards assumption. Biometrika 74:289300

Hjort NL, Claeskens G (2006) Focussed information criteria and model averaging for Coxs hazard regres-sion model. J Am Stat Assoc 101:14491464

Ibrahim JG, Chen MH, Sinha D (2001) Bayesian survival analysis. Springer-Verlag, New York

Kalbfleisch JD (1978) Nonparametric Bayesian analysis of survival time data. J R Stat Soc Ser B 40:214

221

Kim Y, Lee J (2003) Bayesian analysis of proportional hazard models. Ann Stat 31:493511

Kim Y, Lee J (2003) Bayesian bootstrap for proportional hazards models. Ann Stat 31:19051922

123


19/19


Kim Y (2006) TheBernstein-von Mises theorem for the proportional hazard model. Ann Stat 34:16781700

Laud PW, Damien P, Smith AFM (1998) Bayesian nonparametric and covariate analysis of failure time data.

In: Practical nonparametric and semiparametric Bayesian statistics. Springer, New York, pp 213225

Lo AY (1982) Bayesian nonparametric statistical inference for Poisson point processes. Z Wahrsch Verw

Gebiete 59:5566

Mukerjee H (1996) Estimation of survival functions under uniform stochastic ordering. J Am Stat Assoc91:16841689

Paestgaard JT, Huang J (1996) Asymptotic theory for nonparametric estimation of survival curves under

order restriction. Ann Stat 24:16791716

Ripley BD (2006) Stochastic simulation. Wiley, New York

Sengupta D, Bhattacharjee A, Rajeev V (1998) Testing for the proportionality of hazards in two samples

against the increasing cumulative hazard ratio alternative. Scand J Stat 25:637647

Spiegelhalter DJ, Best N, Carlin B, Linde A (2002) Bayesian measures of model complexity and fit (with

discussion). J R Stat Soc Ser B 64:583639

13

Documents

bayesian_analysis_hazard