Introduction to Frailty Models

STA635 Project by Benjamin Hall

Cox proportional hazards modelModel: hyi

(t) = h0(t)e1X1+...+ kXk is the hazard function of the ith individual

Assumption: The hazard function for each individual is proportional to the basine hazard, h0(t). This assumption implies that the hazard function is fully determined by the covariate vector.

Problem: There may be unobserved covariates that cause this assumption to be violated.

Simluation Ex: Unobserved covariateConsider a situation with the following population:

Obviously Drug A is effective for the entire population.

But what happens in the Cox Model if the group is unobservable?

Group Proportion of Population

Hazard Rate with placebo

Hazard Rate with Drug A

1 40% 1 .52 40% 2 13 20% 10 8

Simulation Example, continuedLet’s simulate this example and apply the Cox

model:Have R simulate 100 people according to the

previous table’s probabilities and randomly assign them to treatment or placebo

For each person, have R simulate # of incidents within period of length 1. At the end of period of length 1 right-censoring occurs.

Simulation Example, continuedHere is some of the data generated by R (see

final slide for code):id group treat time status

[1,]

1 3 0 .38 1

[2,]

1 3 0 .07 1

[3,]

1 3 0 .23 1

[4,]

1 3 0 .15 1

[5,]

1 3 0 .11 1

[6,]

1 3 0 .89 0

[7,]

2 3 1 .22 1

[8,]

2 3 1 .04 1

... ... ... ... ... ...

Simulation Example, continuedNow we run coxph on our data: > myfit1 Call:coxph(formula = Surv(time, status) ~ treat) coef exp(coef) se(coef) z ptreat -0.128 0.88 0.11 -1.17 0.24Likelihood ratio test=1.37 on 1 df, p=0.242 n= 445

Notice that the LRT has a p-value of .242 which is not significant. But we know that treatment is effective for everyone. What is happening?

Simulation Example, continuedThe problem is that we have heterogeneity in

the data due to the unobservable groups. Since we cannot include group in our model,

the assumption of proportional hazards is violated.

What can we do to solve this problem? Use a frailty model.

Frailty ModelFrailty models can help explain the

unaccounted for heterogeneity.Frailty Model: hyi

(t) = z h0(t)e1X1+...+ kXk is the hazard function of the ith individual

The distribution of z is specified to be, say, Gamma. (Note: z must be non-negative since the hazard is non-negative.)

In this situation, the shared frailty model is appropriate, that is multiple observations of the same individual always has the same value of z.

Frailty Model in RLet’s apply the frailty model to our simulated

data:> myfit2Call:coxph(formula = Surv(time, status) ~ treat + frailty(id)) coef se(coef) se2 Chisq DF p treat -0.147 0.160 0.111 0.85 1.0 3.6e-01frailty(id) 93.89 43.5 1.4e-05Iterations: 5 outer, 17 Newton-Raphson Variance of random effect= 0.294 I-likelihood = -1887.9 Degrees of freedom for terms= 0.5 43.5 Likelihood ratio test=117 on 44.0 df, p=1.37e-08 n= 445

Notice that the LRT now has a highly significant p-value.

Frailty Model in RNow let’s try implementing the frailty model to

a real data set, the kidnet data set.Here are the results for the regular Cox Model:> kfit1Call:coxph(formula = Surv(time, status) ~ age + sex, data = kidney) coef exp(coef) se(coef) z page 0.00203 1.002 0.00925 0.220 0.8300sex -0.82931 0.436 0.29895 -2.774 0.0055Likelihood ratio test=7.12 on 2 df, p=0.0285 n= 76

Here the LRT is significant with a p-value of .0285 even without considering frailty.

Frailty Model in RHowever, a frailty model seems applicable in

this situation since their are multiple oberservations (i.e. 2 kidneys) per person. Below considers frailty:

> kfit2Call:coxph(formula = Surv(time, status) ~ age + sex + frailty(id), data = kidney) coef se(coef) se2 Chisq DF p age 0.00525 0.0119 0.0088 0.2 1 0.66000sex -1.58749 0.4606 0.3520 11.9 1 0.00057frailty(id) 23.1 13 0.04000Iterations: 7 outer, 49 Newton-Raphson Variance of random effect= 0.412 I-likelihood = -181.6 Degrees of freedom for terms= 0.5 0.6 13.0 Likelihood ratio test=46.8 on 14.1 df, p=2.31e-05 n= 76

Now the LRT is even more significant.

ResourcesTherneau and Grambsch, Modeling Survival

Data, Chapter 9Wienke, Andreas, “Frailty Models”,

http://www.demogr.mpg.de/papers/working/wp-2003-032.pdf

Govindarajulu, “Frailty Models and Other Survival Models”, www.ms.uky.edu/~statinfo/nonparconf/govindarajulu.ppt

R Codelibrary(survival)#GEN_TIMEgen_time <- function(group, treat) {if (group == 1) {return (round(rexp(1, 1-(.5*treat)),2))}if (group == 2) {return (round(rexp(1, 2-treat),2))}if (group == 3) {return (round(rexp(1, 10-2*treat),2))}}

# PERSON DATAperson_data <- function() {treat <- rbinom(1,1,.5)x <- runif(1)t1 <- matrix(NA, nrow=1, ncol=25)if (x < .4) { group <- 1 }if (x > .4 & x < .8) { group <- 2}if (x > .8) { group <- 3}elapse <- 0count <- 1while (elapse < 1) {t1[(count+3)] <- gen_time(group, treat)elapse <- elapse + t1[(count+3)]count <- count + 1}count <- count - 1t1[1] <- groupt1[2] <- treatt1[3] <- countfor (i in (count+4):25) { t1[i] <- 0 }if (count == 1) { t1[count+3] <- 1 }if (count > 1) { t1[count+3] <- 1-t1[count+2] }return (t1)}

m1 <- matrix (NA, nrow=100, ncol=25)

for (i in 1:100) {m1[i,] <- person_data() }

samp_size <- sum(m1[,3])

samp <- matrix(NA, nrow = samp_size, ncol= 5)colnames(samp) <- c("id", "group", "treat", "time", "status")

count2 <- 1for (i in 1:100) {for (j in 1:m1[i,3]) {samp[count2, 1] <- isamp[count2, 2] <- m1[i,1]samp[count2, 3] <- m1[i,2]samp[count2, 4] <- m1[i,j+3]samp[count2, 5] <- 1if(j==m1[i,3]) { samp[count2, 5] <- 0 }count2 <- count2 + 1}}

myfit1 <- coxph(Surv(samp[,4], samp[,5]) ~ samp[,3])myfit2 <- coxph(Surv(samp[,4], samp[,5]) ~ samp[,3] + frailty(samp[,1]))myfit1myfit2

Documents

Introduction to Frailty Models