22
J. Math. Biol. (1986) 24:119-140 Journalof Mathematical Biology Springer-Verlag 1986 Dependent competing risks: a stochastic process model Anatoli I. Yashin ], Kenneth G. Manton 2, and Eric Stallard 2 International Institute for Applied Systems Analysis, A-2361 Laxenburg, Austria z Center for Demographic Studies, Duke University, 2117 Campus Drive, Durham, NC 27706, USA Abstract. Analyses of human mortality data classified according to cause of death frequently are based on competing risk theory. In particular, the times to death for different causes often are assumed to be independent. In this paper, a competing risk model with a weaker assumption of conditional independence of the times to death, given an assumed stochastic covariate process, is developed and applied to cause specific mortality data from the Framingham Heart Study. The results generated under this conditional independence model are compared with analogous results under the standard marginal independence model. Under the assumption that this conditional independence model is valid, the comparison suggests that the standard model overestimates by 4% the effect on life expectancy at age 30 due to the hypothetical elimination of cancer and by 7% the effect for cardiovas- cular/cerebrovascular disease. By age 80 the overestimates were 11% for cancer and 16% for heart disease. These results suggest the importance of avoiding the marginal independence assumption when appropriate data are available -- especially when focusing on mortality at advanced ages. Key words: Chronic disease -- Cohort study -- Diffusion -- Framingham heart study-- Human mortality-- Maximum likelihood -- Mortality selection -- Survival with covariates 1. Introduction It is necessary to resolve the competing risk problem in order to determine how the probability of death for a given cause will change when the probability of death for another cause changes. Traditionally, this problem has been solved by assuming that the hazard rates for each cause are independent (e.g. Chiang 1968). Relatively little effort has been expended on dependent competing risks -- partly because of the perceived difficulty of the problem. For example, Tsiatis (1975) published a proof showing the nonidentifiability aspect of the competing risk problem when only the cause specific times at death are known. Peterson (1976) shows that the upper and lower limits to the probabilities of death can

Dependent competing risks: a stochastic process model

Embed Size (px)

Citation preview

Page 1: Dependent competing risks: a stochastic process model

J. Math. Biol. (1986) 24:119-140 Journalof

Mathematical Biology

�9 Springer-Verlag 1986

Dependent competing risks: a stochastic process model

Anatoli I. Yashin ], Kenneth G. Manton 2, and Eric Stallard 2

International Institute for Applied Systems Analysis, A-2361 Laxenburg, Austria z Center for Demographic Studies, Duke University, 2117 Campus Drive, Durham, NC 27706, USA

Abstract. Analyses of human mortality data classified according to cause of death frequently are based on competing risk theory. In particular, the times to death for different causes often are assumed to be independent. In this paper, a competing risk model with a weaker assumption of conditional independence of the times to death, given an assumed stochastic covariate process, is developed and applied to cause specific mortality data from the Framingham Heart Study. The results generated under this conditional independence model are compared with analogous results under the standard marginal independence model. Under the assumption that this conditional independence model is valid, the comparison suggests that the standard model overestimates by 4% the effect on life expectancy at age 30 due to the hypothetical elimination of cancer and by 7% the effect for cardiovas- cular/cerebrovascular disease. By age 80 the overestimates were 11% for cancer and 16% for heart disease. These results suggest the importance of avoiding the marginal independence assumption when appropriate data are available - - especially when focusing on mortality at advanced ages.

Key words: Chronic disease - - Cohort study - - Diffusion - - Framingham heart s t u d y - - Human mor ta l i ty - - Maximum likelihood - - Mortality selection - - Survival with covariates

1. Introduction

It is necessary to resolve the competing risk problem in order to determine how the probability of death for a given cause will change when the probability of death for another cause changes. Traditionally, this problem has been solved by assuming that the hazard rates for each cause are independent (e.g. Chiang 1968). Relatively little effort has been expended on dependent competing risks - - partly because of the perceived difficulty of the problem. For example, Tsiatis (1975) published a proof showing the nonidentifiability aspect of the competing risk problem when only the cause specific times at death are known. Peterson (1976) shows that the upper and lower limits to the probabilities of death can

Page 2: Dependent competing risks: a stochastic process model

120 A.I. Yashin et al.

be determined. Cohen and Liu (1984) show that the differences between these two limits are typically large.

What Tsiatis (1975) did not indicate is that, given relatively small amounts of additional information, the competing risk model is identifiable. For example, Arnold and Brockett (1983) show how identifiable dependent competing risk models can be produced by making general assumptions about the underlying form of the hazard rates for the various conditions. Tolley and Manton (1983) show how assumptions on the variation of hazard rates over short time intervals can produce an identified model and tests of dependence without specifying the general form of the hazard rate (Tolley and Manton 1984).

In this paper we present an alternative way of estimating the effects of competing risks using a general model of human aging and mortality (Woodbury and Manton (1977)). This model, which is applicable to longitudinal studies of populations with multiple measurements of covariates over time, uses information on the joint dependence of failure times for multiple diseases on a multivariate risk factor distribution (Manton et al. 1986).

2. Competing risk: definitions

Suppose that a population is subjected to K causes of death cl, c 2 , . . . , CK and each individual is characterized by a corresponding vector random variable ~" = (T1, T 2 , . . . , TK), representing the times at which he may die respectively of cl, c 2 , . . . , cK. The competing risk problem may be fully characterized in terms of a joint distribution function of failure times, F( t l , t 2 , . . . , t K ) = P( T~ <~ tl, T2 <<- t 2 , . . . , T~: ~ tK), or in terms of a joint survival function (David and Moeschberger 1978):

S( t l , t2, �9 �9 tK)= 1 - F ( t l , t 2 , . . . , tK).

It is assumed that the effect of eliminating any cause of death Ck can be represented by setting the corresponding argument tk of S(tl, t2,.. . , tK) to zero. It then follows that the K components of r are "net" or "potential" lives with marginal survival functions Sk( t) = P( Tk > t), k = 1, 2 , . . . , K, where Sl( t) = S( tl, 0 , . . . , O) and, in general, Sk( t ) = S ( O , . . . , t k , . . . , 0). In most empirical cases, however, the joint distribution function or, equivalently, the joint survival function, is unknown. One usually only observes the death time T = min(T1, T 2 , . . . , TK) and cause of death Ck, k = 1, 2 , . . . , K for any particular individual. Since the vector (T1, T 2 , . . . , TK) is not observable the function S(t~, t2 . . . . . tK) cannot be~sti- mated empirically (Tsiatis (1975)). Thus different assumptions about the joint distribution function of (T~, T 2 , . . . , T r ) produce different competing risk models.

To proceed further analytically it is necessary to define several types of hazard functions. Let S( t ) denote the aggregate survival function for an individual; S( t ) = P ( T > t) = S(t , t , . . . , t), where T denotes the time of death without regard to cause. The aggregate mortality ra te /z ( t ) can be characterized by the formula

- d S ( t ) / s ( t ) . p~(t) dt

Page 3: Dependent competing risks: a stochastic process model

Dependent competing risks 121

Two other types of intensity functions can be related to the death times {Tk}:

and

- d P ( Tk > t ) . / P ( Tk > t) Ak(t) - dt

--OSk(t) / Sk ( t ) Ot /

I ~ S ~ t l ~ t 2 , . . . , tK) / Itk( t) = It~ : t2 . . . . . tr = t / S ( t).

at~ /

The first, Ak(t), corresponds to the "net" hazard rate of death from the cause Ck in the absence of the other causes of death. The second, Itk(t), describes the "crude" hazard rate from cause Ck in the presence of other causes of death.

It can be easily seen that the aggregate hazard rate is the sum of the crude hazard rates:

K

It(t) = ~ Itk(t), k = l

so S(t) may be represented as a product,

K

S ( t ) = H Gk(t) k = l

The basic assumption of independence of causes in the competing risk model is that the net lives { Tk} are statistically independent. This implies that

K

S( t )= H Sk(t), k = l

from which we obtain Sk(t)= Gk(t) and hk( t )= Itk(t). ThUS the net and crude hazard rates are equal under independence. Biologically, this means that the rate of death from one cause Ck is the same regardless of the presence or absence of any subset of the other K - 1 causes, i.e. that the rate of progression of the underlying pathological process is not accelerated or decelerated by the operation of concurrent pathological processes - - either through the direct interaction of disease mechanisms (e.g. the effect of diabetes on the rate of atherosclerotic degeneration) or by the impact of the disease on the host organism's physiology (e.g. the reduced capacity of the organism to tolerate tumor load with generalized atherosclerotic deterioration of the circulatory system). Such dependence could either be positive (i.e. the disease processes reinforce one another) or negative (e.g. slowing metabolism and poor circulatory efficiency at very advanced age may retard the growth of solid tumors).

In accordance with the definition of /Zk(t), the unnormalized probability density of death times from cause Ck in the presence of all risks is

d -~ P( T <~ t, ok) = Itk( t)S( t).

Page 4: Dependent competing risks: a stochastic process model

122 A.I. Yashin et al.

The occurrence of the cause of death Ck means that T = Tk, sO one can write,

P(T<~ t, Ck) = P(T<~ t, T = Tk).

If the problem is to analyze the data from a population of I individuals then, under the assumption of independence of individuals, the likelihood function can be written in the form

) L({tk,, Ck,})= (I t~k,(tk,) exp - ~ ~,(s) ds i = 1 /=1

where ck, denotes the observed cause of death for the ith individual, and tk, = t~ and I.Lk,(" ) are the corresponding death times and hazard rates, respectively. This formula can be used for estimation if the {/~k(t)} are parameterized. Note that this formula does not involve the net hazard rates {Ak(t)} and hence does not require the assumption of independence of causes.

3. Conditionally independent risks

Given the above definitions, our approach to describing dependence between risks will be to define an underlying multivariate stochastic process upon which the different cause specific hazards can be jointly dependent. In this development we will utilize the biologically motivated model of human aging and mortality due to Woodbury and Manton (1977). This model describes physiological aging and survival in human populations as a function of four types of dynamic forces: (1) drift - - deterministic change in physiological characteristics as a function of time (and possibly age), (2) regression - - deterministic change in physiological characteristics as a function of current physiological status, (3) diffusion - - stochastic changes in physiological status due to unmeasured variables, and (4) mortality selection - - a systematic function of current physiological status. An analysis of the interaction of the dynamics of physiological change (i.e. drift, regression, and diffusion) and mortality selection (Woodbury and Manton (1977)) showed that certain biologically necessary conditions of system behavior (e.g. maintaining physiological parameters in a viable range, homeostatic control of the change of physiological parameters) could be maintained with a stochastic process of a particular form, i.e. a multivariate Gaussian stochastic process with linear physiological dynamics and quadratic forces of selection. We show in the following how that process may be generalized to treat multiple causes of death to formulate a model of dependent competing risks, in which the standard assumption of the marginal independence of causes of death is replaced by the weaker assumption of conditional independence.

Let us assume that death times Tk, k = 1, 2 , . . . , K, are conditionally indepen- dent given the trajectory of the process zt observed on the interval [0, t]. Denote this sampling trajectory by Z~. Following the development in Woodbury and Manton (1977), suppose that the process z, can be represented by the following stochastic differential equation

dzt = (ao(t) + al( t)z,) dt + b( t) d W ,

Page 5: Dependent competing risks: a stochastic process model

Dependent competing risks 123

Thus, changes in zt are assumed to be a function of two sets of components, one deterministic and one random. The matrix ao(t) describes the "drift" or "the deterministic rate of change" of zt at the origin of the space. Alternatively, using ~, to denote the mean value of z, at time t in a population, the matrix ao(t) + al(t)~, describes the drift of zt at the point ~t (i.e at the mean of the physiological variables). Under this model of physiological change, the rate of change can be shown to be zero at the point ht = -a~l( t ) �9 ao(t). Thus the matrix al(t) describes the "regression" or the homeostatic control mechanism of the physiological aging process under which extreme values tend to return to more moderate ranges (i.e. assuming that ht and zt are close). Taken together, the drift and regression components of the model describe the deterministic evolution with age of the physiology of the organism in terms of multiple physiological variables (e.g. blood pressure, vital capacity).

In addition to the deterministic component, the matrix b (t) which is a bounded matrix of scale factors describes the stochastic component of the process as random changes generated by a Wiener process Wt (i.e. a Brownian motion process which is independent of the initial value z,).

Let us assume that the conditional net survival function for the death time Tk given the sampling trajectory of z from 0 to t Can be represented as

(Io ) P(Tk> t[Z~o)=exp - hk(S, zs) ds ,

where hk(t, zt) denotes the conditional net hazard rate at time t for cause ck, given the value z, of the process at time t; this function is assumed to be integrable. By parameterizing this hazard in terms of z, rather than Z~ we are assuming that only the current values of the process zt are relevant to survival and that the past history (i.e. the prior trajectory of z,) of the process is irrelevant. Naturally, zt could be redefined to include cumulative measures representing the history of changes in z, to represent a more general process in which the future course of the process is affected by its past history (e.g. one could represent the history of blood pressure changes in the model to describe the cumulative damage of elevated blood pressure over time).

The assumption of conditional independence of the K net lives { Tk} , given the trajectory Z~, means that Ak( t , Zt) = ~ k ( t , Zt) , where Uk( t, Zt) denotes the conditional crude hazard rate at time t for cause Ck, given the value z, of the process at time t. Hence, the conditional aggregate survival probability can be written as a product,

P ( T > t[Z~)= P(Tk> tlZto)=exp - hk(S, zs) ds . k = l k = l

The unconditional (or marginal) aggregate survival function can be represented in the form (see Lemma 1, Appendix),

e ( r > t ) = e x p ( - I o ~Ttk(S)dS)k=l

where hk(t)=--Izk(t) is the unconditional crude hazard rate defined by ) tk ( t ) = E(Ak(t, z,)[ T > t).

Page 6: Dependent competing risks: a stochastic process model

124 A.I. Yashin et al.

In evaluating this expectation we will assume that ) t k ( t , gt) is a quadratic function of zt, e.g. hk(t, z,) = hk(t) �9 (Zt- h,) 2, where ht represents a homeostatic point. The quadratic form was selected to be consistent with the model for total mortality provided by Woodbury and Manton (1977) and with epidemiological (i.e. population based) observations that both high and low extreme values of physiological variables are frequently associated with increased mortality rates. In many cases these relations are actually J-shaped so that we may need to transform the risk factor measurements to be consistent with the quadratic hazard assumption.

Given an appropriate scaling transformation of the physiological measures it can be shown (see Appendix) that the quadratic form of the hazard function preserves the normality of the distribution of z, under the assumption given for dzt. This means that the quadratic form of the hazard function is reasonable when dealing with normally distributed data, or with data which can be so transformed. Furthermore, the description offered by the model of human physio- logical aging as a complex multivariate system with long term stability and strong homeostatic control mechanisms (e.g. control of blood pressure and serum cholesterol within viable ranges), and the control of the variance of the distribution of physiological parameters (at early ages by homeostatic control, and at advanced ages by heightened mortality selection of persons who lose homeostatic control), seems biologically reasonable.

With the structure assumed for the physiological process and the hazard function we can assess how the K cause specific hazard functions will jointly depend on the physiological process. It is natural to try to interpret {hk(t)} as unconditional net cause-specific mortality rates. However, these rates are related to each other through the process zt. If we assume the simple scalar quadratic form, hk(t, zt) = hk(t)z 2, the formula for irk(t) is (see Appendix):

Xk( t) = Ak( t)(m2( t) + 7( t) )

where re(t) and y( t ) are the mean and variance of the conditional distribution of zt given ( T > t) and satisfy the following differential equations:

dm(t) K dt = ( a ~ Y. Ak(t)m(t)

k = l

K

dy( t_____~) = 2a~( t)y( t) + b=(t) -2yE( t ) Y. Ak(t). dt k=l

These equations show that changes in any Ak(t) will influence re(t) and T(t), and through them 7tt(t), for I # k~ Thus the marginal net and crude hazard rates are not equal and the standard assumption of marginal independence will not hold.

The unnormalized probability density of the death time Tk in the presence of the other causes of death is

d -~ttP(T<~ t, T = Tk )=Xk( t )P (T> t).

This formula for the density function may be used in the estimation procedure with a maximum likelihood approach. If the observations indicate tk, is the time

Page 7: Dependent competing risks: a stochastic process model

Dependent competing risks 125

of death from cause Ck~ for the ith individual and f~k(t) is the function of Ak(t), m(t), and y( t ) given above, then the likelihood function can be Written as

I

L({tk,, Ck,})= ]-[ Ak,(tk,)P(T> tk,) i = 1

= )tk,(tk,) exp -- Xt(s) ds . i=1 1=1

Parameterization of {Ak( t)}, m (t), and Y(t) allows use of the maximum likelihood approach for parameter estimation.

4. A dependent competing risk model of longitudinal data

Assume that, in addition to data on the causes and times of death, additional measurements of the individuals' physiological status are provided. These measurements give the exact state of the process z(t)---zt at each discrete time tj, j = 1, 2 , . . . . In order to specify the model of mortality in terms of the observations on z(t), we introduce the notation 2(0 = (z(h) , z ( t 2 ) , . . . , z( t j ( t ) )) where t j ( t )= max{tj: b ~ < t} and {tj} is an ordered set of times. The formula for the conditional aggregate survival function P ( T > t l~(t)), given ~(t), is,

(Io ) P(T> tiN(t))--E{P(T> tl/~)l:~(t)}--exp - ~ ( s , ~ ( s ) ) d s ,

where/.~(t, ~(t)) is the conditional aggregate hazard rate defined by

K

I~(t, ~(t))= E Ak(t, 2(t)), k = l

and Xk(t, ~(t))=--I~k(t, ~(t)) is the conditional crude hazard rate defined by

Ak( t, ~(t)) = E{Ak( t, z,)l T > t, ~(t)}.

This shifts our focus from P ( T > t) to P ( T > t l2(t)) in order to take advantage of the fact that {~i(t)} are now assumed to be observed. If each Ak(t, zt), k = 1 , . . . , K is specified as a quadratic function of zt, i.e. Ak(t, z,) = z'tQk(t)zt where Qk(t) is a positive definite symmetric matrix with bounded elements, we get, for Ak(t, ~(t)), the following expression (see Appendix):

Ak( t, ~( t) ) = m'( t)Qk( t)m( t) + Tr( Qk( t)y( t) ),

where m(t) and y( t ) are the solutions of the following differential equations specified on the intervals (tj, tj+l), j = 1, 2 , . . . ,

dm(t) K - - = ( a o ( t ) + a l ( t ) m ( t ) ) - E y ( t ) ~ Qk(t)m(t)

dt k=l

dy( t ) K - a l ( t )y( t )+ y(t)a~(t)+ b ( t )b ' ( t ) -2y ( t ) E Qk(t)y(t).

dt k=l

At each time tj, j = 1, 2, . . . , the equations start from the new initial conditions

m(tj)=z(t j)

y(tj) = O.

Page 8: Dependent competing risks: a stochastic process model

126 A.I. Yashin et al.

Under this model, the likelihood function is

(f? ) L({t~, ck,, z~(tk,)}) = Xk,(tk,, z~(tk,))" exp - IX(S, ~i(s)) ds i = 1

• ~ j~=e N[z~(tj); mi(ty), yi(ty)]

where J(tk,)={J: t~(tk,)=max{ts: ts<~ tk,}} is the sequence number of the last observation on the ith individual prior to death at time tk, due to cause Ck, ; where ts denotes the point just before the observation time tj ; and where N [ . ] denotes the multivariate normal distribution with indicated mean vector and covariance matrix. This differs from the previous likelihood functions in that the observations {~i(t)} are now taken explicitly into account.

5. A model of longitudinal data with unobserved covariates

Suppose that the duration of life for any individual in the cohort is a function of the two-component process z ' ( t )=(y ' ( t ) ,x ' ( t ) ) . The quadratic form for conditional cause specific mortality may be rewritten as

hk( t, X(t), y( t ) )= (y'(t), x ' ( t))[ Qkll(t), Qkl2( '~ll Y<')] l 1 1

L Qk21(t), QkE2(t)J L x ( t ) J +/xko(t)

where ~kO(t) is a nonnegative hounded function of time and where Qkll(t) and QkEE(t) are positive definite symmetric matrices and Qkl2(t) is the transpose of Qkzl(t). Furthermore the equations for z(t) may be rewritten as

d[Y(t)] Ix( t ) ] = [ao2(t)] La121(t), a122(t)J[x(t)] Lb2,(t), b2z(t)J a[ W2(t)]

where Wl(t) and Wz(t) are vector-valued Wiener processes, independent of each other and of the initial values x(0) and y(0); the partitions of ao(t), a l ( t ) , and b(t) are assumed to have the appropriate dimensions. Thus, the processes x(t) and y( t ) are the solutions of these linear stochastic differential equations.

Let us assume that the elements of x(t) are measured at a set of fixed times h , t2 , . . . ; y ( t ) represents the variables that are not measured. Suppose that both processes influence the mortality rate as specified above and that the K net lives {Tk} are conditionally independent given the trajectory of the two-component process (y'(t), x'(t)) on the interval [0, t]. The goal is to estimate the elements of Qk(t) from the data on xi(t) at times h, t 2 , . . . , tj <<- tk,, where tk, is the time of death due to cause cki for individual i. Note that the times and causes of deaths are also observable. As before we use the notation ~(t) for the vector (x(q), x(t2), . . . , x(t j( t))) , where tj = max{tj: tj <~ t} and {tj} is an ordered set of times.

The formula for the conditional aggregate survival function P ( T > t l~(t)), given ~(t), is

(fo ) P ( T > t i ~ ( t ) ) = E { P ( T > tiZ'o)l~,(t)}=exp - ~(s, ; (s)) ds ,

Page 9: Dependent competing risks: a stochastic process model

Dependent competing risks 127

where/~(t, ~(t)) is the conditional aggregate hazard rate defined by

K I~(t, ~(t)) = ~. Ak(t, ~(t)),

k = l

and Ak(t, :~(t))=--I~k(t, ~(t)) is the conditional crude hazard rate defined by

~tk(t, :~(t)) = E { A k ( t , zt) I T > t, ~ ( t ) } .

This result is stated formally in the following theorem.

Theorem. Suppose that a process is defined by both measured and unmeasured variables with the structure presented above. Then the conditional aggregate force of mortality I-~ ( t, ~( t)) can be represented as follows,

) /z(t, ~(t)) = m'(t) Qk(t )m(t)+Tr Qk(t)y(t) + tzko(t) k = l 1 =1

where

L m = ( t ) J ' Y(t) = L y 2 , ( t ) , 722(03

on the intervals t~ <~ t< tj+l satisfy the following equations

dm(t) K = a o ( t ) + a l ( t ) m ( t ) - 2 y ( t ) ~ Qk(t)m(t)

dt k=l

dy( t ) K = al ( t )y( t )+ y(t)a~(t)+ b ( t ) b ' ( t ) - 2y ( t ) Y~ Qk(t)y(t).

dt k=l

At each time t~, j = 1, 2 , . . . , the initial values for the equation are

ml(tj) = ml(ty) + T,2( ty) y21( ty)(x( tj) - m2( ty) )

m2(tj) = x(tj)

'~11 ( t ) = ")Ill ( t y ) - ')/12( t j) y21(tf) '~21 ( t y )

722(tfl = 0

Y~2(tj) = Y2,(tj) = 0

where ty represents the left-hand value of the process, i.e. at the point just before the observation.

The proof of the theorem is given in the Appendix. Under this model the likelihood function is

(L' L({tk,, Ck. ~(t~)}) = ~. ~.k~(tk. ~( tk) )" exp - ~(s, ~(s)) ds i=1

Comparing this with the previous likelihood function, one sees that the unobser- ved component process y(t) affects both the mortality terms and the observed risk factor distribution.

Page 10: Dependent competing risks: a stochastic process model

128 A.I. Yashin et al.

6. An illustration

We illustrate the above model with data from the Framingham Heart Study. These data contain risk factor measurements and cause of death indicators for 2336 male residents of the town of Framingham, Massachusetts for 10 biennial examination cycles. Deaths are classified into three groups: (1) c a n c e r - 105 deaths; (2) coronary heart disease, cerebrovascular accidents, and other car- diovascular disease - - 276 deaths; (3) all other causes of death - - 92 deaths. The nine risk factors were: age (years), pulse pressure (PP; mmHg), diastolic blood pressure (DBP; mmHg), Quetelet index of body mass (QI; hg/m2), serum cholesterol (CHOL; mg%), blood sugar (BLDSUG; mg %), hemoglobin (HEMO; dg %), vital capacity index (VITC; cl/m2), and cigarette consumption (CIG; # /day) .

Since actuarial survival models typically have age as the only covariate, our assessment of the competing risk effect used a simulation strategy in which age constitutes the observed x(t) vector and the other eight risk factors constitute the unobserved y(t) vector. Since these models also implicitly equate age and time (i.e. Xk(t) and Ak(t, ~(t)) are taken to be equivalent), this allowed us to examine the competing risk effect using the standard marginal independence assumption (e.g. Keyfitz (1977)). In addition, rather than estimating quadratic risk parameters for the age variable, these parameters were set to zero and the entire quadratic risk function was multiplied by an exponential function of age. This was done to force the risk function for each covariate profile to be a Gompertz function of age - - a well known form for aging and mortality processes (Economos (1982)).

With both components of the z(t) vector observed for each member of the Framingham study cohort, estimates of the parameters of A k ( t , x(t), y(t)) and d/dt[y'(t),x'(t)] can be obtained using the approximation to the maximum likelihood estimation procedure for {tk,, Ck,, ~i(tk,)} described in Manton et al. (1986). However, since these estimates are for a discrete time approximation to the continuous time process, use of these estimates in the averaging formula of the Theorem requires use of difference equations in place of the differential equations of the Theorem. Following the argument presented in Woodbury and Manton (1983), it may be verified that an appropriate set of difference equations is:

re(t+ 1) = ao(t ) + (I+ al(t))m*(t )

y(t+ 1) = O(t)+ (I+ al(t))y*(t)(I + al(t))

where O(t) denotes the innovation variance in the interval (t, t + l ) due to diffusion, and where "*" denotes the adjusted form of y(t) and m(t) taking account of the effects of mortality:

K

T*( t )= ( /+2T( t ) ~ Qk(t))-lT(t) k = l

K

m*(t)=m(t)-2y*(t) E Qk(t)m(t) k = l

Page 11: Dependent competing risks: a stochastic process model

Dependent competing risks

and

P(T> t+ l l~(t+ l))= P(T> t l~(t)), exp - 1 d t

where

129

Xk(s, ~(s)) as),

I t +1 {2Ak[t, (re(t) + Xk(s, :~(s)) ds = m*(t))/2]

--�89 m(t))+Ak(t, m*(t))]

+�89 I+2y( t ) , - -~1Q'( t ) I}

Note that because ~(t) is age, P( T > t l~(t)) is the survival function for a simulated cohort life table beginning at age ~(0).

In Table 1 we present the estimated survival function and the means and standard deviations of the risk factors for 70 years of the projected life span of a hypothetical cohort assumed to be initially observed at age 30 years. These were generated using the difference equations above. The parameters of the process were estimated from the 2336 members of the Framingham study cohort (ages 29-62 years at initial examination) followed for 20 years. Note that while age-effect parameters were estimated from these data, no parameters specific to period (i.e. calendar date) or birth date were estimated; hence, the time dimension in Table 1 represents the implied increase of age within the hypothetical cohort. The life expectancy at age 30 is estimated to be 44.5 years.

The trajectories of the means and standard deviations of the physiological factors represent the dynamic equilibrium of drift, regression (the systematic increase or decrease of physiological variables representing the homeostatic forces and aging changes), diffusion, and mortality selection (a variance controlling force causing individuals at extreme physiological values to die more rapidly due to the quadratic form of the hazard). The trajectories of each risk factor are consistent with observations from a variety of epidemiological studies. For example, among survivors of a cohort, serum cholesterol levels tend to rise in middle age and decrease at later ages (Abraham 1978). Pulse pressure, probably due to increased peripheral circulatory resistance at advanced ages, tends to rise with age (Roberts 1977). Cigarette consumption tends to decrease among the elderly as heavy smokers die of lung cancer, chronic obstructive lung disease and heart disease in late middle age (Klebba 1982). Blood sugar tends to rise and vital capacity to drop as a function of normal aging changes (Shock et al. 1984).

In Table 2 we present the same type of information under the assumption that cancer is eliminated as a cause of death. This was done by setting Ak(t, x(t), y(t))=0, for k = 1, and recomputing the difference equations above for the same 70-year interval. One can see that the life expectancy at age 30 is estimated to increase 2.24 years to 46.76 years. Also shown in this table are the

Page 12: Dependent competing risks: a stochastic process model

Tab

le 1

. E

stim

ated

co

ho

rt s

urvi

val

fun

ctio

n (

lr),

lif

e ex

pec

tan

cy f

un

ctio

n (

~t)

, an

d r

isk

fact

or

mea

ns

and

sta

nd

ard

dev

iati

ons:

mal

es,

aged

30

year

s, F

ram

ing

ham

, M

ass.

hea

rt s

tud

y -

- ba

seli

ne p

roje

ctio

n m

od

el

t l,

~t

AG

E

PP

D

PB

Q

I C

HO

L

BL

DS

UG

H

EM

O

VIT

C

CIG

0 10

0 00

0 44

.52

30.0

0 45

.83

79.5

7 26

1.88

21

5.22

79

.35

142.

11

139.

29

13.2

4 0.

00

13.7

0 12

.53

34.4

4 41

.41

29.6

3 10

.25

31.1

1 11

.53

10

98 3

66

35.1

7 40

.00

41.2

2 83

.18

273.

30

241.

46

78.4

8 14

7.73

13

8.61

14

.46

0.00

13

.70

12.5

3 34

.40

41.3

9 29

.62

10.2

4 31

.11

11.5

2

20

94 5

88

26.3

5 50

.00

47.7

8 83

.42

277.

01

241.

08

83.9

2 14

9.60

12

9.95

12

.64

0.00

13

.69

12.5

2 34

.32

41.3

7 29

.62

10.2

4 31

.10

11.5

2

30

8630

6 18

.34

60.0

0 55

.30

83.3

7 27

4.25

23

3.10

91

.04

150.

43

116.

30

9.16

0.

00

13.6

8 12

.50

34.1

4 41

.33

29.6

0 10

.24

31.0

9 1 t

.50

40

69 0

71

11.5

3 70

.00

62.8

0 82

.91

266.

97

223.

21

98.3

7 15

0.76

99

.54

4.72

0.

00

13.6

5 12

.46

33.7

3 41

.21

29.5

6 10

.22

31.0

5 11

.47

50

38 7

08

6.39

80

.00

69.9

2 82

.01

258.

00

213.

52

105.

47

150.

88

81.3

0 0.

00

0.00

13

.59

12.3

8 32

.85

40.9

7 29

.48

10.1

9 30

.97

0.00

60

8061

3.

13

90.0

0 76

.37

80.5

5 25

0.80

20

5.65

11

1.63

15

1.64

62

.66

0.00

0.

00

13.4

6 12

.19

31.2

1 40

.47

29.2

7 10

.05

30.7

3 0.

00

70

121

1.70

1

00

.00

81

.36

78.7

8 25

0.02

19

9.78

11

6.92

15

2.84

45

.96

0.00

0.

00

13.2

1 11

.85

28.8

1 39

.57

28.9

1 9.

93

30.4

2 0.

00

Med

ian

age

: 76

.69

Not

e: 1

st r

ow -

- A

GE

to

CIG

giv

es m

ean

s; 2

nd

ro

w -

- st

and

ard

dev

iati

ons;

bey

on

d t

= 5

0, C

IG =

0 b

y as

sum

pti

on

Page 13: Dependent competing risks: a stochastic process model

Tab

le 2

. E

stim

ated

co

ho

rt s

urvi

val

fun

ctio

n (

It),

lif

e ex

pec

tan

cy f

un

ctio

n (

~,)

, an

d r

isk

fact

or

mea

ns

and

sta

nd

ard

dev

iati

ons:

mal

es,

aged

30

year

s, F

ram

ing

ham

, M

ass.

hea

rt s

tud

y

--

can

cer

elim

inat

ed a

s a

cau

se o

f d

eath

un

der

co

nd

itio

nal

ly i

nd

epen

den

t co

mp

etin

g r

isk

mo

del

8

t l,

~,

AG

E

PP

D

BP

Q

I C

HO

L

BL

DS

UG

H

EM

O

VIT

C

CIG

0 10

0 00

0 46

.76

30.0

0 45

.83

79.5

7 26

1.88

21

5.22

79

.35

142.

11

139.

29

13.2

4 10

0 00

0 46

.85

0.00

13

.70

12.5

3 34

.44

41.4

1 29

.63

10.2

5 31

.11

11.5

3

10

98 7

69

37.2

7 40

.00

41.2

2 83

.18

273.

27

241.

45

78.4

6 14

7.73

13

8.59

14

.47

98 7

69

37.3

6 0.

00

13.7

0 12

.53

34.4

1 41

.40

29.6

2 10

.24

31.1

1 11

.52

20

95 7

95

28.2

6 50

.00

47.7

8 83

.41

276.

96

241.

06

83.8

9 14

9.60

12

9.91

12

.66

95 7

96

28.3

4 0.

00

13.6

9 12

.52

34.3

6 41

.39

29.6

2 10

.24

31.1

0 11

.52

30

89 0

76

19.9

6 60

.00

55.2

9 83

.36

274.

14

233.

07

90.9

8 15

0.43

11

6.20

9.

21

89 0

87

20.0

5 0.

00

13.6

8 12

.50

34.2

3 41

.37

29.6

0 10

.24

31.0

9 11

.51

40

74 4

89

12.7

8 70

.00

62.8

0 82

.89

266.

66

223.

08

98.2

4 15

0.77

99

.33

4.84

74

550

12

.87

0.00

13

.66

12.4

7 33

.94

41.3

1 29

.57

10.2

3 31

.06

11.4

8

50

46 4

55

7.22

80

.00

69.9

4 81

.96

257.

09

213.

06

105.

17

150.

91

80.8

9 0.

00

46 7

30

7.31

0.

00

13.6

0 12

.38

33.3

0 41

.18

29.5

0 10

.21

30.9

9 0.

00

60

12 6

59

3.57

90

.00

76.4

3 80

.45

248.

76

204.

36

111.

01

151.

66

62.0

0 0,

00

13 0

75

3.66

0.

00

13.4

9 12

.20

31.9

6 40

.90

29.3

1 10

.08

30.7

8 0.

00

70

366

2.01

10

0.00

81

.57

78.6

4 24

6.65

19

7.50

11

5.64

15

3.05

44

.77

0.00

43

0 2.

09

0.00

13

.29

11.8

8 29

.83

40.3

6 28

.99

9,98

30

.53

0.00

Med

ian

age

: 78

.95

Not

e:

1st

row

--

AG

E t

o C

IG g

ives

mea

ns;

2n

d r

ow

--

stan

dar

d d

evia

tion

s; b

eyo

nd

t=

50,

CIG

=0

b

y a

ssu

mp

tio

n;

2nd

row

--

I, an

d

resu

lts

un

der

mar

gin

ally

in

dep

end

ent

com

pet

ing

ris

k m

od

el

et g

ives

co

rres

po

nd

ing

Page 14: Dependent competing risks: a stochastic process model

Fo

to

Tab

le 3

. E

stim

ated

co

ho

rt s

urvi

val

fun

ctio

n (

lt) ,

lif

e ex

pec

tan

cy f

un

ctio

n (

et),

an

d r

isk

fact

or

mea

ns

and

sta

nd

ard

dev

iati

ons:

mal

es,

aged

30

year

s, F

ram

ing

ham

, M

ass.

hea

rt s

tud

y -

- ca

rdio

vas

cula

r/ce

reb

rov

ascu

lar

dise

ase

elim

inat

ed a

s a

cau

se o

f d

eath

un

der

co

nd

itio

nal

ly i

nd

epen

den

t co

mp

etin

g r

isk

mo

del

t I t

et

AG

E

PP

D

BP

Q

I C

HO

L

BL

DS

UG

H

EM

O

VIT

C

CIG

0 10

0 00

0 54

.71

30.0

0 45

.83

79.5

7 26

1.88

21

5.22

79

.35

142.

11

139.

29

13.2

4 10

0 00

0 55

.41

0.00

13

.70

12.5

3 34

.44

41.4

1 29

.63

10.2

5 31

.11

11.5

3

10

99 2

86

45.0

6 40

.00

41.2

3 83

.19

273.

32

241.

51

78.5

0 14

7.74

13

8.58

14

.48

99 2

86

45.7

6 0.

00

13.7

0 12

.53

34.4

2 41

.40

29.6

3 10

.24

31.1

1 11

.52

20

97 8

01

35.6

6 50

.00

47.8

0 83

.43

277.

07

241.

21

83.9

9 14

9.61

12

9.87

12

.69

97 8

02

36.3

7 0.

00

13.7

0 12

.53

34.3

7 41

.38

29.6

2 10

.24

31.1

1 11

.52

30

94 5

08

26.7

1 60

.00

55.3

6 83

.41

274.

30

233.

37

91.2

2 15

0.45

11

6.08

9.

30

94 5

19

27.4

4 0.

00

13.6

9 12

.53

34.2

5 41

.35

29.6

2 10

.24

31.1

0 11

.51

40

87 0

07

18.5

2 70

.00

63.0

2 83

.03

266.

63

223.

65

98.8

1 15

0.78

98

.93

5.06

87

082

19

.30

0~00

13

.69

12.5

2 33

.98

41.2

7 29

.61

10.2

3 31

.08

11.4

9

50

70 2

75

11.6

0 80

.00

70.6

0 82

.34

255.

99

214.

11

106.

50

150.

86

79.6

8 0.

29

70 8

20

12.4

5 0.

00

13.6

6 12

.51

33.3

9 41

.10

29.5

8 10

.20

31.0

4 11

.45

60

39 4

21

6.47

90

.00

78.1

3 81

.36

�9 24

4.93

20

6.45

11

4.18

15

1.29

59

.04

0.00

41

691

7.

40

0.00

13

.60

12.4

8 32

.23

40.7

0 29

.51

10.0

5 30

.87

0.00

70

7 75

5 4.

31

100.

00

85.2

3 80

.12

237.

24

202.

52

122.

25

152.

26

37.9

5 0.

00

10 7

11

5.34

0.

00

13.4

8 12

.42

30.2

8 39

.96

29.3

9 9.

93

30.6

7 0.

00

Med

ian

age

: 87

.02

Not

e:

1st

row

--

AG

E t

o C

IG g

ives

mea

ns;

2nd

ro

w -

- st

and

ard

dev

iati

on

s; b

eyo

nd

t =

50,

CIG

= 0

by

ass

um

pti

on

; 2

nd

ro

w -

- I t

and

re

sult

s u

nd

er m

arg

inal

ly i

nd

epen

den

t co

mp

etin

g r

isk

mo

del

~t

giv

es c

orr

esp

on

din

g

Page 15: Dependent competing risks: a stochastic process model

Tab

le 4

. D

eco

mp

osi

tio

n o

f tw

o t

yp

es o

f ca

use

eli

min

atio

n e

ffec

ts b

y a

ge a

nd

cau

se:

mal

es,

aged

30

year

s, F

ram

ing

ham

, M

ass.

hea

rt

stu

dy

--

can

cer

and

car

dio

vas

cula

r/ce

reb

rov

ascu

lar

dise

ase

(CV

D)

8

Age

int

erva

l A

ll c

ause

s

Co

nd

itio

nal

in

dep

end

ence

M

arg

inal

in

dep

end

ence

Can

cer

CV

D

Res

idu

al

Can

cer

CV

D

Res

idu

al

30-3

9 0.

158

0.15

8 0.

000

0.00

0 40

-49

0.25

0 0.

250

0.00

0 0.

000

50-5

9 0.

378

0.38

0 -0

.00

1

-0.0

01

60

-69

0.51

4 0.

522

-0.0

05

-0

.00

3

70-7

9 0.

547

0.57

1 -0

.01

5

-0.0

09

80

-89

0.33

2 0.

360

-0.0

19

-0

.01

0

90-9

9 0.

062

0.07

2 -0

.00

7

-0.0

03

10

0+

0.00

1 0.

001

0.00

0 0.

000

Tot

al

2.24

2 2.

315

-0.0

47

-0

.02

6

Can

cer

elim

inat

ed

0.15

8 0.

250

0.38

0 0.

521

0.57

2 0.

365

0.07

6 0.

002

2.32

4

Car

dio

vas

cula

r/ce

reb

rov

ascu

lar

30-3

9 0.

357

0.00

0 0.

357

0.00

0 40

-49

0.70

4 0.

000

0.70

5 0.

000

50-5

9 1.

182

-0.0

01

1.

184

-0.0

01

60

-69

1.79

9 -0

.00

5

1.80

9 -0

.00

4

70-7

9 2.

402

-0.0

20

2.

444

-0.0

22

80

-89

2.34

5 -

0.04

6 2.

454

-0.0

62

90

-99

1.19

7 -0

.04

7

1.30

7 -0

.06

3

100+

0.

199

-0.0

11

0.

224

-0.0

14

Tot

al

10.1

86

-0.1

31

10

.483

-0

.16

7

o o

o o

o o

o o

o o

o o

o o

o o

o o

dise

ase

(CV

D)

elim

inat

ed

0 0.

357

0 0

0.70

4 0

0 1.

183

0 0

1.80

8 0

0 2.

450

0 0

2.51

3 0

0 1.

487

0 0

0.38

4 0

0 10

.885

0

Fo

Page 16: Dependent competing risks: a stochastic process model

134 A.I. Yashin et al.

survival function and life expectancies under an independent competing risk model in which Ak(t, ~ ( t ) ) = 0 , k = 1. For the independence model, the means and standard deviations remain at the values seen in Table 1 since no interaction is assumed between causes of death. It can be seen that this adjustment suggests that the independence model overestimates the effects of cause elimination by 0.09 years - - about 4%. This assumes that the conditional independence model is correct, e.g. that all covariates generating the dependence are measured. Furthermore, we see that the relative magnitude of the dependence effect increases with age. By age 80 (the limit to the observed data) the dependency effect is 11% (i.e. 0.92/0.83).

The results for the simulated elimination of cardiovascular/cerebrovascular diseases (k = 2) are in Table 3. Here life expectancy at age 30 is estimated to increase 10.19 years, to 54.71 years; the independent competing risk model overestimates this effect by about 0.70 years, or 7%. As for cancer, the divergence of the survival functions is largest at the oldest ages. At age 80 the independent competing risk effect is 6.06 years. The conditional independence effect is 5.21 years. Thus, at age 80 there is a 16% difference due to the joint dependency of the forces of mortality on the measured covariates.

Greville (1948) showed that the effect on life expectancy of the elimination of a given cause can be expressed as a function of the age specific changes in the overall mortality hazard rate. Pollard (1982) extended this idea to the com- parison of any two life tables by age and cause specific differences in the mortality hazard rates. In our notation,

A eok = A ) t j ( t , ~ ( t ) ) l t. k et dt, j= l

where It. k is the survival function when cause k is eliminated and et is the baseline life expectancy (Table 1). Under the independence model, A 7tj (t, ~(t)) = 0, except f o r j = k, where AAk(t, ~(t)) = Ytk(t, ; ( t ) ) . In contrast, under conditional indepen- dence, all causes contribute to A e0k.

In Table 4, we present the age and cause specific decomposition of the two types of cause elimination effects for cancer and for cardiovascular/cerebrovas- cular deaths. In this case the independent competing risk model does a good job of predicting the cancer component of A eol (i.e. 2.324 vs. 2.315 years) for the conditional elimination of cancer. The differences in the two elimination models result from losses of 0.047 and 0.026 years from cardiovascular/cerebrovascular and residual causes - - losses ignored in the independent competing risk model. The second panel of Table 4 contains results for the elimination of cardiovas- cular/cerebrovascular mortality. We see that the independent competing risk model overestimates the cardiovascular/cerebrovascular component of the condi- tionally independent cause elimination model by 3.5% (i.e. 10.885 vs. 10.483 years), which accounts for about half of the total 7% discrepancy.

7. Discussion and summary

We presented a model which addresses the dependent competing risk question in a tractable form, viz., that conditionally on an appropriate set of risk factor

Page 17: Dependent competing risks: a stochastic process model

Dependent competing risks 135

covariates the competing risks are independent. This translates the nonidentifiabil- ity aspect of the competing risk problem into the problem of specifying the effects of the risk covariates on mortality when the risk covariates are unobserved. We presented the equations for such a process under a very general model in which (1) each cause specific risk function is a quadratic function of risk covariates, and (2) the unobserved risk covariates are governed by a conditional Gaussian process. Parameter estimates were obtained from data on males in the Framing- ham Heart Study. These parameters permitted evaluation of the effects of incor- rectly assuming marginal independence of the competing risks when in fact they were only conditionally independent. It was found that assuming marginal independence at age 30 led to a 4% overestimate of the effect on life expectancy of the elimination of cancer as a cause of death and a 7% overestimate for the elimination of cardiovascular/cerebrovascular disease. These discrepancies are small compared to the theoretical range of the effect suggested by Peterson (1976) and Cohen and Liu (1984). Thus, the assumption of risk independence in cause elimination calculations seems reasonable when one only has demographic data. It is clear, however, that the dependency effect increases with age as a result of the rapidly increasing age component of mortality represented by the Gomper tz function. Thus, adjustment for cause dependency is more important for mortality at advanced ages where the interdependence of chronic morbid states and the effects of biological aging processes, herein represented by the Gomper tz function, are significant. It is also more important when estimating the resulting population life expectancy after elimination than when producing an estimate of the effect of a given disease.

One point briefly addressed in this paper is the mechanism by which a cause of death is eliminated. That is, under both the marginal independence and conditional independence assumptions, the computations involve setting some hazard function (~k(t), 7tk(t, ~ ( t ) ) , or Ak(t, ~( t ) ) to zero. In the actuarial life table model, one has no choice since only a function of ) tk(t) is observable. However, in the model presented for the analysis of longitudinal data on measured risk factor covariates, a variety of interventions in the components of the physiological process could be postulated to assess cross-temporal and cohort changes in cause specific mortality patterns.

Acknowledgments. Dr. Manton's and Mr. Stallard's efforts in this research were supported by NIA Grant No. AG01159 and NSF Grant SES8219315.

Appendix

A1. Auxiliary results

Notation. Let H = (Hz)~o be a nondecreasing right-continuous family of or-algebras in ~ and let H o be completed by sets of P-zero measure from H = Ho~.

Denote by Z t or Z(t), t >~ 0, the continuous time H-adapted process defined on (s H, P). Denote by H z the family of it-algebras in ~ generated by the values z~ =- z(u) of the random process Z(u):

H~ =(H~),~o,H~ = (-'1 cr{Z(v), v~u}. u > t

Page 18: Dependent competing risks: a stochastic process model

136 A.I. Yashin et al.

Let us assume that T is not a stopping time with respect to H z and that the H~-conditional distribution function of the death time T may be represented by the formula

P(T<~tlH~)=l-exp - A(U, Zu)du (A1)

where A(t, zt) is HZ-adapted and an integrable random process. Denote by R(t ) or R t the life cycle process related to the stopping time T:

R t = I ( T < - t ) , t>~O,

and introduce two families of ~r-algebras H r and H rz where H r = (H~),~>0,

H~ = ~{R(s) , s ~< t}; H rz = (H~Z),>~o, n ~ z = H t v H~.

Using the terminology of the martingale theory (Liptser and Shiryayev (1977), Jacod (1979)) and the recent compensator representation results (Yashin (1984)) one can check that the process

Io '̂ T A(t) = A(u, z~) du

is an Hr'-predictable compensator of the life cycle process R t. This means that the process

M ~ = I ( r < ~ t ) - A ( t ) , t ~ O

is an Hr ' -adapted martingale. If the termination time T is viewed as the time of death, the process A(u, z~), 0 ~ u ~ t, may be regarded as the age-specific mortality rate for an individual with history Z~ = { Z ( u ) , O ~ u -< t}.

Let H ~= (H~)t~> 0. Denote by .4(t) the H~-predictable compensator of the life cycle process R~. According to the definition of the compensator and the compensator representation results (Liptser and Shiryayev (1977), Jacod (1979)) one can write

.~(t):_f'^T dP(t~u) So '^~ Jo P( t >- u) ~.(u) du.

N o t e that ~(t) uniquely characterizes the unconditional distribution function P ( T ~ t ) = 1 - e -~ x(~)d~. The natural question arises: How is A(u) related to A(u, z~). The formula for A(u) is the result o f the following lemma:

Lemma 1. Let Z ( t) and T b e related as described by the formula (AI). Then,

X(t) = E[A (t, Zt) ] T ~ t].

Proof. Consider the process

I~I ,=E(Mt lH~) , t>~O

where M t is the H~Z-adapted martingale introduced above. The process Mr t can be represented in the form:

where

N t = E A ( u , Z , ) d u I H ~ -- E[A(u, Zu) lH~]du .

The process Nt is an Hr-predictable martingale. To prove this, it is sufficient to check the martingale property

E(N , IH~) = N~,

which follows directly from the equality

E X ( u , Z . ) d u [ H ~ = E E [ X ( u , Z . ) l n ; ] d u l H ~ v A T vr, T

Page 19: Dependent competing risks: a stochastic process model

Dependent competing risks 137

which is true for any u ~< t. Since hT/t is an H~-adapted martingale, the process

I(T<~t) - E [ A ( u , Z , ) l H ~ ] d u

is also an Hr-adapted martingale. Note further that tr-algebra H,~ has the atom { T > u} (Dellarcherie (1972)) and consequently,

'^TE[A(u,Z.)IH~.]du= E[A(u,Z.)lT>u]du. The nondecreasing process on the right-hand side of this equality is H~-adapted and continuous and, consequently, it is H~-predictable. The uniqueness of the H~-predictable compensator implies the formula

Io .4(t)= E[A(u, Z u ) l T > u ] d u

and consequently

~(t) = E[X (u, z.)l T> t].

Remark I. If A(u, Zu) = z ' O ( u ) Z , where Q(u) is a positive definite symmetric matrix with bounded elements, X(t) can be written

A.( t) = m'( t)Q( t)m( t) + Tr( Q( t)3,( t) ),

where re(t) = E[Z( t ) I T > t] and 3'(0 = E[(Z( t ) - m( t ) ) (Z( t ) - m(t))'[ T > t].

Remark 2. Suppose that there is another right-continuous nondecreasing family of or-algebras H x= (HT),~ 0 such that for any t > 0, H x c HT. Then a similar consideration leads to the following formula for the H'~-predictable compensator A(t, x) of the life cycle process R(t):

.4(t, x) = E ( A ( u , Z , ) I H : , T > u ) d u .

Consequently the H~-conditional distribution function of the stopping time T can be represented in the form

( I 0 ) P ( T < t I g T ) = l - e x p - e(~(u , Z u ) l H : , r > . ) d . .

A2. Proof of theorem (p. 127)

Let tl, t2, . . . , be the times of measurements of the component x(t), Introduce the conditional characteristic function f~(ot) defined on the intervals (tj, tj+x) as follows:

f t (a )=E(e ia 'Y(Ol :~( t ) , r>t ) , tj<~t<tj+l,

where ~ ( t )={x( t l ) , x(t2) . . . . . x(tj)} and tj =max{tj, tj<~ t} and {tj} is an ordered set of times. According to Bayes' rule, this function can be represented as follows f t ( a ) = E'(ei~'Y(~ where

(Io ) O( t )=exp - ( Y ' ( u ) Q ( u ) Y ( u ) - Y ' (u )Q(u )Y(u ) ) du

and E ' denotes the mathematical expectation with respect to the marginal probability measure corresponding to the trajectories of the Wiener process IV., 0<~ u ~ t, and

Y' (u)Q(u) Y(u) = E (Y ' (u )Q(u ) Y(u) i~(u) , T > u).

Using Ito's differential rule, the product e~'Y(~ can be written:

f, ei~'Y(o~(t)=ei~'g(~ ia, e~ 'Y( , )O(u) (ao(u)+al (u)Y(u) )d u do

L + ia' e~"'YO')O(u)b(u ) dW~ - ei'~'Y~ du

fo - e~ 'g(")O(u)[Y ' (u)Q(u)Y(u ) - Y ' (u )Q(u)Y(u)] du.

Page 20: Dependent competing risks: a stochastic process model

138 A.I. Yashin et al.

Taking the mathematical expectation E' of both sides of this equality produces

Io f0 f~(oQ=fo(a)+ia' ao(u)f,(oQ du+iot al(U)E'[ei~'Y(")41(u)Y(u)] du

-�89 a 'b(u)b ' (u)af . (a) du - E'[ei~'g("~O(u) Y ' (u)Q(u) Y(u)] du o

;o + f~(oQY' (u)Q(u)Y(u) du. (A2)

The Gaussian property of Y(0), X(0) yields fo(a) in the form:

f o ( ~ ) = e i~ 'm~176176

The induction method can be used to prove the conditional Gaussian property of Y(t), X( t ) . For this purpose suppose that distribution of Y(tj), X(t j ) is Gaussian given ~(tj) and ( r > tj). This assumption allows us to write the formula for fj(o0 in the form:

f j ( a ) = ei~'m(9-~"'~(9 '~.

This particular form and the equation forft(a) which is true for t ~ [tj, tj+l] suggests that one should search for an f~(a) in the same form on this time interval:

f ,( a ) = e i"'m~t)-~"' ~<')~ (A3)

where m(t) and y(t) satisfy some ordinary differential equations

din(t) = g(t), re(O) (A4)

dt

dr(t) = G(t), y(0). (A5)

dt

(We assume that the equations for rn(t) and y(t) have unique solutions.) The vector function g(t) and matrix G(t) can be found from the Eq. (A2) for f ,(a) . In order to do this note that the following equalities hold:

f~t = E'( i ei~"Y(~ t ) Y( t) )

f ~ , = -E'(ei~'r~ oqi( t) Y( t) Y( t)')

where f '~ and f~, t denote the vector of the first order derivatives and the matrix of the second order derivatives respectively, of the function f i(a) with respect to a.

Applying these formulas to the equation for f t (a) we obtain (omitting the dependence of f , (a) on a for simplicity):

f t= fo+ia ' ao(u)f~du+od f '~a , (u ) du- �89 f~a 'b(u)b ' (u)aau

Io fo + T r ( Q ( u ) f ~ ) du+ [m'(u)Q(u)m(u)+Tr(Q(u)~,(u))]f~du.

Derivatives f '~ and f ~ t may be calculated from Eq. (A3):

f~, = f~( im( t) -�89

f ia t = ft( im( t ) - �89 t )a )( im( t) -�89 t)a ) ' - f(y( t).

Substituting these derivatives into the equation for f , (a) , differentiating with respect to t, and using Eqs. (A4) and (A5) for re(t) and y(t), we obtain:

f [ ia' g( t ) -�89 t )a] = ice'ao( t )f~ + a Z ( im( t) - �89 t)cQa,( t)

-�89 f a ' b ( t ) b'( t )ct + ft Tr{ Q( t )[ ( im( t ) - �89 t )a )( im( t ) - �89 t )a ) ' - y(t)]}

+ f t [m' ( t )Q( t )m( t )+ Tr( Q(t)y(t))].

Page 21: Dependent competing risks: a stochastic process model

Dependent competing risks 139

Taking the real and imaginary parts of this equality yields

g( t) = ao( t ) + al( t)m( t ) - 2y( t )Q( t )m( t ) (A6)

G( t) = al( t) T( t) + T( t)a~ ( t) + b( t)b'( t ) - 2 y( t)Q( t)3,( t) (AT)

which lead to the equations for re(t) and y(t) described in the theorem. Notice that the form of the ft (a) noted above corresponds to the Gaussian !aw for the conditional

distribution of Y(t) given the event ( T > t). Now we must show that Eq. (A5) with G(t) given by (A7) has a unique solution. This can be

done by implementing the approach developed in Liptser and Shiryayev (1977, chap. 12). Hence, we have demonstrated that, on the interval between tj and tj+l, the mortality rate/~(t, )2(t)) can be represented in terms of functions re(t) and y(t) which correspond to the mean and variance of the Gaussian distribution of Y(t) and X ( t ) given ~(t) and ( T > t).

To complete the proof, we need to demonstrate that the distribution of Y(t) and X ( t ) remains Gaussian at the observation times tj, j = 1, 2 , . . . . This is accomplished by recalling that the vector random variable [ v(5)l has the Gaussian distribution given ~(tj) and (T > tj) with mean ~x(9)J

,,,.,=r,.,,.,>] Lm2(t~)J

and covariance matrix

[ Tll(t?) 7,2(t,) l y( t j )= LY21(tfl y22(ty).].

It can be easily shown that the distribution of Y(ti), given ~(tj) and ( T > t), also is Gaussian. The normal correlation theorem (Theorem 13.1 in Liptser and Shiryayev (1977)), stated in terms of Y(tj) and X(tj), with X(t j ) as an observable vector gives the following formulas for the mean ml(tj) and covariance matrix Yll(tj) of Y(t.j):

ml( tj) = m,(ty)+ "y12( t y) y2#( ty)(x( tj) - m2(ty))

Txl(tj) = y l l ( t7) - T12(ty)T2~(tf)y21(ty).

Taking into account that 722(ti) = 0 and V~2(tj) = 721(tfl = 0 (because x(tj) is observed) and m2(ti) = x(tj) (for the same reason), one completes the proof of the theorem.

References

1. Abraham, S.: Total serum cholesterol levels of adults 18-74 years. Vital and Health Statistics Series 11. Data from the National Health Survey, No. 205, DHEW Pub. No. (PHS) 78-1652, NCHS, Hyattsville, MD, 1978

2. Arnold, B. C., Brockett, P. L.: Identifiability for dependent multiple decrement/competing risk models. Scand. Actuarial J. 117-127 (1983)

3. Baum, H. M., Manton, K. G.: Cerebrovascular disease mortality: the relationship of underlying and associated causes of death. In review at American J. Public Health (1985)

4. Chiang, C. L.: Introduction to stochastic processes in biostatistics. New York: Wiley 1968 5. Cohen, J., Liu, L.: Competing risks without independence. Manuscript (1984) 6. Dellacherie, C.: Capacities and stochastic processes. Berlin Heidelberg New York: Springer 1972 7. Economos, A. C.: Rate of aging, rate of dying and the mechanism of mortality. Arch. Gerontol.

Geriatr. 1, 3-27 (1982) 8. Greville, T. N.: Mortality tables analyzed by cause of death. Record American Inst. Actuaries

37, 283-294 (1948) 9, Jacod, J.: Calcul stochastique et probl~mes de martingales. In: Lect. Notes Math. 714. Berlin

Heidelberg New York: Springer 1979 10. Keyfitz, N.: What difference would it make if cancer were eradicated? An examination of the

Taeuber paradox. Demography 14, 411-418 (1977) 11. Klebba, A. J.: Mortality from diseases associated with smoking. Vital and Health Statistics Series

20. Data from the National Vital Statistics System, No. 17 DHHS PUb. No. (PHS) 82-1854, NCHS, Hyattsville, MD, 1982

Page 22: Dependent competing risks: a stochastic process model

140 A.I. Yashin et al.

12. Liptser, R. S., Shiryayev, A. N.: Statistics of random processes. Berlin Heidelberg New York: Springer 1977

13. Manton, K. G., Stallard, E., Woodbury, M. A.: Chronic disease evolution and human aging: a general model for assessing the impact of chronic disease in human populations. Internat. J. Math. Modeling (special issue on math. modeling of diseases, M. Witten, ed.), forthcoming (1986)

14. Peterson, A. V.: Bounds for a joint distribution function with fixed sub-distribution functions: application to competing risks. Proc. Natl. Acad. Sci. USA 73, 11-13 (1976)

15. Pollard, J. H.: The expectation of life and its relationship to mortality. J. Inst. Actuaries 109, 225-240 (1982)

16. Roberts, J.: Blood pressure levels in persons 6-74 years, United States, 1971-1974. Vital and Health Statistics Series 11. Data from the National Health Survey, No. 203, DHEW Pub. No. (HRA) 78-1648, NCHS, Hyattsville, MD, 1977

17. Shock, N. W., Greulich, R. C., Andres, R., Arenberg, D., Costa, P. T., Lakatta, E. G., Tobin, J. D.: Normal human aging: the Baltimore longitudinal study of aging. DHHS, Pub. No. (NIH) 84-2450. Washington, D.C.; USGPO 1984

18. Tsiatis, A.: A nonidentifiability aspect of the problem of competing risks. Proc. Natl. Acad. Sci. USA 72, 20-22 (1975)

19. Tolley, H. D., Manton, K. G.: Comparing mortality risks with chronic conditions. Proc. American Stat. Assoc., Toronto Meeting (1984)

20. Tolley, H. D., Manton, K. G.: Multiple cause models of disease dependency. Scand. Actuarial J. 211-226 (1983)

21. Woodbury, M. A., Manton, K. G.: A random walk model of human mortality and aging. Theor. Popul. Biol. 11, 37-48 (1977)

22. Woodbury, M. A., Manton, K. G.: A theoretical model of the physiological dynamics of circulatory disease in human populations. Human Biol. 55, 417-441 (1983)

23. Yashin, A. I.: Hazard rates and probability distributions: Representation of random intensities. WP-84-21, International Institute for Applied Systems Analysis, Laxenburg, Austria (1984)

Received October 9, 1985/Revised February 2, 1986