58
New methods based on the adaptive ridge procedure to take into account age, period and cohort effects Olivier Bouaziz 1 and Vivien Goepp 1 with Gr´ egory Nuel 3 and Jean-Christophe Thalabard 1 1 MAP5 (CNRS 8145), Universit´ e Paris Descartes, Sorbonne Paris Cit´ e 3 LPSM (CNRS 8001), Sorbonne Universit´ e, Paris eminaire du C´ epiDc Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 1 / 43

New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

New methods based on the adaptive ridge procedure totake into account age, period and cohort effects

Olivier Bouaziz1 and Vivien Goepp1

with Gregory Nuel3 and Jean-Christophe Thalabard1

1MAP5 (CNRS 8145), Universite Paris Descartes, Sorbonne Paris Cite3LPSM (CNRS 8001), Sorbonne Universite, Paris

Seminaire du CepiDc

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 1 / 43

Page 2: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

1 Background in time to event analysis

2 The adaptive ridge procedure for piecewise constant hazards

3 Bidimensional estimation of the hazard rate

4 Extension of the age-period-cohort model

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 2 / 43

Page 3: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Outline

1 Background in time to event analysis

2 The adaptive ridge procedure for piecewise constant hazards

3 Bidimensional estimation of the hazard rate

4 Extension of the age-period-cohort model

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 3 / 43

Page 4: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Background in time to event analysis

I We study a positive continuous time to event variable T .

I T represents the time difference between event of interest and patiententry.

Study start Patient entry Ev. of interest

T

I Examples : time to relapse of Leukemia patients, time to onset ofcancer, time to death . . .

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 4 / 43

Page 5: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Background in time to event analysis : right censoring

Study start End of study

T1

T obs2

T2

T obs3

T3

T4

T obs4

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 5 / 43

Page 6: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The observations, the hazard rate and the likelihood

I Observations :{T obsi = Ti ∧ Ci

∆i = 1Ti≤Ci

I Independent censoring : T ⊥⊥ C

I The hazard rate is defined as :

λ(t) := lim4t→0

P[t ≤ T < t +4t|T ≥ t]

4t

I The likelihood of the observed data is equal to :

n∏i=1

f (T obsi )∆iS(T obs

i )1−∆i =n∏

i=1

λ(T obsi )∆i exp

(−∫ T obs

i

0λ(t)dt

),

where f is the density of T and S(t) = P[T > t].

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 6 / 43

Page 7: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Outline

1 Background in time to event analysis

2 The adaptive ridge procedure for piecewise constant hazards

3 Bidimensional estimation of the hazard rate

4 Extension of the age-period-cohort model

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 7 / 43

Page 8: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The piecewise constant hazard model

I The model :

λ(t) =K∑

k=1

λk1ck−1<t≤ck

I Goal : estimate the λks.

The log-likelihood is equal to :

`n(λ) =K∑

k=1

{Ok log (λk)− λk Rk

},

where

I Ok =∑

i ∆i1ck−1<T obsi ≤ck

: number of observed events in interval

(ck−1, ck ]

I Rk =∑

i (Tobsi ∧ ck − ck−1) : total time at risk in interval (ck−1, ck ]

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 8 / 43

Page 9: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The piecewise constant hazard model

I Ok : number of observed events in interval (ck−1, ck ]

I Rk : total time at risk in interval (ck−1, ck ]

The maximum likelihood estimator is explicit :

λmlek =

Ok

Rk

I We want to choose the number and location of the cuts from the data

I We start from a large grid of cuts (K = 100, 1 000, . . .)

I We use a penalization technique to constrain adjacent cut values tobe equal.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 9 / 43

Page 10: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Penalizing the maximum likelihood estimator

Set log λk = ak . Estimation of a is achieved through penalized

log-likelihood :

`penn (a) = `n(a)︸ ︷︷ ︸

log-likelihood

− pen

2

{K−1∑k=1

wk (ak+1 − ak)2

}︸ ︷︷ ︸

regularization term

,

I w represents a weight

I pen is a penalty term

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 10 / 43

Page 11: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Penalizing the maximum likelihood estimator

Set log λk = ak . Estimation of a is achieved through penalized

log-likelihood :

`penn (a) = `n(a)︸ ︷︷ ︸

log-likelihood

− pen

2

{K−1∑k=1

wk (ak+1 − ak)2

}︸ ︷︷ ︸

regularization term

,

I w represents a weight

I pen is a penalty term

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 10 / 43

Page 12: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Two types of regularization

1. L2 regularization (Ridge) with w = 1

2. L0 regularization with the adaptive ridge procedure.Iterative updates of the weights :

wk =((

ak+1 − ak)2

+ ε2)−1

,

with ε� 1.

F. Frommlet and G. Nuel, An Adaptive Ridge Procedure for L0Regularization. PlosOne (2016).

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 11 / 43

Page 13: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

L0 norm approximation

When ε� 1 :

wk (ak+1 − ak)2 ' ‖ak+1 − ak‖20 =

{0 if ak+1 = ak

1 if ak+1 6= ak

●ε

0.00

0.25

0.50

0.75

1.00

−5.0e−06 −2.5e−06 0.0e+00 2.5e−06 5.0e−06

x

Légende●

||x||02

x2

x2 + ε2

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 12 / 43

Page 14: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Maximization of the penalized log-likelihood

I The penalized estimator is no longer explicit

I Maximization is performed from the Newton-Raphson algorithm. Fora given sequence of weights w , the mth Newton Raphson iterationstep is obtained from the equation

a(m) = a

(m−1) + I (a(m−1),w)−1U(a(m−1),w),

where I is the opposite of the Hessian matrix, U is the score vector.

I The Hessian matrix is tri-diagonal

I =⇒ computation time for the inversion of the Hessian is O(K )

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 13 / 43

Page 15: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The Adaptive Ridge procedure for a given penalty

procedure Adaptive-Ridge(O,R, pen)(a,w , sel) ← (0,1,0)while not converge do

anew ← Newton-Raphson(O,R, pen, a,w)

wnewk ←

((anewk+1 − anew

k

)2+ ε2

)−1

selnewk ← wnew

k

(anewk+1 − anew

k

)2

(a,w ,sel) ← (anew,wnew,selnew)end while

Compute (Osel,Rsel)exp(amle) ← O

sel/Rsel

return amle

end procedure

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 14 / 43

Page 16: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The Adaptive Ridge procedure for a given penalty

procedure Adaptive-Ridge(O,R, pen)(a,w , sel) ← (0,1,0)while not converge do

anew ← Newton-Raphson(O,R, pen, a,w)

wnewk ←

((anewk+1 − anew

k

)2+ ε2

)−1

selnewk ← wnew

k

(anewk+1 − anew

k

)2

(a,w ,sel) ← (anew,wnew,selnew)end whileCompute (Osel,Rsel)exp(amle) ← O

sel/Rsel

return amle

end procedure

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 14 / 43

Page 17: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Comparison of the two regularization methods

pen = 0 =⇒ a = amle

pen =∞ =⇒ a = constant

1e−01 1e+01 1e+03 1e+05

−8

−7

−6

−5

−4

−3

−2

Penalty

Pa

ram

ete

r va

lue

(o

n t

he

lo

g−

sca

le)

L2 regularization

1e−01 1e+01 1e+03 1e+05

−8

−7

−6

−5

−4

−3

−2

Penalty

Pa

ram

ete

r va

lue

(o

n t

he

lo

g−

sca

le)

L0 regularization

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 15 / 43

Page 18: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Comparison of the two regularization methods

pen = 0 =⇒ a = amle

pen =∞ =⇒ a = constant

1e−01 1e+01 1e+03 1e+05

−8

−7

−6

−5

−4

−3

−2

Penalty

Pa

ram

ete

r va

lue

(o

n t

he

lo

g−

sca

le)

L2 regularization

1e−01 1e+01 1e+03 1e+05

−8

−7

−6

−5

−4

−3

−2

Penalty

Pa

ram

ete

r va

lue

(o

n t

he

lo

g−

sca

le)

L0 regularization

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 15 / 43

Page 19: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator (n = 400)

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

0.10

Time

Haz

ard

I In red the true hazard function

I In black the hazard estimator for pen = 0.1

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 16 / 43

Page 20: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator (n = 400)

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

0.10

Time

Haz

ard

I In red the true hazard function

I In black the hazard estimator for pen = 0.27

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 16 / 43

Page 21: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator (n = 400)

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

0.10

Time

Haz

ard

I In red the true hazard function

I In black the hazard estimator for pen = 0.55

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 16 / 43

Page 22: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator (n = 400)

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

0.10

Time

Haz

ard

I In red the true hazard function

I In black the hazard estimator for pen = 0.77

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 16 / 43

Page 23: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator (n = 400)

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

0.10

Time

Haz

ard

I In red the true hazard function

I In black the hazard estimator for pen = 1.54

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 16 / 43

Page 24: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator (n = 400)

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

0.10

Time

Haz

ard

I In red the true hazard function

I In black the hazard estimator for pen = 6.16

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 16 / 43

Page 25: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator (n = 400)

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

0.10

Time

Haz

ard

I In red the true hazard function

I In black the hazard estimator for pen = 52.70

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 16 / 43

Page 26: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator

Three different methods to perform model selection :

1. BIC(m) = −2`n(amlem ) + m log n

2. AIC(m) = −2`n(amlem ) + 2m

3. K-fold Cross Validation (CV),

with m the dimension of the model :

m =K−1∑k=0

1{amlek+1,m − amle

k,m 6= 0}.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 17 / 43

Page 27: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for the Adaptive Ridge estimator using theBIC (n = 400)

0.1 0.5 1.0 5.0 10.0 50.0 100.0

−6

−5

−4

−3

−2

Penalty

Pa

ram

ete

r va

lue

(o

n th

e lo

g−

sca

le)

Regularization path

0 20 40 60 80

0.0

00

.01

0.0

20

.03

0.0

40

.05

Time

Ha

za

rd

Hazard estimator (in black)

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 18 / 43

Page 28: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Outline

1 Background in time to event analysis

2 The adaptive ridge procedure for piecewise constant hazards

3 Bidimensional estimation of the hazard rate

4 Extension of the age-period-cohort model

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 19 / 43

Page 29: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The Lexis diagram

period

age

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

10

20

30

40

50

60

70

80

90

100

×

×

cohort

period

age

Age-Period Lexis diagram

cohort

age

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

10

20

30

40

50

60

70

80

90

100

×

×

cohort

period

age

Age-Cohort Lexis diagram

Key relation : cohort+age=period

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 20 / 43

Page 30: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The Lexis diagram

period

age

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

10

20

30

40

50

60

70

80

90

100

×

×

cohort

period

age

Age-Period Lexis diagram

cohort

age

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

10

20

30

40

50

60

70

80

90

100

×

×

cohort

period

age

Age-Cohort Lexis diagram

Key relation : cohort+age=period

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 20 / 43

Page 31: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The SEER data

I Huge american registry dataset of breast cancerhttps ://seer.cancer.gov

I Primary, unilateral, malignant and invasive cancers

I 1.2 million of patients, 60% of censoring

I The cancer diagnosis range from 1973 to 2014

I The time from cancer diagnosis to death or censoring ranges from 0to 41 years.

I The variable of interest is the time from cancer diagnosis until death.

Aim : estimate the hazard of death as a function of both date of cancerdiagnosis and time since diagnosis.

I We use the adaptive ridge procedure

I Penalization over the two directions.

V. Goepp, J-C. Thalabard, G. Nuel and O. Bouaziz. RegularizedBidimensional Estimation of the Hazard Rate. Submitted

.Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 21 / 43

Page 32: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The SEER data

I Huge american registry dataset of breast cancerhttps ://seer.cancer.gov

I Primary, unilateral, malignant and invasive cancers

I 1.2 million of patients, 60% of censoring

I The cancer diagnosis range from 1973 to 2014

I The time from cancer diagnosis to death or censoring ranges from 0to 41 years.

I The variable of interest is the time from cancer diagnosis until death.

Aim : estimate the hazard of death as a function of both date of cancerdiagnosis and time since diagnosis.

I We use the adaptive ridge procedure

I Penalization over the two directions.

V. Goepp, J-C. Thalabard, G. Nuel and O. Bouaziz. RegularizedBidimensional Estimation of the Hazard Rate. Submitted.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 21 / 43

Page 33: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The SEER data

I λj ,k : true hazard in rectangle (j , k)

I Oj ,k : number of observed events in rectangle (j , k)

I Rj ,k : total time at risk in rectangle (j , k)

The log-likelihood is equal to :

`n(λ) =J∑

j=1

K∑k=1

{Oj ,k log (λj ,k)− λj ,kRj ,k}

Set log λj ,k = ηj ,k . Estimation of η through penalized log-likelihood :

`penn (η) = `n(η)︸ ︷︷ ︸

log-likelihood

− pen

2

∑j ,k

{vj ,k (ηj+1,k − ηj ,k)2 + wj ,k (ηj ,k+1 − ηj ,k)2

}︸ ︷︷ ︸

regularization term

.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 22 / 43

Page 34: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The SEER data

I λj ,k : true hazard in rectangle (j , k)

I Oj ,k : number of observed events in rectangle (j , k)

I Rj ,k : total time at risk in rectangle (j , k)

The log-likelihood is equal to :

`n(λ) =J∑

j=1

K∑k=1

{Oj ,k log (λj ,k)− λj ,kRj ,k}

Set log λj ,k = ηj ,k . Estimation of η through penalized log-likelihood :

`penn (η) = `n(η)︸ ︷︷ ︸

log-likelihood

− pen

2

∑j ,k

{vj ,k (ηj+1,k − ηj ,k)2 + wj ,k (ηj ,k+1 − ηj ,k)2

}︸ ︷︷ ︸

regularization term

.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 22 / 43

Page 35: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

The SEER datastage1 stage2 stage3

ag

e <

45

45

< a

ge

< 5

5a

ge

> 5

5

1990 1995 2000 2005 1990 1995 2000 2005 1990 1995 2000 2005

0

10

20

0

10

20

0

10

20

Date of diagnostic

Su

rviv

al tim

e a

fte

r d

iag

no

stic (

ye

ars

)

0.05

0.10

0.15

0.20

Mortality

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 23 / 43

Page 36: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Outline

1 Background in time to event analysis

2 The adaptive ridge procedure for piecewise constant hazards

3 Bidimensional estimation of the hazard rate

4 Extension of the age-period-cohort model

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 24 / 43

Page 37: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

RecapAge-period-cohort analysis

period

age

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

10

20

30

40

50

60

70

80

90

100

×

×

×

×

cohort

period

age

Lexis diagram : Age-Period

cohort

age

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 20000

10

20

30

40

50

60

70

80

90

100

×

×

×

cohort

period

age

Lexis Diagram : Age-Cohort

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 25 / 43

Page 38: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Age-period-cohort analysis

We want to infer the effect of age, period, and cohort.

I age effect : menopause

I cohort effect : carcinogenic baby food

I period effect : nuclear accident

We define one parameter vector for each effect : α, β et γ

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 26 / 43

Page 39: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Existing models

1. In the age-period-cohort model, we assume

log λj ,k = αj + βk + γj+k−1.

I Non-identifiable : we can eitherI infer ∆2α, ∆2β et ∆2γ.I add a constraint to the model.

2. In the age-cohort model, we assume

log λj ,k = αj + βk .

I J + K − 1 parameters for JK variables → regularizing

I Additive effect of the variables : strong a priori

B. Carstensen, Age–period–cohort models for the Lexis diagram, Statistics inmedicine, 2007.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 27 / 43

Page 40: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Our approach : model the effects and their interactions

I The age-cohort and age-period-cohort models do not inferinteractions between effects.

I We introduce an age-cohort-interaction model :

log (λj ,k) = µ+ αj + βk + δj ,k ,

where δj ,k is the interaction (with δ1,k = δj ,1 = 0).

I We regularize over the differences of δj ,k .

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 28 / 43

Page 41: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Estimation in the ACI model

I The model parameter is θ = (µ,α,β, δ).

I Estimation using the adaptive ridge :

`penn (θ) = `n(θ)−pen

2

∑j ,k

{vj ,k (δj+1,k − δj ,k)2+wj ,k (δj ,k+1 − δj ,k)2

}.

I With adaptive ridge procedure over the interaction term δ.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 29 / 43

Page 42: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Adaptive ridge procedureprocedure Adaptive-Ridge(O,R, pen)

θ ← 0v ← 1w ← 1while not converge do

θnew ← Newton-Raphson(O,R, pen, v ,w)

vnewj,k ←

((δnewj+1,k − δnew

j,k

)2

+ ε2

)−1

wnewj,k ←

((δnewj,k − δnew

j,k−1

)2

+ ε2

)−1

θ ← θnew

v ← vnew

w ← wnew

end whileCompute (Osel,Rsel) from (θnew, vnew,wnew)θmle ← O

sel/Rsel

return θmle

end procedure

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 30 / 43

Page 43: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Adaptive ridge procedureprocedure Adaptive-Ridge(O,R, pen)

θ ← 0v ← 1w ← 1

while not converge doθnew ← Newton-Raphson(O,R, pen, v ,w)

vnewj,k ←

((δnewj+1,k − δnew

j,k

)2

+ ε2

)−1

wnewj,k ←

((δnewj,k − δnew

j,k−1

)2

+ ε2

)−1

θ ← θnew

v ← vnew

w ← wnew

end whileCompute (Osel,Rsel) from (θnew, vnew,wnew)θmle ← O

sel/Rsel

return θmle

end procedure

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 30 / 43

Page 44: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Adaptive ridge procedureprocedure Adaptive-Ridge(O,R, pen)

θ ← 0v ← 1w ← 1while not converge do

θnew ← Newton-Raphson(O,R, pen, v ,w)

vnewj,k ←

((δnewj+1,k − δnew

j,k

)2

+ ε2

)−1

wnewj,k ←

((δnewj,k − δnew

j,k−1

)2

+ ε2

)−1

θ ← θnew

v ← vnew

w ← wnew

end while

Compute (Osel,Rsel) from (θnew, vnew,wnew)θmle ← O

sel/Rsel

return θmle

end procedure

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 30 / 43

Page 45: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Adaptive ridge procedureprocedure Adaptive-Ridge(O,R, pen)

θ ← 0v ← 1w ← 1while not converge do

θnew ← Newton-Raphson(O,R, pen, v ,w)

vnewj,k ←

((δnewj+1,k − δnew

j,k

)2

+ ε2

)−1

wnewj,k ←

((δnewj,k − δnew

j,k−1

)2

+ ε2

)−1

θ ← θnew

v ← vnew

w ← wnew

end whileCompute (Osel,Rsel) from (θnew, vnew,wnew)θmle ← O

sel/Rsel

return θmle

end procedure

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 30 / 43

Page 46: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Adaptive ridge procedure

cohort

age

j = 1

j = 2

j = 3

k = 1 k = 2 k = 3

cohort

age

j = 1

j = 2

j = 3

k = 1 k = 2 k = 3

cohort

age

j = 1

j = 2

j = 3

k = 1 k = 2 k = 3

2

1

1

2

1

1

2

2

3

(a) Representation of

vj ,k (δj+1,k − δj ,k)2

et wj ,k (δj ,k+1 − δj ,k)2(b) Corresponding graph

(c) Segmentation intoconnected components

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 31 / 43

Page 47: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Illustration : simulated dataSimulation setting

I J = 20 age intervals and K = 20 cohort intervals

I Sample the cohort uniformly

I Sample the age using the hazard rate (λj ,k)

I Uniform censoring over the age [75, 100]

I Infer (µ,α,β, δ) in the ACI model.

I We represent medians over 100 repetitions

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 32 / 43

Page 48: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Illustration : simulated dataSimulation design 1

cohort

1920

1940

1960

1980

age

20

40

60

80

hazard

0.02

0.04

0.06

0.08

True hazard

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

0.05

0.10

Hazard MLE

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 33 / 43

Page 49: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Results with the ACI modelSimulation design 1

cohort

1920

1940

1960

1980

age

20

40

60

80

hazard

0.02

0.04

0.06

0.08

Estimated hazard λj,k

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

1.0

1.2

1.4

1.6

1.8

Estimated interaction δj,k

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 34 / 43

Page 50: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Results with the ACI modelSimulation design 1

Blue : true valuesRed : median estimate over 100 repetitions

0.0

0.5

1.0

1.5

25 50 75 100

Age

Effe

ct

Estimated age effect αj

−0.25

0.00

0.25

0.50

1925 1950 1975 2000

Cohort

Effe

ct

Estimated cohort effect βk

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 35 / 43

Page 51: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Illustration : simulated dataSimulation design 2

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

0.01

0.02

0.03

0.04

0.05

True hazard

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

0.00

0.01

0.02

0.03

0.04

0.05

Hazard MLE estimate

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 36 / 43

Page 52: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Results with the ACI modelSimulation design 2

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

0.02

0.03

0.04

0.05

Estimated hazard λj,k

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

1.0

1.1

1.2

1.3

1.4

Estimated interaction δj,k

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 37 / 43

Page 53: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Illustration : simulated dataSimulation design 3

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

0.01

0.02

0.03

0.04

0.05

True hazard

Cohort

1920

1940

1960

1980

Age

20

40

60

80

Hazard

0.01

0.02

0.03

0.04

0.05

0.06

Age-cohort model

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 38 / 43

Page 54: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Results with the ACI modelSimulation design 3

cohort

1920

1940

1960

1980

age

20

40

60

80

hazard

0.02

0.03

0.04

0.05

Estimated hazard λj,k

cohort

1920

1940

1960

1980age

20

40

60

80

hazard

1.0

1.5

2.0

Estimated interaction δj,k

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 39 / 43

Page 55: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Conclusion and perspectives

Conclusion :

I Extends the age-cohort model

I More general than the age-period-cohort model

Perspectives :

I Bootstrapping to reduce sensitivity to outliers

I Application : Incidence of breast cancer in Norway (NOWAC cohort)

More info :

I R package : github.com/goepp/hazreg

I Website : www.math-info.univ-paris5.fr/~obouaziz andgoepp.github.io

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 40 / 43

Page 56: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Merci de votre attention

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 41 / 43

Page 57: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Bonus : Model selection for Adaptive RidgeBayesian criteria

I Problem : choose between M models M1, . . . ,MM of dimensionsq1, . . . , qM .

I Solution : maximize P(Mm|R,O) ∝ P(R,O|Mm)π(Mm).

I By approximation :

−2 log (P(Mm|R,O)) = 2`n(ηm) + qm log n−2 log π(Mm) +OP(1)

I We must choose the prior a priori π (Mm)

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 42 / 43

Page 58: New methods based on the adaptive ridge procedure to take ... · New methods based on the adaptive ridge procedure to take into account age, period and cohort e ects Olivier Bouaziz1

Model selection for Adaptive Ridge

BIC : π (Mm) = 1All the Mm are equiprobable

EBIC0 : P(Mm ∈M[qm]

)= 1

All the M[qm] are equiprobable

× ×××

×

××××

×××

×

Model sets M[1] M[2] M[qm] M[JK−1]M[JK ]

M[qm] is the set of models with qm parameters

J. Chen and Z. Chen, Extended Bayesian information criteria for model selection withlarge model spaces, Biometrika, 2008.

Olivier Bouaziz and Vivien Goepp The AR procedure for APC models June 18, 2019 43 / 43