Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional

Model-based vs. non-parametric estimators of

net survival

Paul W Dickman1 Paul C Lambert1,2

1Department of Medical Epidemiology and Biostatistics,Karolinska Institutet, Stockholm, Sweden

2Department of Health Sciences,University of Leicester, UK

EPAAC WP9 Satellite MeetingState of Art of Methods for the Analysis of

Population-Based Cancer Data23 January 2014

Which approach should I use to estimate net

survival?

A common question in our teaching.

Unless there is reason to prefer a cause-specific approach, werecommend one of the following:

Ederer IIPohar PermeModel-based

The choice of method depends on the research question andpractical considerations.

We were recently critical [1] of a paper [2] that advocated thePohar Perme approach; our critisism was of that particular paperand not the Pohar Perme approach per se, of which we are greatadmirers.

Paul Dickman Model-based vs. non-parametric 23 January 2014 2

We are not suggesting Ederer II is superior

We are not suggesting that the Ederer II approach is superior tothe Pohar Perme approach.

However, we argue that it is not as inferior as some others (e.g.,Roche et al [2]) would have us believe.

We do not agree with Roche et al [2] that “In estimating netsurvival, cancer registries should abandon all classical methodsand adopt the new Pohar-Perme estimator” because “greaterrors may occur ...”.

Internally standardised, or age-specific, Ederer II estimates of5-year and 10-year net survival are biased, although the bias isgenerally so small that it makes no practical difference.

In short, don’t panic if you have used, or are using, Ederer II.


All methods require assumptions

1 Conditional independence between cancer and non-cancermortality.That is, there are no factors associated with both cancer andnon-cancer mortality other than those factors that have beencontrolled for in the estimation (e.g., via stratification,regression modelling or appropriate weighting).

2 The estimates of expected mortality represent the mortality thatwould have been experienced by the cancer patients if they werenot diagnosed with cancer.

3 Administrative censoring is non-informative, or an appropriateadjustment is applied.


If interest is in a summary of net survival for all

patients with a particular cancer

The Pohar Perme estimator was designed specifically for thistype of application.

A model-based approach can also give an estimate with minimalbias, but why bother?


Simulation Study

Expected survival from the UK general population.

Assume linear association (HR=1.03) between age and netmortality and the effect is constant throughout follow-up. Inanother scenario (not shown here), the effect of age wasrestricted to the early follow-up years.

Only considered Danieli’s mechanism 1 and not informativeadministrative censoring (Danieli’s mechanism 2). This isbecause the second mechanism will affect all estimators and weare primarily interested in differences between the estimators dueto mechanism 1.

500 data sets simulated with 15,000 patients in each.


Simulation Study - True values of relative survival

Age 1 Year 5 Years 10 Years 15 years35 92.4 83.8 77.9 73.745 89.9 78.8 71.4 66.255 86.6 72.5 63.5 57.365 82.4 64.8 54.1 47.275 77.0 55.7 43.7 36.385 70.2 45.3 32.7 25.495 62.0 34.4 22.1 15.7

Internal 81.2 63.3 52.8 46.1

The bias in Ederer II will be proportional to the size of theassociation between relative survival and age. This scenariorepresents a relatively large association between relative survivaland age (compared to what one typically observes for cancer).


Results - 5-year survival for all ages

61

62

63

64

65

Bias = -0.06MSE = 0.271Coverage = 95.6

PoharPerme

Bias = 0.76MSE = 0.793Coverage = 65.6

Ederer 2(All Age)


Ederer 2(Standardized)


Model based(grouped)


Model based(continuous)

Age

Sta

ndar

dize

d R

elat

ive

Sur

viva

l

5 years



50

52

54

56


PoharPerme


Ederer 2(All Age)







Age

Sta

ndar

dize

d R

elat

ive

Sur

viva

l

10 years



40

45

50

55

60


PoharPerme


Ederer 2(All Age)







Age

Sta

ndar

dize

d R

elat

ive

Sur

viva

l

15 years


Comments

The Pohar Perme approach provides an internallyage-standardised estimate of marginal net survival, and it does itoptimally.

Other than Ederer II applied to all ages, the other approachesalso provide internally age-standardised estimates of net survival.


Comparing survival between populations

Using the Pohar Perme estimator we can obtain unbiasedestimates of net survival for patients diagnosed with breastcancer in Norway and in the UK.

What if we want to compare them?

Age-standardisation is a common approach and can beimplemented non-parametrically or using a model.

What if we want to understand reasons for the differences?– Are the differences consistent for all ages?– Are the differences consistent across follow-up?

The following graphs were based on a model; could do similarusing non-parametric approach but modelling has advantages.


Relative Survival for England and Norway [3]

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val

0 2 4 6 8Years from Diagnosis

Age 35

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val


Age 45

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val


Age 55

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val


Age 65

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val


Age 75

0.4

0.6

0.8

1.0

Rel

ativ

e S

urvi

val


Age 85


Excess Mortality Rate Ratios (England/Norway)

1

2

3

0 2 4 6 8

Age 35

1

2

3

0 2 4 6 8

Age 45

1

2

3

0 2 4 6 8

Age 55

1

2

3

0 2 4 6 8

Age 65

1

2

3

0 2 4 6 8

Age 75

1

2

3

0 2 4 6 8

Age 85

Exc

ess

Mor

talit

y R

ate

Rat

io

Years from DiagnosisPaul Dickman Model-based vs. non-parametric 23 January 2014 14

The model

H(t) = H∗ (t) + Λ(t)

Model on ln [Λ(t)] scale which includes terms for

Baseline hazard (time) - Splines (6 parameters)Country - 1 dummy covariateAge - Splines (4 parameters)Age×Country - (4 parameters)Country×Time - Splines (3 parameters)Age×Time - 4×3 = 12 parameters

Results extremely robust to number and locations of the knots.


Localised colon carcinoma in Finland 1985–1994

In a clinical setting, interest is often in prediction for patientswith specific characteristics (e.g., of a particular age).

Using the Pohar Perme approach, the estimated 5-year netsurvival for 60 year-old males is 0.94.

Some researchers would argue that this estimate is preferredover all other approaches for estimating net survival since theother approaches are known to be biased.

Let’s look (on the next slide) at the Pohar Perme andmodel-based estimates for a range of ages.

As an aside, the Ederer II estimates are identical to the PoharPerme estimates, but comparing non-parametric estimators isnot the focus of this talk.


Non-parametric and model-based estimates

of 5-year net survival

Age PP Model60 0.94 0.8561 0.61 0.8462 0.77 0.8463 0.70 0.8364 0.73 0.8265 0.78 0.8266 0.87 0.8167 0.83 0.8068 0.86 0.8069 0.77 0.79


Modelling assumptions give lower SEs

Age PP Model60 0.94 (0.07) 0.85 (0.02)61 0.61 (0.09) 0.84 (0.02)62 0.77 (0.09) 0.84 (0.02)63 0.70 (0.09) 0.83 (0.02)64 0.73 (0.08) 0.82 (0.02)65 0.78 (0.09) 0.82 (0.02)66 0.87 (0.07) 0.81 (0.02)67 0.83 (0.07) 0.80 (0.02)68 0.86 (0.09) 0.80 (0.02)69 0.77 (0.09) 0.79 (0.02)


Non-parametric and model-based estimates

I prefer the model-based estimate. The survival of 60 year-oldsshould not be markedly different to 61 year-olds.

Proponents of a non-parametric approach argue that it is free ofassumptions and therefore preferable.

I believe the assumptions made in the model-based approach areappropriate, and incorporating them into the analysis leads tomore appropriate estimates of survival for each age.

By introducing an assumption, we are reducing variance by‘borrowing strength’ from the surrounding ages.

Could apply the non-parametric approach to a broader age group(e.g., patients aged 60–69), but doesn’t this imply adding anassumption on the interpretation (even if we don’t make anassumption in the estimation).


There is no single correct model

Age PP Model1 Model2 Model3

60 0.94 0.85 0.94 0.8261 0.61 0.84 0.61 0.8262 0.77 0.84 0.77 0.8263 0.70 0.83 0.70 0.8264 0.73 0.82 0.73 0.8265 0.78 0.82 0.78 0.8266 0.87 0.81 0.87 0.8267 0.83 0.80 0.83 0.8268 0.86 0.80 0.86 0.8269 0.77 0.79 0.77 0.82

Parallels between models 2 and 3 and analogous non-parametricapproaches; same assumptions and identical estimates.How we specify and interpret covariate effects is often moreimportant, and is relevant independent of approach.


Modelling is a powerful tool and requires skill

The utility of a model-based approach requires fitting anappropriate model

Choice of covariatesFunctional form for metric covariatesInteractions and how to parameterise them

Decisions on which interactions to include and how toparameterise them (spline-spline interactions require somethought) should often be based on subject matter considerationsrather than statistical significance.


A comment from Riccardo Capocaccia

A well-constructed model does not assure to provide fittedsurvival estimates close to the empirical ones for allcombinations of covariate values. Differences between fitted andempirical survival can be due to random variability as well as to‘true’ effects. For instance, a model applied to European datamight not show a sudden increase of survival for colon cancer ina single country given by the introduction of mass screening.

I agree. But I see this as fundamental to the role of statisticalmodelling and the skills required to perform it.


Goal in modelling

Examining the empirical estimates for a large number ofcovariate patterns does not provide a good basis for scientificinference.

Our goal in modelling is to fit a model that is sufficiently simplethat it provides a basis for scientific inference, while at the sametime being sufficiently complex that it does not obscureimportant effects or otherwise produce misleading results.


References

[1] Dickman PW, Lambert PC, Coviello E, Rutherford MJ. Estimating net survival inpopulation-based cancer studies. Int J Cancer 2013;133:519–21.

[2] Roche L, Danieli C, Belot A, Grosclaude P, Bouvier AM, Velten M, et al.. Cancer netsurvival on registry data: Use of the new unbiased Pohar-Perme estimator and magnitudeof the bias with the classical methods. Int J Cancer 2012;132:2359–69.

[3] Lambert PC, Holmberg L, Sandin F, Bray F, Linklater KM, Purushotham A, et al..Quantifying differences in breast cancer survival between England and Norway. CancerEpidemiology 2011;35:526–533.


Documents

Model-based vs. non-parametric estimators of net survival · 2016. 6. 3. · Paul Dickman Model-based vs. non-parametric 23 January 2014 3. All methods require assumptions 1 Conditional