Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Model-based vs. non-parametric estimators of
net survival
Paul W Dickman1 Paul C Lambert1,2
1Department of Medical Epidemiology and Biostatistics,Karolinska Institutet, Stockholm, Sweden
2Department of Health Sciences,University of Leicester, UK
EPAAC WP9 Satellite MeetingState of Art of Methods for the Analysis of
Population-Based Cancer Data23 January 2014
Which approach should I use to estimate net
survival?
A common question in our teaching.
Unless there is reason to prefer a cause-specific approach, werecommend one of the following:
Ederer IIPohar PermeModel-based
The choice of method depends on the research question andpractical considerations.
We were recently critical [1] of a paper [2] that advocated thePohar Perme approach; our critisism was of that particular paperand not the Pohar Perme approach per se, of which we are greatadmirers.
Paul Dickman Model-based vs. non-parametric 23 January 2014 2
We are not suggesting Ederer II is superior
We are not suggesting that the Ederer II approach is superior tothe Pohar Perme approach.
However, we argue that it is not as inferior as some others (e.g.,Roche et al [2]) would have us believe.
We do not agree with Roche et al [2] that “In estimating netsurvival, cancer registries should abandon all classical methodsand adopt the new Pohar-Perme estimator” because “greaterrors may occur ...”.
Internally standardised, or age-specific, Ederer II estimates of5-year and 10-year net survival are biased, although the bias isgenerally so small that it makes no practical difference.
In short, don’t panic if you have used, or are using, Ederer II.
Paul Dickman Model-based vs. non-parametric 23 January 2014 3
All methods require assumptions
1 Conditional independence between cancer and non-cancermortality.That is, there are no factors associated with both cancer andnon-cancer mortality other than those factors that have beencontrolled for in the estimation (e.g., via stratification,regression modelling or appropriate weighting).
2 The estimates of expected mortality represent the mortality thatwould have been experienced by the cancer patients if they werenot diagnosed with cancer.
3 Administrative censoring is non-informative, or an appropriateadjustment is applied.
Paul Dickman Model-based vs. non-parametric 23 January 2014 4
If interest is in a summary of net survival for all
patients with a particular cancer
The Pohar Perme estimator was designed specifically for thistype of application.
A model-based approach can also give an estimate with minimalbias, but why bother?
Paul Dickman Model-based vs. non-parametric 23 January 2014 5
Simulation Study
Expected survival from the UK general population.
Assume linear association (HR=1.03) between age and netmortality and the effect is constant throughout follow-up. Inanother scenario (not shown here), the effect of age wasrestricted to the early follow-up years.
Only considered Danieli’s mechanism 1 and not informativeadministrative censoring (Danieli’s mechanism 2). This isbecause the second mechanism will affect all estimators and weare primarily interested in differences between the estimators dueto mechanism 1.
500 data sets simulated with 15,000 patients in each.
Paul Dickman Model-based vs. non-parametric 23 January 2014 6
Simulation Study - True values of relative survival
Age 1 Year 5 Years 10 Years 15 years35 92.4 83.8 77.9 73.745 89.9 78.8 71.4 66.255 86.6 72.5 63.5 57.365 82.4 64.8 54.1 47.275 77.0 55.7 43.7 36.385 70.2 45.3 32.7 25.495 62.0 34.4 22.1 15.7
Internal 81.2 63.3 52.8 46.1
The bias in Ederer II will be proportional to the size of theassociation between relative survival and age. This scenariorepresents a relatively large association between relative survivaland age (compared to what one typically observes for cancer).
Paul Dickman Model-based vs. non-parametric 23 January 2014 7
Results - 5-year survival for all ages
61
62
63
64
65
Bias = -0.06MSE = 0.271Coverage = 95.6
PoharPerme
Bias = 0.76MSE = 0.793Coverage = 65.6
Ederer 2(All Age)
Bias = 0.11MSE = 0.248Coverage = 94.6
Ederer 2(Standardized)
Bias = 0.57MSE = 0.536Coverage = 79.2
Model based(grouped)
Bias = 0.03MSE = 0.219Coverage = 94.4
Model based(continuous)
Age
Sta
ndar
dize
d R
elat
ive
Sur
viva
l
5 years
Paul Dickman Model-based vs. non-parametric 23 January 2014 8
Results - 10-year survival for all ages
50
52
54
56
Bias = -0.07MSE = 0.869Coverage = 95.0
PoharPerme
Bias = 1.57MSE = 2.772Coverage = 20.2
Ederer 2(All Age)
Bias = 0.22MSE = 0.452Coverage = 94.2
Ederer 2(Standardized)
Bias = 0.88MSE = 1.122Coverage = 71.0
Model based(grouped)
Bias = 0.09MSE = 0.440Coverage = 96.4
Model based(continuous)
Age
Sta
ndar
dize
d R
elat
ive
Sur
viva
l
10 years
Paul Dickman Model-based vs. non-parametric 23 January 2014 9
Results - 15-year survival for all ages
40
45
50
55
60
Bias = -0.27MSE = 4.392Coverage = 93.6
PoharPerme
Bias = 2.28MSE = 5.599Coverage = 5.8
Ederer 2(All Age)
Bias = 0.21MSE = 0.896Coverage = 93.6
Ederer 2(Standardized)
Bias = 1.04MSE = 1.685Coverage = 73.6
Model based(grouped)
Bias = 0.07MSE = 0.807Coverage = 95.2
Model based(continuous)
Age
Sta
ndar
dize
d R
elat
ive
Sur
viva
l
15 years
Paul Dickman Model-based vs. non-parametric 23 January 2014 10
Comments
The Pohar Perme approach provides an internallyage-standardised estimate of marginal net survival, and it does itoptimally.
Other than Ederer II applied to all ages, the other approachesalso provide internally age-standardised estimates of net survival.
Paul Dickman Model-based vs. non-parametric 23 January 2014 11
Comparing survival between populations
Using the Pohar Perme estimator we can obtain unbiasedestimates of net survival for patients diagnosed with breastcancer in Norway and in the UK.
What if we want to compare them?
Age-standardisation is a common approach and can beimplemented non-parametrically or using a model.
What if we want to understand reasons for the differences?– Are the differences consistent for all ages?– Are the differences consistent across follow-up?
The following graphs were based on a model; could do similarusing non-parametric approach but modelling has advantages.
Paul Dickman Model-based vs. non-parametric 23 January 2014 12
Relative Survival for England and Norway [3]
0.4
0.6
0.8
1.0
Rel
ativ
e S
urvi
val
0 2 4 6 8Years from Diagnosis
Age 35
0.4
0.6
0.8
1.0
Rel
ativ
e S
urvi
val
0 2 4 6 8Years from Diagnosis
Age 45
0.4
0.6
0.8
1.0
Rel
ativ
e S
urvi
val
0 2 4 6 8Years from Diagnosis
Age 55
0.4
0.6
0.8
1.0
Rel
ativ
e S
urvi
val
0 2 4 6 8Years from Diagnosis
Age 65
0.4
0.6
0.8
1.0
Rel
ativ
e S
urvi
val
0 2 4 6 8Years from Diagnosis
Age 75
0.4
0.6
0.8
1.0
Rel
ativ
e S
urvi
val
0 2 4 6 8Years from Diagnosis
Age 85
Paul Dickman Model-based vs. non-parametric 23 January 2014 13
Excess Mortality Rate Ratios (England/Norway)
1
2
3
0 2 4 6 8
Age 35
1
2
3
0 2 4 6 8
Age 45
1
2
3
0 2 4 6 8
Age 55
1
2
3
0 2 4 6 8
Age 65
1
2
3
0 2 4 6 8
Age 75
1
2
3
0 2 4 6 8
Age 85
Exc
ess
Mor
talit
y R
ate
Rat
io
Years from DiagnosisPaul Dickman Model-based vs. non-parametric 23 January 2014 14
The model
H(t) = H∗ (t) + Λ(t)
Model on ln [Λ(t)] scale which includes terms for
Baseline hazard (time) - Splines (6 parameters)Country - 1 dummy covariateAge - Splines (4 parameters)Age×Country - (4 parameters)Country×Time - Splines (3 parameters)Age×Time - 4×3 = 12 parameters
Results extremely robust to number and locations of the knots.
Paul Dickman Model-based vs. non-parametric 23 January 2014 15
Localised colon carcinoma in Finland 1985–1994
In a clinical setting, interest is often in prediction for patientswith specific characteristics (e.g., of a particular age).
Using the Pohar Perme approach, the estimated 5-year netsurvival for 60 year-old males is 0.94.
Some researchers would argue that this estimate is preferredover all other approaches for estimating net survival since theother approaches are known to be biased.
Let’s look (on the next slide) at the Pohar Perme andmodel-based estimates for a range of ages.
As an aside, the Ederer II estimates are identical to the PoharPerme estimates, but comparing non-parametric estimators isnot the focus of this talk.
Paul Dickman Model-based vs. non-parametric 23 January 2014 16
Non-parametric and model-based estimates
of 5-year net survival
Age PP Model60 0.94 0.8561 0.61 0.8462 0.77 0.8463 0.70 0.8364 0.73 0.8265 0.78 0.8266 0.87 0.8167 0.83 0.8068 0.86 0.8069 0.77 0.79
Paul Dickman Model-based vs. non-parametric 23 January 2014 17
Modelling assumptions give lower SEs
Age PP Model60 0.94 (0.07) 0.85 (0.02)61 0.61 (0.09) 0.84 (0.02)62 0.77 (0.09) 0.84 (0.02)63 0.70 (0.09) 0.83 (0.02)64 0.73 (0.08) 0.82 (0.02)65 0.78 (0.09) 0.82 (0.02)66 0.87 (0.07) 0.81 (0.02)67 0.83 (0.07) 0.80 (0.02)68 0.86 (0.09) 0.80 (0.02)69 0.77 (0.09) 0.79 (0.02)
Paul Dickman Model-based vs. non-parametric 23 January 2014 18
Non-parametric and model-based estimates
I prefer the model-based estimate. The survival of 60 year-oldsshould not be markedly different to 61 year-olds.
Proponents of a non-parametric approach argue that it is free ofassumptions and therefore preferable.
I believe the assumptions made in the model-based approach areappropriate, and incorporating them into the analysis leads tomore appropriate estimates of survival for each age.
By introducing an assumption, we are reducing variance by‘borrowing strength’ from the surrounding ages.
Could apply the non-parametric approach to a broader age group(e.g., patients aged 60–69), but doesn’t this imply adding anassumption on the interpretation (even if we don’t make anassumption in the estimation).
Paul Dickman Model-based vs. non-parametric 23 January 2014 19
There is no single correct model
Age PP Model1 Model2 Model3
60 0.94 0.85 0.94 0.8261 0.61 0.84 0.61 0.8262 0.77 0.84 0.77 0.8263 0.70 0.83 0.70 0.8264 0.73 0.82 0.73 0.8265 0.78 0.82 0.78 0.8266 0.87 0.81 0.87 0.8267 0.83 0.80 0.83 0.8268 0.86 0.80 0.86 0.8269 0.77 0.79 0.77 0.82
Parallels between models 2 and 3 and analogous non-parametricapproaches; same assumptions and identical estimates.How we specify and interpret covariate effects is often moreimportant, and is relevant independent of approach.
Paul Dickman Model-based vs. non-parametric 23 January 2014 20
Modelling is a powerful tool and requires skill
The utility of a model-based approach requires fitting anappropriate model
Choice of covariatesFunctional form for metric covariatesInteractions and how to parameterise them
Decisions on which interactions to include and how toparameterise them (spline-spline interactions require somethought) should often be based on subject matter considerationsrather than statistical significance.
Paul Dickman Model-based vs. non-parametric 23 January 2014 21
A comment from Riccardo Capocaccia
A well-constructed model does not assure to provide fittedsurvival estimates close to the empirical ones for allcombinations of covariate values. Differences between fitted andempirical survival can be due to random variability as well as to‘true’ effects. For instance, a model applied to European datamight not show a sudden increase of survival for colon cancer ina single country given by the introduction of mass screening.
I agree. But I see this as fundamental to the role of statisticalmodelling and the skills required to perform it.
Paul Dickman Model-based vs. non-parametric 23 January 2014 22
Goal in modelling
Examining the empirical estimates for a large number ofcovariate patterns does not provide a good basis for scientificinference.
Our goal in modelling is to fit a model that is sufficiently simplethat it provides a basis for scientific inference, while at the sametime being sufficiently complex that it does not obscureimportant effects or otherwise produce misleading results.
Paul Dickman Model-based vs. non-parametric 23 January 2014 23
References
[1] Dickman PW, Lambert PC, Coviello E, Rutherford MJ. Estimating net survival inpopulation-based cancer studies. Int J Cancer 2013;133:519–21.
[2] Roche L, Danieli C, Belot A, Grosclaude P, Bouvier AM, Velten M, et al.. Cancer netsurvival on registry data: Use of the new unbiased Pohar-Perme estimator and magnitudeof the bias with the classical methods. Int J Cancer 2012;132:2359–69.
[3] Lambert PC, Holmberg L, Sandin F, Bray F, Linklater KM, Purushotham A, et al..Quantifying differences in breast cancer survival between England and Norway. CancerEpidemiology 2011;35:526–533.
Paul Dickman Model-based vs. non-parametric 23 January 2014 26