Upload
jauhar-firdaus
View
216
Download
0
Embed Size (px)
Citation preview
7/28/2019 73.Full
1/8
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ORIGINAL ARTICLE
Differences in endpoints between the Swedish W-E(two county) trial of mammographic screening and theSwedish overview: methodological consequencesL Holmberg, S W Duffy, A M F Yen, L Taba r, B Vitak, L Nystrom and J Frisell
J Med Screen 2009;16:7380DOI: 10.1258/jms.2009.008103
See end of article forauthors affiliations. . . . . . . . . . . . . . . . . . .
Correspondence to: LarsHolmberg, ResearchOncology, 3rd floorBermondsey Wing, GuysHospital, Divison of CancerStudies, Guys Campus,Kings College London,SE1 9RT London, UK;
[email protected] for publication11 March 2009. . . . . . . . . . . . . . . . . . .
Objectives To characterize and quantify the differences in the number of cases and breast cancerdeaths in the Swedish W-E Trial compared with the Swedish Overview Committee (OVC) summariesand to study methodological issues related to trials in secondary prevention.Setting The study population of the W-E Trial of mammography screening was included in the first(W and E county) and the second (E-county) OVC summary of all Swedish randomizedmammography screening trials. The OVC and the W-E Trial used different criteria for case
definition and causes of death determination.Method A Review Committee compared the original data files from W and E county and the first andsecond OVC. The reason for a discrepancy was determined individually for all non-concordant casesor breast cancer deaths.Results Of the 2615 cases included by the W-E Trial or the OVC, there were 478 (18%)disagreements. Of the disagreements 82% were due to inclusion/exclusion criteria, and 18% todisagreement with respect to cause of death or vital status at ascertainment. For E-County, the OVCinclusion rules and register based determination of cause of death (second OVC) rather thanindividual case review (W-E Trial and 1st OVC) resulted in a reduction of the estimate of the effectof screening, but for W-County the difference between the original trial and the OVC was modest.Conclusions The conclusion that invitation to mammography screening reduces breast cancermortality remains robust. Disagreements were mainly due to study design issues, whiledisagreements about cause of death were a minority. When secondary research does not adhereto the protocols of the primary research projects, the consequences of such design differences
should be investigated and reported. Register linkage of trials can add follow-up information. Theprecision of trials with modest size is enhanced by individual monitoring of case status andoutcome status such as determination of cause of death.
INTRODUCTION
Mammographic screening reduces mortality from
breast cancer in both randomized trials1,2 and in
routine service screening.3,4 The Swedish W-E
Trial was the first randomized trial to demonstrate a
reduction in breast cancer mortality from screening withmammography alone,5 showing a 31% reduction in breast
cancer mortality with invitation to screening. This reduction
has remained consistent over long-term follow-up.6
In 1987 the Swedish Cancer Society set up an Overview
Committee (OVC) to review all the randomized mammo-
graphy trials in Sweden, the W-E Trial being one of them.
The OVC performed two overviews (hereafter called the 1st
and the 2nd OVC) by collecting data from all four Swedish
mammography trials in a uniform way. However, between
the 1st and 2nd OVC there was a difference in the methods
of determining cause of death, using an endpoint committee
in the 1st OVC and registries only in the 2nd.
Concern was expressed about differences between results
reported for the W-E trial by the original trialists and those
reported by the Swedish Overview, particularly with
respect to numbers of breast cancer deaths.7 It has been
pointed out that such differences are an inevitable conse-
quence of the different case definition and determination
of cause of death, and the different eligibility criteria of theSwedish Overview1,810 as compared with the W-E Trial.
These differences, however, raise both particular and general
methodological issues related to follow-up of large trials or
sets of trials in secondary prevention. These questions
include:
(1) What is the magnitude of these differences at an indi-
vidual rather than aggregate level, what proportions
of the differences are due to inclusion/exclusion cri-teria of cases in the Swedish Overview and to cause
of death determination?
73
www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2
7/28/2019 73.Full
2/8
(2) What are the reasons for individual differences
between the original study and the overview with
respect to breast cancer case definition/inclusion andcause of death?
(3) What are the implications of these kinds of differences
for endpoint definition in future studies in primary or
secondary prevention?
In this paper, wereport on a complete audit of breastcancercases and deaths in the Swedish W-E Trial, as defined by
the original trial investigators and by the Swedish Overview.
We report the numbers of disagreements at individual level
and the reasons for these. We discuss their implications for
interpretation of the overview and the original trial (up to
the end of 1993 for W-county and to the end of 1996 for
E-county) and for design and follow-up of future secondary
prevention trials.
BACKGROUND
The Swedish W-E Trial was initiated in 1977 in Kopparberg
county (now Dalarna, referred to as W-county hereafter)and in 1978 in Ostergotland county (E-county). Small geo-
graphical clusters were randomized to invitation to screening
(Active Study Population, ASP) or no invitation (Passive
Study Population, PSP) within 7 strata in W-county and
within 12 strata in E-county. The strata were chosen so
that clusters within strata were socioeconomically homo-
geneous. In W-county, randomization was approximately
in the ratio 2:1 for ASP:PSP. In E-county, roughly equal
numbers were randomized to the two groups. Entry of
strata to the trial was staggered to allow the mammography
facilities to cope with the workload. Year of birth cohorts
were included to give an approximate age range of 4074.
For example, for a stratum whose randomization date was1977, years of birth 1903 1937 were included. In total,
77,080 women were randomized to the ASP and 55,985 to
the PSP. Details of the age and county breakdown of the
study population are given elsewhere.11,12
The screening regime was single-view mammography, on
average every 24 months in women aged 4049 and every
33 months in women aged 5074. At the end of 1984 a sig-
nificant 31% reduction in breast cancer mortality was
observed in the ASP5. The PSP was then invited to screening.
The trial was closed immediately on completion of the first
round of screening in the PSP, and all cases in both arms
diagnosed up to and including the end of the first screen
of the PSP were followed up for death from breast cancer.
In W-county, according to the local trials records, therewere 694 breast cancer cases in the ASP and 359 in the
PSP. In E-county, there were 732 breast cancer cases diag-
nosed in the ASP and 683 in the PSP.6 These cases included
both in situ and invasive cancers diagnosed during the trial
period.
The OVC defined breast cancer cases as women reported
with an invasive breast cancer only (excluding women
with cancer in situ) to the Swedish National Cancer
Registry (NCR) during each trials recruitment period using
the reporting date in the NCR as the date of diagnosis;
women with an invasive breast cancer reported to the
NCR before the trial start were excluded from the study
base, although women diagnosed before 1958, when the
NCR was established, could not be excluded. The OVC also
accepted a woman as a breast cancer case in the study
when there was only a breast cancer death registered in
the Swedish Causes of Death Register (CDR), even if they
were not registered at the NCR. Thus, the diagnosis could
have occurred before trial start, during the trial period or
after the trial ended. Deaths from breast cancer were
retrieved from the CDR to include all deaths in womenwith breast cancer as the underlying cause, according to
the death certificate. Inclusion criteria in all analyses in the
overview were based on the exact age at randomization, as
opposed to the Two-County trial, where inclusion was
determined on the basis of year of birth. The OVC retrieved
the original randomization file, based on the population regi-
ster, from the IT co-ordinator responsible for data manage-
ment in each of the counties. The files were linked by
each womans unique National Registration Number to the
corresponding Regional Tumour Registry which provides
the data for the NCR to obtain verification and date of diag-
nosis and to the CDR to obtain date and cause of death.
Importantly, the 1st OVC included specialists who indepen-
dently from the W-E Trial determined the cause of death ofthe breast cancer patients based on case records. The publi-
cations of the 1st OVC gave a relative risk estimate similar to
that of the W-E Trials local committee.2 In the 2nd OVC the
decision was made to use the Swedish National Cancer and
Death Registries (NCR and CDR) to determine cause of
death instead of using the specialist committee, because
the combined relative risk using the register data was
similar to that of the 1st OVC.13
The 1st OVC conducted a computerized follow-up of both
the W-County and E-County data to 31 December 1993, and
the 2nd OVC continued data collection for the E-County
only until 31 December 1996.2 The computerized follow-up
ended 31 December 1993 for the first evaluation round(which was the last time the W-data were included) and
31 December 1996 for the second evaluation round
(which was the last time the E-data were included).
The four particularly important differences between the
W-E study design and the 1st & 2nd OVC s criteria were:
(1) The original trial defined inclusion and exclusion of
women to the trial by year of birth and residence in
the relevant geographical areas at the time of ran-
domization. The 1st & 2nd OVC defined the population
by year, month and day of birth.
(2) The end-point committees of the W-E trial and the 1st
OVC determined individual patient outcome by
reviewing all clinical records as identified in the origi-nal trial data and in NCR and CDR data. The 2nd
OVC used NCR and CDR data only.
(3) The OVC included women as cases if the CDR reported
a breast cancer death even if there was no report of a
breast cancer diagnosis in the NCR. The W-E Trial
included only those women who had a microscopically
confirmed breast cancer diagnosed during the trial
period.
(4) The W-E Trial included all breast cancer cases (in situ
and invasive), whereas the 1st and 2nd OVC both con-
sidered only invasive breast cancer cases reported to
the NCR, excluding all women as reported to the
74 Holmberg et al.
Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com
7/28/2019 73.Full
3/8
NCR as having cancer in situ carcinoma and could not
include those by clerical errors not reported from the
clinics to the NCR.
METHODS
In 2006 the Swedish Cancer Society set up a Joint Review
Committee (JRC) including members of the 1st & 2ndOVCs and the project leaders of the W-E trial to investigate
the sources of disagreement between the results published
by the trialists and the OVC (the 1st OVC for W county
and the 2nd OVC for E county data). The lists of women
with breast cancer according to the trialists and the OVC
were compared. Where necessary, clinical records were
retrieved. After investigating each case in the two lists inde-
pendently by the trialists and the OVC, a classification
scheme of the differences was developed by the JRC
(Table 1). The records of breast cancer cases and deaths
according to the local endpoint committee were compared
with those of the OVC using the Swedish National
Registration Numbers of the subjects for linkage. The
deaths through 1993 were compared for W-county and
through 1996 for E-county, as these dates were respecti-
vely the most recent Swedish Overview analyses to
include each county.8,14 The JRC reviewed each disagree-
ment between the two datasets with respect to either case
definition or cause of death. The JRC determined the
reasons for each individual disagreement. As a final result,
the trialists also accepted some women as additional
cancer cases in their trials depending on new information
about migrated women and clerical errors. The addition of
them to the original datasets is called the JRC conclusive
dataset.
Paired significance tests between OVC and W-E endpoints
were carried out using McNemar methods.15 Associations of
the likelihood of disagreement with age, county and trialarm were assessed using the chi-squared test. Relative risks
and 95% confidence intervals on these were calculated
using Poisson regression.
RESULTS
According to the W-E trial records the total numbers of
women for W-county were 38,589 in the ASP and 18,582
in the PSP; for E-county the numbers were 38,481 and
37,403. The corresponding figures in the O-V records were
38,562, 18,478, 38,405 and 37,145. These differences of
the order of less than 1% were not influential in the esti-
mation of the primary results.
W-county
Table 2 shows all cases included in either the local trial
records for W-county or the 1st OVC records or both, with
Table1 Classification of potential differences between the WE trial and the overview
Category Type of disagreement
Explanation
WE trial Overview
A Difference in the definition of age(accounts for differences atboth ends of the age spectrum)
Age calculation was based on the yearofrandomization and the yearof birth of thetrial attendee.
Age calculation was based on exact dateof birth and randomization (day/month/year).
B Definition of date of diagnosis(accounts for differences at trialstart and at trial end)
Date of operation. Women who had ascreening or clinical diagnosis at the endof the trial, before closing date, butoperated after the closing date, have beenincluded.
Date of notification to the NCR, which,according to registry principles is thefirst notification to the NCR of a cancer(often the date of a positive cytology).
C Difference in the principles of useof causes of death registry
Included only cases diagnosed within thetrial period.
Included cases diagnosed within the trialperiod andbreast cancer deathsregistered in the CDR even if they were notregistered at the NCR (Thus the diagnosiscould have occurred before trial start,during the trial period or after the trialended).
D Cases not retrievable from theNCR at the time of the
overview
Included all cases including cancer in situduring the trial period when there was
clinical information available on a breastcancer, even if a case was not registeredat the NCR due to administrative errorsand excluded all cases with a history ofbreast cancer before the study started.
Included only invasive cases retrievable fromthe NCR or identified at the CDR at the
time when the overview was conducted,but could not exclude those cases thatwere diagnosed before the start of theNCR in 1958.
G Differences in the determinationof cause of death
Cause of death was determined by the localtrial committeebased on data available inpatients medical records.
1st OVC used an independendent end-pointcommittee; the 2nd OVC used cause ofdeath data registered in the CDR.
I Erroneous inclusion in the W-Edatabase
Clerical error or incorrect national registration number.
K Miscellaneous clerical errors andother reasons
Includes misspecification of eligibility or cause of death due to clerical error, erroneousregistration in NCR or CDR or administrative loss of information. Also includes individualmigration, where a subject received a breast cancer diagnosis outside the study areas,and was therefore in the overview but missing from the W-E database.
NCR, National cancer register; CDR, National cause of death register
Differences in endpoints between the Swedish W-E 75
www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2
7/28/2019 73.Full
4/8
the endpoint in each data set cross-tabulated. Of the 1053
cases included in the local trial records, the OVC included
925 cases (88%). Conversely, of the 972 cases included in
the OVC records, the local trial included 925 breast cancer
cases (95%). Of the 443 deaths to 1993 included in both
datasets, there were 24 (5%) disagreements regarding deter-
mination of cause of death (type G disagreement). Of the
total 199 disagreements, whether with respect to case
inclusion or to cause of death, 175 (88%) pertained to
case inclusion rather than cause of death. For both the
ASP and the PSP, the overview was less likely to classify a
death as from breast cancer. The magnitude of this tendency
did not differ significantly between ASP and PSP.Table 3 shows the reasons for disagreement between the
two breast cancer-case datasets, in the ASP and PSP separ-
ately. The largest group of disagreements in the ASP was
type D, mainly due to women with screen-detected in situ
lesions included in the W-E dataset but not included by
the overview. In the PSP, most of the disagreements were
of type B, relating to date of diagnosis. These disagreements
mostly resulted from women in PSP diagnosed at the first
screen but through delays in reporting, not entered into
the NCR until after closure of the trial. These women were
considered by the OVC only to have been diagnosed at the
reporting date to the register and were thus excluded in
the OVC (see Table 1, category B). Disagreements with
respect to death from breast cancer were mainly due to cate-gory G (47%; disagreement about cause of death) and C
(29%; use of cause of death register without reference to
date of diagnosis) in the ASP, and to G (disagreement
about cause of death) and B (32%; definition of date of diag-
nosis) in the PSP.
Table 4 shows the breast cancer deaths and corresponding
relative risks (RR) from the W-arm of the W-E trial, the
OVC, and those derived after review of all information by
the JRC and the resulting conclusive dataset (i.e. the original
trial data plus correction for the clerical errors and cases lost
to the trialists due to migration). The OVC result is more
conservative than that of the original trial and the result
based on the JRC conclusive dataset. All analyses show a
significant mortality reduction in the ASP.
E-county
Table 5 shows cross-tabulation of the local trial endpoint
records for E-county with the 2nd OVC records, for all
women with breast cancer in either or both datasets. Of
the 1415 women with breast cancer included in the local
trial records, the 2nd OVC included 1298 (92%). Of the
1398 cases included in the 2nd OVC records, the local trial
included 1298 (93%). Of the 655 deaths to 1996 included
in both datasets, there were 53 (8%) disagreements. Of
the total 279 disagreements, 217 (78%) pertained to case
inclusion, 53 (19%) to cause of death and 9 (3%) to vital
status at 31 December 1996. For both the ASP and the
PSP, the 2nd OVC was less likely to classify a death as
from breast cancer. This tendency was significantly stronger
in the PSP (59% vs. 52%; P 0.03).
The reasons for disagreements are shown in Table 6. As with
W-county, the largest group, 40% of the disagreements in the
ASP are of type D, absence of trial cases from the NCR. For
the PSP, however, similar proportions of disagreements were
Table 3 Categorized disagreements between W-county trialrecords and 1st OVC records
Disagreementcategory
Number (%) of disagreements in cases
ASP PSP Total
A 14 (12) 3 (4) 17 (8)B 5 (4) 52 (65) 57 (29)C 13 (11) 2 (2) 15 (7)D 51 (43) 12 (15) 63 (32)G 16 (13) 8 (10) 24 (12)I 0 (0) 0 (0) 0 (0)K 20 (17) 3 (4) 23 (12)Total 119 (100) 80 (100) 199 (100)
Number (%) of disagreements for breastcancer death
A 2 (6) 0 (0) 2 (4)B 0 (0) 6 (32) 6 (11)C 10 (29) 2 (11) 12 (23)D 2 (6) 1 (5) 3 (6)G 16 (47) 8 (42) 24 (45)I 0 (0) 0 (0) 0 (0)K 4 (12) 2 (11) 6 (11)Total 34 (100) 19 (100) 53 (100)
PSP, passive study population, not invited; ASP, active study population, invited
Table 4 Trial mortality result for W-county from original localtrial endpoint, 1st OVC endpoint and the JRC conclusivedataset
ASP PSP RR (95% CI)
W original breastcancer deaths
135 110 0.59 (0.45 0.76)
OVC breast cancerdeaths
141 99 0.69 (0.53 0.90)
JRC conclusion breastcancer deaths for W
136 111 0.59 (0.45 0.76)
Number of subjects 38,589 18,582
PSP, passive study population, not invited; ASP, active study population, invited
Table 2 W-county outcomes tabulated against overviewoutcomes (agreements in bold)
Studygroup
1st OVCoutcome
W-county outcome
Incl,BCD
Incl,DOC
Incl,alive
Notincl Total
PSP Incl, BCD 95 0 0 4 99Incl, DOC 8 50 0 0 58Incl, Alive 0 0 138 0 138Not incl 7 7 54 0 68Total 110 57 192 4 363
ASP Incl, BCD 121 4 0 16 141Incl, DOC 12 153 0 15 180Incl, Alive 0 0 344 12 356Not incl 2 10 48 0 60Total 135 167 392 43 737
Total Incl, BCD 216 4 0 20 240Incl, DOC 20 203 0 15 238Incl, Alive 0 0 482 12 494Not incl 9 17 102 0 128Total 245 224 584 47 1100
Incl, included; BCD, breast cancer death; DOC, death from other causes; PSP, passive studypopulation, not invited; ASP, active study population, invited
76 Holmberg et al.
Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com
7/28/2019 73.Full
5/8
observed in categories D (19%), absence of the case from the
NCR, G (22%), disagreement about cause of death, and K
(26%), miscellaneous clerical errors and other reasons. With
respect to breast cancer death, disagreements were dominated
by category G (disagreement about cause of death) and C (date
of diagnosis) in both the ASP and PSP.
Table 7 shows the E-county trial result with respect to
breast cancer mortality using the original trial endpoint,
the 2nd OVC endpoint and the conclusive endpoint after
review of all sources by the JRC (i.e. the original trial data
plus correction for the clerical errors and cases lost to the
trialists due to migration). The trial endpoint and the JRC
conclusive dataset both show a significant 2023%
reduction in mortality, whereas the 2nd OVC result shows
a non-significant 10% reduction.
Associations with disagreement
We also investigated whether study group (ASP/PSP),county or age were significantly related to the likelihood of
disagreement about breast cancer death. In the 685 cases
classified as breast cancer death by either the W and E
local committees or the OVC or both, there was no signifi-
cant association of study group with disagreement (P
0.2).There was a higher proportion of disagreement in E-county
than in W-county, but this did not attain statistical significance
(P 0.09). There was, however, a significant effect of patient-
age at the time of randomization on the probability of a risk
of disagreement (P, 0.001). In both counties, the disagree-
ment increased with age (Figure 1).
DISCUSSION
In this study, the Swedish Cancer Societys Joint Review
Committee (JRC) investigated disagreements between the
breast cancer incidence and death data as recorded in the
original Swedish Two-County Trial, based on individualpatient records and determination of cause of death by an
expert committee, and that in the 2nd OVC based on the
National Cancer Registry and Cause of Death Register. For
the purposes of this study, we had full access to original
W-E trial data, original data collected for the OVC, individual
medical records, and register data from the regional tumour
registries for the respective counties. The registration of new
diagnosis of breast cancer is mandatory by law in Sweden
Table 7 Trial mortality result for E-county from original localtrial endpoint, 2nd OVC endpoint and the JRC conclusivedataset
ASP PSP RR (95% CI)
Original E breastcancer deaths
163 200 0.80 (0.64 0.98)
OVC breast cancerdeaths
175 189 0.90 (0.72 1.12)
JRC conclusion breastcancer deaths for E
162 206 0.77 (0.62 0.95)
Number of subjects 38,309 37,403
PSP, passive study population, not invited; ASP, active study population, invited
Figure 1 Percentage disagreement between W-E and 2nd OVCby age, in 604 cases classed as breast cancer deaths by one orboth sources
Table 5 E-county outcomes tabulated against 2nd OVCoutcomes (agreements in bold)
Studygroup
2nd OVCoutcome
E-county outcome
Incl,BCD
Incl,DOC
Incl,alive
Notincl Total
PSP Incl, BCD 164 4 2 19 189Incl, DOC 27 128 3 20 178Incl, Alive 0 1 296 10 307Not incl 9 8 41 0 58Total 200 141 342 49 732
ASP Incl, BCD 147 8 0 20 175Incl, DOC 14 163 2 21 200Incl, Alive 0 1 338 10 349Not incl 2 13 44 0 59Total 163 185 384 51 783
Total Incl, BCD 311 12 2 39 364Incl, DOC 41 291 5 41 378Incl, Alive 0 2 634 20 656Not incl 11 21 85 0 117Total 363 326 726 100 1515
Incl, included; BCD, breast cancer death; DOC, death of other causes; PSP, passive studypopulation, not invited; ASP, active study population, invited
Table 6 Categorized disagreements between E-county trialrecords and 2nd OVC records
Disagreementcategory
Number (%) of disagreements in cases
ASP PSP Total
A 21 (16) 17 (12) 38 (14)B 0 (0) 18 (12) 18 (6)C 15 (11) 10 (7) 25 (9)D 54 (40) 28 (19) 83 (30)G 22 (16) 31 (22) 52 (19)I 3 (2) 3 (2) 6 (2)K 20 (15) 37 (26) 57 (20)Total 135 (100) 144 (100) 279 (100)
Number (%) of disagreements for breastcancer death
A 5 (12) 3 (5) 8 (7)B 0 (0) 4 (7) 4 (4)C 15 (34) 10 (16) 25 (24)D 1 (2) 4 (7) 5 (5)G 22 (50) 30 (49) 52 (49)I 1 (2) 0 (0) 1 (1)K 0 (0) 10 (16) 10 (10)Total 44 (100) 61 (100) 105 (100)
PSP, passive study population, not invited; ASP, active study population, invited
Differences in endpoints between the Swedish W-E 77
www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2
7/28/2019 73.Full
6/8
and the completeness of registration of breast cancer is over
98%.16 Thus, we were able to determine the reason for dis-
crepancy in every individual case and no discrepancies were
left unexplained.
Our main empirical findings are that the JRC found that
of the 2615 cases included by the W-E Trial or the OVC,
there were 478 (18%) disagreements about inclusion/exclusion of women into the trial or determination of the
cause of death. The vast majority of these pertained to adisagreement in inclusion/exclusion and not to disagree-ment in determination of cause of death. The disagreements
were in the great majority of cases due to OVC-study design
decisions pertaining to issues such as definition of age and
last date of inclusion into the study, and use of a register
rather than clinical records for case definition and cause
of death determination. Disagreement about whether a
death included in both the W-E Trial and the OVC
was from breast cancer or not was relatively rare. We
also found that the likelihood of disagreement about the
cause of death was not significantly affected by county or
trial arm. Such disagreement was, however, significantly
more likely in older patients. These findings have
implications both for the interpretation of screening effectsand for methodological issues in overviewing original
research.
The combined results of the two counties showed a signifi-
cant breast cancer mortality reduction associated with the
offer of screening by any of the three endpoint criteria.
Using the JRC conclusions, the combined RR was 0.69
(95% CI 0.58 0.83). Thus, the overall interpretation was
not sensitive to these differences in design. In W-county,
the result was significant by any of the three criteria,
whereas in E-county, the result was significant using the
original trial endpoint, and the JRC conclusive review end-
point, but not statistically significant using the 2nd OVC end-
point. The JRC conclusive result included some women withbreast cancer previously missed by the trialists due to
migration, but picked up by the NCR or CDR.
The remit of the JRC was not to determine whether one
or the other of the endpoints were correct. However, it is
clear from the E-county results that a combination of dif-
fering causes of death determinations and inclusion/exclu-sion rules made a crucial difference to the primary result.
It is highly relevant for the field of secondary prevention
to understand how such modest disagreements cause
such a difference to the outcome in a trial with a total of
133,065 subjects. The answer is that the disagreements
only needed to impact on the small minorities of subjects
classified as dying from breast cancer within the trial arm
subgroups of one geographical stratum (E-county) withinthe larger trial. In the ASP of E-county, disagreements
with respect to cause of death and eligibility for inclusion
caused a loss of 16 and a gain of 28 breast cancer deaths,
a net increase of 12 breast cancer deaths. In the PSP,
there was a loss of 36 breast cancer deaths and a gain of
25, a net loss of 11 deaths (Table 5). Thus the 2nd OVC
classification of eligibility for inclusion and cause of death
gave a 7% higher death rate in the ASP and a 6% lower
death rate in the PSP, sufficient to convert a statistically
significant 20% reduction in mortality to a statistically
non-significant 10% reduction. It should be noted that if
the inclusion criteria had been identical and the only
difference had been the disagreements over cause of
death, the result in E-county would still have been ren-
dered non-significant.
The effect of misclassification of exposure factors has
been extensively studied in epidemiology,1719 and when
it is non-differential with respect to disease outcome, it
tends to dilute estimated effects. Although less fully
researched, the misclassification of outcome has also been
shown to cause underestimation of exposure/outcomeassociations.20 Disagreement rates between OVC and W-Eclassifications were 18% in both counties. Discrepancies
of this magnitude are suggestive of misclassification prob-
abilities of 10%, and would be likely to lead to dilution
of observed associations by approximately 33%.21 The
differences between W-E and OVC are smaller than this
for W-county and rather larger for E-county. That they
are proportionally larger for E-county is likely to be due
to the fact that disagreement rates were differential
between trial arms. The implications of this are that in
general, the poorer the classification, the greater the poten-
tial for missing a true effect, that the presence of differen-
tial misclassification may increase the potential bias, and
that the more thorough the classification effort, the moresensitive the comparison is likely to be.
All these circumstances underline the importance of using
an expert panel for determining cause of death when the
individual study units contain few events. Others have
regarded the determinations of such an expert committee
as the gold standard,22 even when they have concluded
that national death register information is adequate in com-
parison.23 The OVC obtained results closer to those of the
original trial when the 1st OVC used an expert endpoint
committee.2
The finding that the disagreement of cause of death
increased with age is also of general interest. It accords
with the findings of the 1st OVC where four clinicians notinvolved in the trials independently determined cause of
death and the discordance at the initial review was 5%,
5%, 13% and 19% in women 4049, 5059, 6069 and
70 74 years respectively, at randomization.13 This probably
reflects an increasing difficulty to determine cause of death
with age for several reasons: a mixed clinical picture due
to increasing co-morbidity, death occurring more often at
home or in a nursing home without a clinical examination
closely before death, very low probability of an autopsy,
and increased uncertainty about origin of eventual metas-
tases if also another malignancy has been diagnosed
during follow-up. With long-term follow-up, information
may also be lost that the woman is a trial participant and
that determination of cause of death may be important.The results of the JRC review show that the disagree-
ments were due to design differences between the clinical
intervention trial approach employed in the original W-E
Trial and the register-study design used by the OVC. This
leads to a more general observation: design decisions in
either an original study or a subsequent overview that
may at first glance seem trivial e.g. defining a date for
end of trial can influence basic and important study fea-
tures such as the number of included subjects. Thus, design
differences between original studies and overviews have to
be taken into account when the overview does not adhere
to the original designs, and it should be investigated if
78 Holmberg et al.
Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com
7/28/2019 73.Full
7/8
the interpretation is sensitive to such design conflicts.
An example here is the inclusion of women with in situ
tumours as cases in the original study contrasted with the
decision to only include those registered with an invasive
cancer in the 1st and the 2nd OVC. This decision made
an especially large difference for the ASP. Thus, seemingly
general deviances from the original study design may not
be neutral to the evaluation of the randomized trial.
In this case this decision above all contributed to the differ-ent number of cases reported in the original trial as com-
pared with the 1st and 2nd OVC, but little to the
evaluation of breast cancer mortality.
Is there a role for registry data in evaluation of primary or
secondary interventions? It would definitely seem so where
the research involves millions of person-years and large
numbers of cause-specific deaths, thus misclassifications
are likely to be heavily outnumbered by reliable obser-
vations3,4 such as in large prevention and secondary preven-
tion studies. For individual trials with smaller sizes, however,
it is more reliable to individually determine case status and
the cause of deaths by an expert committee.
CONCLUSION
The following points are suggested by the above results:
(1) The conclusion that invitation to mammography
screening was associated with a significant breast
cancer mortality reduction remains robust after a full
examination of disagreements between the original
Two-County Trial endpoints and those of the Swedish
overview. Disagreements about actual cause of death
were a minority of the overall disagreements and
were common only for older cases; the majority of dis-
agreements related to inclusion or exclusion.(2) The use of the overview inclusion criteria and the
national registry data for determination of breast
cancer deaths led to a substantial change in the result
for one of the two counties illustrating that non-
differential misclassification of the main endpoint
tends to drive results towards the null.
(3) Thus, for trials with modest size it would appear to be
more prudent to rely on trial logistics with close indi-
vidual monitoring of case status, presence of covariates
and outcome status such as determination of cause of
death based on all available clinical information.
(4) When secondary research does not adhere to the pro-
tocols of the primary research projects included, theconsequences of such design differences should be
investigated and reported. Seemingly trivial design
decisions may have significant impact on the result
and are not always neutral to the randomized design.
. . . . . . . . . . . . . . .Authors affiliationsL Holmberg, Professor of Cancer Epidemiology, Kings CollegeLondon, Medical School, Division of Cancer Studies, London, UKS W Duffy, Professor of Breast Cancer Screening, Cancer ResearchUK Centre for Epidemiology, Mathematics and Statistics, WolfsonInstitute of Preventive Medicine, London, UK
A M F Yen, Cancer Research UK Centre for Epidemiology,Mathematics and Statistics, Wolfson Institute of Preventive Medicine,London, UKL Tabar, Professor of Radiology, University of Uppsala, School ofMedicine, Department of Mammography, Falun Central Hospital,Falun, SwedenB Vitak, Consultant Radiologist, Division of Radiological Sciences,Department of Medical and Health Sciences, Linkoping University,Linkoping, SwedenL Nystrom, Associate Professor of Epidemiology, Department of
Public Health and Clinical Medicine, Umea Universtiy, Umea, SwedenJ Frisell, Professor of Surgery, Department of Molecular Medicineand Surgery, Unit of Breast Surgery, Karolinska Institute, Solna,Sweden
ACKNOWLEDGEMENTS
The study was supported by grants from the Swedish Cancer
Society and the American Cancer Society. We thank Sherry
Yueh-Hsia Chiu from the Institute of Preventive Medicine,
Division of Biostatistics, College of Public Health at the
National Taiwan University for excellent help and Robert
Smith from the American Cancer Society for valuable dis-
cussions and advice.
Conflict of interest and contributions: The authors are
associated with the WE trial and the Overview as described
in contributions and have otherwise no conflict of interest in
relation to this work.
Lars Holmberg, Stephen Duffy and Jan Frisell oversaw the
comparison and coordinated the analyses. Lars Holmberg
and Stephen Duffy drafted the report. Jan Frisell and Lars
Holmberg were the principal investigators for the grants
that supported the study. Laszlo Tabar and Bedrich Vitak
were the principal investigators for the W and E trial parts,
respectively, and provided all data for the W-E trial. Jan
Frisell and Lennarth Nystrom were the principal and the
coordinating investigators for the Overview committee,
respectively, and Lennarth Nystrom provided the Overviewdata. Stephen Duffy and Amy Yen made the statistical ana-
lyses. All authors had full access to the data, contributed in
the comparison process, the interpretation of the analyses
and revised the manuscript for intellectual content. Lars
Holmberg is the guarantor for the study.
REFERENCES
1 Smith RA, Duffy SW, Gabe R, Tabar L, Yen AMF, Chen HHT. Therandomized trials of breast cancer screening: what have we learned?Radiol Clin Nth Amer 2004;42:793806
2 Nystrom L, Rutquist LE, Wall S, et al. Breast cancer screening withmammography: overview of the Swedish randomised trials. Lancet
1993;341:97383 Swedish Organised Service screening Evaluation Group. Reduction inBreast Cancer Mortality from Organised Service screening withMammography: 1. Further confirmation with extended data. CancerEpidemiol Biomarkers Prev 2006;15:4551
4 Swedish Organised Service screening Evaluation Group. Reductionin breast cancer mortality from organised service screening withmammography: 2. Validation with alternative analytic methods. CancerEpidemiol Biomarkers Prev 2006;15:5256
5 Tabar L, Fagerberg CJ, Gad A, et al. Reduction in mortality from breastcancer after mass screening with mammography. Randomised trial fromthe Breast Cancer screening Working Group of the Swedish NationalBoard of Health and Welfare. Lancet1985;325:82932
6 Tabar L, Vitak B, Chen HH, Duffy SW, Smith RA. The Swedish Two-CountyTrial twenty years later: updated mortality results and new insights fromlong term follow-up. Radiol Clin Nth Amer 2000;38:62551
7 Gtzsche PC, Olsen O. Is screening for breast cancer with mammographyjustifiable? Lancet2000;355:12933
Differences in endpoints between the Swedish W-E 79
www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2
7/28/2019 73.Full
8/8
8 Nystrom L, Andersson I, Bjurstam N, Frisell J, Nordenskjo ld B, Rutqvist LE.Long-term effects of mammography screening: updated overview of theSwedish randomised trials. Lancet2002;359:90919
9 Freedman DA, Petitti DB, Robins JM. On the efficacy of screening for breastcancer. Int J Epidemiol 2004;33:4355
10 Duffy SW. Interpretation of the breast screening trials: a commentary on therecent paper by Gtzsche and Olsen. The Breast 2001;10:20912
11 Tabar L, Fagerberg G, Duffy SW, Day NE, Gad A, Grontoft O. Update ofthe Swedish two- county program of mammographic screening for breastcancer. Radiol Clin Nth Amer1992;30:187210
12 Duffy SW, Tabar L, Vitak B, et al. The Swedish Two-County Trial ofmammographic screening: cluster randomisation and endpoint evaluation.Ann Oncol 2003;39:174654
13 Nystrom L, Larsson L-G, Rutqvist LE, et al. Determination of cause of deathamong breast cancer cases in the Swedish mammography screening trials:a comparison between official statistics and validation by an endpointcommittee. Acta Oncol 1995;34:14552
14 Larsson LG, Andersson I, Bjurstam N, et al. Updated overview of theSwedish randomised trials on beast cancer screening with mammography:age group 4049 at randomisation. J Natl Cancer Inst Monogr1997;22:5761
15 McNemar Q. Note on the sampling error of the differencebetween correlated proportions or percentages. Psychometrika1947;12:1537
16 Barlow L, Westergren K, Holmberg L, Talback M . The completeness of theSwedish Cancer Register - a sample survey for year 1998. Acta Oncol2009;48:2733
17 Freedman LS, Midthune D, Carroll RJ, Kipnis V. A comparison of regressioncalibration, moment reconstruction and imputation for adjusting forcovariate measurement error in regression. Stat Med2008;27:5195216
18 Wong MY, Day NE, Luan JA, Wareham NJ. Estimation of magnitude ingene-environment interactions in the presence of measurement error.Stat Med2004;23:98798
19 Bashir SA, Duffy SW. The correction of risk estimates for measurement
error. Ann Epidemiol 1997;7:1546420 Duffy SW, Warwick J, Williams AR, et al. A simple model for potential use
with a misclassified binary outcome in epidemiology. J Epidemiol CommHlth 2004;58:7127
21 Duffy SW, Maximovitch DM, Day NE. External validation, repeatdetermination, precision of risk estimation in misclassified exposure datain epidemiology. J Epidemiol Comm Hlth 1992;46:62024
22 Miller AB. Design of cancer screening trials/randomized trials forevaluation of cancer screening. World J Surg 2006;30:115262
23 Makinen T, Karhunen P, Aro J, Lahtela J, Maattanen L, Auvinen A.Assessment of causes of death in a prostate cancer screening trial.Int J Cancer 2008;122:41317
80 Holmberg et al.
Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com