73.Full

7/28/2019 73.Full

1/8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ORIGINAL ARTICLE

Differences in endpoints between the Swedish W-E(two county) trial of mammographic screening and theSwedish overview: methodological consequencesL Holmberg, S W Duffy, A M F Yen, L Taba r, B Vitak, L Nystrom and J Frisell

J Med Screen 2009;16:7380DOI: 10.1258/jms.2009.008103

See end of article forauthors affiliations. . . . . . . . . . . . . . . . . . .

Correspondence to: LarsHolmberg, ResearchOncology, 3rd floorBermondsey Wing, GuysHospital, Divison of CancerStudies, Guys Campus,Kings College London,SE1 9RT London, UK;

[email protected] for publication11 March 2009. . . . . . . . . . . . . . . . . . .

Objectives To characterize and quantify the differences in the number of cases and breast cancerdeaths in the Swedish W-E Trial compared with the Swedish Overview Committee (OVC) summariesand to study methodological issues related to trials in secondary prevention.Setting The study population of the W-E Trial of mammography screening was included in the first(W and E county) and the second (E-county) OVC summary of all Swedish randomizedmammography screening trials. The OVC and the W-E Trial used different criteria for case

definition and causes of death determination.Method A Review Committee compared the original data files from W and E county and the first andsecond OVC. The reason for a discrepancy was determined individually for all non-concordant casesor breast cancer deaths.Results Of the 2615 cases included by the W-E Trial or the OVC, there were 478 (18%)disagreements. Of the disagreements 82% were due to inclusion/exclusion criteria, and 18% todisagreement with respect to cause of death or vital status at ascertainment. For E-County, the OVCinclusion rules and register based determination of cause of death (second OVC) rather thanindividual case review (W-E Trial and 1st OVC) resulted in a reduction of the estimate of the effectof screening, but for W-County the difference between the original trial and the OVC was modest.Conclusions The conclusion that invitation to mammography screening reduces breast cancermortality remains robust. Disagreements were mainly due to study design issues, whiledisagreements about cause of death were a minority. When secondary research does not adhereto the protocols of the primary research projects, the consequences of such design differences

should be investigated and reported. Register linkage of trials can add follow-up information. Theprecision of trials with modest size is enhanced by individual monitoring of case status andoutcome status such as determination of cause of death.

INTRODUCTION

Mammographic screening reduces mortality from

breast cancer in both randomized trials1,2 and in

routine service screening.3,4 The Swedish W-E

Trial was the first randomized trial to demonstrate a

reduction in breast cancer mortality from screening withmammography alone,5 showing a 31% reduction in breast

cancer mortality with invitation to screening. This reduction

has remained consistent over long-term follow-up.6

In 1987 the Swedish Cancer Society set up an Overview

Committee (OVC) to review all the randomized mammo-

graphy trials in Sweden, the W-E Trial being one of them.

The OVC performed two overviews (hereafter called the 1st

and the 2nd OVC) by collecting data from all four Swedish

mammography trials in a uniform way. However, between

the 1st and 2nd OVC there was a difference in the methods

of determining cause of death, using an endpoint committee

in the 1st OVC and registries only in the 2nd.

Concern was expressed about differences between results

reported for the W-E trial by the original trialists and those

reported by the Swedish Overview, particularly with

respect to numbers of breast cancer deaths.7 It has been

pointed out that such differences are an inevitable conse-

quence of the different case definition and determination

of cause of death, and the different eligibility criteria of theSwedish Overview1,810 as compared with the W-E Trial.

These differences, however, raise both particular and general

methodological issues related to follow-up of large trials or

sets of trials in secondary prevention. These questions

include:

(1) What is the magnitude of these differences at an indi-

vidual rather than aggregate level, what proportions

of the differences are due to inclusion/exclusion cri-teria of cases in the Swedish Overview and to cause

of death determination?

73

www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2

7/28/2019 73.Full

2/8

(2) What are the reasons for individual differences

between the original study and the overview with

respect to breast cancer case definition/inclusion andcause of death?

(3) What are the implications of these kinds of differences

for endpoint definition in future studies in primary or

secondary prevention?

In this paper, wereport on a complete audit of breastcancercases and deaths in the Swedish W-E Trial, as defined by

the original trial investigators and by the Swedish Overview.

We report the numbers of disagreements at individual level

and the reasons for these. We discuss their implications for

interpretation of the overview and the original trial (up to

the end of 1993 for W-county and to the end of 1996 for

E-county) and for design and follow-up of future secondary

prevention trials.

BACKGROUND

The Swedish W-E Trial was initiated in 1977 in Kopparberg

county (now Dalarna, referred to as W-county hereafter)and in 1978 in Ostergotland county (E-county). Small geo-

graphical clusters were randomized to invitation to screening

(Active Study Population, ASP) or no invitation (Passive

Study Population, PSP) within 7 strata in W-county and

within 12 strata in E-county. The strata were chosen so

that clusters within strata were socioeconomically homo-

geneous. In W-county, randomization was approximately

in the ratio 2:1 for ASP:PSP. In E-county, roughly equal

numbers were randomized to the two groups. Entry of

strata to the trial was staggered to allow the mammography

facilities to cope with the workload. Year of birth cohorts

were included to give an approximate age range of 4074.

For example, for a stratum whose randomization date was1977, years of birth 1903 1937 were included. In total,

77,080 women were randomized to the ASP and 55,985 to

the PSP. Details of the age and county breakdown of the

study population are given elsewhere.11,12

The screening regime was single-view mammography, on

average every 24 months in women aged 4049 and every

33 months in women aged 5074. At the end of 1984 a sig-

nificant 31% reduction in breast cancer mortality was

observed in the ASP5. The PSP was then invited to screening.

The trial was closed immediately on completion of the first

round of screening in the PSP, and all cases in both arms

diagnosed up to and including the end of the first screen

of the PSP were followed up for death from breast cancer.

In W-county, according to the local trials records, therewere 694 breast cancer cases in the ASP and 359 in the

PSP. In E-county, there were 732 breast cancer cases diag-

nosed in the ASP and 683 in the PSP.6 These cases included

both in situ and invasive cancers diagnosed during the trial

period.

The OVC defined breast cancer cases as women reported

with an invasive breast cancer only (excluding women

with cancer in situ) to the Swedish National Cancer

Registry (NCR) during each trials recruitment period using

the reporting date in the NCR as the date of diagnosis;

women with an invasive breast cancer reported to the

NCR before the trial start were excluded from the study

base, although women diagnosed before 1958, when the

NCR was established, could not be excluded. The OVC also

accepted a woman as a breast cancer case in the study

when there was only a breast cancer death registered in

the Swedish Causes of Death Register (CDR), even if they

were not registered at the NCR. Thus, the diagnosis could

have occurred before trial start, during the trial period or

after the trial ended. Deaths from breast cancer were

retrieved from the CDR to include all deaths in womenwith breast cancer as the underlying cause, according to

the death certificate. Inclusion criteria in all analyses in the

overview were based on the exact age at randomization, as

opposed to the Two-County trial, where inclusion was

determined on the basis of year of birth. The OVC retrieved

the original randomization file, based on the population regi-

ster, from the IT co-ordinator responsible for data manage-

ment in each of the counties. The files were linked by

each womans unique National Registration Number to the

corresponding Regional Tumour Registry which provides

the data for the NCR to obtain verification and date of diag-

nosis and to the CDR to obtain date and cause of death.

Importantly, the 1st OVC included specialists who indepen-

dently from the W-E Trial determined the cause of death ofthe breast cancer patients based on case records. The publi-

cations of the 1st OVC gave a relative risk estimate similar to

that of the W-E Trials local committee.2 In the 2nd OVC the

decision was made to use the Swedish National Cancer and

Death Registries (NCR and CDR) to determine cause of

death instead of using the specialist committee, because

the combined relative risk using the register data was

similar to that of the 1st OVC.13

The 1st OVC conducted a computerized follow-up of both

the W-County and E-County data to 31 December 1993, and

the 2nd OVC continued data collection for the E-County

only until 31 December 1996.2 The computerized follow-up

ended 31 December 1993 for the first evaluation round(which was the last time the W-data were included) and

31 December 1996 for the second evaluation round

(which was the last time the E-data were included).

The four particularly important differences between the

W-E study design and the 1st & 2nd OVC s criteria were:

(1) The original trial defined inclusion and exclusion of

women to the trial by year of birth and residence in

the relevant geographical areas at the time of ran-

domization. The 1st & 2nd OVC defined the population

by year, month and day of birth.

(2) The end-point committees of the W-E trial and the 1st

OVC determined individual patient outcome by

reviewing all clinical records as identified in the origi-nal trial data and in NCR and CDR data. The 2nd

OVC used NCR and CDR data only.

(3) The OVC included women as cases if the CDR reported

a breast cancer death even if there was no report of a

breast cancer diagnosis in the NCR. The W-E Trial

included only those women who had a microscopically

confirmed breast cancer diagnosed during the trial

period.

(4) The W-E Trial included all breast cancer cases (in situ

and invasive), whereas the 1st and 2nd OVC both con-

sidered only invasive breast cancer cases reported to

the NCR, excluding all women as reported to the

74 Holmberg et al.

Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com

7/28/2019 73.Full

3/8

NCR as having cancer in situ carcinoma and could not

include those by clerical errors not reported from the

clinics to the NCR.

METHODS

In 2006 the Swedish Cancer Society set up a Joint Review

Committee (JRC) including members of the 1st & 2ndOVCs and the project leaders of the W-E trial to investigate

the sources of disagreement between the results published

by the trialists and the OVC (the 1st OVC for W county

and the 2nd OVC for E county data). The lists of women

with breast cancer according to the trialists and the OVC

were compared. Where necessary, clinical records were

retrieved. After investigating each case in the two lists inde-

pendently by the trialists and the OVC, a classification

scheme of the differences was developed by the JRC

(Table 1). The records of breast cancer cases and deaths

according to the local endpoint committee were compared

with those of the OVC using the Swedish National

Registration Numbers of the subjects for linkage. The

deaths through 1993 were compared for W-county and

through 1996 for E-county, as these dates were respecti-

vely the most recent Swedish Overview analyses to

include each county.8,14 The JRC reviewed each disagree-

ment between the two datasets with respect to either case

definition or cause of death. The JRC determined the

reasons for each individual disagreement. As a final result,

the trialists also accepted some women as additional

cancer cases in their trials depending on new information

about migrated women and clerical errors. The addition of

them to the original datasets is called the JRC conclusive

dataset.

Paired significance tests between OVC and W-E endpoints

were carried out using McNemar methods.15 Associations of

the likelihood of disagreement with age, county and trialarm were assessed using the chi-squared test. Relative risks

and 95% confidence intervals on these were calculated

using Poisson regression.

RESULTS

According to the W-E trial records the total numbers of

women for W-county were 38,589 in the ASP and 18,582

in the PSP; for E-county the numbers were 38,481 and

37,403. The corresponding figures in the O-V records were

38,562, 18,478, 38,405 and 37,145. These differences of

the order of less than 1% were not influential in the esti-

mation of the primary results.

W-county

Table 2 shows all cases included in either the local trial

records for W-county or the 1st OVC records or both, with

Table1 Classification of potential differences between the WE trial and the overview

Category Type of disagreement

Explanation

WE trial Overview

A Difference in the definition of age(accounts for differences atboth ends of the age spectrum)

Age calculation was based on the yearofrandomization and the yearof birth of thetrial attendee.

Age calculation was based on exact dateof birth and randomization (day/month/year).

B Definition of date of diagnosis(accounts for differences at trialstart and at trial end)

Date of operation. Women who had ascreening or clinical diagnosis at the endof the trial, before closing date, butoperated after the closing date, have beenincluded.

Date of notification to the NCR, which,according to registry principles is thefirst notification to the NCR of a cancer(often the date of a positive cytology).

C Difference in the principles of useof causes of death registry

Included only cases diagnosed within thetrial period.

Included cases diagnosed within the trialperiod andbreast cancer deathsregistered in the CDR even if they were notregistered at the NCR (Thus the diagnosiscould have occurred before trial start,during the trial period or after the trialended).

D Cases not retrievable from theNCR at the time of the

overview

Included all cases including cancer in situduring the trial period when there was

clinical information available on a breastcancer, even if a case was not registeredat the NCR due to administrative errorsand excluded all cases with a history ofbreast cancer before the study started.

Included only invasive cases retrievable fromthe NCR or identified at the CDR at the

time when the overview was conducted,but could not exclude those cases thatwere diagnosed before the start of theNCR in 1958.

G Differences in the determinationof cause of death

Cause of death was determined by the localtrial committeebased on data available inpatients medical records.

1st OVC used an independendent end-pointcommittee; the 2nd OVC used cause ofdeath data registered in the CDR.

I Erroneous inclusion in the W-Edatabase

Clerical error or incorrect national registration number.

K Miscellaneous clerical errors andother reasons

Includes misspecification of eligibility or cause of death due to clerical error, erroneousregistration in NCR or CDR or administrative loss of information. Also includes individualmigration, where a subject received a breast cancer diagnosis outside the study areas,and was therefore in the overview but missing from the W-E database.

NCR, National cancer register; CDR, National cause of death register

Differences in endpoints between the Swedish W-E 75


7/28/2019 73.Full

4/8

the endpoint in each data set cross-tabulated. Of the 1053

cases included in the local trial records, the OVC included

925 cases (88%). Conversely, of the 972 cases included in

the OVC records, the local trial included 925 breast cancer

cases (95%). Of the 443 deaths to 1993 included in both

datasets, there were 24 (5%) disagreements regarding deter-

mination of cause of death (type G disagreement). Of the

total 199 disagreements, whether with respect to case

inclusion or to cause of death, 175 (88%) pertained to

case inclusion rather than cause of death. For both the

ASP and the PSP, the overview was less likely to classify a

death as from breast cancer. The magnitude of this tendency

did not differ significantly between ASP and PSP.Table 3 shows the reasons for disagreement between the

two breast cancer-case datasets, in the ASP and PSP separ-

ately. The largest group of disagreements in the ASP was

type D, mainly due to women with screen-detected in situ

lesions included in the W-E dataset but not included by

the overview. In the PSP, most of the disagreements were

of type B, relating to date of diagnosis. These disagreements

mostly resulted from women in PSP diagnosed at the first

screen but through delays in reporting, not entered into

the NCR until after closure of the trial. These women were

considered by the OVC only to have been diagnosed at the

reporting date to the register and were thus excluded in

the OVC (see Table 1, category B). Disagreements with

respect to death from breast cancer were mainly due to cate-gory G (47%; disagreement about cause of death) and C

(29%; use of cause of death register without reference to

date of diagnosis) in the ASP, and to G (disagreement

about cause of death) and B (32%; definition of date of diag-

nosis) in the PSP.

Table 4 shows the breast cancer deaths and corresponding

relative risks (RR) from the W-arm of the W-E trial, the

OVC, and those derived after review of all information by

the JRC and the resulting conclusive dataset (i.e. the original

trial data plus correction for the clerical errors and cases lost

to the trialists due to migration). The OVC result is more

conservative than that of the original trial and the result

based on the JRC conclusive dataset. All analyses show a

significant mortality reduction in the ASP.

E-county

Table 5 shows cross-tabulation of the local trial endpoint

records for E-county with the 2nd OVC records, for all

women with breast cancer in either or both datasets. Of

the 1415 women with breast cancer included in the local

trial records, the 2nd OVC included 1298 (92%). Of the

1398 cases included in the 2nd OVC records, the local trial

included 1298 (93%). Of the 655 deaths to 1996 included

in both datasets, there were 53 (8%) disagreements. Of

the total 279 disagreements, 217 (78%) pertained to case

inclusion, 53 (19%) to cause of death and 9 (3%) to vital

status at 31 December 1996. For both the ASP and the

PSP, the 2nd OVC was less likely to classify a death as

from breast cancer. This tendency was significantly stronger

in the PSP (59% vs. 52%; P 0.03).

The reasons for disagreements are shown in Table 6. As with

W-county, the largest group, 40% of the disagreements in the

ASP are of type D, absence of trial cases from the NCR. For

the PSP, however, similar proportions of disagreements were

Table 3 Categorized disagreements between W-county trialrecords and 1st OVC records

Disagreementcategory

Number (%) of disagreements in cases

ASP PSP Total

A 14 (12) 3 (4) 17 (8)B 5 (4) 52 (65) 57 (29)C 13 (11) 2 (2) 15 (7)D 51 (43) 12 (15) 63 (32)G 16 (13) 8 (10) 24 (12)I 0 (0) 0 (0) 0 (0)K 20 (17) 3 (4) 23 (12)Total 119 (100) 80 (100) 199 (100)

Number (%) of disagreements for breastcancer death

A 2 (6) 0 (0) 2 (4)B 0 (0) 6 (32) 6 (11)C 10 (29) 2 (11) 12 (23)D 2 (6) 1 (5) 3 (6)G 16 (47) 8 (42) 24 (45)I 0 (0) 0 (0) 0 (0)K 4 (12) 2 (11) 6 (11)Total 34 (100) 19 (100) 53 (100)

PSP, passive study population, not invited; ASP, active study population, invited

Table 4 Trial mortality result for W-county from original localtrial endpoint, 1st OVC endpoint and the JRC conclusivedataset

ASP PSP RR (95% CI)

W original breastcancer deaths

135 110 0.59 (0.45 0.76)

OVC breast cancerdeaths

141 99 0.69 (0.53 0.90)

JRC conclusion breastcancer deaths for W

136 111 0.59 (0.45 0.76)

Number of subjects 38,589 18,582


Table 2 W-county outcomes tabulated against overviewoutcomes (agreements in bold)

Studygroup

1st OVCoutcome

W-county outcome

Incl,BCD

Incl,DOC

Incl,alive

Notincl Total

PSP Incl, BCD 95 0 0 4 99Incl, DOC 8 50 0 0 58Incl, Alive 0 0 138 0 138Not incl 7 7 54 0 68Total 110 57 192 4 363

ASP Incl, BCD 121 4 0 16 141Incl, DOC 12 153 0 15 180Incl, Alive 0 0 344 12 356Not incl 2 10 48 0 60Total 135 167 392 43 737

Total Incl, BCD 216 4 0 20 240Incl, DOC 20 203 0 15 238Incl, Alive 0 0 482 12 494Not incl 9 17 102 0 128Total 245 224 584 47 1100

Incl, included; BCD, breast cancer death; DOC, death from other causes; PSP, passive studypopulation, not invited; ASP, active study population, invited

76 Holmberg et al.


7/28/2019 73.Full

5/8

observed in categories D (19%), absence of the case from the

NCR, G (22%), disagreement about cause of death, and K

(26%), miscellaneous clerical errors and other reasons. With

respect to breast cancer death, disagreements were dominated

by category G (disagreement about cause of death) and C (date

of diagnosis) in both the ASP and PSP.

Table 7 shows the E-county trial result with respect to

breast cancer mortality using the original trial endpoint,

the 2nd OVC endpoint and the conclusive endpoint after

review of all sources by the JRC (i.e. the original trial data

plus correction for the clerical errors and cases lost to the

trialists due to migration). The trial endpoint and the JRC

conclusive dataset both show a significant 2023%

reduction in mortality, whereas the 2nd OVC result shows

a non-significant 10% reduction.

Associations with disagreement

We also investigated whether study group (ASP/PSP),county or age were significantly related to the likelihood of

disagreement about breast cancer death. In the 685 cases

classified as breast cancer death by either the W and E

local committees or the OVC or both, there was no signifi-

cant association of study group with disagreement (P

0.2).There was a higher proportion of disagreement in E-county

than in W-county, but this did not attain statistical significance

(P 0.09). There was, however, a significant effect of patient-

age at the time of randomization on the probability of a risk

of disagreement (P, 0.001). In both counties, the disagree-

ment increased with age (Figure 1).

DISCUSSION

In this study, the Swedish Cancer Societys Joint Review

Committee (JRC) investigated disagreements between the

breast cancer incidence and death data as recorded in the

original Swedish Two-County Trial, based on individualpatient records and determination of cause of death by an

expert committee, and that in the 2nd OVC based on the

National Cancer Registry and Cause of Death Register. For

the purposes of this study, we had full access to original

W-E trial data, original data collected for the OVC, individual

medical records, and register data from the regional tumour

registries for the respective counties. The registration of new

diagnosis of breast cancer is mandatory by law in Sweden

Table 7 Trial mortality result for E-county from original localtrial endpoint, 2nd OVC endpoint and the JRC conclusivedataset

ASP PSP RR (95% CI)

Original E breastcancer deaths

163 200 0.80 (0.64 0.98)

OVC breast cancerdeaths

175 189 0.90 (0.72 1.12)

JRC conclusion breastcancer deaths for E

162 206 0.77 (0.62 0.95)

Number of subjects 38,309 37,403


Figure 1 Percentage disagreement between W-E and 2nd OVCby age, in 604 cases classed as breast cancer deaths by one orboth sources

Table 5 E-county outcomes tabulated against 2nd OVCoutcomes (agreements in bold)

Studygroup

2nd OVCoutcome

E-county outcome

Incl,BCD

Incl,DOC

Incl,alive

Notincl Total

PSP Incl, BCD 164 4 2 19 189Incl, DOC 27 128 3 20 178Incl, Alive 0 1 296 10 307Not incl 9 8 41 0 58Total 200 141 342 49 732

ASP Incl, BCD 147 8 0 20 175Incl, DOC 14 163 2 21 200Incl, Alive 0 1 338 10 349Not incl 2 13 44 0 59Total 163 185 384 51 783

Total Incl, BCD 311 12 2 39 364Incl, DOC 41 291 5 41 378Incl, Alive 0 2 634 20 656Not incl 11 21 85 0 117Total 363 326 726 100 1515

Incl, included; BCD, breast cancer death; DOC, death of other causes; PSP, passive studypopulation, not invited; ASP, active study population, invited

Table 6 Categorized disagreements between E-county trialrecords and 2nd OVC records

Disagreementcategory

Number (%) of disagreements in cases

ASP PSP Total

A 21 (16) 17 (12) 38 (14)B 0 (0) 18 (12) 18 (6)C 15 (11) 10 (7) 25 (9)D 54 (40) 28 (19) 83 (30)G 22 (16) 31 (22) 52 (19)I 3 (2) 3 (2) 6 (2)K 20 (15) 37 (26) 57 (20)Total 135 (100) 144 (100) 279 (100)

Number (%) of disagreements for breastcancer death

A 5 (12) 3 (5) 8 (7)B 0 (0) 4 (7) 4 (4)C 15 (34) 10 (16) 25 (24)D 1 (2) 4 (7) 5 (5)G 22 (50) 30 (49) 52 (49)I 1 (2) 0 (0) 1 (1)K 0 (0) 10 (16) 10 (10)Total 44 (100) 61 (100) 105 (100)




7/28/2019 73.Full

6/8

and the completeness of registration of breast cancer is over

98%.16 Thus, we were able to determine the reason for dis-

crepancy in every individual case and no discrepancies were

left unexplained.

Our main empirical findings are that the JRC found that

of the 2615 cases included by the W-E Trial or the OVC,

there were 478 (18%) disagreements about inclusion/exclusion of women into the trial or determination of the

cause of death. The vast majority of these pertained to adisagreement in inclusion/exclusion and not to disagree-ment in determination of cause of death. The disagreements

were in the great majority of cases due to OVC-study design

decisions pertaining to issues such as definition of age and

last date of inclusion into the study, and use of a register

rather than clinical records for case definition and cause

of death determination. Disagreement about whether a

death included in both the W-E Trial and the OVC

was from breast cancer or not was relatively rare. We

also found that the likelihood of disagreement about the

cause of death was not significantly affected by county or

trial arm. Such disagreement was, however, significantly

more likely in older patients. These findings have

implications both for the interpretation of screening effectsand for methodological issues in overviewing original

research.

The combined results of the two counties showed a signifi-

cant breast cancer mortality reduction associated with the

offer of screening by any of the three endpoint criteria.

Using the JRC conclusions, the combined RR was 0.69

(95% CI 0.58 0.83). Thus, the overall interpretation was

not sensitive to these differences in design. In W-county,

the result was significant by any of the three criteria,

whereas in E-county, the result was significant using the

original trial endpoint, and the JRC conclusive review end-

point, but not statistically significant using the 2nd OVC end-

point. The JRC conclusive result included some women withbreast cancer previously missed by the trialists due to

migration, but picked up by the NCR or CDR.

The remit of the JRC was not to determine whether one

or the other of the endpoints were correct. However, it is

clear from the E-county results that a combination of dif-

fering causes of death determinations and inclusion/exclu-sion rules made a crucial difference to the primary result.

It is highly relevant for the field of secondary prevention

to understand how such modest disagreements cause

such a difference to the outcome in a trial with a total of

133,065 subjects. The answer is that the disagreements

only needed to impact on the small minorities of subjects

classified as dying from breast cancer within the trial arm

subgroups of one geographical stratum (E-county) withinthe larger trial. In the ASP of E-county, disagreements

with respect to cause of death and eligibility for inclusion

caused a loss of 16 and a gain of 28 breast cancer deaths,

a net increase of 12 breast cancer deaths. In the PSP,

there was a loss of 36 breast cancer deaths and a gain of

25, a net loss of 11 deaths (Table 5). Thus the 2nd OVC

classification of eligibility for inclusion and cause of death

gave a 7% higher death rate in the ASP and a 6% lower

death rate in the PSP, sufficient to convert a statistically

significant 20% reduction in mortality to a statistically

non-significant 10% reduction. It should be noted that if

the inclusion criteria had been identical and the only

difference had been the disagreements over cause of

death, the result in E-county would still have been ren-

dered non-significant.

The effect of misclassification of exposure factors has

been extensively studied in epidemiology,1719 and when

it is non-differential with respect to disease outcome, it

tends to dilute estimated effects. Although less fully

researched, the misclassification of outcome has also been

shown to cause underestimation of exposure/outcomeassociations.20 Disagreement rates between OVC and W-Eclassifications were 18% in both counties. Discrepancies

of this magnitude are suggestive of misclassification prob-

abilities of 10%, and would be likely to lead to dilution

of observed associations by approximately 33%.21 The

differences between W-E and OVC are smaller than this

for W-county and rather larger for E-county. That they

are proportionally larger for E-county is likely to be due

to the fact that disagreement rates were differential

between trial arms. The implications of this are that in

general, the poorer the classification, the greater the poten-

tial for missing a true effect, that the presence of differen-

tial misclassification may increase the potential bias, and

that the more thorough the classification effort, the moresensitive the comparison is likely to be.

All these circumstances underline the importance of using

an expert panel for determining cause of death when the

individual study units contain few events. Others have

regarded the determinations of such an expert committee

as the gold standard,22 even when they have concluded

that national death register information is adequate in com-

parison.23 The OVC obtained results closer to those of the

original trial when the 1st OVC used an expert endpoint

committee.2

The finding that the disagreement of cause of death

increased with age is also of general interest. It accords

with the findings of the 1st OVC where four clinicians notinvolved in the trials independently determined cause of

death and the discordance at the initial review was 5%,

5%, 13% and 19% in women 4049, 5059, 6069 and

70 74 years respectively, at randomization.13 This probably

reflects an increasing difficulty to determine cause of death

with age for several reasons: a mixed clinical picture due

to increasing co-morbidity, death occurring more often at

home or in a nursing home without a clinical examination

closely before death, very low probability of an autopsy,

and increased uncertainty about origin of eventual metas-

tases if also another malignancy has been diagnosed

during follow-up. With long-term follow-up, information

may also be lost that the woman is a trial participant and

that determination of cause of death may be important.The results of the JRC review show that the disagree-

ments were due to design differences between the clinical

intervention trial approach employed in the original W-E

Trial and the register-study design used by the OVC. This

leads to a more general observation: design decisions in

either an original study or a subsequent overview that

may at first glance seem trivial e.g. defining a date for

end of trial can influence basic and important study fea-

tures such as the number of included subjects. Thus, design

differences between original studies and overviews have to

be taken into account when the overview does not adhere

to the original designs, and it should be investigated if

78 Holmberg et al.


7/28/2019 73.Full

7/8

the interpretation is sensitive to such design conflicts.

An example here is the inclusion of women with in situ

tumours as cases in the original study contrasted with the

decision to only include those registered with an invasive

cancer in the 1st and the 2nd OVC. This decision made

an especially large difference for the ASP. Thus, seemingly

general deviances from the original study design may not

be neutral to the evaluation of the randomized trial.

In this case this decision above all contributed to the differ-ent number of cases reported in the original trial as com-

pared with the 1st and 2nd OVC, but little to the

evaluation of breast cancer mortality.

Is there a role for registry data in evaluation of primary or

secondary interventions? It would definitely seem so where

the research involves millions of person-years and large

numbers of cause-specific deaths, thus misclassifications

are likely to be heavily outnumbered by reliable obser-

vations3,4 such as in large prevention and secondary preven-

tion studies. For individual trials with smaller sizes, however,

it is more reliable to individually determine case status and

the cause of deaths by an expert committee.

CONCLUSION

The following points are suggested by the above results:

(1) The conclusion that invitation to mammography

screening was associated with a significant breast

cancer mortality reduction remains robust after a full

examination of disagreements between the original

Two-County Trial endpoints and those of the Swedish

overview. Disagreements about actual cause of death

were a minority of the overall disagreements and

were common only for older cases; the majority of dis-

agreements related to inclusion or exclusion.(2) The use of the overview inclusion criteria and the

national registry data for determination of breast

cancer deaths led to a substantial change in the result

for one of the two counties illustrating that non-

differential misclassification of the main endpoint

tends to drive results towards the null.

(3) Thus, for trials with modest size it would appear to be

more prudent to rely on trial logistics with close indi-

vidual monitoring of case status, presence of covariates

and outcome status such as determination of cause of

death based on all available clinical information.

(4) When secondary research does not adhere to the pro-

tocols of the primary research projects included, theconsequences of such design differences should be

investigated and reported. Seemingly trivial design

decisions may have significant impact on the result

and are not always neutral to the randomized design.

. . . . . . . . . . . . . . .Authors affiliationsL Holmberg, Professor of Cancer Epidemiology, Kings CollegeLondon, Medical School, Division of Cancer Studies, London, UKS W Duffy, Professor of Breast Cancer Screening, Cancer ResearchUK Centre for Epidemiology, Mathematics and Statistics, WolfsonInstitute of Preventive Medicine, London, UK

A M F Yen, Cancer Research UK Centre for Epidemiology,Mathematics and Statistics, Wolfson Institute of Preventive Medicine,London, UKL Tabar, Professor of Radiology, University of Uppsala, School ofMedicine, Department of Mammography, Falun Central Hospital,Falun, SwedenB Vitak, Consultant Radiologist, Division of Radiological Sciences,Department of Medical and Health Sciences, Linkoping University,Linkoping, SwedenL Nystrom, Associate Professor of Epidemiology, Department of

Public Health and Clinical Medicine, Umea Universtiy, Umea, SwedenJ Frisell, Professor of Surgery, Department of Molecular Medicineand Surgery, Unit of Breast Surgery, Karolinska Institute, Solna,Sweden

ACKNOWLEDGEMENTS

The study was supported by grants from the Swedish Cancer

Society and the American Cancer Society. We thank Sherry

Yueh-Hsia Chiu from the Institute of Preventive Medicine,

Division of Biostatistics, College of Public Health at the

National Taiwan University for excellent help and Robert

Smith from the American Cancer Society for valuable dis-

cussions and advice.

Conflict of interest and contributions: The authors are

associated with the WE trial and the Overview as described

in contributions and have otherwise no conflict of interest in

relation to this work.

Lars Holmberg, Stephen Duffy and Jan Frisell oversaw the

comparison and coordinated the analyses. Lars Holmberg

and Stephen Duffy drafted the report. Jan Frisell and Lars

Holmberg were the principal investigators for the grants

that supported the study. Laszlo Tabar and Bedrich Vitak

were the principal investigators for the W and E trial parts,

respectively, and provided all data for the W-E trial. Jan

Frisell and Lennarth Nystrom were the principal and the

coordinating investigators for the Overview committee,

respectively, and Lennarth Nystrom provided the Overviewdata. Stephen Duffy and Amy Yen made the statistical ana-

lyses. All authors had full access to the data, contributed in

the comparison process, the interpretation of the analyses

and revised the manuscript for intellectual content. Lars

Holmberg is the guarantor for the study.

REFERENCES

1 Smith RA, Duffy SW, Gabe R, Tabar L, Yen AMF, Chen HHT. Therandomized trials of breast cancer screening: what have we learned?Radiol Clin Nth Amer 2004;42:793806

2 Nystrom L, Rutquist LE, Wall S, et al. Breast cancer screening withmammography: overview of the Swedish randomised trials. Lancet

1993;341:97383 Swedish Organised Service screening Evaluation Group. Reduction inBreast Cancer Mortality from Organised Service screening withMammography: 1. Further confirmation with extended data. CancerEpidemiol Biomarkers Prev 2006;15:4551

4 Swedish Organised Service screening Evaluation Group. Reductionin breast cancer mortality from organised service screening withmammography: 2. Validation with alternative analytic methods. CancerEpidemiol Biomarkers Prev 2006;15:5256

5 Tabar L, Fagerberg CJ, Gad A, et al. Reduction in mortality from breastcancer after mass screening with mammography. Randomised trial fromthe Breast Cancer screening Working Group of the Swedish NationalBoard of Health and Welfare. Lancet1985;325:82932

6 Tabar L, Vitak B, Chen HH, Duffy SW, Smith RA. The Swedish Two-CountyTrial twenty years later: updated mortality results and new insights fromlong term follow-up. Radiol Clin Nth Amer 2000;38:62551

7 Gtzsche PC, Olsen O. Is screening for breast cancer with mammographyjustifiable? Lancet2000;355:12933



7/28/2019 73.Full

8/8

8 Nystrom L, Andersson I, Bjurstam N, Frisell J, Nordenskjo ld B, Rutqvist LE.Long-term effects of mammography screening: updated overview of theSwedish randomised trials. Lancet2002;359:90919

9 Freedman DA, Petitti DB, Robins JM. On the efficacy of screening for breastcancer. Int J Epidemiol 2004;33:4355

10 Duffy SW. Interpretation of the breast screening trials: a commentary on therecent paper by Gtzsche and Olsen. The Breast 2001;10:20912

11 Tabar L, Fagerberg G, Duffy SW, Day NE, Gad A, Grontoft O. Update ofthe Swedish two- county program of mammographic screening for breastcancer. Radiol Clin Nth Amer1992;30:187210

12 Duffy SW, Tabar L, Vitak B, et al. The Swedish Two-County Trial ofmammographic screening: cluster randomisation and endpoint evaluation.Ann Oncol 2003;39:174654

13 Nystrom L, Larsson L-G, Rutqvist LE, et al. Determination of cause of deathamong breast cancer cases in the Swedish mammography screening trials:a comparison between official statistics and validation by an endpointcommittee. Acta Oncol 1995;34:14552

14 Larsson LG, Andersson I, Bjurstam N, et al. Updated overview of theSwedish randomised trials on beast cancer screening with mammography:age group 4049 at randomisation. J Natl Cancer Inst Monogr1997;22:5761

15 McNemar Q. Note on the sampling error of the differencebetween correlated proportions or percentages. Psychometrika1947;12:1537

16 Barlow L, Westergren K, Holmberg L, Talback M . The completeness of theSwedish Cancer Register - a sample survey for year 1998. Acta Oncol2009;48:2733

17 Freedman LS, Midthune D, Carroll RJ, Kipnis V. A comparison of regressioncalibration, moment reconstruction and imputation for adjusting forcovariate measurement error in regression. Stat Med2008;27:5195216

18 Wong MY, Day NE, Luan JA, Wareham NJ. Estimation of magnitude ingene-environment interactions in the presence of measurement error.Stat Med2004;23:98798

19 Bashir SA, Duffy SW. The correction of risk estimates for measurement

error. Ann Epidemiol 1997;7:1546420 Duffy SW, Warwick J, Williams AR, et al. A simple model for potential use

with a misclassified binary outcome in epidemiology. J Epidemiol CommHlth 2004;58:7127

21 Duffy SW, Maximovitch DM, Day NE. External validation, repeatdetermination, precision of risk estimation in misclassified exposure datain epidemiology. J Epidemiol Comm Hlth 1992;46:62024

22 Miller AB. Design of cancer screening trials/randomized trials forevaluation of cancer screening. World J Surg 2006;30:115262

23 Makinen T, Karhunen P, Aro J, Lahtela J, Maattanen L, Auvinen A.Assessment of causes of death in a prostate cancer screening trial.Int J Cancer 2008;122:41317

80 Holmberg et al.


Documents

73.Full