73.Full

Embed Size (px)

Citation preview

  • 7/28/2019 73.Full

    1/8

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    ORIGINAL ARTICLE

    Differences in endpoints between the Swedish W-E(two county) trial of mammographic screening and theSwedish overview: methodological consequencesL Holmberg, S W Duffy, A M F Yen, L Taba r, B Vitak, L Nystrom and J Frisell

    J Med Screen 2009;16:7380DOI: 10.1258/jms.2009.008103

    See end of article forauthors affiliations. . . . . . . . . . . . . . . . . . .

    Correspondence to: LarsHolmberg, ResearchOncology, 3rd floorBermondsey Wing, GuysHospital, Divison of CancerStudies, Guys Campus,Kings College London,SE1 9RT London, UK;

    [email protected] for publication11 March 2009. . . . . . . . . . . . . . . . . . .

    Objectives To characterize and quantify the differences in the number of cases and breast cancerdeaths in the Swedish W-E Trial compared with the Swedish Overview Committee (OVC) summariesand to study methodological issues related to trials in secondary prevention.Setting The study population of the W-E Trial of mammography screening was included in the first(W and E county) and the second (E-county) OVC summary of all Swedish randomizedmammography screening trials. The OVC and the W-E Trial used different criteria for case

    definition and causes of death determination.Method A Review Committee compared the original data files from W and E county and the first andsecond OVC. The reason for a discrepancy was determined individually for all non-concordant casesor breast cancer deaths.Results Of the 2615 cases included by the W-E Trial or the OVC, there were 478 (18%)disagreements. Of the disagreements 82% were due to inclusion/exclusion criteria, and 18% todisagreement with respect to cause of death or vital status at ascertainment. For E-County, the OVCinclusion rules and register based determination of cause of death (second OVC) rather thanindividual case review (W-E Trial and 1st OVC) resulted in a reduction of the estimate of the effectof screening, but for W-County the difference between the original trial and the OVC was modest.Conclusions The conclusion that invitation to mammography screening reduces breast cancermortality remains robust. Disagreements were mainly due to study design issues, whiledisagreements about cause of death were a minority. When secondary research does not adhereto the protocols of the primary research projects, the consequences of such design differences

    should be investigated and reported. Register linkage of trials can add follow-up information. Theprecision of trials with modest size is enhanced by individual monitoring of case status andoutcome status such as determination of cause of death.

    INTRODUCTION

    Mammographic screening reduces mortality from

    breast cancer in both randomized trials1,2 and in

    routine service screening.3,4 The Swedish W-E

    Trial was the first randomized trial to demonstrate a

    reduction in breast cancer mortality from screening withmammography alone,5 showing a 31% reduction in breast

    cancer mortality with invitation to screening. This reduction

    has remained consistent over long-term follow-up.6

    In 1987 the Swedish Cancer Society set up an Overview

    Committee (OVC) to review all the randomized mammo-

    graphy trials in Sweden, the W-E Trial being one of them.

    The OVC performed two overviews (hereafter called the 1st

    and the 2nd OVC) by collecting data from all four Swedish

    mammography trials in a uniform way. However, between

    the 1st and 2nd OVC there was a difference in the methods

    of determining cause of death, using an endpoint committee

    in the 1st OVC and registries only in the 2nd.

    Concern was expressed about differences between results

    reported for the W-E trial by the original trialists and those

    reported by the Swedish Overview, particularly with

    respect to numbers of breast cancer deaths.7 It has been

    pointed out that such differences are an inevitable conse-

    quence of the different case definition and determination

    of cause of death, and the different eligibility criteria of theSwedish Overview1,810 as compared with the W-E Trial.

    These differences, however, raise both particular and general

    methodological issues related to follow-up of large trials or

    sets of trials in secondary prevention. These questions

    include:

    (1) What is the magnitude of these differences at an indi-

    vidual rather than aggregate level, what proportions

    of the differences are due to inclusion/exclusion cri-teria of cases in the Swedish Overview and to cause

    of death determination?

    73

    www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2

  • 7/28/2019 73.Full

    2/8

    (2) What are the reasons for individual differences

    between the original study and the overview with

    respect to breast cancer case definition/inclusion andcause of death?

    (3) What are the implications of these kinds of differences

    for endpoint definition in future studies in primary or

    secondary prevention?

    In this paper, wereport on a complete audit of breastcancercases and deaths in the Swedish W-E Trial, as defined by

    the original trial investigators and by the Swedish Overview.

    We report the numbers of disagreements at individual level

    and the reasons for these. We discuss their implications for

    interpretation of the overview and the original trial (up to

    the end of 1993 for W-county and to the end of 1996 for

    E-county) and for design and follow-up of future secondary

    prevention trials.

    BACKGROUND

    The Swedish W-E Trial was initiated in 1977 in Kopparberg

    county (now Dalarna, referred to as W-county hereafter)and in 1978 in Ostergotland county (E-county). Small geo-

    graphical clusters were randomized to invitation to screening

    (Active Study Population, ASP) or no invitation (Passive

    Study Population, PSP) within 7 strata in W-county and

    within 12 strata in E-county. The strata were chosen so

    that clusters within strata were socioeconomically homo-

    geneous. In W-county, randomization was approximately

    in the ratio 2:1 for ASP:PSP. In E-county, roughly equal

    numbers were randomized to the two groups. Entry of

    strata to the trial was staggered to allow the mammography

    facilities to cope with the workload. Year of birth cohorts

    were included to give an approximate age range of 4074.

    For example, for a stratum whose randomization date was1977, years of birth 1903 1937 were included. In total,

    77,080 women were randomized to the ASP and 55,985 to

    the PSP. Details of the age and county breakdown of the

    study population are given elsewhere.11,12

    The screening regime was single-view mammography, on

    average every 24 months in women aged 4049 and every

    33 months in women aged 5074. At the end of 1984 a sig-

    nificant 31% reduction in breast cancer mortality was

    observed in the ASP5. The PSP was then invited to screening.

    The trial was closed immediately on completion of the first

    round of screening in the PSP, and all cases in both arms

    diagnosed up to and including the end of the first screen

    of the PSP were followed up for death from breast cancer.

    In W-county, according to the local trials records, therewere 694 breast cancer cases in the ASP and 359 in the

    PSP. In E-county, there were 732 breast cancer cases diag-

    nosed in the ASP and 683 in the PSP.6 These cases included

    both in situ and invasive cancers diagnosed during the trial

    period.

    The OVC defined breast cancer cases as women reported

    with an invasive breast cancer only (excluding women

    with cancer in situ) to the Swedish National Cancer

    Registry (NCR) during each trials recruitment period using

    the reporting date in the NCR as the date of diagnosis;

    women with an invasive breast cancer reported to the

    NCR before the trial start were excluded from the study

    base, although women diagnosed before 1958, when the

    NCR was established, could not be excluded. The OVC also

    accepted a woman as a breast cancer case in the study

    when there was only a breast cancer death registered in

    the Swedish Causes of Death Register (CDR), even if they

    were not registered at the NCR. Thus, the diagnosis could

    have occurred before trial start, during the trial period or

    after the trial ended. Deaths from breast cancer were

    retrieved from the CDR to include all deaths in womenwith breast cancer as the underlying cause, according to

    the death certificate. Inclusion criteria in all analyses in the

    overview were based on the exact age at randomization, as

    opposed to the Two-County trial, where inclusion was

    determined on the basis of year of birth. The OVC retrieved

    the original randomization file, based on the population regi-

    ster, from the IT co-ordinator responsible for data manage-

    ment in each of the counties. The files were linked by

    each womans unique National Registration Number to the

    corresponding Regional Tumour Registry which provides

    the data for the NCR to obtain verification and date of diag-

    nosis and to the CDR to obtain date and cause of death.

    Importantly, the 1st OVC included specialists who indepen-

    dently from the W-E Trial determined the cause of death ofthe breast cancer patients based on case records. The publi-

    cations of the 1st OVC gave a relative risk estimate similar to

    that of the W-E Trials local committee.2 In the 2nd OVC the

    decision was made to use the Swedish National Cancer and

    Death Registries (NCR and CDR) to determine cause of

    death instead of using the specialist committee, because

    the combined relative risk using the register data was

    similar to that of the 1st OVC.13

    The 1st OVC conducted a computerized follow-up of both

    the W-County and E-County data to 31 December 1993, and

    the 2nd OVC continued data collection for the E-County

    only until 31 December 1996.2 The computerized follow-up

    ended 31 December 1993 for the first evaluation round(which was the last time the W-data were included) and

    31 December 1996 for the second evaluation round

    (which was the last time the E-data were included).

    The four particularly important differences between the

    W-E study design and the 1st & 2nd OVC s criteria were:

    (1) The original trial defined inclusion and exclusion of

    women to the trial by year of birth and residence in

    the relevant geographical areas at the time of ran-

    domization. The 1st & 2nd OVC defined the population

    by year, month and day of birth.

    (2) The end-point committees of the W-E trial and the 1st

    OVC determined individual patient outcome by

    reviewing all clinical records as identified in the origi-nal trial data and in NCR and CDR data. The 2nd

    OVC used NCR and CDR data only.

    (3) The OVC included women as cases if the CDR reported

    a breast cancer death even if there was no report of a

    breast cancer diagnosis in the NCR. The W-E Trial

    included only those women who had a microscopically

    confirmed breast cancer diagnosed during the trial

    period.

    (4) The W-E Trial included all breast cancer cases (in situ

    and invasive), whereas the 1st and 2nd OVC both con-

    sidered only invasive breast cancer cases reported to

    the NCR, excluding all women as reported to the

    74 Holmberg et al.

    Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com

  • 7/28/2019 73.Full

    3/8

    NCR as having cancer in situ carcinoma and could not

    include those by clerical errors not reported from the

    clinics to the NCR.

    METHODS

    In 2006 the Swedish Cancer Society set up a Joint Review

    Committee (JRC) including members of the 1st & 2ndOVCs and the project leaders of the W-E trial to investigate

    the sources of disagreement between the results published

    by the trialists and the OVC (the 1st OVC for W county

    and the 2nd OVC for E county data). The lists of women

    with breast cancer according to the trialists and the OVC

    were compared. Where necessary, clinical records were

    retrieved. After investigating each case in the two lists inde-

    pendently by the trialists and the OVC, a classification

    scheme of the differences was developed by the JRC

    (Table 1). The records of breast cancer cases and deaths

    according to the local endpoint committee were compared

    with those of the OVC using the Swedish National

    Registration Numbers of the subjects for linkage. The

    deaths through 1993 were compared for W-county and

    through 1996 for E-county, as these dates were respecti-

    vely the most recent Swedish Overview analyses to

    include each county.8,14 The JRC reviewed each disagree-

    ment between the two datasets with respect to either case

    definition or cause of death. The JRC determined the

    reasons for each individual disagreement. As a final result,

    the trialists also accepted some women as additional

    cancer cases in their trials depending on new information

    about migrated women and clerical errors. The addition of

    them to the original datasets is called the JRC conclusive

    dataset.

    Paired significance tests between OVC and W-E endpoints

    were carried out using McNemar methods.15 Associations of

    the likelihood of disagreement with age, county and trialarm were assessed using the chi-squared test. Relative risks

    and 95% confidence intervals on these were calculated

    using Poisson regression.

    RESULTS

    According to the W-E trial records the total numbers of

    women for W-county were 38,589 in the ASP and 18,582

    in the PSP; for E-county the numbers were 38,481 and

    37,403. The corresponding figures in the O-V records were

    38,562, 18,478, 38,405 and 37,145. These differences of

    the order of less than 1% were not influential in the esti-

    mation of the primary results.

    W-county

    Table 2 shows all cases included in either the local trial

    records for W-county or the 1st OVC records or both, with

    Table1 Classification of potential differences between the WE trial and the overview

    Category Type of disagreement

    Explanation

    WE trial Overview

    A Difference in the definition of age(accounts for differences atboth ends of the age spectrum)

    Age calculation was based on the yearofrandomization and the yearof birth of thetrial attendee.

    Age calculation was based on exact dateof birth and randomization (day/month/year).

    B Definition of date of diagnosis(accounts for differences at trialstart and at trial end)

    Date of operation. Women who had ascreening or clinical diagnosis at the endof the trial, before closing date, butoperated after the closing date, have beenincluded.

    Date of notification to the NCR, which,according to registry principles is thefirst notification to the NCR of a cancer(often the date of a positive cytology).

    C Difference in the principles of useof causes of death registry

    Included only cases diagnosed within thetrial period.

    Included cases diagnosed within the trialperiod andbreast cancer deathsregistered in the CDR even if they were notregistered at the NCR (Thus the diagnosiscould have occurred before trial start,during the trial period or after the trialended).

    D Cases not retrievable from theNCR at the time of the

    overview

    Included all cases including cancer in situduring the trial period when there was

    clinical information available on a breastcancer, even if a case was not registeredat the NCR due to administrative errorsand excluded all cases with a history ofbreast cancer before the study started.

    Included only invasive cases retrievable fromthe NCR or identified at the CDR at the

    time when the overview was conducted,but could not exclude those cases thatwere diagnosed before the start of theNCR in 1958.

    G Differences in the determinationof cause of death

    Cause of death was determined by the localtrial committeebased on data available inpatients medical records.

    1st OVC used an independendent end-pointcommittee; the 2nd OVC used cause ofdeath data registered in the CDR.

    I Erroneous inclusion in the W-Edatabase

    Clerical error or incorrect national registration number.

    K Miscellaneous clerical errors andother reasons

    Includes misspecification of eligibility or cause of death due to clerical error, erroneousregistration in NCR or CDR or administrative loss of information. Also includes individualmigration, where a subject received a breast cancer diagnosis outside the study areas,and was therefore in the overview but missing from the W-E database.

    NCR, National cancer register; CDR, National cause of death register

    Differences in endpoints between the Swedish W-E 75

    www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2

  • 7/28/2019 73.Full

    4/8

    the endpoint in each data set cross-tabulated. Of the 1053

    cases included in the local trial records, the OVC included

    925 cases (88%). Conversely, of the 972 cases included in

    the OVC records, the local trial included 925 breast cancer

    cases (95%). Of the 443 deaths to 1993 included in both

    datasets, there were 24 (5%) disagreements regarding deter-

    mination of cause of death (type G disagreement). Of the

    total 199 disagreements, whether with respect to case

    inclusion or to cause of death, 175 (88%) pertained to

    case inclusion rather than cause of death. For both the

    ASP and the PSP, the overview was less likely to classify a

    death as from breast cancer. The magnitude of this tendency

    did not differ significantly between ASP and PSP.Table 3 shows the reasons for disagreement between the

    two breast cancer-case datasets, in the ASP and PSP separ-

    ately. The largest group of disagreements in the ASP was

    type D, mainly due to women with screen-detected in situ

    lesions included in the W-E dataset but not included by

    the overview. In the PSP, most of the disagreements were

    of type B, relating to date of diagnosis. These disagreements

    mostly resulted from women in PSP diagnosed at the first

    screen but through delays in reporting, not entered into

    the NCR until after closure of the trial. These women were

    considered by the OVC only to have been diagnosed at the

    reporting date to the register and were thus excluded in

    the OVC (see Table 1, category B). Disagreements with

    respect to death from breast cancer were mainly due to cate-gory G (47%; disagreement about cause of death) and C

    (29%; use of cause of death register without reference to

    date of diagnosis) in the ASP, and to G (disagreement

    about cause of death) and B (32%; definition of date of diag-

    nosis) in the PSP.

    Table 4 shows the breast cancer deaths and corresponding

    relative risks (RR) from the W-arm of the W-E trial, the

    OVC, and those derived after review of all information by

    the JRC and the resulting conclusive dataset (i.e. the original

    trial data plus correction for the clerical errors and cases lost

    to the trialists due to migration). The OVC result is more

    conservative than that of the original trial and the result

    based on the JRC conclusive dataset. All analyses show a

    significant mortality reduction in the ASP.

    E-county

    Table 5 shows cross-tabulation of the local trial endpoint

    records for E-county with the 2nd OVC records, for all

    women with breast cancer in either or both datasets. Of

    the 1415 women with breast cancer included in the local

    trial records, the 2nd OVC included 1298 (92%). Of the

    1398 cases included in the 2nd OVC records, the local trial

    included 1298 (93%). Of the 655 deaths to 1996 included

    in both datasets, there were 53 (8%) disagreements. Of

    the total 279 disagreements, 217 (78%) pertained to case

    inclusion, 53 (19%) to cause of death and 9 (3%) to vital

    status at 31 December 1996. For both the ASP and the

    PSP, the 2nd OVC was less likely to classify a death as

    from breast cancer. This tendency was significantly stronger

    in the PSP (59% vs. 52%; P 0.03).

    The reasons for disagreements are shown in Table 6. As with

    W-county, the largest group, 40% of the disagreements in the

    ASP are of type D, absence of trial cases from the NCR. For

    the PSP, however, similar proportions of disagreements were

    Table 3 Categorized disagreements between W-county trialrecords and 1st OVC records

    Disagreementcategory

    Number (%) of disagreements in cases

    ASP PSP Total

    A 14 (12) 3 (4) 17 (8)B 5 (4) 52 (65) 57 (29)C 13 (11) 2 (2) 15 (7)D 51 (43) 12 (15) 63 (32)G 16 (13) 8 (10) 24 (12)I 0 (0) 0 (0) 0 (0)K 20 (17) 3 (4) 23 (12)Total 119 (100) 80 (100) 199 (100)

    Number (%) of disagreements for breastcancer death

    A 2 (6) 0 (0) 2 (4)B 0 (0) 6 (32) 6 (11)C 10 (29) 2 (11) 12 (23)D 2 (6) 1 (5) 3 (6)G 16 (47) 8 (42) 24 (45)I 0 (0) 0 (0) 0 (0)K 4 (12) 2 (11) 6 (11)Total 34 (100) 19 (100) 53 (100)

    PSP, passive study population, not invited; ASP, active study population, invited

    Table 4 Trial mortality result for W-county from original localtrial endpoint, 1st OVC endpoint and the JRC conclusivedataset

    ASP PSP RR (95% CI)

    W original breastcancer deaths

    135 110 0.59 (0.45 0.76)

    OVC breast cancerdeaths

    141 99 0.69 (0.53 0.90)

    JRC conclusion breastcancer deaths for W

    136 111 0.59 (0.45 0.76)

    Number of subjects 38,589 18,582

    PSP, passive study population, not invited; ASP, active study population, invited

    Table 2 W-county outcomes tabulated against overviewoutcomes (agreements in bold)

    Studygroup

    1st OVCoutcome

    W-county outcome

    Incl,BCD

    Incl,DOC

    Incl,alive

    Notincl Total

    PSP Incl, BCD 95 0 0 4 99Incl, DOC 8 50 0 0 58Incl, Alive 0 0 138 0 138Not incl 7 7 54 0 68Total 110 57 192 4 363

    ASP Incl, BCD 121 4 0 16 141Incl, DOC 12 153 0 15 180Incl, Alive 0 0 344 12 356Not incl 2 10 48 0 60Total 135 167 392 43 737

    Total Incl, BCD 216 4 0 20 240Incl, DOC 20 203 0 15 238Incl, Alive 0 0 482 12 494Not incl 9 17 102 0 128Total 245 224 584 47 1100

    Incl, included; BCD, breast cancer death; DOC, death from other causes; PSP, passive studypopulation, not invited; ASP, active study population, invited

    76 Holmberg et al.

    Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com

  • 7/28/2019 73.Full

    5/8

    observed in categories D (19%), absence of the case from the

    NCR, G (22%), disagreement about cause of death, and K

    (26%), miscellaneous clerical errors and other reasons. With

    respect to breast cancer death, disagreements were dominated

    by category G (disagreement about cause of death) and C (date

    of diagnosis) in both the ASP and PSP.

    Table 7 shows the E-county trial result with respect to

    breast cancer mortality using the original trial endpoint,

    the 2nd OVC endpoint and the conclusive endpoint after

    review of all sources by the JRC (i.e. the original trial data

    plus correction for the clerical errors and cases lost to the

    trialists due to migration). The trial endpoint and the JRC

    conclusive dataset both show a significant 2023%

    reduction in mortality, whereas the 2nd OVC result shows

    a non-significant 10% reduction.

    Associations with disagreement

    We also investigated whether study group (ASP/PSP),county or age were significantly related to the likelihood of

    disagreement about breast cancer death. In the 685 cases

    classified as breast cancer death by either the W and E

    local committees or the OVC or both, there was no signifi-

    cant association of study group with disagreement (P

    0.2).There was a higher proportion of disagreement in E-county

    than in W-county, but this did not attain statistical significance

    (P 0.09). There was, however, a significant effect of patient-

    age at the time of randomization on the probability of a risk

    of disagreement (P, 0.001). In both counties, the disagree-

    ment increased with age (Figure 1).

    DISCUSSION

    In this study, the Swedish Cancer Societys Joint Review

    Committee (JRC) investigated disagreements between the

    breast cancer incidence and death data as recorded in the

    original Swedish Two-County Trial, based on individualpatient records and determination of cause of death by an

    expert committee, and that in the 2nd OVC based on the

    National Cancer Registry and Cause of Death Register. For

    the purposes of this study, we had full access to original

    W-E trial data, original data collected for the OVC, individual

    medical records, and register data from the regional tumour

    registries for the respective counties. The registration of new

    diagnosis of breast cancer is mandatory by law in Sweden

    Table 7 Trial mortality result for E-county from original localtrial endpoint, 2nd OVC endpoint and the JRC conclusivedataset

    ASP PSP RR (95% CI)

    Original E breastcancer deaths

    163 200 0.80 (0.64 0.98)

    OVC breast cancerdeaths

    175 189 0.90 (0.72 1.12)

    JRC conclusion breastcancer deaths for E

    162 206 0.77 (0.62 0.95)

    Number of subjects 38,309 37,403

    PSP, passive study population, not invited; ASP, active study population, invited

    Figure 1 Percentage disagreement between W-E and 2nd OVCby age, in 604 cases classed as breast cancer deaths by one orboth sources

    Table 5 E-county outcomes tabulated against 2nd OVCoutcomes (agreements in bold)

    Studygroup

    2nd OVCoutcome

    E-county outcome

    Incl,BCD

    Incl,DOC

    Incl,alive

    Notincl Total

    PSP Incl, BCD 164 4 2 19 189Incl, DOC 27 128 3 20 178Incl, Alive 0 1 296 10 307Not incl 9 8 41 0 58Total 200 141 342 49 732

    ASP Incl, BCD 147 8 0 20 175Incl, DOC 14 163 2 21 200Incl, Alive 0 1 338 10 349Not incl 2 13 44 0 59Total 163 185 384 51 783

    Total Incl, BCD 311 12 2 39 364Incl, DOC 41 291 5 41 378Incl, Alive 0 2 634 20 656Not incl 11 21 85 0 117Total 363 326 726 100 1515

    Incl, included; BCD, breast cancer death; DOC, death of other causes; PSP, passive studypopulation, not invited; ASP, active study population, invited

    Table 6 Categorized disagreements between E-county trialrecords and 2nd OVC records

    Disagreementcategory

    Number (%) of disagreements in cases

    ASP PSP Total

    A 21 (16) 17 (12) 38 (14)B 0 (0) 18 (12) 18 (6)C 15 (11) 10 (7) 25 (9)D 54 (40) 28 (19) 83 (30)G 22 (16) 31 (22) 52 (19)I 3 (2) 3 (2) 6 (2)K 20 (15) 37 (26) 57 (20)Total 135 (100) 144 (100) 279 (100)

    Number (%) of disagreements for breastcancer death

    A 5 (12) 3 (5) 8 (7)B 0 (0) 4 (7) 4 (4)C 15 (34) 10 (16) 25 (24)D 1 (2) 4 (7) 5 (5)G 22 (50) 30 (49) 52 (49)I 1 (2) 0 (0) 1 (1)K 0 (0) 10 (16) 10 (10)Total 44 (100) 61 (100) 105 (100)

    PSP, passive study population, not invited; ASP, active study population, invited

    Differences in endpoints between the Swedish W-E 77

    www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2

  • 7/28/2019 73.Full

    6/8

    and the completeness of registration of breast cancer is over

    98%.16 Thus, we were able to determine the reason for dis-

    crepancy in every individual case and no discrepancies were

    left unexplained.

    Our main empirical findings are that the JRC found that

    of the 2615 cases included by the W-E Trial or the OVC,

    there were 478 (18%) disagreements about inclusion/exclusion of women into the trial or determination of the

    cause of death. The vast majority of these pertained to adisagreement in inclusion/exclusion and not to disagree-ment in determination of cause of death. The disagreements

    were in the great majority of cases due to OVC-study design

    decisions pertaining to issues such as definition of age and

    last date of inclusion into the study, and use of a register

    rather than clinical records for case definition and cause

    of death determination. Disagreement about whether a

    death included in both the W-E Trial and the OVC

    was from breast cancer or not was relatively rare. We

    also found that the likelihood of disagreement about the

    cause of death was not significantly affected by county or

    trial arm. Such disagreement was, however, significantly

    more likely in older patients. These findings have

    implications both for the interpretation of screening effectsand for methodological issues in overviewing original

    research.

    The combined results of the two counties showed a signifi-

    cant breast cancer mortality reduction associated with the

    offer of screening by any of the three endpoint criteria.

    Using the JRC conclusions, the combined RR was 0.69

    (95% CI 0.58 0.83). Thus, the overall interpretation was

    not sensitive to these differences in design. In W-county,

    the result was significant by any of the three criteria,

    whereas in E-county, the result was significant using the

    original trial endpoint, and the JRC conclusive review end-

    point, but not statistically significant using the 2nd OVC end-

    point. The JRC conclusive result included some women withbreast cancer previously missed by the trialists due to

    migration, but picked up by the NCR or CDR.

    The remit of the JRC was not to determine whether one

    or the other of the endpoints were correct. However, it is

    clear from the E-county results that a combination of dif-

    fering causes of death determinations and inclusion/exclu-sion rules made a crucial difference to the primary result.

    It is highly relevant for the field of secondary prevention

    to understand how such modest disagreements cause

    such a difference to the outcome in a trial with a total of

    133,065 subjects. The answer is that the disagreements

    only needed to impact on the small minorities of subjects

    classified as dying from breast cancer within the trial arm

    subgroups of one geographical stratum (E-county) withinthe larger trial. In the ASP of E-county, disagreements

    with respect to cause of death and eligibility for inclusion

    caused a loss of 16 and a gain of 28 breast cancer deaths,

    a net increase of 12 breast cancer deaths. In the PSP,

    there was a loss of 36 breast cancer deaths and a gain of

    25, a net loss of 11 deaths (Table 5). Thus the 2nd OVC

    classification of eligibility for inclusion and cause of death

    gave a 7% higher death rate in the ASP and a 6% lower

    death rate in the PSP, sufficient to convert a statistically

    significant 20% reduction in mortality to a statistically

    non-significant 10% reduction. It should be noted that if

    the inclusion criteria had been identical and the only

    difference had been the disagreements over cause of

    death, the result in E-county would still have been ren-

    dered non-significant.

    The effect of misclassification of exposure factors has

    been extensively studied in epidemiology,1719 and when

    it is non-differential with respect to disease outcome, it

    tends to dilute estimated effects. Although less fully

    researched, the misclassification of outcome has also been

    shown to cause underestimation of exposure/outcomeassociations.20 Disagreement rates between OVC and W-Eclassifications were 18% in both counties. Discrepancies

    of this magnitude are suggestive of misclassification prob-

    abilities of 10%, and would be likely to lead to dilution

    of observed associations by approximately 33%.21 The

    differences between W-E and OVC are smaller than this

    for W-county and rather larger for E-county. That they

    are proportionally larger for E-county is likely to be due

    to the fact that disagreement rates were differential

    between trial arms. The implications of this are that in

    general, the poorer the classification, the greater the poten-

    tial for missing a true effect, that the presence of differen-

    tial misclassification may increase the potential bias, and

    that the more thorough the classification effort, the moresensitive the comparison is likely to be.

    All these circumstances underline the importance of using

    an expert panel for determining cause of death when the

    individual study units contain few events. Others have

    regarded the determinations of such an expert committee

    as the gold standard,22 even when they have concluded

    that national death register information is adequate in com-

    parison.23 The OVC obtained results closer to those of the

    original trial when the 1st OVC used an expert endpoint

    committee.2

    The finding that the disagreement of cause of death

    increased with age is also of general interest. It accords

    with the findings of the 1st OVC where four clinicians notinvolved in the trials independently determined cause of

    death and the discordance at the initial review was 5%,

    5%, 13% and 19% in women 4049, 5059, 6069 and

    70 74 years respectively, at randomization.13 This probably

    reflects an increasing difficulty to determine cause of death

    with age for several reasons: a mixed clinical picture due

    to increasing co-morbidity, death occurring more often at

    home or in a nursing home without a clinical examination

    closely before death, very low probability of an autopsy,

    and increased uncertainty about origin of eventual metas-

    tases if also another malignancy has been diagnosed

    during follow-up. With long-term follow-up, information

    may also be lost that the woman is a trial participant and

    that determination of cause of death may be important.The results of the JRC review show that the disagree-

    ments were due to design differences between the clinical

    intervention trial approach employed in the original W-E

    Trial and the register-study design used by the OVC. This

    leads to a more general observation: design decisions in

    either an original study or a subsequent overview that

    may at first glance seem trivial e.g. defining a date for

    end of trial can influence basic and important study fea-

    tures such as the number of included subjects. Thus, design

    differences between original studies and overviews have to

    be taken into account when the overview does not adhere

    to the original designs, and it should be investigated if

    78 Holmberg et al.

    Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com

  • 7/28/2019 73.Full

    7/8

    the interpretation is sensitive to such design conflicts.

    An example here is the inclusion of women with in situ

    tumours as cases in the original study contrasted with the

    decision to only include those registered with an invasive

    cancer in the 1st and the 2nd OVC. This decision made

    an especially large difference for the ASP. Thus, seemingly

    general deviances from the original study design may not

    be neutral to the evaluation of the randomized trial.

    In this case this decision above all contributed to the differ-ent number of cases reported in the original trial as com-

    pared with the 1st and 2nd OVC, but little to the

    evaluation of breast cancer mortality.

    Is there a role for registry data in evaluation of primary or

    secondary interventions? It would definitely seem so where

    the research involves millions of person-years and large

    numbers of cause-specific deaths, thus misclassifications

    are likely to be heavily outnumbered by reliable obser-

    vations3,4 such as in large prevention and secondary preven-

    tion studies. For individual trials with smaller sizes, however,

    it is more reliable to individually determine case status and

    the cause of deaths by an expert committee.

    CONCLUSION

    The following points are suggested by the above results:

    (1) The conclusion that invitation to mammography

    screening was associated with a significant breast

    cancer mortality reduction remains robust after a full

    examination of disagreements between the original

    Two-County Trial endpoints and those of the Swedish

    overview. Disagreements about actual cause of death

    were a minority of the overall disagreements and

    were common only for older cases; the majority of dis-

    agreements related to inclusion or exclusion.(2) The use of the overview inclusion criteria and the

    national registry data for determination of breast

    cancer deaths led to a substantial change in the result

    for one of the two counties illustrating that non-

    differential misclassification of the main endpoint

    tends to drive results towards the null.

    (3) Thus, for trials with modest size it would appear to be

    more prudent to rely on trial logistics with close indi-

    vidual monitoring of case status, presence of covariates

    and outcome status such as determination of cause of

    death based on all available clinical information.

    (4) When secondary research does not adhere to the pro-

    tocols of the primary research projects included, theconsequences of such design differences should be

    investigated and reported. Seemingly trivial design

    decisions may have significant impact on the result

    and are not always neutral to the randomized design.

    . . . . . . . . . . . . . . .Authors affiliationsL Holmberg, Professor of Cancer Epidemiology, Kings CollegeLondon, Medical School, Division of Cancer Studies, London, UKS W Duffy, Professor of Breast Cancer Screening, Cancer ResearchUK Centre for Epidemiology, Mathematics and Statistics, WolfsonInstitute of Preventive Medicine, London, UK

    A M F Yen, Cancer Research UK Centre for Epidemiology,Mathematics and Statistics, Wolfson Institute of Preventive Medicine,London, UKL Tabar, Professor of Radiology, University of Uppsala, School ofMedicine, Department of Mammography, Falun Central Hospital,Falun, SwedenB Vitak, Consultant Radiologist, Division of Radiological Sciences,Department of Medical and Health Sciences, Linkoping University,Linkoping, SwedenL Nystrom, Associate Professor of Epidemiology, Department of

    Public Health and Clinical Medicine, Umea Universtiy, Umea, SwedenJ Frisell, Professor of Surgery, Department of Molecular Medicineand Surgery, Unit of Breast Surgery, Karolinska Institute, Solna,Sweden

    ACKNOWLEDGEMENTS

    The study was supported by grants from the Swedish Cancer

    Society and the American Cancer Society. We thank Sherry

    Yueh-Hsia Chiu from the Institute of Preventive Medicine,

    Division of Biostatistics, College of Public Health at the

    National Taiwan University for excellent help and Robert

    Smith from the American Cancer Society for valuable dis-

    cussions and advice.

    Conflict of interest and contributions: The authors are

    associated with the WE trial and the Overview as described

    in contributions and have otherwise no conflict of interest in

    relation to this work.

    Lars Holmberg, Stephen Duffy and Jan Frisell oversaw the

    comparison and coordinated the analyses. Lars Holmberg

    and Stephen Duffy drafted the report. Jan Frisell and Lars

    Holmberg were the principal investigators for the grants

    that supported the study. Laszlo Tabar and Bedrich Vitak

    were the principal investigators for the W and E trial parts,

    respectively, and provided all data for the W-E trial. Jan

    Frisell and Lennarth Nystrom were the principal and the

    coordinating investigators for the Overview committee,

    respectively, and Lennarth Nystrom provided the Overviewdata. Stephen Duffy and Amy Yen made the statistical ana-

    lyses. All authors had full access to the data, contributed in

    the comparison process, the interpretation of the analyses

    and revised the manuscript for intellectual content. Lars

    Holmberg is the guarantor for the study.

    REFERENCES

    1 Smith RA, Duffy SW, Gabe R, Tabar L, Yen AMF, Chen HHT. Therandomized trials of breast cancer screening: what have we learned?Radiol Clin Nth Amer 2004;42:793806

    2 Nystrom L, Rutquist LE, Wall S, et al. Breast cancer screening withmammography: overview of the Swedish randomised trials. Lancet

    1993;341:97383 Swedish Organised Service screening Evaluation Group. Reduction inBreast Cancer Mortality from Organised Service screening withMammography: 1. Further confirmation with extended data. CancerEpidemiol Biomarkers Prev 2006;15:4551

    4 Swedish Organised Service screening Evaluation Group. Reductionin breast cancer mortality from organised service screening withmammography: 2. Validation with alternative analytic methods. CancerEpidemiol Biomarkers Prev 2006;15:5256

    5 Tabar L, Fagerberg CJ, Gad A, et al. Reduction in mortality from breastcancer after mass screening with mammography. Randomised trial fromthe Breast Cancer screening Working Group of the Swedish NationalBoard of Health and Welfare. Lancet1985;325:82932

    6 Tabar L, Vitak B, Chen HH, Duffy SW, Smith RA. The Swedish Two-CountyTrial twenty years later: updated mortality results and new insights fromlong term follow-up. Radiol Clin Nth Amer 2000;38:62551

    7 Gtzsche PC, Olsen O. Is screening for breast cancer with mammographyjustifiable? Lancet2000;355:12933

    Differences in endpoints between the Swedish W-E 79

    www.jmedscreen.com Journal of Medical Screening 2009 Volume 16 Number 2

  • 7/28/2019 73.Full

    8/8

    8 Nystrom L, Andersson I, Bjurstam N, Frisell J, Nordenskjo ld B, Rutqvist LE.Long-term effects of mammography screening: updated overview of theSwedish randomised trials. Lancet2002;359:90919

    9 Freedman DA, Petitti DB, Robins JM. On the efficacy of screening for breastcancer. Int J Epidemiol 2004;33:4355

    10 Duffy SW. Interpretation of the breast screening trials: a commentary on therecent paper by Gtzsche and Olsen. The Breast 2001;10:20912

    11 Tabar L, Fagerberg G, Duffy SW, Day NE, Gad A, Grontoft O. Update ofthe Swedish two- county program of mammographic screening for breastcancer. Radiol Clin Nth Amer1992;30:187210

    12 Duffy SW, Tabar L, Vitak B, et al. The Swedish Two-County Trial ofmammographic screening: cluster randomisation and endpoint evaluation.Ann Oncol 2003;39:174654

    13 Nystrom L, Larsson L-G, Rutqvist LE, et al. Determination of cause of deathamong breast cancer cases in the Swedish mammography screening trials:a comparison between official statistics and validation by an endpointcommittee. Acta Oncol 1995;34:14552

    14 Larsson LG, Andersson I, Bjurstam N, et al. Updated overview of theSwedish randomised trials on beast cancer screening with mammography:age group 4049 at randomisation. J Natl Cancer Inst Monogr1997;22:5761

    15 McNemar Q. Note on the sampling error of the differencebetween correlated proportions or percentages. Psychometrika1947;12:1537

    16 Barlow L, Westergren K, Holmberg L, Talback M . The completeness of theSwedish Cancer Register - a sample survey for year 1998. Acta Oncol2009;48:2733

    17 Freedman LS, Midthune D, Carroll RJ, Kipnis V. A comparison of regressioncalibration, moment reconstruction and imputation for adjusting forcovariate measurement error in regression. Stat Med2008;27:5195216

    18 Wong MY, Day NE, Luan JA, Wareham NJ. Estimation of magnitude ingene-environment interactions in the presence of measurement error.Stat Med2004;23:98798

    19 Bashir SA, Duffy SW. The correction of risk estimates for measurement

    error. Ann Epidemiol 1997;7:1546420 Duffy SW, Warwick J, Williams AR, et al. A simple model for potential use

    with a misclassified binary outcome in epidemiology. J Epidemiol CommHlth 2004;58:7127

    21 Duffy SW, Maximovitch DM, Day NE. External validation, repeatdetermination, precision of risk estimation in misclassified exposure datain epidemiology. J Epidemiol Comm Hlth 1992;46:62024

    22 Miller AB. Design of cancer screening trials/randomized trials forevaluation of cancer screening. World J Surg 2006;30:115262

    23 Makinen T, Karhunen P, Aro J, Lahtela J, Maattanen L, Auvinen A.Assessment of causes of death in a prostate cancer screening trial.Int J Cancer 2008;122:41317

    80 Holmberg et al.

    Journal of Medical Screening 2009 Volume 16 Number 2 www.jmedscreen.com