Effect Modification & Confounding Kostas Danis EPIET Introductory course, Menorca 2012

Preview:

Citation preview

Effect Modification & Confounding

Kostas Danis

EPIET Introductory course,

Menorca 2012

Analytical epidemiology

Study design: cohorts & case control & cross-sectional studies

Choice of a reference group Biases Impact Causal inference

Stratification- Effect modification - Confounding

Matching Multivariable analysis

Cohort studies marching towards outcomes

Exposed

Not exposed

CasesNoncases Risk %

Cohort study

50 50 50 %

10 90 10 %

Risk ratio 50% / 10% = 5

Total

100

100

CasesExposed

Unexposed

Source population

Controls:Sample of the denominatorRepresentative with regard to exposure

Controls

Sample

Controls are non cases

Low attack rate: non-cases likely to represent exposure in source pop

Non- casesSourcepopn

High attack rate: non-cases unlikely to represent

exposure in source population

Cases

Cases

Non- cases

endstart

endstart

Exposed

Not exposed

Cases Controls Odds ratio

Case control study

a b

c d

Total a+c

OR= (a/c) / (b/d) = ad / bc

a/c b/dOdds ofexposure

b+d

Who are the right controls?

Controls may not be easy to find

Cross-sectional study: Sampling

Sample

Target Population

SamplingPopulation

Exposed

Not exposed

CasesNoncases Prevalence %

Cross-sectional study

500 500 50 %

100 900 10 %

Prevalence ratio (PR) 50% / 10% = 5

Total

1,000

1,000

Should I believe my measurement?

Exposure Outcome

RR = 4

Chance?Bias? Confounding?

True associationcausal

non-causal

Exposure Outcome

Third variable

Two main complications

(1) Effect modifier

(2) Confounding factor

- useful information

- bias

To analyse effect modification

To eliminate confounding

Solution = stratification stratified analysis

Create strata according to categories inside the range of values taken by third variable

Effect modification

Variation in the magnitude of measure of effect across levels of a third variable.

Effect modifier

Happens when RR or OR is different between strata (subgroups of population)

Effect modifier

To identify a subgroup with a lower or higher risk ratio

To target public health action

To study interaction between risk factors

Effect modification

Factor A(asbestos)

Disease(lung cancer)

Factor B(smoking)

Effect modifier = Interaction

19

Asbestos (As) and lung cancer (Ca)

Case-control study, unstratified data

As Ca Controls OR

Yes 693 320 4.8No 307 680 Ref.

Total 1000 1000

Asbestos Lung cancer

Smoking

As Smoking Cases Controls OR

Yes Yes 517 160 8.9

Yes No 176 160 3.0

No Yes 183 340 1.5

No No 124 340 Ref.

Asbestos (As), smoking and lung cancer (Ca)

1.5 * 3.0 < 8.9 1.5 * 3.0 * interaction=8.9

Physical activity and MI

Physical Infarction activity

Gender

Vaccine efficacy

ARU – ARVVE = ----------------

ARU

VE = 1 – RR

Vaccine efficacy

Status Pop. Cases Cases

per 1000 RR

V 301 545 150 0.49 0.28

NV 298 655 515 1.72 Ref.

Total 600 200 665 1.11

VE = 1 - RR = 1 - 0.28

VE = 72%

Vaccine Disease

Age

Vaccine efficacy by age group

Effect modification

Different effects (RR) in different strata (age groups)

VE is modified by age

Test for homogeneity among strata (Woolf test)

Any statistical test to help us?

• Breslow-Day

• Woolf test

• Test for trends: Chi square

Homogeneity

How to conduct a stratified analysis?

Crude analysis

Stratified analysis1.Do stratum-specific estimates look different? 2.95% CI of OR/RR do NOT overlap? 3.Is the Test of Homogeneity significant?

33

YESEFFECT MODIFICATION

(Report estimates by stratum)

NOCheck for confounding(compare crude RR/OR

with MH RR/OR)

Stratified analysis: Effect Modification

E ffect m od ifica tion

O R s / R R s 95% C .I.d o no t o verlap

E ffect m od ifica tion

W oo lf's tes t sig nificant

D iscuss lack o f po w ero f W o llf 's test

E ffect m od ifica tionu n like ly

W o olf's tes t no t sig nificant

U se W o olf's test

O R s / R R s C .I.d o overlap

O R s / R Rsd iffe ren t acro ss s tra ta

Diarrhea Controls OR (95% CI)

No breast feeding 120 136 3.6 (2.4-5.5)

Breast feeding 50 204 Ref

Death from diarrhea according to breast feeding, Brazil, 1980s

(Crude analysis)

No breast Diarhoea feeding

Age

Infants < 1 month of age

Cases Controls OR (95% CI)

No breast feeding 10 3 32 (6-203)

Breast feeding 7 68 Ref

Infants ≥ 1 month of age

Cases Controls OR (95% CI)

No breast feeding 110 133 2.6 (1.7-4.1)

Breast feeding 43 136 Ref

Death from diarrhea according to breast feeding, Brazil, 1980s

Woolf test (test of homogeneity):p=0.03

Exposed

ExposureYes No

RR† (95% CI‡)

n AR (%)* n AR(%)*

pasta 94 77 7 4.2 18.0

(8.8-38)

tuna 49 68 49 24 2.9 (2.1-3.8)

† RR = Risk Ratio* AR = Attack Rate

‡ 95% CI = 95% confidence interval of the RR

Risk of gastroenteritis by exposure, Outbreak X, Place, time X (crude analysis)

Tuna gastroenteritis

Pasta

Pasta Yes

Cases Total AR (%) RR (95% CI)

Tuna 43 52 83 1.1 (0.9-1.3)

No tuna 46 60 77 Ref

Pasta No

Cases Total AR (%) RR (95% CI) Tuna 4 17 24 11 (2.6-46)

No tuna 3 144 2 RefWoolf test (test of homogeneity): p=0.0007

Risk of gastroenteritis by exposure, Outbreak X, Place, time X (stratified analysis)

Tuna, pasta and gastroenteritis

Tuna Pasta Cases AR(%) RR

Yes Yes 43 83 42

Yes No 4 23 12

No Yes 46 76 38

No No 3 2 Ref.

38 * 12 > 42 38 * 12 * interaction= 42

Risk of HIV by injecting drug use (idu), surveillance data, Spain, 1988-2004

Cases Total AR (%) RR (95% CI)

Idu 268 2,732 9.8 3.9 (3.3-4.4)

No idu 484 18,822 2.5 Ref

idu hiv

gender

Males

Cases Total AR (%) RR (95% CI)

idu 86 693 12 20 (14-28)

No idu 52 8,306 0.6 Ref

Females

Cases Total AR (%) RR (95% CI) idu 182 2,039 8.9 2.3 (1.9-2.6)

No idu 432 10,576 4.1 RefWoolf test (test of homogeneity): p=0.00000

Risk of HIV by injecting drug use (idu), Spain, 1988-2004 (stratified analysis)

Idu, gender and hiv

Idu Male Cases AR(%) RR

Yes Yes 86 12.4 3.0

Yes No 182 8.9 2.2

No Yes 52 0.6 0.14

No No 432 4.1 Ref.

0.14 * 2.2 > 3.0 0.14 * 2.2 * interaction= 3.0

Confounding

Confounding

Distortion of measure of effect because of a third factor

Should be prevented

Needs to be controlled for

Confounding

Age

ChlamydiaSkate-boarding

Age not evenly distributed between the 2 exposure groups - skate-boarders, 90% young - Non skate-boarders, 20% young

50

Exposure Outcome (coffee) (Lung cancer)

Third variable (smoking)

51

Grey hair stroke

Age

Cases of Down syndroms by birth order

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5

Birth order

Cases per 100 000 live births

Cases of Down Syndrom by age groups

0100200300400500600700800900

1000

< 20 20-24 25-29 30-34 35-39 40+

Age groups

Cases per 100000 live

births

Birthorder

Age ormother

Downsyndrom

0100200300400500600700800900

1000

Cases per 100000

1 2 3 4 5

Birth order

Cases of Down syndrom by birth order and mother's age

Confounding

Exposure Outcome

Third variable

To be a confounding factor, 2 conditions must be met:

Be associated with exposure - without being the consequence of exposure

Be associated with outcome - independently of exposure

Exposure OutcomeHypercholesterolaemia Myocardial infarction

Third factorAtheroma

Any factor which is a necessary step in the causal chain is not a confounder

Salt Myocardial infarction

Hypertension

The nuisance introduced by confounding factors

• May simulate an association

• May hide an association that does exist

• May alter the strength of the association– Increased– Decreased

Confounding factor

Ethnicity Pneumonia

Crowding

Apparent association

Crowding Pneumonia

Malnutrition

Altered strength of association

How to prevent/control confounding?

Prevention– Randomization (experiment) – Restriction to one stratum– Matching

Control– Stratified analysis– Multivariable analysis

Are Mercedes more dangerous than Porsches?

Type Total Accidents AR % RR

Porsche 1 000 300 30 1.5

Mercedes 1 000 200 20 Ref.

Total 2 000 500 25

95% CI = 1.3 - 1.8

Car type Accidents

Confounding factor:Age of driver

Crude RR = 1.5Adjusted RR = 1.1 (0.94 - 1.27)

Incidence of malaria according to the presence of a radio set,

Kahinbhi Pradesh

Crude data Malaria Total AR% RR

Radio set 80 520 15 0.7

No radio 220 1080 20 Ref

RR: 0.7; 95% CI: 0.6- 0.9; p < 0.0295% CI = 0.6 - 0.9

Radio Malaria

Confounding factor:Mosquito net

Crude RR = 0.7Adjusted RR = 1.01

To identify confounding

Compare crude measure of effect (RR or OR)

to

adjusted (weighted) measure of effect (Mantel Haenszel RR or OR)

10 - 20 %

Any statistical test to help us?

When is ORMH different from crude OR ?

Mantel-Haenszel summary measure

Adjusted or weighted RR or OR

Advantages of MH

• Zeroes allowed

(ai di) / ni

OR MH = ---------------------------

(bi ci) / ni

Mantel-Haenszel summary measure

• Mantel-Haenszel (adjusted or weighted) OR

OR MH = ------------------- SUM (ai di / ni)

SUM (bi ci / ni) n1

a1 b1

c1d1

Cases Controls

Exp+

Exp-

b2

c2d2

Cases Controls

Exp+

Exp-

n2

a2 (a1 x d1) / n1 +

ORMH = ----------------------------------------

(a2 x d2) / n2

(b2 x c2) / n2 (b1 x c1) / n1 +

How to conduct a stratified analysis?

Crude analysis

Stratified analysis1.Do stratum-specific estimates look different? 2.95% CI of OR/RR do NOT overlap? 3.Is the Test of Homogeneity significant?

73

YESEFFECT MODIFICATION

(Report estimates by stratum)

NOCheck for confounding(compare crude RR/OR

with MH RR/OR)

74

pesto 79 45 56.96 212 58 27.36 2.08 [1.56-2.79] 0.000 pasta 121 94 77.69 165 7 4.24 18.31 [8.81-38.04] 0.000 Exposure Total Cases AR% Total Cases AR% Risk Ratio P Exposed Unexposed

. cstable case pesto pasta

Risk of gastroenteritis by exposure, Outbreak X, Place, time X (crude analysis)

Adjusted/crude relative change : -52.67 % MH RR for pesto adjusted for pasta : 0.99 [0.81-1.20] Crude RR for pesto : 2.08 [1.56-2.79]

Test of Homogeneity (M-H) : pvalue : 0.8366301

UnExposed 145 6 4.14 Attrib.risk.pop 0.02 [.-.] Exposed 20 1 5.00 Attrib.risk.exp 0.17 [-5.52-0.90] Risk Ratio 1.21 [0.15-9.53] pesto Total Cases Risk % Risk difference 0.01 [-0.09-0.11] pasta = Unexposed

UnExposed 65 51 78.46 Attrib.risk.pop 0.01 [.-.] Exposed 56 43 76.79 Attrib.risk.exp 0.02 [-0.19-0.19] Risk Ratio 0.98 [0.81-1.19] pesto Total Cases Risk % Risk difference -0.02 [-0.17-0.13] pasta = Exposed

. csinter case pesto, by(pasta)

75

Stratified Analysis

> 10-20%

Examples of stratified analysis

Effect modifierBelongs to natureDifferent effects in different strataSimpleUsefulIncreases knowledge of biological mechanismAllows targeting of PH action

Confounding factorBelongs to study

Weighted RR different from crude RRDistortion of effectCreates confusion in dataPrevent (protocol)

Control (analysis)

Analyzing a third factor

Report ONE crude OR/ RR

Third factor does not play a role

Strata ORs / RRs similar to crude(Crude value fal ls between strata)

El iminate the confoudingReport ONE adj usted OR / RR

Adj ust using theM-H technique

Confounding factor

Strata ORs / RRs diff erent f rom crude(Crude value does not fal l between strata)

Ident ical ORs / RRs across strata

Report MULT IPLE ORs / RRs for each stratum

Stop the analysis.DO NOT adj ust!

Eff ect modifi cat ion

Diff erent ORs / RRs across strata

Examine ORs / RRs in each st ratum

Examine crude OR / RR

How to conduct a stratified analysis

Perform crude analysisMeasure the strength of association

List potential effect modifiers and confounders

Stratify data according topotential modifiers or confounders

Check for effect modification

If effect modification present, show the data by stratum

If no effect modification present, check for confoundingIf confounding, show adjusted dataIf no confounding, show crude data

80

How to define the strata?• Strata defined according to third variable:

– ‘Usual’ confounders (e.g. age, sex, socio-economic status)

– Any other suspected confounder, effect modifier or additional risk factor

– Stratum of public health interest

• For two risk factors:– stratify on one to study the effect of the second

on outcome

• Two or more exposure categories:– each is a stratum

• Residual confounding ?

Logical order of data analysis

How to deal with multiple risk factors:

Crude analysis

Multivariable analysis

1. stratified analysis

2. modelling

linear regression

logistic regression

Multivariate analysis

• Mathematical model

• Simultaneous adjustment of all confounding and risk factors

• Can address effect modification

A train can mask a second train

A variable can mask another variable

Back-up slides

86

Risk factors for Salmonella enteritidis infections, France, 1995

Delarocque-Astagneau et al Epidemiol. Infect 1998:121:561-7

87

Summer Cases Controls OR

(95%CI)

Duration of storage

>= 2 weeks 12 2 7.4

(1.5-69.9)< 2 weeks 52 64

Other seasons

Duration of storage

>= 2 weeks 7 3 2.6

(0.5-16.8)< 2 weeks 32 36

All seasons

>= 2 weeks 19 5 4.5

(1.5 – 16.1)< 2 weeks 84 100

Cases of Salmonella enteritidis gastroenteritis according to egg storage and season

88

Duration Salmonellosisof storage

Season

89

Summer

(A)

“Long” storage

(B)

Cases Control OR

Yes Yes 12 2 ORAB 6.8

Yes No 52 64 ORA 0.9

No Yes 7 3 ORB 2.6

No No 32 36 Ref Ref

Cases of Salmonella enteritidis gastroenteritis according to egg storage and season

90

Advantages & Disadvantages of Stratified Analysis

• Advantages– straightforward to implement and comprehend– easy way to evaluate interaction

• Disadvantages– only one exposure-disease association at a time– requires continuous variables to be grouped

• Loss of information; possible “residual confounding”

– deteriorates with multiple confounders• e.g. suppose 4 confounders with 3 levels

– 3x3x3x3=81 strata needed – unless huge sample, many cells have “0”’ and strata

have undefined effect measures

Recommended