Matching(in case control studies)
James Stuart, Fernando SimónEPIET
Dublin, 2006
Remember confounding…
Confounding factor is variable independently associated with
• exposure of interest• outcome
that distorts measurement of association
Control of confounders
In the study design
• Restriction• Matching
In the analysis
• Stratification• Multivariate analysis
Control of confounders
In the study design
• Restriction• Matching
In the analysis
• Stratification• Multivariate analysis
Matching
Selection of controls to match specific characteristics of cases
a) Frequency matchingSelect controls to get same distribution of
variable as cases (e.g. age group)
b) Individual matchingSelect a specific control per case by matching variable (e.g. date of birth)
… but matching introduces bias
because controls are no longer representative of source population
to remove this selection bias
• Stratify analysis by matching criteria
matched design matched analysis
• Can not study the effect of matching variables on the outcome
a) Frequency matching
useful if distribution of cases for a confounding variable differs markedly from distribution of that variable in source population
a) Frequency matching
Age Cases (years) 0-14 50 15-29 30 30-44 15 45+ 5
TOTAL 100
a) Frequency matching
Age Cases Controls(years) unmatched 0-14 50 20 15-29 30 20 30-44 15 20 45+ 5 40
TOTAL 100 100
a) Frequency matching
Age Cases Controls(years) unmatched matched 0-14 50 10 5015-29 30 25 3030-44 15 25 1545+ 5 40 5
TOTAL 100 100 100
a) Frequency matching: analysis
• Mantel-Haenszel Odds Ratio (weighted)
• Conditional logistic regression for multiple variables
][][
i
iMH ncb
ndaOR
a) Frequency matching: analysis
• keep stratification by age group
0-14 years Exposed Cases Controls Total Yes 45(a) 30(b) 75No 5(c) 20(d) 25Total 50 50 100(ni)
5.19
100150100900
i
i
ncbnda
a) Frequency matching: analysis
15-29 years Exposed Cases Controls Total Yes 15(a) 4(b) 19No 15(c) 26(d) 41Total 30 30 60(ni)
same process for each age group
0.15.6
606060390
i
i
ncbnda
etcetcORMH
15.15.69
b) individual matching
Each pair could be considered one stratum
4 possible outcomes per pairExposure
+ -Case 1 0Control 1 0
b) individual matching
Each pair could be considered one stratum
4 possible outcomes per pairExposure
+ - + -Case 1 0 1 0Control 1 0 0 1
b) individual matching
Each pair could be considered one stratum
4 possible outcomes per pairExposure
+ - + - + -Case 1 0 1 0 0 1Control 1 0 0 1 0 1
b) individual matching
Each pair can be considered as one stratum
4 possible outcomes per pairExposure+ - + - + - + -
Case 1 0 1 0 0 1 0 1Control 1 0 0 1 0 1 1 0
ad = zero unless case exposed, control not exposed bc = zero unless control exposed, case not exposed
b) individual matching
The only pairs that contribute to OR are discordant
ORMH= sum of discordant pairs where case exposed sum of discordant pairs where control exposed
][][
i
iMH ncb
ndaOR
b) individual matching
If change way of presenting case and control data to show in pairs
ControlsExposed Unexposed
Exposed e f (ad=1)Cases
Unexposed g (bc = 1) h
ORMH = sum of discordant pairs where case exposed sum of discordant pairs where control exposed
= f/g
b) individual matching: for n controls
each set analysed in pairs case used in as many pairs as number of controls Case Control1 Control2 Control3 Control4 C+/Ctr- C-/Ctr+ + - + - - 3 0 + + - + + 1 0 - - - - - 0 0 + - - - + 3 0 - - + - - 0 1 + - + + + 1 0 + + + + + 0 0 Total......................................................................... 8 1
pairs case exp/control not 8pairs case not/control exp 1
OR= = = 8
Matched study: example
• 20 cases of cryptosporidiosis
• Hypothesis: associated with attendance at local swimming pool
• 2 matched studies conducted (i) controls from same general practice and nearest date of birth (ii) case nominated (friend) controls
Analysis: GP and age matched controls
swimming pool exposure
Controls+ -
+ 1 15Cases
- 1 3
OR = f/g = 15/1 = 15.0
Analysis: friend controls
swimming pool exposure
Controls+ -
+ 13 3Cases
- 1 3
OR = 3/1 = 3.0
Why do matched studies?
• Random sample may not be possible
• Quick and easy way to get controls
• Improves efficiency of study (smaller sample size)
• Can control for confounding due to factors that are difficult to measure or even for unknown confounders.
Disadvantages of matching
• Cannot examine risks associated with matching variable
• If no controls identified, more likely if too many matching variables, lose case data and vice versa
• Overmatching on exposure of interest will bias OR towards 1
• May be residual confounding in frequency matching
Over-matching
• exposure to the risk factor of interest
• under-estimates true association
• may fail to find true association
Key points
• Matching controls for confounding factors in study design
• Matched design matched analysis
• Matching for variables that are not confounders complicates design
• Frequency matching simpler than individual
• Multivariable analysis reduces need to match