PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

Preview:

DESCRIPTION

PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006. Frequency Matching. Situation: Stratification factors have few distinct levels - PowerPoint PPT Presentation

Citation preview

1

PH 241: Chapter 16

Nicholas P. JewellUniversity of California Berkeley

April 17-26, 2006

2

Frequency Matching

Situation: Stratification factors have few distinct levels

Goal: To maintain balance on the marginals of planned strata so that precision is not lost when stratification is used to reduce confounding

Implementation: Perform mini studies at each stratification level

3

Pancreatic Cancer and Coffee Drinking, Stratified by Sex

Pancreatic Cancer

Sex Cases Controls

Females Coffee drinking

(cups/day)

10

14011

28056

2.545

Males Coffee drinking

(cups/day)

10

2079

27532

2.676

RO ˆ

Original balance: 1.75 controls for every case

Balance: 2.2:1

Balance: 1.4:1

Frequency matching: select 265 female and 378 male controls so thatbalance is approximately 1.75:1 in both strata

4

Frequency Matching: Analysis

For example, in case-control studies, we can no longer estimate P(E |D) etc, only

P(E |D,C ) where C is the matching factor Are therefore committed to stratification on

CCan no longer evaluate the association

between C and D in case-control studiesCan still use logistic regression so long

as C is always appropriately entered into the model

5

Pair Matching

Matching factors have a very high number of discrete levels

Pair Matching One case, one control at any given common

level of matching factors (Case-Control) One exposed, one unexposed at any given

common level of matching factors (Cohort)

6

Types of Matched Pair Case-Control Data

(1) Control

E not E

Case

EX

not E

(2) Control

E not E

Case

EX

not E

(3) Control

E not E

Case

E

not EX

(4) Control

E not E

Case

E

not EX

7

Summarization of Matched Pair Case-Control Data

Control

E not E

CaseE A B

not E C D

N

Classification of Pairs, not individuals

8

Matched Pair Case-Control Data on Spontaneous Abortions and

CHDControl

SA No SA

Case SA 7 18

No SA 5 20

50

1

1

Matched on age and location of residence

9

Exposure Patterns in the Four Types of Matched Pair Case-Control Data

(1) D not D

E1 1 2

not E0 0 0

1 1

(2) D not D

E1 0 1

not E0 1 1

1 1

(3) D not D D

E0 1 1

not E1 0 1

1 1

(4) D not D D

E0 0 0

not E1 1 2

1 1

10

Odds Ratio with Matched Pair Case-Control Data

Pr(pair has exposed case | discordant)

i

i

i

i

i

i

i

i

i

i

iiii

ii

i

OR

OR

s

s

r

r

s

s

r

r

rssr

sr

P

1

1)1(

)1(

)1(

)1(

)1()1(

)1(

),|(

),|(

ii

ii

CDEPs

CDEPr

11

Odds Ratio with Matched Pair Case-Control Data

If no interaction: Pr(pair has exposed case | discordant)

For example OR = 1 & P = 0.5

Estimation: Known as conditional maximum likelihood

Testing:

OR

ORP

1

CBROCBBP /ˆ ˆ

2)1(

22

0

)(or

)1,0( :statistic 5.01:

CB

B-C

NCB

B-CzPORH

McN

12

Matched Pair Case-Control Data on Spontaneous Abortions and

CHDControl

SA No SA

Case SA 7 18

No SA 5 20

50

1

1

6.3518ˆ 783.023

18ˆ ROP

Confidence intervals:

)40.12,29.1(:

)925.0,563.0(:

OR

P

007.0 35.7518

)518( 22

pMcN

13

Cochran-Mantel-Haenszel Procedures for Pair-Matched

DataPair Type

# Pairs of Type

ai Ai Vi aidi/ni bici/ni

(1) A 1 1 0 0 0

(2) B 1 ½ ¼ ½ 0

(3) C 0 ½ ¼ 0 ½

(4) D 0 ½ 0 0 0

Totals A+B A+(B+C)/2

(B+C)/4 B/2 C/2

CB

CBCB

CBABA

2

2

)(

4

2)(

C

B

C

B

2/

2/

Cochran-Mantel-Haenszel test statistic:

Mantel-Haenszel Estimator:

Small-sample OR estimator:1

ˆ

C

BRO SS

14

1:M Matching

Can use conditional maximum likelihood or Cochran-Mantel-Haenszel procedures (no longer exactly the same)

15

Further Assessment of Confounding and Interaction

Further confounders (non matching factors) Stratify further on new confounders Quick loss of precision

Interaction Straightforward if interested in interaction of

E with a matching factor If the additional covariate is a non-matching

factor, further stratification limits power to estimate interactive effects

16

Logistic Regression Model for Matched Data

bxa

bxIaIaa

ixXDp

p

i

NN

ix

x

*

1111

th

factors) matching of level , | for oddslog(1

log

1,,1factors, matching theof levels theindexes NjNI j

• Too many unknown parameters for regular maximum likelihood•Use conditional maximum likelihood (conditioning on the exposure pattern in the matched pair)

Use conditional likelihood (which only depends on b) just as a conventional likelihood (ML estimates, SEs, Wald test, LR tests)

17

Coding for Matched Study of Pregnancy History and CHD

)0( abortions sspontaneou ofhistory no 0

)1( abortions sspontaneou ofhistory any 1

SA

SAX

33

22

11

00

SA

SA

SA

SA

X ord

otherwise0

31

otherwise0

21

otherwise0

11

3

21

SAX

SAX

SAX

abortions sspontaneou ofNumber SA

65pairin age average 0

65pairin age average 1Y

nonsmoker 0

smoker 1Z

18

Matched Study of Pregnancy History and CHD: Fitted Logistic Regression Models(#) Model Param

.Estimat

eSD OR p-value Max. log

lik.

(1) b 1.281

0.506

3.600 0.011 -30.76

(2) bc

1.609-0.629

0.775

1.029

5.0000.533

0.0380.541 -30.57

(3) bc

1.3380.279

0.521

0.501

3.8131.322

0.0100.577 -30.60

(4) bcd

1.039-0.0020.819

0.627

0.609

1.027

2.8250.9982.267

0.0970.9980.426 -30.27

(5) b1

b2

b3

1.2520.6482.173

0.654

0.734

1.099

3.4961.9128.786

0.0560.3770.048 -29.96

(6)b 0.589

0.251

1.802 0.019 -31.14

(7)b 0.473

0.215

1.605 0.028 -30.89

bxa

pp

i

)1/log(

)(

)1/log(

yxcbx

app i

)(

)1/log(

zxdczbx

app i

332211

)1/log(

xbxbxb

app i

)(

)1/log(

ord

i

xb

app

czbx

app i

)1/log(

)(

)1/log(

SAb

app i

19

Matched Study of Birth Order and RDS: Fitted Logistic Regression

Models

(#) Model Param.

Estimate

SD OR p-value Max. log lik.

(1) b 1.355

0.397

3.875 0.001 -19.79

(2) bc

0.3361.743

0.586

0.847

1.4005.714

0.570.04 -17.57

bxa

pp

i

)1/log(

)(

)1/log(

yxcbx

app i

Matched Cohort study of RDS in twins221 twins: matched on everything common to twins!Birth order is known risk factor—can it be explained by other factors?

First Born

RDSNo

RDS

Second Born

RDS 24 31

No RDS 8 158

221

X is birth order (1 is second born); Y is delivery mode (1 is vaginal)

OR = 31/8 = 3.9

McNemar’s test statistic = 13.6, P = 0.0002

Caution:natural matched pairs

Recommended