19
1 PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

Embed Size (px)

DESCRIPTION

PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006. Frequency Matching. Situation: Stratification factors have few distinct levels - PowerPoint PPT Presentation

Citation preview

Page 1: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

1

PH 241: Chapter 16

Nicholas P. JewellUniversity of California Berkeley

April 17-26, 2006

Page 2: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

2

Frequency Matching

Situation: Stratification factors have few distinct levels

Goal: To maintain balance on the marginals of planned strata so that precision is not lost when stratification is used to reduce confounding

Implementation: Perform mini studies at each stratification level

Page 3: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

3

Pancreatic Cancer and Coffee Drinking, Stratified by Sex

Pancreatic Cancer

Sex Cases Controls

Females Coffee drinking

(cups/day)

10

14011

28056

2.545

Males Coffee drinking

(cups/day)

10

2079

27532

2.676

RO ˆ

Original balance: 1.75 controls for every case

Balance: 2.2:1

Balance: 1.4:1

Frequency matching: select 265 female and 378 male controls so thatbalance is approximately 1.75:1 in both strata

Page 4: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

4

Frequency Matching: Analysis

For example, in case-control studies, we can no longer estimate P(E |D) etc, only

P(E |D,C ) where C is the matching factor Are therefore committed to stratification on

CCan no longer evaluate the association

between C and D in case-control studiesCan still use logistic regression so long

as C is always appropriately entered into the model

Page 5: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

5

Pair Matching

Matching factors have a very high number of discrete levels

Pair Matching One case, one control at any given common

level of matching factors (Case-Control) One exposed, one unexposed at any given

common level of matching factors (Cohort)

Page 6: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

6

Types of Matched Pair Case-Control Data

(1) Control

E not E

Case

EX

not E

(2) Control

E not E

Case

EX

not E

(3) Control

E not E

Case

E

not EX

(4) Control

E not E

Case

E

not EX

Page 7: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

7

Summarization of Matched Pair Case-Control Data

Control

E not E

CaseE A B

not E C D

N

Classification of Pairs, not individuals

Page 8: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

8

Matched Pair Case-Control Data on Spontaneous Abortions and

CHDControl

SA No SA

Case SA 7 18

No SA 5 20

50

1

1

Matched on age and location of residence

Page 9: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

9

Exposure Patterns in the Four Types of Matched Pair Case-Control Data

(1) D not D

E1 1 2

not E0 0 0

1 1

(2) D not D

E1 0 1

not E0 1 1

1 1

(3) D not D D

E0 1 1

not E1 0 1

1 1

(4) D not D D

E0 0 0

not E1 1 2

1 1

Page 10: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

10

Odds Ratio with Matched Pair Case-Control Data

Pr(pair has exposed case | discordant)

i

i

i

i

i

i

i

i

i

i

iiii

ii

i

OR

OR

s

s

r

r

s

s

r

r

rssr

sr

P

1

1)1(

)1(

)1(

)1(

)1()1(

)1(

),|(

),|(

ii

ii

CDEPs

CDEPr

Page 11: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

11

Odds Ratio with Matched Pair Case-Control Data

If no interaction: Pr(pair has exposed case | discordant)

For example OR = 1 & P = 0.5

Estimation: Known as conditional maximum likelihood

Testing:

OR

ORP

1

CBROCBBP /ˆ ˆ

2)1(

22

0

)(or

)1,0( :statistic 5.01:

CB

B-C

NCB

B-CzPORH

McN

Page 12: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

12

Matched Pair Case-Control Data on Spontaneous Abortions and

CHDControl

SA No SA

Case SA 7 18

No SA 5 20

50

1

1

6.3518ˆ 783.023

18ˆ ROP

Confidence intervals:

)40.12,29.1(:

)925.0,563.0(:

OR

P

007.0 35.7518

)518( 22

pMcN

Page 13: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

13

Cochran-Mantel-Haenszel Procedures for Pair-Matched

DataPair Type

# Pairs of Type

ai Ai Vi aidi/ni bici/ni

(1) A 1 1 0 0 0

(2) B 1 ½ ¼ ½ 0

(3) C 0 ½ ¼ 0 ½

(4) D 0 ½ 0 0 0

Totals A+B A+(B+C)/2

(B+C)/4 B/2 C/2

CB

CBCB

CBABA

2

2

)(

4

2)(

C

B

C

B

2/

2/

Cochran-Mantel-Haenszel test statistic:

Mantel-Haenszel Estimator:

Small-sample OR estimator:1

ˆ

C

BRO SS

Page 14: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

14

1:M Matching

Can use conditional maximum likelihood or Cochran-Mantel-Haenszel procedures (no longer exactly the same)

Page 15: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

15

Further Assessment of Confounding and Interaction

Further confounders (non matching factors) Stratify further on new confounders Quick loss of precision

Interaction Straightforward if interested in interaction of

E with a matching factor If the additional covariate is a non-matching

factor, further stratification limits power to estimate interactive effects

Page 16: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

16

Logistic Regression Model for Matched Data

bxa

bxIaIaa

ixXDp

p

i

NN

ix

x

*

1111

th

factors) matching of level , | for oddslog(1

log

1,,1factors, matching theof levels theindexes NjNI j

• Too many unknown parameters for regular maximum likelihood•Use conditional maximum likelihood (conditioning on the exposure pattern in the matched pair)

Use conditional likelihood (which only depends on b) just as a conventional likelihood (ML estimates, SEs, Wald test, LR tests)

Page 17: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

17

Coding for Matched Study of Pregnancy History and CHD

)0( abortions sspontaneou ofhistory no 0

)1( abortions sspontaneou ofhistory any 1

SA

SAX

33

22

11

00

SA

SA

SA

SA

X ord

otherwise0

31

otherwise0

21

otherwise0

11

3

21

SAX

SAX

SAX

abortions sspontaneou ofNumber SA

65pairin age average 0

65pairin age average 1Y

nonsmoker 0

smoker 1Z

Page 18: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

18

Matched Study of Pregnancy History and CHD: Fitted Logistic Regression Models(#) Model Param

.Estimat

eSD OR p-value Max. log

lik.

(1) b 1.281

0.506

3.600 0.011 -30.76

(2) bc

1.609-0.629

0.775

1.029

5.0000.533

0.0380.541 -30.57

(3) bc

1.3380.279

0.521

0.501

3.8131.322

0.0100.577 -30.60

(4) bcd

1.039-0.0020.819

0.627

0.609

1.027

2.8250.9982.267

0.0970.9980.426 -30.27

(5) b1

b2

b3

1.2520.6482.173

0.654

0.734

1.099

3.4961.9128.786

0.0560.3770.048 -29.96

(6)b 0.589

0.251

1.802 0.019 -31.14

(7)b 0.473

0.215

1.605 0.028 -30.89

bxa

pp

i

)1/log(

)(

)1/log(

yxcbx

app i

)(

)1/log(

zxdczbx

app i

332211

)1/log(

xbxbxb

app i

)(

)1/log(

ord

i

xb

app

czbx

app i

)1/log(

)(

)1/log(

SAb

app i

Page 19: PH 241: Chapter 16 Nicholas P. Jewell University of California Berkeley April 17-26, 2006

19

Matched Study of Birth Order and RDS: Fitted Logistic Regression

Models

(#) Model Param.

Estimate

SD OR p-value Max. log lik.

(1) b 1.355

0.397

3.875 0.001 -19.79

(2) bc

0.3361.743

0.586

0.847

1.4005.714

0.570.04 -17.57

bxa

pp

i

)1/log(

)(

)1/log(

yxcbx

app i

Matched Cohort study of RDS in twins221 twins: matched on everything common to twins!Birth order is known risk factor—can it be explained by other factors?

First Born

RDSNo

RDS

Second Born

RDS 24 31

No RDS 8 158

221

X is birth order (1 is second born); Y is delivery mode (1 is vaginal)

OR = 31/8 = 3.9

McNemar’s test statistic = 13.6, P = 0.0002

Caution:natural matched pairs