X 11 X 12 X 13 X 21 X 22 X 23 X 31 X 32 X 33

X 11 X 12 X 13

X21 X22 X23

X31 X32 X33

Research Question• Are nursing homes dangerous for seniors? Does admittance to a nursing home increase risk of death in adults over 65 years of age when controlling for age, gender, race, and number of emergency room visits?

Propensity Score Matchingor

Do nursing homes kill you?ANNMARIA DE MARS, PH.D.

&CHELSEA HEAVEN

THE JULIA GROUP

WHY YOU NEED ITTWO NON-EQUIVALENT

GROUPSPatients in specialized

unitsPeople who attend a

fundraising event

Any time you can ask the question ….

Is there a difference on OUTCOME between levels of “treatment” A,

controlling for X, Y and Z ?

ExamplesOUTCOME “TREATMENT”

LEVELSCOVARIATES

DROP OUT PUBLIC, PRIVATE INCOMEPARENT EDUCATIONGR. 8 ACHIEVEMENT

BMI DAILY SOFT DRINKSNO SOFT DRINKS

GENDERAGERACEEXERCISE FREQ.

DEATH LIVES AT HOMENURSING HOME

AGEGENDERTOTAL ER VISITS

1. Make sure there are pre-existing differences

(Thank you, Captain Obvious)

2a. Decide on covariates

• Are the differences pre-existing or could they possibly be due to the different “treatment” levels?

• Race and gender are good choices for covariates. If more students at private vs public schools are black or female, the schooling probably didn’t cause that

• Differences in grade 10 math scores may be a result of the type of school

2b. Decide on covariates

Don’t use your outcome variable as one of your

covariates

3. Run logistic regression to generate propensity scores

PROC LOGISTIC DATA= datasetname ;CLASS categorical variables ;MODEL dependent = list-of-covariates ;OUTPUT OUT = newdataset

PREDICTED= propensity-score;

4. Select matching method

1. Quintiles2. Nearest neighbors3. Calipers

ALL OF THE ABOVE CAN BE DONE EITHER WITH OR WITHOUT REPLACEMENT

5. Run matching program & test its

effectiveness

6. Run your analysis using the matched data set

An actual example

Do nursing homes kill you?

Our data Kaiser Permanente Study of the Oldest

Old, 1971-1979 and 1980-1988: [California]

DEPENDENT VARIABLE:

Dthflag = 1 if Died during study period

0 if alive at end of study period

Our data TREATMENT VARIABLEathome = 1 if lived at home

continuously 0 if admitted to nursing

home any time during study period

Before matchingAT HOME > NO YES TOTAL

DIED Frequency(Column %)

=========

=========

NO 184(14.6)

2,486(52.6)

2,670(44.6)

YES 1,077(85.4)

2,239(47.4)

3,316(55.4)

TOTAL 1,261 4,725 5,986

Covariates *

•AGE•RACE•GENDER•TOTAL Emergency Room VISITS **

* Three out of four were DEFINITELY pre-existing differences

** Proxy for health

PROC LOGISTICPROC LOGISTIC DATA= saslib.old ;CLASS athome race sex ;

MODEL athome = race sex age_comp vissum1;OUTPUT OUT =study.allpropen PREDICTED = prob;

Create propensity scores

NOTE: No DESCENDING option

ODDS Ratios

ODDS Ratios

Yes, pre-existing differences

TYPE 3 ANALYSIS OF EFFECTS

Effect DF

WaldChi-

SquarePr > ChiS

qRACE 4 18.7017 0.0009SEX 1 12.5424 0.0004age_comp

1 412.8103 <.0001

VISSUM1 1 212.9695 <.0001

QUINTILE MATCHINGEXAMPLE ONE

Part on creating quintiles blatantly copied (almost)

http://www.pauldickman.com/teaching/sas/quintiles.php

Calculate Quintile Cutpoints

PROC UNIVARIATE DATA= saslib.allpropen;

VAR prob; OUTPUT OUT=quintile

PCTLPTS=20 40 60 80 PCTLPRE=pct;

Remember the dataset we created with the predicted probabilities saved in it?

PROC UNIVARIATE VAR prob;*** predicted probability as variable OUTPUT OUT=quintile

PCTLPTS=20 40 60 80 PCTLPRE=pct;*** output to a dataset named quintile, *** create four variables at these percentiles*** with the prefix pct ;

/* write the quintiles to macro variables */

data _null_ ;set quintile;call symput('q1',pct20) ;call symput('q2',pct40) ;call symput('q3',pct60) ;call symput('q4',pct80) ;

Just because I am too lazy to write down the percentiles

Create quintilesdata STUDY.AllPropen;

set STUDY.AllPropen ;

if prob =. then quintile = .;

else if prob le &q1 then quintile=1;




else quintile=5;

Quintiles

Quintile Frequency PercentCumulativeFrequency

CumulativePercent

1 1075 19.76 1075 19.76

2 1101 20.24 2176 40.00

3 1088 20.00 3264 60.00

4 1088 20.00 4352 80.00

5 1088 20.00 5440 100.00

The matching part

Try to control your excitement

Create case & control data sets

DATA small large ;SET study.allpropen ;IF athome = 0 THEN OUTPUT small ;

ELSE IF athome = 1 THEN OUTPUT large ;

Create data set of sampling percentages

PROC FREQ DATA = small ; quintile / OUT = samp_pct ;

Quintiles in smaller data set

Quintile Frequency PercentCumulativeFrequency

CumulativePercent

1 50 4.06 50 4.06

2 115 9.33 165 13.39

3 208 16.88 373 30.28

4 338 27.44 711 57.71

5 521 42.29 1232 100.00

Create data set of sampling percentages

PROC FREQ DATA = small ; quintile / OUT = samp_pct ;

Create sampling data set

DATA samp_pct ;SET samp_pct ;_NSIZE_ = 1 ;_NSIZE_ = _NSIZE_ * COUNT ;DROP PERCENT ;

Just here to make it easy to modify

PROC SURVEYSELECT

SAMPSIZE= input data set can provide stratum sample sizes in the _NSIZE_ variable

STRATA groups should appear in the same order in the secondary data set as in the DATA= data set.

SELECT RANDOM SAMPLE

PROC SORT DATA = large ;BY quintile ;

PROC SURVEYSELECT DATA= large SAMPSIZE = samp_pct OUT = largesamp ;STRATA quintile ;

Concatenate data setsDATA study.psm_sample ;

SET largesamp small ;

Did it work?Variable

Before After

AT Home

NOT Home

Prob AT Home NOT Home

Prob

Age 75.0 79.3 .0001 79.2 79.3 .60ER visits

4.5 2.4 .0001 4.5 **** 3.8 **** .0001

Female 49% 54% .01 52% 54% .36Race .0001 .97

** P <.01 **** P < .0001

Before odds ratio 6.5 : 1

EffectPoint

Estimate95% Wald

Confidence Limitsathome 0 vs 1 0.154 0.130 0.182

EffectPoint

Estimate95% Wald

Confidence Limitsquintile 0.661 0.610 0.716athome 0 vs 1 0.273 0.223 0.334

AFTER ODDS RATIO = 3.7: 1

Documents

X 11 X 12 X 13 X 21 X 22 X 23 X 31 X 32 X 33