Upload
orli
View
23
Download
3
Embed Size (px)
DESCRIPTION
X 11 X 12 X 13 X 21 X 22 X 23 X 31 X 32 X 33. Research Question. - PowerPoint PPT Presentation
Citation preview
X 11 X 12 X 13
X21 X22 X23
X31 X32 X33
Research Question• Are nursing homes dangerous for seniors? Does admittance to a nursing home increase risk of death in adults over 65 years of age when controlling for age, gender, race, and number of emergency room visits?
Propensity Score Matchingor
Do nursing homes kill you?ANNMARIA DE MARS, PH.D.
&CHELSEA HEAVEN
THE JULIA GROUP
WHY YOU NEED ITTWO NON-EQUIVALENT
GROUPSPatients in specialized
unitsPeople who attend a
fundraising event
Any time you can ask the question ….
Is there a difference on OUTCOME between levels of “treatment” A,
controlling for X, Y and Z ?
ExamplesOUTCOME “TREATMENT”
LEVELSCOVARIATES
DROP OUT PUBLIC, PRIVATE INCOMEPARENT EDUCATIONGR. 8 ACHIEVEMENT
BMI DAILY SOFT DRINKSNO SOFT DRINKS
GENDERAGERACEEXERCISE FREQ.
DEATH LIVES AT HOMENURSING HOME
AGEGENDERTOTAL ER VISITS
1. Make sure there are pre-existing differences
(Thank you, Captain Obvious)
2a. Decide on covariates
• Are the differences pre-existing or could they possibly be due to the different “treatment” levels?
• Race and gender are good choices for covariates. If more students at private vs public schools are black or female, the schooling probably didn’t cause that
• Differences in grade 10 math scores may be a result of the type of school
2b. Decide on covariates
Don’t use your outcome variable as one of your
covariates
3. Run logistic regression to generate propensity scores
PROC LOGISTIC DATA= datasetname ;CLASS categorical variables ;MODEL dependent = list-of-covariates ;OUTPUT OUT = newdataset
PREDICTED= propensity-score;
4. Select matching method
1. Quintiles2. Nearest neighbors3. Calipers
ALL OF THE ABOVE CAN BE DONE EITHER WITH OR WITHOUT REPLACEMENT
5. Run matching program & test its
effectiveness
6. Run your analysis using the matched data set
An actual example
Do nursing homes kill you?
Our data Kaiser Permanente Study of the Oldest
Old, 1971-1979 and 1980-1988: [California]
DEPENDENT VARIABLE:
Dthflag = 1 if Died during study period
0 if alive at end of study period
Our data TREATMENT VARIABLEathome = 1 if lived at home
continuously 0 if admitted to nursing
home any time during study period
Before matchingAT HOME > NO YES TOTAL
DIED Frequency(Column %)
=========
=========
NO 184(14.6)
2,486(52.6)
2,670(44.6)
YES 1,077(85.4)
2,239(47.4)
3,316(55.4)
TOTAL 1,261 4,725 5,986
Covariates *
•AGE•RACE•GENDER•TOTAL Emergency Room VISITS **
* Three out of four were DEFINITELY pre-existing differences
** Proxy for health
PROC LOGISTICPROC LOGISTIC DATA= saslib.old ;CLASS athome race sex ;
MODEL athome = race sex age_comp vissum1;OUTPUT OUT =study.allpropen PREDICTED = prob;
Create propensity scores
NOTE: No DESCENDING option
ODDS Ratios
ODDS Ratios
Yes, pre-existing differences
TYPE 3 ANALYSIS OF EFFECTS
Effect DF
WaldChi-
SquarePr > ChiS
qRACE 4 18.7017 0.0009SEX 1 12.5424 0.0004age_comp
1 412.8103 <.0001
VISSUM1 1 212.9695 <.0001
QUINTILE MATCHINGEXAMPLE ONE
Part on creating quintiles blatantly copied (almost)
http://www.pauldickman.com/teaching/sas/quintiles.php
Calculate Quintile Cutpoints
PROC UNIVARIATE DATA= saslib.allpropen;
VAR prob; OUTPUT OUT=quintile
PCTLPTS=20 40 60 80 PCTLPRE=pct;
Remember the dataset we created with the predicted probabilities saved in it?
PROC UNIVARIATE VAR prob;*** predicted probability as variable OUTPUT OUT=quintile
PCTLPTS=20 40 60 80 PCTLPRE=pct;*** output to a dataset named quintile, *** create four variables at these percentiles*** with the prefix pct ;
/* write the quintiles to macro variables */
data _null_ ;set quintile;call symput('q1',pct20) ;call symput('q2',pct40) ;call symput('q3',pct60) ;call symput('q4',pct80) ;
Just because I am too lazy to write down the percentiles
Create quintilesdata STUDY.AllPropen;
set STUDY.AllPropen ;
if prob =. then quintile = .;
else if prob le &q1 then quintile=1;
else if prob le &q2 then quintile=2;
else if prob le &q3 then quintile=3;
else if prob le &q4 then quintile=4;
else quintile=5;
Quintiles
Quintile Frequency PercentCumulativeFrequency
CumulativePercent
1 1075 19.76 1075 19.76
2 1101 20.24 2176 40.00
3 1088 20.00 3264 60.00
4 1088 20.00 4352 80.00
5 1088 20.00 5440 100.00
The matching part
Try to control your excitement
Create case & control data sets
DATA small large ;SET study.allpropen ;IF athome = 0 THEN OUTPUT small ;
ELSE IF athome = 1 THEN OUTPUT large ;
Create data set of sampling percentages
PROC FREQ DATA = small ; quintile / OUT = samp_pct ;
Quintiles in smaller data set
Quintile Frequency PercentCumulativeFrequency
CumulativePercent
1 50 4.06 50 4.06
2 115 9.33 165 13.39
3 208 16.88 373 30.28
4 338 27.44 711 57.71
5 521 42.29 1232 100.00
Create data set of sampling percentages
PROC FREQ DATA = small ; quintile / OUT = samp_pct ;
Create sampling data set
DATA samp_pct ;SET samp_pct ;_NSIZE_ = 1 ;_NSIZE_ = _NSIZE_ * COUNT ;DROP PERCENT ;
Just here to make it easy to modify
PROC SURVEYSELECT
SAMPSIZE= input data set can provide stratum sample sizes in the _NSIZE_ variable
STRATA groups should appear in the same order in the secondary data set as in the DATA= data set.
SELECT RANDOM SAMPLE
PROC SORT DATA = large ;BY quintile ;
PROC SURVEYSELECT DATA= large SAMPSIZE = samp_pct OUT = largesamp ;STRATA quintile ;
Concatenate data setsDATA study.psm_sample ;
SET largesamp small ;
Did it work?Variable
Before After
AT Home
NOT Home
Prob AT Home NOT Home
Prob
Age 75.0 79.3 .0001 79.2 79.3 .60ER visits
4.5 2.4 .0001 4.5 **** 3.8 **** .0001
Female 49% 54% .01 52% 54% .36Race .0001 .97
** P <.01 **** P < .0001
Before odds ratio 6.5 : 1
EffectPoint
Estimate95% Wald
Confidence Limitsathome 0 vs 1 0.154 0.130 0.182
EffectPoint
Estimate95% Wald
Confidence Limitsquintile 0.661 0.610 0.716athome 0 vs 1 0.273 0.223 0.334
AFTER ODDS RATIO = 3.7: 1