Propensity Score Methods to Adjust for Bias in Observational Data Group... · 2018-04-06 ·...

Inst i tute for Cl in ical Evaluat ive SciencesInst i tute for Cl in ical Evaluat ive Sciences

Propensity Score Methods to Adjust for Bias in Observational Data

SAS HEALTH USERS GROUP

APRIL 6, 2018

Overview

1.What is observational data?

2.What is the propensity score?

3.Statistical adjustment using the propensity score

a)Matching on the propensity score

b)Inverse probability of treatment weighting

Follow-up

☺☺

☺ ☺ ☺☺ ☺ ☺☺

☺☺☺ ☺☺☺ ☺☺☺☺ ☺☺☺☺ ☺☺

Treatment

☺☺☺ ☺☺☺ ☺☺☺☺ ☺☺☺☺ ☺☺

Control

☺☺☺ ☺☺☺ ☺☺☺☺ ☺☺☺☺

Inclusion/exclusion

criteria

Population Study

population

Baseline Outcome

Randomized Controlled Trials

(the “gold standard”)

Characteristics of RCTs

• Randomization ensures subjects in both

treatment groups are equally matched on all

factors

• Allow causal inference

• High cost

• Often short duration and/or underpowered.

• Problems with generalizability:

• Treatment is “ideal” (high compliance, careful follow-up means that any problems may be caught early).

• Many people who are given the treatments in “real life” are excluded from the trials

• Some situations cannot be randomized.

What is Observational Data?

The choice of treatment is not under the control of the researcher - the researcher can only ‘observe’ what treatment was given.

Examples:• Data obtained using chart review• Electronic medical records• Survey data or health study data• Administrative data.

The Study

Two medications used to treat chronic obstructive lung disease (COPD)

• Long-acting anticholinergic (LAAC)• Long-acting beta-agonist (LABA)

Compare overall mortality and risk of hospital admission related to COPD

Gershon et al. Annals of Internal Medicine, 2011

Inst i tute for Cl in ical Evaluat ive Sciences

Ontario Drug

Benefit Plan

(which drug

is the person

taking)

Hospital Discharge

Database

(diagnoses)

Physician Billing

Database

(diagnoses)

Registered

Persons

Database

(age, sex, SES)

Hospital Discharge

Database

(for outcomes).

Analysis of Our StudyExposure variable is choice of drug (LAAC vs.

Outcome is time to hospitalization or death

survival analysis will be used

Bias is a concern

Bias in confounders we can measure

And bias in confounders we can’t measure, e.g., smoking,

fitness

Covariate LAAC LABA

Specialist care in last year (%)

44.7 50.5

Prior lung functiontesting (%)

69.7 74.3

Statistical Adjustment for Observational

• Propensity score methods

• Instrumental variable analysis

• And others

Propensity Score

Rosenbaum and Rubin (1983) realized the bias

from covariates can be eliminated by controlling for

a scalar-valued function (a “balancing score”)

calculated from the baseline covariates, i.e., the

propensity score

The propensity score is a way of summarizing the

information in all the prognostic variables

What is the Propensity Score?

PS = probability that a person received one treatment

(rather than the other), given that person’s observed

covariates

Calculated using logistic regression to estimate the

propensity for a person to be prescribed a LAAC

(rather than a LABA)

proc logistic descending;

model LAAC = age sex diabetes hypertension

rural_res incquint ...;

output out = score predicted = ps;

proc PSMATCH

Calculating the Propensity Score

proc logistic descending;

model LAAC = age sex diabetes

hypertension rural_res incquint ....;

Patients predicted, based on their characteristics, to

be likely to be prescribed a LAAC will have a high

propensity score

Patients predicted to be unlikely to be prescribed a

LAAC (likely to be prescribed a LABA instead) will

have a low propensity score

Variable Selection

1. All measured baseline covariates

2. Baseline covariates associated with treatment

choice

3. Baseline covariates associated with the

outcome

4. Baseline covariates associated with both

treatment assignment and outcome

Propensity Score Methods

1. Covariate adjustment using the Propensity Score

2. Stratification on the PS

3. Matching on the PS.

4. Inverse probability weighting

Matching

1. Create a matched sample based on logit(PS)

2. Assess balance between treated and untreated

subjects in the matched sample.

– The test of a good propensity score model is

how well it balances the measured variables

between treated and untreated subjects.

3. For unbalanced variables, add interactions or

higher order terms to the propensity score

logistic regression, recalculate the propensity

score and repeat the process.

Before Matching After Matching

BaselineCovariate

LAACN=28,563

LABAN=17,840

Standarddifference

LAACN=15,532

LABAN=15,532

Standard difference

Lung function testing (%)

69.7 74.3 10.2 72.4 73.0 1.3

Specialist care previous year (%)

44.7 50.5 11.6 49.0 49.1 0.2

Also using inhaledcorticosteroid

48.3 52.1 7.7 51.1 51.3 0.3

Co-diagnosis of CHF

40.2 38.2 4.1 39.0 39.2 0.4

Hospitalizedfor COPD in previous 6 months

8.0 7.3 2.5 7.8 7.8 0.1

Analysis of Matched Data

Analysis of Matched Data Must Incorporate the Matching

Means Paired t-test

Proportions McNemar’s test

Survival models Stratify on matched pairs

Logistic regression GEE estimation to account for matched pairs

Matched Analyses ...

• Compares patients who are all potential candidates for both treatments.

• Matching pairs patients who are similar with respect to their propensity score matches on many confounders simultaneously

• Unmatched individuals are discarded

• The resulting matched sample may not be representative of all patients receiving treatment

Interpretation of a Matched Analysis

Estimates the Average Treatment Effect for the

Treated (ATT) – the average treatment effect for

those who ultimately received the treatment

Inverse Probability of Treatment

Weighting Using the Propensity Score

The Weights

𝑊 =𝑍

𝑃𝑆+

1 − 𝑍

1 − 𝑃𝑆

where Z = 1 for the treatment group and 0 for the

control group

The Weights

Recall that our PS is the probability of receiving a

LAAC (rather than a LABA)

𝑊 =𝐿𝐴𝐴𝐶

𝑃𝑆+

1 − 𝐿𝐴𝐴𝐶

1 − 𝑃𝑆

where LAAC is a 0/1 variable.

The Weights

𝑊 =𝐿𝐴𝐴𝐶

𝑃𝑆+

1 − 𝐿𝐴𝐴𝐶

1 − 𝑃𝑆

For those who received LAAC (LAAC = 1), weight =

1 / (probability of receiving LAAC):

𝑊 =1

𝑃𝑆For those who received LABA (LAAC = 0), weight =

1 / (probability of receiving LABA):

𝑊 =1

1 − 𝑃𝑆

The Weights

Similar to survey weights

Respondents from oversampled groups are

assigned low weights

– Selection probability = 1% weight = 1 / 0.01

Respondents from undersampled groups are

assigned high weights

– Selection probability = 0.2% weight = 1 /

0.002 = 500

Data Set to Estimate the Outcome of

Treatment

𝑊 =𝑍

𝑃𝑆+

1 − 𝑍

1 − 𝑃𝑆

ID Ztreatment = 1

control = 0

PS Weight1/PS

Outcome under

treatment

1 treatment 0.33 1 / 0.33 = 3 Y1

2 control 0.33 0 ?

3 control 0.33 0 ?

4 treatment 0.67 1 / 0.67 = 1.5 Y4

5 treatment 0.67 1 / 0.67 = 1.5 Y5

6 control 0.67 0 ?

Data Set to Estimate the Outcome of

Treatment Estimated average outcome of treatment = 1

𝑁σ𝑖=1𝑁 𝑤𝑖 × 𝑌𝑖 where 𝑤𝑖 =

𝑃𝑆for treated people

and 0 for controls.ID Z

treatment = 1control = 0

PS Weight1/PS

Outcome under treatment

1 treatment 0.33 1 / 0.33 = 3 Y1

2 control 0.33 0 ?

3 control 0.33 0 ?

4 treatment 0.67 1 / 0.67 = 1.5 Y4

5 treatment 0.67 1 / 0.67 = 1.5 Y5

6 control 0.67 0 ?

Data Set to Estimate the Outcome for

Controls

𝑊 =𝑍

𝑃𝑆+

1 − 𝑍

1 − 𝑃𝑆

ID Ztreatment = 1;

control = 0

PS Weight Outcome under control

1 treatment 0.33 0 ?

2 control 0.33 =1 / (1 – 0.33) = 1.5 Y2

3 control 0.33 =1 / (1 – 0.33) = 1.5 Y3

6 control 0.67 1 / (1 – 0.67) = 3 Y6

Data Set to Estimate the Outcome for

ControlsEstimated average effect for controls =1

𝑁σ𝑖=1𝑁 𝑤𝑖 × 𝑌𝑖 where 𝑤𝑖 =

1 − 𝑃𝑆for people in the

control group and 0 for people in the treated group

ID Ztreatment = 1;

control = 0

PS Weight Outcome

1 Treatment 0.33 0 ?

2 control 0.33 1 / (1 – 0.33) = 1.5 Y2

3 control 0.33 1 / (1 – 0.33) = 1.5 Y3

6 control 0.67 1 / (1 – 0.67) = 3 Y6

Estimating the Treatment Difference

Estimated difference (treatment A – treatment B) =

𝑁σ𝑖=1𝑁 𝑍𝑖×𝑌𝑖

𝑃𝑆-1

𝑁σ𝑖=1𝑁 1 − 𝑍𝑖 ×𝑌𝑖

1 −𝑃𝑆

Estimate of the variance

– Robust sandwich type variance estimators

– Bootstrapping

May trim very large weights (propensity score < 1st

percentile or > 99th percentile)

Interpretation of an Inversely Weighted

Analysis

Estimates the Average Treatment Effect (ATE): an

estimate of the treatment effect, if it were applied to

the entire population

It’s Magic

Well, Not Quite

The analyses make no claims to balance

unmeasured covariates

The analyses remove hidden biases only to the

extent that the unmeasured variables are correlated

with the available covariates

Sensitivity analyses can help quantify the possible

effects of unmeasured confounders.

Drawbacks to the Propensity Score

Available data is probably missing key covariates

(e.g., living arrangements, smoking history)

Definition of the baseline time may be difficult (it

should be the time at which the decision about

treatment was made).

Does not eliminate the need to think about patient

identification and selection

Advantages of the Propensity Score

Reduced dimensionality of covariates (important for

rare outcomes)

Can demonstrate that the two groups are similar on

all measured covariates

Like an RCT, does not predict the outcome for a

person with a given set of characteristics

Like an RCT, does not tell you the role of the

covariates in predicting the outcome

Like an RCT, can build planned sub-analyses into

the design

Advantages of Observational Studies

Useful when it is not feasible to use an RCT

– Unethical to withhold treatment

– Exposure believed to be harmful

– Patients will not agree to be randomized

– RCT too expensive

Generalizable (all patients, all providers)

Allows studies of rare events, and studies with long

follow-up times

Disadvantages of Observational Studies

Researcher has no control over assignment of

subjects to treatments

Researcher often has no control over what

covariates are available, their definitions, or the

quality of their measurement

ReferencesAustin PC. An introduction to propensity score methods for reducing the

effects of confounding in observational studies. Multivariate Behavioural

Research. 2011; 46: 399 - 424.

Austin PC, Stuart EA. Moving towards best practice when using inverse

probability of treatment weighting (IPTW) using the propensity score to

estimate causal treatment effects in observational studies. Statistics in

Medicine. 2015; 34: 3661 – 3679.

Anything else written by Peter Austin

Introducing the PSMATCH procedure for propensity score analysis:

https://www.youtube.com/watch?v=JM2uu39zEAs (a very good

introduction to both propensity scores and matching as well as the

PSMATCH procedure)

Inst i tute for Cl in ical Evaluat ive Sciences 41

Thank You

Propensity Score Methods to Adjust for Bias in Observational Data Group... · 2018-04-06 ·...

Documents

Propensity Score Models

Covariate balancing propensity score · propensity score methods, including matching and weighting. Although matching exactly on the propensity score is typically impossible, methods

Comparison of Propensity Score Methods and Covariate ...Stuart J. Pocock, PHDa ABSTRACT Propensity scores (PS) are an increasingly popular method to adjust for confounding in observational

Stratiﬂcation and Weighting Via the Propensity Score in …davidian/statinmed.pdf · 2004-10-19 · Stratiﬂcation and Weighting Via the Propensity Score in Estimation of Causal

Propensity Scores

be experiencing social isolation and loneliness (KID) to ... · increased propensity for residents to be experiencing social isolation and loneliness. 3.1.1 Weighting Consideration

Propensity Score Matching

APPENDIX: PS MATCHING IN R - Information …bkl29/pres/SPER_methods_R_matching.pdfAPPENDIX: PS MATCHING IN R (with attached dataset and code) ... misestimation of the propensity score

Propensity Score Weighting with Mismeasured Covariates: An

Todd Wagner, PhD October 2013 Propensity Scores. Outline 1. Background on assessing causation 2. Define propensity score (PS) 3. Calculate the PS 4. Use

Propensity Score Weighting with Multilevel Datafl35/papers/ps_cluster_SIM12.pdfPropensity Score Weighting with Multilevel Data Fan Li 1, Alan M. Zaslavsky 2, Mary Beth Landrum 1 Department

Step-by-Step Guidelines for Propensity Score Weighting ... · 2 Four key steps 1) Choose the primary treatment effect of interest (ATE or ATT) 2) Estimate propensity score (ps) weights

Propensity Score Matching. Outline Describe the problem Introduce propensity score matching as one solution Present empirical tests of propensity score

Propensity Models Versus Weighting Cell Approaches to ... · Logistic RP 0.008619 0.010184 GEM Case 2 0.008862 0.010337 0.001455 UWE 1.5953 1.5971 1.5944 1.5956 Mean weight 263.8699

Propensity Score Weighting with Multilevel Datafl35/talk/psclustertalk.pdf · 2012. 11. 5. · Propensity score methods for multilevel data I Propensity score methods for multilevel

Methods to control for confounding - PROTECT Home · Applying the propensity score 35 • Options: PS control, stratification & matching • After estimating PS for each person, use

Balancing Covariates via Propensity Score Weightingfl35/papers/psweight_final.pdf · 2016-11-09 · Balancing Covariates via Propensity Score Weighting Fan Li Kari Lock Morgan Alan

Stratiﬂcation and Weighting Via the Propensity Score in ...davidian/statinmed.pdf · individual observations [8, 9]. In this paper, we review approaches using stratiﬂcation and

Implementing Propensity Score Methods: A Review of the ...megan-schuler-1fuq.squarespace.com/s/Schuler-JSM-PS-software_fina… · Implementing Propensity Score Methods: A Review of

Quantile Treatment Effects in Difference in Differences ... · methods; and, at the least, doing so would be computationally challenging. Moreover, a similar propensity score re-weighting