Download pdf - Alternative approaches for confounding adjustment in

Division of Pharmacoepidemiology and Pharmacoeconomics

Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School

Alternative approaches for confounding

adjustment in observational studies using

weighting based on the propensity scoreRishi J Desai, MS, PhD

Assistant Professor of Medicine


Brigham & Women’s Hospital/Harvard Medical School

[email protected] @Rishidesai11



Outline

• Confounding and propensity score (PS) basics

• PS weighting- some general principles

• PS weighting- selecting among alternatives

• Case-example

• Summary



CONFOUNDING AND PROPENSITY SCORE

BASICS



Random Systematic

Information biasSelection bias

Related to the manner in which patients

are recruited into the study or are

retained during the course of the study

Related to the manner in which information

about important study variables (exposure or

outcomes) is collected

Confounding bias

The relationship between the exposure

and outcome of interest is due,

completely or in part, to another variable

(the “confounder”)

Sources of errors in observational studies



Potential confounders

(e.g age)

Exposure

(Treatment A vs B)

Outcome of interest

(e.g mortality)

Confounding

• Confounders- variables that simultaneously influence treatment selection and

risk for the outcome of interest

Conditioning on the

propensity score



Taxonomy of Confounding Control

Amenable to Propensity

Techniques

Schneeweiss. Pharmacoepidemiol Drug Saf 2006; 15; 291-303



Propensity scores (PS)

• What

o It is the conditional probability of receiving a particular treatment given a

vector of observed covariates

P (Treatment) ~ age + gender + DM + HTN..

o Predicted probability- quantity between 0 and 1

o Depends on availability of measured patient characteristics

Rosenbaum & Rubin. Biometrika 1983; 70 (1): 41-55




• Why

o PS offers a one-dimensional summary of multidimensional covariates, such

that when the propensity score is balanced across the treatment and

comparison groups, the distribution of all the covariates are balanced in

expectation across the two groups

o Efficient confounding control





• How

o Build a model (logistic regression, most commonly) with treatment as the

dependent variable and measured patient characteristics as independent

variables with a goal of identifying patients’ likelihood of receiving a treatment

o Through modeling of available data on patient characteristics and actually

received treatments, researchers try to mimic prescriber’s decision process for

selecting a treatment for a particular patient




PS WEIGHTING- SOME GENERAL PRINCIPLES




Matching• Find one (or more) reference patients for

a treated patient based on the proximity

in the PS

Regression adjustment• Treatment effect is derived after

adjusting for the PS in the outcome

model as a continuous variable

Weighting• Use PS to calculate weights, which create

a weighted pseudopopulation in which

treatment groups are balanced with

respect to distribution of all confounding

variables

Stratification (subclassification)• Classify patients into groups based on PS

distribution of the cohort (eg. 5 groups

based on PS quintiles)

• Treatment effect is computed within each

subclass and an overall effect is derived

based on a weighted average of effects

in the subclass



Properties of alternate PS adjustment appraoches

12

Matching Stratification

(traditional)

Regression Weighting

Bias control*

Maximize precision (keeps all

observations in the

analysis?)

Transparent evaluation and

reporting of balance

Robust against phenomenon

of increasing covariate

imbalance after conditioning

(“PSM paradox”)#

Flexibility in targeting specific

populations of interest

* True under non-exceptional circumstances when methods are tuned to the problem at hand and applied carefully

Stürmer et al. Am J Epidemiol 2005;161:891-8

Elze et al. J Am Coll Cardiol. 2017 Jan 24;69(3):345-357

Vansteelandt & Daniel. Stat. Med. 2014, 33 4053–4072

# King and Neilsen. Political Analysis. 2019;27(4), 435-454

Ripollone et al. Am J Epidemiol. 2018;187(9):1951–1961



Target of inference (Estimand)

13

the patient population to which the estimated treatment effect applies

Key question to consider when deciding the target of inference for a specific study—would it be feasible

to treat all eligible patients included in the study with the treatment of interest?

Yes

the target of inference might be defined as the

average treatment effect (ATE)

No

only patients with certain characteristics who actually

received the treatment would be ideal candidates for

treatment; then the target of inference might be defined

as average treatment effect among the treated

population (ATT)

For instance, a study comparing the effectiveness of

a newly approved treatment with an existing

treatment for a certain condition when both

treatments are indicated as exchangeable options

e.g dabigatran v warfarin for atrial fibrillation

For instance, a study evaluating the safety of

antipsychotic drugs for pregnant women with

schizophrenia/bipolar. Not all pregnant patients

might be considered for treatment due to unclear

teratogenicity profile; only women with greater

severity of these conditions would be treated

• In some cases, the interest is in targeting ATE only among a subset of patients with certain characteristics

leading to clinical equipoise eg average treatment effect in the overlap population (ATO)

• When there is no heterogeneity in treatment effect, ATE and ATT coincide



PS WEIGHTING- SELECTING AMONG

ALTERNATIVES


Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School15



• Misspecification takes multiple forms- missing interaction between covariates, missing

an important covariate, inappropriate functional forms (for continuous covariates)

• A simple diagnostic step of checking covariate balance in weighted samples can alert

researchers to potential model misspecification

• In theory, alternate models of PS (machine learning based- boosting, random forests,

neural networks etc or targeting balance e.g covariate balancing PS) could be useful in

protecting against misspecification because they are adept at handling interactions

and non-linearity in continuous variables by default; in practice, evidence generally

suggests logistic regression model is adequate in most scenarios*

• The impact of model misspecification could vary across PS weighting approaches

(more on this later)

16* Wyss et al. Am J Epidemiol. 2014;180(6):645–655

Setoguchi et al. Pharmacoepidem Drug Saf 2008; 17: 546–555



“Sufficient overlap”

• In most situations, investigators need to

use somewhat subjective decisions

• A simple recommendation is to

implement non-overlap trimming and

determine sufficiency of overlap based

on % of the sample excluded

0 1

%

0.5

A: Extensive treatment selection

0 1

%

0.50 1

%

0.5

B: Moderate treatment selection C: Little preference





Key features

•Clear target of inference (mimics target of

inference in randomized trials)

•Easily extends to >2 treatment groups

•Easily extends to time-varying settings

(marginal structural models)

19

Cautions

•The score is directly used to create

weights, which commonly leads to

unstable weights and requirement for

stabilization or truncation

•Generally has less robust performance in

presence of PS model misspecification*

Weight calculation

Treated patients Reference patients

1

𝑃𝑆

1

(1 − 𝑃𝑆)

Inverse probability treatment weights (IPTW)

* Waernbaum Stat Med 2012;31:1572-81

Lee et al. PLoS One 2011;6:e18174



Exposed Reference

Stratum 1

Stratum 2

Stratum 3

Stratum 49

Stratum 48

...

...

Stratum 50

Based on

PS

distribution

in the

exposed

Fine stratification weights (ATE)

Original sample

Weight calculation


Τ𝑁𝑇𝑜𝑡𝑎𝑙 𝑖𝑛 𝑃𝑆 𝑠𝑡𝑟𝑎𝑡𝑎 𝑖 𝑁𝑇𝑜𝑡𝑎𝑙

Τ𝑁𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑖𝑛 𝑃𝑆 𝑠𝑡𝑟𝑎𝑡𝑎 𝑖 𝑁𝑇𝑜𝑡𝑎𝑙 𝑒𝑥𝑝𝑜𝑠𝑒𝑑


Τ𝑁𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑃𝑆 𝑠𝑡𝑟𝑎𝑡𝑎 𝑖 𝑁𝑇𝑜𝑡𝑎𝑙 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒

Desai et al. Epidemiology 2017;28:249-57.



Exposed Reference

Stratum 1

Stratum 2

Stratum 3

Stratum 49

Stratum 48

...

...

Stratum 50

Based on

PS

distribution

in the

exposed

Weighted sample


Weight calculation









Key features



• “Semiparametric implementation PS

weighting”- PS is only used to stratify and

not directly to weight, which makes this

approach theoretically robust against PS

model misspecification

•Extreme weights are less common

22

Cautions

•Does not readily extend to >2 treatment

groups, n of strata increases

exponentially leading to high variability

•Sparse strata could lead to unstable

weights and requirement for truncation


Weight calculation








Exposed Reference

Stratum 1

Stratum 2

Stratum 3

Stratum 49

Stratum 48

...

...

Stratum 50

Based on

PS

distribution

in the

exposed

Original sample

Fine stratification weights (ATT)

Weight calculation


1Τ𝑁𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑖𝑛 𝑃𝑆 𝑠𝑡𝑟𝑎𝑡𝑎 𝑖 𝑁

𝑇𝑜𝑡𝑎𝑙 𝑒𝑥𝑝𝑜𝑠𝑒𝑑





Exposed Reference

Stratum 1

Stratum 2

Stratum 3

Stratum 49

Stratum 48

...

...

Stratum 50

Based on

PS

distribution

in the

exposed

Reference

Weighted sample


Weight calculation








Key features



• “Semiparametric implementation PS

weighting”- PS is only used to stratify and

not directly to weight, which makes this

approach theoretically robust against PS

model misspecification

•Extreme weights are less common

25

Cautions

•Does not readily extend to >2 treatment

groups, n of strata increases

exponentially leading to high variability

•Sparse strata could lead to unstable

weights and requirement for truncation


Weight calculation







Key features

•Clear target of inference

•Extends to >2 treatment groups

26

Standardized mortality ratio weighting (SMRW)

Weight calculation


1𝑃𝑆

(1 − 𝑃𝑆)

Cautions

•The score is directly used to create

weights, which commonly leads to

unstable weights and requirement for

stabilization or truncation

•Performance likely compromised in

presence of PS model misspecification



Key features

• Weights bound between 0 and 1- extreme weights

impossible

• Extends to >2 treatment groups

27

Matching weights

Weight calculation


Cautions

• Target of interference is variable- close to ATE in

the whole population when groups are equally

sized and PS distributions have good overlap, is

close to the ATT in the smaller group when groups

are unequally sized but PS distribution have good

overlap.

• In circumstances of limited overlap in PS

distribution, may lead to treatment effect

estimation in a subpopulation that is not reflective

of patients receiving the treatment of interest in

routine care or the whole study population.

𝑀𝑖𝑛𝑖𝑚𝑢𝑚 (𝑃𝑆, 1 − 𝑃𝑆)

𝑃𝑆

𝑀𝑖𝑛𝑖𝑚𝑢𝑚 (𝑃𝑆, 1 − 𝑃𝑆)

(1 − 𝑃𝑆)

Li & Greene. Int J Biostat 2013;9:215-34

Yoshida et al. Epidemiology 2017;28:387-95.



Key features

•Weights bound between 0 and 1-

extreme weights impossible

•Extends to >2 treatment groups

28

Overlap weights

Weight calculation


Cautions

• Target of interference is variable- may lead to

treatment effect estimation in a

subpopulation that is not reflective of

patients receiving the treatment of interest in

routine care or the whole study population

• For 2 treatment groups, OWs calculated

based on logistic model yields exact covariate

balance- which makes it difficult to use

balance as a PS model diagnostic

(1 − 𝑃𝑆) 𝑃𝑆

Li et al. Am J Epidemiol 2019;188:250-7.



CASE EXAMPLE

29



Study design

30

Warfarin/dabigatran

prescription

Index date

Continuous enrollment

for 6-months

No warfarin or dabigatran

use

Afib

diagnosis

Outcome

Follow-up stroke/systemic

embolism

(As treated)

October,

2013

October,

2010

(Dabigatran market entry)

PS calculated with 72 covariates using a logistic regression model

Desai et al. Am J Epidemiol. 2018 Nov 1;187(11):2439-2448.



PS distributional overlap

• Trimming non-overlapping regions

of the resulted in the exclusion of

only 10 patients, which confirmed

sufficient overlap in this cohort

• Bimodal distribution of warfarin

treated patients

• Warfarin patients in the first peak

down-weighted substantially

under all weighting approaches

except for the weights targeting

the ATE

31



Weight distribution

32

Truncated at 99th percentile



Confounder distribution in

warfarin group= distribution in

dabigatran group



dabigatran group=distribution

in subset of the whole cohort



dabigatran group = distribution

in the whole cohort

Table of select population characteristics

33



Balance plot



Treatment effect estimates with respect to stroke





Summary

• Weighting based on the propensity score represents a valuable tool for

confounding adjustment in observational studies of treatment use and is

increasingly being used in epidemiological investigations

• Ideally, selection of the appropriate weighting approach should be driven by

target of inference specific to each study question

• When applied carefully, all alternative approaches of confounding adjustment

are likely to work well under most circumstances

37



Resources

• SAS macros for PS fine stratification available on Harvard dataverse along with

simulated toy examples

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/U8

JLCW

• For R users- WeightIt by Noah Greifer supports most of these weighting

methods

https://cran.r-project.org/web/packages/WeightIt/WeightIt.pdf

38

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/U8JLCW

https://cran.r-project.org/web/packages/WeightIt/WeightIt.pdf



Thank you

[email protected] @Rishidesai11