Informative Subspace Learning for Counterfactual Inferenceychang/papers/AAAI_2017_talk.pdf ·...

Informative Subspace Learning for Counterfactual Inference

Yale Chang Jennifer G. DyDepartment of Electrical and Computer Engineering

Northeastern University

February 9, 2017

Motivation: Why Causal Inference?

Treatment Outcome?

New Medication

Blood Pressure

Job Training Employee’s Income

?Ø Healthcare

Ø Economics

Ø Advertising Advertising Campaign

Company’s Revenue

Question of Interest:

Challenges

Figures: Shalit & Sontag www.cs.nyu.edu/~shalit/tutorial.html

?Potential Outcome Framework

Ø Only one outcome can be observed

Randomized Controlled Trial

Observational Data

Ø Confounding factors

Contributions of This Work

Ø Propose a novel approach for causal inference on observational data.

Ø Speed up the proposed approach (reducing quadratic to linear complexity) via randomized approximation and provide theoretical results proving an upper bound on the approximation error.

Ø Empirical results on simulated and real-world data demonstrate that our proposed approach outperforms competing methods.

Potential Outcome Framework

Control Outcome

Treatment Outcome

Factual Outcome

Control Outcome

Treatment Outcome

Factual Outcome

Counterfactual Outcome

ITEITE

ITE: Individual Treatment Effect

ITEITE

ITE: Individual Treatment Effect

ATE:Average Treatment Effect

Nearest Neighbor Matching

Ø Set each sample’s counterfactual outcome equal to factual outcome of its nearest neighbor in the opposite group

Ø Distance can be measured by Euclidean metric

Nearest Neighbor Matching

Ø Not all features affect the outcome.

Ø Need learn informative subspaces (predictive of outcomes) for both treatment and control group before matching.

However!

In this case, only age affects the outcome

AgeWeight

Informative Subspace Learning

Key Property: make samples 𝑥" with similar outcomes 𝑦" be close in the learned subspace.

𝐾% =𝑠𝑖𝑚(𝑦+, 𝑦+) ⋯ 𝑠𝑖𝑚(𝑦+, 𝑦/)

⋮ ⋱ ⋮𝑠𝑖𝑚(𝑦/, 𝑦+) ⋯ 𝑠𝑖𝑚(𝑦/, 𝑦/)

Learn projection matrix 𝑊 ∈ ℝ5×7 to map 𝑥" ∈ ℝ5 to its low dimensional embedding 𝑧" = 𝑊9𝑥" ∈ ℝ7 to preserve the similarity structure in 𝑌.

𝐾< =𝑠𝑖𝑚(𝑧+, 𝑧+) ⋯ 𝑠𝑖𝑚(𝑧+, 𝑧/)

⋮ ⋱ ⋮𝑠𝑖𝑚(𝑧/, 𝑧+) ⋯ 𝑠𝑖𝑚(𝑧/, 𝑧/)

Maximize Hilbert-Schmidt Independence Criterion (HSIC) between 𝒁 and 𝒀!

HSIC Z, Y = 1

𝑛(𝑛 − 1)𝑇𝑟 𝐾< 𝐾%

𝑛(𝑛 − 1)KK𝐾L 𝑖, 𝑗 𝐾%(𝑖, 𝑗)

Error Bound on HSIC Approximation

ChallengeThe storage and computation of kernels are quadratic w.r.t. sample size!

SolutionApproximate kernel matrices with random Fourier features.

𝐹 ∈ ℝ/×R

𝐾< ∈ ℝ/×/

𝐾% ∈ ℝ/×/

𝐺 ∈ ℝ/×T

𝑚, 𝑙 are the numbers of random Fourier features 𝑚, 𝑙 ≪ 𝑛

Approximation Error Bound

𝔼 |𝑒𝑟𝑟𝑜𝑟|≤ /

/^+_/ `ab /RT

cd/ `ab /RT

Learning Objective

HSIC Z, Y − 𝜆||𝑊||Pd

Ø Solved with L-BFGS

Ø Time complexity: 𝒪(𝑛(𝑚𝑑 +𝑚𝑙 + 𝑑𝑞))

Ø Storage cost: 𝒪(𝑛(𝑑 +𝑚 + 𝑙))

Infant Health Development Data

MDM PSM RLP LASSO BART CausalForest

Proposed

News Data

MDM PSM RLP LASSO BART CausalForest

Proposed

Summary

ØSignificantly improve nearest-neighbor matching for counterfactual inference through informative subspace learning.

ØSpeed up HSIC computation via random Fourier features and provided proof on an upper bound on the approximation error.

ØEmpirically show state-of-the-art performance on real datasets.

Informative Subspace Learning for Counterfactual Inferenceychang/papers/AAAI_2017_talk.pdf ·...

Documents

Nonparametric Counterfactual Predictions in Neoclassical ... · Nonparametric Counterfactual Predictions in Neoclassical Models . ... Nonparametric Counterfactual Predictions in Neoclassical

The Subspace Emmisary

EX COUNTERFACTUAL JMIC02 OSINT – EX COUNTERFACTUAL JMIC02 OSINT DAY 1 – Afternoon 1400 – Intro to Counterfactual History and Ex 1500 – Course departs for

Subspace Physics

Probabilistic language in indicative and counterfactual ...web.stanford.edu/...probability-conditionals-draft.pdf · Keywords: Indicative conditionals, counterfactual conditionals,

Alternative realities: Counterfactual historical fiction ...shura.shu.ac.uk/19154/1/Raghunath_2017_PhD_AlternativeRealities... · Alternative Realities: Counterfactual Historical

Counterfactual Dependence and Arrow - PhilSci-Archivephilsci-archive.pitt.edu › 10841 › 1 › Counterfactual... · Counterfactual Dependence and Arrow Thomas Kroedel Humboldt

Counterfactual Imagination

5.1.2 counterfactual framework

Subspace 2009b

Counterfactual Conditionals and False Belief

Ed Snelson. Counterfactual Analysis

Chinese and Counterfactual Reasoning

Reliable Decision Support using Counterfactual Models · Reliable Decision Support using Counterfactual Models ... and propose the counterfactual GP (CGP), a ... In this paper,

subspace ident5.dvi.pdf

Cyclic Subspace

Trajectory counterfactual development REVISION FINAL

Image Counterfactual Sensitivity Analysis for

Counterfactual computational vehicles of consciousness

FACE:Feasible and Actionable Counterfactual Explanations