1
Comparing four different local causal inference models for a variety of Wikipedia pages. The “Main Page”, Mutual Exclusion, and Watchmen (film) pages are unrelated to the Earthquake event that occurred off the coast of Australia. Note all time-series x are scaled between 0 and 1 via xmin(x)/(max(x)min(x)). In this experiment, we use 1 hour of data to estimate β and the next hour to estimate β′, and repeat this for all 48 hours by sliding the model to obtain a time-series of β’s (and β′). Overview Evaluation & Results In this work, we formalize the problem of causal inference over graph-based relational time-series data where each node in the graph has one or more time- series associated to it. We propose causal inference models for this problem that leverage both the graph topology and time-series to accurately estimate local causal effects of nodes. Furthermore, the relational time-series causal inference models are able to estimate local effects for individual nodes by exploiting local node-centric temporal dependencies and topological/structural dependencies. We show that simpler causal models that do not consider the graph topology are recovered as special cases of the proposed relational time-series causal inference model. We describe the conditions under which the resulting estimate can be used to estimate a causal effect, and describe how the Durbin-Wu-Hausman test of specification can be used to test for the consistency of the proposed estimator from data. Empirically, we demonstrate the effectiveness of the causal inference models on both synthetic data with known ground-truth and a large-scale observational relational time-series data set collected from Wikipedia. 1 Adobe Research, San Jose, CA 2 Intel Labs, Santa Clara, CA Causal Inference in Relational Time Series Let G=(V,E) denote a graph and X be a n-by-t max matrix of node time-series. Further, assume that there is a known intervention that occurs at some time and that the causal effect of the intervention. Is a smoothly varying process centered at some node i. The problem is to infer the individual and peer effects of node i. AUC results comparing HONE to recent embedding methods across a wide variety of networks from different application domains. Main Findings & Contributions 1. We formalized the problem of causal inference over graph-based relational time-series data where each node in the graph has one or more time-series associated to it. 2. We proposed causal inference models for this problem that leverage both the graph topology and time-series to accurately estimate local causal effects of nodes. 3. We showed that simpler causal models that do not consider the graph topology are recovered as special cases of the proposed relational time-series causal inference model. Experimental setup: 10-fold cross-validation, repeated for 10 random trials, D=128, D l = 16, edge embedding derived via ; Predict link existence via LR; # steps K selected via grid search Inferring Individual Level Causal Models Ryan A. Rossi 1 , Somdeb Sarkhel 1 , Nesreen K. Ahmed 2 from Graph-based Relational Time Series Example with known causal effect between the Earthquake and Richter Magnitude pages. Earthquake Richter Mag. Earthquake Preparedness 35 31 56 1447 132 172 764 3406 time t+1 t Example with known causal effect between the Earthquake and Richter Magnitude pages. Individual (IID) Model: Individual (IID) Peer Model: Global Model: Global Peer Model: γ provides the ability to control the extent to which information from other nodes in the network contribute to the estimation of β. At one extreme, as γ approaches zero, we recover an i.i.d. estimate. At the other, as γ approaches 1, the estimate will pool all instances and a global model is recovered. Peer Model Formulation: Global Models: Basic Model Formulation: Mean squared error between actual β and estimated under different network models for inferring a individual effect (top) or peer effect (bottom) given heterogeneously affected neighbors. Causal effects as γ varies.

Inferring Individual Level Causal Models from Graph-based ...ryanrossi.com/poster/AAAI20-StarAI-poster.pdfExample with known causal effect between the Earthquake and Richter Magnitude

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inferring Individual Level Causal Models from Graph-based ...ryanrossi.com/poster/AAAI20-StarAI-poster.pdfExample with known causal effect between the Earthquake and Richter Magnitude

Comparing four different local causal inference models for a variety of Wikipedia pages. The “MainPage”, Mutual Exclusion, and Watchmen (film) pages are unrelated to the Earthquake event thatoccurred off the coast of Australia. Note all time-series x are scaled between 0 and 1 viax−min(x)/(max(x)−min(x)). In this experiment, we use 1 hour of data to estimate β and the next hour toestimate β′, and repeat this for all 48 hours by sliding the model to obtain a time-series of β’s (and β′).

Overview

Evaluation & Results

In this work, we formalize the problem of causal inference over graph-basedrelational time-series data where each node in the graph has one or more time-series associated to it. We propose causal inference models for this problem thatleverage both the graph topology and time-series to accurately estimate localcausal effects of nodes. Furthermore, the relational time-series causal inferencemodels are able to estimate local effects for individual nodes by exploiting localnode-centric temporal dependencies and topological/structural dependencies.We show that simpler causal models that do not consider the graph topology arerecovered as special cases of the proposed relational time-series causal inferencemodel. We describe the conditions under which the resulting estimate can beused to estimate a causal effect, and describe how the Durbin-Wu-Hausman testof specification can be used to test for the consistency of the proposed estimatorfrom data. Empirically, we demonstrate the effectiveness of the causal inferencemodels on both synthetic data with known ground-truth and a large-scaleobservational relational time-series data set collected from Wikipedia.

1Adobe Research, San Jose, CA2Intel Labs, Santa Clara, CA

Causal Inference in Relational Time SeriesLet G=(V,E) denote a graph and X be a n-by-tmax matrix of node time-series.Further, assume that there is a known intervention that occurs at some time and that the causal effect of the intervention. Is a smoothly varying process centered at some node i. The problem is to infer the individual and peer effects of node i.

AUC results comparing HONE to recent embedding methods across a wide variety of networks from different application domains.

Main Findings & Contributions1. We formalized the problem of causal inference over graph-based relational time-series data

where each node in the graph has one or more time-series associated to it.

2. We proposed causal inference models for this problem that leverage both the graph topology and time-series to accurately estimate local causal effects of nodes.

3. We showed that simpler causal models that do not consider the graph topology are recovered as special cases of the proposed relational time-series causal inference model.

Experimental setup: 10-fold cross-validation, repeated for 10 random trials, D=128, Dl = 16, edge embedding derived via ; Predict link existence via LR; # steps K selected via grid search

Inferring Individual Level Causal Models

Ryan A. Rossi1, Somdeb Sarkhel1, Nesreen K. Ahmed2

from Graph-based Relational Time Series

Example with known causal effect between the Earthquake and Richter Magnitude pages.

Earthquake

Richter Mag.

EarthquakePreparedness

35 31 56 1447

132 172 764 3406

time

t+1

t

Example with known causal effect between the Earthquake and Richter Magnitude pages.

Individual (IID) Model:

Individual (IID) Peer Model:

Global Model:

Global Peer Model:

γ provides the ability to control the extent to which information from other nodes in the network contribute to the estimation of β. • At one extreme, as γ approaches zero, we recover an i.i.d. estimate.• At the other, as γ approaches 1, the estimate will pool all instances

and a global model is recovered.

Peer Model Formulation:

Global Models:

Basic Model Formulation:

Mean squared error between actual β and estimated under different network models for inferring a individual effect (top) or peer effect (bottom) given heterogeneously affected neighbors.

Causal effects as γ varies.