27
Causal inference in graphical models with latent variables. From theory to practice. Sam Maes, Stijn Meganck, Philippe Leray [email protected] Laboratoire d’Informatique, Traitement de l’Information, Syst` emes. Causality and Probability in the Sciences Canterbury 15th June 2006

Causal inference in graphical models with latent variables ... · Causal inference in graphical models with latent variables. From theory to practice. Sam Maes, Stijn Meganck, Philippe

  • Upload
    lyhuong

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Causal inference in graphical models withlatent variables. From theory to practice.

Sam Maes, Stijn Meganck, Philippe [email protected]

Laboratoire d’Informatique, Traitement de l’Information, Systemes.

Causality and Probability in the SciencesCanterbury

15th June 2006

Introduction Causal Inference Learning Parametrisation

Outline

1 Introduction

2 Causal Inference

3 Learning

4 Parametrisation

Sam Maes From theory to practice 2/27

Introduction Causal Inference Learning Parametrisation

Outline

1 Introduction

2 Causal Inference

3 Learning

4 Parametrisation

Sam Maes From theory to practice 3/27

Introduction Causal Inference Learning Parametrisation

Introduction

Subtasks of causal modeling with latent variables:

Structure learning from data:

observationalexperimental

Learning parameters

Probabilistic inference

Causal inference

Sam Maes From theory to practice 4/27

Introduction Causal Inference Learning Parametrisation

Problem

No integral approach for all these tasks in the presence oflatent variables.

Causal inference :semi-Markovian causal models

Structure learning from observational data:ancestral graph models

Our solution:

use ancestral graph models for learning

transform the result into a semi-Markovian causal model

Sam Maes From theory to practice 5/27

Introduction Causal Inference Learning Parametrisation

Assumptions

data generated by an underlying faithful causal DAG

some variables are latent to the user

sufficient data and learning algorithms make no errors

no latent common cause of 2 variables with an immediatecausal connection

no selection bias

Sam Maes From theory to practice 6/27

Introduction Causal Inference Learning Parametrisation

Outline

1 Introduction

2 Causal Inference

3 Learning

4 Parametrisation

Sam Maes From theory to practice 7/27

Introduction Causal Inference Learning Parametrisation

Probabilistic vs Causal Inference

underlying DAG after observation after manipulationinstantiate theobservedvariables

propagate

replacer oldcauses

instantiate

propagate

Sam Maes From theory to practice 8/27

Introduction Causal Inference Learning Parametrisation

With latent variables

Causal inference becomes more complicated:

replace old causes

instantiate

propagate

P(Y = y |do(X = x)): manipulate variable X and study the effecton Y.

Sam Maes From theory to practice 9/27

Introduction Causal Inference Learning Parametrisation

Representation for causal inference

semi-Markoviens causal models (SMCM)

directed edge represents an autonomous causal relation

bi-directed edge represents a latent common cause

importance: every models with arbitrary latent variables canbe transformed into a SMCM

a joint probability distribution: e.g. P(V1, . . . ,V6)

Sam Maes From theory to practice 10/27

Introduction Causal Inference Learning Parametrisation

Inference in SMCMs

causal inference algorithm exists (Tian & Pearl), but:

no efficient parametrisation

no probabilistic inference algorithm

no structural learning algorithm

Sam Maes From theory to practice 11/27

Introduction Causal Inference Learning Parametrisation

Outline

1 Introduction

2 Causal Inference

3 Learning

4 Parametrisation

Sam Maes From theory to practice 12/27

Introduction Causal Inference Learning Parametrisation

Representation for learning

The class of DAGs is not complete under marginalisation.

I.e., a DAG of the observable variables can not exactly represent allthe independences between the variables.

Sam Maes From theory to practice 13/27

Introduction Causal Inference Learning Parametrisation

Maximal ancestral graphs (MAG)

Maximal ancestral graphs without conditionning

directed edges have an ancestral meaning 6= causal:there is a causal path in the underlying DAG

bi-directed edges: latent common causes

max. 1 edge between 2 variables: ancestral relations absorblatent common causes

maximal: every absent edge represents an independence

Sam Maes From theory to practice 14/27

Introduction Causal Inference Learning Parametrisation

Markov equivalence class

complete partial ancestral graph (CPAG):

Fast Causal Inference (FCI)

Rules to orient edges

3 possible edge marks: o, -, >

Sam Maes From theory to practice 15/27

Introduction Causal Inference Learning Parametrisation

3 possible explanations for each edge:

causal relation V1,V2

latent variable V2,V6

”inducing path” between V1 et V6

V1 can not be separated from V6 by using observed variablesdue to the maximality of the models FCI finds an edge

Sam Maes From theory to practice 16/27

Introduction Causal Inference Learning Parametrisation

CPAG → SMCM

Perform experiments to differ between cases:

Type 1: resolve o→Type 2: resolve o−o

Remove i-false edges

Sam Maes From theory to practice 17/27

Introduction Causal Inference Learning Parametrisation

CPAG → SMCM (Type 1)

Type 1: resolve Ao→ B

exp(A) 6 B: A ↔ B

exp(A) B:

6 ∃ pot.dir. path A 99K B of length ≥ 2: A → B

∃ pot.dir. path A 99K B of length ≥ 2:block each pot.dir. path by conditioning on a set D

exp(A)|D B: A → Bexp(A)|D 6 B: A ↔ B

Sam Maes From theory to practice 18/27

Introduction Causal Inference Learning Parametrisation

Type 1: examples

Sam Maes From theory to practice 19/27

Introduction Causal Inference Learning Parametrisation

CPAG → SMCM (Type 2)

Type 2: resolve o−o

Easily transformed to Type 1

Sam Maes From theory to practice 20/27

Introduction Causal Inference Learning Parametrisation

Remove i-false edges

recognize the edges A ↔ B and A → B that are possiblecreated due to an inducing path

block each inducing path between A,B with experiments E

block each other path between A,B by condit. on a set D

exp(E )|D: A 2B: confounding edge is realA⊥⊥B: remove i-false edge

Sam Maes From theory to practice 21/27

Introduction Causal Inference Learning Parametrisation

Outline

1 Introduction

2 Causal Inference

3 Learning

4 Parametrisation

Sam Maes From theory to practice 22/27

Introduction Causal Inference Learning Parametrisation

Parametrisation for discrete variables

Represent the SMCM with a DAG

this DAG is an I-map of SMCM

i.e. all the indep. found in the DAG are also in the SMCM

use this DAG for:

learning parametersprobabilistic inference

for causal inference:

use the SMCM for the structureuse the DAG for the parameters

Sam Maes From theory to practice 23/27

Introduction Causal Inference Learning Parametrisation

Some definitions

C-component

In a SMCM, the set of observable variables that are connected by abi-directed path belong to the same c-component (from”confounded component”).

Sam Maes From theory to practice 24/27

Introduction Causal Inference Learning Parametrisation

Parametrisation: procedure

Given a SMCM and a topological order, thePR-representation is obtained as follows:

The nodes are V , the observable variables of the SMCM.

The directed edges of the SMCM remain.

The bi-directed edges in the SMCM are replaced:

Add a directed edge from node Vi to node Vj iff:

Vi is a topological predecessor in the same c-componenta parent of a topological predecessor in the same c-componentexcept if there was already an edge between nodes Vi and Vj .

Sam Maes From theory to practice 25/27

Introduction Causal Inference Learning Parametrisation

Parametrisation: example

Add an edge Vi → Vj iff:

Vi is a topological predecessor in the same c-component

a parent of a topological predecessor in the same c-component

with topological order V2 < V5 < V6:

Sam Maes From theory to practice 26/27

Introduction Causal Inference Learning Parametrisation

Conclusion

From theory to practice.

Somewhat closer, still a lot of assumptions to relax:

take into account:

possible errors in the learning algorithmscost of experiments and impossible experimentsrules for marking edges after each experiment

allow:

selection biasconfounding between variables with an immediate causalconnection

resolve practical issues such as finding pot. inducing pathsand pot. directed paths

Sam Maes From theory to practice 27/27