Joint causal inferenceon observational and experimental datasets
Sara Magliacane, Tom Claassen, Joris M. Mooij
What If?, 10th December, 2016
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 1 / 21
Causal inference: learning causal relations from data
Definition
X causes Y (X 99KY ) = intervening upon (changing) X changes Y
We can represent causal relations with a causal DAG (hidden vars):
X Y
Z
E.g. X = Smoking, Y = Cancer, Z = Genetics
Causal inference = structure learning of the causal DAG
Traditionally, causal relations are inferred from interventions.
Sometimes, interventions are unethical, unfeasible or too expensive
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
Causal inference: learning causal relations from data
Definition
X causes Y (X 99KY ) = intervening upon (changing) X changes Y
We can represent causal relations with a causal DAG (hidden vars):
X Y
Z
E.g. X = Smoking, Y = Cancer, Z = Genetics
Causal inference = structure learning of the causal DAG
Traditionally, causal relations are inferred from interventions.
Sometimes, interventions are unethical, unfeasible or too expensive
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
Causal inference: learning causal relations from data
Definition
X causes Y (X 99KY ) = intervening upon (changing) X changes Y
We can represent causal relations with a causal DAG (hidden vars):
X Y
Z
E.g. X = Smoking, Y = Cancer, Z = Genetics
Causal inference = structure learning of the causal DAG
Traditionally, causal relations are inferred from interventions.
Sometimes, interventions are unethical, unfeasible or too expensive
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
Causal inference: learning causal relations from data
Definition
X causes Y (X 99KY ) = intervening upon (changing) X changes Y
We can represent causal relations with a causal DAG (hidden vars):
X Y
Z
E.g. X = Smoking, Y = Cancer, Z = Genetics
Causal inference = structure learning of the causal DAG
Traditionally, causal relations are inferred from interventions.
Sometimes, interventions are unethical, unfeasible or too expensive
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 3 / 21
Causal inference from observational and experimental data
Holy Grail of Causal Inference
Learn as much causal structure as possible from observations,integrating background knowledge and experimental data.
Current causal inference methods:
Score-based: evaluate models using a penalized likelihood score
Constraint-based: use statistical independences to expressconstraints over possible causal models
Advantage of constraint-based methods:
can handle latent confounders naturally
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 4 / 21
Causal inference from observational and experimental data
Holy Grail of Causal Inference
Learn as much causal structure as possible from observations,integrating background knowledge and experimental data.
Current causal inference methods:
Score-based: evaluate models using a penalized likelihood score
Constraint-based: use statistical independences to expressconstraints over possible causal models
Advantage of constraint-based methods:
can handle latent confounders naturally
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 4 / 21
Causal inference from observational and experimental data
Holy Grail of Causal Inference
Learn as much causal structure as possible from observations,integrating background knowledge and experimental data.
Current causal inference methods:
Score-based: evaluate models using a penalized likelihood score
Constraint-based: use statistical independences to expressconstraints over possible causal models
Advantage of constraint-based methods:
can handle latent confounders naturally
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 4 / 21
Joint inference on observational and experimental data
Advantage of score-based methods:
can formulate joint inference on observational and experimental dataand learn the targets of interventions, e.g. [Eaton and Murphy, 2007].
raf
mek12
pip2
erk
akt
p38jnk
pkc
plcy
pip3
pka
(a) (b)
raf
plcy
pip2
pip3
erk
p38
pkc
akt
pka
mek12
PresentMissing jnk
B2cAMP
f
erk
akt
jnk
Psitect AKT inh U0126PMA
p38
G06967mek12
raf
pkc
pip3
plcy
pip2
pka
PresentMissingInt. edge
(c) (d)
Figure 6: Models of the biological data. (a) A partial model of the T-cell pathway, as currently accepted by biologists. The small roundcircles with numbers represent various interventions (green = activators, red = inhibitors). From [SPP+05]. Reprinted with permissionfrom AAAS. (b) Edges with marginal probability above 0.5 as estimated by [SPP+05]. (c) Edges with marginal probability above 0.5as estimated by us, assuming known perfect interventions. Dashed edges are ones that are missing from the union of (a) and (b). Theseare either false positives, or edges that Sachs et al missed. (d) Edges with marginal probability above 0.5 as estimated by us, assuminguncertain, imperfect interventions, and a fan-in bound of k = 2. The intervention nodes are in red, and edges from the interventionnodes are light gray. Dashed edges are ones that are missing from the union of (a) and (b). This figure is best viewed in colour.
Example from [Eaton and Murphy, 2007]
Goal: Can we perform joint inference using constraint-based methods?
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 5 / 21
Joint inference on observational and experimental data
Advantage of score-based methods:
can formulate joint inference on observational and experimental dataand learn the targets of interventions, e.g. [Eaton and Murphy, 2007].
raf
mek12
pip2
erk
akt
p38jnk
pkc
plcy
pip3
pka
(a) (b)
raf
plcy
pip2
pip3
erk
p38
pkc
akt
pka
mek12
PresentMissing jnk
B2cAMP
f
erk
akt
jnk
Psitect AKT inh U0126PMA
p38
G06967mek12
raf
pkc
pip3
plcy
pip2
pka
PresentMissingInt. edge
(c) (d)
Figure 6: Models of the biological data. (a) A partial model of the T-cell pathway, as currently accepted by biologists. The small roundcircles with numbers represent various interventions (green = activators, red = inhibitors). From [SPP+05]. Reprinted with permissionfrom AAAS. (b) Edges with marginal probability above 0.5 as estimated by [SPP+05]. (c) Edges with marginal probability above 0.5as estimated by us, assuming known perfect interventions. Dashed edges are ones that are missing from the union of (a) and (b). Theseare either false positives, or edges that Sachs et al missed. (d) Edges with marginal probability above 0.5 as estimated by us, assuminguncertain, imperfect interventions, and a fan-in bound of k = 2. The intervention nodes are in red, and edges from the interventionnodes are light gray. Dashed edges are ones that are missing from the union of (a) and (b). This figure is best viewed in colour.
Example from [Eaton and Murphy, 2007]
Goal: Can we perform joint inference using constraint-based methods?
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 5 / 21
Joint Causal Inference: Assumptions
Idea: Model jointly several observational or experimental datasets{Dr}r∈{1...n} with zero or more possibly unknown intervention targets.
We assume a unique underlying causal DAG across datasets definedover system variables {Xj}j∈X (some of which possibly hidden).
Example
X1 X2
X3
Dataset D1
unknown ints
X1 X2
X3
Dataset D2
unknown ints
X1 X2
X3
Dataset D3
do X1
Note: cannot handle certain intervention types, e.g. perfect interventions.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 7 / 21
Joint Causal Inference: Assumptions
Idea: Model jointly several observational or experimental datasets{Dr}r∈{1...n} with zero or more possibly unknown intervention targets.
We assume a unique underlying causal DAG across datasets definedover system variables {Xj}j∈X (some of which possibly hidden).
Example
X1 X2
X3
Dataset D1
unknown ints
X1 X2
X3
Dataset D2
unknown ints
X1 X2
X3
Dataset D3
do X1
Note: cannot handle certain intervention types, e.g. perfect interventions.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 7 / 21
Joint Causal Inference: Assumptions
Idea: Model jointly several observational or experimental datasets{Dr}r∈{1...n} with zero or more possibly unknown intervention targets.
We assume a unique underlying causal DAG across datasets definedover system variables {Xj}j∈X (some of which possibly hidden).
Example
X1 X2
X3
Dataset D1
unknown ints
X1 X2
X3
Dataset D2
unknown ints
X1 X2
X3
Dataset D3
do X1
Note: cannot handle certain intervention types, e.g. perfect interventions.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 7 / 21
Joint Causal Inference: SCM
We introduce two types of dummy variables in the data:
a regime variable R, indicating which dataset Dr a data point isfrom
intervention variables {Ii}i∈I , functions of R
We assume that we can represent the whole system as an acyclic SCM:
R = ER ,
Ii = gi (R), i ∈ I,Xj = fj(Xpa(Xj )∩X , Ipa(Xj )∩I ,Ej), j ∈ X ,
P((Ek)k∈X∪{R}
)=
∏k∈X∪{R}
P(Ek).
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 8 / 21
Joint Causal Inference: SCM
We introduce two types of dummy variables in the data:
a regime variable R, indicating which dataset Dr a data point isfrom
intervention variables {Ii}i∈I , functions of R
We assume that we can represent the whole system as an acyclic SCM:
R = ER ,
Ii = gi (R), i ∈ I,Xj = fj(Xpa(Xj )∩X , Ipa(Xj )∩I ,Ej), j ∈ X ,
P((Ek)k∈X∪{R}
)=
∏k∈X∪{R}
P(Ek).
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 8 / 21
Joint Causal Inference: single joint causal DAG
We represent the SCM with a causal DAG C representing all datasetsjointly:
R I1 I2 X1 X2 X4
1 20 0 0.1 0.2 0.51 20 0 0.13 0.21 0.491 20 0 . . . . . . . . .
2 20 1 . . . . . . . . .
3 30 0 . . . . . . . . .
4 30 1 . . . . . . . . .
4 datasets with 2 interventions
R
I1 I2
X1
X2
X3
X4
Causal DAG C
We assume Causal Markov and Minimality hold in C.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 9 / 21
Joint Causal Inference: Faithfulness violations
Causal Faithfulness assumption:
X ⊥⊥ Y |W =⇒ X ⊥d Y |W [C]
R I1 X1 I1 = g1(R) =⇒ X1 ⊥⊥ I1 |R ... but X1 6⊥d I1 |R
Solution: D-separation [Spirtes et al., 2000]: X1 ⊥D I1 |R
Problem: ⊥D provably complete (for now) for functionally determinedrelations
Solution: restrict deterministic relations to only ∀i ∈ I : Ii = gi (R)
D-Faithfulness: X ⊥⊥ Y |W =⇒ X ⊥D Y |W [C].
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
Joint Causal Inference: Faithfulness violations
Causal Faithfulness assumption:
X ⊥⊥ Y |W =⇒ X ⊥d Y |W [C]
R I1 X1 I1 = g1(R) =⇒ X1 ⊥⊥ I1 |R ... but X1 6⊥d I1 |R
Solution: D-separation [Spirtes et al., 2000]: X1 ⊥D I1 |R
Problem: ⊥D provably complete (for now) for functionally determinedrelations
Solution: restrict deterministic relations to only ∀i ∈ I : Ii = gi (R)
D-Faithfulness: X ⊥⊥ Y |W =⇒ X ⊥D Y |W [C].
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
Joint Causal Inference: Faithfulness violations
Causal Faithfulness assumption:
X ⊥⊥ Y |W =⇒ X ⊥d Y |W [C]
R I1 X1 I1 = g1(R) =⇒ X1 ⊥⊥ I1 |R ... but X1 6⊥d I1 |R
Solution: D-separation [Spirtes et al., 2000]: X1 ⊥D I1 |R
Problem: ⊥D provably complete (for now) for functionally determinedrelations
Solution: restrict deterministic relations to only ∀i ∈ I : Ii = gi (R)
D-Faithfulness: X ⊥⊥ Y |W =⇒ X ⊥D Y |W [C].
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
Joint Causal Inference: Faithfulness violations
Causal Faithfulness assumption:
X ⊥⊥ Y |W =⇒ X ⊥d Y |W [C]
R I1 X1 I1 = g1(R) =⇒ X1 ⊥⊥ I1 |R ... but X1 6⊥d I1 |R
Solution: D-separation [Spirtes et al., 2000]: X1 ⊥D I1 |R
Problem: ⊥D provably complete (for now) for functionally determinedrelations
Solution: restrict deterministic relations to only ∀i ∈ I : Ii = gi (R)
D-Faithfulness: X ⊥⊥ Y |W =⇒ X ⊥D Y |W [C].
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 10 / 21
Joint Causal Inference
Joint Causal Inference (JCI) = Given all the assumptions, reconstructthe causal DAG C from independence test results.
X1 6⊥⊥ I1
X1 ⊥⊥ I1 |R
X4 ⊥⊥ X2 | I1. . .
=⇒
R
I1 I2
X1
X2
X3
X4
Problem: Current constraint-based methods cannot work with JCI,because of faithfulness violations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
Joint Causal Inference
Joint Causal Inference (JCI) = Given all the assumptions, reconstructthe causal DAG C from independence test results.
X1 6⊥⊥ I1
X1 ⊥⊥ I1 |R
X4 ⊥⊥ X2 | I1. . .
=⇒
R
I1 I2
X1
X2
X3
X4
Problem: Current constraint-based methods cannot work with JCI,because of faithfulness violations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
Joint Causal Inference
Joint Causal Inference (JCI) = Given all the assumptions, reconstructthe causal DAG C from independence test results.
X1 6⊥⊥ I1
X1 ⊥⊥ I1 |R
X4 ⊥⊥ X2 | I1. . .
=⇒
R
I1 I2
X1
X2
X3
X4
Problem: Current constraint-based methods cannot work with JCI,because of faithfulness violations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
Joint Causal Inference
Joint Causal Inference (JCI) = Given all the assumptions, reconstructthe causal DAG C from independence test results.
X1 6⊥⊥ I1
X1 ⊥⊥ I1 |R
X4 ⊥⊥ X2 | I1. . .
=⇒
R
I1 I2
X1
X2
X3
X4
Problem: Current constraint-based methods cannot work with JCI,because of faithfulness violations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 11 / 21
Part III
Extending constraint-based methods for JCI
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 12 / 21
A simple strategy for dealing with faithfulness violations
Idea: Faithfulness violations → Partial inputs
A simple strategy for dealing with functionally determined relations:
1 Rephrase constraint-based method in terms of d-separations
2 For each independence test result derive sound d-separations:
X 6⊥⊥ Y |W =⇒ X 6⊥d Y |W
X 6∈ Det(W ) and Y 6∈ Det(W ) and X ⊥⊥ Y |W =⇒X ⊥d Y |Det(W )
where Det(W ) = variables determined by (a subset of) W .
Conjecture: sound also for a larger class of deterministic relations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
A simple strategy for dealing with faithfulness violations
Idea: Faithfulness violations → Partial inputs
A simple strategy for dealing with functionally determined relations:
1 Rephrase constraint-based method in terms of d-separations
2 For each independence test result derive sound d-separations:
X 6⊥⊥ Y |W =⇒ X 6⊥d Y |W
X 6∈ Det(W ) and Y 6∈ Det(W ) and X ⊥⊥ Y |W =⇒X ⊥d Y |Det(W )
where Det(W ) = variables determined by (a subset of) W .
Conjecture: sound also for a larger class of deterministic relations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
A simple strategy for dealing with faithfulness violations
Idea: Faithfulness violations → Partial inputs
A simple strategy for dealing with functionally determined relations:
1 Rephrase constraint-based method in terms of d-separations
2 For each independence test result derive sound d-separations:
X 6⊥⊥ Y |W =⇒ X 6⊥d Y |W
X 6∈ Det(W ) and Y 6∈ Det(W ) and X ⊥⊥ Y |W =⇒X ⊥d Y |Det(W )
where Det(W ) = variables determined by (a subset of) W .
Conjecture: sound also for a larger class of deterministic relations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
A simple strategy for dealing with faithfulness violations
Idea: Faithfulness violations → Partial inputs
A simple strategy for dealing with functionally determined relations:
1 Rephrase constraint-based method in terms of d-separations
2 For each independence test result derive sound d-separations:
X 6⊥⊥ Y |W =⇒ X 6⊥d Y |W
X 6∈ Det(W ) and Y 6∈ Det(W ) and X ⊥⊥ Y |W =⇒X ⊥d Y |Det(W )
where Det(W ) = variables determined by (a subset of) W .
Conjecture: sound also for a larger class of deterministic relations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
A simple strategy for dealing with faithfulness violations
Idea: Faithfulness violations → Partial inputs
A simple strategy for dealing with functionally determined relations:
1 Rephrase constraint-based method in terms of d-separations
2 For each independence test result derive sound d-separations:
X 6⊥⊥ Y |W =⇒ X 6⊥d Y |W
X 6∈ Det(W ) and Y 6∈ Det(W ) and X ⊥⊥ Y |W =⇒X ⊥d Y |Det(W )
where Det(W ) = variables determined by (a subset of) W .
Conjecture: sound also for a larger class of deterministic relations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
A simple strategy for dealing with faithfulness violations
Idea: Faithfulness violations → Partial inputs
A simple strategy for dealing with functionally determined relations:
1 Rephrase constraint-based method in terms of d-separations
2 For each independence test result derive sound d-separations:
X 6⊥⊥ Y |W =⇒ X 6⊥d Y |W
X 6∈ Det(W ) and Y 6∈ Det(W ) and X ⊥⊥ Y |W =⇒X ⊥d Y |Det(W )
where Det(W ) = variables determined by (a subset of) W .
Conjecture: sound also for a larger class of deterministic relations.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 13 / 21
Ancestral Causal Inference with Determinism (ACID)
Idea: Faithfulness violations → Partial inputs
Traditional constraint-based methods (e.g., PC, FCI, ...):
cannot handle partial inputs
cannot exploit the rich background knowledge in JCI
Solution: Logic-based methods, e.g., ACI [Magliacane et al., 2016]
=⇒ we implement the strategy in an extension of ACI called:
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
Ancestral Causal Inference with Determinism (ACID)
Idea: Faithfulness violations → Partial inputs
Traditional constraint-based methods (e.g., PC, FCI, ...):
cannot handle partial inputs
cannot exploit the rich background knowledge in JCI
Solution: Logic-based methods, e.g., ACI [Magliacane et al., 2016]
=⇒ we implement the strategy in an extension of ACI called:
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
Ancestral Causal Inference with Determinism (ACID)
Idea: Faithfulness violations → Partial inputs
Traditional constraint-based methods (e.g., PC, FCI, ...):
cannot handle partial inputs
cannot exploit the rich background knowledge in JCI
Solution: Logic-based methods, e.g., ACI [Magliacane et al., 2016]
=⇒ we implement the strategy in an extension of ACI called:
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
Ancestral Causal Inference with Determinism (ACID)
Idea: Faithfulness violations → Partial inputs
Traditional constraint-based methods (e.g., PC, FCI, ...):
cannot handle partial inputs
cannot exploit the rich background knowledge in JCI
Solution: Logic-based methods, e.g., ACI [Magliacane et al., 2016]
=⇒ we implement the strategy in an extension of ACI called:
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 14 / 21
Rephrasing ACI rules in terms of d-separations
ACI rules:
Example
For X , Y , W disjoint (sets of) variables:
(X ⊥⊥Y |W ) ∧ (X 699KW ) =⇒ X 699KY
ACID rules:
Example
For X , Y , W disjoint (sets of) variables:
(X ⊥d Y |W ) ∧ (X 699KW ) =⇒ X 699KY
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 15 / 21
Rephrasing ACI rules in terms of d-separations
ACI rules:
Example
For X , Y , W disjoint (sets of) variables:
(X ⊥⊥Y |W ) ∧ (X 699KW ) =⇒ X 699KY
ACID rules:
Example
For X , Y , W disjoint (sets of) variables:
(X ⊥d Y |W ) ∧ (X 699KW ) =⇒ X 699KY
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 15 / 21
ACID-JCI
ACID-JCI = ACID rules + sound d-separations from strategy
+ JCI background knowledge
For example:
∀i ∈ I,∀j ∈ X : (Xj 699KR) ∧ (Xj 699K Ii )
“Standard variables cannot cause dummy variables”
Example
I1
X1
R
I1
X1
R
I1 X1
R
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 16 / 21
ACID-JCI
ACID-JCI = ACID rules + sound d-separations from strategy
+ JCI background knowledge
For example:
∀i ∈ I, ∀j ∈ X : (Xj 699KR) ∧ (Xj 699K Ii )
“Standard variables cannot cause dummy variables”
Example
I1
X1
R
I1
X1
R
I1 X1
R
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 16 / 21
ACID-JCI
ACID-JCI = ACID rules + sound d-separations from strategy
+ JCI background knowledge
For example:
∀i ∈ I, ∀j ∈ X : (Xj 699KR) ∧ (Xj 699K Ii )
“Standard variables cannot cause dummy variables”
Example
I1
X1
R
I1
X1
R
I1 X1
R
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 16 / 21
Simulated data accuracy: example Precision Recall curve
Ancestral (“causes”) relations Non-ancestral (“not causes”)
Precision Recall curves of 2000 randomly generated causal graphs for4 system variables and 3 interventions
ACID-JCI substantially improves accuracy w.r.t. merging learntstructures (merged ACI)
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 18 / 21
Conclusion
Joint Causal Inference, a powerful formulation of causal discoveryover multiple datasets for constraint-based methods
A simple strategy for dealing with faithfulness violations due tofunctionally determined relations
An implementation, ACID-JCI, that substantially improves theaccuracy w.r.t. state-of-the-art
Future work:
Improve scalability (now max 7 variables in C).
Working paper: https://arxiv.org/abs/1611.10351
Collaborators:
Tom Claassen Joris M. Mooij
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
Conclusion
Joint Causal Inference, a powerful formulation of causal discoveryover multiple datasets for constraint-based methods
A simple strategy for dealing with faithfulness violations due tofunctionally determined relations
An implementation, ACID-JCI, that substantially improves theaccuracy w.r.t. state-of-the-art
Future work:
Improve scalability (now max 7 variables in C).
Working paper: https://arxiv.org/abs/1611.10351
Collaborators:
Tom Claassen Joris M. Mooij
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
Conclusion
Joint Causal Inference, a powerful formulation of causal discoveryover multiple datasets for constraint-based methods
A simple strategy for dealing with faithfulness violations due tofunctionally determined relations
An implementation, ACID-JCI, that substantially improves theaccuracy w.r.t. state-of-the-art
Future work:
Improve scalability (now max 7 variables in C).
Working paper: https://arxiv.org/abs/1611.10351
Collaborators:
Tom Claassen Joris M. Mooij
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
Conclusion
Joint Causal Inference, a powerful formulation of causal discoveryover multiple datasets for constraint-based methods
A simple strategy for dealing with faithfulness violations due tofunctionally determined relations
An implementation, ACID-JCI, that substantially improves theaccuracy w.r.t. state-of-the-art
Future work:
Improve scalability (now max 7 variables in C).
Working paper: https://arxiv.org/abs/1611.10351
Collaborators:
Tom Claassen Joris M. Mooij
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
Conclusion
Joint Causal Inference, a powerful formulation of causal discoveryover multiple datasets for constraint-based methods
A simple strategy for dealing with faithfulness violations due tofunctionally determined relations
An implementation, ACID-JCI, that substantially improves theaccuracy w.r.t. state-of-the-art
Future work:
Improve scalability (now max 7 variables in C).
Working paper: https://arxiv.org/abs/1611.10351
Collaborators:
Tom Claassen Joris M. Mooij
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
Conclusion
Joint Causal Inference, a powerful formulation of causal discoveryover multiple datasets for constraint-based methods
A simple strategy for dealing with faithfulness violations due tofunctionally determined relations
An implementation, ACID-JCI, that substantially improves theaccuracy w.r.t. state-of-the-art
Future work:
Improve scalability (now max 7 variables in C).
Working paper: https://arxiv.org/abs/1611.10351
Collaborators:
Tom Claassen Joris M. Mooij
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 19 / 21
References I
Claassen, T. and Heskes, T. (2011).
A logical characterization of constraint-based causal discovery.In UAI 2011, pages 135–144.
Eaton, D. and Murphy, K. (2007).
Exact Bayesian structure learning from uncertain interventions.In Proceedings of the 10th International Conference on Artificial Intelligence and Statistics (AISTATS), pages 107–114.
Magliacane, S., Claassen, T., and Mooij, J. M. (2016).
Ancestral causal inference.In NIPS.
Mooij, J. M. and Heskes, T. (2013).
Cyclic causal discovery from continuous equilibrium data.In Nicholson, A. and Smyth, P., editors, Proceedings of the 29th Annual Conference on Uncertainty in ArtificialIntelligence (UAI-13), pages 431–439. AUAI Press.
Spirtes, P., Glymour, C., and Scheines, R. (2000).
Causation, Prediction, and Search.MIT press, 2nd edition.
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 20 / 21
Ancestral Causal Inference (ACI)
Weighted list of inputs: I = {(ij ,wj)}:
E.g. I = { (Y ⊥⊥Z |X , 0.2), (Y 6⊥⊥X , 0.1)} }
Any consistent weighting scheme, e.g. frequentist, Bayesian
For any possible ancestral structure C , we define the loss function:
Loss(C , I ) :=∑
(ij ,wj )∈I : ij is not satisfied in C
wj
Here: “ij is not satisfied in C” = defined by ancestral reasoning rules
For each possible causal relation X 99KY provide score:
Conf (X 99KY ) = minC∈CLoss(C , I + (X 699KY ,∞))
−minC∈CLoss(C , I + (X 99KY , ∞))
Sara Magliacane (VU, UvA) Joint causal inference 10-12-2016 21 / 21