Upload
bebe
View
76
Download
4
Embed Size (px)
DESCRIPTION
Causal Inference and Ambiguous Manipulations. Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University. 1. Motivation. Wanted: Answers to Causal Questions: Does attending Day Care cause Aggression? Does watching TV cause obesity? - PowerPoint PPT Presentation
Citation preview
1
Causal Inference and
Ambiguous Manipulations
Richard Scheines
Grant Reaber, Peter SpirtesCarnegie Mellon University
2
1. MotivationWanted: Answers to Causal Questions: • Does attending Day Care cause Aggression? • Does watching TV cause obesity?• How can we answer these questions
empirically?• When and how can we estimate the size of
the effect?• Can we know our estimates are reliable?
3
Causation & Intervention
P(Lung Cancer | Tar-stained teeth = no)
P(Lung Cancer | Tar-stained teeth set= no)
Conditioning is not the same as intervening
Show Teeth Slides
4
Gender
CEO Earings
Gender
CEO Earings
I
5
Causal Inference: Experiments
Gold Standard: Randomized Clinical Trials - Intervene: Randomly assign treatment - Observe Response
Estimate P( Response | Treatment assigned)
6
Causal Inference: Observational Studies
Collect a sample on - Potential Causes (X) - Response (Y) - Covariates (potential confounders Z)
Estimate P(Y | X, Z)• Highly unreliable• We can estimate sampling variability, but we don’t know
how to estimate specification uncertainty from data
Individual Day Care Aggressiveness
John
Mary
A lot
None
A lot
A little
7
2. Progress 1985 – Present
1. Representing causal structure, and connecting it to probability
2. Modeling Interventions3. Indistinguishability and Discovery
Algorithms
8
Representing Causal Structures
Causal Graph G = {V,E} Each edge X Y represents a direct causal claim:
X is a direct cause of Y relative to V
Exposure Infection Symptoms
9
Direct Causation
X is a direct cause of Y relative to S, iff z,x1 x2 P(Y | X set= x1 , Z set= z)
P(Y | X set= x2 , Z set= z)
where Z = S - {X,Y} X Y
10
Causal Bayes Networks
P(S = 0) = .7P(S = 1) = .3
P(YF = 0 | S = 0) = .99 P(LC = 0 | S = 0) = .95P(YF = 1 | S = 0) = .01 P(LC = 1 | S = 0) = .05P(YF = 0 | S = 1) = .20 P(LC = 0 | S = 1) = .80P(YF = 1 | S = 1) = .80 P(LC = 1 | S = 1) = .20
Smoking [0,1]
Lung Cancer[0,1]
Yellow Fingers[0,1]
P(S,Y,F) = P(S) P(YF | S) P(LC | S)
The Joint Distribution Factors
According to the Causal Graph,
i.e., for all X in V
P(V) = P(X|Immediate Causes of(X))
11
Modeling Ideal Interventions
Interventions on the Effect
WearingSweater
Room
Temperature
Pre-experimental SystemPost
12
Modeling Ideal Interventions
Interventions on the Cause
Pre-experimental SystemPost
WearingSweater
Room
Temperature
13
Interventions & Causal Graphs
• Model an ideal intervention by adding an “intervention” variable outside the original system
• Erase all arrows pointing into the variable intervened upon
Exp Inf
Rash
Intervene to change Inf
Post-intervention graph?Pre-intervention graph
Exp Inf Rash
I
14
Calculating the Effect of Interventions
Pre-manipulation Joint Distribution
P(Exp,Inf,Rash) = P(Exp)P(Inf | Exp)P(Rash|Inf)
Intervention on Inf
Exp Inf
Rash
Post-manipulation Joint Distribution
P(Exp,Inf,Rash) = P(Exp)P(Inf | I) P(Rash|Inf)
Exp Inf
Rash
I
15
Causal Discovery from Observational Studies
X3 | X2 X1
X2 X3 X1
Causal Markov Axiom(D-separation)
IndependenceRelations
Equivalence Class ofCausal Graphs
X2 X3 X1
X2 X3 X1
Discovery Algorithm
16
Equivalence Class with Latents:PAGs: Partial Ancestral Graphs
X2
X3
X1
X2
X3
Represents
PAG
X1 X2
X3
X1
X2
X3
T1
X1
X2
X3
X1
etc.
T1
T1 T2
Assumptions:• Acyclic graphs• Latent variables• Sample Selection Bias
Equivalence:• Independence over measured variables
17
Knowing when we know enough to calculate the effect of Interventions
The Prediction Algorithm (SGS, 2000)
Causal Inference from Observational Studies
18
Causal Discovery from Observational Studies
X2 X3 X1 Prediction Algorithm
Equivalence Class (PAG)
X4
Predictions? P(X3 | X2set) yes P(X2 | X1set) Don’t know P(X1 | X2set) yes ….
Observed Independence X1 _||_ X4 X1 _||_ X3 | X2 X4 _||_ X3 | X2
Discovery Algorithm
19
3. The Ambiguity of Manipulation
Assumptions
• Causal graph known (Cholesterol is a cause of Heart Condition)
• No Unmeasured Common Causes
Heart Disease
Total Blood Cholesterol
Therefore The manipulated and unmanipulated distributions are the same:
P(H | TC = x) = P(H | TC set= x)
20
The Problem with Predicting the Effects of Acting
Problem – the cause is a composite of causes that don’t act uniformly,
E.g., Total Blood Cholesterol (TC) = HDL + LDL
Heart Disease
Total Blood Cholesterol = HDL
+ LDL +
-
•The observed distribution over TC is determined by the unobserved joint distribution over HDL and LDL
• Ideally Intervening on TC does not determine a joint distribution for HDL and LDL
21
The Problem with Predicting the Effects of Setting TC
Heart Disease
Total Blood Cholesterol = HDL
+ LDL +
-
• P(H | TC set1= x) puts NO constraints on P(H | TC set2= x),
• P(H | TC = x) puts NO constraints on P(H | TC set= x) • Nothing in the data tips us off about our ignorance, i.e., we don’t know that we don’t know.
22
Examples Abound
Social Adjustment
Total TV = Violent Junk
+ PBS, Discovery Channel
+ -
Aggressiveness Total Day Care =
Overcrowded, Poor Quality +
High Quality
+ -
23
Possible Ways Out
• Causal Graph is Not Known:
Cholesterol does not really cause Heart Condition
• Confounders (unmeasured common causes) are present:
LDL and HDL are confounders
24
Cholesterol is not really a cause of Heart Condition
Relative to a set of variables S (and a background),
X is a cause of Y iff x1 x2 P(Y | X set= x1) P(Y | X set= x2)
• Total Cholesterol is a cause of Heart Disease
25
Cholesterol is not really a cause of Heart Condition
Is Total Cholesterol is a direct cause of Heart Condition relative to: {TC, LDL, HDL, HD}?
• TC is logically related to LDL, HDL, so manipulating it once LDL and HDL are set is impossible.
26
LDL, HDL are confounders
Heart Disease TC
HDL LDL
?
• No way to manipulate TCl without affecting HDL, LDL
• HDL, LDL are logically related to TC
27
Logico-Causal Systems
S: Atomic Variables
• independently manipulable
• effects of all manipulations are unambiguous
S’: Defined Variables
• defined logically from variables in S
For example:
S: LDL, HDL, HD, Disease1, Disease2
S’: TC
28
Logico-Causal Systems: Adding EdgesS: LDL, HDL, HD, D1, D2 S’: TC
System over S System over S U S’ D1 D2
LDL HDL
HD
D1 D2
LDL HDL
HD
TC
?
TC HD iff manipulations of TC are unambiguous wrt HD
29
Logico-Causal Systems: Unambiguous Manipulations
TC HD iff all manipulations of TC are unambiguous wrt HD
For each variable X in S’, let Parents(X’) be the set of variables in S that logically determine X’, i.e.,
X’ = f(Parents(X’)), e.g., TC = LDL + HDL
Inv(x’) = set of all values p of Parents(X’) s.t., f(p) = x’
A manipulation of a variable X’ in S’ to a value x’
wrt another variable Y is unambiguous iff
p1≠ p2 [P(Y | p1 Inv(x’)) = P(Y | p2 Inv(x’))]
30
Logico-Causal Systems: Removing Edges
S: LDL, HDL, HD, D1, D2 S’: TC
System over S System over S U S’ D1 D2
LDL HDL
HD
D1 D2
LDL HDL
HD
TC
? ?
Remove LDL HD iff LDL _||_ HD | TC
31
Logico-Causal Systems: Faithfulness
D1 D2
LDL HDL
HD
TC
Faithfulness: Independences entailed by structure, not by special parameter values. Crucial to inference
Effect of TC on HD unambiguous
Unfaithfulness: LDL _||_ HDL | TC
Because LDL and TC determine HDL, and similarly, HDL and TC determine TC
32
Effect on Prediction Algorithm
Manipulate: Effect on: Assume manipulation unambiguous
ManipulationMaybe ambiguous
Disease 1 Disease 2 None None
Disease 1 HD Can’t tell Can’t tell
Disease 1 TC Can’t tell Can’t tell
Disease 2 Disease 1 None None
Disease 2 HD Can’t tell Can’t tell
Disease 2 TC Can’t tell Can’t tell
TC Disease 1 None Can’t tell
TC Disease 2 None Can’t tell
TC HD Can’t tell Can’t tell
HD Disease 1 None Can’t tell
HD Disease 2 None Can’t tell
HD TC Can’t tell Can’t tell
Observed System:
TC, HD, D1, D2 D1 D2
LDL HDL
HD
TC
? ? ?
Still sound – but less informative
33
Effect on Prediction Algorithm
Observed System:
TC, HD, D1, D2, X
D1 D2
LDL HDL
HD
TC
?
X
Not completely sound
No general characterization of when the Prediction algorithm, suitably modified, is still informative and sound. Conjectures, but no proof yet.
Example:• If observed system has no deterministic relations• All orientations due to marginal independence relations are still valid
34
Effect on Causal Inference ofAmbiguous Manipulations
Experiments, e.g., RCTs:
Manipulating treatment is• unambiguous sound• ambiguous unsound
Observational Studies, e.g., Prediction Algorithm:
Manipulation is• unambiguous potentially sound• ambiguous potentially sound
35
References
• Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press)
• Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press
• Spirtes, P., Scheines, R.,Glymour, C., Richardson, T., and Meek, C. (2004), “Causal Inference,” in Handbook of Quantitative Methodology in the Social Sciences, ed. David Kaplan, Sage Publications, 447-478
• Spirtes, P., and Scheines, R. (2004). Causal Inference of Ambiguous Manipulations. in Proceedings of the Philosophy of Science Association Meetings, 2002.
• Reaber, Grant (2005). The Theory of Ambiguous Manipulations. Masters Thesis, Department of Philosophy, Carnegie Mellon University