Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Causal Inference: predic1on, explana1on, and interven1on
Lecture 5: Causality and Graphical Models
Samantha Kleinberg [email protected]
Next few weeks
• October 13 (Tuesday class!): Time series • October 19: Lecture + midterm review/Q&A • October 26: Midterm exam • November 2: Project proposal due
• November 30 and December 7: final presenta1ons
Final Project
• Project types (not exhaus1ve!) – Analyze data, adapt causal inference method to par1cular domain
– Compare causal inference methods – Theore1cal work on causality (methods or meaning)
• Proposal – What’s the goal? How will you accomplish it? How will you know the project was successful? What resources are needed (and do you have them)? 1 page
Today
• What makes a graphical model causal? • How to use graphical models to answer causal ques1ons – Predic1ng effects of ac1ons – Counterfactual queries – Explana1on
Applica1on: infant mortality
Mani, S., & Cooper, G. F. (2004). Causal Discovery Using a Bayesian Local Causal Discovery Algorithm. Proceedings of MedInfo, 731-‐735
Applica1on: epidemiology
Twardy, C. R., Nicholson, A. E., Korb, K. B., & McNeil, J. (2006). Epidemiological data mining of cardiovascular bayesian networks. Electronic Journal of Health Informa8cs, 1(1), e3
Applica1on: Organizing evidence
Lagnado, D. (2011). Thinking about Evidence. In Proceedings of the Bri8sh Academy (Vol. 171, pp. 183-‐223)
Applica1on: Causes and effects of poverty
Bessler, D. A. (2003). On world poverty: Its causes and effects. Food and Agricultural Organiza8on (FAO) of the United Na8ons, Research Bulle8n, Rome
Applica1ons of causal BNs
• Biomedical informa1cs – diagnosis, prognosis • Psychology – representa1on of causal learning • Economics/finance
Markov condi1on
(from last week) Node independent of non-‐descendants given its parents
X
Y Z
Y ⊥ Z | X
d-‐separa1on
X Y Z
DAG G Set of independencies
Y ⊥ Z | X
d-‐separa1on
d-‐separa1on
Equivalent statements, for sets of nodes X, Y, Z in graph G: – X and Y are d-‐separated by Z (Z can be node or set of nodes) in G
– X and Y are condi1onally independent given Z – Z blocks all paths between X and Y
d-‐separa1on and Markov blanket
Markov blanket: set of nodes that separate a node from all others d-‐separa3on: Method for determining whether a pair of nodes (or sets of nodes) are independent condi1oned on another set
Defini1on: d-‐separa1on • Node v is a collider if two arrowheads meet at v
• X and Y are d-‐connected by Z in graph G iff – Exists an undirected path between a vertex in X and vertex in Y s.t. for every collider C on the path, C or descendant of C is in Z and no non-‐collider on path is in Z
• X and Y are d-‐separated by Z in G iff they are not d-‐connected by Z in G
Z Y X Z Y X
✓ ✗
Example 1
• X → Y → Z
• X ß Y → Z
In both cases, X,Z d-‐separated by Y: no colliders on path from X to Z, and X and Z not d-‐connected by Y
Example 2
• X → Y ← Z
• X,Z d-‐separated by a set of nodes only if Y NOT in that set. X,Z d-‐connected by Y
Example 3
• Are Y,Z d-‐separated by W?
• No, d-‐connected by X and W is descendent
X
Y Z
W
But…
Indep -‐> many networks Network -‐> 1 set indep
X Y Z Y ⊥ Z | X
X Y Z
X
Y Z
But…
Can rule out some
X Y Z Y ⊥ Z | X
X Y Z
X
Y Z X
Y Z
X
…And
HDL
Heart disease
Genes
Heart disease HDL
…Also Coin 1 Coin 2 # obs.
H H 5
T T 3
H T 1
T H 1
P(C1 = H ^C2 = H )> P(C1 = H )P(C2 = H )5 /10 > 6 /10*6 /10
C1 /⊥C2
Causal interpreta1on
• Causal Markov condi1on • Faithfulness • Causal sufficiency
+ a few others, e.g. variables “correctly” specified
Causal graph
• Arrows denote direct causes – Edge from X to Y means X causes Y
• DAG
ads buy weather
Causal Markov condi1on (CMC)
Node in the graph is independent of all of its non-‐descendants (direct and indirect effects) given its direct causes
CMC and screening off
Recall Common Cause Principle (CCP) If P(X^Y)>P(X)P(Y) then either X causes Y (or vice versa) or they have a common cause
Now: if P(X^Y)>P(X)P(Y) and they have a common cause C, it means X ind Y | C Note that CCP seeks single common cause. CMC allows for sets of nodes.
Problems: feedback
Supply
Demand
Problems: Indeterminism
P(picture|switch) < P(picture|switch,sound)
TV switch
Picture Sound
TV switch
Picture Sound
Closed Circuit
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causa8on, predic8on, and search. MIT Press
Problems: hidden common causes
CFS
Tired Achy
Flu
Completeness of graph
• Complete: all common causes included, all causal rela1ons among variables included
• Incomplete: not all intermediate factors necessarily included
Faithfulness
Exactly the dependencies in the underlying structure hold in the data – i.e. Independence rela1ons not from chance but from structure
– No canceling out
Example
Smoking
Exercise
Health
-‐ +
+
Another example
Gene A
Gene B Phenotype
-‐ +
+
A final example (determinis1c chain)
X Y Z
X ⊥ Z |Y
Selec1on bias
fever Abdominal pain F+ A
Go to hospital Stay home
Cooper, G. F. (1999). An overview of the representa1on and discovery of causal rela1onships using bayesian networks. In C. Glymour & G. F. Cooper (Eds.), Computa8on, causa8on, and discovery. AAAI Press and MIT Press
Recap of problems for faithfulness
• Only true in large sample limit • Simpson’s paradox • Selec1on bias • Sta1s1cal tests
Quick recap
• CMC: popula1on produced by structure has these independencies
• Faithfulness: popula1on has only these independencies Why do we need both?
X Y Z
Causal sufficiency
• All common causes of pairs of variables measured
• Not sufficient if Y not measured
X
Y
Z
Completeness vs. sufficiency
Completeness: common causes are included in causal graph Sufficiency: all common causes have been measured
Example
Intelligence
X Y Z
Scheines, R. (1997). An introduc1on to causal inference. Causality in Crisis, 185-‐99
What if no causal sufficiency?
• Need to include all graphs with unmeasured common causes.
• Ex: measured A, B, C. Found A ╨ C. With CMC, faithfulness but no causal sufficiency, the following graphs are all possible (X is unmeasured):
A
B
C A
B
C
X
A
B
C
X . . .
In absence of sufficiency…
• Can s1ll learn something – Some rela1onships may appear in all graphs – Can find set of all graphs represen1ng independence rela1ons, with nodes for possible hidden variables
• Timing informa1on helps
Things to beware of with inference
• Sample size • Missing data (not just variables) • Mul1ple tes1ng (and FDR) • What structures DAG can/cannot represent (e.g. 1me series and feedback)
• Variable representa1on
The good news
• Can add 1me • Can experiment • Methods for tes1ng assump1ons
Recap of causal inference with BN
What makes a Bayesian network causal? The assump1ons: CMC, sufficiency, faithfulness
Assump1ons+DataàIndependencies à Causal BN(s) à effects of interven1ons
Learning BN
Same methods we discussed last week – Search and score – Constraint based (e.g. PC algorithm)
Pearl on causality
• Ac1ons – What happens if we do X?
• Counterfactuals – What if things happened differently?
• Explana1ons – Why did X happen?
Manipulability
BPA
Obesity
Ideal manipula1ons
• Defini1on: change in value of a variable that does not introduce any other changes (except those produced by the change in variable)
Weather
Running speed Air condi1oner
Speeches
Popularity Dona1ons
Tes1ng popularity, how do we manipulate it’s value?
Seeing versus doing
Disease
Treatment
Doctor
Outcome
Hospital
Predic1ng the effects of ac1ons
What’s P(C) if I turn the sprinkler on? Is this the same as P(C|S=T)?
Cloudy
Sprinkler Rain
Wet grass
Interven1on and joint probability
!= just incorpora1ng evidence – Last week: set value of observed values – This week: set value by forcing variable to take value independent of its parents’ values
Cloudy
Sprinkler Rain
Wet grass
If turn on sprinkler, the fact that it’s on no longer gives info about C
Interven1on and joint probability
Cloudy
Sprinkler Rain
Wet grass
P(C,S,W,R) = P(c)P(s | c)P(r | c)P(w | s, r)C,S,W ,R∑
P(C,W,R | do(s)) = P(c)P(s)P(r | c)P(w | s, r)C,W ,R∑
do() operator
Model can help us determine the effect of interven1ons P(X=x|Y=y) != P(X=x|set Y=y)
Example
P(S|do(M))
Disease
Medica1on
Side effect
Example Disease
Medica1on
Side effect
X P(s | do(m)) = P(s | m)) = P(s,d |d∑ m)
= P(s,d, md∑ ) / P(m) = P(s | d, m
d∑ )P(d | m)P(m) / P(m)
= P(s | d, md∑ )P(d)
do-‐calculus rules
1. Inser1on/dele1on of observa1ons
P(y | do(x), z,w) = P(y | do(x),w) if (Y⊥ Z|X,W)GX
Disease
Medica1on
Side effect
GX Means edges into X deleted
do-‐calculus rule 1 examples
1. Inser1on/dele1on of observa1ons
P(y | do(x), z,w) = P(y | do(x),w) if (Y⊥ Z|X,W)GX
D
M
S
G GS
D
M
S
X
W
Y
Z
G
X
W
Y
Z
GX
do-‐calculus rules
2. Ac1on/Observa1on exchange If remove edges from Z, and independent, can replace doing with observing
P(y | do(x),do(z),w) = P(y | do(x), z,w) if (Y⊥ Z | X,W)GXZ
X Y Z
W
X Y Z
W
G G-‐XZ
do-‐calculus rules
3. Inser1on/dele1on of ac1ons Z(W) is set of Z nodes that are not ancestors of W-‐nodes in
P(y | do(x),do(z),w) = P(y | do(x),w) if (Y ⊥ Z | X,W )GX,Z (W )
X Y Q
W
Z X Y Q
W
Z
GX
G G –X,Z(W)
Example
From pearl
P(y | do(z)) = P(y | z)
P(y | z) = P(y | x, z)P(x | z)x∑
With rule 3 (insert/delete ac1on):
P(x | z) = P(x) since (Z ⊥ X)GZ
With rule 2 (ac1on/observa1on exchange):
P(y | x, z) = P(y | x, z) since (Z ⊥Y | X)GZ
X Y Z
U
G
Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge University Press
X Y Z
U
GZ
X Y Z
U
GZ
Summary of do-‐calculus
1. Inser1on/dele1on of observa1ons 2. Ac1on/Observa1on exchange 3. Inser1on/dele1on of ac1ons
In general, may have unobserved/hidden variables
Some caveats
• Time • Modularity • Possibility of intervening • Efficacy
Counterfactuals reminder
• If I had not gone running, I would not have go}en a sunburn
• If the pa1ent had taken the drug, she would have recovered
• Had I bought shares of Apple stock in 2004, I would have made a large profit
Pearl on Counterfactuals
Like do(), except backward looking and changing value of variable Three steps
1. Abduc1on: use evidence to interpret past
2. Ac1on: change to hypothe1cal values
3. Predic1on: see consequences of ac1ons
Disease (D)
Medica1on (M)
Side effect (S)
Example from Pearl If D, then D would s1ll be true if A were false DàD¬A
Court (U)
Captain (C)
A B
Death (D)
Abduc1on
Captain (C)
A B
Death (D)
Ac1on
Court (U)
Captain (C)
A B
Death (D)
Predic1on
Court (U)
Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge University Press
Actual causality
Pearl: token cause = actual cause – What caused a person’s lung cancer? – Who is responsible for an accident
Graphical models and explana1on
• Graphical model that represents rela1onships between variables
• Observa1ons give truth value of variables in par1cular scenario
• Use model to evaluate counterfactual queries
Pearl’s approach to actual cause
• Based on but a}empts to solve problems with counterfactuals – Bob and Susie and the broken bo}le
• Key idea: sustenance (mix of necessity and sufficiency) – If Bob’s rock missed, would Susie’s sustain the glass breaking?
Defini1ons
• Depends (necessity) – y depends on x, if x is necessary to maintain value of y
• Produces (sufficiency) – x can produce y if it can bring about effect when neither are present
• Sustains – x sustains y if there is at least one condi1on where Y will differ from y in absence of x AND Y=y is maintained in presence of x under any set of condi1ons
Causal Beams
• Causal beam: New model, where we remove all parents except those that minimally sustain their children. Set other parents to some w’.
• x is actual cause of y if x is necessary for y in that causal beam for some w’
Example of causal beam
• Did the traveler die of thirst or poisoning?
• Death = C v D
X (enemy2 shoots canteen)
D (dehydra1on)
p(enemy1 Poisons water)
C (cyanide Intake)
Y (death)
Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge University Press
Construc1ng the causal beam
• True: X,D,Y,P • False: C
• Note: C=¬X^P
X=1
D =1
P=1
C =0
Y =1
X= shoots canteen, D=dehydra1on, Y=Death, C=cyanide intake, P=poisons water
Sustaining
Inac1ve
Construc1ng the causal beam
• True: X,D,Y,P • False: C
• Now: C=¬X
X=1
D =1
P=1
C =0
Y =1
X= shoots canteen, D=dehydra1on, Y=Death, C=cyanide intake, P=poisons water
Sustaining
Construc1ng the causal beam
• True: X,D,Y,P • False: C
• Next: Y=DvC
X=1
D =1
P=1
C =0
Y =1
X= shoots canteen, D=dehydra1on, Y=Death, C=cyanide intake, P=poisons water
Sustaining
Sustaining Inac1ve
Construc1ng the causal beam
• True: X,D,Y,P • False: C
• Finally: Y=D, Y=X
X=1
D =1
P=1
C =0
Y =1
X= shoots canteen, D=dehydra1on, Y=Death, C=cyanide intake, P=poisons water
Sustaining
Sustaining
What if we are uncertain?
• U represents 1me 1ll first drink
• U=1 if canteen emp1ed before drink
• U=0 otherwise
X=1
D
P=1
C
Y
u
Case 1: u=1
• Canteen emp1ed before traveler drinks
• Same beam as before
X=1
D =1 C =0
Y =1
P=1 U=1
Case 2: u=0
• Drinks before canteen is emp1ed
• What if U is uncertain? • Use P(u) to calculate • P(x caused y)= – Sum of P(u) over u where x caused y in u
X=1
D=0
P=1
C=1
Y=1
U=0
Problem: Over-‐determina1on
• Which rifleman caused the death?
• Counterfactual: – If not A, then B would have caused D
Court (U)
Captain (C)
A B
Death (D)
Problem: Over-‐determina1on
• Which rifleman caused the death?
• Counterfactual: – If not A, then B would have caused D
• Structural: – A sustains D against B
Court (U)
Captain (C)
A
Death (D)
B=False
Challenges for the actual cause
• Type !=Token • Subjec1vity • Timing
Challenge: subjec1vity
• Variables chosen and what values they are set to affect outcome – Smoking vs. dura1on of smoking
• Note: both queries and their answers may then differ
Intoxica1on
Car Accident
Weather Emo1onal state
Challenge: 1me
• Bob starts smoking (S) Wednesday. He’s diagnosed with lung cancer (LC) on Friday. Did his S cause his LC?
• Bob later dies. Was LC the cause?
Smoking
Lung cancer
Death
Inference recap BN DBN Granger Temporal logic
Results Graph
Time No
Data C/D/M
Cycles No
Latent vars. Yes
Predic1on Yes
Token cause
Counterfactual -‐based
BN DBN Granger Temporal logic
Results Graph Graph Rela1onships Rela1onships
Time No Set of lags Single lag* Window
Data C/D/M C/D/M* C D/M
Cycles No Yes Yes Yes
Latent vars
Yes Yes No No
Predic1on
Yes Yes No No
Token cause
Counterfactual-‐based
No No Probabilis1c
Further reading • Graphical models and causality – Spirtes, P., Glymour, C., & Scheines, R. (2000). Causa8on, predic8on, and search. MIT Press
– Pearl, J. (2000/2009). Causality: Models, reasoning, and inference. Cambridge University Press.
• Actual cause – Pearl’s book – Halpern, J. Y., & Pearl, J. (2005). Causes and explana1ons: A structural-‐model approach. Part I: Causes. The Bri8sh Journal for the Philosophy of Science, 56(4), 843-‐887.
So�ware
• TETRAD V h}p://www.phil.cmu.edu/tetrad/ • h}p://www.dagi}y.net
For next week
• How can we find how long it takes for smoking to cause lung cancer?
• When to buy/sell a stock a�er you hear some news?
• Read CPT 2.4.2