37
Structure Learning Using Causation Rules Raanan Yehezkel PAML Lab. Journal Club March 13, 2003

Structure Learning Using Causation Rules

  • Upload
    nellis

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Structure Learning Using Causation Rules. Raanan Yehezkel PAML Lab. Journal Club. March 13, 2003. Main References. Pearl, J., Verma, T., A Theory of Inferred Causation , Proceedings of the Second International Conference of Representation and Reasoning, San Francisco. 1991. - PowerPoint PPT Presentation

Citation preview

Page 1: Structure Learning Using Causation Rules

Structure Learning Using Causation Rules

Raanan Yehezkel

PAML Lab. Journal Club

March 13, 2003

Page 2: Structure Learning Using Causation Rules

Main References

• Pearl, J., Verma, T., A Theory of Inferred Causation, Proceedings of the Second International Conference of Representation and Reasoning, San Francisco. 1991.

• Spirtes, P., Glymour, C., Scheines, R., Causation Prediction and Search, second edition, 2000, MIT Press.

Page 3: Structure Learning Using Causation Rules

Taken from Judea Pearl web-site

Simpson’s “Paradox”

The sure thing principle (Savage, 1954)

Let a, b be two alternative acts of any sort, and let G be any event.

If you would definitely prefer b to a, either knowing that the event G obtained, or knowing that the event G did not obtain, then you definitely prefer b to a.

Page 4: Structure Learning Using Causation Rules

Taken from Judea Pearl web-site

New treatment is preferred for male group (G).

New treatment is preferred for female group (G’).

=> New treatment is preferred.

Simpson’s “Paradox”Local Success Rate

G = male patients G’ = female patients

Old 5% (50/1000) 50% (5000/10000)

New 10% (1000/10000) 92% (95/100)

Global Success Rateall patients

Old 46% (5050/11000)

New 11% (1095/10100)

Page 5: Structure Learning Using Causation Rules

Simpson’s “Paradox”

• Intuitive way of thinking:

G T

S

P(S,G,T)=P(G) P(T) · P(S|G,T)

P(S=1 | G,T=new) = 0.51

P(S=1 | G,T=old) = 0.27

Page 6: Structure Learning Using Causation Rules

Simpson’s “Paradox”

• The faithful DAG: G T

S

P(S,G,T)=P(G) · P(T | G) · P(S | G,T)

P(S=1 | G,T=new) = 0.11

P(S=1 | G,T=old) = 0.46

Page 7: Structure Learning Using Causation Rules

Assumptions:

• Directed Acyclic Graph, Bayesian Networks.

• All variables are observable.

• No errors in Conditional Independence test results.

Page 8: Structure Learning Using Causation Rules

Identifying cause and effect relations

• Statistical data.• Statistical data and temporal information.

Page 9: Structure Learning Using Causation Rules

Identifying cause and effect relations

• Potential Cause• Genuine Cause• Spurious Association

Page 10: Structure Learning Using Causation Rules

Intransitive Triplet

• I(C1,C2)

• ~I(C1,E)

• ~I(C2,E)

C1 C2

E

H1 H2

C1 C2

E

H1 H2

C1 C2

E

Page 11: Structure Learning Using Causation Rules

Potential Cause

X has a potential causal influence on Y if:

• X and Y are dependent in every context.

• ~I(Z,Y|Scontext)

• I(X,Z|Scontext)

X

Y

Z

Page 12: Structure Learning Using Causation Rules

Genuine Cause

X has a genuine causal influence on Y if:

• Z is a potential cause of X.

• ~I(Z,Y|Scontext)

• I(Z,Y|X,Scontext)

Z XPotential

Y

Given context S

Given X and context S

Z XPotential

Y

Page 13: Structure Learning Using Causation Rules

Spurious Association

X and Y are spuriously associated if:

1. ~I(X,Y| Scontext)

2. ~I(Z1,X|Scontext)

3. ~I(Z2,Y|Scontext)

4. I(Z1,Y|Scontext)

5. I(Z2,X|Scontext)

Z1

X Y

From conditions 1,2,4

From conditions 1,3,5Z2

X Y

Page 14: Structure Learning Using Causation Rules

Genuine Cause with temporal information

X has a genuine causal influence on Y if:

• Z and Scontext precedes X.

• ~I(Z,Y|Scontext)

• I(Z,Y|X,Scontext)

Z

Y

Given context S

Given X and context SZ

X

Y

Page 15: Structure Learning Using Causation Rules

Spurious Association with temporal information

X and Y are spuriously associated if:

1. ~I(X,Y|S)

2. X precedes Y.

3. I(Z,Y|Scontext)

4. ~I(Z,X|Scontext)

Z

Y

From conditions 1,2

X

From conditions 1,3,4

X

Y

Page 16: Structure Learning Using Causation Rules

Algorithms

• Inductive Causation (IC).

• PC.

• Other.

Page 17: Structure Learning Using Causation Rules

Pearl and Verma, 1991

• For each pair of non-adjacent nodes (X,Y) with a common neighbor C, if C is not in SXY then add arrowheads to C: X C Y.

• For each pair (X,Y) find the set of nodes SXY such that I(X,Y|SXY). If SXY is empty, place an undirected link between X and Y.

• For each pair (X,Y) find the set of nodes SXY such that I(X,Y|SXY). If SXY is empty, place an undirected link between X and Y.

• For each pair of non-adjacent nodes (X,Y) with a common neighbor C, if C is not in SXY then add arrowheads to C: X C Y.

Inductive Causation (IC)

Page 18: Structure Learning Using Causation Rules

Pearl and Verma, 1991

• Recursively:

1. If X-Y and there is a strictly directed path from X to Y then add an arrowhead at Y.

2. If X and Y aren’t adjacent but XC and there is Y-C then direct the link CY.

• Recursively:

1. If X-Y and there is a strictly directed path from X to Y then add an arrowhead at Y.

2. If X and Y aren’t adjacent but XC and there is Y-C then direct the link CY.

• Mark uni-directed links XY if there is some link with an arrow head at X.

Inductive Causation (IC)

• Mark uni-directed links XY if there is some link with an arrow head at X.

Page 19: Structure Learning Using Causation Rules

Example (IC)

X1 X2

X3 X4 X5

True graph

Page 20: Structure Learning Using Causation Rules

Example (IC)

X1 X2

X3 X4 X5

For each pair (X,Y) find the set of nodes SXY such that I(X,Y|SXY). If SXY is empty, place an undirected link between X and Y.

Page 21: Structure Learning Using Causation Rules

Example (IC)

X1 X2

X3 X4 X5

For each pair of non-adjacent nodes (X,Y) with a common neighbor C, if C is not in SXY then add arrowheads to C:

X C Y

Page 22: Structure Learning Using Causation Rules

Example (IC)

X1 X2

X3 X4 X5

Recursively:

1. If X-Y and there is a strictly directed path from X to Y then add an arrowhead at Y.

2. If X and Y aren’t adjacent but XC and there is Y-C then direct the link CY.

Page 23: Structure Learning Using Causation Rules

Example (IC)

X1 X2

X3 X4 X5

Mark uni-directed links XY if there is some link with an arrow head at X.

Page 24: Structure Learning Using Causation Rules

Spirtes and Glymour, 1993

1. Form a complete undirected graph C on vertex set V.

1. Form a complete undirected graph C on vertex set V.

PC

Page 25: Structure Learning Using Causation Rules

Spirtes and Glymour, 1993

2. n = 0;

3. Repeat

Repeat

• Select an ordered pair X and Y such that:

|Adj(C,X)\{Y}| n, and a subset S such that:

S Adj(C,X)\{Y}, |S| = n

• if: I(X,Y|S) = true, then delete edge(X,Y)

Until all possible sets were tested. n = n + 1.

Until: X,Y, |Adj(C,X)\{Y}| < n.

2. n = 0;

3. Repeat

Repeat

• Select an ordered pair X and Y such that:

|Adj(C,X)\{Y}| n, and a subset S such that:

S Adj(C,X)\{Y}, |S| = n

• if: I(X,Y|S) = true, then delete edge(X,Y)

Until all possible sets were tested. n = n + 1.

Until: X,Y, |Adj(C,X)\{Y}| < n.

PC

Page 26: Structure Learning Using Causation Rules

Spirtes and Glymour, 1993

4. For each triple of vertices X, Y, Z,

such that edge(X,Z) and edge(Y,Z),

orient X Z Y, if and only if:

Z SXY

PC4. For each triple of vertices X, Y, Z,

such that edge(X,Z) and edge(Y,Z),

orient X Z Y, if and only if:

Z SXY

Page 27: Structure Learning Using Causation Rules

Pearl and Verma, 1991

Mark uni-directed links XY if there is some link with an arrow head at X.

Recursively:

1. If X-Y and there is a strictly directed path from X to Y then add an arrowhead at Y.

2. If X and Y aren’t adjacent but XC and there is Y-C then direct the link CY.

Use Inductive Causation (IC)

Page 28: Structure Learning Using Causation Rules

Spirtes, Glymour and Scheines. 2000.

Example (PC)

True graph

X5X2

X4

X1

X3

Page 29: Structure Learning Using Causation Rules

Example (PC)

Form a complete undirected graph C on vertex set V.

X5X2

X4

X1

X3

Page 30: Structure Learning Using Causation Rules

Example (PC)

n = 0; |SXY| = n

Independencies:

None

X5X2

X4

X1

X3

Page 31: Structure Learning Using Causation Rules

Example (PC)

n = 1; |SXY| = n

Independencies:

I(X1,X3|X2)

X5X2

X4

X1

X3

I(X1,X4|X2) I(X1,X5|X2) I(X3,X4|X2)

Page 32: Structure Learning Using Causation Rules

Example (PC)

n = 2; |SXY| = n

Independencies:

I(X2,X5|X3,X4)

X5X2

X4

X1

X3

Page 33: Structure Learning Using Causation Rules

Example (PC)For each triple of vertices X, Y, Z, such that edge(X,Z) and edge(Y,Z),

orient X Z Y, if and only if: Z SXY

X5X2

X4

X1

X3

D-Separation set:

S3,4={X2}S1,3 = {X2}

Page 34: Structure Learning Using Causation Rules

• PC* - tests conditional independence between X,Y given a subset S, where

S { [(Adj(X) Adj(Y)] path(X,Y) }• CI test prioritization according to:

for a given variable X, first test those variables Y that are least dependent on X, conditional on those subsets of variables that are most dependent on X.

• PC* - tests conditional independence between X,Y given a subset S, where

S { [(Adj(X) Adj(Y)] path(X,Y) }• CI test prioritization according to:

for a given variable X, first test those variables Y that are least dependent on X, conditional on those subsets of variables that are most dependent on X.

Possible PC improvements(2)

Page 35: Structure Learning Using Causation Rules

Markov Equivalence

• (Verma and Pearl, 1990). Two casual models are equivalent if and only if their dags have the same links and same set of uncoupled head-to-head nodes (colliders).

Z

X Y

P=P(X)·P(Y)·P(Z|X,Y)

Z

X Y

Z

X Y

P=P(Z)·P(X|Z)·P(Y|Z) = P(Y)·P(X|Z)·P(Z|Y)

Page 36: Structure Learning Using Causation Rules

• Algorithms such as PC and IC produce a partially directed graphs, which represent a family of Markov equivalent graphs.

• The remaining undirected arcs can be oriented arbitrarily (under DAG restrictions), in order to construct a classifier.

• The main flaw of the IC and PC algorithms, is that they might be unstable in a noisy environment. An error in one CI test for an arc, might lead to an error in other arcs. And one erroneous orientation might lead to other erroneous orientations.

Summery

• Algorithms such as PC and IC produce a partially directed graphs, which represent a family of Markov equivalent graphs.

• The remaining undirected arcs can be oriented arbitrarily (under DAG restrictions), in order to construct a classifier.

• The main flaw of the IC and PC algorithms, is that they might be unstable in a noisy environment. An error in one CI test for an arc, might lead to an error in other arcs. And one erroneous orientation might lead to other erroneous orientations.

Page 37: Structure Learning Using Causation Rules