30
Causality Workbench clopinet.com/causality Challenges in Causality Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F. Cooper, Pittsburg University Peter Spirtes, Carnegie Mellon

Causality Workbenchclopinet.com/causality Challenges in Causality Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ

  • View
    222

  • Download
    1

Embed Size (px)

Citation preview

Causality Workbench clopinet.com/causality

Challenges in Causality

Isabelle Guyon, ClopinetConstantin Aliferis and Alexander Statnikov, Vanderbilt Univ.

André Elisseeff and Jean-Philippe Pellet, IBM Zürich

Gregory F. Cooper, Pittsburg University

Peter Spirtes, Carnegie Mellon

Causality Workbench clopinet.com/causality

Causal discovery

Which actions will have beneficial effects?

…your health?

…climate changes?… the economy?

What affects…

Causality Workbench clopinet.com/causality

What is causality?

• Many definitions:– Science– Philosophy– Law– Psychology– History– Religion– Engineering

• “Cause is the effect concealed, effect is the cause revealed” (Hindu philosophy)

Causality Workbench clopinet.com/causality

An engineering view…

Causality Workbench clopinet.com/causality

The system

Systemic causality

External agent

Causality Workbench clopinet.com/causality

Feature Selection

X

Y

Predict Y from features X1, X2, …

Select most predictive features.

Causality Workbench clopinet.com/causality

X

Y

Causation

Predict the consequences of actions:

Under “manipulations” by an external agent, some features are no longer predictive.

Y

Causality Workbench clopinet.com/causality

What is out there?

Causality Workbench clopinet.com/causality

Available data

• A lot of “observational” data.

Correlation Causality!

• Experiments are often needed, but:– Costly– Unethical– Infeasible

Causality Workbench clopinet.com/causality

Causal discovery from “observational data”

Example algorithm: PC (Peter Spirtes and Clarck Glymour, 1999)

Let A, B, C X and V X. Initialize with a fully connected un-oriented graph.1. Conditional independence. Cut connection if

V s.t. (A B | V).2. Colliders. In triplets A — C — B (A — B) if there is

no subset V containing C s.t. A B | V, orient edges as: A C B.

3. Constraint-propagation. Orient edges until no change:

(i) If A B … C, and A — C then A C. (ii) If A B — C then B C.

Causality Workbench clopinet.com/causality

Difficulties

• Violated assumptions:– Causal sufficiency– Markov equivalence– Faithfulness– Linearity– Gaussianity

• Overfitting (statistical complexity):– Finite sample size

• Algorithm efficiency (computational complexity):– Thousands of variables– Tens of thousands of examples

Causality Workbench clopinet.com/causality

Causality workbench

Causality Workbench clopinet.com/causality

Our approach

What is the causal question?

Why should we care?

What is hard about it?

Is this solvable?

Is this a good benchmark?

Causality Workbench clopinet.com/causality

First datasets

Toy datasets

Challenge datasets

Causality Workbench clopinet.com/causality

On-line feed-back

Causality Workbench clopinet.com/causality

Our challenges

Find…

• Problems

• Data

• Metrics

• Challenge protocols

• Implementation

Causality Workbench clopinet.com/causality

Ecology

0 2000 4000 6000 8000 10000 12000 14000 160000

10

20

30

40

50

60

70

80

90

100

DALTON

Healthcaremass spec

Upcoming datasets

ECONO

Marketing

TIED

Conceptual

Psychology

Epidemiology

InternetClimatology

Neuroscience

Security Sociology

Causality Workbench clopinet.com/causality

Want to contribute data?

• Real data:– Non confidential– Large number of samples– Large number of variables– Observational and experimental

• Semi-artificial data:– Re-simulated– Real data + artificial variables

Causality Workbench clopinet.com/causality

Performance assessment

Causality Workbench clopinet.com/causality

Metrics

• Fulfillment of an objective:• Future (prediction)

• Past (counterfactual)

• Causal relationships:• Existence

• Strength

• Degree

Causality Workbench clopinet.com/causality

Examples of objectives

• Medicine and epidemiology – Maximize life expectancy– Maximize drug efficacy– Minimize contagion

• Economy and marketing– Maximize Gross National Product (GNP)– Maximize sales– Minimize churn rate

Causality Workbench clopinet.com/causality

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Anxiety Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

Fatigue

LUCAS0: natural

Causality assessmentwith manipulations

Causality Workbench clopinet.com/causality

LUCAS1: manipulate

d

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Anxiety Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

Fatigue

Causality assessmentwith manipulations

Causality Workbench clopinet.com/causality

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Anxiety Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

Fatigue

LUCAS2: manipulate

d

Causality assessmentwith manipulations

Causality Workbench clopinet.com/causality

Goal driven causality

0

9 4

11

61

10 2

3

7

5

8

• We define: V=variables of interest

(e.g. MB, direct causes, ...)

• We assess causal relevance: R=f(V,S).

4 11 2 3 1

• Participants return: S=selected subset

(ordered or not).

Causality Workbench clopinet.com/causality

Causality assessmentwithout manipulation?

Causality Workbench clopinet.com/causality

Using artificial “probes”

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Anxiety Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

FatigueLUCAP0: natural

Probes

P1 P2 P3 PT

Causality Workbench clopinet.com/causality

Probes

Lung Cancer

Smoking Genetics

Coughing

AttentionDisorder

Allergy

Anxiety Peer Pressure

Yellow Fingers

Car Accident

Born an Even Day

Fatigue

P1 P2 P3 PT

LUCAP1&2:

manipulated

Using artificial “probes”

Causality Workbench clopinet.com/causality

Scoring using “probes”

• What we can compute (Fscore):

– Negative class = probes (here, all “non-causes”, all manipulated).

– Positive class = other variables (may include causes and non causes).

• What we want (Rscore):

– Positive class = causes.

– Negative class = non-causes.

• What we get (asymptotically):

Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal)

Causality Workbench clopinet.com/causality

Conclusion

• Try our first challenge, learn, and win!!!!– WCCI08 Workshop. Hong-Kong, June, 2008

• travel grants for top ranking students.

– Proceedings of JMLR. Top ranking entrants will be invited to write a paper.

• Best paper award: free WCCI registration.

– Prizes: P(i)=$100. P = n*sum P(i).

• Your problem solved by dozens of research groups: – help us organize the next challenge!