29
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006

Causal Models, Learning Algorithms and their Application to Performance Modeling

Embed Size (px)

DESCRIPTION

Causal Models, Learning Algorithms and their Application to Performance Modeling. Jan Lemeire Parallel Systems lab November 15 th 2006. Overview. I. Causal Models II. Learning Algorithms III. Performance Modeling IV. Extensions. I. Multivariate Analysis. Variables. - PowerPoint PPT Presentation

Citation preview

Page 1: Causal Models, Learning Algorithms and their Application to Performance Modeling

Causal Models, Learning Algorithms and their Application to Performance Modeling

Jan LemeireParallel Systems labNovember 15th 2006

Page 2: Causal Models, Learning Algorithms and their Application to Performance Modeling

2Causal Performance Models

Pag.

Overview

  I. Causal Models  II. Learning Algorithms  III. Performance Modeling  IV. Extensions

Page 3: Causal Models, Learning Algorithms and their Application to Performance Modeling

3Causal Performance Models

Pag.

I. Multivariate Analysis

  Variables

Probabilistic model of joint distribution?Relational information?A priori unknown relations

  Experimental data

Page 4: Causal Models, Learning Algorithms and their Application to Performance Modeling

4Causal Performance Models

Pag.

A. Representation of distributions

Factorization

Reduction of factorization complexity

Bayesian Network

A, B, C, DOrdering 1 Ordering 2

A

B C

D

C D B

A C B

A, D, B, C

A

D B

C

C D B

A C B

A D

P(A, B, C, D)=P(A).P(B|A).P(C|A, B).P(D|A, B, C)

P(C|A, B)=P(C|B) ó A C B

A

B C

D A

D B

C

Page 5: Causal Models, Learning Algorithms and their Application to Performance Modeling

5Causal Performance Models

Pag.

Conditional independence

Qualitative property: P(rain|quality of speech)=P(rain)?

Markov condition in graphVariable becomes independent from all its non-descendants by

conditioning on its direct parents.

– graphical d-separation criterion

B. Representation of Independencies

D

CBA

P(A|B, C) = P(A|B) ó A C B

B d-separates A from CA is d-separated from DA is not d-separated from D, given B

A C BA D

A D B

Page 6: Causal Models, Learning Algorithms and their Application to Performance Modeling

6Causal Performance Models

Pag.

Faithfulness

Faithfulness:Joint Distribution Directed Acyclic Graph

Conditional independencies d-separation

Theorem:

if a faithful graph exists, it is the minimal factorization.

Independence-map:

All independencies in the Bayesian network appear in the distribution

Page 7: Causal Models, Learning Algorithms and their Application to Performance Modeling

7Causal Performance Models

Pag.

Definition through interventions

causal model + Conditional Probability Distributions + Causal Markov Condition = Bayesian network

C. Representation of Causal Mechanisms

Model of the underlying physical mechanisms

A B

do(A=a)

P(B|A=a)

A B

do(A=a)

P(B)

A B A B

Page 8: Causal Models, Learning Algorithms and their Application to Performance Modeling

8Causal Performance Models

Pag.

Reductionism Causal modeling = reductionism

Canonical representation: unique, minimal, independent

Building block = P(Xi|parents(Xi))Whole theory is based on this modularity

Intervention = change of block

X1 X2

X3 X4

X5

X1 X2

X3 X4

X5

do(X3=a) =a

Page 9: Causal Models, Learning Algorithms and their Application to Performance Modeling

9Causal Performance Models

Pag.

Ultimate motivation for causality

Model = canonical representation able to explain all qualitative properties (independencies) close to reality

If causal mechanisms are unrelated model is faithful

Page 10: Causal Models, Learning Algorithms and their Application to Performance Modeling

10Causal Performance Models

Pag.

II. Learning Algorithms

Two types: Constraint-basedbased on the independencies

Scoring-basedsearches set of all models, give a score of how good they

represent distribution

Page 11: Causal Models, Learning Algorithms and their Application to Performance Modeling

11Causal Performance Models

Pag.

Step 1: Adjacency search

Property: adjacent nodes do not become independent

Algorithm: start with full-connected graphcheck for marginal independenciescheck for conditional independencies

C

A B

DC

A B

D

A D

C

A B

D

A C B

C D B

Page 12: Causal Models, Learning Algorithms and their Application to Performance Modeling

12Causal Performance Models

Pag.

Step 2: Orientation

Property: V-structure can be recognized

Algorithm: look for v-structures derived rules

C

A B

DC

A B

D

A D

A D B

C

A B

D

A C

A C B

A C

B

A B C

A B C

A C

B

A C

A C B

A C

A C B

Page 13: Causal Models, Learning Algorithms and their Application to Performance Modeling

13Causal Performance Models

Pag.

Assumptions

General statistical assumptions: No selection bias Random sample Sufficient data for correctness of statistical tests

Underlying network is faithful

Causal sufficiency No unknown common causes

A C

B

Page 14: Causal Models, Learning Algorithms and their Application to Performance Modeling

14Causal Performance Models

Pag.

Criticism

Definition causality?About predicting the effect of changes to the system

Faithfulness assumption Eg.: accidental cancellation

Causal Markov Condition “All relations are causal”

Learning algorithms are not robustStatistical tests make mistakes

X

Y

VU

X Y

Page 15: Causal Models, Learning Algorithms and their Application to Performance Modeling

15Causal Performance Models

Pag.

Part III: Performance Analysis

  High-Performance computing

1 processor

parallel system

Performance Questions:Performance prediction

System-dependency?

Parameter-dependency? Reasons of bad performance?

Effect of Optimizations?

Page 16: Causal Models, Learning Algorithms and their Application to Performance Modeling

16Causal Performance Models

Pag.

Causal modeling (cf. COMO lab, VUB) Representation form Close to reality Learning algorithms TETRAD tool (open-source, java)

PhD??

Page 17: Causal Models, Learning Algorithms and their Application to Performance Modeling

17Causal Performance Models

Pag.

Performance Models

Aim performance analysis Support software developer High-performance applications

Expected properties offer insight into causes performance degradation prediction estimate effect of optimizations reusable submodels

separate application and system-dependency

reason under uncertainty

causal models

Page 18: Causal Models, Learning Algorithms and their Application to Performance Modeling

18Causal Performance Models

Pag.

Integrated in statistical analysis

Statistical characteristics Regression analysisProbability table compressionOutlier detection

Iterative process1. Perform additional experiments2. Extract additional characteristics3. Indicate exceptions4. Analyze the divergences of the

data points with the current hypotheses

Experiments

Profiling

Causal Model

User Inspection

ModelConstruction

Application

Curve Fitting

Analytical Model

Database

DivergencesExc

eptio

ns

1

2

34

CPT compression

Page 19: Causal Models, Learning Algorithms and their Application to Performance Modeling

19Causal Performance Models

Pag.

A. Model construction

Model of computation

time of LU decom- position algorithm

elementsize (redundant variable) is sufficient for influence datatype -> cache misses regression analysis on submodels X=f(parents) analysis of parameters

#op

fclock

datatypeTcomp

n

Cop

L1Mop

elementsize

#instrop

L2Mop

Page 20: Causal Models, Learning Algorithms and their Application to Performance Modeling

20Causal Performance Models

Pag.

B. Detection of unexpected dependencies

Point-to-point communication performance

background communication

Page 21: Causal Models, Learning Algorithms and their Application to Performance Modeling

21Causal Performance Models

Pag.

C. Finding explanations for outliers

Exceptional data in communication performance measurements

Probability table compressionX P(X=1)

}X0

X1

X2

X3

00

1

1 }

Y

Y0

Y0

Y1

Y1

=> derived variableInteresting features

Page 22: Causal Models, Learning Algorithms and their Application to Performance Modeling

22Causal Performance Models

Pag.

IV. Complexity of Performance Data

Mixture discrete and continuous variables Mutual Information & Kernel Density Estimation

Non-linear relations Mutual Information & Kernel Density Estimation

Deterministic relations Augmented models & Complexity criterion

Context variables Work in progress

Context-specific independencies Work in progress

Page 23: Causal Models, Learning Algorithms and their Application to Performance Modeling

23Causal Performance Models

Pag.

A. Information-theoretic Dependency

  Entropy of random variable X   Discretized entropy for continuous variable

  Mutual Information

Page 24: Causal Models, Learning Algorithms and their Application to Performance Modeling

24Causal Performance Models

Pag.

B. Kernel Density Estimation

  See applets

Trade-off maximal entropy <> typicalness

Conclusions Limited number data points needed Discretization of continuous data justified Form-free dependency measure

Page 25: Causal Models, Learning Algorithms and their Application to Performance Modeling

25Causal Performance Models

Pag.

C. Deterministic relations

  Y=f(X)

Y becomes independent from Z conditioned on X~ violation of the intersection condition (Pearl ’88)Not faithfully describable

Solution: augmented causal model- add regularity to model- adapt inference algorithms

YX Z

YX Z

X YZ XY ZY Z

XY Z

X Z

Page 26: Causal Models, Learning Algorithms and their Application to Performance Modeling

26Causal Performance Models

Pag.

The Complexity Criterion

Select simplest relation

X & Y contain equivalent information about Z

Complexity(Y-Z) < Complexity(X-Z)

X Y

Z

X Y

Z

Page 27: Causal Models, Learning Algorithms and their Application to Performance Modeling

27Causal Performance Models

Pag.

Augmented causal model

Consistent models underComplexity Increase assumption

Compl(X-Z) ≥ Compl(X-Y)Compl(X-Z) ≥ Compl(Y-Z)

X Y Z

X YZS

Restrict conditional independenciesGeneralize d-separation

Reestablish faithfulness

X YZeq

S

eq{

Page 28: Causal Models, Learning Algorithms and their Application to Performance Modeling

28Causal Performance Models

Pag.

Theory works!

Deterministic

A

B Probabilistic

Page 29: Causal Models, Learning Algorithms and their Application to Performance Modeling

29Causal Performance Models

Pag.

Conclusions

  Benefit of the integration of statistical techniques

  Causal modeling is a challenge– wants to know the inner from the outer

  More information– http://parallel.vub.ac.be– http://parallel.vub.ac.be/~jan