33
Bayes Nets

Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Bayes Nets

Page 2: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

• Graphical models are a marriage between probability theory and graph theory.They provide a natural tool for dealing with two problems that occur throughoutapplied mathematics and engineering { uncertainty and complexity { and inparticular they are playing an increasingly important role in the design andanalysis of machine learning algorithms. Fundamental to the idea of a graphicalmodel is the notion of modularity--a complex system is built by combiningsimpler parts. Probability theory provides the glue whereby the parts arecombined, ensuring that the system as a whole is consistent, and providing waysto interface models to data. The graph theoretic side of graphical models providesboth an intuitively appealing interface by which humans can model highly-interacting sets of variables as well as a data structure that lends itself naturally tothe design of efficient general-purpose algorithms.

• Many of the classical multivariate probabalistic systems studied in fields such asstatistics, systems engineering, information theory, pattern recognition andstatistical mechanics are special cases of the general graphical model formalism--examples include mixture models, factor analysis, hidden Markov models,Kalman filters and Ising models. The graphical model framework provides a wayto view all of these systems as instances of a common underlying formalism.This view has many advantages—in particular, specialized techniques that havebeen developed in one field can be transferred between research communitiesand exploited more widely. Moreover, the graphical model formalism provides anatural framework for the design of new systems. Michael Jordan

Page 3: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Graphical Models

• Representation:– Efficiently represent a joint probability distribution

• Inference:– Infer hidden states of the system, given data

• Learning:– Estimate the parameters and structure of the model

from data.• Applications

Page 4: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Example 1

• Pearl's (1988, p. 49) example: Sherlock. Holmesis at work.

• His neighbor Mr. Watson, a practical joker, hascalled to say that his alarm at home has sounded.

• Should Sherlock rush home?• If the alarm really has sounded, it may be because

of a burglary or because of an earthquake. If hehears a radio report of an earthquake, his degree ofconfidence that there was a burglary will diminish.

Page 5: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Qualitative part:Directed acyclic graph

(DAG)• Nodes - random vars.• Edges - direct influence

Quantitative part:Set of conditionalprobability distributions

0.9 0.1

e

be

0.2 0.8

0.01 0.990.9 0.1

bebb

e

BE P(A | E,B)Family of Alarm

Earthquake

Radio

Burglary

Alarm

Call

Compact representation of joint probabilitydistributions via conditional independence

Together:Define a unique distributionin a factored form

P(B, E, A,C,R) = P(B)P(E)P(A | B,E)P(R | E)P(C | A)

What is a Bayes (belief) net?

Figure from N. Friedman

Page 6: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

What is a Bayes net?

Earthquake

Radio

Burglary

Alarm

Call

C R,B,E | A

A node is conditionally independent of itsancestors given its parents, e.g.

Hence

From 25 – 1 = 31 parameters to 1+1+2+4+2=10

C

Page 7: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Why are Bayes nets useful?

- Graph structure supports- Modular representation of knowledge- Local, distributed algorithms for inference and learning- Intuitive (possibly causal) interpretation

- Factored representation may have exponentiallyfewer parameters than full joint P(X1,…,Xn) =>

- lower sample complexity (less data for learning)

- lower time complexity (less time for inference)

Page 8: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

What can Bayes nets be used for?• Posterior probabilities

– Probability of any event given any evidence

• Most likely explanation– Scenario that explains evidence

• Rational decision making– Maximize expected utility– Value of Information

• Effect of intervention– Causal analysis

Earthquake

Radio

Burglary

Alarm

Call

Radio

Call

Figure from N. Friedman

Explaining away effect

Page 9: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Example 2

Page 10: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Wet example cont’d

Page 11: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Domain: Monitoring Intensive-Care Patients• 37 variables• 509 parameters …instead of 237

PCWP CO

HRBP

HREKG HRSAT

ERRCAUTERHRHISTORY

CATECHOL

SAO2 EXPCO2

ARTCO2

VENTALV

VENTLUNG VENITUBE

DISCONNECT

MINVOLSET

VENTMACHKINKEDTUBEINTUBATIONPULMEMBOLUS

PAP SHUNT

ANAPHYLAXIS

MINOVL

PVSAT

FIO2

PRESS

INSUFFANESTHTPR

LVFAILURE

ERRBLOWOUTPUTSTROEVOLUMELVEDVOLUME

HYPOVOLEMIA

CVP

BP

A real Bayes net: Alarm

Figure from N. Friedman

Page 12: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

More real-world BN applications• “Microsoft’s competitive advantage lies in its expertise in

Bayesian networks”-- Bill Gates, quoted in LA Times, 1996

• MS Answer Wizards, (printer) troubleshooters• Medical diagnosis• Genetic pedigree analysis• Speech recognition (HMMs)• Gene sequence/expression analysis• Turbocodes (channel coding)

Page 13: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Dealing with time

• In many systems, data arrives sequentially• Dynamic Bayes nets (DBNs) can be used to

model such time-series (sequence) data• Special cases of DBNs include

– State-space models– Hidden Markov models (HMMs)

Page 14: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

State-space model (SSM)/Linear Dynamical System (LDS)

Y1 Y3

X1 X2 X3

Y2

“True” state

Noisy observations

Page 15: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Example: LDS for 2D tracking

Y1 Y3

X1 X2X3

Y2

X1

X1 X2

X2

X1 X2

y1

y1 y2

y2

y2y1

oo

o o

Sparse linear Gaussian systems) sparse graphs

Page 16: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Hidden Markov model (HMM)

Y1 Y3

X1 X2 X3

Y2

Phones/ words

acoustic signal

transitionmatrix

Gaussianobservations

Sparse transition matrix ) sparse graph

Page 17: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Probabilistic graphical modelsProbabilistic models

Directed Undirected

Graphical models

Alarm networkState-space modelsHMMsNaïve Bayes classifierPCA/ ICA

Markov Random FieldBoltzmann machineIsing modelMax-ent modelLog-linear models

(Bayesian belief nets) (Markov nets)

Page 18: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Many PatternRecognitionMethods areinstances of

graphicalmodels

Page 19: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Inference• Posterior probabilities

– Probability of any event given any evidence

• Most likely explanation– Scenario that explains evidence

• Rational decision making– Maximize expected utility– Value of Information

• Effect of intervention– Causal analysis

Earthquake

Radio

Burglary

Alarm

Call

Radio

Call

Figure from N. Friedman

Explaining away effect

Page 20: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Wet inference

• Q: Is grass wet due to sprinkler or rain?

Page 21: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Efficient Evaluation

Page 22: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They
Page 23: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Kalman filtering as a BeliefNetwork

Y1 Y3

X1 X2X3

Y2

Estimate P(Xt|y1:t) from P(Xt-1|y1:t-1) and yt

•Predict: P(Xt|y1:t-1) = sXt-1 P(Xt|Xt-1) P(Xt-1|y1:t-1)•Update: P(Xt|y1:t) / P(yt|Xt) P(Xt|y1:t-1)

Page 24: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Belief Propagation

rootroot

Collect

rootroot

Distribute

Figure from P. Green

Generalization of forwards-backwards algo. /RTS smoother from chains to trees - linear time, two-pass algorithm

aka Pearl’s algorithm, sum-product algorithm

Page 25: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Message passing view offorwards algorithm

Yt-1 Yt+1

Xt-1 XtXt+1

Yt

at|t-1

btbt+1

Page 26: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Forwards-backwards algorithm

Yt-1 Yt+1

Xt-1 Xt Xt+1

Yt

at|t-1bt

bt

Discrete analog of RTS smoother

Page 27: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

BP: parallel, distributed version

X1

X2

X3 X4

X1

X2

X3 X4

Stage 1. Stage 2.

Page 28: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Inference in general graphs• BP is only guaranteed to be correct for trees• A general graph should be converted to a

junction tree, by clustering nodes• Computationally complexity is exponential

in size of the resulting clusters (NP-hard)

Page 29: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Approximate inference• Why?

– to avoid exponential complexity of exact inference indiscrete loopy graphs

– Because cannot compute messages in closed form(even for trees) in the non-linear/non-Gaussian case

• How?– Deterministic approximations: loopy BP, mean field,

structured variational, etc– Stochastic approximations: MCMC (Gibbs sampling),

likelihood weighting, particle filtering, etc

- Algorithms make different speed/accuracy tradeoffs

- Should provide the user with a choice of algorithms

Page 30: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Learning

• Parameter estimation• Model selection (structure learning)

Page 31: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Parameter learning

Figure from M. Jordan

Conditional Probability Tables (CPTs)

1?11?1

110111

000010

X6X5X4X3X2X1

iid data

If some prob. values are missing(latent variables), we must use some method (e.g. gradient descent or EM) to compute the (locally) maximum likelihood estimates

Page 32: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Structure learning (which nodesare connected)

Gene expression data

Figure from N. Friedman

Genetic pathway

Page 33: Bayes Nets - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/...• Graphical models are a marriage between probability theory and graph theory. They

Structure learning•Learning the optimal structure is NP-hard (except for trees)•Hence use heuristic search through space of DAGs or PDAGs or node orderings•Search algorithms: hill climbing, simulated annealing, GAs•Scoring function is often marginal likelihood, or an approximation like BIC/MDL or AIC

Structural complexity penalty