27
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis

Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

  • Upload
    luna

  • View
    146

  • Download
    0

Embed Size (px)

DESCRIPTION

Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11. CS479/679 Pattern Recognition Dr. George Bebis. Statistical Dependences Between Variables. Many times, the only knowledge we have about a distribution is which variables are or are not dependent . - PowerPoint PPT Presentation

Citation preview

Page 1: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Bayesian Networks

Chapter 2 (Duda et al.) – Section 2.11

CS479/679 Pattern RecognitionDr. George Bebis

Page 2: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Statistical Dependence Between Variables

– Representing high-dimensional densities is very challenging since we need to estimate many parameters (e.g., kn)

• Many times, the only knowledge we have about a distribution is which variables are (or are not) dependent.

• Such dependencies can be represented efficiently using Bayesian Networks (or Belief Networks).

1 2( , ,..., )np x x x

Page 3: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Example of Dependencies

• Represent the state of an automobile:– Engine temperature– Brake fluid pressure– Tire air pressure– Wire voltages

• Causally related variables– Engine temperature– Coolant temperature

• NOT causally related variables– Engine oil pressure– Tire air pressure

Page 4: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Bayesian Net Applications

• Microsoft: Answer Wizard, Print Troubleshooter

• US Army: SAIP (Battalion Detection from SAR, IR etc.)

• NASA: Vista (DSS for Space Shuttle)

• GE: Gems (real-time monitoring of utility generators)

Page 5: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Definitions and Notation

• A bayesian net is usually a Directed Acyclic Graph (DAG)• Each node represents a variable.• Each variable assumes certain states (i.e., values).

Page 6: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Relationships Between Nodes

• A link joining two nodes is directional and represents a causal influence (e.g., A influences X or X depends on A)

• Influences could be direct or indirect (e.g., A influences X directly and A influences C indirectly through X).

Page 7: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Prior / Conditional Probabilities

• Each variable is associated with prior or conditional probabilities (discrete or continuous).

Page 8: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Markov Property

“Each node is conditionally independent of its ancestors given its parents”

1 2 1 1( / ,..., ) ( / )np x x x p x Example:

1 : parents of x1

Page 9: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Computing Joint ProbabilitiesUsing the Markov property

• Using the chain rule, the joint probability of a set of variables x1, x2, …, xn is given as:

• Using the Markov property (i.e., node xi is conditionally independent of its ancestors given its parents πi), we have :

1 2 2 3 1( / ,..., ) ( / ,..., )... ( / ) ( )n n n n np x x x p x x x p x x p x

=

much simpler!

1 2( , ,..., )np x x x

1 21

( , ,..., ) ( / )n

n i ii

p x x x p x

Page 10: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Example

• We can compute the probability of any configuration of states in the joint density, e.g.:

P(a3, b1, x2, c3, d2)=P(a3)P(b1)P(x2 /a3,b1)P(c3 /x2)P(d2 /x2)=

0.25 x 0.6 x 0.4 x 0.5 x 0.4 = 0.012

Page 11: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Fundamental Problems in Bayesian Nets

• Evaluation (inference): Given the values of the observed variables (evidence), estimate the values of the non-observed variables.

• Learning: Given training data and prior information (e.g., expert knowledge, causal relationships), estimate the network structure, or the parameters (probabilities), or both.

Page 12: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Inference Example: Medical Diagnosis

Uppermost nodes: biological agents (bacteria, virus)

Intermediate nodes: diseases

Lowermost nodes: symptoms

• Goal: given some evidence (biological agents, symptoms), find most likely disease.

causes

effects

Page 13: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Evaluation (Inference) Problem

• In general, if X denotes the query variables and e denotes the evidence, then

where α=1/P(e) is a constant of proportionality.

( , )( / ) ( , )

( )

P eP e P e

P e

XX X

Page 14: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Example• Classify a fish given that the fish is light (c1) and was caught

in south Atlantic (b2) -- no evidence about what time of the year the fish was caught nor its thickness.

Page 15: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Example (cont’d)

( , )

( / ) ( , )( )

P eP e P e

P e

XX X

Page 16: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Example (cont’d)

Page 17: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Example (cont’d)

• Similarly, P(x2 / c1,b2)=α 0.066

• Normalize probabilities (not needed necessarily):

P(x1 /c1,b2)+ P(x2 /c1,b2)=1 (α=1/0.18)

P(x1 /c1,b2)= 0.73

P(x2 /c1,b2)= 0.27 salmon

Page 18: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Evaluation (Inference) Problem (cont’d)

• Exact inference is an NP-hard problem because the number of terms in the summations (or integrals) for discrete (or continuous) variables grows exponentially with the number of variables.

• For some restricted classes of networks (e.g., singly connected networks where there is no more than one path between any two nodes) exact inference can be efficiently solved in time linear in the number of nodes.

Page 19: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Evaluation (Inference) Problem (cont’d)

• For singly connected Bayesian networks:

• Approximate inference methods are typically used in most cases.

– Sampling (Monte Carlo) methods– Variational methods– Loopy belief propagation

( / ) ( / , ) ( / ) ( / )

: , :C P P C

C P

P e P e e P e P e

e childrennodes e parent nodes

X X X X

Page 20: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Another Example

• You have a new burglar alarm installed at home.

• It is fairly reliable at detecting burglary, but also sometimes responds to minor earthquakes.

• You have two neighbors, Ali and Veli, who promised to call you at work when they hear the alarm.

Page 21: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Another Example (cont’d)

• Ali always calls when he hears the alarm, but sometimes confuses telephone ringing with the alarm and calls too.

• Veli likes loud music and sometimes misses the alarm.

• Design a Bayesian network to estimate the probability of a burglary given some evidence.

Page 22: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Another Example (cont’d)

• What are the system variables?– Alarm– Causes • Burglary, Earthquake

– Effects• Ali calls, Veli calls

Page 23: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Another Example (cont’d)

• What are the conditional dependencies among them?– Burglary (B) and earthquake (E) directly affect the

probability of the alarm (A) going off– Whether or not Ali calls (AC) or Veli calls (VC)

depends on the alarm.

Page 24: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Another Example (cont’d)

Page 25: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Another Example (cont’d)

• What is the probability that the alarm has sounded but neither a burglary nor an earthquake has occurred, and both Ali and Veli call?

Page 26: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Another Example (cont’d)• What is the probability that there is a burglary

given that Ali calls?

• What about if both Veli and Ali call?

Page 27: Bayesian Belief Networks Chapter 2 (Duda et al.) – Section 2.11

Naïve Bayesian Network

• Assuming that features are conditionally independent, the conditional class density can be simplified as follows:

• Sometimes works well in practice despite the strong assumption behind it.

Naïve Bayesian Network: