33
Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga ([email protected]) Área de Metodología de las Ciencias del Comportamiento Universidad de Almería www.ual.es/personal/jpuga Februar y 2012

Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Embed Size (px)

Citation preview

Page 1: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Bayesian Networks – Principles and Application to Modelling

water, governance and human development indicators in

Developing Countries

Jorge López Puga ([email protected])Área de Metodología de las Ciencias del Comportamiento

Universidad de Almeríawww.ual.es/personal/jpuga

February 2012

Page 2: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20122

The Content of the Sections

1. What is probability?2. The Bayes Theorem

Deduction of the theoremThe Balls problem

3. Introduction to Bayesian NetworksHistorical backgroundQualitative and quantitative dimensionsAdvantages and disadvantages of Bayes netsSoftware

Page 3: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20123

What is Probability?

Etymology►Measure of authority of a witness in a legal

case (Europe)

Interpretations of Probability►Objective probability

• Aprioristic or classical• Frequentist or empirical

►Subjective probability• Belief

Page 4: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20124

Objective Probability

Classical (Laplace, 1812-1814)

►A priory ►Aprioristic

►Equiprobability►Full knowledge

about the sample space

Frequentist►Random

experiment►Well defined

sample space►Posterior

probability►Randomness

N

NAp A

N

frAp A

Page 5: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20125

Subjective Probability

It is simply an individual degree of belief which is updated based on experience

Probability Axioms ►p(SE) = 1►p(…) ≥ 0►If two events are mutually exclusive (A B =

Ø), then p(A B) = p(A) + p(B)

Page 6: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20126

Cards Game

Let me show you the idea of probability with a cards gameClassical vs. Frequentist vs.

Subjective

Page 7: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20127

Which is the probability of getting an ace?

As you probably know…

Suit Ace 2 3 4 5 6 7 8 9 10 J Q K

Spades

Hearts

Diamonds

Clubs

Page 8: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20128

Which is the probability of getting an ace?

Given that there are 52 cards and 4 aces in a French deck…►We could say… 077.0

52

4)( Acep Apriorist

ic

If we repeated the experience a finite number of times

Frequentist

If I subjectively assess that probability

Bayesian

Page 9: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/20129

Which is the probability of getting an ace?

Why is useful a Bayesian interpretation of probability? – Let’s play►We could say… 077.0

52

4)( Acep

059.051

3)( Acep

04.050

2)( Acep

02.049

1)( Acep

Probability estimations

depends on our state of

knowledge(Dixon, 1964)

Page 10: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

The Bayesian Theorem

Getting Evidences and Updating Probabilities

Page 11: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201211

Joint and Conditional Probability

Joint probability (Distributions – of variables)►It represents the likelihood of two events

occurring at the same time►It is the same that the intersection of events►Notation

• p(A B), p(A,B), p(AB)

Estimation►Independent events►Dependent events

Page 12: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201212

Independent events►p(AB) = p(A) × p(B) or p(BA) = p(B) × p(A) Example: which is the probability of obtaining two tails (T) after tossing two coins?

p(TT) = p(T) × p(T) = 0.5 × 0.5 = 0.25

Dependent events►Conditional probability and the symbol “|”►p(AB) = p(A|B) × p(B) or p(BA) = p(B|A) × p(A)Example: which is the probability of suffering from bronchitis (B) and being a smoker (S) at the same time?

• p(B) = 0.25• p(S|B) = 0.6

p(SB) = p(S|B) × p(B) = 0.6 × 0.25 = 0.15

Page 13: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201213

The Bayes Theorem

It is a generalization of the conditional probability applied to the joint probabilityIt is:

You can deduce it because:p(AB) = p(A|B) × p(B) - - - - - p(BA) = p(B|A) ×

p(A)p(A|B) × p(B) = p(B|A) × p(A)p(A|B) = p(B|A) × p(A) / p(B)

)(

)()|(|

Bp

ApABpBAp

Page 14: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201214

Example: which is the probability of a person suffering from bronchitis (B) given s/he smokes (S)?

• p(B) = 0.25• p(S|B) = 0.6• p(S) = 0.40

)(

)()|(|

Sp

BpBSpSBp

375.040.0

25.06.0|

IBp

Page 15: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201215

The Total Probability Teorem

If we use a system based on a mutually excusive set of events = {A1, A2, A3 ,…An} whose probabilities sum to unity,then the probability of an arbitrary event (B) equals to:

which means:

)()|()( ii ApABpBp

)()|()()|()( nnii ApABpApABpBp

Page 16: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201216

If = {A1, A2, A3 ,…An} is a mutually excusive set of events whose probabilities sum to unity, then the Bayes Theorem becomes:

Let’s use a typical example to see how it works

)()|(

)()|(|

ii

kkk ApABp

ApABpBAp

Page 17: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201217

The Balls problem

Situation: we have got three boxes (B1, B2, B3) with the following content of balls:

Experiment: extracting a ball, looking at its colour and determining from which box was extracted

30%

60%

10%

Box 1 40%

30%

30%

Box 2 10%

70%

20%

Box 3

Page 18: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201218

Let’s consider that the probability of selecting each box is the same: p(Bi) = 1/3Imagine someone gives you a white ball, which is the probability that the ball was extracted from box 2?

p(B2|W) = ????

30%

60%

10%

Box 1 40%

30%

30%

Box 2 10%

70%

20%

Box 3

Page 19: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201219

p(B2|W) = ????

By definition we know that:p(W|B1) = 0.3 p(W|B2) = 0.4 p(W|B2) = 0.1

But we do not know p(W)

30%

60%

10%

Box 1 40%

30%

30%

Box 2 10%

70%

20%

Box 3

)(

)()|(| 22

2 Wp

BpBWpWBp

Page 20: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201220

p(B2|W) = ????

►But we can use the total probability theorem to discover the value of p(W):

30%

60%

10%

Box 1 40%

30%

30%

Box 2 10%

70%

20%

Box 3

5.062.0

4.0| 3

1

2

WBp

)()|()()|()()|()( 332211 BpBWpBpBWpBpBWpWp

62.01.04.03.0)( 31

31

31

Wp

𝒑 (𝑾 )=𝟑𝟎+𝟒𝟎+𝟏𝟎𝟑𝟎𝟎

Page 21: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201221

►The following table shows changes in beliefs

►Imagine we were given a red ball, what would be the updated probability for each box?

Prior Posterior

Prior Posterior

Box p(W|B_i) p(B_i) p(W|B_i) * p(B_i) p(B_i|W)1 0.3 0.3 0.100 0.3752 0.4 0.3 0.133 0.5003 0.1 0.3 0.033 0.125

Total 0.8 1 0.267 1

Box p(R|B_i) p(B_i) p(R|B_i) * p(B_i) p(B_i|R)1 0.1 0.375 0.038 0.1762 0.3 0.500 0.150 0.7063 0.2 0.125 0.025 0,118

Total 0.6 1 0.212 1

Page 22: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201222

►Finally, what would be the probability for each box if we were said that a yellow ball was extracted?

But, is there another way to solve this problem?

►Yes, there is►Using a Bayesian Network►Let’s use the Balls network

Prior PosteriorBox p(B|B_i) p(B_i) p(B|B_i) * p(B_i) p(B_i|B)

1 0.6 0.176 0.106 0.2652 0.3 0.706 0.212 0.5293 0.7 0.118 0.082 0.206

Total 1.6 1 0.400 1

Page 23: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Bayesian Networks

A brief Introduction

Page 24: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201224

Brief Historical Background

Late 70’s – early 80’sArtificial intelligenceMachine learning and reasoning►Expert system = Knowledge Base + Inference

Engine

Diagnostic decision tree, classification tree, flowchart or algorithm

Heart rate?Enter Femoral pulses < other pulses?

Superior axis or additional cyanosis?

Weak left arm pulse?

No

Complete heart block

Correct = 1/1

No

Yes

Yes

Tachyarrhythmia

Correct = 3/3

70-200/min

<70/min >200/min

(Adapted from Cowell et. al., 1999)

Page 25: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201225

Rule-based expert systems or production systems►If…then

• IF headache & temperature THEN influenza• IF influenza THEN sneezing• IF influenza THEN weakness

►Certainty factor• IF headache & fever THEN influenza (certainty

0.7)• IF influenza THEN sneezing (certainty 0.9)• IF influenza THEN weakness (certainty 0.6)

(Example adpted from Cowell et. al., 1999)

Page 26: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201226

What is a Bayesian Network?

There are several names for it, among others: Bayes net, belief network, causal network, influence diagram, probabilistic expert system“a set of related uncertainties” (Edwards, 1998)For Xiang (2002): […] it is triad V, G, P where:►V, is a set of variables►G, is a directed acyclic graph (DAG)►P, is a set of probability distributions

To make things practical we could say:►Qualitative dimension►Quantitative dimension

Page 27: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201227

Qualitative Structure

Graph: a set of vertexes (V) and a set of links (L) Directed Acyclic Graph (DAG)The meaning of a connection: A B The Principle of Conditional IndependenceThree types of basic connections | Evidence propagation A B C

Serial connectionCausal-chain

model

Page 28: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201228

Divergent connectionDiverging connection

Common-cause model

B

A C

B

A C

Convergent connectionConverging connection

Common-effect model

Page 29: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201229

A Classical Example

Mr. Holmes is working in his office when he receives a phone call from his neighbour Dr. Watson, who tells him that Holmes’ burglar alarm has gone off. Convinced that a burglar has broken into his house, Holmes rushes to his car and heads for home. On his way, he listens to the radio, and in the news it is reported that there has been a small earthquake in the area. Knowing that earthquakes have a tendency to turn burglar alarms on, he returns to his work.

Page 30: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201230

Quantitative Structure

Probability as a belief (Cox, 1946; Dixon, 1970)Bayes TheoremEach variable (node) in the model is a conditional probability function of others variablesConditional Probability Tables (CPT)

Page 31: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201231

Pros and cons of Bayes nets

Qualitative - QuantitativeMissing dataNon-parametric modelsInteraction–non-linearityInference – scenariosLocal computationsEasy interpretation

Hybrid netsTime seriesSoftware

Page 32: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201232

Software

Netica Application (Norsys Software Corp.) www.norsys.comHugin (Hugin Exper A/S) www.hugin.comErgo (Noetic Systems Inc.) www.noeticsystems.comElvira (Academic development) http://www.ia.uned.es/~elviraTetrad (CMU, NASA, ONR) http://www.phil.cmu.edu/projects/tetrad/

R MATLAB

Page 33: Bayesian Networks – Principles and Application to Modelling water, governance and human development indicators in Developing Countries Jorge López Puga

Water4Dev – Feb/201233

Thank you very much for your

attention!