30
CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License . s (14-onward) adapted from slides originally created by Andrew W. Mo e on slide #14.

CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Embed Size (px)

DESCRIPTION

Objectives  Understand 4 important discrete distributions  Describe uncertain worlds with joint probability distributions  Reel with terror at the intractability of reasoning with joint distributions  Prepare to build models of natural phenomena as Bayes Nets

Citation preview

Page 1: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

CS 401R: Intro. to Probabilistic Graphical Models

Lecture #6: Useful Distributions; Reasoning with Joint Distributions

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.

Some slides (14-onward) adapted from slides originally created by Andrew W. Moore of CMU.See message on slide #14.

Page 2: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Announcements Assignment 0.1

Due today

Reading Report #2 Due Wednesday

Assignment 0.2 Mathematical exercises Early: Friday Due next Monday

Page 3: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Objectives

Understand 4 important discrete distributions Describe uncertain worlds with joint

probability distributions Reel with terror at the intractability of

reasoning with joint distributions Prepare to build models of natural

phenomena as Bayes Nets

Page 4: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Parametric Distributions

Page 5: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

e.g., Normal Distribution

Page 6: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative
Page 7: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Bernoulli Distribution

2 possible outcomes1(1 ) when {0,1}

( ) ( ; )0 otherwise

(1 ) (1 ) when {0,1}0 otherwise

when 11 when 00 otherwise

x xp p xP x B x p

x p x p x

p xp x

“What’s the probability of a single binary event x, if a ‘positive’ event has probability p?”

Page 8: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative
Page 9: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Categorical Distribution Extension for m possible outcomes

“What’s the probability of a single event x (containing a 1 in only one position), if outcomes 1, 2, …, and m, are specified by p = [p1, p2, …, pm]?”

Note: pi must sum to 1

Page 10: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Categorical Distribution Extension for m possible outcomes

“What’s the probability of a single event x (containing a 1 in only one position), if outcomes 1, 2, …, and m, are specified by p = [p1, p2, …, pm]?”

Note: pi must sum to 1

Equivalently:

Great for language models, where each value corresponds to aword or an n-gram of words. (e.g., value ‘1’ corresponds to ‘the’)

1

when {0,1} and exactly one 1( ) ( ; )

0 otherwise

i

mxi i i

i

p x xP x Cat x p

when {1,..., }( ) ( ; )

0 otherwisexp x m

P x C x p

Page 11: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative
Page 12: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Binomial Distribution 2 possible outcomes; N trials

“What’s the probability in N independent Bernoulli events that x of them will come up ‘positive’, if a ‘positive’ event has probability p?”

!( ) ( ; , ) (1 )!( )!

(1 )

x N x

x N x

NP x Bin x N p p px N xNp p

x

Page 13: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative
Page 14: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Multinomial Distribution Extension for m possible outcomes; N trials

“What’s the probability in N independent categorical events that value 1 will occur x1 times … and thatvalue m will occur xm times, if the probabilities of each possible value are specified byp = [p1, p2, …, pm]?”

Note: pi must sum to 1

1 2

1 2

1 21111 2

1

1 211 2

! !when ... when ! ! ! ... !( ) ( ; , )

0 otherwise0 otherwise

... when ...

0 otherwise

im

m

m mx m

xx xi imm iii

ii mi

mxx x

m iim

N Np x N p p p x Nx x x xP x Mul x N p

Np p p x N

x x x

Page 15: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Acknowledgments

Note to other teachers and users of the following slides:Andrew Moore would be delighted if you found this source material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. PowerPoint originals are available. If you make use of a significant portion of these slides in your own lecture, please include this message, or the following link to the source repository of Andrew’s tutorials: http://www.cs.cmu.edu/~awm/tutorials . Comments and corrections gratefully received.

Page 16: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Why Bayes Nets Matter Andrew Moore (Google, formerly CMU):

One of the most important technologies in the Machine Learning / AI field to have emerged in the last 20 years

A clean, clear, manageable language Express what you’re certain and uncertain about

Many practical applications in medicine, factories, helpdesks, robotics, and NLP!

Inference: P(diagnosis | these symptoms)Anomaly Detection: anomalousness of this observationActive Data Collection: next diagnostic test | current observations

Page 17: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

The Joint Distribution

Recipe for making a joint distribution of M variables:

Example: Boolean variables X, Y, Z

Page 18: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

The Joint Distribution

Recipe for making a joint distribution of M variables:

1. Make a truth table listing all combinations of values of your variables (if there are M Boolean variables, then the table will have 2M

rows).

X Y Z0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

Example: Boolean variables X, Y, Z

Page 19: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

The Joint Distribution

Recipe for making a joint distribution of M variables:

1. Make a truth table listing all combinations of values of your variables (if there are M Boolean variables, then the table will have 2M

rows).2. For each combination of values,

indicate the probability.

X Y Z Prob0 0 0 0.30

0 0 1 0.05

0 1 0 0.10

0 1 1 0.05

1 0 0 0.05

1 0 1 0.10

1 1 0 0.25

1 1 1 0.10

Example: Boolean variables X, Y, Z

Page 20: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Recipe for making a joint distribution of M variables:

1. Make a truth table listing all combinations of values of your variables (if there are M Boolean variables, then the table will have 2M

rows).2. For each combination of values,

indicate the probability.3. Per the axioms of probability, those

numbers must sum to 1.

Note: You could be economical and specify only probabilities.

X Y Z Prob0 0 0 0.30

0 0 1 0.05

0 1 0 0.10

0 1 1 0.05

1 0 0 0.05

1 0 1 0.10

1 1 0 0.25

1 1 1 0.10

X=1

Y=1Z=1

0.050.25

0.100.050.05

0.10

0.100.30

The Joint Distribution Example: Boolean variables

X, Y, Z

Page 21: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Using the Joint Distribution

Page 22: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Using the Joint Distribution

Once you have the joint dist., you can ask for the probability of any logical expression involving any of the “attributes”.

rows : r matches

allvaluesof varsnot in

( ) ( )

( , , )r e

e

P e P r

P g h w

What is this summation called?

Page 23: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

P(Poor, Male) =

Using the Joint Distribution

rows : r matches

all valuesof varsnot in

( ) ( )

( , , )r e

e

P e P r

P g h w

Page 24: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

P(Poor) =

Using the Joint Distribution

Try this.

rows : r matches

all valuesof varsnot in

( ) ( )

( , , )r e

e

P e P r

P g h w

Page 25: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

P(Poor) = 0.7604

Using the Joint Distribution

rows : r matches

all valuesof varsnot in

( ) ( )

( , , )r e

e

P e P r

P g h w

Page 26: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

1 2

2

rows matching and 1 21 2

2rows matching

( )( , )( | )

( ) ( )r e e

r e

P rP e eP e eP e P r

Inference with the Joint Dist.

Page 27: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

P(Male | Poor) =

Inference with the Joint Dist.

1 2

2

rows matching and 1 21 2

2rows matching

( )( , )( | )

( ) ( )r e e

r e

P rP e eP e eP e P r

Page 28: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

P(Male | Poor) = 0.4654 / 0.7604 = 0.612

Inference with the Joint Dist.

1 2

2

rows matching and 1 21 2

2rows matching

( )( , )( | )

( ) ( )r e e

r e

P rP e eP e eP e P r

Page 29: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

News

GoodOnce you have a joint

distribution, you can ask important questions about uncertain events.

BadImpossible to create for

more than about ten attributes because there are so many numbers needed when you build the distribution.

Page 30: CS 401R: Intro. to Probabilistic Graphical Models Lecture #6: Useful Distributions; Reasoning with Joint Distributions This work is licensed under a Creative

Next

Address our efficiency problem by making independence assumptions!

Use the Bayes Net methodology to build joint distributions in manageable chunks