View
229
Download
1
Tags:
Embed Size (px)
Citation preview
Logistics Homework 2 is going to be out tomorrow.
It is due on Sep 16, Wed.
There is no class on Monday Sep 7th (Labor day)
Those who have not return Homework 1 yetFor details of how to submit the homework policy please
check : http://www.cs.cmu.edu/~ggordon/10601/hws.html
Outline We will review
Some probability and statistics Some graphical models
We will not go over Homework 1 Since the grace period has not ended yet. Solutions will be up next week on the web page.
We’ll play a game: Catch the goof! I’ll be the sloppy TA… will make ‘intentional’ mistakes
You’ll catch those mistakes and correct me!
Slides with mistakes are marked with
Correct slides are marked with
Given two discrete random variables X and Y X takes values in
Law of total probability
( ) ( , )
( ) ( | ) ( )
j i jj
j i j jj
P Y y P X x Y y
P Y y P X x Y y P Y y
1, , mx x
Y takes values in 1, , ny y
Given two discrete random variables X and Y X takes values in
Law of total probability
( ) ( , )
( ) ( | ) ( )
j i jj
j i j jj
P Y y P X x Y y
P Y y P X x Y y P Y y
1, , mx x
Y takes values in 1, , ny y
Given two discrete random variables X and Y X takes values in
Law of total probability
( ) ( , )
( ) ( | ) ( )
j i jj
j i j jj
P Y y P X x Y y
P Y y P X x Y y P Y y
1, , mx x
Y takes values in 1, , ny y
( )iP X x
( )iP X x
Given two discrete random variables X and Y X takes values in
Law of total probability
( ) ( , )
( ) ( | ) ( )
i i jj
i i j jj
P X x P X x Y y
P X x P X x Y y P Y y
1, , mx x
Y takes values in 1, , ny y
Given two discrete random variables X and Y
Law of total probability
( ) ( , )i i jj
P X x P X x Y y
Joint probability
Marginal probability
( | ) ( )i j jj
P X x Y y P Y y
Conditional probability of X conditioned on Y
Given two discrete random variables X and Y
Law of total probability
( ) ( , )i i jj
P X x P X x Y y
Joint probability
Marginal probability
( | ) ( )i j jj
P X x Y y P Y y
Conditional probability of X conditioned on Y
Formulas are fine.Anything wrong with the names?
Given two discrete random variables X and Y
Law of total probability
( ) ( , )i i jj
P X x P X x Y y
Joint probability of X,Y
Marginal probability
( | ) ( )i j jj
P X x Y y P Y y
Conditional probability of X conditioned on Y
Marginal probability
In a strange world Two discrete random variables X and Y take binary values
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.5P X Y
( 1, 1) 0.5P X Y
Joint probabilities
In a strange world Two discrete random variables X and Y take binary values
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.5P X Y
( 1, 1) 0.5P X Y
Joint probabilities
Should sum up to 1
The world seems fine Two discrete random variables X and Y take binary values
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.3P X Y
( 1, 1) 0.3P X Y
Joint probabilities
What about the marginals?
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.3P X Y
( 1, 1) 0.3P X Y
Joint probabilities Marginal probabilities
( 0) 0.2( 1) 0.8
P XP X
( 0) 0.5( 1) 0.5
P YP Y
This is a strange world
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.3P X Y
( 1, 1) 0.3P X Y
Joint probabilities Marginal probabilities
( 0) 0.2( 1) 0.8
P XP X
( 0) 0.5( 1) 0.5
P YP Y
In a strange world
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.3P X Y
( 1, 1) 0.3P X Y
Joint probabilities Marginal probabilities
( 0) 0.2( 1) 0.8
P XP X
( 0) 0.5( 1) 0.5
P YP Y
This is a strange world
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.3P X Y
( 1, 1) 0.3P X Y
Joint probabilities Marginal probabilities
( 0) 0.2( 1) 0.8
P XP X
( 0) 0.5( 1) 0.5
P YP Y
Let’s have a simple problem
( 0, 1) 0.2P X Y
( 0, 0) 0.2P X Y
( 1, 0) 0.3P X Y
( 1, 1) 0.3P X Y
Joint probabilities Marginal probabilities
( 0) 0.4( 1) 0.6
P XP X
( 0) 0.5( 1) 0.5
P YP Y
( 0) ( 0, 0) ( 1, 1) =0.4 P X P X Y P X Y
Independent number of parametersAssume X and Y take Boolean values {0,1}:
How many independent parameters do you need to fully specify:
marginal probability of X?
the joint probability of P(X,Y)?
the conditional probability of P(X|Y)?
Independent number of parametersAssume X and Y take Boolean values {0,1}:
How many independent parameters do you need to fully specify:
marginal probability of X?P(X=0) 1 parameter only [ because P(X=1)+P(X=0)=1 ]
the joint probability of P(X,Y)?
P(X=0, Y=0) 3 parameters P(X=0, Y=1)
P(X=1, Y=0)the conditional probability of P(X|Y)?
Number of parameters Assume X and Y take Boolean values {0,1}?
How many independent parameters do you need to fully specifymarginal probability of X?
P(X=0) 1 parameter only P(X=1)= 1-P(X=0) How many independent parameters do you need to fully specify
the joint probability of P(X,Y)? P(X=0, Y=0) 3 parameters P(X=0, Y=1)
P(X=1, Y=0) How many independent parameters do you need to fully specify
the conditional probability of P(X|Y)? P(X=0|Y=0) 2 parameters P(X=0|Y=1)
Number of parameters What about P(X | Y,Z) , how many independent parameters do you need to be able to fully specify the probabilities?
Assume each RV takes:
m values
P(X | Y,Z)
n values q values
Number of parameters What about P(X | Y,Z) , how many independent parameters do you need to be able to fully specify the probabilities?
Assume each RV takes:
m values
Number of independent parameters: (m-1)*nq
P(X | Y,Z)
n values q values
Graphical models
A graphical model is a way of representing probabilistic relationships between random variables
Variables are represented by nodes:Edges indicates probabilistic relationships:
Arrive class late
You miss the bus
1 2 n i ii 1
P(X ,X ..X ) P(X | Pa(X ))
Serial connection
X
Y
Z
Is X conditionally independent of Z given Y?
|X Z Y
Yes they are independent
How can we show it?
X
Y
Z
( , , ) ( ) ( | ) ( | )P X Y Z P X P Y X P Z Y
( | , )
( , , )
( , )
( ) ( | ) ( | )
( ) ( | )
( | )
P Z X Y
P X Y Z
P X Y
P X P Y X P Z Y
P X P Y X
P Z Y
Is X conditionally independent of Z given Y?
|X Z Y
Common cause
Z
YX
Shoe Size
Age
Gray Hair
X and Y are not marginally independentX and Y are conditionally independent given Z
Explaining away
X Z
Y
Flu Allergy
Sneeze
X and Z marginally independentX and Z conditionally dependent given Y
D-separation X and Z are conditionally independent given Y if Y d-separates X
and Z
X
Y
Z
Path between X and Z is blocked by Y
X
Y
Z
X
Y
Z
X
Y
Z
Neither Y nor its descendants should beobserved
Naïve Bayes Model
J
D C R
J: The person is a juniorD: The person knows calculusC: The person leaves in campusR: Saw the “Return of the King” more than once
Naïve Bayes Model
J
D C R
J: The person is a juniorD: The person knows calculusC: The person leaves in campusR: Saw the “Return of the King” more than once
What parameters are stored?
Naïve Bayes Model
J
D C R
J : The person is a juniorD: The person knows calculusC: The person leaves in campusR: Saw the “Return of the King” more than once
P(J)=
P(R/J=1)=P(R/J=0)=
P(D/J=1)=P(D/J=0)=
P(C/J=1)=P(C/J=0)=
Naïve Bayes Model
J
D C R
J: The person is a juniorD: The person knows calculus C: The person leaves in campusR: Saw the “Return of the King” more than once
P(J)=
P(R/J=1)=P(R/J=0)=
P(D/J=1)=P(D/J=0)=
P(C/J=1)=P(C/J=0)=
Are you ajunior?
Do you knowcalculus?
Do you livein campus?
Have you seen'Return of the King‘more than once?
Student 1 1 0 1 1
Student 2 1 1 1 0
Student 3 1 0 1 1
Student 4 1 0 1 1
Student 5 1 1 1 0
Student 6 1 0 1 1
Student 7 1 1 1 1
Student 8 1 1 1 1
Student 9 0 1 0 1
Student 10 1 1 1 1
Student 11 1 0 1 0
Student 12 1 0 1 1
Student 13 0 1 1 1
Student 14 1 1 1 1
Student 15 1 1 1 1
Student 16 1 1 1 1
Student 17 0 0 0 1
Student 18 1 0 1 0
Student 19 0 1 1 1
Student 20 0 0 1 1
We have the structure how do we get the CPTs?Estimate them from observed data
Naïve Bayes Model
J
D C R
J: The person is a juniorD: The person knows calculusC: The person leaves in campusR: Saw the “Return of the King” more than onceP(J)=
P(R/J)=P(R/~J)=
P(C/J)=P(C/~J)=
P(C/J)=P(C/~J)=
Suppose a new person come and says: I don’t know calculus I live in campus I have seen ‘The return of the king’ five times
What is the probability that he is a Junior?
Naïve Bayes Model
J
D C R
Suppose a person says:I don’t know calculus D=0I live in campus C=1I have not seen ‘The return of the king’ five times R=1
What is the probability that he is a Junior?
P(J=1/D=0,C=1,R=1)
What is the probability that he is a Junior?
P(J=1,D=0,C=1,R=1)P(J=1/D=0,C=1,R=1) =
P(D 0,C 1,R 1)
P(J 1)P(C 1/ J 1)P(R 1/ J 1)P(D 0/ J 1)
P(D 0,C 1,R 1)
J
D C R
To calculate this marginalize over J