View
217
Download
1
Tags:
Embed Size (px)
Citation preview
Bayesian Learning,cont’d
Administrivia•Homework 1 returned today (details in a
second)
•Reading 2 assigned today
•S. Thrun, Learning occupancy grids with forward sensor models. Autonomous Robots, 2002.
•Due: Oct 26
•Much crunchier than the first! Don’t slack.
•Work with your group to sort out the math.
•Questions to mailing list and me.
•Midterm exam: Oct 21
Homework 1 results•Mean=30.3; std=6.9
IID Samples•In supervised learning, we usually
assume that data points are sampled independently and from the same distribution
•IID assumption: data are independent and identically distributed
•⇒ joint PDF can be written as product of individual (marginal) PDFs:
The max likelihood recipe•Start with IID data
•Assume model for individual data point, f(X;Θ)
•Construct joint likelihood function (PDF):
•Find the params Θ that maximize L
•(If you’re lucky): Differentiate L w.r.t. Θ, set =0 and solve
•Repeat for each class
Exercise•Find the maximum likelihood estimator of μ
for the univariate Gaussian:
•Find the maximum likelihood estimator of β for the degenerate gamma distribution:
•Hint: consider the log of the likelihood fns in both cases
Solutions•PDF for one data point:
•Joint likelihood of N data points:
Solutions•Log-likelihood:
Solutions•Log-likelihood:
•Differentiate w.r.t. μ:
Solutions•Log-likelihood:
•Differentiate w.r.t. μ:
Solutions•Log-likelihood:
•Differentiate w.r.t. μ:
Solutions•Log-likelihood:
•Differentiate w.r.t. μ:
Solutions•What about for the gamma PDF?
Putting the parts together
[X,Y]
com
ple
te
train
ing
data
Putting the parts together Assumed distribution
family (hyp. space)w/ parameters Θ
Parameters for class a:
Specific PDFfor class a
Putting the parts together
Putting the parts together
Gaussian Distributions
5 minutes of math...•Recall your friend the Gaussian PDF:
•I asserted that the d-dimensional form is:
•Let’s look at the parts...
5 minutes of math...
5 minutes of math...•Ok, but what do the parts mean?
•Mean vector, : mean of data along each dimension
5 minutes of math...•Covariance matrix
•Like variance, but describes spread of data
5 minutes of math...•Note: covariances on the diagonal of
are same as standard variances on that dimension of data
•But what about skewed data?
5 minutes of math...•Off-diagonal covariances ( )
describe the pairwise variance
•How much xi changes as x
j changes (on
avg)
5 minutes of math...•Calculating from data:
•In practice: you want to measure the covariance between every pair of random variables (dimensions):
•Or, in linear algebra: