Lecture 18 Expectation Maximization

Lecture 18Expectation Maximization

Machine Learning

Last Time

• Expectation Maximization• Gaussian Mixture Models

Term Project

• Projects may use existing machine learning software– weka, libsvm, liblinear, mallet, crf++, etc.

• But must experiment with– Type of data– Feature Representations– a variety of training styles – amount of data,

classifiers.– Evaluation

Gaussian Mixture Model

• Mixture Models.– How can we combine many probability density

functions to fit a more complicated distribution?

• Fitting Multimodal Data

• Clustering

• Expectation Maximization.

• E-step– Assign points.

• M-step– Re-estimate model parameters.

• EM Proof– Jensen’s Inequality

• Clustering sequential data– EM over HMMs

Gaussian Mixture Models

How can we be sure GMM/EM works?

• We’ve already seen that there are multiple clustering solutions for the same data.– Non-convex optimization problem

• Can we prove that we’re approaching some maximum, even if many exist.

Bound maximization

• Since we can’t optimize the GMM parameters directly, maybe we can find the maximum of a lower bound.

• Technically: optimize a convex lower bound of the initial non-convex function.

EM as a bound maximization problem

• Need to define a function Q(x,Θ) such that– Q(x,Θ) ≤ l(x,Θ) for all x,Θ– Q(x,Θ) = l(x,Θ) at a single

point– Q(x,Θ) is concave

EM as bound maximization

• Claim: – for GMM likelihood

– The GMM MLE estimate is a convex lower bound

EM Correctness Proof

• Prove that l(x,Θ) ≥ Q(x,Θ)Likelihood function

Introduce hidden variable (mixtures in GMM)

A fixed value of θt

Jensen’s Inequality (coming soon…)

EM Correctness Proof

GMM Maximum Likelihood Estimation

The missing link: Jensen’s Inequality

• If f is concave (or convex down):

• Incredibly important tool for dealing with mixture models.

if f(x) = log(x)

Generalizing EM from GMM

• Notice, the EM optimization proof never introduced the exact form of the GMM

• Only the introduction of a hidden variable, z.• Thus, we can generalize the form of EM to

broader types of latent variable models

General form of EM

• Given a joint distribution over observed and latent variables:

• Want to maximize:

1. Initialize parameters2. E Step: Evaluate:

3. M-Step: Re-estimate parameters (based on expectation of complete-data log likelihood)

4. Check for convergence of params or likelihood

Applying EM to Graphical Models

• Now we have a general form for learning parameters for latent variables.– Take a Guess– Expectation: Evaluate likelihood– Maximization: Reestimate parameters– Check for convergence

Clustering over sequential data

• HMMs

• What if you believe the data is sequential, but you can’t observe the state.

Training latent variables in Graphical Models

• Now consider a general Graphical Model with latent variables.

EM on Latent Variable Models

• Guess– Easy, just assign random values to parameters

• E-Step: Evaluate likelihood.– We can use JTA to evaluate the likelihood.– And marginalize expected parameter values

• M-Step: Re-estimate parameters.– Based on the form of the models generate new expected

parameters • (CPTs or parameters of continuous distributions)

• Depending on the topology this can be slow

Lecture 18 Expectation Maximization

Documents

Wavelet based multiresolution expectation maximization ...raheja/publications/wmrem_cmig.pdf · Wavelet based multiresolution expectation maximization image reconstruction algorithm

Lecture 17 - Computer Science - Western Universityoveksler/Courses/CS434a_541a/Lecture17.… · Lecture 17. Today Parametric Unsupervised Learning Expectation Maximization (EM) one

Maximum Likelihood Estimation & Expectation Maximization

Expectation-Maximization · Expectation Maximization {Convenientalgorithm forcertain Maximum Likelihood problems. { Viablealternativeto Newton or Conjugate Gradient algorithms. {

Natural Language Processing Expectation Maximization

Lecture 17 Gaussian Mixture Models and Expectation Maximization Machine Learning

Unified Expectation Maximization

Expectation Maximization - University of Washington

Expectation-Maximization-based Channel Estimation for …islab.snu.ac.kr/upload/expectation-maximization-based... · 2018-08-13 · 2 Expectation-Maximization-based Channel Estimation

Modern Computational Statistics [1em] Lecture 10: Expectation … · 2020-05-27 · Modern Computational Statistics Lecture 10: Expectation Maximization Cheng Zhang School of Mathematical

Expectation Maximization Machine Learning. Last Time Expectation Maximization Gaussian Mixture Models

LECTURE 10: EXPECTATION MAXIMIZATION (EM)

Expectation Maximization A “Gentle” Introduction

Lecture 17 Gaussian Mixture Models and Expectation Maximization

Expectation & Maximization

Gaussian Mixture Models and Expectation Maximization

Lecture 6: Hidden Variables and Expectation-Maximization

Expectation-Maximization (EM) Algorithm

Gaussian Mixture Models and Expectation Maximization

CSC 2515 Lecture 9: Expectation-Maximization