37
Most slides from http://w ww.autonlab.org/tutorials / Expectation Maximizati on (EM) Northwestern University EECS 395/495 Special Topics in Machine Lea rning

Most slides from Expectation Maximization (EM) Northwestern University EECS 395/495 Special Topics in Machine Learning

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Most slides from http://www.autonlab.org/tutorials/

Expectation Maximization (EM)

Northwestern University

EECS 395/495

Special Topics in Machine Learning

Most slides from http://www.autonlab.org/tutorials/

Outline

• Objective

• Simple example

• Complex example

Most slides from http://www.autonlab.org/tutorials/

Objective

• Learning with missing/unobservable data

J

BE

A

E B A J

1 1 1 1

1 0 1 1

0 0 0 0

Maximum likelihood

Most slides from http://www.autonlab.org/tutorials/

Objective

• Learning with missing/unobservable data

J

BE

A

E B A J

1 1 ? 1

1 0 ? 1

0 0 ? 0

Optimize what?

Most slides from http://www.autonlab.org/tutorials/

Outline

• Objective

• Simple example

• Complex example

Most slides from http://www.autonlab.org/tutorials/

Simple example

Most slides from http://www.autonlab.org/tutorials/

Maximize likelihood

Most slides from http://www.autonlab.org/tutorials/

Same Problem with Hidden Information

Score

GradeHidden

Observable

Most slides from http://www.autonlab.org/tutorials/

Same Problem with Hidden Information

S

G

Most slides from http://www.autonlab.org/tutorials/

Same Problem with Hidden Information

S

G

Most slides from http://www.autonlab.org/tutorials/

EM for our example

Most slides from http://www.autonlab.org/tutorials/

EM Convergence

Most slides from http://www.autonlab.org/tutorials/

Generalization

• X: observable data (score = {h, c, d})

• z: missing data(grade = {a, b, c, d})

• : model parameters to estimate ( )

• E: given , compute the expectation of z

• M: use z obtained in E step, maximize the likelihood with respect to

Most slides from http://www.autonlab.org/tutorials/

Outline

• Objective

• Simple example

• Complex example

Most slides from http://www.autonlab.org/tutorials/

Gaussian Mixtures

Most slides from http://www.autonlab.org/tutorials/

Gaussian Mixtures

• Know– Data– -– -

• Don’t know– Data label

• Objective– -

Most slides from http://www.autonlab.org/tutorials/

The GMM assumption

Most slides from http://www.autonlab.org/tutorials/

The GMM assumption

Most slides from http://www.autonlab.org/tutorials/

The GMM assumption

Most slides from http://www.autonlab.org/tutorials/

The GMM assumption

Most slides from http://www.autonlab.org/tutorials/

The data generated

Coordinates

Label

Most slides from http://www.autonlab.org/tutorials/

Computing the likelihood

Most slides from http://www.autonlab.org/tutorials/

EM for GMMs

Most slides from http://www.autonlab.org/tutorials/

EM for GMMs

Most slides from http://www.autonlab.org/tutorials/

EM for GMMs

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Most slides from http://www.autonlab.org/tutorials/

Generalization

• X: observable data

• z: unobservable data

• : model parameters to estimate

• E: given , compute the “expectation” of z

• M: use z obtained in E step, maximize the likelihood with respect to

Most slides from http://www.autonlab.org/tutorials/

• Exponential family – Yes: normal, exponential, beta, Bernoulli,

binomial, multinomial, Poisson…– No: Cauchy and uniform

• EM using sufficient statistics– S1: computing the expectation of the statistics– S2: set the maximum likelihood

For distributions in exponential family

Most slides from http://www.autonlab.org/tutorials/

What EM really is

• Maximize expected log likelihood

• E-step: Determine the expectation

• M-step: Maximize the expectation above with respect to

•X: observable data

•z: missing data

Most slides from http://www.autonlab.org/tutorials/

Final comments

• Deal with missing data/latent variables

• Maximize expected log likelihood

• Local minima