Simplifying Gaussian Mixture Models Via Entropic Quantization (EUSIPCO 2009)

Preview:

DESCRIPTION

Slides for the paper presented at EUSIPCO 2009: Simplifying Gaussian Mixture Models Via Entropic Quantization http://www.eurasip.org/Proceedings/Eusipco/Eusipco2009/contents/papers/1569187249.pdf

Citation preview

Simplifying Gaussian Mixture ModelsVia Entropic Quantization

Frank Nielsen1 2, Vincent Garcia1, and Richard Nock3

1 Ecole Polytechnique (Paris, France)2 Sony Computer Science Laboratories (Tokyo, Japan)

3 Universite des Antilles et de la Guyane (Guadeloupe, France)

28th august 2009

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 1 / 23

Introduction

Plan

1 IntroductionMixture modelsProblemMixture model simplification

2 Mixture model simplificationKLD and Bregman divergenceSided BKMCSymmetric BKMCjMEF

3 ExperimentsQuality measure and initializationSided BKMCBKMC vs UTAC

4 Conclusion

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 2 / 23

Introduction Mixture models

Mixture models

Mixture model is a powerful framework to estimate PDF

Mixture model f

f (x) =n∑

i=1

αi fi (x)

where αi ≥ 0 denotes a weight with∑n

i=1 αi = 1

If f is a Gaussian mixture model (GMM),

fi (x) =1

(2π)d/2|Σi |1/2exp

(−

(x − µi )T Σ−1

i (x − µi )

2

)

with µi mean and Σi covariance matrix

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 3 / 23

Introduction Problem

Problem

−0.5 0 0.5 1 1.50

0.5

1

1.5

2

2.5

Density estimation using kernel-based Parzen estimator

Mixture models usually contain a lot of components

Estimation of statistical measures is computationally expensive

Need to reduce the number of componentsRe-lear a simpler mixture model from datasetSimplify the mixture model f

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 4 / 23

Introduction Mixture model simplification

Mixture model simplification

Given a mixture model f of n components

f (x) =n∑

i=1

αi fi (x)

Compute a mixture model g of m components

g(x) =m∑

j=1

α′jgj(x)

such as g is the best approximation of f

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 5 / 23

Mixture model simplification

Plan

1 IntroductionMixture modelsProblemMixture model simplification

2 Mixture model simplificationKLD and Bregman divergenceSided BKMCSymmetric BKMCjMEF

3 ExperimentsQuality measure and initializationSided BKMCBKMC vs UTAC

4 Conclusion

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 6 / 23

Mixture model simplification KLD and Bregman divergence

Relative entropy and Bregman divergence

The fundamental measure between statistical distributions is therelative entropy, also called the Kullback-Leibler divergence

Given fi and fj two distributions, the KLD is given by

KLD(fi ||fj) =

∫fi (x) log

fi (x)

fj(x)dx

In the case of normal distriubtions

KLD(fi ||fj) =1

2log

(det Σj

det Σi

)+

1

2tr(

Σ−1j Σi

)+

1

2(µj − µi )

T Σ−1j (µj − µi )−

d

2

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 7 / 23

Mixture model simplification KLD and Bregman divergence

Relative entropy and Bregman divergence

Nomral distributions belong to the class of exponential families

Canonical form of exponential families

f (x) = exp{〈Θ, t(x)〉 − F (Θ) + C (x)

}Estimation of the KLD by computing the Bregman divergence definedfor the log normalizer F

KLD(fi ||fj) = DF (Θj ||Θi )

where

DF (Θj ||Θi ) = F (Θj)− F (Θi )− 〈Θj − Θi ,∇F (Θi )〉

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 8 / 23

Mixture model simplification KLD and Bregman divergence

Relative entropy and Bregman divergence

For multivariate normal distributions

Sufficient statistics

t(x) = (x ,−1

2xxT )

Natural parameters

Θ = (θ,Θ) = (Σ−1µ,1

2Σ−1)

Log normalizer

F (Θ) =1

4tr(Θ−1θθT )− 1

2log det Θ +

d

2log π

∇F (Θ) =

(1

2Θ−1θ , −1

2Θ−1 − 1

4(Θ−1θ)(Θ−1θ)T

)

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 9 / 23

Mixture model simplification Sided BKMC

Bregman k-means clustering

K-means clustering

Set of points

Initialize k centroids = k classes

Repetition until convergence

Repartition step (distance)Computation of centroids (centers of mass)

Bregman K-means clustering

Set of distributions

Initialize k centroids (α′i , gi ) = GMM with k components

Repetition until convergence

Repartition step (sided Bregman divergence)Computation of centroids (sided centroids)

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 10 / 23

Mixture model simplification Sided BKMC

Sided centroids

5 multivariate Gaussians

Right-centroid

Left-centroid

http://www.sonycsl.co.jp/person/nielsen/BNCj/

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 11 / 23

Mixture model simplification Sided BKMC

Right-sided BKMC algorithm

1: Initialize the GMM g2: repeat3: Compute the cluster C : the Gaussian fi belongs to cluster Cj if and only if

DF (Θi‖Θ′j) < DF (Θi‖Θ′l), ∀l ∈ [1,m] \ {j}

4: Compute the centroids: the weight and the natural parameters of the j-thcentroid (i.e. Gaussian gj) are given by:

α′j =∑

i

αi , θ′j =

∑i αiθi∑i αi

, Θ′j =

∑i αiΘi∑

i αi

The sum∑

i is performed on i ∈ [1,m] such as fi ∈ Cj

5: until the cluster does not change between two iterations

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 12 / 23

Mixture model simplification Sided BKMC

Left-sided BKMC algorithm

1: Initialize the GMM g2: repeat3: Compute the cluster C : the Gaussian fi belongs to cluster Cj if and only if

DF (Θ′j‖Θi ) < DF (Θ′l‖Θi ), ∀l ∈ [1,m] \ {j}

4: Compute the centroids: the weight and the natural parameters of the j-thcentroid (i.e. Gaussian gj) are given by:

α′j =∑

i

αi , Θ′j = ∇F−1

(∑i

αi

α′j∇F

(Θi

))

where

∇F−1(Θ) =

(−(Θ + θθT

)−1θ , −1

2

(Θ + θθT

)−1)

5: until the cluster does not change between two iterations

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 13 / 23

Mixture model simplification Symmetric BKMC

Symmetric BKMC algorithm

Symmetric similarity measure can be required (e.g. CBIR)

Repartition step: Symmetric Bregman divergence

SDF (Θp, Θq) =DF (Θq||Θp) + DF (Θp||Θq)

2

Computation of symmetric centroid:

Compute right and left centroids (cr and cl)The symmetric centroid cs belongs to the geodesic link joining cr and cl

cλ = ∇F−1 (λ∇F (cr ) + (1− λ)∇F (cl))

The symmetric centroid cs = cλ verifies

SDF (cλ, cr ) = SDF (cλ, cl).

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 14 / 23

Mixture model simplification jMEF

jMEF

jMEF : Java library for Mixture of Exponential Families

Create and manage MEF

Simplify MEF using BKMC

Available on line at www.lix.polytechnique.fr/∼nielsen/MEF

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 15 / 23

Experiments

Plan

1 IntroductionMixture modelsProblemMixture model simplification

2 Mixture model simplificationKLD and Bregman divergenceSided BKMCSymmetric BKMCjMEF

3 ExperimentsQuality measure and initializationSided BKMCBKMC vs UTAC

4 Conclusion

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 16 / 23

Experiments Quality measure and initialization

Quality measure and initialization

Simplification quality measure

KLD(f ‖g) (right-sided)

No closed-form expression

Draw 10,000 points to estimate this KLD (Monte-Carlo)

Initial GMM f

Learnt from an image

K-means on RGB pixels ⇒ 32 classes

EM algorithm ⇒ fi

Weights αi : proportion of pixels in each class

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 17 / 23

Experiments Sided BKMC

Sided BKMC

Evolution of KLD(f ‖g) as a function of m

The simplification quality increases with m

Left-sided BKMC provides the best results

Right-sided BKMC provides the worst results

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 18 / 23

Experiments BKMC vs UTAC

BKMC vs UTAC

UTAC algorithm based on sigma points + EM algorithm

BKMC provides better results than UTAC

BKMC is faster than UTAC: 20ms vs 100ms

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 19 / 23

Experiments BKMC vs UTAC

Clustering-based image segmentation

Image f UTAC BKMC

KLD=0.23 KLD=0.11

KLD=0.16 KLD=0.13

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 20 / 23

Experiments BKMC vs UTAC

Clustering-based image segmentation

Image f UTAC BKMC

KLD=0.69 KLD=0.53

KLD=0.36 KLD=0.18

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 21 / 23

Conclusion

Plan

1 IntroductionMixture modelsProblemMixture model simplification

2 Mixture model simplificationKLD and Bregman divergenceSided BKMCSymmetric BKMCjMEF

3 ExperimentsQuality measure and initializationSided BKMCBKMC vs UTAC

4 Conclusion

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 22 / 23

Conclusion

Conclusion

GMM simplification algorithm based on k-means and Bregmandivergence

BKMC is faster and provides better results than UTAC algorithm

BKMC extends to mixtures of exponential families

jMEF available on line at www.lix.polytechnique.fr/∼nielsen/MEF

Included features:

Create/manage mixtures of exponential familiesBKMC algorithmHierarchical GMM (ACCV 2009)

V. Garcia (X, Paris, France) Simplifying GMMs 28th august 2009 23 / 23

Recommended