52
Unsupervised Learning — Week #7 Machine Learning I 80-629A Apprentissage Automatique I 80-629

Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Unsupervised Learning— Week #7

Machine Learning I 80-629A

Apprentissage Automatique I 80-629

Page 2: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Today

• Unsupervised learning

• Definition and properties

• A few clustering models

• Other instantiations of unsupervised learning

2

Page 3: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Experience (E)

• What data does f experience?

• (Focus on algorithms that experience whole datasets)

• Unsupervised. Examples alone

• Supervised. Examples come with labels

3

{]N}SN=�

{(]N,^N)}SN=�

Recall

Page 4: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

1. Unsupervised

• Experience examples alone

• Learn “useful properties of the structure of the data”

• E.g., clustering, density modeling (p(x)), PCA, FA.

4

{]N}SN=�

Recall

Page 5: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Different tasks

• Finding patterns

• Clustering

• Dimensionality reduction

• Density modelling

• …

5

Page 6: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Clustering

• Task: Assign each point to one of K clusters

• Cluster: a set of similar points (a group)

• Alternatively: Divide the space into K regions. Assign points to a cluster based on the region they lie in

• Similar to classification.

6

K : = ! {�, . . . ,0}

Page 7: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

]�

]�

]�

]�

Supervised Unsupervised

Page 8: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Clustering• Desideratum: group similar

points together

• Similarity is often defined as being close according to some similarity or distance function

• E.g., in Euclidean space

8

]�

]�

INXYFSHJ(]N, ]O) =! ]N " ]O !��

]N

]O

Page 9: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

K-means clustering• A particular clustering model (and accompanying algorithm)

• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance

9

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�

]Nµ�

µ�

W =

����

� �� ����

���� �

����

3�0

µP

Page 10: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

K-means clustering• A particular clustering model (and accompanying algorithm)

• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance

• Algorithm to minimize the objective:

• Initialize the cluster centers

• Until convergence:

• Update r

• Update cluster centers

9

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�

]Nµ�

µ�

W =

����

� �� ����

���� �

����

3�0

µP �P

µP

Page 11: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

K-means clustering• A particular clustering model (and accompanying algorithm)

• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance

• Algorithm to minimize the objective:

• Initialize the cluster centers

• Until convergence:

• Update r

• Update cluster centers

9

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�

]Nµ�

µ�

W =

����

� �� ����

���� �

����

3�0

µP �P

µP

Page 12: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

K-means clustering• A particular clustering model (and accompanying algorithm)

• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance

• Algorithm to minimize the objective:

• Initialize the cluster centers

• Until convergence:

• Update r

• Update cluster centers

9

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�

]Nµ�

µ�

W =

����

� �� ����

���� �

����

3�0

µP �P

µP

Page 13: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

K-means clustering• A particular clustering model (and accompanying algorithm)

• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance

• Algorithm to minimize the objective:

• Initialize the cluster centers

• Until convergence:

• Update r

• Update cluster centers

9

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�

]Nµ�

µ�

W =

����

� �� ����

���� �

����

3�0

µP �P

µP

Page 14: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

(a)

−2 0 2

−2

0

2 (b)

−2 0 2

−2

0

2 (c)

−2 0 2

−2

0

2 (d)

−2 0 2

−2

0

2

(e)

−2 0 2

−2

0

2 (f)

−2 0 2

−2

0

2 (g)

−2 0 2

−2

0

2 (h)

−2 0 2

−2

0

2 (i)

−2 0 2

−2

0

2

Step 1 Step 2

Step 2 Step 2

Step 1

Step 1 Step 1

Initial cluster centers

Final solution

[Figure 9.1 from PRML]

Page 15: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

(a)

−2 0 2

−2

0

2 (b)

−2 0 2

−2

0

2 (c)

−2 0 2

−2

0

2 (d)

−2 0 2

−2

0

2

(e)

−2 0 2

−2

0

2 (f)

−2 0 2

−2

0

2 (g)

−2 0 2

−2

0

2 (h)

−2 0 2

−2

0

2 (i)

−2 0 2

−2

0

2

Step 1 Step 2

Step 2 Step 2

Step 1

Step 1 Step 1

Initial cluster centers

Final solution

[Figure 9.1 from PRML]

• Initialize the cluster centers

• Until convergence:

1. Update r

2. Update cluster centers

Page 16: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

K-means on images

K = 2 K = 3 K = 10 Original image

]N = (WN,LN,GN)

Each pixel is a datum

Page 17: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

K-means on images

K = 2 K = 3 K = 10 Original image

]N = (WN,LN,GN)

Each pixel is a datum

• We cluster pixels by color

• I.e., pixels of similar color will be belong to the same cluster

• We re-draw the image by replacing the color of each pixel by the color of its cluster

Page 18: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A few failure cases• Only works with squared Euclidean distance

• Not robust to outliers

• What about non-continuous data?

• Tends to result in relatively uniform cluster sizes

• Hard assignments

12 [http://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html]

0�RJITNIX 4GOJHYN[J :=3!

N=�

0!

P=�

WNPI(]N, µP)

Page 19: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629[http://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html]

Page 20: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Unsupervised learning• In supervised learning you have:

• Clear metric (e.g., mean squared error, accuracy, etc.)

• Procedures for comparing different models

• Unsupervised learning

• Unclear what the right metric is

• How to compare different models?

• Data dimensions can have a big impact on solution

• You should carefully select the input

14

Page 21: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

15

]�

]�

]N

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�

Page 22: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

15

]�

]�

]N

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�

Page 23: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

15

]�

]�

]N

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�data

indicator center

Page 24: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

15

]�

]�

]N

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�data

indicator center

W =

����

� �� ����� �

����

3�0

WN: =�� �

���0the i’th line

Page 25: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

15

]�

]�

]N

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�data

indicator center

W =

����

� �� ����� �

����

3�0

WN: =�� �

���0

5(WN:) = (FYJLTWNHFQ(ä)

=0�

P=�

äWNPP

the i’th line

Page 26: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

15

]�

]�

]N

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�data

indicator center

W =

����

� �� ����� �

����

3�0

WN: =�� �

���0

5(WN:) = (FYJLTWNHFQ(ä)

=0�

P=�

äWNPP

the i’th line

]N =��.� �.�

µP =��.� �.�

Page 27: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

15

]�

]�

]N

4GOJHYN[J :=3!

N=�

0!

P=�

WNP ! ]N " µP !�data

indicator center

W =

����

� �� ����� �

����

3�0

WN: =�� �

���0

5(WN:) = (FYJLTWNHFQ(ä)

=0�

P=�

äWNPP

the i’th line

5(]N | WNP = �) = N (]N | µP, �P)

]N =��.� �.�

µP =��.� �.�

Page 28: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

16

5(]N | WNP = �) = N (]N | µP, �P)

5(WN:) = (FYJLTWNHFQ(ä)

¨(ɴǣ | ✓) =X

ȸǣ

¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)

=X

Ǹ

ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>

¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>

Page 29: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

16

5(]N | WNP = �) = N (]N | µP, �P)

5(WN:) = (FYJLTWNHFQ(ä)

� � ä � ��

P

äP = �

¨(ɴǣ | ✓) =X

ȸǣ

¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)

=X

Ǹ

ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>

¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>

Page 30: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

16

5(]N | WNP = �) = N (]N | µP, �P)

5(WN:) = (FYJLTWNHFQ(ä)

� � ä � ��

P

äP = �

1) Soft clustering each cluster k explains of each datapoint The posterior over r ( ) makes it explicit

äP

¨(ɴǣ | ✓) =X

ȸǣ

¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)

=X

Ǹ

ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>

¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>

Page 31: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

16

5(]N | WNP = �) = N (]N | µP, �P)

5(WN:) = (FYJLTWNHFQ(ä)

� � ä � ��

P

äP = �

2) Mixture of Gaussians (GMMs) convex combination of Gaussian distributions

1) Soft clustering each cluster k explains of each datapoint The posterior over r ( ) makes it explicit

äP

¨(ɴǣ | ✓) =X

ȸǣ

¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)

=X

Ǹ

ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>

¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>

Page 32: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

A probabilistic approach to k-means clustering

• Estimation of the parameters: Maximum likelihood estimate

• Could do it jointly (“a la” neural network)

• Often in two separate steps (similar as the non-probabilistic version)

• This leads to the Expectation-Maximization (EM) algorithm

17

Page 33: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

(a)−2 0 2

−2

0

2

(b)−2 0 2

−2

0

2

(c)

L = 1

−2 0 2

−2

0

2

(d)

L = 2

−2 0 2

−2

0

2

(e)

L = 5

−2 0 2

−2

0

2

(f)

L = 20

−2 0 2

−2

0

2

Initial cluster centers

Final solution

[Figure 9.8 from PRML]

Page 34: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

(a)−2 0 2

−2

0

2

(b)−2 0 2

−2

0

2

(c)

L = 1

−2 0 2

−2

0

2

(d)

L = 2

−2 0 2

−2

0

2

(e)

L = 5

−2 0 2

−2

0

2

(f)

L = 20

−2 0 2

−2

0

2

Initial cluster centers

Final solution

(i)

−2 0 2

−2

0

2

k-means Final solution

[Figure 9.8 from PRML]

Page 35: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629 19

Similar

Similar

GMM better

GMM better

Similar

Similar

GMMsK-means

Page 36: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Comparing K-means to GMMs

• GMMs learns covariance matrix

• Per cluster variance

• Covariance terms

• GMMs has many more parameters

• Covariance matrix (MxM)

19

Similar

Similar

GMM better

GMM better

Similar

Similar

GMMsK-means

Page 37: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering

20

Page 38: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

20

Page 39: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

• What’s the usefulness of unsupervised learning?

20

Page 40: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

• What’s the usefulness of unsupervised learning?

• Data exploration

20

Page 41: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

• What’s the usefulness of unsupervised learning?

• Data exploration

• Dimensionality reduction

20

Page 42: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

• What’s the usefulness of unsupervised learning?

• Data exploration

• Dimensionality reduction

• Visualization

20

Page 43: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

• What’s the usefulness of unsupervised learning?

• Data exploration

• Dimensionality reduction

• Visualization

• Pre-processing

20

Page 44: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

• What’s the usefulness of unsupervised learning?

• Data exploration

• Dimensionality reduction

• Visualization

• Pre-processing

• Learn useful representations

20

Page 45: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Beyond Clustering• Clustering is the classical task

• What’s the usefulness of unsupervised learning?

• Data exploration

• Dimensionality reduction

• Visualization

• Pre-processing

• Learn useful representations

• Pre-training / semi-supervised learning

20

Page 46: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Autoencoders• A type of neural network for dimensionality reduction*

• Non-linear PCA

21

dim(h) <

Page 47: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Autoencoders• A type of neural network for dimensionality reduction*

• Non-linear PCA

• Intuition: let’s learn to copy the data

21

dim(h) <

ɴ! = ǔ(ɴ)<latexit sha1_base64="ZS8Gvufk79zF7aiG0nJXCjEZCBE=">AAADanicfVJdaxNBFJ0mftT41dYn6ctgLKYQw0Yf9EUotsVCrVY0bSVZyuzsTTpkPpaZuzFh2V/hq/4w/4M/orNJLN2keGHhcs7cO2fOniiRwmEQ/FmpVG/dvnN39V7t/oOHjx6vrW+cOJNaDh1upLFnEXMghYYOCpRwllhgKpJwGg13C/50BNYJo7/hJIFQsYEWfcEZeuj7+AV9R/uN8fb5Wj1oBdOiy0173tR36mRax+frlcNebHiqQCOXzLluO0gwzJhFwSXktV7qIGF8yAbQ9a1mClyYTRXndMsjMe0b6z+NdIrWtq6NZMrjDqxlmHcdKBEZGYfXl2ZMOTdRkV+mGF64Ra4Ab+K6KfbfhpnQSYqg+UxLP5UUDS0corGwwFFOfMO4Ff45lF8wyzh6H0u3RMYMkUWuybzOSVOlEoU1P0pvz6YCEuBldJxqwU0MC6jEMVpWNi8eicTN7RvP/PNWOUDFhC7syw5AjsDrZPQTpHDF+nsLurEnBgJd86P/5br5wQIMt5dGSvuOrrx/+dVb/95b/+/E/3beOFbr7YHPh4UjP/s5Ac8am/X286xXbIuibD/Paz5+7cWwLTcnr1rt163gi8/h7iyHZJVskmekQdrkDdkhB+SYdAgnivwkv8jvyt/qRvVpdXN2tLIyn3lCSlV9fgmdHiB/</latexit>

Page 48: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Autoencoders• A type of neural network for dimensionality reduction*

• Non-linear PCA

• Intuition: let’s learn to copy the data

• We force a “bottleneck”

21

dim(h) <

ɴ! = ǔ(ɴ)<latexit sha1_base64="ZS8Gvufk79zF7aiG0nJXCjEZCBE=">AAADanicfVJdaxNBFJ0mftT41dYn6ctgLKYQw0Yf9EUotsVCrVY0bSVZyuzsTTpkPpaZuzFh2V/hq/4w/4M/orNJLN2keGHhcs7cO2fOniiRwmEQ/FmpVG/dvnN39V7t/oOHjx6vrW+cOJNaDh1upLFnEXMghYYOCpRwllhgKpJwGg13C/50BNYJo7/hJIFQsYEWfcEZeuj7+AV9R/uN8fb5Wj1oBdOiy0173tR36mRax+frlcNebHiqQCOXzLluO0gwzJhFwSXktV7qIGF8yAbQ9a1mClyYTRXndMsjMe0b6z+NdIrWtq6NZMrjDqxlmHcdKBEZGYfXl2ZMOTdRkV+mGF64Ra4Ab+K6KfbfhpnQSYqg+UxLP5UUDS0corGwwFFOfMO4Ff45lF8wyzh6H0u3RMYMkUWuybzOSVOlEoU1P0pvz6YCEuBldJxqwU0MC6jEMVpWNi8eicTN7RvP/PNWOUDFhC7syw5AjsDrZPQTpHDF+nsLurEnBgJd86P/5br5wQIMt5dGSvuOrrx/+dVb/95b/+/E/3beOFbr7YHPh4UjP/s5Ac8am/X286xXbIuibD/Paz5+7cWwLTcnr1rt163gi8/h7iyHZJVskmekQdrkDdkhB+SYdAgnivwkv8jvyt/qRvVpdXN2tLIyn3lCSlV9fgmdHiB/</latexit>

ɿ = ǔ(ɴ)ɴ0 = ǔא(ɿ)

Ƴǣȅ(ɿ) < Ƴǣȅ(ɴ)<latexit sha1_base64="yyJR84g2TC5iv2yarr00Uem03bE=">AAADsHicfVJdb9MwFPUaPkb46uCRF4tqo0WlSssDPIA0sU1MgsIQdKvUVJXj3Hamjh3ZTkkX5Vfxa/YKfwSn7aalm7AU6egcn+t7T24Qc6aN551vVJxbt+/c3bzn3n/w8NHj6taTYy0TRaFHJZeqHxANnAnoGWY49GMFJAo4nATTvUI/mYHSTIofZh7DMCITwcaMEmOpUbXrBzBhIiOcTcTL3D3DO+/xeNSupw3s+9hNXyyJTv1sSYQsKuDOO1ygtOH6IMIL+6ha81re4uDroL0CNbQ6R6Otyic/lDSJQBjKidaDthebYUaUYZRD7vqJhpjQKZnAwEJBItDDbDF3jrctE+KxVPYTBi9Yd/uKJYssr0EpYvKBhogFkofDq0UzEmk9jwJbLCLmVK9rBXmTNkjM+O0wYyJODAi67GWccGwkLnK26Sighs8tIFQxOw6mp0QRauzfKL0SSDk1JNBNYvucN6OEG6bkr9Ls2aKBGGiZTRPBqAxhjeUmNYqUwwtnLNar+NJlfjYqDSYiTBTxZYfAZ2D7JPgLJHCp2ncLub7PJszo5me7OKL5UQFMG9cspXrdy+xffbfRf7DRX9z4X80bba6/D3Y/FHSt92sMVpUq8w/yzC+qBUF2kOeuXb/2+rJdB8edVvt1y/vWqe3urRZxEz1Dz1EdtdEbtIsO0RHqIYp+o3P0B/11Ok7fGTlkebWysfI8RaXj/PwHexg2nQ==</latexit>

Page 49: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Autoencoders

• A neural network architecture for unsupervised learning

22

Page 50: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Unsupervised learning as supervised learning

• Examples:

• Auto-encoders “predict” their inputs

• Language models “predict” the next word

• Create a target from the x’s (or a subset)

• Find a task to ensure that you learn something useful

• Self-supervised learning

23

Page 51: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Some thoughts on probabilistic approaches

• Many “equivalences” from known models/algorithms to probabilistic formulations

• K-means -> Mixture of Gaussians

• Linear Regression -> MLE in Gaussian model

• Logistic Regression -> MLE in Bernoulli model

• Allows to think of models in a common framework

• Build a joint distribution from a marginal and a conditional

• Learning procedures

• Well motivated

• Can be shared across problems

24

5(], _) = 5(] | _)5(_)

Page 52: Machine Learning I 80-629A Apprentissage Automatique I 80-629lcharlin/courses/80-629/slides_unsupervised.pdf · Laurent Charlin — 80-629 1. Unsupervised • Experience examples

Laurent Charlin — 80-629

Unsupervised learning takeaways

• Most data (in the world) is unlabeled

• Useful tasks: clustering, density estimation, dimensionality reduction

• K-means and Gaussian mixtures (GMMs)

• Performance is harder to judge (measure)

• Note that all our examples were in 2D

• Can be used in downstream tasks

25