Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Unsupervised Learning— Week #7
Machine Learning I 80-629A
Apprentissage Automatique I 80-629
Laurent Charlin — 80-629
Today
• Unsupervised learning
• Definition and properties
• A few clustering models
• Other instantiations of unsupervised learning
2
Laurent Charlin — 80-629
Experience (E)
• What data does f experience?
• (Focus on algorithms that experience whole datasets)
• Unsupervised. Examples alone
• Supervised. Examples come with labels
3
{]N}SN=�
{(]N,^N)}SN=�
Recall
Laurent Charlin — 80-629
1. Unsupervised
• Experience examples alone
• Learn “useful properties of the structure of the data”
• E.g., clustering, density modeling (p(x)), PCA, FA.
4
{]N}SN=�
Recall
Laurent Charlin — 80-629
Different tasks
• Finding patterns
• Clustering
• Dimensionality reduction
• Density modelling
• …
5
Laurent Charlin — 80-629
Clustering
• Task: Assign each point to one of K clusters
• Cluster: a set of similar points (a group)
• Alternatively: Divide the space into K regions. Assign points to a cluster based on the region they lie in
• Similar to classification.
6
K : = ! {�, . . . ,0}
Laurent Charlin — 80-629
]�
]�
]�
]�
Supervised Unsupervised
Laurent Charlin — 80-629
Clustering• Desideratum: group similar
points together
• Similarity is often defined as being close according to some similarity or distance function
• E.g., in Euclidean space
8
]�
]�
INXYFSHJ(]N, ]O) =! ]N " ]O !��
]N
]O
Laurent Charlin — 80-629
K-means clustering• A particular clustering model (and accompanying algorithm)
• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance
9
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�
]Nµ�
µ�
W =
�
����
� �� ����
���� �
�
����
3�0
µP
Laurent Charlin — 80-629
K-means clustering• A particular clustering model (and accompanying algorithm)
• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance
• Algorithm to minimize the objective:
• Initialize the cluster centers
• Until convergence:
• Update r
• Update cluster centers
9
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�
]Nµ�
µ�
W =
�
����
� �� ����
���� �
�
����
3�0
µP �P
µP
Laurent Charlin — 80-629
K-means clustering• A particular clustering model (and accompanying algorithm)
• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance
• Algorithm to minimize the objective:
• Initialize the cluster centers
• Until convergence:
• Update r
• Update cluster centers
9
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�
]Nµ�
µ�
W =
�
����
� �� ����
���� �
�
����
3�0
µP �P
µP
Laurent Charlin — 80-629
K-means clustering• A particular clustering model (and accompanying algorithm)
• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance
• Algorithm to minimize the objective:
• Initialize the cluster centers
• Until convergence:
• Update r
• Update cluster centers
9
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�
]Nµ�
µ�
W =
�
����
� �� ����
���� �
�
����
3�0
µP �P
µP
Laurent Charlin — 80-629
K-means clustering• A particular clustering model (and accompanying algorithm)
• For a particular number of clusters (K), find cluster centers that minimize the within cluster distance
• Algorithm to minimize the objective:
• Initialize the cluster centers
• Until convergence:
• Update r
• Update cluster centers
9
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�
]Nµ�
µ�
W =
�
����
� �� ����
���� �
�
����
3�0
µP �P
µP
Laurent Charlin — 80-629
(a)
−2 0 2
−2
0
2 (b)
−2 0 2
−2
0
2 (c)
−2 0 2
−2
0
2 (d)
−2 0 2
−2
0
2
(e)
−2 0 2
−2
0
2 (f)
−2 0 2
−2
0
2 (g)
−2 0 2
−2
0
2 (h)
−2 0 2
−2
0
2 (i)
−2 0 2
−2
0
2
Step 1 Step 2
Step 2 Step 2
Step 1
Step 1 Step 1
Initial cluster centers
Final solution
[Figure 9.1 from PRML]
Laurent Charlin — 80-629
(a)
−2 0 2
−2
0
2 (b)
−2 0 2
−2
0
2 (c)
−2 0 2
−2
0
2 (d)
−2 0 2
−2
0
2
(e)
−2 0 2
−2
0
2 (f)
−2 0 2
−2
0
2 (g)
−2 0 2
−2
0
2 (h)
−2 0 2
−2
0
2 (i)
−2 0 2
−2
0
2
Step 1 Step 2
Step 2 Step 2
Step 1
Step 1 Step 1
Initial cluster centers
Final solution
[Figure 9.1 from PRML]
• Initialize the cluster centers
• Until convergence:
1. Update r
2. Update cluster centers
K-means on images
K = 2 K = 3 K = 10 Original image
]N = (WN,LN,GN)
Each pixel is a datum
K-means on images
K = 2 K = 3 K = 10 Original image
]N = (WN,LN,GN)
Each pixel is a datum
• We cluster pixels by color
• I.e., pixels of similar color will be belong to the same cluster
• We re-draw the image by replacing the color of each pixel by the color of its cluster
Laurent Charlin — 80-629
A few failure cases• Only works with squared Euclidean distance
• Not robust to outliers
• What about non-continuous data?
• Tends to result in relatively uniform cluster sizes
• Hard assignments
12 [http://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html]
0�RJITNIX 4GOJHYN[J :=3!
N=�
0!
P=�
WNPI(]N, µP)
Laurent Charlin — 80-629[http://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html]
Laurent Charlin — 80-629
Unsupervised learning• In supervised learning you have:
• Clear metric (e.g., mean squared error, accuracy, etc.)
• Procedures for comparing different models
• Unsupervised learning
• Unclear what the right metric is
• How to compare different models?
• Data dimensions can have a big impact on solution
• You should carefully select the input
14
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
15
]�
]�
]N
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
15
]�
]�
]N
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
15
]�
]�
]N
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�data
indicator center
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
15
]�
]�
]N
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�data
indicator center
W =
�
����
� �� ����� �
�
����
3�0
WN: =�� �
���0the i’th line
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
15
]�
]�
]N
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�data
indicator center
W =
�
����
� �� ����� �
�
����
3�0
WN: =�� �
���0
5(WN:) = (FYJLTWNHFQ(ä)
=0�
P=�
äWNPP
the i’th line
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
15
]�
]�
]N
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�data
indicator center
W =
�
����
� �� ����� �
�
����
3�0
WN: =�� �
���0
5(WN:) = (FYJLTWNHFQ(ä)
=0�
P=�
äWNPP
the i’th line
]N =��.� �.�
�
µP =��.� �.�
�
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
15
]�
]�
]N
4GOJHYN[J :=3!
N=�
0!
P=�
WNP ! ]N " µP !�data
indicator center
W =
�
����
� �� ����� �
�
����
3�0
WN: =�� �
���0
5(WN:) = (FYJLTWNHFQ(ä)
=0�
P=�
äWNPP
the i’th line
5(]N | WNP = �) = N (]N | µP, �P)
]N =��.� �.�
�
µP =��.� �.�
�
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
16
5(]N | WNP = �) = N (]N | µP, �P)
5(WN:) = (FYJLTWNHFQ(ä)
¨(ɴǣ | ✓) =X
ȸǣ
¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)
=X
Ǹ
ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>
¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
16
5(]N | WNP = �) = N (]N | µP, �P)
5(WN:) = (FYJLTWNHFQ(ä)
� � ä � ��
P
äP = �
¨(ɴǣ | ✓) =X
ȸǣ
¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)
=X
Ǹ
ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>
¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
16
5(]N | WNP = �) = N (]N | µP, �P)
5(WN:) = (FYJLTWNHFQ(ä)
� � ä � ��
P
äP = �
1) Soft clustering each cluster k explains of each datapoint The posterior over r ( ) makes it explicit
äP
¨(ɴǣ | ✓) =X
ȸǣ
¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)
=X
Ǹ
ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>
¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
16
5(]N | WNP = �) = N (]N | µP, �P)
5(WN:) = (FYJLTWNHFQ(ä)
� � ä � ��
P
äP = �
2) Mixture of Gaussians (GMMs) convex combination of Gaussian distributions
1) Soft clustering each cluster k explains of each datapoint The posterior over r ( ) makes it explicit
äP
¨(ɴǣ | ✓) =X
ȸǣ
¨(ȸǣ | ✓)¨(ɴǣ | ȸǣ, ✓)
=X
Ǹ
ǸNצ (ɴǣ | (Ǹ,⌃Ǹץ ✓ := ,{Ǹצ,Ǹ,⌃Ǹץ} 8Ǹ<latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit><latexit sha1_base64="N03XBGOQzWrdZj8gw1/sjZfzMXk=">AAAEwHicjVPbbtNAELVLgGJuDTzysiJq1UqhShASCIRUlVbtQ1sKpRepjqz1epKsvOs1eykJK7/xxNfwCn/C37B2QpukLWKjRCczZ87Mjo/jnFGlW63f/tyN2s1bt+fvBHfv3X/wcKH+6EgJIwkcEsGEPImxAkYzONRUMzjJJWAeMziO03dl/vgMpKIi+6SHOXQ47mW0SwnWLhTV/UaoYaDtfrE8AoMiGgFaoJDTBIW6DxqvoCX0FoXK8MiO8vKC6JjTIvI6keubVTR5/rd5URIGS+POKQpzGtm0FMW6TzCze9fPzU2UOpkD2uM4Sp3Qm9FnrIxeO1U7S2tWLVyjcoKukJgxlEYLjdZqqzroMmiPQcMbn/2oPvctTAQxHDJNGFbqtN3KdcdiqSlhUAShUZBjkuIenDqYYQ6qY6vHWaBFF0mQa+6+mUZVdLLCYq7UkMeOWa5BzebK4FW5U6O7rzqWZrnRkJFRo65hSAtUegMlVALRbOgAJpK6WRHpY4mJdg6a6hILkWocqyaWEg+b3DBNpfgydTFbDZADmY4OTEaJSGAmyvRAS1wEi5MDJ2c0VxPLMbIkKNAc06xcjt0GdgZuUIz2wMB51jUu08sbtEe1au44t2fNLQmQrlwqmdLbdT8K3KX0swPgdF2w5C/jX5pXlv3HLBN1u5BQw2cWsLP1sVcyO9UysSodXgRBEG6A85WEXRd9n4OrF9KGm4WtXow4tpuFoznftmddehkcPV9tO/zhRWNtfezgee+J99Rb9treS2/N2/b2vUOP+N/9H/5P/1dtvdavidrnEXXOH9c89qZO7esfLJKflA==</latexit>
¨(ȸǣ = | ɴ)<latexit sha1_base64="EGGCPoYMI1vzTI1a4RJmdPBDCHc=">AAADenicfVJdaxNBFJ0mftT4lSrogz4MKYUE05BY8eNBKLXFglYjmraQDWF29iYdMrOzzNyNCcv+Fh981Qd/jv/FB2eTGLpN8cLA5Zy55957uH4khcVm8/daoXjl6rXr6zdKN2/dvnO3vHHv2OrYcOhwLbU59ZkFKULooEAJp5EBpnwJJ/7oTcafjMFYocMvOI2gp9gwFAPBGTqoX37oIUwwaadV0xevW9RTIqATWuuXN5uN5izoatJaJJu7FVL49uDX43Z/o/DOCzSPFYTIJbO222pG2EuYQcElpCUvthAxPmJD6Lo0ZApsL5ltkNIthwR0oI17IdIZWto6V5Ioh1swhmHataCEr2XQOy+aMGXtVPlOTDE8sxe5DLyM68Y4eNlLRBjFCCGfzzKIJUVNM8doIAxwlFOXMG6EW4fyM2YYR+drrouv9QiZb+vMzTmtq1iiMPprbvdkNkAEPI9O4lBwHcAFVOIEDcubF4xFZBf2Teb+OassoGIizOxLDkGOwc3J6AeIYcm6vhld3RdDgbb+3p1AWH9rAEa1lZKc3tHS++3Pzvo9Z/2/H//TvLSs5O2Duw8DR672YwSO1SbxDtLEy9R8PzlI09Ls/F5l8Xx5bKvJ8dNGa6ex88nd4R6Zxzp5RCqkSlrkBdklh6RNOoSTlHwnP8jPwp9ipVgrPpl/Lawtau6TXBSf/QWS8Ch4</latexit>
Laurent Charlin — 80-629
A probabilistic approach to k-means clustering
• Estimation of the parameters: Maximum likelihood estimate
• Could do it jointly (“a la” neural network)
• Often in two separate steps (similar as the non-probabilistic version)
• This leads to the Expectation-Maximization (EM) algorithm
17
Laurent Charlin — 80-629
(a)−2 0 2
−2
0
2
(b)−2 0 2
−2
0
2
(c)
L = 1
−2 0 2
−2
0
2
(d)
L = 2
−2 0 2
−2
0
2
(e)
L = 5
−2 0 2
−2
0
2
(f)
L = 20
−2 0 2
−2
0
2
Initial cluster centers
Final solution
[Figure 9.8 from PRML]
Laurent Charlin — 80-629
(a)−2 0 2
−2
0
2
(b)−2 0 2
−2
0
2
(c)
L = 1
−2 0 2
−2
0
2
(d)
L = 2
−2 0 2
−2
0
2
(e)
L = 5
−2 0 2
−2
0
2
(f)
L = 20
−2 0 2
−2
0
2
Initial cluster centers
Final solution
(i)
−2 0 2
−2
0
2
k-means Final solution
[Figure 9.8 from PRML]
Laurent Charlin — 80-629 19
Similar
Similar
GMM better
GMM better
Similar
Similar
GMMsK-means
Laurent Charlin — 80-629
Comparing K-means to GMMs
• GMMs learns covariance matrix
• Per cluster variance
• Covariance terms
• GMMs has many more parameters
• Covariance matrix (MxM)
19
Similar
Similar
GMM better
GMM better
Similar
Similar
GMMsK-means
Laurent Charlin — 80-629
Beyond Clustering
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
• What’s the usefulness of unsupervised learning?
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
• What’s the usefulness of unsupervised learning?
• Data exploration
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
• What’s the usefulness of unsupervised learning?
• Data exploration
• Dimensionality reduction
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
• What’s the usefulness of unsupervised learning?
• Data exploration
• Dimensionality reduction
• Visualization
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
• What’s the usefulness of unsupervised learning?
• Data exploration
• Dimensionality reduction
• Visualization
• Pre-processing
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
• What’s the usefulness of unsupervised learning?
• Data exploration
• Dimensionality reduction
• Visualization
• Pre-processing
• Learn useful representations
20
Laurent Charlin — 80-629
Beyond Clustering• Clustering is the classical task
• What’s the usefulness of unsupervised learning?
• Data exploration
• Dimensionality reduction
• Visualization
• Pre-processing
• Learn useful representations
• Pre-training / semi-supervised learning
20
Laurent Charlin — 80-629
Autoencoders• A type of neural network for dimensionality reduction*
• Non-linear PCA
21
dim(h) <
Laurent Charlin — 80-629
Autoencoders• A type of neural network for dimensionality reduction*
• Non-linear PCA
• Intuition: let’s learn to copy the data
21
dim(h) <
ɴ! = ǔ(ɴ)<latexit sha1_base64="ZS8Gvufk79zF7aiG0nJXCjEZCBE=">AAADanicfVJdaxNBFJ0mftT41dYn6ctgLKYQw0Yf9EUotsVCrVY0bSVZyuzsTTpkPpaZuzFh2V/hq/4w/4M/orNJLN2keGHhcs7cO2fOniiRwmEQ/FmpVG/dvnN39V7t/oOHjx6vrW+cOJNaDh1upLFnEXMghYYOCpRwllhgKpJwGg13C/50BNYJo7/hJIFQsYEWfcEZeuj7+AV9R/uN8fb5Wj1oBdOiy0173tR36mRax+frlcNebHiqQCOXzLluO0gwzJhFwSXktV7qIGF8yAbQ9a1mClyYTRXndMsjMe0b6z+NdIrWtq6NZMrjDqxlmHcdKBEZGYfXl2ZMOTdRkV+mGF64Ra4Ab+K6KfbfhpnQSYqg+UxLP5UUDS0corGwwFFOfMO4Ff45lF8wyzh6H0u3RMYMkUWuybzOSVOlEoU1P0pvz6YCEuBldJxqwU0MC6jEMVpWNi8eicTN7RvP/PNWOUDFhC7syw5AjsDrZPQTpHDF+nsLurEnBgJd86P/5br5wQIMt5dGSvuOrrx/+dVb/95b/+/E/3beOFbr7YHPh4UjP/s5Ac8am/X286xXbIuibD/Paz5+7cWwLTcnr1rt163gi8/h7iyHZJVskmekQdrkDdkhB+SYdAgnivwkv8jvyt/qRvVpdXN2tLIyn3lCSlV9fgmdHiB/</latexit>
Laurent Charlin — 80-629
Autoencoders• A type of neural network for dimensionality reduction*
• Non-linear PCA
• Intuition: let’s learn to copy the data
• We force a “bottleneck”
21
dim(h) <
ɴ! = ǔ(ɴ)<latexit sha1_base64="ZS8Gvufk79zF7aiG0nJXCjEZCBE=">AAADanicfVJdaxNBFJ0mftT41dYn6ctgLKYQw0Yf9EUotsVCrVY0bSVZyuzsTTpkPpaZuzFh2V/hq/4w/4M/orNJLN2keGHhcs7cO2fOniiRwmEQ/FmpVG/dvnN39V7t/oOHjx6vrW+cOJNaDh1upLFnEXMghYYOCpRwllhgKpJwGg13C/50BNYJo7/hJIFQsYEWfcEZeuj7+AV9R/uN8fb5Wj1oBdOiy0173tR36mRax+frlcNebHiqQCOXzLluO0gwzJhFwSXktV7qIGF8yAbQ9a1mClyYTRXndMsjMe0b6z+NdIrWtq6NZMrjDqxlmHcdKBEZGYfXl2ZMOTdRkV+mGF64Ra4Ab+K6KfbfhpnQSYqg+UxLP5UUDS0corGwwFFOfMO4Ff45lF8wyzh6H0u3RMYMkUWuybzOSVOlEoU1P0pvz6YCEuBldJxqwU0MC6jEMVpWNi8eicTN7RvP/PNWOUDFhC7syw5AjsDrZPQTpHDF+nsLurEnBgJd86P/5br5wQIMt5dGSvuOrrx/+dVb/95b/+/E/3beOFbr7YHPh4UjP/s5Ac8am/X286xXbIuibD/Paz5+7cWwLTcnr1rt163gi8/h7iyHZJVskmekQdrkDdkhB+SYdAgnivwkv8jvyt/qRvVpdXN2tLIyn3lCSlV9fgmdHiB/</latexit>
ɿ = ǔ(ɴ)ɴ0 = ǔא(ɿ)
Ƴǣȅ(ɿ) < Ƴǣȅ(ɴ)<latexit sha1_base64="yyJR84g2TC5iv2yarr00Uem03bE=">AAADsHicfVJdb9MwFPUaPkb46uCRF4tqo0WlSssDPIA0sU1MgsIQdKvUVJXj3Hamjh3ZTkkX5Vfxa/YKfwSn7aalm7AU6egcn+t7T24Qc6aN551vVJxbt+/c3bzn3n/w8NHj6taTYy0TRaFHJZeqHxANnAnoGWY49GMFJAo4nATTvUI/mYHSTIofZh7DMCITwcaMEmOpUbXrBzBhIiOcTcTL3D3DO+/xeNSupw3s+9hNXyyJTv1sSYQsKuDOO1ygtOH6IMIL+6ha81re4uDroL0CNbQ6R6Otyic/lDSJQBjKidaDthebYUaUYZRD7vqJhpjQKZnAwEJBItDDbDF3jrctE+KxVPYTBi9Yd/uKJYssr0EpYvKBhogFkofDq0UzEmk9jwJbLCLmVK9rBXmTNkjM+O0wYyJODAi67GWccGwkLnK26Sighs8tIFQxOw6mp0QRauzfKL0SSDk1JNBNYvucN6OEG6bkr9Ls2aKBGGiZTRPBqAxhjeUmNYqUwwtnLNar+NJlfjYqDSYiTBTxZYfAZ2D7JPgLJHCp2ncLub7PJszo5me7OKL5UQFMG9cspXrdy+xffbfRf7DRX9z4X80bba6/D3Y/FHSt92sMVpUq8w/yzC+qBUF2kOeuXb/2+rJdB8edVvt1y/vWqe3urRZxEz1Dz1EdtdEbtIsO0RHqIYp+o3P0B/11Ok7fGTlkebWysfI8RaXj/PwHexg2nQ==</latexit>
Laurent Charlin — 80-629
Autoencoders
• A neural network architecture for unsupervised learning
22
Laurent Charlin — 80-629
Unsupervised learning as supervised learning
• Examples:
• Auto-encoders “predict” their inputs
• Language models “predict” the next word
• Create a target from the x’s (or a subset)
• Find a task to ensure that you learn something useful
• Self-supervised learning
23
Laurent Charlin — 80-629
Some thoughts on probabilistic approaches
• Many “equivalences” from known models/algorithms to probabilistic formulations
• K-means -> Mixture of Gaussians
• Linear Regression -> MLE in Gaussian model
• Logistic Regression -> MLE in Bernoulli model
• Allows to think of models in a common framework
• Build a joint distribution from a marginal and a conditional
• Learning procedures
• Well motivated
• Can be shared across problems
24
5(], _) = 5(] | _)5(_)
Laurent Charlin — 80-629
Unsupervised learning takeaways
• Most data (in the world) is unlabeled
• Useful tasks: clustering, density estimation, dimensionality reduction
• K-means and Gaussian mixtures (GMMs)
• Performance is harder to judge (measure)
• Note that all our examples were in 2D
• Can be used in downstream tasks
25