Upload
jay-ranjit
View
218
Download
0
Embed Size (px)
Citation preview
7/31/2019 Chap10.Part1
1/26
Unsupervised Learning andClustering
7/31/2019 Chap10.Part1
2/26
Why consider unlabeled samples?
1. Collecting and labeling large set of samples is costlyGetting recorded speech is free, labeling is time consuming
2. Classifier could be designed on small set of labeledsamples and tuned on a large unlabeled set
3. Train on large unlabeled set and use supervision ongroupings found
4. Characteristics of patterns may change with time
5. Unsupervised methods can be used to find usefulfeatures
6. Exploratory data analysis may discover presence of
significant subclasses affecting design
7/31/2019 Chap10.Part1
3/26
7/31/2019 Chap10.Part1
4/26
Gradient Ascent for Mixtures
Mixture density:
Likelihood of observed samples:
Log-likelihood:
Gradient w.r.t. i:
MLE must satisfy:
)(),|x(),x(1
j
c
j
jj Ppp =
=
=
=n
k
kxpDp1
)|()|(
=
=n
k
kxpl1
)|(ln
= ==
c
j
jjjk
n
k k
Pxpxp
l ii11
)(),|()|(
1
0)
,|(ln)
,|(1=
=iikk
n
ki xpxP i
7/31/2019 Chap10.Part1
5/26
Gaussian Mixture
Unknown mean vectors,yields
Leading to an iterative
scheme for improvingestimates
t
n
k
ki
n
k
kki
i
xP
xxP
),..(where
),|(
),|(
c1
1
1
==
=
=
))(,|(
)1( 1
==+
n
k
kki xjxP
j
))(,|(
1
=
n
k
ki
i
jxP
7/31/2019 Chap10.Part1
6/26
k-means clustering
Gaussian case with all parametersunknown leads to a formulation:
begin initialize n, c, 1,2,..,c do classify n samples according to nearest i
recompute iuntil no change in iend
7/31/2019 Chap10.Part1
7/26
k-meansclustering with one feature
One-dimensionalexample
Six starting points lead local maxima whereastwo for both of which 1(0) =2(0) lead to asaddle point
7/31/2019 Chap10.Part1
8/26
k-means clustering with two features
Two-dimensional
example
There are three means and there are three steps
in the iteration. Voronoi tesselations based on meansare shown
7/31/2019 Chap10.Part1
9/26
7/31/2019 Chap10.Part1
10/26
7/31/2019 Chap10.Part1
11/26
Data sets having identical statistics
upto second order, i.e., same and
7/31/2019 Chap10.Part1
12/26
7/31/2019 Chap10.Part1
13/26
Similarity Measures
Two Issues1. How to measure similarity between samples?2. How to evaluate partitioning?
If distance is a good measure of dissimilarity
distance between samples in same cluster must be smallerthan distance between samples in different clusters
Two samples belong to the same cluster if distancebetween them is less than a threshold d0
Distancethresholdaffects number
and size ofclusters
7/31/2019 Chap10.Part1
14/26
7/31/2019 Chap10.Part1
15/26
Binary Feature Similarity
Measures
||'||||||)',(
'
xx
xxxxst
=Numerator
no of attributes possessed by both x and xDenominator
(xtxxtx)1/2 is geometric mean of no of
attributes possessed by x and x
d
xxs xxt '
)',( = Fraction of attributes shared
Tanimoto coefficient: Ratio of
number of shared attributes tonumber possessed by x or xxxxxxx
xxttt
t
xxs ''
'
')',( +=
7/31/2019 Chap10.Part1
16/26
7/31/2019 Chap10.Part1
17/26
7/31/2019 Chap10.Part1
18/26
7/31/2019 Chap10.Part1
19/26
7/31/2019 Chap10.Part1
20/26
Hierarchical Clustering
7/31/2019 Chap10.Part1
21/26
7/31/2019 Chap10.Part1
22/26
7/31/2019 Chap10.Part1
23/26
7/31/2019 Chap10.Part1
24/26
Nearest Neighbor Algorithm
7/31/2019 Chap10.Part1
25/26
How to determine nearest clusters
7/31/2019 Chap10.Part1
26/26