Spectral Clustering
Jianping Fan
Dept of Computer Science
UNC, Charlotte
2
Lecture Outline
Motivation Graph overview and construction Spectral Clustering Cool implementations
3
Semantic interpretations of clusters
4
Spectral Clustering Example – 2 Spirals
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Dataset exhibits Dataset exhibits complex cluster shapescomplex cluster shapes
K-meansK-means performs very performs very poorly in this space due poorly in this space due bias toward dense bias toward dense spherical clusters.spherical clusters.
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-0.709 -0.7085 -0.708 -0.7075 -0.707 -0.7065 -0.706In the embedded space In the embedded space given by two leading given by two leading eigenvectors, clusters eigenvectors, clusters are trivial to separate.are trivial to separate.
Original Points K-means (2 Clusters)
Spectral Clustering Example
Why k-means fail for these two examples?
Geometry vs. Manifold
6
Lecture Outline
Motivation Graph overview and construction Spectral Clustering Cool implementation
7
Graph-based Representation of Data Similarity
8
Graph-based Representation of Data Similarity
similarity
9
Graph-based Representation of Data Relationship
10
Manifold
11
Graph-based Representation of Data Relationships
Manifold
12
Graph-based Representation of Data Relationships
13Data Graph Construction
14
Graph-based Representation of Data Relationships
15
Graph-based Representation of Data Relationships
16
17
Graph-based Representation of Data Relationships
18
Graph-based Representation of Data Relationships
19
Graph Cut
20
Lecture Outline
Motivation Graph overview and construction Spectral Clustering Cool implementations
21
Graph-based Representation of Data Relationships
22
23
Graph Cut
24
25
26
27
28
Graph-based Representation of Data Relationships
29
Graph Cut
30
31
32
33
34
Eigenvectors & Eigenvalues
35
36
37
Normalized Cut
A graph G(V, E) can be partitioned into two disjoint sets A, B
Optimal partition of the graph G is achieved by minimizing the cut
Cut is defined as :
Min) (
38
Normalized Cut
Normalized Cut
Association between partition set and whole graph
39
Normalized Cut
40
Normalized Cut
41
Normalized Cut
42
Normalized Cut
Normalized Cut becomes
Normalized cut can be solved by eigenvalue equation:
43
K-way Min-Max Cut
Intra-cluster similarity
Inter-cluster similarity
Decision function for spectral clustering
44
Mathematical Description of Spectral Clustering
Refined decision function for spectral clustering
We can further define:
45
Refined decision function for spectral clustering
This decision function can be solved as
46
Spectral Clustering Algorithm Ng, Jordan, and Weiss
Motivation Given a set of points
We would like to cluster them into k subsets
1,...,l
nS s s R
47
Algorithm
Form the affinity matrix Define if
Scaling parameter chosen by user
Define D a diagonal matrix whose
(i,i) element is the sum of A’s row i
nxnW Ri j
0iiW
2 2|| || / 2i js s
ijW e
48
Algorithm
Form the matrix
Find , the k largest eigenvectors of L These form the the columns of the new
matrix X Note: have reduced dimension from nxn to nxk
1/ 2 1/ 2L D WD
1 2, ,..., kx x x
49
Algorithm
Form the matrix Y Renormalize each of X’s rows to have unit length Y
Treat each row of Y as a point in Cluster into k clusters via K-means
2 2/( )ij ij ijj
Y X X kR
nxkR
50
Algorithm
Final Cluster Assignment Assign point to cluster j iff row i of Y was
assigned to cluster jis
51
Why?
If we eventually use K-means, why not just apply K-means to the original data?
This method allows us to cluster non-convex regions
52
Some Examples
53
54
55
56
57
58
59
60
61
User’s Prerogative
Affinity matrix construction Choice of scaling factor
Realistically, search over and pick value that gives the tightest clusters
Choice of k, the number of clusters Choice of clustering method
2
62
0
5
10
15
20
25
30
35
40
45
50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
K
Eig
enva
lue
Largest Largest eigenvalueseigenvalues
of Cisi/Medline of Cisi/Medline datadata
λ1
λ2
How to select k? Eigengap: the difference between two consecutive eigenvalues. Most stable clustering is generally given by the value k that
maximises the expression
1k k k
Choose Choose k=2k=2
12max k
63
Recap – The bottom line
64
Summary
Spectral clustering can help us in hard clustering problems
The technique is simple to understand The solution comes from solving a simple
algebra problem which is not hard to implement
Great care should be taken in choosing the “starting conditions”
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering
Spectral ClusteringSpectral Clustering