87
Analysis of High-Dimensional Data Leif Kobbelt

Lec 6 Data Analysis

  • Upload
    poop

  • View
    251

  • Download
    0

Embed Size (px)

DESCRIPTION

Computer graphics

Citation preview

Page 1: Lec 6 Data Analysis

Analysis ofHigh-Dimensional Data

Leif Kobbelt

Page 2: Lec 6 Data Analysis

2Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Motivation

• Given: n samples in d-dimensional space

nd

n R xxX ,,1

2

Page 3: Lec 6 Data Analysis

3Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Motivation

• Given: n samples in d-dimensional space

• Decrease d dimensionality reduction:PCA

MDS

nd

n R xxX ,,1

3

Page 4: Lec 6 Data Analysis

4Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Principal Component Analysis

• Idea: Compute orthorgonal linear transformation

that transforms the data into a new coordinate

system s.t.greatest variance on first coordinate axis

second greatest variance on second axis

etc.

• Optimal transform for a given data set in the least

squares sense

• Dimensionality reduction: project data into lower

dimensional space spanned by first principal

components

Page 5: Lec 6 Data Analysis

5Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Principal Component Analysis

Given: n samples scattered in d-dimensional space,

written as a matrix

nd

n R xxxX ,,, 21

compute the centered covariance matrix:

(interpretation as map from Rd

to Rd)

ddT RXXXXC ))((

5

Page 6: Lec 6 Data Analysis

6Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Principal Component Analysis

computation of C with the “centering matrix”:

TTTXJJXXJXJC

T

nn

IJ 111

principal component(s):

eigenvector(s) vi to largest eigenvalue(s) λi of C

(low rank approximation)

6

Page 7: Lec 6 Data Analysis

7Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Principal Component Analysis

Tqqq

T

ddd

TVDVC

vvvv

vvvv

111

111

diag

diag

nqT

q RJXX vv 1

* :

7

Page 8: Lec 6 Data Analysis

8Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Relation to SVD

• singular value decomposition

TUVXJ

T

TTTT

VV

VUUVXJXJC

2

)(

8

Page 9: Lec 6 Data Analysis

9Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

… for very large dimension d

ddT RXJXJC )(

nnT RXJXJC )(~

vvC vXJwT

wvXJvXJXJXJwCTTT

~

9

Page 10: Lec 6 Data Analysis

10Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Example

10 points in 2R

10

Page 11: Lec 6 Data Analysis

11Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

10 points in

Example

717.0615.0

615.0617.0C

68.0

74.01e

74.0

68.02e

11

2R

Page 12: Lec 6 Data Analysis

12Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Multi-Dimensional Scaling

Given: For n unknown samples in high-

dimensional space

d

in R xxxX ,,,1

we are given a matrix of pairwise

(squared) distances:2

, jiji xxD

ndR X

nnRD

12

Page 13: Lec 6 Data Analysis

13Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Multi-Dimensional Scaling

samples in some abstract space:

matrix of pairwise abstract distances:

Ain xxxX ,,,1

jiD ,

X

nnRD

13

Page 14: Lec 6 Data Analysis

14Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Multi-Dimensional Scaling

Goal:find an embedding of in a low-dimensional

space such that the pairwise (variations of)

distances are preserved.

2

)ˆ()ˆ,(F

T JDDJDD

other measures are possible

but they cannot be solved easily.

)ˆ,( DD

X

14

Page 15: Lec 6 Data Analysis

15Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Multi-Dimensional Scaling

closed form solution:

first q eigenvectors of the matrix

define the coordinates of a q-dimensional

embedding

nnT RJDJ 21

nq

T

q

q

q R

v

v

v

vX ,,'

1

11

15

qvv ,,1

Page 16: Lec 6 Data Analysis

16Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Multi-Dimensional Scaling

Page 17: Lec 6 Data Analysis

17Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Motivation

• Given: n samples in d-dimensional space

• Decrease n clustering: k-means

EM

Mean shift

Spectral clustering

Hierarchical clustering

nd

n R xxX ,,1

17

Page 18: Lec 6 Data Analysis

18Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Cluster Analysis

• Task: Given a set of observations / data samples,

assign them into clusters so that observations in

the same cluster are similar.

18

Page 19: Lec 6 Data Analysis

19Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Cluster Analysis

• Task: Given a set of observations / data samples,

assign them into clusters so that observations in

the same cluster are similar.

19

Page 20: Lec 6 Data Analysis

20Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

• Idea: partition n observations into k clusters in

which each observation belongs to the cluster with

the nearest mean.

• Given: data samples

• Goal: partition the n samples into k sets (k ≤ n)

S1, S2, …, Sk such that

is minimized, where μi is the mean of points in Si.

d

in Rxxx ,,1

k

i S

ijS

ij1

2

minargx

μx

20

Page 21: Lec 6 Data Analysis

21Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

• Two step algorithm:Assignment step: Assign each sample to the cluster with

the closest mean (Voronoi Diagram)

Update step: Calculate the new means to be the centroid

of the observations in the cluster.

Iterate until convergence (assignments change no

longer)

kiS t

ij

t

ijj

t

i ,,1,: ** mxmxx

tij S

jt

i

t

iS x

xm11

21

Page 22: Lec 6 Data Analysis

22Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

22

Page 23: Lec 6 Data Analysis

23Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

23

Page 24: Lec 6 Data Analysis

24Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

24

Page 25: Lec 6 Data Analysis

25Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

25

Page 26: Lec 6 Data Analysis

26Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

26

Page 27: Lec 6 Data Analysis

27Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

27

Page 28: Lec 6 Data Analysis

28Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

28

Page 29: Lec 6 Data Analysis

29Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

29

Page 30: Lec 6 Data Analysis

30Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

30

Page 31: Lec 6 Data Analysis

31Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

31

Page 32: Lec 6 Data Analysis

32Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

32

Page 33: Lec 6 Data Analysis

33Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

33

Page 34: Lec 6 Data Analysis

34Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

34

Page 35: Lec 6 Data Analysis

35Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering

35

Page 36: Lec 6 Data Analysis

36Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

k-means Clustering - Comments

• Advantages:Efficient

Always converges to a solution

• Drawbacks:Not necessarily globally optimal solution

#clusters k is an input parameter

Sensitive to initial clusters

Cluster model: data is split halfway between cluster

means

36

Page 37: Lec 6 Data Analysis

37Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Clustering Results

37

Page 38: Lec 6 Data Analysis

38Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm

• Expectation Maximization (EM)

• Probabilistic assignments to clusters instead of

deterministic assignments

• Multivariate Gaussian distributions instead of

means

38

Page 39: Lec 6 Data Analysis

39Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm

• Given: data samples

• Assumption: data was generated by k Gaussians

• Goal: Fit Gaussian mixture model (GMM) to data

Find means

covariances of the Gaussians

probabilities (weights) that the samples come from

the Gaussian j

kj ,,1

jjΣ

d

in R xxxX ,,,1

X

39

Page 40: Lec 6 Data Analysis

40Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm – Example (1D)

• Three samples drawn from each mixture component• means: 2,2 21 μμ

40

Page 41: Lec 6 Data Analysis

41Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm – Example (2D)

41

Page 42: Lec 6 Data Analysis

42Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm – Example (2D)

42

Page 43: Lec 6 Data Analysis

43Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm – Example (2D)

43

Page 44: Lec 6 Data Analysis

44Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm

1. Initialization: Choose initial estimates

and compute the initial

log-likelihood

2. E-step: Compute

and

n

i

k

j

jjijn

L1 1

0000 ,log1

Σμx

kjjjj ,,1,,, 000 Σμ

kjni

k

l

m

l

m

li

m

l

m

j

m

ji

m

jm

ij ,,1,,,1,

,

,

1

Σμx

Σμx

kjnn

i

m

ij

m

j ,,1,1

44

Page 45: Lec 6 Data Analysis

45Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

EM Algorithm

3. M-step: Compute new estimates (j=1,...,k)

4. Convergence check: Compute new log-

likelihood

n

i

Tm

ji

m

ji

m

ijm

j

m

j

n

i

i

m

ijm

j

m

j

m

jm

j

n

n

n

n

1

111

1

1

1

1

1

μxμxΣ

n

i

k

j

m

j

m

ji

m

j

m

nL

1 1

1111 ,log1

μx

45

Page 46: Lec 6 Data Analysis

46Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Example (2D)

Ground truth: Means:

Covariance matrices:

Weights:

• Input to EM-algorithm:

1000 samples

46

Page 47: Lec 6 Data Analysis

47Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Initial Estimate

Initial density estimation:

(centroids of k-means result)

47

Page 48: Lec 6 Data Analysis

48Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

1st Iteration

48

Page 49: Lec 6 Data Analysis

49Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

2nd Iteration

49

Page 50: Lec 6 Data Analysis

50Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

3rd Iteration

Estimates after three iterations:

50

Page 51: Lec 6 Data Analysis

51Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

• Non-parametric clustering technique

• No prior knowledge of #clusters

• No constraints on shape of clusters

51

Page 52: Lec 6 Data Analysis

52Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering - Idea

• Interprete points in feature space as empirical probability

density function

• Dense regions in feature space correspond to local

maxima of the underlying distribution

• For each sample: run gradient ascent procedure on local

estimated density until convergence

• Stationary points = maxima of distribution

• Samples associted with the same stationary point are

considered to be in the same cluster

52

Page 53: Lec 6 Data Analysis

53Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

• Given: data samples• Multi-variate kernel density estimate with radially

symmetric kernel K(x) and window radius h

• The radially symmetric kernel is defined as

where is a normalization constant• Modes of density function are located at zeros of

gradient function

n

i

i

d hK

nhf

1

1 xxx

2

, xx kcK dk

dkc ,

0 xf

d

in Rxxx ,,1

53

Page 54: Lec 6 Data Analysis

54Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

Gradient of density estimator

where denotes the derivative of the

kernel profile

xxx

xxx

xxx

n

i

i

n

i

ii

n

i

i

d

dk

hg

hg

hg

nh

cf

1

2

1

2

1

2

2

,2

xx'kg

xk

54

Page 55: Lec 6 Data Analysis

55Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

Gradient of density estimator

xxx

xxx

xxx

n

i

i

n

i

ii

n

i

i

d

dk

hg

hg

hg

nh

cf

1

2

1

2

1

2

2

,2

proportional to density

estimate at x xhm

mean shift vector points toward direction of

maximum increase in the density.

xhm

55

Page 56: Lec 6 Data Analysis

56Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

Mean shift procedure for sample :

1. Compute mean shift vector

2. Translate density estimation window

Iterate 1. and 2. until convergence, i.e.,

t

ixm

t

i

t

i

t

i m xxx 1

0 if x

ix

56

Page 57: Lec 6 Data Analysis

57Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

57

Page 58: Lec 6 Data Analysis

58Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

0

ix

58

Page 59: Lec 6 Data Analysis

59Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

1

ix

59

Page 60: Lec 6 Data Analysis

60Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

2

ix

60

Page 61: Lec 6 Data Analysis

61Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

3

ix

61

Page 62: Lec 6 Data Analysis

62Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift Clustering

n

ix

62

Page 63: Lec 6 Data Analysis

63Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Mean Shift - Comments

• Advantages: No prior knowledge of #clusters

No constraints on shape of clusters

• Drawbacks:Computationally expensive:

Run algorithm for every sample

Identification of sample neighborhood requires multi-dimensional

range search

How to choose the bandwidth parameter h ?

Page 64: Lec 6 Data Analysis

64Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Summary

• Given: n samples in d-dimensional space

• Decrease d dimensionality reduction:PCAMDS

• Decrease n clustering: k-meansEM Mean shiftSpectral clusteringHierarchical clustering

nd

n R xxX ,,1

64

Page 65: Lec 6 Data Analysis

65Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Model similarity between data points as graph

• Clustering: Find connected components in graph

Page 66: Lec 6 Data Analysis

66Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Model similarity between data points as graph

• (weighted) Adjacency Matrix W:

• Degree Matrix D:

Page 67: Lec 6 Data Analysis

67Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Graphs: Similarity graph: fully connected, model local neighborhood relations

Gaussian kernel similarity function:

K-nearest neighbour graph

𝜀-neighbourhood graph

Page 68: Lec 6 Data Analysis

68Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Model similarity between data points as graph

• (weighted) Adjacency Matrix W:

• Degree Matrix D:

• Graph Laplacian L = D – W:

Page 69: Lec 6 Data Analysis

69Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Properties of the Graph Laplacian L:

For every vector

L is symmetric and positive semi-definite

The smallest eigenvalue of L is 0

The corresonding eigenvector is the constant one vector

L has n non-negative, real-valued eigenvalues

Page 70: Lec 6 Data Analysis

70Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• The multiplicity k of the eigenvalue 0 of L equals the number of connected

components in the graph Consider k = 1. Assume f is eigenvector with eigenvalue 0:

The sum only vanishes if all terms vanish

If two vertices are connected (their edge weight > 0)

f needs to be constant for all vertices which can be connected by a path

All vertices of a connected component in an undirected graph can be connected by a

path:

f needs to be constant on the whole connected component

Page 71: Lec 6 Data Analysis

71Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Laplacian of graph with 1 connected component has one constant vector

with eigenvalue 0

• For k > 1: Wlog. assume that vertices are ordered according to connected

components

• Each is a graph Laplacian of a fully connected graph: Each has one eigenvalue 0 with constant one vector on the i-th connected comp.

• Spectrum of L is given by union of the spectra of

Page 72: Lec 6 Data Analysis

72Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Graph:

• Graph Laplacian

• Eigenvectors for eigenvalues

Page 73: Lec 6 Data Analysis

73Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Graph:

• Project vertices into subspace spanned by k eigenvectors

• Projected vertices:

• K-means clustering recovers the connected components Embedding is the same regardless of data ordering

Page 74: Lec 6 Data Analysis

74Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Similarity Graph:

• W =

Page 75: Lec 6 Data Analysis

75Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Similarity Graph:

• L =

• Eigenvalues : 0, 0.4, 2, 2

• Eigenvectors :

Page 76: Lec 6 Data Analysis

76Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Similarity Graph:

• For fully connected graph we want to find the Min-Cut: Partition graph into 2 sets of vertices such that the weight of edges connecting them

is minimal:

Vertices in each set should be similar to vertices in the same set, but dissimilar to

vertices from the other set

Partitions often not balanced: isolated vertices

Page 77: Lec 6 Data Analysis

77Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Similarity Graph:

• For fully connected graph we want to find the Normalized Cut: Partition graph into 2 sets of vertices such that the weight of edges connecting them

is minimal

Partitions should have similar size

Page 78: Lec 6 Data Analysis

78Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Min-Cut: minimize

• Normalized Cut: minimize

minimal if

Page 79: Lec 6 Data Analysis

79Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Reformulate with Graph Laplacian

• Construct f:

Page 80: Lec 6 Data Analysis

80Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Reformulate Ncut:

• Minimize subject to

Partition (cluster) assignment by thresholding f at 0

NP hard to compute since f is discrete

Relax problem by allowing f to take arbitrary real values

Solution: second eigenvector of (normalized Graph Laplacian)

• For k > 2 we can similarily construct indicator vectors like f and relax the

problem for minimization: Project the vertices into the subspace spanned by the first k eigenvectors of L‘

Clustering the embedded vertices yields the solution

• Spectral clustering (with normalized Graph Laplacian) approximates Ncut

Page 81: Lec 6 Data Analysis

81Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

Mean Shift Spectral Clustering K-Means

Page 82: Lec 6 Data Analysis

82Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Spectral Clustering

• Summary: Useful for non-convex clustering problems

Computation intensive because of eigenvalue computation (for large matrices)

Choice of k necessary:

A heuristic can be used that tries to find jumps in the eigenvalues (eigengap)

Similarity has to be defined for graph construction:

Size of Gaussian kernel?

Size of neighbourhood?

Page 83: Lec 6 Data Analysis

83Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Hierarchical Clustering

• Bottom up: Each data point is it‘s own cluster

Greedily merge clusters according to some criteria

Page 84: Lec 6 Data Analysis

84Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Hierarchical Clustering

• Requirements: Metric: distance between data points

Linkage: distance between data point sets:

Maximum linkage:

Average linkage:

Ward linkage:

Page 85: Lec 6 Data Analysis

85Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Hierarchical Clustering

• Algorithm: Start out with a cluster for each data point

Merge two clusters that result in the least increase in linkage criteria

Repeat until k clusters remain

• Maximum linkage: Minimizes maximimal distance of data points in each cluster

• Average linkage: Minimizes average distance of data points in each cluster

• Ward linkage: Minimizes inter-cluster variance

Page 86: Lec 6 Data Analysis

86Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Hierarchical Clustering

• We can add connectivity constraints that enforce which clusters can be

merged

Page 87: Lec 6 Data Analysis

87Visual Computing Institute | Prof. Dr. Leif Kobbelt

Computer Graphics and Multimedia

Data Analysis and Visualization

Hierarchical Clustering

• Summary: Flexibel: any pairwise distance can be used

Choice of k, distance and linkage necessary

Instead of specifying k we can use a heuristic which stops cluster merging if the

linkage increases too much

Given connectivity constraints hierarchical clustering scales well for large number of

data points

How do we choose connectivity constraints?

K-nearest neighbour graph

𝜀-neighbourhood graph