Introduction to Data Miningukang/courses/19S-DM/L20-graphs2.pdf · Spectral Graph Partitioning ... ith coordinate of A x : Sum of the x-values of neighbors of i Make this a new value

1U Kang

Introduction to Data Mining

Lecture #20: Analysis of Large Graphs-2

U KangSeoul National University

2U Kang

In This Lecture

Learn the min-cut problem in graphs, and its solutions

Learn an example of spectral graph theory (how linear algebra and graph problems interact)

Learn how to find small communities in graphs

3U Kang

Outline

Spectral Clustering

Small Communities

4U Kang

Graph Partitioning

Undirected graph 𝑮(𝑽, 𝑬):

Bi-partitioning task: Divide vertices into two disjoint groups 𝑨,𝑩

Questions: How can we define a “good” partition of 𝑮? How can we efficiently identify such a partition?

1

32

5

46

A B

1

3

2

5

46

5U Kang

Graph Partitioning

What makes a good partition?

Maximize the number of within-group connections

Minimize the number of between-group connections

1

3

2

5

46

A B

6U Kang

A B

Graph Cuts

Express partitioning objectives as a function of the “edge cut” of the partition

Cut: Set of edges with only one vertex in a group:

cut(A,B) = 21

3

2

5

46

7U Kang

Graph Cut Criterion

Criterion: Minimum-cut Minimize weight of connections between groups

Degenerate case:

Problem: Only considers external cluster connections Does not consider internal cluster connectivity

arg minA,B cut(A,B)

“Optimal cut”

Minimum cut

8U Kang

Graph Cut Criteria

Criterion: Normalized-cut [Shi-Malik, ’97]

Connectivity between groups relative to the density of each group

𝒗𝒐𝒍(𝑨): total weight of the edges with at least one endpoint in 𝑨: 𝒗𝒐𝒍 𝑨 = σ𝒊∈𝑨𝒌𝒊

Why use this criterion?

Produces more balanced partitions

How do we efficiently find a good partition?

Problem: Computing optimal normalized cut is NP-hard

[Shi-Malik]

9U Kang

Spectral Graph Partitioning

A: adjacency matrix of undirected G Aij =1 if (𝒊, 𝒋) is an edge, else 0

x: a vector in n with components (𝒙𝟏, … , 𝒙𝒏) Think of it as a label/value of each node of 𝑮

What is the meaning of A x?

Entry yi is a sum of labels xj of neighbors of i

10U Kang

What is the meaning of Ax?

ith coordinate of A x :

Sum of the x-values of neighbors of i

Make this a new value at node i

Spectral Graph Theory:

Analyze the “spectrum” of matrix representing 𝑮

Spectrum: Eigenvectors 𝒙𝒊 of a graph, ordered by the magnitude (strength) of their corresponding eigenvalues 𝝀𝒊:

𝑨 ⋅ 𝒙 = 𝝀 ⋅ 𝒙

11U Kang

Example: d-regular graph

Suppose all nodes in 𝑮 have degree 𝒅and 𝑮 is connected

What are some eigenvalues/vectors of 𝑮?

𝑨 𝒙 = 𝝀 ⋅ 𝒙 What is ? What x?

Let’s try: 𝒙 = (𝟏, 𝟏,… , 𝟏)

Then: 𝑨 ⋅ 𝒙 = 𝒅, 𝒅,… , 𝒅 = 𝝀 ⋅ 𝒙. So: 𝝀 = 𝒅

We found eigenpair of 𝑮: 𝒙 = (𝟏, 𝟏,… , 𝟏), 𝝀 = 𝒅

Remember the meaning of 𝒚 = 𝑨 𝒙:

definition of

d-regular graph

12U Kang

Matrix Representations

Adjacency matrix (A): n n matrix

A=[aij], aij=1 if edge between node i and j

1

3

2

5

46

1 2 3 4 5 6

1 0 1 1 0 1 0

2 1 0 1 0 0 0

3 1 1 0 1 0 0

4 0 0 1 0 1 1

5 1 0 0 1 0 1

6 0 0 0 1 1 0

13U Kang


Degree matrix (D): n n diagonal matrix

D=[dii], dii = degree of node i

1

3

2

5

46

1 2 3 4 5 6

1 3 0 0 0 0 0

2 0 2 0 0 0 0

3 0 0 3 0 0 0

4 0 0 0 3 0 0

5 0 0 0 0 3 0

6 0 0 0 0 0 2

14U Kang


Laplacian matrix (L): n n symmetric matrix

What is trivial eigenpair? 𝒙 = (𝟏,… , 𝟏) then 𝑳 ⋅ 𝒙 = 𝟎 and so 𝝀 = 𝝀𝟏 = 𝟎

Important properties of L: Eigenvalues are non-negative real numbers

Eigenvectors are real and orthogonal

𝑳 = 𝑫 − 𝑨

1

3

2

5

46

1 2 3 4 5 6

1 3 -1 -1 0 -1 0

2 -1 2 -1 0 0 0

3 -1 -1 3 -1 0 0

4 0 0 -1 3 -1 -1

5 -1 0 0 -1 3 -1

6 0 0 0 -1 -1 2

15U Kang

Our Goal

Recall: our goal is to find the minimum cut (or normalized cut)

1

3

2

5

46

A B

16U Kang

New Formulation of Min Cut [Fiedler’73]

Express partition (A,B) as a vector

𝒚𝒊 = ቊ+𝟏−𝟏

𝒊𝒇 𝒊 ∈ 𝑨𝒊𝒇 𝒊 ∈ 𝑩

Problem 1: Find a non-trivial vector y that minimizes f(y):

Problem 1 is equivalent to finding the min-cut (A,B) Why?

17U Kang

Connection of min-cut problem and Laplacian L

Why?

yTL y = σ𝑖,𝑗=1𝑛 𝐿𝑖𝑗 𝑦𝑖𝑦𝑗 = σ𝑖,𝑗=1

𝑛 𝐷𝑖𝑗 − 𝐴𝑖𝑗 𝑦𝑖𝑦𝑗

= σ𝑖𝐷𝑖𝑖𝑦𝑖2 −σ 𝑖,𝑗 ∈𝐸 2𝑦𝑖𝑦𝑗

= σ 𝑖,𝑗 ∈𝐸(𝑦𝑖2 + 𝑦𝑗

2 − 2𝑦𝑖𝑦𝑗) = σ 𝒊,𝒋 ∈𝑬 𝒚𝒊 − 𝒚𝒋𝟐


18U Kang

Until now: finding the min-cut is equiv. to solving the following problem

But, the problem is NP-hard

So, we relax the constraint to make it tractable 𝒚𝒊 can be any number

Surprisingly, the solution of the problem is tightly connected with the eigenvector of L Detail: next slide


19U Kang

Rayleigh Theorem

𝝀𝟐 = 𝐦𝐢𝐧𝐲 𝒇 𝒚 : The minimum value of 𝒇(𝒚) is

given by the 2nd smallest eigenvalue λ2 of the Laplacian matrix L

𝐱 = 𝐚𝐫𝐠𝐦𝐢𝐧𝐲 𝒇 𝒚 : The optimal solution for y is

given by the corresponding eigenvector 𝒙, referred as the Fiedler vector

20U Kang

Rayleigh Theorem

In fact, the minimum solution is given by y = 1 vector (the smallest eigenvector w/ eigenvalue 0); however, this does not say anything about the partition

Thus, we find the next best solution which is the Fiedler vector

Let the Fiedler vector = (𝛼1, … 𝛼𝑛). Then,σ𝛼𝑖

2 = 1 (unit length), σ𝛼𝑖 = 0 (orthogonal to the first eigenvector).

x

21U Kang

So far…

How to define a “good” partition of a graph?

Minimize a given graph cut criterion

How to efficiently identify such a partition?

Approximate using information provided by the eigenvalues and eigenvectors of Laplacian

Spectral Clustering

22U Kang

Spectral Clustering Algorithms

Three basic stages:

1) Pre-processing

Construct a matrix representation of the graph

2) Decomposition

Compute eigenvalues and eigenvectors of Laplacian

Map each point to a lower-dimensional representation based on one or more eigenvectors

3) Grouping

Assign points to two or more clusters, based on the new representation

23U Kang

Spectral Partitioning Algorithm

1) Pre-processing: Build Laplacian

matrix L of the graph

2) Decomposition: Find eigenvalues

and eigenvectors xof the matrix L

Map vertices to corresponding components of x

0.0-0.4-0.40.4-0.60.4

0.50.4-0.2-0.5-0.30.4

-0.50.40.60.1-0.30.4

0.5-0.40.60.10.30.4

0.00.4-0.40.40.60.4

-0.5-0.4-0.2-0.50.30.4

5.0

4.0

3.0

3.0

1.0

0.0

= X =

-

0.66

-

0.35

-

0.34

0.33

0.62

0.31

1 2 3 4 5 6

1 3 -1 -1 0 -1 0

2 -1 2 -1 0 0 0

3 -1 -1 3 -1 0 0

4 0 0 -1 3 -1 -1

5 -1 0 0 -1 3 -1

6 0 0 0 -1 -1 2

24U Kang

Spectral Partitioning

3) Grouping: Sort components of reduced 1-dimensional vector

Identify clusters by splitting the sorted vector in two

How to choose a splitting point? Naïve approaches:

Split at 0 or median value

More expensive approaches: Attempt to minimize normalized cut in 1-dimension

(sweep over ordering of nodes induced by the eigenvector)

-0.66

-0.35

-0.34

0.33

0.62

0.31 Split at 0:

Cluster A: Positive points

Cluster B: Negative points

0.33

0.62

0.31

-0.66

-0.35

-0.34

A B

25U Kang

Example: Spectral Partitioning

Rank in x2

Valu

e o

f x

2

26U Kang

Example: Spectral Partitioning

Rank in x2

Valu

e o

f x

2

Components of x2

27U Kang

Example: Spectral partitioningComponents of x1

Components of x3

28U Kang

k-Way Spectral Clustering

How do we partition a graph into k clusters?

Two basic approaches:

Recursive bi-partitioning [Hagen et al., ’92]

Recursively apply bi-partitioning algorithm in a hierarchical divisive manner

Disadvantages: Inefficient, unstable

Cluster multiple eigenvectors [Shi-Malik, ’00]

Build a reduced space from multiple eigenvectors

Commonly used in recent papers

A preferable approach…

29U Kang

Outline

Spectral Clustering

Small Communities

30U Kang

Small Communities in Web

What is the signature of a community / discussion in a Web graph?

[Kumar et al. ‘99]

Dense 2-layer graph

Intuition: Many people all talking about the same things

… …

Use this to define “topics”:

What the same people on

the left talk about on the right

Remember HITS!

31U Kang

Searching for Small Communities

A more well-defined problem:Enumerate complete bipartite subgraphs Ks,t Where Ks,t : s nodes on the “left” where each links to

the same t other nodes on the “right”

K3,4

|X| = s = 3

|Y| = t = 4X Y

Fully connected

32U Kang

Frequent Itemset Enumeration

Market basket analysis. Setting:

Market: Universe U of n items

Baskets: m subsets of U: S1, S2, …, Sm U

(Si is a set of items one person bought)

Support: minimum number of occurrence f

Goal:

Find all subsets T appearing in at least f sets Si

(items in T were bought together at least f times)

What’s the connection between the itemsets and complete bipartite graphs?

[Agrawal-Srikant ‘99]

33U Kang

From Itemsets to Bipartite Ks,t

Frequent itemsets = complete bipartite graphs!

How?

View each node i as a set Si of nodes i points to

Ks,t = a set Y of size tthat occurs in s sets Si

Looking for Ks,t findingfrequent sets of size t

with support s


ib

c

d

a

Si={a,b,c,d}

j

i

k

b

c

d

a

X Y

s … minimum support (|X|=s)

t … itemset size (|Y|=t)

34U Kang

From Itemsets to Bipartite Ks,t


ib

c

d

a

Si={a,b,c,d}

x

y

z

b

c

a

X Y

Find frequent itemsets:

s … minimum support

t … itemset size

xb

c

a

We found Ks,t! Ks,t = a set Y of size tthat occurs in s sets Si

View each node i as a set Si of nodes i points to

Say we find a frequent itemset Y={a,b,c} of supp sSo, there are s nodes that link to all of {a,b,c}:

za

b

c

yb

c

a

35U Kang

Example (1)

Minimum support s=2

{b,d}: support 3

{e,f}: support 2

And we just found 2 bipartite subgraphs:

c

a b

d

f

Itemsets:

a = {b,c,d}

b = {d}

c = {b,d,e,f}

d = {e,f}

e = {b,d}

f = {}

e

c

a b

d

e

c

d

fe

36U Kang

Example (2)

Example of a community from a web graph

Nodes on the right Nodes on the left

[Kumar, Raghavan, Rajagopalan, Tomkins: Trawling the Web for emerging cyber-communities 1999]

37U Kang

What You Need to Know

Min-cut problem

NP-hard graph theoretic problem

Can be formulated with an equivalent matrix problem which is also NP-hard

Relaxing constraints on the matrix problem leads to spectral clustering algorithm

Finding small communities in graphs

Can be solved by frequent itemset mining

38U Kang

Questions?

Documents

Introduction to Data Miningukang/courses/19S-DM/L20-graphs2.pdf · Spectral Graph Partitioning ... ith coordinate of A x : Sum of the x-values of neighbors of i Make this a new value