53
Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU) ICDM 2014, Monday December 15 th 2014, Shenzhen, China Copyright for the tutorial materials is held by the authors. The authors grant IEEE ICDM permission to distribute the materials through its website.

Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

Embed Size (px)

Citation preview

Page 1: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

Dept. of Computer ScienceRutgers

Node and Graph Similarity: Theory and Applications

Danai Koutra (CMU)Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

ICDM 2014, Monday December 15th 2014, Shenzhen, ChinaCopyright for the tutorial materials is held by the authors.  The authors grant IEEE ICDM

permission to distribute the materials through its website.

Page 2: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Part 2aGraph Similarity: known node

correspondence

2

Page 3: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

What to remember• Numerous applications:

– Network monitoring, anomaly detection, network intrusion, behavioral studies

• Although seems easy problem, it’s not!– Some measures are counter-intuitive.– DeltaCon [Koutra+, SDM’13] (based on node

proximity) satisfies several intuitive properties. • There are multiple measures, but which one

to use?– Depends on the application!– Good news according to the guide of

[Soundarajan+, SDM’14]!3

Danai Koutra
Maybe take out the applications and move them to part 3?
Page 4: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Roadmap• Known node correspondence

– Simple features– Complex features– Visualization– Summary

• Unknown node correspondence

4

Page 5: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Problem Definition:Graph Similarity

• Given: (i) 2 graphs with the same nodes and different edge sets (ii) node correspondence• Find: similarity score s [0,1]

GA

GB

5

Page 6: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Problem Definition:Graph Similarity

• Given: (a) 2 graphs with the same nodes and different edge sets (b) node correspondence• Find: similarity score, s [0,1]

s = 0: GA <> GB

s = 1: GA == GB

GA

GB

6

Page 7: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Applications

Discontinuity Detection

Day 1 Day 2 Day 3 Day 4 Day 5

2

Classification1

different brain wiring?

7

Page 8: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Applications

Intrusion detection4

Behavioral Patterns3

FB message graph vs. wall-to-wall network

8

Page 9: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Roadmap• Known node correspondence

– Simple features– Complex features– Visualization– Summary

• Unknown node correspondence

9

Page 10: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Is there any obvious solution?

10

Page 11: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

One Solution

Edge Overlap(EO)

# of common edges (normalized or not)

GA

GB

11

Page 12: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

… but “barbell”…

EO(B10,mB10) == EO(B10,mmB10)

GA GA

GB GB’

12

Page 13: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Other solutions?

13

Page 14: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

1. “… they share many vertices and/or edges”

2. “… the rankings of their vertices are similar.” VR = rank correlation of node pagerank

3. “… their edge weights are similar.”

GA GB

Vertex/Edge OverlapO(|V|+|V’|+|E|+|E’|)

Vertex RankingO(|V|+|V’|)

Similar if …

Weighted distanceO(|E|+|E’|)

14[Papadimitriou, Dasdan, Garcia-Molina ’10; Bunke ‘06,

Shoubridge+ ’02, Dickinson+ ’04]14

Page 15: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

4. “… they have similar subgraphs.”

5. “… if we need few node/edge additions/deletions to transform GA to GB”

GA GB

Similar if …

Maximum Common SubgraphNP-complete

(weighted) Graph Edit Distance

Vertex MCS Distance Edge MCS Distance

[Bunke ‘06, Shoubridge+ ’02, Dickinson+ ’04; [Bunke+ ’98, ’06, Riesen ’09, Gao ’10, Fankhauser ’11; Kapsabelis+ ’07]

Page 16: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

6. “… they have similar fingerprints.” b-bit fingerprint of GA:

b-bit fingerprint of GB:

Hamming Distance: 1

GA GB

Similar if …

Signature similarity

1 0 1 0 1

0 0 1 0 1

[Papadimitriou, Dasdan, Garcia-Molina ‘10]

16

Page 17: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Event Detection

[Bunke+ ’06]

MC

S D

ista

nce

(|

G|=

|V|)

day

17

Page 18: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Application: Web graph anomaly detection

[Papadimitriou, Dasdan, Garcia-Molina ‘10]

18

Page 19: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Roadmap• Known node correspondence

– Simple features– Complex features– Visualization– Summary

• Unknown node correspondence

19

Page 20: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Graph Kernels: Idea

1) Compute graph substructures in poly time2) Compare them to find sim(GA, GB)

Source: http://mloss.org/software/view/139/

GA GBsim(GA, GB)

GA

GB

[Vishwanathan]

20

Page 21: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Fast Subtree Kernel

[Shervashidze+ ’09 NIPS, JMLR’11] O(m h) per graph pair

Sorted list of neighborsLabeled graphs

Label compression (hash func. on sorted strings)

Relabeling

Weisfeiler-Lehman algorithm

test forisomorphism

21

Page 22: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Graph kernels: Applications

[Ralaivola+ ’05, Borgwardt+ ’05]Source: http://www.ra.cs.uni-tuebingen.de/forschung/molsim/welcome_e.html

Aligning chemical compounds

Functionprediction

22

Page 23: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Other Graph Kernels• RWR [Kashima+ ’03,

Gaertner+ ’03, Vishwanathan

’10]• Shortest path kernels [Borgwardt &

Kriegel ’05]• Cyclic path kernels

[Horvath+ ’04]• Depth-first search kernels

[Swamidass+ ’05]• Subtree kernels [Shervashidze+

’09 NIPS, JMLR’11 , Ralaivola+ ’05]• Graphlet / Subgraph kernels

[Shervashidze+ ’09, Thoma+

’10]• All-paths kernels [Airola+

’08]• …

23

Page 24: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

… Many similarity functions can be defined…

What properties should

a good similarityfunction have?

24

Page 25: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Axioms

A1. Identity property sim( , ) = 1

A2. Symmetric property sim( , ) = sim( , )

A3. Zero propertysim( , ) = 0

[Koutra, Faloutsos, Vogelstein ‘13]

25

Page 26: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Desired Properties

• Intuitiveness

P1. Edge ImportanceP2. Weight AwarenessP3. Edge-“Submodularity”P4. Focus Awareness

• Scalability

[Koutra, Faloutsos, Vogelstein ‘13]

26

Page 27: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Desired Properties

• Intuitiveness

P1. Edge ImportanceP2. Weight AwarenessP3. Edge-“Submodularity”P4. Focus Awareness

• ScalabilityCreation of disconnected components matters more than small connectivity changes.

[Koutra, Faloutsos, Vogelstein ‘13]

27

Page 28: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Desired Properties

• Intuitiveness

P1. Edge ImportanceP2. Weight AwarenessP3. Edge-“Submodularity”P4. Focus Awareness

• ScalabilityThe bigger the edge weight, the more the edge change matters.

w=5

w=1✗

[Koutra, Faloutsos, Vogelstein ‘13]

28

Page 29: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Desired Properties

• Intuitiveness

P1. Edge ImportanceP2. Weight AwarenessP3. Edge-“Submodularity”P4. Focus Awareness

• Scalability“Diminishing Returns”: The

sparser the graphs, the more important

is a ‘’fixed’’ change.

n=5GA

GA

GB

GB

[Koutra, Faloutsos, Vogelstein ‘13]

29

Page 30: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Desired Properties

• Intuitiveness

P1. Edge ImportanceP2. Weight AwarenessP3. Edge-“Submodularity”P4. Focus Awareness

• Scalability Targeted changes are more important

than random changes of the same extent.

GA

targete

dGB’

random

GB

[Koutra, Faloutsos, Vogelstein ‘13]

30

Page 31: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

How do state-of-the-art methods fare?

Metric P1 P2 P3 P4

Vertex/Edge Overlap ✗ ✗ ✗ ?

Graph Edit Distance (XOR) ✗ ✗ ✗ ?

Signature Similarity ✗ ✔ ✗ ?

λ-distance (adjacency matrix)

✗ ✔ ✗ ?

λ-distance (graph laplacian)

✗ ✔ ✗ ?

λ-distance (normalized lapl.)

✗ ✔ ✗ ?

importance weight returns focus

[Koutra, Vogelstein, Faloutsos ‘13]

31

Page 32: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Is there a method that satisfies the properties?

Yes! DeltaCon

32

Page 33: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

DELTACON

SA = SB =

DETAILS

① Find the pairwise node influence, SA & SB.

② Find the similarity between SA & SB.

[Koutra, Faloutsos, Vogelstein ‘13]

33

Page 34: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

How? Using FaBP.

•Sound theoretical background (MLE on marginals)•Attenuating Neighboring Influence for small ε: 1-hop 2-hops …

Note: ε > ε2 > ..., 0<ε<1

INTUITION

34

Page 35: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

OUR SOLUTION: DELTACONDETAI

LS

① Find the pairwise node influence, SA & SB.

② Find the similarity between SA & SB.SA,SB

SB =SA =

sim(SA , SB) = 0.3

[Koutra, Faloutsos, Vogelstein ‘13]

35

Page 36: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

… but O(n2) …

f a s t e r ?

1

4

2

3

in the paper

http://www.cs.cmu.edu/~dkoutra/CODE/deltacon.zip[Koutra, Faloutsos, Vogelstein ‘13]

36

Page 37: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Comparison of methods revisited

Metric P1 P2 P3 P4

Vertex/Edge Overlap ✗ ✗ ✗ ?

Graph Edit Distance (XOR) ✗ ✗ ✗ ?

Signature Similarity ✗ ✔ ✗ ?

DELTACON0✔ ✔ ✔ ✔

DELTACON ✔ ✔ ✔ ✔

edge weight returns focus

[Koutra, Faloutsos, Vogelstein ‘13]

37

Page 38: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

• Nodes: employees• Edges: email exchange

Day 1 Day 2 Day 3 Day 4 Day 5

sim1 sim2 sim3 sim4

Temporal Anomaly Detection

[Koutra, Faloutsos, Vogelstein ‘13]

38

Page 39: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

sim

ilari

ty

consecutive days

Feb 4: Lay resigns

Temporal Anomaly Detection

[Koutra, Faloutsos, Vogelstein ‘13]

39

Page 40: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Brain Connectivity Graph Clustering

• 114 brain graphs– Nodes: 70 cortical regions– Edges: connections

• Attributes: gender, IQ, age…

[Koutra, Faloutsos, Vogelstein ‘13]

40

Page 41: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Brain Connectivity Graph Clustering

t-test p-value = 0.0057 [Koutra, Faloutsos, Vogelstein ‘13]

41

Page 42: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Roadmap• Known node correspondence

– Simple features– Complex features– Visualization– Summary

• Unknown node correspondence

42

Page 43: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Tested Visual Encodings

[Alper+ ’13, CHI]

Augmenting the graphs /adjacency matrices to show the differences.

User Study Result:

For bigger and sparser graphs, matrices are better.

40-80 nodes

low density

43

Danai Koutra
Maybe take out the applications and move them to part 3?
Page 44: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

More on visualization

• For large graphs HoneyComb [van Ham+ ’09]

• Reference graph [Andrews ’09]

• Interactive comparison [Hascoet+ ’12]• General principles

[Gleicher+ ’11]• …

44

Danai Koutra
Maybe take out the applications and move them to part 3?
Page 45: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Roadmap• Known node correspondence

– Simple features– Complex features– Visualization– Summary

• Unknown node correspondence

45

Page 46: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

A Guide to Selecting a Measure

[Soundarajan, Gallagher, Eliassi-Rad. SDM’14]

H15

H1

H20

H2

Hk

46

Page 47: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Q1 Q2

Q3

Much higher than

expected!

Some complex methods are very similar to simpler

methods

NetSimile, RWR often

close to consensus

[Soundarajan, Gallagher, Eliassi-Rad. SDM’14]

A Guide to Selecting a Measure

Are the graph similarity methods

correlated?

Are there groupsof methods that

behave comparably?

How can weget a singleconsensus method?

RWR ≈BP≈SSL[Koutra+ PKDD’11]

47

Page 48: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

Summary• Numerous applications:

– Network monitoring, anomaly detection, network intrusion, behavioral studies

• Although seems easy problem, it’s not!– Some measures are counter-intuitive.– DeltaCon [Koutra+, SDM’13] (based on node

proximity) satisfies several intuitive properties. • There are multiple measures, but which one

to use?– Depends on the application!– Good news according to the guide of

[Soundarajan+, SDM’14]!48

Danai Koutra
Maybe take out the applications and move them to part 3?
Page 49: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

References•S. Soundarajan and B. Gallagher, T. Eliassi-Rad. 2014. A Guide to Selecting a Network Similarity Method. SDM 2014. •D. Koutra, J.T. Vogelstein, C. Faloutsos. 2013. DELTACON: A Principled Massive-Graph Similarity Function. SDM 2013: 162-170. [CODE]•Stefan Fankhauser, Kaspar Riesen, and Horst Bunke. 2011. Speeding up graph edit distance computation through fast bipartite matching. In GbRPR'11.•Xinbo Gao, Bing Xiao, Dacheng Tao, and Xuelong Li. 2010. A survey of graph edit distance. Pattern Anal. Appl. 13, 1 (January 2010), 113-129.•Papadimitriou, Panagiotis and Dasdan, Ali and Garcia-Molina, Hector (2010). Web Graph Similarity for Anomaly Detection. Journal of Internet Services and Applications, Volume 1 (1). pp. 19-30. 49

(In reverse chronological order)

Page 50: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

References•Kaspar Riesen and Horst Bunke. 2009. Approximate graph edit distance computation by means of bipartite graph matching.•Kelly Marie Kapsabelis, Peter John Dickinson, Kutluyil Dogancay. Investigation of graph edit distance cost functions for detection of network anomalies. ANZIAM J. 48 (CTAC2006) pp.436–449, 2007.•H. Bunke, P. J. Dickinson, M. Kraetzl, and W. D. Wallis. A Graph-Theoretic Approach to Enterprise Network Dynamics (PCS). Birkhauser, 2006.•Shoubridge P., Kraetzl M., Wallis W. D., Bunke H. Detection of Abnormal Change in a Time Series of Graphs. Journal of Interconnection Networks (JOIN) 3(1-2):85-101, 2002. •Horst Bunke and Kim Shearer. 1998. A graph distance metric based on the maximal common subgraph. Pattern Recogn. Lett. 19, 3-4 (March 1998), 255-259. 50

Page 51: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

References•Kelmans, A. 1976. Comparison of graphs by their number of spanning trees. Discrete Mathematics 16, 3, 241 – 261.

Kernels (for more references, check slide 22)

•U. Kang, H. Tong, and J. Sun. Fast random walk graph kernel. in SDM, 2012.•Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. 2011. Weisfeiler-Lehman Graph Kernels. J. Mach. Learn. Res. 12, 2539-2561.•N. Shervashidze and K. M. Borgwardt. Fast subtree kernels on graphs. In Advances in Neural Information Processing Systems, pages 1660–1668, 2009. •Airola, A., Pyysalo, S., Björne, J., Pahikkala, T., Ginter, F., & Salakoski, T. (2008). All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning. BMC Bioinformatics C7 - S2, 9(Suppl 11).

51

Page 52: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

ReferencesVisualization•Basak Alper, Benjamin Bach, Nathalie Henry Riche, Tobias Isenberg, and Jean-Daniel Fekete. 2013. Weighted graph comparison techniques for brain connectivity analysis. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13).•Mountaz Hascoët and Pierre Dragicevic. 2012. Interactive graph matching and visual comparison of graphs and clustered graphs. In Proceedings of the International Working Conference on Advanced Visual Interfaces (AVI '12).•Michael Gleicher, Danielle Albers, Rick Walker, Ilir Jusufi, Charles D. Hansen, and Jonathan C. Roberts. 2011. Visual comparison for information visualization.•Andrews, K., Wohlfahrt, M., and Wurzinger, G. 2009. Visual graph comparison. In Information Visualisation, 2009 13th International Conference. 62 –67.

52

Page 53: Dept. of Computer Science Rutgers Node and Graph Similarity: Theory and Applications Danai Koutra (CMU) Tina Eliassi-Rad (Rutgers) Christos Faloutsos (CMU)

D. Koutra & T. Eliassi-Rad & C. FaloutsosICDM’14 Tutorial

References•Frank Ham, Hans-Jörg Schulz, and Joan M. Dimicco. 2009. Honeycomb: Visual Analysis of Large Scale Social Networks. In Proceedings of the 12th IFIP TC 13 International Conference on Human-Computer Interaction: Part II (INTERACT '09)

53