105
Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov 2016 1 / 10

Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Graph Sampling and SparsificationLecture 19

CSCI 4974/6971

7 Nov 2016

1 / 10

Page 2: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Today’s Biz

1. Reminders

2. Review

3. Graph Sampling/Sparsification

2 / 10

Page 3: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Reminders

I Assignment 4: due date November 10thI Setting up and running on CCI clusters

I Assignment 5: due date TBD (before Thanksgivingbreak, probably 22nd)

I Assignment 6: due date TBD (early December)

I Tentative: No class November 14 and/or 17

I Final Project Presentation: December 8th

I Project Report: December 11th

I Office hours: Tuesday & Wednesday 14:00-16:00 Lally317

I Or email me for other availability

3 / 10

Page 4: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Today’s Biz

1. Reminders

2. Review

3. Graph Sampling/Sparsification

4 / 10

Page 5: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Quick Review

Graph Compression:

I

5 / 10

Page 6: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Today’s Biz

1. Reminders

2. Review

3. Graph Sampling/Sparsification

6 / 10

Page 7: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling and Summarization for Social NetworksShouDe Lin, MiYen Yeh, and ChengTe Li, National Taiwan

University

7 / 10

Page 8: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling and Summarization for Social NetworksPAKDD 2013 Tutorial

Shou‐De Lin*, Mi‐Yen Yeh#, and Cheng‐Te Li** Computer Science and Information Engineering, National Taiwan University

# Institute of Information Science, Academic [email protected], [email protected], [email protected]

Tutorial slides can be downloaded here: http://mslab.csie.ntu.edu.tw/tut‐pakdd13/

Page 9: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

About This Tutorial

• It is a two‐hour tutorial for PAKDD2013 on socialnetwork sampling and summarization– We do not anticipate to cover everything relevant tothis topic.

– We will highlight the trend, categorize different types of strategies, and describe some ongoing works of us

• Agenda– Introduction + Sampling +Q/A(45+10 min)– Summarization + conclusion + Q/A (45+10 min)

13/05/02 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial  2

Page 10: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

by Paul Butler

3What can be mined from this picture? 

Big Social Network Billions of different types of nodes and links

Page 11: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Motivation

>1 billion

>500 million

>200 million

• Sometimes the full networks are not completely observed in advance

• Even they are, loading everything into memory for further analysis might not be feasible

• Even it is feasible, generating some simple statistics (e.g. average path length, diameter) can take a long time, not to mention more complicated ones (e.g. counting the occurrence of certain pattern)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 4

Page 12: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

•1+Billion users•Avg: 130 friends each node It costs >1TB memory to simply save the raw 

graph data (without attributes, labels norcontent)

This can cause problems for information extraction, processing, and analysis 

An Example on Facebook

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial  5Two possible solutions: Sampling and Summarization

Page 13: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling Versus Summarization• Sampling

– Assume the information of nodes/links become known only after they are sampled

– Require certain sampling strategy to explore/expand the network gradually

– Goal: gradually identify a small set of representativenodes and links of a social network, usually given little prior information about this network

• Summarization– The entire social network is known in prior– Goal: condense the social network as much as possible without losing too much information

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 6

Page 14: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Homogeneous VS Heterogeneous Social Networks

• Homogeneous Single Relational Network– Single object type & Link type

• HeterogeneousMulti‐Relational Network– Multiple object type & Link type

• Example– Homogeneous

– HeterogeneousLink TypesFriendFamilyLove

Link TypesFriend

13/05/02 7Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 

Page 15: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling for Social Networks

Page 16: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling Social Networks• Assume that the detailed information of a node can only be seen after it is sampled– Entire social network is not known in advance 

• Goal– Sample (i.e. gradually observe nodes and links) a sub‐network that represents the whole network 

• To preserve certain properties of the original network

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 9

Page 17: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Evaluating the Sampling Quality

• How to measure the quality of the sampling algorithm?

• A sampling algorithm is effective if– The sampled social network can preserve certain network properties

– Using the sampled network to perform an ultimate task (e.g. centrality analysis, link prediction, etc), one can produce similar results as if this task were performed on the fully observed network

– The sample sub‐network is small

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 10

Page 18: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Properties Preserved (1/3)

• Homogeneous Static Social Network– In/Out Degree Distribution– Path Length Distribution– Clustering Coefficient Distribution– Eigenvalues– Weakly/Strongly Connected Component Size Distribution

– Community Structure– Etc..

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 11

Page 19: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Properties Preserved (2/3)• Homogeneous Dynamic Social Networks(Graphs are time‐evolving)

– Densification Power Law• Number of edges vs. number of nodes over time

– Shrinking diameter• Observed that shrinks and stabilizes over time

– Average clustering coefficient over time– Largest singular value of graph adjacency matrix over time

– Etc…Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 12

Page 20: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Properties Preserved (3/3)

• Heterogeneous Social Network– Note type Distribution– Intra‐link and Inter‐link type Distribution– Higher‐order types connection

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 13

Page 21: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Evaluation Metrics• Whether certain properties are preserved

– For single value properties (E.g. clustering coefficient, average path length), one can measure whether this value is preserved

– For distributional properties (E.g. degree distribution, component size distribution), one can compute the distance between two distributions (e.g. KL divergence)

• Whether certain end‐task can be performed similarly– Performing a certain task using the sampled network, and check whether the results are similar to those when the full network is used

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 14

Page 22: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling for HomogeneousSocial Networks

Page 23: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Three Main Strategies

• Node Selection• Edge Selection• Sampling by Exploration

– Random Walk– Graph Search– Chain‐Referral Sampling

Seeds (i.e., ego)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 16

Page 24: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Node Selection

• Random Node Sampling– Uniformly select a set of nodes

• Degree‐based Sampling [Adamic’01]

– the probability of a node being selected is proportional to its degree (assuming known)

• PageRank‐based Sampling [Leskovec’06]– the probability of a node being selected is  proportional to its PageRank value (assuming known)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 17

Page 25: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Edge Selection

• Random Edge (RE) Sampling– Uniformly select edges at random, and then include the associated nodes

• Random Node‐Edge (RNE) Sampling– Uniformly select a node, then uniformly          select an edge incident to it

• Hybrid Sampling [Leskovec’06]– With probability p perform RE sampling, with probability 1‐p perform RNE sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 18

Page 26: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Edge Selection (cont.)• Induced Edge Sampling [Ahmed’12]

– Step 1: Uniformly select edges (and consequently nodes) for several rounds

– Step 2: Add edges that exist between sampled nodes• Frontier Sampling [Ribeiro’10]

– Step 0: Randomly select a set of nodes L as seeds– Step 1: Select a seed u from L using degree‐based sampling

– Step 2: Select an edge of u, (u, v), uniformly– Step 3: Replace u by v in L and add (u, v) to the sequence of sampled edges

– * Repeat Step 1 to 3Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 19

Page 27: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling by Exploration• Random Walk [Gjoka’10]

– The next‐hop node is chosen uniformly among the neighbors of the current node 

• Random Walk with Restart [Leskovec’06]– Uniformly select a random node and perform a random walk with restarts

• Random Jump [Ribeiro’10]– Same as random walk but with a probability p we jump to any node in the network

• Forest Fire [Leskovec’06]– Choose a node u uniformly – Generate a random number z and select z out links of u that are not yet visited

– Apply this step recursively for all newly added nodesLin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 

13/05/02 20

Page 28: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling by Exploration (cont.)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 21

• Ego‐Centric Exploration (ECE) Sampling– Similar to random walk, but each neighbor has p probability to be selected

– Multiple ECE (starting with multiple seeds)• Depth‐First / Breadth‐First Search [Krishnamurthy’05]

– Keep visiting neighbors of earliest / most recently visited nodes

• Sample Edge Count [Maiya’11]

– Move to neighbor with the highest degree, and keep going

• Expansion Sampling [Maiya’11]

– Construct a sample with the maximal expansion. Select the neighbor v based on

S: the set of sampled nodes, N(S): the 1st neighbor set of S∈ ∪

Page 29: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Example: Expansion Sampling

EG

H

F

A

B C

D

|N({A})|=4

|N({E}) – N({A}) ∪{A}|=|{F,G,H}|=3|N({D}) – N({A}) ∪{A}|=|{F}|=1

Page 30: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

qk ‐ sampled node degree distribution

pk ‐ real node degree distribution

Drawback of Random Walk: Degree Bias!

• Real average node degree ~ 94, Sampled average node degree ~ 338• Solution: modify the transition probability :

13/05/02 23

,

1∗ min 1,

1 ,

0

If w is a neighbor of v

If w = v

otherwise

Page 31: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Metropolis Graph Sampling• Step 1: Initially pick one subgraph sample S with n’ nodes randomly

• Step 2: Iterate the following steps until convergence2.1: Remove one node from S2.2: Randomly add a new node to S  S’2.3: Compute the likelihood ratio

– *(S) measures the similarity of a certain property between the sample S and the original network G

• Be derived approximately using Simulated Annealing

[Hubler’08]

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 24

∗ ′∗

1: : ≔ 1: : ≔ with probability 

: ≔ with probability 1

Page 32: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling for Heterogeneous Social Networks

Page 33: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling on Heterogeneous Social Networks

• Heterogeneous Social Networks (HSN)– A graph G=<V, E> has n nodes (v1,v2, …, vn), m directed edges (e1, …, em) and k different types

– Each node/edge belongs to a type• Given a finite set L = {L1, ..., Lk} denoting k types

• Sampling methods for HSN– Multi‐graph sampling– Type‐distribution preserving sampling– Relational‐profile preserving sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 26

[Gjoka’10]

(Li’ 11)

(Yang’13)

Page 34: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Multigraph Sampling

• Random walk sampling on the union multiple graph to avoid stopping on the disconnected graph.

13/05/02 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial  27

Page 35: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Sampling Heterogeneous Social Networks

• Sampling methods for HSN– Multi‐graph sampling– Type‐distribution preserving sampling– Relational‐profile preserving sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 28

[Gjoka’10](Li’ 11)

(Yang’13)

Page 36: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Node Type Distribution Preserving Sampling

• Given a graph G and a sampled subgraph GS

• The node type distribution of GS is expected to be the same as G, i.e., d(Dist(Gs),Dist(G)) = 0– d() denotes the difference between two distributions

(9:6) = (3:2)

Sampled NetworkOriginal Network

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 29

Page 37: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Connection‐type Preserving Sampling

• Heterogeneous Connection– For an edge E[vi,vj]– Intra‐connection edge: Type(vi) = Type(vj)– Inter‐connection edge: Type(vi) != Type(vj)

• Intra‐Relationship preserving– The ratio of the intra‐connection should be preserved, that is:

d(IR(GS),IR(G)) = 0– If the intra‐relationship is preserved, the inter‐relationship is 

also preserved

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 30

Page 38: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Respondent‐driven Sampling• First proposed in social science[Heck’99] to solve the hidden 

population in surveying.• Two Main Phases: 

Snowball sampling   Finding steady‐state in Recruitment matrix

31

G

respondents

limited coupon c

limited coupon c

limited coupon c

S11 S12 S13

S21 S22 S23

S31 S32 S33

N‐step

transition P1 P2 P3

Transition Matrix

steady‐state vector

Page 39: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

• Respondent‐driven Sampling does a good job with small node size, but saturate to mediocre afterwards 

• Random node sampling performs poorly in the beginning, but reaches the best results after sufficient amount of nodes are sampled. 

13/05/02 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial  32

Comparing Different Sampling algorithmsSimilarity of node type‐distribution  Similarity of Intra‐link distribution

Page 40: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Heterogeneous Social Networks

• Sampling methods for HSN– Multi‐graph sampling– Type‐distribution preserving sampling– Relational‐profile preserving sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 33

[Gjoka’10]

(Li’ 11)

(Yang’13)

Page 41: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Relational Profile Preserving Sampling• Node‐type/intra‐type preservation considers the semantics of nodes, but not the structure of networks

• Propose the Relational Profile to consider semantic and structure all together– Capture the dependency between each Node Type(NT) and Edge 

Type(ET) of a directed Heterogeneous Network– Consists of 4 Relational Matrices

• Conditional probabilities P(Tj|Ti) (e.g. P(LT=cites|NT=paper) )• Node to node, node to edge, edge to node, edge to edge

NT ET

NT Transition Matrix

Transition Matrix

ET Transition Matrix

Transition Matrix

papercites

cites

journal_of

authored

author

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 34

Page 42: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Example of Relational Profile (RP)P A C J c p a

P 0.44 0.22 0.22 0.11 0.44 0.33 0.22

A 1 1

C 1 1

J 1 1

c 1 0.22 0.44 0.33

p 0.5 0.33 0.17 0.66 0.33

a 0.5 0.5 0.6 0.4

P A C J c p a

P 0.182 0.364 0.091 0.273 0.182 0.364 0.364

A 1 1

C 1 1

J 1 1

c 1 0.5 0.5

p 0.5 0.125 0.375 0.17 0.5 0.33

a 0.5 0.5 0.22 0.33 0.44

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 35

Page 43: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Challenge: How to approximate RP when the true RP is unknown

• We propose Exploration by Expectation Sampling• Aim to preserve the unknown relational profile while adding new sample node1. Randomly choose a starting node and the corresponding edges 2. Based on current RP, select a next node from all 1 degree neighbor3. Add the new node and all its edges4. Update RP of the sub‐sampled graph5. Repeat step 3, 4 & 5 until the converge of RP

• Which node should be selected?– Select the node whose inclusion can potentially lead to the largest change to the existing RP 

• Use the partially observed RP to generate the ‘expected amount of change’ of each node as its score

• Weighted sampling based on the score

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 36

Page 44: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Relational Profile Sampling (RPS) 

D(v, Gs) = estimated change of RP given sampling v on the current graph Gs=E[ΔP(Gs, Gs+v)|Gs] , where ΔP = RMSERP

Goal: maximize expected property (Relational Profile distribution) change

Exploiting the existing RP, P(type(v)=t|Gs) can beobtained using the observed types of v’s neighbors

vwhich can be calculated as

vRP(type |type  )

RP(type |type  )

RP(type |type  )

RP(type |type  )

P(type|type) can be obtained from the existing RP

Idea: Sample to increase the diversity  

Gs

Page 45: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Evaluation• Datasets: 3 real‐life large scale social networks• Baselines:

– Random Walk Sampling (RW)– Degree‐based sampling (HDS)

• Evaluation I (Property Preservation): see how well the sampled network approximates two properties of the full network

• Evaluation II (Prediction): training a prediction model using the sampled network to infer out‐of‐sampled network status:– Node Type Prediction: Predict the type of unseen nodes in the 

network using a sub‐sampled network– Missing Relations Prediction: Recover/predict the missing links– Features:

• fdeg = (in/out deg; avg in/out deg of neighbors)• ftopo = (Common Neighbors; Jaccard’s Coefficient; etc)• fnt = P(type(v)|Gs)=• fRPnode = • fRPpath = 

Page 46: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Experiments (Property Preservation)• RP (RMSE)

• Weighted PageRank

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

1 5 9 13 17 21 25 29 33 37 41 45 49

Kend

all‐T

au

# Nodes Sampled (in 10s)

RW HDS RPS

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

1 5 9 13 17 21 25 29 33 37 41 45 49民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

1 6 11 16 21 26 31 36 41 46

Hep Aca Movie

Type dependency preservation 

Preserving relative node weights propagated throughout entire network

Page 47: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Experiments (Prediction)• We show Academic Network for brevity.

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

民國前/通用格式

Accuracy

number of sampled nodes

highDeg RandWalk RPS

Node Type Prediction Missing Relation Prediction

Page 48: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Task‐driven Network Sampling• Sampling Community Structure 

[Maiya’10][Satuluri’11]

• Sampling Network Backbone for Influence Maximization [Mathioudakis’11]

• Sampling High Centrality Individuals [Maiya’10]

• Sampling Personalized PageRank Values [Vattani’11]

• Sampling Network for Link/Label Prediction [Ahmed’12]

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 41

Page 49: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Short Summary

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 42

Homogeneous SN Heterogeneous SNNode and Edge

Selection[Leskovec’06] [Adamic’01] [Ahmed’12][Ribeiro’10] [Kurant’12]

Sampling by Exploration

[Krishnamurthy’05] [Leskovec’06][Hubler’08][Gjoka’10][Ribeiro’10] [Maiya’11][Kurant’11]

[Gjoka’11][Li’11][Kurant’12][Yang’13]

Task‐driven Sampling

[Maiya’10][Satuluri’11][Mathioudakis’11][Vattani’11][Ahmed’12]

• Why sampling a social network? the full network (e.g. Facebook) cannot be fully observed crawling can be costly in terms of resource and time consumption (therefore 

a smart sampling strategy is needed)

Page 50: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Detecting Community Structures in Social Networks byGraph Sparsification

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha,Subhasis Majumder, Heritage Institute of Technology, Kolkata,

India

8 / 10

Page 51: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Detecting Community Structures in Social Networksby Graph Sparsification

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder

Department of Computer Science and Engineering,Heritage Institute of Technology, Kolkata, India

September 5, 2016

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 52: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Figure: The tendency of people to live in racially homogeneous neighborhoods[1]. In yellow andorange blocks % of Afro-Americans ≤ 25, in brown and black boxes % ≥ 75.

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 53: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 54: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 55: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 56: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 57: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 58: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 59: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 60: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detectionPartha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 61: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

A Little Background: Edge Betweenness Centrality

cB(e) =∑

s,t∈Vs6=t

σ(s, t | e)σ(s, t)

Top 6 edgesEdge cB(e) Type

(10, 13) 0.3 inter(3, 5) 0.23333 inter

(7, 15) 0.2079 inter(1, 8) 0.1873 inter

(13, 15) 0.1746 intra(5, 7) 0.1476 intra

Bottom 6 edgesEdge cB(e) Type

(8, 11) 0.022 intra(1, 2) 0.0269 intra

(9, 11) 0.031 intra(8, 9) 0.0412 intra

(12, 15) 0.052 intra(3, 4) 0.060 intra

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 62: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

A Little Background: Edge Betweenness Centrality

cB(e) =∑

s,t∈Vs6=t

σ(s, t | e)σ(s, t)

Top 6 edgesEdge cB(e) Type

(10, 13) 0.3 inter(3, 5) 0.23333 inter

(7, 15) 0.2079 inter(1, 8) 0.1873 inter

(13, 15) 0.1746 intra(5, 7) 0.1476 intra

Bottom 6 edgesEdge cB(e) Type

(8, 11) 0.022 intra(1, 2) 0.0269 intra

(9, 11) 0.031 intra(8, 9) 0.0412 intra

(12, 15) 0.052 intra(3, 4) 0.060 intra

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 63: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

A Little Background: Edge Betweenness Centrality

cB(e) =∑

s,t∈Vs6=t

σ(s, t | e)σ(s, t)

Top 6 edgesEdge cB(e) Type

(10, 13) 0.3 inter(3, 5) 0.23333 inter

(7, 15) 0.2079 inter(1, 8) 0.1873 inter

(13, 15) 0.1746 intra(5, 7) 0.1476 intra

Bottom 6 edgesEdge cB(e) Type

(8, 11) 0.022 intra(1, 2) 0.0269 intra

(9, 11) 0.031 intra(8, 9) 0.0412 intra

(12, 15) 0.052 intra(3, 4) 0.060 intra

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 64: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 65: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 66: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 67: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 68: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 69: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 70: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 71: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

(a) Best edge: (10, 13)

(f) Final graph

(b) Best edge: (3, 5)

(e) Best edge: (2, 11)

(c) Best edge: (7, 15)

(d) Best edge: (1, 8)

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 72: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008

Takes the greedy maximization approach

Very fast in practice, it’s the current state-of-the-art in disjoint communitydetection

Performs hierarchical partitioning, stopping when there cannot be any furtherimprovement in modularity

Contracts the graph in each iteration thereby speeding up the process

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 73: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008

Takes the greedy maximization approach

Very fast in practice, it’s the current state-of-the-art in disjoint communitydetection

Performs hierarchical partitioning, stopping when there cannot be any furtherimprovement in modularity

Contracts the graph in each iteration thereby speeding up the process

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 74: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008

Takes the greedy maximization approach

Very fast in practice, it’s the current state-of-the-art in disjoint communitydetection

Performs hierarchical partitioning, stopping when there cannot be any furtherimprovement in modularity

Contracts the graph in each iteration thereby speeding up the process

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 75: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008

Takes the greedy maximization approach

Very fast in practice, it’s the current state-of-the-art in disjoint communitydetection

Performs hierarchical partitioning, stopping when there cannot be any furtherimprovement in modularity

Contracts the graph in each iteration thereby speeding up the process

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 76: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008

Takes the greedy maximization approach

Very fast in practice, it’s the current state-of-the-art in disjoint communitydetection

Performs hierarchical partitioning, stopping when there cannot be any furtherimprovement in modularity

Contracts the graph in each iteration thereby speeding up the process

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 77: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Louvain Method in Action

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 78: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Outline for Part I

1 Building Community Preserving Sparsified NetworkAssigning Meaningful Weights to EdgesSparsification using t-spanner

2 Fast Detection of Communities from the Sparsified NetworkMethodology and VisualizationsExperimental Results

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 79: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Our Method

Input: An unweighted network G(V, E)

Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E ,W)

2 Construct an t-spanner of G(V, E ,W). Take the complement of GS , call it Gcomm

3 Use LINCOM to break Gcomm into small but pure fragments

4 Use the second phase of Louvain Method to piece all the small bits and piecestogether to get C

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 80: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Our Method

Input: An unweighted network G(V, E)

Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E ,W)

2 Construct an t-spanner of G(V, E ,W). Take the complement of GS , call it Gcomm

3 Use LINCOM to break Gcomm into small but pure fragments

4 Use the second phase of Louvain Method to piece all the small bits and piecestogether to get C

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 81: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Our Method

Input: An unweighted network G(V, E)

Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E ,W)

2 Construct an t-spanner of G(V, E ,W). Take the complement of GS , call it Gcomm

3 Use LINCOM to break Gcomm into small but pure fragments

4 Use the second phase of Louvain Method to piece all the small bits and piecestogether to get C

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 82: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Our Method

Input: An unweighted network G(V, E)

Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E ,W)

2 Construct an t-spanner of G(V, E ,W). Take the complement of GS , call it Gcomm

3 Use LINCOM to break Gcomm into small but pure fragments

4 Use the second phase of Louvain Method to piece all the small bits and piecestogether to get C

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 83: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Our Method

Input: An unweighted network G(V, E)

Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E ,W)

2 Construct an t-spanner of G(V, E ,W). Take the complement of GS , call it Gcomm

3 Use LINCOM to break Gcomm into small but pure fragments

4 Use the second phase of Louvain Method to piece all the small bits and piecestogether to get C

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 84: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Jaccard Intro

Definition

wJ(e(vi, vj)) =|Γ(vi) ∩ Γ(vj)||Γ(vi) ∪ Γ(vj)|

where Γ(vi) is the neighborhood of the node vi∴ wJ ∈ [0, 1]

Jaccard works well in domains where local influence is important[4][5][6]

The computation takes O(m) time

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 85: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Jaccard Intro

Definition

wJ(e(vi, vj)) =|Γ(vi) ∩ Γ(vj)||Γ(vi) ∪ Γ(vj)|

where Γ(vi) is the neighborhood of the node vi∴ wJ ∈ [0, 1]

Jaccard works well in domains where local influence is important[4][5][6]

The computation takes O(m) time

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 86: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Jaccard Example

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 87: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Jaccard Example

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 88: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Jaccard Example

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 89: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 90: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Table: Jaccard weight statistics for top 10% edges in terms of wJ .

Network |E| intra-cluster top 10% edges in terms of wJ

edge count Total edges Intra-edge Fraction

Karate 78 21 7 7 1

Dolphin 159 39 15 15 1

Football 613 179 61 61 1

Les-Mis 254 56 25 25 1

Enron 180,811 48,498 18,383 18,220 0.99113

Epinions 405,739 146,417 40,573 36,589 0.90180

Amazon 925,872 54,403 92,587 92,584 0.99996

DBLP 1,049,866 164,268 104,986 104,986 1

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 91: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Spanner

A (α, β)-spanner of a graph G = (V, E , W) is a subgraph GS = (V, ES , WS),

such that,

δS(u, v) ≤ α . δ(u, v) + β ∀ u, v ∈ V

A t-spanner is a special case of (α, β) spanner where α = t and β = 0

Authors Size Running Time

Althofer et al. [1993] [7] O(n1+1k ) O(m(n1+

1k + nlogn))

Althofer et al. [1993] [7] 12n

1+ 1k O(mn1+

1k )

Roddity et al. [2004] [8] 12n

1+ 1k O(kn2+

1k )

Roddity et al. [2005] [9] O(kn1+1k ) O(km) (det.)

Baswana and Sen [2007] [10] O(kn1+1k ) O(km) (rand.)

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 92: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Spanner

A (α, β)-spanner of a graph G = (V, E , W) is a subgraph GS = (V, ES , WS),

such that,

δS(u, v) ≤ α . δ(u, v) + β ∀ u, v ∈ VA t-spanner is a special case of (α, β) spanner where α = t and β = 0

Authors Size Running Time

Althofer et al. [1993] [7] O(n1+1k ) O(m(n1+

1k + nlogn))

Althofer et al. [1993] [7] 12n

1+ 1k O(mn1+

1k )

Roddity et al. [2004] [8] 12n

1+ 1k O(kn2+

1k )

Roddity et al. [2005] [9] O(kn1+1k ) O(km) (det.)

Baswana and Sen [2007] [10] O(kn1+1k ) O(km) (rand.)

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 93: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 94: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 95: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Figure: Original network n = 11,m = 18δ(1, 5) = 5

Figure: A 3-spanner of the networkn = 11,m = 11 δs(1, 5) = 12

Since δs(1, 5) < t . δ(1, 5), the edge (1, 5) is discardedThe other edges are discarded in a similar fashion.

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 96: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Figure: Dolphin network. n = 62, m = 159

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 97: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Figure: 3-spanner. n = 62, m = 150

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 98: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Figure: 5-spanner. n = 62, m = 148

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 99: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Figure: 7-spanner. n = 62, m = 144

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 100: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Figure: 9-spanner. n = 62, m = 138

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 101: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 102: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Name n Spanner #intra-community #inter-community

Karate 34

Original 59 193 57 195 53 197 51 189 48 19

Dolphin 59

Original 120 393 117 385 102 387 100 389 90 38

Football 115

Original 447 1633 385 1665 376 1667 293 1669 286 165

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 103: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Assigning Meaningful Weights to EdgesSparsification using t-spanner

Figure: Original US Footballnetwork

Figure: Sparsified networkGcomm

Figure: Final network withcommunities marked asseparate components

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Page 104: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Today: In class work

I Implement node and edge sampling methods

I Compare their efficacy on various networks

9 / 10

Page 105: Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Graph Sparsification and SamplingBlank code and data available on website

(Lecture 19)www.cs.rpi.edu/∼slotag/classes/FA16/index.html

10 / 10