Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I

Graph Sampling and SparsificationLecture 19

CSCI 4974/6971

7 Nov 2016

1 / 10

Today’s Biz

1. Reminders

2. Review

3. Graph Sampling/Sparsification

2 / 10

Reminders

I Assignment 4: due date November 10thI Setting up and running on CCI clusters

I Assignment 5: due date TBD (before Thanksgivingbreak, probably 22nd)

I Assignment 6: due date TBD (early December)

I Tentative: No class November 14 and/or 17

I Final Project Presentation: December 8th

I Project Report: December 11th

I Office hours: Tuesday & Wednesday 14:00-16:00 Lally317

I Or email me for other availability

3 / 10

Today’s Biz

1. Reminders

2. Review


4 / 10

Quick Review

Graph Compression:

I

5 / 10

Today’s Biz

1. Reminders

2. Review


6 / 10

Sampling and Summarization for Social NetworksShouDe Lin, MiYen Yeh, and ChengTe Li, National Taiwan

University

7 / 10

Sampling and Summarization for Social NetworksPAKDD 2013 Tutorial

Shou‐De Lin*, Mi‐Yen Yeh#, and Cheng‐Te Li** Computer Science and Information Engineering, National Taiwan University

# Institute of Information Science, Academic [email protected], [email protected], [email protected]

Tutorial slides can be downloaded here: http://mslab.csie.ntu.edu.tw/tut‐pakdd13/

About This Tutorial

• It is a two‐hour tutorial for PAKDD2013 on socialnetwork sampling and summarization– We do not anticipate to cover everything relevant tothis topic.

– We will highlight the trend, categorize different types of strategies, and describe some ongoing works of us

• Agenda– Introduction + Sampling +Q/A(45+10 min)– Summarization + conclusion + Q/A (45+10 min)

13/05/02 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 2

by Paul Butler

3What can be mined from this picture?

Big Social Network Billions of different types of nodes and links

Motivation

>1 billion

>500 million

>200 million

• Sometimes the full networks are not completely observed in advance

• Even they are, loading everything into memory for further analysis might not be feasible

• Even it is feasible, generating some simple statistics (e.g. average path length, diameter) can take a long time, not to mention more complicated ones (e.g. counting the occurrence of certain pattern)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 4

•1+Billion users•Avg: 130 friends each node It costs >1TB memory to simply save the raw

graph data (without attributes, labels norcontent)

This can cause problems for information extraction, processing, and analysis

An Example on Facebook

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 5Two possible solutions: Sampling and Summarization

Sampling Versus Summarization• Sampling

– Assume the information of nodes/links become known only after they are sampled

– Require certain sampling strategy to explore/expand the network gradually

– Goal: gradually identify a small set of representativenodes and links of a social network, usually given little prior information about this network

• Summarization– The entire social network is known in prior– Goal: condense the social network as much as possible without losing too much information


Homogeneous VS Heterogeneous Social Networks

• Homogeneous Single Relational Network– Single object type & Link type

• HeterogeneousMulti‐Relational Network– Multiple object type & Link type

• Example– Homogeneous

– HeterogeneousLink TypesFriendFamilyLove

Link TypesFriend

13/05/02 7Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial

Sampling for Social Networks

Sampling Social Networks• Assume that the detailed information of a node can only be seen after it is sampled– Entire social network is not known in advance

• Goal– Sample (i.e. gradually observe nodes and links) a sub‐network that represents the whole network

• To preserve certain properties of the original network


Evaluating the Sampling Quality

• How to measure the quality of the sampling algorithm?

• A sampling algorithm is effective if– The sampled social network can preserve certain network properties

– Using the sampled network to perform an ultimate task (e.g. centrality analysis, link prediction, etc), one can produce similar results as if this task were performed on the fully observed network

– The sample sub‐network is small


Properties Preserved (1/3)

• Homogeneous Static Social Network– In/Out Degree Distribution– Path Length Distribution– Clustering Coefficient Distribution– Eigenvalues– Weakly/Strongly Connected Component Size Distribution

– Community Structure– Etc..


Properties Preserved (2/3)• Homogeneous Dynamic Social Networks(Graphs are time‐evolving)

– Densification Power Law• Number of edges vs. number of nodes over time

– Shrinking diameter• Observed that shrinks and stabilizes over time

– Average clustering coefficient over time– Largest singular value of graph adjacency matrix over time

– Etc…Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 12

Properties Preserved (3/3)

• Heterogeneous Social Network– Note type Distribution– Intra‐link and Inter‐link type Distribution– Higher‐order types connection


Evaluation Metrics• Whether certain properties are preserved

– For single value properties (E.g. clustering coefficient, average path length), one can measure whether this value is preserved

– For distributional properties (E.g. degree distribution, component size distribution), one can compute the distance between two distributions (e.g. KL divergence)

• Whether certain end‐task can be performed similarly– Performing a certain task using the sampled network, and check whether the results are similar to those when the full network is used


Sampling for HomogeneousSocial Networks

Three Main Strategies

• Node Selection• Edge Selection• Sampling by Exploration

– Random Walk– Graph Search– Chain‐Referral Sampling

Seeds (i.e., ego)


Node Selection

• Random Node Sampling– Uniformly select a set of nodes

• Degree‐based Sampling [Adamic’01]

– the probability of a node being selected is proportional to its degree (assuming known)

• PageRank‐based Sampling [Leskovec’06]– the probability of a node being selected is proportional to its PageRank value (assuming known)


Edge Selection

• Random Edge (RE) Sampling– Uniformly select edges at random, and then include the associated nodes

• Random Node‐Edge (RNE) Sampling– Uniformly select a node, then uniformly select an edge incident to it

• Hybrid Sampling [Leskovec’06]– With probability p perform RE sampling, with probability 1‐p perform RNE sampling


Edge Selection (cont.)• Induced Edge Sampling [Ahmed’12]

– Step 1: Uniformly select edges (and consequently nodes) for several rounds

– Step 2: Add edges that exist between sampled nodes• Frontier Sampling [Ribeiro’10]

– Step 0: Randomly select a set of nodes L as seeds– Step 1: Select a seed u from L using degree‐based sampling

– Step 2: Select an edge of u, (u, v), uniformly– Step 3: Replace u by v in L and add (u, v) to the sequence of sampled edges

– * Repeat Step 1 to 3Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 19

Sampling by Exploration• Random Walk [Gjoka’10]

– The next‐hop node is chosen uniformly among the neighbors of the current node

• Random Walk with Restart [Leskovec’06]– Uniformly select a random node and perform a random walk with restarts

• Random Jump [Ribeiro’10]– Same as random walk but with a probability p we jump to any node in the network

• Forest Fire [Leskovec’06]– Choose a node u uniformly – Generate a random number z and select z out links of u that are not yet visited

– Apply this step recursively for all newly added nodesLin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial

13/05/02 20

Sampling by Exploration (cont.)


• Ego‐Centric Exploration (ECE) Sampling– Similar to random walk, but each neighbor has p probability to be selected

– Multiple ECE (starting with multiple seeds)• Depth‐First / Breadth‐First Search [Krishnamurthy’05]

– Keep visiting neighbors of earliest / most recently visited nodes

• Sample Edge Count [Maiya’11]

– Move to neighbor with the highest degree, and keep going

• Expansion Sampling [Maiya’11]

– Construct a sample with the maximal expansion. Select the neighbor v based on

S: the set of sampled nodes, N(S): the 1st neighbor set of S∈ ∪

Example: Expansion Sampling

EG

H

F

A

B C

D

|N({A})|=4

|N({E}) – N({A}) ∪{A}|=|{F,G,H}|=3|N({D}) – N({A}) ∪{A}|=|{F}|=1

qk ‐ sampled node degree distribution

pk ‐ real node degree distribution

Drawback of Random Walk: Degree Bias!

• Real average node degree ~ 94, Sampled average node degree ~ 338• Solution: modify the transition probability :

13/05/02 23

,

1∗ min 1,

1 ,

0

If w is a neighbor of v

If w = v

otherwise

Metropolis Graph Sampling• Step 1: Initially pick one subgraph sample S with n’ nodes randomly

• Step 2: Iterate the following steps until convergence2.1: Remove one node from S2.2: Randomly add a new node to S S’2.3: Compute the likelihood ratio

– *(S) measures the similarity of a certain property between the sample S and the original network G

• Be derived approximately using Simulated Annealing

[Hubler’08]


∗ ′∗

1: : ≔ 1: : ≔ with probability

: ≔ with probability 1

Sampling for Heterogeneous Social Networks

Sampling on Heterogeneous Social Networks

• Heterogeneous Social Networks (HSN)– A graph G=<V, E> has n nodes (v1,v2, …, vn), m directed edges (e1, …, em) and k different types

– Each node/edge belongs to a type• Given a finite set L = {L1, ..., Lk} denoting k types

• Sampling methods for HSN– Multi‐graph sampling– Type‐distribution preserving sampling– Relational‐profile preserving sampling


[Gjoka’10]

(Li’ 11)

(Yang’13)

Multigraph Sampling

• Random walk sampling on the union multiple graph to avoid stopping on the disconnected graph.


Sampling Heterogeneous Social Networks



[Gjoka’10](Li’ 11)

(Yang’13)

Node Type Distribution Preserving Sampling

• Given a graph G and a sampled subgraph GS

• The node type distribution of GS is expected to be the same as G, i.e., d(Dist(Gs),Dist(G)) = 0– d() denotes the difference between two distributions

(9:6) = (3:2)

Sampled NetworkOriginal Network


Connection‐type Preserving Sampling

• Heterogeneous Connection– For an edge E[vi,vj]– Intra‐connection edge: Type(vi) = Type(vj)– Inter‐connection edge: Type(vi) != Type(vj)

• Intra‐Relationship preserving– The ratio of the intra‐connection should be preserved, that is:

d(IR(GS),IR(G)) = 0– If the intra‐relationship is preserved, the inter‐relationship is

also preserved


Respondent‐driven Sampling• First proposed in social science[Heck’99] to solve the hidden

population in surveying.• Two Main Phases:

Snowball sampling Finding steady‐state in Recruitment matrix

31

G

respondents

limited coupon c

limited coupon c

limited coupon c

S11 S12 S13

S21 S22 S23

S31 S32 S33

N‐step

transition P1 P2 P3

Transition Matrix

steady‐state vector

• Respondent‐driven Sampling does a good job with small node size, but saturate to mediocre afterwards

• Random node sampling performs poorly in the beginning, but reaches the best results after sufficient amount of nodes are sampled.


Comparing Different Sampling algorithmsSimilarity of node type‐distribution Similarity of Intra‐link distribution

Heterogeneous Social Networks



[Gjoka’10]

(Li’ 11)

(Yang’13)

Relational Profile Preserving Sampling• Node‐type/intra‐type preservation considers the semantics of nodes, but not the structure of networks

• Propose the Relational Profile to consider semantic and structure all together– Capture the dependency between each Node Type(NT) and Edge

Type(ET) of a directed Heterogeneous Network– Consists of 4 Relational Matrices

• Conditional probabilities P(Tj|Ti) (e.g. P(LT=cites|NT=paper) )• Node to node, node to edge, edge to node, edge to edge

NT ET

NT Transition Matrix

Transition Matrix

ET Transition Matrix

Transition Matrix

papercites

cites

journal_of

authored

author


Example of Relational Profile (RP)P A C J c p a

P 0.44 0.22 0.22 0.11 0.44 0.33 0.22

A 1 1

C 1 1

J 1 1

c 1 0.22 0.44 0.33

p 0.5 0.33 0.17 0.66 0.33

a 0.5 0.5 0.6 0.4

P A C J c p a

P 0.182 0.364 0.091 0.273 0.182 0.364 0.364

A 1 1

C 1 1

J 1 1

c 1 0.5 0.5

p 0.5 0.125 0.375 0.17 0.5 0.33

a 0.5 0.5 0.22 0.33 0.44


Challenge: How to approximate RP when the true RP is unknown

• We propose Exploration by Expectation Sampling• Aim to preserve the unknown relational profile while adding new sample node1. Randomly choose a starting node and the corresponding edges 2. Based on current RP, select a next node from all 1 degree neighbor3. Add the new node and all its edges4. Update RP of the sub‐sampled graph5. Repeat step 3, 4 & 5 until the converge of RP

• Which node should be selected?– Select the node whose inclusion can potentially lead to the largest change to the existing RP

• Use the partially observed RP to generate the ‘expected amount of change’ of each node as its score

• Weighted sampling based on the score


Relational Profile Sampling (RPS)

D(v, Gs) = estimated change of RP given sampling v on the current graph Gs=E[ΔP(Gs, Gs+v)|Gs] , where ΔP = RMSERP

Goal: maximize expected property (Relational Profile distribution) change

Exploiting the existing RP, P(type(v)=t|Gs) can beobtained using the observed types of v’s neighbors

vwhich can be calculated as

vRP(type |type )

RP(type |type )

RP(type |type )

RP(type |type )

P(type|type) can be obtained from the existing RP

Idea: Sample to increase the diversity

Gs

Evaluation• Datasets: 3 real‐life large scale social networks• Baselines:

– Random Walk Sampling (RW)– Degree‐based sampling (HDS)

• Evaluation I (Property Preservation): see how well the sampled network approximates two properties of the full network

• Evaluation II (Prediction): training a prediction model using the sampled network to infer out‐of‐sampled network status:– Node Type Prediction: Predict the type of unseen nodes in the

network using a sub‐sampled network– Missing Relations Prediction: Recover/predict the missing links– Features:

• fdeg = (in/out deg; avg in/out deg of neighbors)• ftopo = (Common Neighbors; Jaccard’s Coefficient; etc)• fnt = P(type(v)|Gs)=• fRPnode = • fRPpath =

Experiments (Property Preservation)• RP (RMSE)

• Weighted PageRank

民國前/通用格式









1 5 9 13 17 21 25 29 33 37 41 45 49

Kend

all‐T

au

# Nodes Sampled (in 10s)

RW HDS RPS








1 5 9 13 17 21 25 29 33 37 41 45 49民國前/通用格式








1 6 11 16 21 26 31 36 41 46

Hep Aca Movie

Type dependency preservation

Preserving relative node weights propagated throughout entire network

Experiments (Prediction)• We show Academic Network for brevity.








Accuracy

number of sampled nodes

highDeg RandWalk RPS

Node Type Prediction Missing Relation Prediction

Task‐driven Network Sampling• Sampling Community Structure

[Maiya’10][Satuluri’11]

• Sampling Network Backbone for Influence Maximization [Mathioudakis’11]

• Sampling High Centrality Individuals [Maiya’10]

• Sampling Personalized PageRank Values [Vattani’11]

• Sampling Network for Link/Label Prediction [Ahmed’12]


Short Summary


Homogeneous SN Heterogeneous SNNode and Edge

Selection[Leskovec’06] [Adamic’01] [Ahmed’12][Ribeiro’10] [Kurant’12]

Sampling by Exploration

[Krishnamurthy’05] [Leskovec’06][Hubler’08][Gjoka’10][Ribeiro’10] [Maiya’11][Kurant’11]

[Gjoka’11][Li’11][Kurant’12][Yang’13]

Task‐driven Sampling

[Maiya’10][Satuluri’11][Mathioudakis’11][Vattani’11][Ahmed’12]

• Why sampling a social network? the full network (e.g. Facebook) cannot be fully observed crawling can be costly in terms of resource and time consumption (therefore

a smart sampling strategy is needed)

Detecting Community Structures in Social Networks byGraph Sparsification

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha,Subhasis Majumder, Heritage Institute of Technology, Kolkata,

India

8 / 10

Building Community Preserving Sparsified NetworkFast Detection of Communities from the Sparsified Network

Detecting Community Structures in Social Networksby Graph Sparsification

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder

Department of Computer Science and Engineering,Heritage Institute of Technology, Kolkata, India

September 5, 2016

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification


Figure: The tendency of people to live in racially homogeneous neighborhoods[1]. In yellow andorange blocks % of Afro-Americans ≤ 25, in brown and black boxes % ≥ 75.



Definition of a Community

For a given graph G(V, E), find a cover C = {C1 ,C2 , ...,Ck} such that⋃iCi = V.

For disjoint communities, ∀i, j we have Ci⋂Cj = ∅

For overlapping communities, ∃i, j where Ci⋂Cj 6= ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes,C2 = green, C3 = blue is a disjoint cover

However, C = {C1, C2}, C1 = yellow &green nodes and C2 = blue & green nodesis an overlapping cover

For our problem, we concentrate on disjoint community detection






































































For our problem, we concentrate on disjoint community detectionPartha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification


A Little Background: Edge Betweenness Centrality

cB(e) =∑

s,t∈Vs6=t

σ(s, t | e)σ(s, t)

Top 6 edgesEdge cB(e) Type

(10, 13) 0.3 inter(3, 5) 0.23333 inter

(7, 15) 0.2079 inter(1, 8) 0.1873 inter

(13, 15) 0.1746 intra(5, 7) 0.1476 intra

Bottom 6 edgesEdge cB(e) Type

(8, 11) 0.022 intra(1, 2) 0.0269 intra

(9, 11) 0.031 intra(8, 9) 0.0412 intra

(12, 15) 0.052 intra(3, 4) 0.060 intra




cB(e) =∑

s,t∈Vs6=t

σ(s, t | e)σ(s, t)


(10, 13) 0.3 inter(3, 5) 0.23333 inter

(7, 15) 0.2079 inter(1, 8) 0.1873 inter

(13, 15) 0.1746 intra(5, 7) 0.1476 intra


(8, 11) 0.022 intra(1, 2) 0.0269 intra

(9, 11) 0.031 intra(8, 9) 0.0412 intra

(12, 15) 0.052 intra(3, 4) 0.060 intra




cB(e) =∑

s,t∈Vs6=t

σ(s, t | e)σ(s, t)


(10, 13) 0.3 inter(3, 5) 0.23333 inter

(7, 15) 0.2079 inter(1, 8) 0.1873 inter

(13, 15) 0.1746 intra(5, 7) 0.1476 intra


(8, 11) 0.022 intra(1, 2) 0.0269 intra

(9, 11) 0.031 intra(8, 9) 0.0412 intra

(12, 15) 0.052 intra(3, 4) 0.060 intra



The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002

The Key Ideas

Based on reachability of nodes - shortest paths

Edges are selected on the basis of the edge betweenness centrality

The algorithm

1 Compute centrality for all edges

2 Remove edge with largest centrality; ties can be broken randomly

3 Recalculate the centralities on the running graph

4 Iterate from step 2, stop when you get clusters of desirable quality





The Key Ideas



The algorithm









The Key Ideas



The algorithm









The Key Ideas



The algorithm









The Key Ideas



The algorithm









The Key Ideas



The algorithm









The Key Ideas



The algorithm







(a) Best edge: (10, 13)

(f) Final graph

(b) Best edge: (3, 5)

(e) Best edge: (2, 11)

(c) Best edge: (7, 15)

(d) Best edge: (1, 8)



Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008

Takes the greedy maximization approach

Very fast in practice, it’s the current state-of-the-art in disjoint communitydetection

Performs hierarchical partitioning, stopping when there cannot be any furtherimprovement in modularity

Contracts the graph in each iteration thereby speeding up the process



































Louvain Method in Action



Assigning Meaningful Weights to EdgesSparsification using t-spanner

Outline for Part I

1 Building Community Preserving Sparsified NetworkAssigning Meaningful Weights to EdgesSparsification using t-spanner

2 Fast Detection of Communities from the Sparsified NetworkMethodology and VisualizationsExperimental Results




Our Method

Input: An unweighted network G(V, E)

Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E ,W)

2 Construct an t-spanner of G(V, E ,W). Take the complement of GS , call it Gcomm

3 Use LINCOM to break Gcomm into small but pure fragments

4 Use the second phase of Louvain Method to piece all the small bits and piecestogether to get C




Our Method










Our Method










Our Method










Our Method










Jaccard Intro

Definition

wJ(e(vi, vj)) =|Γ(vi) ∩ Γ(vj)||Γ(vi) ∪ Γ(vj)|

where Γ(vi) is the neighborhood of the node vi∴ wJ ∈ [0, 1]

Jaccard works well in domains where local influence is important[4][5][6]

The computation takes O(m) time




Jaccard Intro

Definition

wJ(e(vi, vj)) =|Γ(vi) ∩ Γ(vj)||Γ(vi) ∪ Γ(vj)|

where Γ(vi) is the neighborhood of the node vi∴ wJ ∈ [0, 1]

Jaccard works well in domains where local influence is important[4][5][6]

The computation takes O(m) time




Jaccard Example




Jaccard Example




Jaccard Example







Table: Jaccard weight statistics for top 10% edges in terms of wJ .

Network |E| intra-cluster top 10% edges in terms of wJ

edge count Total edges Intra-edge Fraction

Karate 78 21 7 7 1

Dolphin 159 39 15 15 1

Football 613 179 61 61 1

Les-Mis 254 56 25 25 1

Enron 180,811 48,498 18,383 18,220 0.99113

Epinions 405,739 146,417 40,573 36,589 0.90180

Amazon 925,872 54,403 92,587 92,584 0.99996

DBLP 1,049,866 164,268 104,986 104,986 1




Spanner

A (α, β)-spanner of a graph G = (V, E , W) is a subgraph GS = (V, ES , WS),

such that,

δS(u, v) ≤ α . δ(u, v) + β ∀ u, v ∈ V

A t-spanner is a special case of (α, β) spanner where α = t and β = 0

Authors Size Running Time

Althofer et al. [1993] [7] O(n1+1k ) O(m(n1+

1k + nlogn))

Althofer et al. [1993] [7] 12n

1+ 1k O(mn1+

1k )

Roddity et al. [2004] [8] 12n

1+ 1k O(kn2+

1k )

Roddity et al. [2005] [9] O(kn1+1k ) O(km) (det.)

Baswana and Sen [2007] [10] O(kn1+1k ) O(km) (rand.)




Spanner

A (α, β)-spanner of a graph G = (V, E , W) is a subgraph GS = (V, ES , WS),

such that,

δS(u, v) ≤ α . δ(u, v) + β ∀ u, v ∈ VA t-spanner is a special case of (α, β) spanner where α = t and β = 0

Authors Size Running Time

Althofer et al. [1993] [7] O(n1+1k ) O(m(n1+

1k + nlogn))

Althofer et al. [1993] [7] 12n

1+ 1k O(mn1+

1k )

Roddity et al. [2004] [8] 12n

1+ 1k O(kn2+

1k )

Roddity et al. [2005] [9] O(kn1+1k ) O(km) (det.)

Baswana and Sen [2007] [10] O(kn1+1k ) O(km) (rand.)










Figure: Original network n = 11,m = 18δ(1, 5) = 5

Figure: A 3-spanner of the networkn = 11,m = 11 δs(1, 5) = 12

Since δs(1, 5) < t . δ(1, 5), the edge (1, 5) is discardedThe other edges are discarded in a similar fashion.




Figure: Dolphin network. n = 62, m = 159




Figure: 3-spanner. n = 62, m = 150



















Name n Spanner #intra-community #inter-community

Karate 34

Original 59 193 57 195 53 197 51 189 48 19

Dolphin 59

Original 120 393 117 385 102 387 100 389 90 38

Football 115

Original 447 1633 385 1665 376 1667 293 1669 286 165




Figure: Original US Footballnetwork

Figure: Sparsified networkGcomm

Figure: Final network withcommunities marked asseparate components


Today: In class work

I Implement node and edge sampling methods

I Compare their efficacy on various networks

9 / 10

Graph Sparsification and SamplingBlank code and data available on website

(Lecture 19)www.cs.rpi.edu/∼slotag/classes/FA16/index.html

10 / 10

Documents

Graph Sampling and Sparsification - Lecture 19slotag/classes/FA16/slides/lec19...7 Nov 2016 1/10 Today’s Biz 1. Reminders 2.Review 3.Graph Sampling/Sparsi cation 2/10 Reminders I