26
Analysis and Modeling of Social Networks Foudalis Ilias

Analysis and Modeling of Social Networks

  • Upload
    liluye

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Analysis and Modeling of Social Networks. Foudalis Ilias. Introduction. Online social networks have become a ubiquitous part of everyday life Opportunity to study social interactions in a large-scale worldwide environment Why model such networks? Understand their evolution and formation - PowerPoint PPT Presentation

Citation preview

Page 1: Analysis and Modeling of Social Networks

Analysis and Modeling of Social Networks

Foudalis Ilias

Page 2: Analysis and Modeling of Social Networks

Introduction

Online social networks have become a ubiquitous part of everyday life

Opportunity to study social interactions in a large-scale worldwide environment

Why model such networks? Understand their evolution and formation Improve current systems and build better applications Advance the state of the art in closely related fields

(such as diffusion of information)

Page 3: Analysis and Modeling of Social Networks

Social and Information Networks

Social Networks Mainly undirected graphs Connect people Nodes with more similar degrees (limited capacity of

social ties) Information Networks

Tend to be directed graphs Connect web pages or other units of information Few nodes with extremely large number of incoming

links

Page 4: Analysis and Modeling of Social Networks

Statistical characteristics of social networks Exhibit small diameter and small average path

length Also known as the “small world phenomenon”

Clustering coefficients tend to be larger Distribution of nodes tend to exhibit fat tails High degree nodes tend to be connected with

other high degree nodes Neighbors of a high degree node are less likely

to be connected with each other

Page 5: Analysis and Modeling of Social Networks

Related work

Internet Wats and Strogatz (1998), simple model that exhibits small

world characteristics Barabasi and Albert (1999), preferential attachment models,

power law distributions Kumar et al. (2000), link copying model, power law distributions Klemm, Eguiluz (2002), preferential attachment with fertile

nodes, small world properties Social Networks

Jackson and Rogers (2006), random meetings and local search Kumar et al. (2006), preferential attachment, different types of

nodes

Page 6: Analysis and Modeling of Social Networks

Our algorithm, General Description

People by default are part of certain groups A person will have a high chance to connect to

people in the same group People also make connections to people they

meet at random To capture this effect we introduce random walks In a random walk a person will have a higher chance

to connect with social or famous persons As time passes “older” persons will do less random

walks

Page 7: Analysis and Modeling of Social Networks

Our algorithm, Group Formation

First Pass Clique Formation

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

1 2 3 4 5 6

Page 8: Analysis and Modeling of Social Networks

Our algorithm, Group Formation

Second Pass Clique Formation1 2 3 4 5 6

3 4 1 2 6 5

3 4 1 2 6 5

6 3 2 1 5 4

6 3 2 1 5 4

1 2 3 4 5 61 2 3 4 5 6

Page 9: Analysis and Modeling of Social Networks

Our Algorithm, Group Formation Clique generation (Imaginary graph)

For FIRST_PASS times While the total number of nodes in cliques are less than N

Get m nodes and put them in a clique m will be chosen according to a power law distribution with exponent γ

Let M be the number of cliques generated from the first pass For M times

Get m nodes and put them in a clique m will be chosen according to a power law distribution with exponent γ

Page 10: Analysis and Modeling of Social Networks

Our Algorithm, Graph Generation Connection to groups

At each time step t a node will enter the graph The node will try to connect to all nodes with id < t with

probability:

||

||1

nGraphMaxCliqueI

liqueMinCommonC

Page 11: Analysis and Modeling of Social Networks

Our Algorithm, Graph Generation

Random walks All nodes with id ≤ t will try RW_TIMES to start a random walk

with probability 1/(t-id+1) During the random walk node i will try to connect with node j

with probability sociali*qualityj

At each step the probability to stop will be (1 – 1/DEPTH)

Page 12: Analysis and Modeling of Social Networks

Metrics 1/3 Degree distribution

Description of the relative frequencies of nodes that have different degrees

Diameter and average path length Diameter is the largest distance between any two pairs of nodes in the network

Distance is defined as the length of the shortest path between two nodes Average path length is the average over all the shortest paths

Betweenness Centrality Gives information on how important a node is in terms of connecting other nodes

Computed as: Where Pi(k,j) denotes the number of shortest paths from k and j that i lies on

),(: 2/)2)(1(

),(/),(

kjijk

ii NN

jkPjkPCe

Page 13: Analysis and Modeling of Social Networks

Metrics 2/3

Clustering Indicates whether two neighbors of the same node are also connected with each

other Clustering coefficient for each node i is:

Assortativity coefficient In real networks the degrees in the endpoints of any edge tend not to be

independent This feature can be captured by computing the assortativity coefficient:

Where m is the average degree of the graph

2/)1(

),,(#

ii

iii dd

NkNjkjCl

Nii

gjiji

md

mdmd

r2

,

)(

))((

Page 14: Analysis and Modeling of Social Networks

Metrics 3/3

Neighbor degree distribution Average degree of the nearest neighbors of a vertex

with degree k:

Where P(k’|k) is the conditional probability that a node with degree k will be connected to a node with degree k’

Positive assortativity is translated as an increasing knn(k) function

max

1'

)|'(')(k

knn kkPkkk

Page 15: Analysis and Modeling of Social Networks

Data Description

Facebook data from 4 large U.S. universities Number of nodes is small compared to the real

Facebook graph Nodes represent a closed society Much better way to analyze a social network Large sample presents disadvantages

Difficult to analyze How good is the sampling?

Page 16: Analysis and Modeling of Social Networks

Results and Comparisons 1/5

Average degree does not depend on the size of network

Page 17: Analysis and Modeling of Social Networks

Results and Comparisons 1/5

Average degree does not depend on the size of network All networks present positive assortativity

High degree nodes tend to connect with other high degree nodes

Page 18: Analysis and Modeling of Social Networks

Results and Comparisons 1/5

Average degree does not depend on the size of network All networks present positive assortativity

High degree nodes tend to connect with other high degree nodes

High clustering coefficients

Average degree does not depend on the size of network All networks present positive assortativity

High degree nodes tend to connect with other high degree nodes

High clustering coefficients

Page 19: Analysis and Modeling of Social Networks

Results and Comparisons 1/5

Average degree does not depend on the size of network All networks present positive assortativity

High degree nodes tend to connect with other high degree nodes High clustering coefficients Small diameter and average path length

Page 20: Analysis and Modeling of Social Networks

Results and Comparisons 2/5

Increasing knn(k) functions

As expected due to positive assortativity

Nodes with high degree tend to be connected to each other

Page 21: Analysis and Modeling of Social Networks

Results and Comparisons 3/5

Small betweenness values

Almost independent of node degree

No central authorities Information flows are

distributed

Page 22: Analysis and Modeling of Social Networks

Results and Comparisons 4/5

No clear power law phenomena

On the log scale we see fat tails as expected

Page 23: Analysis and Modeling of Social Networks

Results and Comparisons 5/5

Overall clustering is a simple summary characteristic

Clear clustering pattern emerges

High node degrees have small clustering

Neighbors of high degree nodes less likely to be connected to each other

Page 24: Analysis and Modeling of Social Networks

Current Work

Analysis of information networksVery large datasets from

LiveJournal, YouTube, Flickr

As expected, different structureClear power law distributions Introduction of a new metric:

How close is pagerank with in-degree?

Page 25: Analysis and Modeling of Social Networks

Future Work

Make our model mathematically tractable Graph evolution over time

Densification lawsShrinking diameters

Community detection and formation New focus on coevolutionary models

Page 26: Analysis and Modeling of Social Networks

Thank you!

aiw.cs.aueb.gr/projects.html