36
A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State University

A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

  • View
    220

  • Download
    7

Embed Size (px)

Citation preview

Page 1: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

A measure of betweenness centrality based on random walks Author: M. E. J. Newman

Presented by:

Amruta Hingane

Department of Computer Science

Kent State University

Page 2: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Overview

• Introduction

• Centrality Measures

• Types Of Betweenness

• Random-walk Betweenness

– A current flow analogy

– Random walks

• Comparison Of Different Betweenness Measures

• Correlation With Other Measures

• Examples Applications

Page 3: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Centrality Measures

Degree:

• Simplest centrality measure

• Number of edges incident on a vertex in a network

• The number of ties an actor has in social network parlance

• A measure in some sense of the popularity of an actor.

ab

c

Degree of b = ?

Page 4: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Centrality Measures Continued…

Closeness:

• Centrality measure which is the mean geodesic (shortest-path) distance between a vertex and all other vertices reachable from it.

• Measure of how long it will take information to spread from a given vertex to others in the network

Betweenness:• Measure of the extent to which a vertex lies on the paths

between others

Page 5: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Betweenness

• The betweenness of a vertex i is defined to be the fraction of shortest paths between pairs of vertices in a network that pass through i.

Σ s<t gi(st) /nst

bi = ----------------

½ n(n − 1)

n = Total no of vertices in network

gi(st) = no of geodesic paths from vertex s to vertex t that pass

through i.

nst = total no of geodesic paths from s to t.

Page 6: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Shortest-path Betweenness

• A measure of the extent to which an actor has control over information flowing between others.

• In a network in which flow is entirely or at least mostly along geodesic paths, the betweenness of a vertex measures how much flow will pass through that particular vertex.

• Betweenness can be calculated for all vertices in time O(mn)

m: edges

n: vertices

Page 7: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Example of Shortest-path Betweenness

Shortest path betweenness

Vertices A and B will have high (shortest-path) betweenness in this configuration, while vertex C will not

Page 8: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Drawbacks of Shortest-path Betweenness

• Does information flow only along geodesic paths?

• News, rumor, fad, message – does it know the ideal route

• To get from one place to another more likely a message wanders around more randomly, encountering who it will.

• Certainly it is possible for information to flow between two individuals via a third mutual acquaintance, even when the two individuals in question are themselves well acquainted

• A realistic betweenness measure should include non-geodesic paths in addition to geodesic ones

Page 9: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Flow betweenness

• Flow betweenness of a vertex i is defined as the amount of flow through vertex i when the maximum flow is transmitted from s to t, averaged over all s and t.

• Flow betweenness can be thought of as measuring the betweenness of vertices in a network in which a maximal amount of information is continuously pumped between all sources and targets.

• Maximum flow from a given s to all reachable targets t can be calculated in worst-case time O(m2) and hence the flow betweenness for all vertices can be calculated in time O(m2n)

Page 10: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Example of Flow Betweenness

While calculating flow betweenness, vertices A and B will get high scores while vertex C will not

Page 11: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Drawbacks of Flow Betweenness

• Does information “know” the ideal route (or one of the ideal routes) from each source to each target, in order to realize the maximum flow?

• Although the flow betweenness does take account of paths other than the shortest path this still seems unrealistic for many practical situations

• Flow betweenness suffers from some of the same drawbacks as shortest-path betweenness, as in flow does not take any sort of ideal path from source to target, be it the shortest path, the maximum flow path, or another kind of ideal path

Page 12: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Random Walk Betweenness

• Random-walk betweenness of a vertex i is equal to the number of times that a random walk starting at s and ending at t passes through i along the way, averaged over all s and t

• Random-walk betweenness can be calculated for all vertices in a network in worst-case time O((m + n)n2) using matrix methods

• This measure is appropriate to a network in which information wanders about essentially at random until it finds its target

Page 13: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Basic Matrix Notations

4

321

0111

1000

1001

1010

A

0110

0000

0001

1000

A

4

321

njnj

njj

njj

n a

a

a

k

k

k

K

,1

,12

,11

2

1

...00

............

0...0

0...0

...00

............

0...0

0...0

Adjacency Matrices

Page 14: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Basic Matrix Notations Continued…

• Rank of a matrix A = maximal number of linearly independent rows/columns

• A square matrix Anxn is invertible only if rank A = n

• Product of eigen values of a matrix = determinant of the matrix

• Ax = λx x = non-zero eigen vector

A = matrix

λ = eigen value• Singular matrix = determinant is zero = matrix is non-invertible

Page 15: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Current Flow Analogy

Current flow betweenness of a vertex i is the amount of current that flows through iaveraged over all source and target points

Kirchoff’s law of current: Total current flow into or out of any vertex is zero

i = i1 + i2

ixy = [V(x) – V(y)] / Rxy

Injected current = 1 unit

Extracted current = 1 unit

Resistance = 1 unit

Page 16: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Current flow betweenness

By Kirchoff’s law of current conservation, the voltages satisfy:

ΣjAij(Vi - Vj) = δis – δit

Aij is an element of adjacency matrix

Aij =

δij is the Knocker δ =

1 if there is an edge between i and j

0 otherwise

1 if i = j

0 otherwise

Page 17: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Current flow betweenness Continued…

Since Σj Aij = ki degree of vertex i

Therefore,

(D - A) . V = s

D = diagonal matrix with elements Dii = ki

s is source vector with elements si =

+1 for i = s

-1 for i = t

0 otherwise

Page 18: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Current flow betweenness Continued…

To obtain V, matrix (D – A) cannot be inverted as it is singular

Removing vth row of D – A.

Vv = 0 as voltage is measured with respect to

corresponding vertex

Removing vth column,

Dv - Av a square matrix (n-1) * (n-1)

V = (Dv - Av)-1 . S

Adding back the missing vertex with values all equal to zero

Voltage at i:

Vi(st) = Tis – Tit T is the resulting

matrix

Page 19: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Current flow betweenness Continued…

Current through i = half of the sum of the absolute values of the currents flowing along the edges incident on that vertex

Ii(st) = ½ ∑j Aij |Vi

(st) – Vj(st)| = ½ ∑j Aij |Tis – Tit – Tjs + Tjt|

for i != s,t

Is(st) = 1

It(st) = 1

∑s<t Ii(st)

Betweenness: bi = ----------- avg. of current flow over all

½ n(n-1) source-target pairs

* Calculated separately for each component of a graph with more than one components

Page 20: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Time of Calculation

Inversion of matrix takes O(n3)

Betweenness equation takes O(mn) for each vertex or O(mn2) for all of them

Total running time to calculate current flow betweenness for all vertices = O((m+n) n2)

O(n3) for a sparse graph

Page 21: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Random Walks

st s = source

t = target

m = message

m?

?

Page 22: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Random-walk Betweenness

• Definition: Betweenness measure for a vertex i is the number of times a message passes through i on its journey, averaged over a large number of trials of the random walk, this value averaged over all possible source/target pairs s, t is random-walk betweenness

• Betweenness of vertex i is the net number of times a walk passes through i.

Page 23: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Calculating Random-walk Betweenness

• Consider an absorbing random walk, a walk that starts at vertex s and makes random moves around the network until it finds itself at vertex t and then stops.

• If at some point in this walk we find ourselves at vertex j, then the probability that we will find ourselves at i on the next step is given by the matrix element: Mij = Aij / kj , for j != t,

s

t

ij

Page 24: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Calculating Random-walk Betweenness Continued…

Aij - element of the adjacency matrix,

kj = Σi Aij - degree of vertex j

In matrix notation,

M = A ・ D−1

D - diagonal matrix with elements Dii = ki.

Mit = 0 for all i

Mt = At ・ D−1 after removing row and column t

Page 25: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Calculating Random-walk Betweenness Continued…

For a walk from s,

probability of reaching j after r steps: [Mrt ]js

probability that step is taken to an adjacent vertex i: kj−1 [Mr

t ]js

Summing over all values of r from 0 to ∞

kj−1 [(I −Mt)−1]js total no.of times we go from j to

i averaged over all possible walks

In matrix notation we can write this as an element of the vector

V = Dt−1 . (I −Mt)−1 . s = (Dt − At)−1 . s

Page 26: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Calculating Random-walk Betweenness Continued..

|Vi − Vj | Net flow of the random walk from j to i

∑s<t Ii(st)

Betweenness: bi = -----------

½ n(n-1)

- Final net flow of random walks through vertex i

Page 27: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Summarization

• Construct the matrix D−A, where D is the diagonal matrix of vertex degrees and A is the adjacency matrix.

• Remove any single row, and the corresponding column. For example, one could remove the last row and column.

• Invert the resulting matrix and then add back in a new row and column consisting of all zeros in the position from which the row and column were previously removed (e.g., the last row and column). Call the resulting matrix T, with elements Tij .

• Calculate the betweenness from Eq. of bi using the values of Ii

Page 28: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Comparison

• In each network, we intuitively expect vertex C to have betweenness lower than that of vertices A and B, but higher than that of vertices X and Y.

Page 29: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Comparison Continued…

• Shortest path betweenness fails to give a higher score to vertex C in the first network than to any of the other vertices within the two communities, while flow betweenness has the same problem with vertex C in the second network.

• Random-walk measure orders the vertices correctly in each case.

Page 30: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Correlation With Other Measures

• Shortest-path betweenness is known to be strongly correlated with vertex degree in most networks

• If the two are strongly correlated, then why calculate betweenness, when degree is almost the same and much easier to calculate?

• There are usually a small number of vertices in a network for which betweenness and degree are very different so need betweenness to identify these vertices.

Page 31: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Example on Correlation With Other Measures

• Vertices with higher degree or higher shortest-path betweenness tend also to have higher random-walk betweenness.

• This misses the real point of interest, that there are a few vertices that have random-walk betweenness values quite different from their scores on the other two measures.

Page 32: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

• The size of the vertices increases linearly with their random-walk betweenness.

• The highlighted vertices are those for which the random-walk betweenness is substantially greater than shortest-path betweenness (a factor of two or more).

The largest component of a network of sexual contacts between high-risk actors in the city of Colorado Springs

Example continued…

Page 33: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Example Applications

The network of intermarriage relations between the 15th century Florentine families

Example 1

Page 34: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Ranking on random-walk betweenness

• Medici come out well ahead of the competition, and they easily best their arch-rivals, the Strozzi.

• It is suggested that it was in part the Medici’s skillful manipulation of this marriage network that led to their eventual dominance of the Florentine political landscape.

Page 35: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Example 2

• Increasing size of vertices: score on the random-walk betweenness measure

• A: brokers who establish connections between different groups

• B: lying on paths where there are two (or more) paths to an outlying group of vertices

The largest component of the co-authorship network of scientists working on networks

Page 36: A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State

Thank You