E6885 Network Science Lecture 2: Network Representations ... · 1 © 2011 Columbia University E6885 Network Science Lecture 2: Network Representations and Characteristics E 6885 Topics

1

© 2011 Columbia University

E6885 Network Science Lecture 2:

Network Representations and Characteristics

E 6885 Topics in Signal Processing -- Network Science

Ching-Yung Lin, Dept. of Electrical Engineering, Columbia University

September 19th, 2011

© 2011 Columbia University2 E6885 Network Science – Lecture 2: Network Representations and Characteristics

Course Structure

Final Project Presentation 1412/19/11

Large-Scale Network Processing System 1312/12/11

Behavior Understanding and Cognitive Networks 1212/05/11

Privacy, Security, and Economy Issues in Networks 1111/28/11

Information and Knowledge Networks1011/21/11

Social Influence and Info Diffusion in Networks911/14/11

Dynamic Networks810/31/11

Network Topology Inference710/24/11

Network Models610/17/11

Network Sampling and Estimation510/10/11

Network Visualization410/03/11

Network Partitioning and Clustering309/26/11

Network Representations and Characteristics209/19/11

Overview – Social, Information, and Cognitive Network Analysis109/12/11

Topics CoveredClass

Number

Class

Date

2


Grader / TA info

� Grader / TA: Jian Wang

� Contact: [email protected]

� Office Hour: Wednesday 1pm – 3pm (Mudd 1311 – mini conf. room in the EE office)


Graphs and Matrix Algebra

� The fundamental connectivity of a graph G may be

captured in an binary symmetric matrix A

with entries: v vN N×

1, { , }

0,ij

if i j EA

otherwise

∈=

A is called the Adjacency Matrix of G

3


Some properties of adjacency matrix

� The row sum is equal to the degree of vertex.

i i ij

j

d A A+= =∑� Symmetry:

i iA A+ +=

� Number of walks of length r from the r-th power of A : Ar

r

ijA


For directional graph

� {i,j} represents an directed edge from i to j.

� In and Out degrees:

out

i iA d+ = in

j jA d+ =

4


Incidence Matrix

� Incidence Matrix: an binary matrixv eN N×

1,

0,ijB

=

if vertex i is incident to edge j

otherwise

� Then,

� Where

T =BB D-A

[ ]( )i i Vdiag d ∈=D


Laplacian of Graph

� Laplacian of Graph:

� For a real-valued vector:

=L D-A

2

{ , }

( )i j

i j E

x x∈

= −∑Tx Lx

vN∈x ℝ

The closer this value is to zero, the more similar are the elements of x at adjacent vertices.

� A measure of ‘smoothness’

5


Laplacian of Graph

� L is a positive semi-definite matrix, with all non-

negative eigenvalues:

� The smallest eigenvalue:

� The second smallest eigenvalue :

1 0λ =

2λ

The larger the more connected G is, and the more difficult to separate G into disconnected subgraphs

2λ


Data Structure

�Adjacency Matrix:

�Adjacency List:

2( )vO N

( )v eO N N+

6


Algorithms

�Some questions:

–Are vertex i and j linked by an edge?

–What is the degree of vertex i?

–What is the shortest path(s) between vertex i and j?

–How many connected component does the graph have?

–(for a directed graph,) does it have cycles or is it acyclic?

–What is the maximal clique in a graph?


Algorithmic Complexity

� ‘Tractable’: Polynomial Time:

( )p

O n 0p >

� ‘Intractable:’ Super-Polynomial Time, e.g.:

( )n

O a 1a >

The design of efficient algorithms is usually nontrivial.

Usually improved by indexing, storage, removing redundant computations, etc.

7


Measurement of Complexity (I)

� The basic goal of data mining is prediction

� Complexity can be defined as the amount of information

required for optimal prediction. (Grassberger, J of Theoretical

Physics,1986)

� f is any predictor that translates the past of the time sequence

x- (or, in other occasions, training set) into an effective state,

s=f(x-), and then make its prediction on the basis of s.

min [ ( )]f M

C H f X −

∈=


Measurement of Complexity (II)

� Grassberger-Crutchfield-Yong Statistical Complexity (J. of Statistical Physica, 2001)

� An effective procedure for finding the minimal maximally predictive model and its states.

� Definition: causal states of a process:– Two histories and are equivalent if– Write the set of all histories equivalent as – A function which maps each history into its equivalent class:

– Crutchfield and Young proposed to forget particular history and retain only its equivalent class. They call the equivalent classes as the causal states of a process. These are the optimal states.

– The statistical complexity of a processes is thus the information content of its causal states.

– It is equal to the shortest description of the past which is relevant to the actual dynamics of the system> E.g., IID: 0, periodic sequences: log p.

Pr[ | ( )] Pr[ | ]X x X xε+ − + −=

1 2Pr( | ) Pr( | )X x X x+ − + −=1x−

2x−

[ ]x−

( ) [ ]x xε − −=

8


Example – Finding all vertices that are reachable from a vertex

� Breadth-First Search (BFS)

� Depth-First Search (DFS)


Characteristics and Structure Properties of Network

� Some questions to ask:

–Triplets of vertices (triads) in social dynamics

–Paths and flows in graph

–Importance of individual element => how ‘central’ the corresponding

vertex is in the network

–Finding communities

� Characteristics of individual vertices

� Characteristics of network cohesion

9


Centrality

� “There is certainly no unanimity on exactly what centrality is or its

conceptual foundations, and there is little agreement on the procedure of

its measurement.” – Freeman 1979.

� Degree (centrality)

� Closeness (centrality)

� Betweeness (centrality)

� Eigenvector (centrality)


Degree Distribution Example: Power-Law Network

� A. Barbasi and E. Bonabeau, “Scale-free Networks”, Scientific American 288: p.50-59,

2003.

/kkp C k eτ κ− −= ⋅

Newman, Strogatz and Watts, 2001!

km

k

mp e

k

−= ⋅

10


Another example of complex network: Small-World Network

� Six Degree Separation:

– adding long range link, a regular graph can be transformed into a small-

world network, in which the average number of degrees between two

nodes become small.

from Watts and Strogatz, 1998

C: Clustering Coefficient, L: path length,

(C(0), L(0) ): (C, L) as in a regular graph;

(C(p), L(p)): (C,L) in a Small-world graph with randomness p.


Indication of ‘Small’

� A graph is ‘small’ which usually indicates the average distance between distinct

vertices is ‘small’

1( , )

( 1) / 2 u v Vv v

l dist u vN N ≠ ∈

=+ ∑

For instance, a protein interaction network would be considered to have the small-world property, as there is an average distance of 3.68 among the 5,128 vertices in its giant component.

11


Some examples of Degree Distribution

� (a) scientist collaboration: biologists (circle) physicists (square), (b)

collaboration of move actors, (d) network of directors of Fortune 1000

companies


Degree Distribution

Kolaczyk, “Statistical Analysis of Network Data: Methods and Models”, Springer 2009.

12


Degree Distribution



Degree Correlations


13


Conceptual Descriptions of Three Centrality Measurements



Closeness

� Closeness: A vertex is ‘close’ to the other vertices

1( )

( , )CI

u V

c vdist v u

∈

=∑

where dist(v,u) is the geodesic distance between vertices v and u.

14


Betweenness

�Betweenness measures are aimed at summarizing the extent

to which a vertex is located ‘between’ other pairs of vertices.

�Freeman’s definition:

( , | )( )

( , )B

s t v V

s t vc v

s t

σσ≠ ≠ ∈

= ∑

�Calculation of all betweenness centralities requires

– calculating the lengths of shortest paths among all

pairs of vertices

–Computing the summation in the above definition for

each vertex


Eigenvector Centrality

� Try to capture the ‘status’, ‘prestige’, or ‘rank’.

� More central the neighbors of a vertex are, the more central the vertex

itself is.

{ , }

( ) ( )Ei Ei

u v E

c v c uα∈

= ∑

The vector ( (1), ..., ( ))TEi Ei Ei vc c N=c is the solution of the

eigenvalue problem: 1

Ei Eiα −⋅ =A c c

15


PageRank Algorithm (Simplified)


PageRank Steps

� Example: Simplified Initial State:

R(A) = R(B) = R(C) = R(D) = 0.25

� Iterative Procedure:

R(A) = R(B) / 2 + R(C) / 1 + R(D) / 3

A B

C D

( )( )

vv B v

R uR u d e

N∈

= +∑

u uN F=

uF

uB

where

The set of pages u points to

The set of pages point to u

Number of links from u

Normalization / damping factord

1 de

N

−= In general, d=0.85

16


Solution of PageRank

� The PageRank values are the entries of the dominant eigenvector of the modified

adjacency matrix.1

2

( )

( )

:

( )N

R p

R p

R p

=

R

where R is the solution of the equation

1 1 1 1 1

2 1

1

( , ) ( , ) ( , )(1 ) /

( , )(1 ) /

( , )

( , ) ( , )(1 ) /

N

i j

N N N

l p p l p p l p pd N

l p pd Nd

l p p

l p p l p pd N

− − = +

−

R R

…

⋱ ⋮

⋮ ⋮⋮

… …

where R is the adjacency function if page pj does not link to pi, and normalized such that for each j,

( , ) 0i jl p p =

1

( , ) 1N

i j

i

l p p=

=∑


Network Cohesion

� Questions to answer:

– Do friends of a given actor in a SN tend to be friends of one another?

– What collections of proteins in a cell appear to work closely together?

– Does the structure of the pages in the WWW tend to separate with respect to

distinct types of content?

– What portion of a measured Internet topology would seem to constitute the

backbone?

� Definitions differ in

– Scale

– Local to Global

– Explicity (.e.g., cliques) vs implicity (e.g. clusters)

17


Local Density

� A coherent subset of nodes should be locally dense.

� Cliques:

3-cliques

A sufficient condition for a clique of size n to exist in G is:

2 2

2 1

ve

N nN

n

− > −


Computational Complexity of Cliques

� Detecting all triangles (i.e., all cliques of size n=3) in G can be done exhaustively

in

� Brandes and Erlebach, 2005. the time can be improved to

in general, and

in sparse graphs through fast algorithms for multiplications of square matrices.

3( )vO N

2.376( )vO N1.41( )eO N

18


Weakened Versions of Cliques -- Plexes

� A subgraph H consisting of m vertices is called n-plex, for m > n, if no vertex has

degree less than m – n.

1-plex

1-plex� No vertex is missing more than n of its possible m-1 edges.


Another Weakened Versions of Cliques -- Cores

� A k-core of a graph G is a subgraph H in which all vertices have degree at

least k.

3-core

� Batagelj et. al., 1999. A maximal k-core subgraph may be computed in as

little as O( Nv + Ne) time.

Computes the shell indices for every vertex in the graph

Shell index of v = the largest value, say c, such that v belongs to the c-core of G but not its (c+1)-core.

For a given vertex, those neighbors with lesser degree lead to a decrease in the potential shell index of that vertex.

19


Density measurement

� The density of a subgraph H = ( VH , EH ) is:

( )( 1) / 2

H

H H

Eden H

V V=

−

Range of density

and

0 ( ) 1den H≤ ≤

( ) ( 1) ( )Hden H V d H= −

average degree of H


Use of the density measure

� Density of a graph: let H=G

� ‘Clustering’ of edges local to v: let H=Hv, which is the set of neighbors of a vertex

v, and the edges between them

� Clustering Coefficient of a graph: The average of den(Hv) over all vertices

20


An insight of clustering coefficient

� A triangle is a complete subgraph of order three.

� A connected triple is a subgraph of three vertices connected by two edges

(regardless how the other two nodes connect).

� The local clustering coefficient can be expressed as:

� The clustering coefficient of G is then:

3

( )( ) ( )

( )v

vden H cl v

v

ττ

∆= =

1( ) ( )

v V

cl G cl vV ′∈

=′∑

Where V’⊆ V is the set of vertices v with dv ≥ 2.

# of triangles

# of connected triples for which 2 edges are both incident to v.


An example

21


Transitivity of a graph

� A variation of the clustering coefficient � takes weighted average

where

3

3 3

( ) ( )3 ( )

( )( ) ( )

v VT

v V

v cl vG

cl Gv G

ττ

τ τ′∈ ∆

′∈

= =∑

∑

1( ) ( )

3 v VG vτ τ∆ ∆

∈

= ∑

3 3( ) ( )v V

G vτ τ∈

=∑

is the number of triangles in the graph

is the number of connected triples

� The friend of your friend is also a friend of yours

Clustering coefficients have become a standard quantity for network structure analysis. But, it is important on reporting which clustering coefficients are used.


Example: Brain of an epliepsy patient

22


Network Representation of Cortical-level Coupling


Visual Summaries of Degree and Closeness Centrality

preictal ictaldiff

23


Visual Summaries of Betweenness Centrality and Clustering Coefficient

preictal ictaldiff


Connectivity of Graph

� A measure related to the flow of information in the graph

� Connected � every vertex is reachable from every other

� A connected component of a graph is a maximally connected subgraph.

� A graph usually has one dominating the others in magnitude � giant component.

24


Vertex / Edge Connectivity

� If an arbitrary subset of k vertices or edges is removed from a graph, is the

remaining subgraph connected?

� A graph G is called k-vertex-connected, if (1) Nv>k, and (2) the removal of any

subset of vertices X in V of cardinality |X| smaller than k leaves a subgraph G – X

that is connected.

The vertex connectivity of G is the largest integer such that G is k-vertex-connected.

• Similar measurement for edge connectivity


Vertex / Edge Cut

� If the removal of a particular set of vertices in G disconnects the graph, that set is

called a vertex cut.

� For a given pair of vertices (u,v), a u-v-cut is a partition of V into two disjoint non-

empty subsets, S and S’, where u is in S and v is in S’.

Minimum u-v-cut: the sum of the weights on edges connecting vertices in S to vertices in S’ is a minimum.

25


Minimum cut and flow

� Find a minimum u-v-cut is an equivalent problem of maximizing a measure of

flow on the edges of a derived directed graph.

� Ford and Fulkerson, 1962. Max-Flow Min-Cut theorem.


Example: AIDS blog network

26


Browtie structure of a directed network graph


Classify the nodes

27


Graph Partitioning

� Many uses of graph partitioning:

– E.g., community structure in social networks

� A cohesive subset of vertices generally is taken to refer to a subset of vertices that

– (1) are well connected among themselves, and

– (2) are relatively well separated from the remaining vertices

� Graph partitioning algorithms typically seek a partition of the vertex set of a graph

in such a manner that the sets E( Ck , Ck’ ) of edges connecting vertices in Ck to

vertices in Ck’ are relatively small in size compared to the sets E(Ck) = E( Ck , Ck’ )

of edges connecting vertices within Ck’ .


Example: Karate Club Network

28


Hierarchical Clustering

� Agglomerative

� Divisive

� In agglomerative algorithms, given two sets of vertices C1 and C2, two standard

approaches to assigning a similarity value to this pair of sets is to use the

maximum (called single-linkage) or the minimum (called complete linkage) of the

similarity xij over all pairs.

( ) ( 1)

i jv v

ij

v v

N Nx

d N d N

∆=

+ −

The “normalized” number of neighbors of vi and vj that are not shared.


Hierarchical Clustering Example

29


Questions?

Documents

E6885 Network Science Lecture 2: Network Representations ... · 1 © 2011 Columbia University E6885 Network Science Lecture 2: Network Representations and Characteristics E 6885 Topics