41
Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Embed Size (px)

Citation preview

Page 1: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Class 4: Random Graphs

Network Science: Random Graphs 2012

Prof. Albert-László BarabásiDr. Baruch Barzel, Dr. Mauro Martino

Page 2: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

See: https://en.wikipedia.org/wiki/Boeing_787_Dreamliner_battery_problems

BOEING BATTERY FAILURE

Page 3: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

HUMAN DISEASE NETWORK

Page 4: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

http://www.slate.com/id/2245232

FIGHTING TERRORISM AND MILITARY

Network Science: Introduction 2012

Page 5: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

http://www.ns-cta.org/ns-cta-blog/

FIGHTING TERRORISM AND MILITARY

Network Science: Introduction 2012

Started in 2009 includes four interconnected centers:1. Social Cognitive Networks anchored at RPI2. Information Networks anchored at UIUC3. Communication Networks anchored at Penn State4. Integrative Research Center anchored at BBN

10 year funding of over $120 mln awarded throughnation-wide competition.

Page 6: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Real Projected

EPIDEMIC FORECAST Predicting the H1N1 pandemic

Network Science: Introduction 2012

Page 7: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Thex

In September 2010 the National Institutes of Health awarded $40 million to researchers at Harvard, Washington University in St. Louis, the University of Minnesota and UCLA, to develop the technologies that could systematically map out brain circuits.

The Human Connectome Project (HCP) with the ambitious goal to construct a map of the complete structural and functional neural connections in vivo within and across individuals.

http://www.humanconnectomeproject.org/overview/

BRAIN RESEARCH

Network Science: Introduction 2012

Page 8: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

National Research Council:

Network Science: Introduction 2012

Page 9: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

GENERAL AUDIENCE

Network Science: Introduction 2012

Page 10: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

BOOKS

Handbook of Graphs and Networks: From the Genome to the Internet (Wiley-VCH, 2003).

S. N. Dorogovtsev and J. F. F. Mendes, Evolution of Networks: From Biological Nets to the Internet and WWW (Oxford University Press, 2003).

S. Goldsmith, W. D. Eggers, Governing by Network: The New Shape of the Public Sector (Brookings Institution Press, 2004).

P. Csermely, Weak Links: The Universal Key to the Stability of Networks and Complex Systems (The Frontiers Collection) (Springer, 2006), rst edn.

M. Newman, A.-L. Barabasi, D. J. Watts, The Structure and Dynamics of Networks: (Princeton Studies in Complexity) (PrincetonUniversity Press, 2006), rst edn.

L. L. F. Chung, Complex Graphs and Networks (CBMS Regional Conference Series in Mathematics) (American Mathematical Society, 2006).

Network Science: Introduction 2012

Page 11: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

BOOKS

R. Pastor-Satorras, A. Vespignani, Evolution and Structure of the Internet: A Statistical Physics Approach (Cambridge University Press, 2007), rst edn.

F. Kopos, Biological Networks (Complex Systems and Interdisciplinary Science) (World Scientic Publishing Company, 2007), rst edn.

B. H. Junker, F. Schreiber, Analysis of Biological Networks (Wiley Series in Bioinformatics) (Wiley-Interscience, 2008).

T. G. Lewis, Network Science: Theory and Applications (Wiley, 2009).

E. Ben Naim, H. Frauenfelder, Z.Torotzai, Complex Networks (Lecture Notes in Physics) (Springer, 2010), rst edn.

M. O. Jackson, Social and Economic Networks (Princeton University Press, 2010).

Network Science: Introduction 2012

Page 12: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

• 1998: Watts-Strogatz paper in the most cited Nature publication from 1998; highlighted by ISI as one of the ten most cited papers in physics in the decade after its publication.

• 1999: Barabasi and Albert paper is the most cited Science paper in 1999;highlighted by ISI as one of the ten most cited papers in physics in the decade after its publication.

• 2001: Pastor -Satorras and Vespignani is one of the two most cited papers among the papers published in 2001 by Physical Review Letters.

• 2002: Girvan-Newman is the most cited paper in 2002 Proceedings of the National Academy of Sciences.

 

Original papers:

Network Science: Introduction 2012

Page 13: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Thex

If you were to understand the spread of diseases, can you do it without networks?

If you were to understand the WWW structure, searchability, etc, hopeless without invoking the Web’s topology.

If you want to understand human diseases, it is hopeless without considering the wiring diagram of the cell.

MOST IMPORTANT Networks Really Matter

Network Science: Introduction 2012

Page 14: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Thex

NGRAMS Networks Awareness

Network Science: Introduction 2012

Page 15: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Degree distribution pk

THREE CENTRAL QUANTITIES IN NETWORK SCIENCE

Average path length <d>

Clustering coefficient C

Network Science: Graph Theory 2012

Page 16: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Degree distribution P(k): probability that a randomly chosen vertex has degree k

Nk = # nodes with degree k

P(k) = Nk / N plot➔

k

P(k)

1 2 3 4

0.10.20.30.40.50.6

DEGREE DISTRIBUTION

Network Science: Graph Theory 2012

Page 17: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

discrete representation: pk is the probability that a node has degree k.

continuum description: p(k) is the pdf of the degrees, where

represents the probability that a node’s degree is between k1 and k2.

Normalization condition:

where Kmin is the minimal degree in the network.

 

DEGREE DISTRIBUTION

Network Science: Graph Theory 2012

Page 18: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Clustering coefficient:

what portion of your neighbors are connected to each other?

Node i with degree ki, so the maximum number of edges is ki(ki - 1)/2

Ci in [0,1]

CLUSTERING COEFFICIENT

Network Science: Graph Theory 2012

Page 19: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Degree distribution: P(k)

Path length: <d>

Clustering coefficient:

THREE CENTRAL QUANTITIES IN NETWORK SCIENCE

Network Science: Graph Theory 2012

Page 20: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

A. Degree distribution: pk

B. Path length: <d>

C. Clustering coefficient:

THREE CENTRAL QUANTITIES IN NETWORK SCIENCE

Network Science: Graph Theory 2012

Page 21: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

ANOTHER EXAMPLE -THREE CENTRAL QUANTITIES IN NETWORK SCIENCE

Both networks have two parts, each a symmetric mirror image of the other, so we consider just nodes of left square in each network, its nodes identified by position as UL, LL, UR, LR.

Degrees: blue UL,LL,UR 2, LR 3, so <k>=2.25, red UL,LL,UR 3, LR 4, so <k>=3.25.

Distributions are show to the right of graphs.

Blue paths are UL=(2,1,1,2,1), LL,UR=(2,2,2,1) and LR=(3,3,1), dmax=5, <d> = 2.28

Red paths are UL,LL,UR=(3,1,3) and LR(4,3), dmax=3, <d> = 1.

Clustering coefficients are: blue all 0, red UL,LL,UR=1 and LR=1/2

Blue network Red network 1 2 3 4

1/2

P(k)

k

Page 22: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

connected all are (red) neighbors whosenodeblack see 6,N if nodeeach for 2

1C

ends" at the nodes" theof ridget weend, at the nodes, theconnectingBy

nodes yellowfor 2 and nodesgreen for 3 nodes,black and redfor 4k

The above assumes end nodes connected so, largest distance is N/2 nodes.

The average path-length varies as

Constant degree, constant clustering coefficient.

Nd

ONE DIMENSIONAL LATTICE: nodes on a line

Network Science: Graph Theory 2012

Page 23: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Network Science: Graph Theory 2012

Page 24: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

RANDOM NETWORK MODEL

Network Science: Random Graphs 2012

Page 25: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

Erdös-Rényi model (1960)

Connect with probability p

p=1/6 N=10

k ~ 1.5

Pál Erdös(1913-1996)

Alfréd Rényi(1921-1970)

RANDOM NETWORK MODEL

Network Science: Random Graphs 2012

Here <k> = (6+2*2+4)/10 = 1.4

Page 26: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

RANDOM NETWORK MODEL

Definition:

A random graph is a graph of N labeled nodes where each pair of nodes is connected by a preset probability p.

We will call is G(N, p).

Network Science: Random Graphs 2012

Page 27: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

RANDOM NETWORK MODEL

p=1/6 N=12

Network Science: Random Graphs 2012

Page 28: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

RANDOM NETWORK MODEL

p=0.03 N=100

Network Science: Random Graphs 2012

Page 29: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

RANDOM NETWORK MODEL

N and p do not uniquely define the network– we can have many different realizations of it. How many?

N=10 p=1/6

The probability to form a particular graph G(N,p) is That is, each graph G(N,p) appears with probability P(G(N,p)).

Network Science: Random Graphs 2012

Page 30: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

RANDOM NETWORK MODEL

P(L): the probability to have exactly L links in a network of N nodes and probability p:

The maximum number of links in a network of N nodes.

Number of different ways we can choose L links among all potential links.

Binomial distribution...

Network Science: Random Graphs 2012

Page 31: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

MATH TUTORIAL the mean of a binomial distribution

There is a faster way using generating functions, see: http://planetmath.org/encyclopedia/BernoulliDistribution2.html Network Science: Random Graphs 2012

Page 32: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

MATH TUTORIAL the variance of a binomial distribution

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.htmlNetwork Science: Random Graphs 2012

Page 33: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

MATH TUTORIAL the variance of a binomial distribution

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.htmlNetwork Science: Random Graphs 2012

Page 34: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

MATH TUTORIAL Binomian Distribution: The bottom line

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.htmlNetwork Science: Random Graphs 2012

Page 35: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

RANDOM NETWORK MODEL

P(L): the probability to have a network of exactly L links

• The average number of links <L> in a random graph

• The standard deviation

Network Science: Random Graphs 2012

Page 36: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

DEGREE DISTRIBUTION OF A RANDOM GRAPH

As the network size increases, the distribution becomes increasingly narrow—we are increasingly confident that the degree of a node is in the vicinity of <k>.

Select k nodes from N-1

probability of having k edges

probability of missing N-1-kedges

Network Science: Random Graphs 2012

Page 37: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

DEGREE DISTRIBUTION OF A RANDOM GRAPH

For large N and small k, we can use the following approximations:

for

Network Science: Random Graphs 2012

Page 38: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

DEGREE DISTRIBUTION OF A RANDOM GRAPH

P(k

)

k Network Science: Random Graphs 2012

Page 39: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

DEGREE DISTRIBUTION OF A RANDOM NETWORK

Exact Result-binomial distribution-

Large N limit-Poisson distribution-

Pro

ba

bili

ty D

istr

ibu

tion

Fu

nct

ion

(P

DF

)

Network Science: Random Graphs 2012

Page 40: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

What does it mean? Continuum formalism:

If we consider a network with average degree <k> then the probability to have a node whose degree exceeds a degree k0 is:

For example, with <k>=10, • the probability to find a node whose degree is at least twice the average degree is 0.00158826. • the probability to find a node whose degree is at least ten times the average degree is 1.79967152 × 10 -13

• the probability to find a node whose degree is less than a tenth of the average degree is 0.00049See http://www.stud.feec.vutbr.cz/~xvapen02/vypocty/po.php

• The probability of seeing a node with very high of very low degree is exponentially small.• Most nodes have comparable degrees.• The larger the size of a random network, the more similar are the node degrees

What does it mean? Discrete formalism:

NODES HAVE COMPARABLE DEGREES IN RANDOM NETWORKS

Page 41: Class 4: Random Graphs Network Science: Random Graphs 2012 Prof. Albert-László Barabási Dr. Baruch Barzel, Dr. Mauro Martino

NO OUTLIERS IN A RANDOM SOCIETY

According to sociological research, for a typical individual k ~1,000

The probability to find an individual with degree k>2,000 is 10-27.

Given that N ~109, the chance of finding an individual with 2,000 acquaintances is so tiny that such nodes are virtually inexistent in a random society.

a random society would consist of mainly average individuals, with everyone with roughly the same number of friends.

It would lack outliers, individuals that are either highly popular or recluse.

Network Science: Random Graphs 2012