View
214
Download
1
Category
Tags:
Preview:
Citation preview
Biological networks and statistical physics
Said Business School, University of Oxford, UK
Diego Garlaschelli
Dipartimento di Fisica, Università di Siena, ITALY
BioPhys09, Arcidosso, ITALY
Biological networks: from cells to ecosystems
Metabolic networks Vertices = cellular substrates (products or educts)
Links = biochemical reactions (enzyme-mediated)
(part of E. coli’s metabolic network )
educt
eductenzyme
product
complex
Protein-protein interaction networksVertices = proteins
Links = interactions within the cell
Neural networks
Vertices = neurons
Links = synapses
← single neuron
↑web of synaptic connections
Vascular networksVertices = tissues
Links = blood vessels
35
6
2
1
7
84
Ecological networks (food webs)Vertices = coexisting species
Links = predator-prey interactions
Real networks versus regular graphs
Two problems:
1) characterization of network structure (and complexity)2) network modelling
Protein-protein interaction network
(Saccharomyces cerevisiae)
Regular graphs
Average vertex-vertex distance:
ji d
dD
ji
ji
andbetween distance minimum
0
1jia
i ji j
i j
0
1ijji aa
i j
i j
Graph Theory Directed GraphUndirected Graph
i j i jcorresponds to
“Graph”≡ G(V,E)V: N vertices
E: L links
Adjacency Matrix:jia
Clustering coefficient:
i
i ci
icC
to connectedpairs
other eachto andto connectedpairs
Degree (number of links) of vertex i
N
1jji
ini ak
N
1jij
outi ak
outi
inii kkk
outin kPkPkP ,, :onsdistributi lstatistica
N
2Lkak
N
1jjii
Short mean distance D:
“it’s a small world, after all!”
Efficient information transport
(and fast disease spreading too!)
Small-world character of (most) real networks:
Large clustering coefficient C:
“my friends are friends of each other”
High robustness
under vertex removal
Degree distribution in (most) real networks:Power-law distribution
P(k) k -
2< <3
No characteristic scale (scale-free)!
(a) Archaeoglobus fulgidus (archea);
(b) E. coli (bacterium);
(c) Caenorhabditis elegans (eukaryote);
(d) 43 different organisms together.
Few highly connected vertices
Many poorly connected vertices
Scale-free networks:P(k) decays as a power law
Few vertices have a degree much larger that the average value
Finite-scale versus scale-free networks
Finite-scale networks:P(k) decays exponentially
No vertex has a degree much larger than the average value
Scale-free networks:P(k) decays as a power law
Finite-scale networks:P(k) decays exponentially
(in both cases N=130 and L=215: same average degree)
5 vertices with largest degree
vertices connected to the red ones (random 27%, scale-free 60%)
other vertices
Finite-scale versus scale-free networks
Degree distribution P(k):
(Poisson)
!k
pNekP
kpN
● Start with a set of N isolated vertices;
● For each pair of vertices draw a link with uniform probability p.
pN1NpN
2Lk
2
1NNpL
Average vertex-vertex distance:
Clustering coefficient
N
kpC
kN
DNkD
loglog
p=0 p=0.1
p=0.5 p=1
RANDOM GRAPH model (Erdös, Renyi 1959)
The interesting feature of the random graph model is the presence of a
critical probability pc marking the appearance of a giant cluster:
When p<pc the network is made of many small clusters and P(s) decays exponentially;
when p>pc there are few very small clusters and one giant one;
at p=pc the cluster size distribution has a power-law form: P(s) s
-
Percolation threshold pc 1/N
Connected components in random graphs
SMALL-WORLD model (Watts, Strogatz Nature 1998)
p =0 0<p<1 p = 1
Regular Small-world Random
P(k)
10 -1
10 -2
10 -3
10 -40 4 8 12 16 k
Degree distribution
● Start with a regular d-dimensional lattice, connected up to q nearest neighbours;
● With probability p, an end of each link is rewired to a new randomly chosen vertex.
C(p)/C(0)
D(p)/D(0)
small-worldregime
Average distance and clustering coefficient
P(k) k -
=3
After a certain number of iterations, the degree distribution approaches a power-law
distribution:
P(k) k - =3
Growth and preferential attachment are both necessary!
● Start with m0 vertices and no link;
● at each timestep add a a new vertex with m links, connected to preexisting vertices chosen randomly with probability proportional to their degree k (preferential attachment).
SCALE-FREE model (Barabási, Albert Science 1999)
● Each vertex i is assigned a fitness value xi
drawn from a given distribution (x) ;
● A link is drawn between each pair of vertices i and j with probability f(xi,xj) depending on xi and xj .
FITNESS model (Caldarelli et al. Phys. Rev. Lett. 2002)
Power-law degree
distributions are obtained by
chosing
(x) xα
f(xi,xj) xi xj
or
(x)= ex
f(xi,xj) (xi +xj –z)
Exponential random graphs
Reciprocity of directed networks
Do reciprocated links (pairs of mutual links between two vertices) occur more or less often than expected by chance in a directed network?
1
2
6
4
5
3
Adjacency matrix (NxN):
Important aspect of many networks:
Mutuality of relationships (friendship, acquaintance, etc.) in social networksReversibility of biochemical reactions in cellular networks
Symbiosis in food websSynonymy in word association networks
Economic/financial interdependence in trade/shareholding networks…
Link reciprocity: the problem
Reciprocity = fraction of reciprocated links in the network
reciprocity
Number of reciprocated links:
Total number of directed links:
(Email and WWW)
(WTW)
Standard definition of reciprocity
- is not an absolute quantity, to be compared to
- as a consequence, networks with different density cannot be compared
- self-loops should be excluded when computing and
New definition of reciprocity:
correlation coefficient between reciprocal links
reciprocal
areciprocal
antireciprocal
avoiding the aforementioned problems.
Conceptual problems with the standard definition:
A new definition of reciprocity
D. Garlaschelli, M.I. Loffredo Phys. Rev. Lett.93,268701(2004)
Results:reciprocity classifies
real networks
WTW
WWWNeural
Metabolic
Food Webs
Words
Financial
D. Garlaschelli, M.I. Loffredo Phys. Rev. Lett.93,268701(2004)
World Trade WebFood WebsMetabolic networks
Size dependence of the reciprocity
We introduce a multi-species formalism where reciprocated and non-reciprocated links are regarded as two different ‘chemical species’,
each governed by the corresponding chemical potential ( and )
‘particles’ of type distributed among ‘states’
‘particles’ of type distributed among ‘states’
Decomposition of the adjacency matrix:
where
Graph Hamiltonian:
• Garlaschelli and Loffredo, PHYSICAL REVIEW E 73, 015101(R) 2006
A general model of reciprocity
Grand Partition Function:
Grand Potential:
Conditional connection probability:
Occupation probabilities:
A general model of reciprocity
Models of weighted networks
Structural correlations in complex networks
In order to detect patterns in networks, one needs (one or more) null model(s) as a reference.
Examples of null models for unweighted networks:-the random graph (Erdos-Renyi) model (number of links fixed),
-the configuration model (degree sequence fixed),-etc.
Problem of structural correlations: When a low-level constraint is fixed,
patterns may be generated at a higher level, even if they do not signal ‘true’ high-level correlations.
A null model is obtained by fixing some topological constraint(s),and generating a maximally random network consistent with
them.
The (solved) problem for unweighted networks
Maslov et al.
Problem: specifying the degree sequence alone generates anticorrelations between knn
i and ki (disassortativity)and between ci and ki (hierarchy).
Solution: in unweighted networks, structural correlations can be fully characterized analytically in terms of exponential random
graphs:
Park & NewmanPark & Newman
Correct prediction:
Model 3: Local weighted rewiring (fixed strengths)
Model 4: Local weighted rewiring (fixed strengths and degrees)
Model 1: Global weight reshuffling (fixed topology)
Model 2: Global weight & tie reshuffling (fixed degrees)
Some null models for weighted networks
Is it possible to characterize these models analytically?
Exponential formulation of the four null models
Model 3: Local weighted rewiring (fixed strengths)
Note: H1, H2, H3 and H4 are particular cases of:
Model 4: Local weighted rewiring (fixed strengths and degrees)
Model 1: Global weight reshuffling (fixed topology)
Model 2: Global weight & tie reshuffling (fixed degrees)
Analytic solution of the general null model:
Solution: the probability of a link of weight w between i and j is
Models 1 and 2 (global weight reshuffling):Fermionic correlations
This means that weighted measures (except the disparity)
display a satisfactory behaviour under these null models
(but they inherit purely topological correlations!)
The expectations
are confirmed, however
implies
Model 3 (fixed strength): Bosonic correlations
Now all weighted measures are uninformative!
Model 4 (fixed strength+degree): mixed Bose-Fermi statistics
We still have as in model 3:
All weighted measures are uninformative in this case too!
Particular case:the Weighted Random Graph (WRG)
model
See a Mathematica demonstration of the model (by T. Squartini) at:
http://demonstrations.wolfram.com/WeightedRandomGraph/
The Weighted Random Graph (WRG) model
The Weighted Random Graph (WRG) model
Largest connected component in the WRG after weak (+) and strong (-) edge removal
Clustering coefficient in the WRGafter weak (+) and strong (-) edge removal
Food websNetworks of predation relationships among N biological species
i is eaten by j
i j
Only property similar to other networks: small distance D
Dunne, Williams, Martinez Proc. Natl. Acad. Sci. USA 2002
C/CrandomN
N
C/Crandom P>(k’)
k’=k/<k>Not scale-free!
Peculiar (problematic?) aspects of food webs
The connectance c=L/N2 varies across different webs
(fraction of directed links out of the total possible ones)
Not small-world!
C/Crandom=1
A modest proposal: food webs as transportation networksResource transfer along each food chain:
Flux of matter and energy form prey to predators, in more and more complex forms: directionality
Species ultimately feed on the abiotic resources (light, water, chemicals): connectedness
Almost 10% of the resources are transferred from the prey to the predator: energy dispersion
Minimum-energy subgraphs: minimum spanning treesMinimum spanning trees can be obtained as zero-temperature ensembles
where li is the trophic level (shortest distance to abiotic resources) of species i
Spanning trees and allometric scalingStructure minimizing each species’ distance from the “environment vertex”
Ai Ci
Spanning tree:
all links from a species at level ℓ to
species at levels ℓ’≤ℓ are removed.
Allometric relations:
Ci (Ai) → C (A)
1
3
6
19
0
4
8
12
16
20
0 2 4 6 8 10
A
C(A)
Power-law scaling:
C(A) Aη
Trophic level ℓ of a species i:
minimum distance from the
environment to i.
ℓ=
ℓ=
ℓ=
ℓ=
Allometric scaling in river networks
Banavar, Maritan, Rinaldo Nature 1999
C(A) Aη
η = 3/2
Ai = drainage area of site i
Ci = water in the basin of i
Allometric scaling in vascular systems
West, Brown, Enquist Science 1999; Banavar, Maritan, Rinaldo Nature 1999
A0= metabolic rate (B)
C0= nutrient volume (M)
C(A) Aη
η = 4/3
General case (dimension d): η = (d+1)/d maximum efficiency
Kleiber’s law of metabolism:
B(M) M 3/4
Allometric scaling in food webs
Garlaschelli, Caldarelli, Pietronero Nature 423, 165-168 (2003)
C(A) Aη η = 1.16-1.13
The resource transfer is universal and efficient (common organising principle?)
C(A) A
star
efficient
C(A) A2
chain
inefficient
C(A) Aη
1<η<2
competition
Transport efficiency in food websThe constraint limiting the efficiency is not the geometry, but the competition!
Tree-forming links:
1) Determine the degree of transportation EFFICIENCY
2) Measured by the allometric exponent η
3) η is universal! (Common evolutionary principle?)
Summary: food web structure decompositionSpanning trees and loops: complementary properties and roles
Source
Species
Loop-forming links:
1) Determine the STABILITY under species removal
2) Measured by the directed connectance c
3) c varies! (Web-specific organization?)
Out-of-equilibrium statistical mechanics of networks
Restoring the feedback
Dynamical process
Topological evolution
We focus on the case when topology and dynamics evolve over comparable timescales:
As a result, the process is self-organizedand a non-equilibrium stationary state is reached,
independently of (otherwise arbitrary) initial conditions.
We choose the simplest possible dynamical rule: Bak-Sneppen model
and the simplest possible network formation mechanism: Fitness model
Coupling the Bak-Sneppen and the fitness modelBak-Sneppen model on fixed graphs
(Bak, Sneppen PRL 1993 – Flyvbjerg, Sneppen, Bak PRL 1993 –Kulkarni, Almaas, Stroud cond-mat/9905066 – Moreno, Vazquez EPL 2002 -
Lee, Kim PRE 2005 - Masuda, Goh, Kahng PRE 2005)1) Specify graph, and keep it fixed;2) assign each vertex i a fitness xi drawn uniformly in (0,1);3) draw anew fitnesses of least fit vertex and its neighbours;4) evolve fitnesses iterating 3).
Fitness network model with quenched fitnesses(Caldarelli et al. PRL 2002 – Boguna, Pastor-Satorras PRE 2003)
1) Specify fitness distribution (x);2) assign each vertex i a fitness xi drawn from (x), and keep it fixed;3) draw network by joining i and j with probability f(xi, xj);4) repeat realizations and perform ensemble average.
Coupled (Self-organized) model:1) Assign each vertex i a fitness xi drawn from what you like;2) draw network by joining i and j with probability f(xi, xj);3) draw anew fitnesses of least fit vertex and its neighbours, uniformly in (0,1);4) draw anew links of least fit vertex and its neighbours with probability f(xi, xj);5) repeat from 3).
Typical iteration of the model:
Analytical solution for arbitrary f(x,y)Stationary fitness distribution:
Critical threshold obtained from normalization condition:
novel result:depends on x(not uniform)
uniform, as in standard BS
Distribution of minimum fitness:
uniform
D. Garlaschelli, A. Capocci, G. Caldarelli, Nature Physics 3, 813-817 (2007)
Analytical solution for arbitrary f(x,y)
Degree versus fitness:
Similarly, all other topological properties are derivedas in the static fitness model
Stationary degree distribution:
Particular choices of f(x,y)Null case: random graph
(“grandcanonically” equivalent to random-neighbor BS model)
Stationary fitness distribution:
Step-like, as inrandom-neighbor
BS model
Critical threshold:
subcriticalsparsedense
dynamical regimes rooted in an underlying percolation transition, located at
(if sparse)
Particular choices of f(x,y)Simplest nontrivial (and unbiased) case: configuration model
Stationary fitness distribution:
Zipf(but
normalizable!)Critical threshold:subcriticalsparsedense
conjecture (verified later): underlyingpercolation transition, located at
see Garlaschelli and Loffredo, Phys. Rev. E 78, 015101(R) (2008).
Stationary fitness distributionIn the self-organized model, it is no longer step-like
(as in the BS model on fitness-independent networks) but power-law:
Theoretical results against simulations
Power-law fitness distribution (above ):
Check the percolation transition conjecture
Power-law cluster size distribution
at the transition
Check the percolation transition conjecture
Degree versus fitness
The “saturation”
reflects repulsion
between large degrees: implies
disassortativity and hierarchy(not shown)
Cumulative degree distribution
Scale-free degree
distribution (above )
Average fitness versus threshold
References
Reciprocity
Weighted networks
Food web scaling
Out-of-equilibrium model
D. Garlaschelli, New Journal of Physics 11, 073005 (2009)
D. Garlaschelli, M.I. Loffredo, Phys. Rev. Lett. 102, 038701 (2009)
D. Garlaschelli, A. Capocci, G. Caldarelli, Nature Physics 3, 813 - 817 (2007) G. Caldarelli, A. Capocci, D. Garlaschelli, Eur. Phys. J. B 64, 585-591 (2008)
D. Garlaschelli, M. I. Loffredo, Phys. Rev. E 73, 015101(R) (2006)
D. Garlaschelli, M. I. Loffredo, Phys. Rev. Lett. 93, 268701 (2004)
D. Garlaschelli, G. Caldarelli, L. Pietronero, Nature 423, 165-168 (2003)
D. Garlaschelli, Eur. Phys. J. B 38(2), 277 (2004)
Recommended