View
341
Download
0
Embed Size (px)
Citation preview
1
Professor Ernesto EstradaDepartment of Mathematics & Statistics
University of Strathclyde
Glasgow, UK
www.estradalab.org
2
3
Measure what is measurable, and make measurable what is not so.
Galileo Galilei
Professor Ernesto EstradaDepartment of Mathematics & Statistics
University of Strathclyde
Glasgow, UK
www.estradalab.org
kkp ~ kkekp /~
!k
kekp
kk
2
2
2
2
1 k
kk
k
ekp
kkp ~ !k
kekp
kk
Poisson
Exponential
Gamma
Power-law
Log-normal
Stretched exponential
Stumpf & Ingram, Europhys. Lett. 71 (2005) 152.
kekp kk /~
2~
222//ln
k
ekp
mk
Stumpf & Ingram, Europhys. Lett. 71 (2005) 152.
The qualitative interpretation of the skew is
complicated. For example, a zero value
indicates that the tails on both sides of the
mean balance out, which is the case for a
symmetric distribution, but is also true for an
asymmetric distribution where the asymmetries
even out, such as one tail being long but thin,
and the other being short but fat. Further, in
multimodal distributions and discrete
distributions, skewness is also difficult to
interpret. Importantly, the skewness does not
determine the relationship of mean and
median.
Wikipedia
• Collatz-Sinogowitz Index (1957)
kGCS 1
• Bell Index (1992)
2
2 11
i
i
i
i kn
kn
GVar
• Albertson Index (1997)
Eji
ji kkGirr,
Unknown
upper
bounds
i
ik
p1
ji
ijkk
p11
2
1
Estrada: Phys. Rev. E 82, 066102 (2010)
ji
ijkk
p1
ˆ
ijijij pp ˆ
2
11
2112
ji
jiji
ij
kk
kkkk
Eij jiEij
ijkk
G
2
112
AKL
otherwise,0
,~for 1
,for
ji
jik
L
i
ij
2/12/1
2/12/1
KAKI
KLKL
otherwise,0
,~for 1
,for 1
jikk
ji
ji
ijL
.2
2
11
,
2/1
Gn
kkn
G
Eji
ji
T
L
2
1n
Gn
12
2~
nn
GnG
0~1 G
n 210
n
i
j
j
j
T
j in 1
1
1
1cos
j
jj
T nG 2cos11
L
j i
jj
T iG
2
11
L
jjjx cos1
jjjy sin1
n
j
jxnn
nG
1
2
12
~
j
01
j
jx
jy
y
x
8k 16k
pnG ,
017.0~ G 034.0~ G
1227.2
27.018.1
18.1
nn
n
k
kBA
8k 16k
132.0~ G 108.0~ G
S. cereviciaeD. melanogaster
C. elegansH. pylori
148.0~ G 294.0~ G
E. coli
241.0~ G
323.0~ G 383.0~ G Stretched exponential
Log-normal
kekp kk /~
2~
222//ln
k
ekp
mk
Professor Ernesto EstradaDepartment of Mathematics & Statistics
University of Strathclyde
Glasgow, UK
www.estradalab.org
1,305 mi.
Stanley Milgram
1933-1984
CLUSTERINGDISTANCE
5ij
d
i
iCi
node of relations e transitivpossible ofnumber total
node of relations e transitivofnumber
1
2
2/1
ii
i
ii
ii
kk
t
kk
tC
i
iCn
C1
d 3.65
0.79
2.99
0.00027
1847
0.74C
d 2.65
0.28
2.25
0.05
10.5
0.69C
33
d 18.7
0.08
12.4
0.0005
926
0.3C
34
D. J. Watts
S. H. Strogatz
35
There is a high probability that the shortest path connecting nodes p and q goes through the most connected nodes of the network.
Sending the ‘information’ to the most connected nodes (hubs) of the network will increase the chances of reaching the target.
The sender does not know the global structure of the network!
36
But also, there is a high probability that the most connected nodes of the network are involved in a high number of transitive relations.
Sending the ‘information’ to the most connected nodes (hubs) of the network will increase the chances of getting lost.
37
Definition: A route is a walk of length l, which is any sequence of (not necessarily different) nodes v1,…, vl such as for each i =1,…,l there is a link from vl to vl+1.
Definition: The communicability between two nodes in a network is defined as a function of the total number of routes connecting them, giving more importance to the shorter than to the longer ones.
38
Hypothesis: The information not only flows through the shortestpaths but by mean of any available route.
39
0
# of walks in steps from to pq l
l
G c l p q
The between the nodes p and q is then mathematically defined as
where cl should:
makes the series convergent
gives more weight to the shorter than to the longer walks
VpVp pfpfV
2
2
Eqp
qfpAf,
Definition: Let .
The adjacency operator is a bounded operator on
defined as .
Theorem: (Harary) The number of walks of length l
between the nodes p and q in a network is equal to .
pq
lA
V2
40
The communicability function can be written as:
qjpj
l
j
l
n
j
lpq
l
l
lpq cAcG ,,
0 10
41
Let be a nonincreasing ordering
of the eigenvalues of A, and let be the eigenvector
associated with .
n 21
j
j
42
jee
l
AG
n
j
qjpjpq
A
l
pq
l
pq
1
,,
0 !
n
j j
qjpj
pq
l
pq
ll
pq AIAG1
,,1
0 1
1
10
1 1
1 1
2
nn
pq n j j
j
G K p q e e p p
11
2
1 1 11 .
n n
pq n j j
j
nn
eG K e p p
n
ee
n ne ne
pq nG K
2cos1
j
j
n
2sin
1 1j
jpp
n n
2cos
1
1
2cos1
1
2 2sin sin
1 1 1 1
2sin sin .
1 1 1
jnn
pq n
j
jnn
j
jp jqG P e
n n n n
jp jqe
n n n
sin sin 1/ 2 cos cos
2cos 2cos
1 1
1
1 1cos cos
1 1 1 1
j jnn n
pq n
j
j p q j p qG P e e
n n n n
1,2, ,j n 1/ nj ,0
2cos 2cos
0 0
1 1cos cospq nG P p q e d p q e d
/ 1j n
x cos
0
1cos e d I x
2 2pq n p q p qG P I I
1 1 12 2 0n n n nG P I I
x
xx
xxx
x
x
xooo
oooo
ooo
0 5 10 15 20
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
Subject
Second larg
est
princip
al valu
e
x
x
x
x
x
x
x
x
oooo o
oo
o
0 5 10 15 20
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
-0.5
x
oo
Subject
Second larg
est
princip
al valu
e
pqpq AWWG 2/12/1exp
~
: Stroke : Decreased communicability
: Stroke : Increased communicability
Images courtesy of Dr. Jonathan J. Crofts.
Professor Ernesto EstradaDepartment of Mathematics & Statistics
University of Strathclyde
Glasgow, UK
www.estradalab.org
Local metrics provide a measurement of a structural
property of a single node
Designed to characterise
Functional role – what part does this node play in
system dynamics?
Structural importance – how important is this node to
the structural characteristics of the system?
jee
l
AG
n
j
pjpp
A
l
pp
l
pp
1
2
,
0 !
1) 3.026
2) 1.641
3) 1.641
4) 3.127
5) 2.284
6) 2.284
7) 1.592
8) 1.592
1
2
3
4
5
6
7
8
1
2
3
4
56
7
8
9 8,6,5,3,11 S
65.451 SpGpp
9,7,4,22 S
70.452 SpGpp
Essential
54
55
Rank Protein Essential
?
1 YCR035C Y
2 YIL062C Y
... ... ...
Estrada: Proteomics 6 (2009) 35-40
0
10
20
30
40
50
60
70P
erc
en
tage o
f E
ssen
tial P
rote
ins
Iden
tifi
ed
57
Consider that two nodes p and q are trying to communicate with each other by sending information both ways through the network.
ppGQuantifies the amount of ‘information’ that is sent by p, wanders around the network, and returns to the origin, the node p.
pqGQuantifies the amount of ‘information’ that is sent by p, wanders around the network, and arrives to its destination, the node q.
We know that:
2def
pq pp qq pqG G G
The goal of communication is to maximise the amount of information that arrives to its destination by minimising that which is lost.
58
Theorem: The function is a Euclidean distance. pq
59
T
pq p q p qe Λφ φ φ φ
1
2
0 0
0 0
0 0 n
Λ
1
2
p
n
p
p
p
φ
Proof:
60
/2 /2
/2 /2 /2 /2
2
T
pq p q p q
T
p q p q
T
p q p q
p q
e e
e e e e
Λ Λ
Λ Λ Λ Λ
φ φ φ φ
φ φ φ φ
x x x x
x x
Proof (cont.):
61
pqpqd
p q
62
p q
1P
p q
2P
09.5
1,,
2/1
PjiEji
ij
80.4
2,,
2/1
PjiEji
ij
Remark. The shortest route based on the communicability
distance is the shortest path that avoids the nodes with the
highest ‘cliquishness’ in the graph.
81.56 10 83.30 10
Fraction of nodes
removed
The routes minimizing the communicability distance are the shortest ones
that avoid the hubs of the network. Then, they can be useful to avoid
possible collapse of the networks due to the removal of the hubs.
67
Essential
q
qp
n
,
1
q
qpd
n
,
1
68
69
large ppG
Wikipedia
70
qqpp
pqdef
pqGG
G
Theorem. The function is the cosine of the Euclidean
angle spanned by the position vectors of the nodes p and q.
pq
qqpp
pq
qp
qp
pqGG
G
xx
xx11 coscos
Definition:
71
a
bcR
2
2 2
4
1
11
AT ea 1 AT esb sesc AT
Theorem. The communicability distance induces an
embedding of a network into an (n-1)-dimensional
Euclidean sphere of radius:
Estrada et al.: Discrete Appl. Math. 176 (2014) 53-77.
Estrada & Hatano, SIAM Rev. (2016) in press.
ATT essC 211 Aediags
Remark. The communicability distance matrix C is circum-
Euclidean:
72
2/1
2/1
2/1
1
2/1
0
2/1
2
2/1
2/1
2/1
3
jjjA
73
0.08.06.0
0.0
7.0
0.0
0.1
0.1
xy
z
2
3
1
0.00.1
2
3
1
0.1
0.1
74
2
1
3
x
y
z
0.1
0.0
0.1
0.10.1
0.0
0.10.2
0.0
O
1x
2x
3x
75
px
qx
O
ppp Gx pq
pq
qqq Gx
Estrada & Hatano: Arxiv (2014) 1412.7388.
Estrada: Lin. Alg. Appl. 436 (2012) 4317-4328.
qppq xxG
Professor Ernesto EstradaDepartment of Mathematics & Statistics
University of Strathclyde
Glasgow, UK
www.estradalab.org
1V 2V
1 2V V V
0
0T
BA
B
1j n j
m
mb
fr1
Problem:
sinh 0Bipartivity Tr A
Bipartivity No odd cycles
exp sinh coshTr A Tr A Tr A
exp coshBipartivity Tr A Tr A
EE
EE
ATr
ATrb even
n
j
j
n
j
j
s
1
1
exp
cosh
exp
cosh
even evens s
EE EE ab G b G e
EE EE a b
Theorem.
G G e
e
b
a
1.000sb 0.829sb 0.769sb 0.721sb
0.692sb 0.645sb 0.645sb 0.597sb
1 1
1 1
cosh sinh
cosh sinh
n n
j j
j j
e n n
j j
j j
b
1
1
expexp
expexp
n
j
j
e n
j
j
Tr Ab
Tr A
1.000sb 0.658sb 0.538sb 0.462sb
0.383sb 0.289sb 0.289sb 0.194sb
Hervibores
Grass
Parasitoids
Hyper-hyper-parasitoids
Hyper-parasitoids
0.766sb
0.532eb
j = 1 j = 2 j = 3
j = 4 j = 5 j = 6
ˆ exp cosh sinhpq pq pqG A A A
ˆ sinh 0pq pqG A
p
q
p
q
ˆ cosh 0pq pqG A
0 and in different partitionsˆ 0 and in the same partitions
pq
p qG
p q
1
3
7
4
6
10
8
11
9
12
5
2
38.919.872.673.680.636.628.777.610.705.704.773.8
19.81.1083.609.711.766.673.862.598.729.728.702.9
72.683.603.817.487.543.504.731.632.456.665.528.7
73.609.717.405.862.550.505.742.562.630.456.629.7
80.611.787.562.517.893.310.768.596.562.632.498.7
36.666.643.550.593.337.777.639.568.542.531.662.5
28.773.804.705.710.777.638.936.680.673.672.619.8
77.662.531.642.568.539.536.637.793.350.543,566.6
10.798.732.462.696.568.580.693.317.862.587.511.7
05.729.756.630.462.642.573.650.562.505.817.409.7
04.728.765.556.632.431.672.643.587.517.403.883.6
73.802.928.729.798.75.6219.866.611.709.783.61.10
G
011111000000
101111000000
110111000000
111011000000
111101000000
111110000000
000000011111
000000101111
000000110111
000000111011
000000111101
000000111110
~G
10
12
11
8
7
9
2
3
4
1
5
6
1 1,2,3,4,5,6C
2 6,7,8,9,10,11,12C
1
3
7
4
6
10
8
11
9
12
5
2
2 3
10
4
1211
1 5 6
87 91
3
7
4
6
10
8
11
9
12
5
2
Professor Ernesto EstradaDepartment of Mathematics & Statistics
University of Strathclyde
Glasgow, UK
www.estradalab.org
VS VS 2/1
S
Let and
represents the number
of edges with exactly one
endpoint in S.
12
min , ,02n
S
S VG S V S
S
1 21 1 22
2G
N 321(Alon-Milman): Let
21
A Ramanujan graph
n = 80, = 1/4
The Petersen graph
n = 10, = 1
S S
SS
bottleneck
MySQL 30.7
93.6 56.5
St Marks St Martin
t
t
t
t
iiG
i
800,628,3321
!
1032
0
iiiiii
k
ii
k
ii
AAA
k
AG
i i
i
1/2 1/6
1/24i
1/40320
iNk
1
1
n
j
kkk jNiNis
i,1
k
jpj
n
p
n
j
ijk iN ,
1 1
,
n
p
k
jqj
n
q
n
j
pj
n
p
k pN1
,
1 1
,
1
n
p
k
jqj
n
q
n
j
pj
k
jpj
n
p
n
j
ij
k is
1
,
1 1
,
,
1 1
,
.
1 1
,1,1
1
,1,1
1 1
1,1,1
1
1,1,1
n
p
n
q
qp
n
p
pi
n
p
n
q
k
qp
n
p
k
pi
k is
2
1
,1
1 1
,1,1
n
j
p
n
p
n
q
qp in
p
p
i
k is ,1
1
,1
,1
k
Remark: The eigenvector centrality of a given node represents the probability of selecting at random a walk of infinite length that hasstarted at that node.
1 2
n
j
jijiiiG2
2
,1
2
,1 expexp
1
2
,1 exp iiiG
2ln
2
1ln 1
,1
iii G
i,1ln
iiGln
5.0
1
2
,1
,1
explnln
ii
i
iG
iiii GVi 1
2
,1,1 exp,0ln
iiii GVi 1
2
,1,1 exp,0ln
iiii GVi 1
2
,1,1 exp,0ln
andVpp ,0ln ,1Vqq ,0ln ,1
Chordless cycle
of length 15
MySQL
3105.1
St Marks St Martin
31090.2
67.1