61
Lecture 28 Network resilience Slides are modified from Lada Adamic

Lecture 28 Network resilience Slides are modified from Lada Adamic

Embed Size (px)

Citation preview

Page 1: Lecture 28 Network resilience Slides are modified from Lada Adamic

Lecture 28

Network resilience

Slides are modified from Lada Adamic

Page 2: Lecture 28 Network resilience Slides are modified from Lada Adamic

Outline

network resilience

effects of node and edge removal

example: power grid

example: biological networks

Page 3: Lecture 28 Network resilience Slides are modified from Lada Adamic

Network resilience

Q: If a given fraction of nodes or edges are removed… how large are the connected components? what is the average distance between nodes in the components

Related to percolation We say the network percolates when a giant component forms.

Source: http://mathworld.wolfram.com/BondPercolation.html

Page 4: Lecture 28 Network resilience Slides are modified from Lada Adamic

Bond percolation in Networks

Edge removal bond percolation: each edge is removed with probability (1-p)

corresponds to random failure of links targeted attack: causing the most damage to the network with

the removal of the fewest edges strategies: remove edges that are most likely to break apart the

network or lengthen the average shortest path e.g. usually edges with high betweenness

Page 5: Lecture 28 Network resilience Slides are modified from Lada Adamic

Percolation threshold in Erdos-Renyi Graphs

average degree

size

of g

iant

com

pone

nt

av deg = 0.99 av deg = 1.18 av deg = 3.96

Percolation threshold: the point at which the giant component emerges

As the average degree increases to z = 1, a giant component suddenly appears

Edge removal is the opposite process As the average degree drops below 1 the network becomes disconnected

Page 6: Lecture 28 Network resilience Slides are modified from Lada Adamic

In this network each node has average degree 4.64, if you removed 25% of the edges,

by how much would you reduce the giant component?

Quiz Q:

Page 7: Lecture 28 Network resilience Slides are modified from Lada Adamic

Edge percolation

50 nodes, 116 edges, average degree 4.64

after 25 % edge removal - > 76 edges, average degree 3.04

still well above percolation threshold

How many edges would you have to remove to break up an Erdos Renyi random graph? e.g. each node has an average degree of 4.64

Page 8: Lecture 28 Network resilience Slides are modified from Lada Adamic

Site percolation on lattices

Fill each square with probability p

low p: small isolated islands

p critical: giant component forms, occupying finite fraction of infinite lattice. Size of other components is power law distributed

p above critical: giant component rapidly spreads to span the lattice Size of other components is O(1)

Interactive demonstration:http://www.ladamic.com/netlearn/NetLogo501/LatticePercolation.html

Page 9: Lecture 28 Network resilience Slides are modified from Lada Adamic

Percolation on Complex Networks

Percolation can be extended to networks of arbitrary topology

We say the network percolates when a giant component forms

Page 10: Lecture 28 Network resilience Slides are modified from Lada Adamic

Scale-free networks are resilient with respect to random attack

gnutella network 20% of nodes removed

574 nodes in giant component 427 nodes in giant component

Page 11: Lecture 28 Network resilience Slides are modified from Lada Adamic

Targeted attacks are affective against scale-free networks

gnutella network, 22 most connected nodes removed (2.8% of the nodes)

301 nodes in giant component574 nodes in giant component

Page 12: Lecture 28 Network resilience Slides are modified from Lada Adamic

Quiz Q:

Why is removing high-degree nodes more effective?

it removes more nodes it removes more edges it targets the periphery of the network

Page 13: Lecture 28 Network resilience Slides are modified from Lada Adamic

random failures vs. attacks

Source: Error and attack tolerance of complex networks. Réka Albert, Hawoong Jeong and Albert-László Barabási.

Page 14: Lecture 28 Network resilience Slides are modified from Lada Adamic

Percolation Threshold scale-free networks

Source: Cohen et al., Resilience of the Internet to Random Breakdowns

What proportion of the nodes must be removed in order for the size (S) of the giant component to drop to ~0?

For scale free graphs there is always a giant component the network always percolates

Page 15: Lecture 28 Network resilience Slides are modified from Lada Adamic

Network resilience to targeted attacks

Scale-free graphs are resilient to random attacks, but sensitive to targeted attacks.

For random networks there is smaller difference between the two

random failure targeted attack

Source: Error and attack tolerance of complex networks. Réka Albert, Hawoong Jeong and Albert-László Barabási

Page 16: Lecture 28 Network resilience Slides are modified from Lada Adamic

Real networks

Source: Error and attack tolerance of complex networks. Réka Albert, Hawoong Jeong and Albert-László Barabási

Page 17: Lecture 28 Network resilience Slides are modified from Lada Adamic

the first few % of nodes removed

Source: Error and attack tolerance of complex networks. Réka Albert, Hawoong Jeong and Albert-László Barabási

Page 18: Lecture 28 Network resilience Slides are modified from Lada Adamic

Skewness of power-law networks and effects and targeted attack

Source: D. S. Callaway, M. E. J. Newman, S. H. Strogatz, and D. J. Watts, Network robustness and fragility: Percolation on random graphs, Phys. Rev. Lett., 85 (2000), pp. 5468–5471.

% of nodes removed, from highest to lowest degree

Page 19: Lecture 28 Network resilience Slides are modified from Lada Adamic

Assortativity

Social networks are assortative: the gregarious people associate with other

gregarious people the loners associate with other loners

The Internet is disassortative:

Assortative:hubs connect to hubs Random

Disassortative:hubs are in the periphery

Page 20: Lecture 28 Network resilience Slides are modified from Lada Adamic

Correlation profile of a network

Detects preferences in linking of nodes to each other based on their connectivity

Measure N(k0,k1) – the number of edges between nodes with connectivities k0 and k1

Compare it to Nr(k0,k1) – the same property in a properly randomized network

Very noise-tolerant with respect to both false positives and negatives

Page 21: Lecture 28 Network resilience Slides are modified from Lada Adamic

Degree correlation profiles: 2DInternet

source: Sergei Maslov

Page 22: Lecture 28 Network resilience Slides are modified from Lada Adamic

Average degree of neighbors Pastor-Satorras and Vespignani: 2D plot

aver

age

degr

eeof

the

nod

e’s

neig

hbor

s

degree of node

probability of aquiring edges is dependent on ‘fitness’ + degreeBianconi & Barabasi

Page 23: Lecture 28 Network resilience Slides are modified from Lada Adamic

Single number

cor(deg(i),deg(j)) over all edges {ij}

internet = -0.189

The Pearson correlation coefficient of nodes on eachside on an edge

Page 24: Lecture 28 Network resilience Slides are modified from Lada Adamic

assortative mixing more generally

Assortativity is not limited to degree-degree correlations other attributes social networks: race, income, gender, age food webs: herbivores, carnivores internet: high level connectivity providers, ISPs,

consumers

Tendency of like individuals to associate = ‘homophily’

Page 25: Lecture 28 Network resilience Slides are modified from Lada Adamic

degree assortativity and resilience

will a network with positive or negative degree assortativity be more resilient to attack?

assortative disassortative

Page 26: Lecture 28 Network resilience Slides are modified from Lada Adamic

Is it really that simple?

Internet?

terrorist/criminal networks?

Page 27: Lecture 28 Network resilience Slides are modified from Lada Adamic

Power grid

Electric power flows simultaneously through multiple paths in the network.

For visualization of the power grid, check out NPR’s interactive visualization: http://www.npr.org/templates/story/story.php?storyId=110997398

Page 28: Lecture 28 Network resilience Slides are modified from Lada Adamic

Power grid

Electric power does not travel just by the shortest route from source to sink, but also by parallel flow paths through other parts of the system.

Where the network jogs around large geographical obstacles, such as the Rocky Mountains in the West or the Great Lakes in the East, loop flows around the obstacle are set up that can drive as much as 1 GW of power in a circle, taking up transmission line capacity without delivering power to consumers.

Source: Eric J. Lerner, http://www.aip.org/tip/INPHFA/vol-9/iss-5/p8.html

Page 29: Lecture 28 Network resilience Slides are modified from Lada Adamic

Cascading failures

Each node has a load and a capacity that says how much load it can tolerate.

When a node is removed from the network its load is redistributed to the remaining nodes.

If the load of a node exceeds its capacity, then the node fails

Page 30: Lecture 28 Network resilience Slides are modified from Lada Adamic

Case study: North American power grid

Nodes: generators, transmission substations, distribution substations

Edges: high-voltage transmission lines

14,099 substations: NG 1,633 generators,

ND 2,179 distribution substations

NT the rest transmission substations

19,657 edges

Modeling cascading failures in the North American power grid

R. Kinney, P. Crucitti, R. Albert, and V. Latora, Eur. Phys. B, 2005

Page 31: Lecture 28 Network resilience Slides are modified from Lada Adamic

Degree distribution is exponential

Source: Albert et al., ‘Structural vulnerability of the North American power grid

Page 32: Lecture 28 Network resilience Slides are modified from Lada Adamic

Efficiency of a path

efficiency e [0,1], 0 if no electricity flows between two endpoints, 1 if the transmission lines are working perfectly

harmonic composition for a path

1

1

edges edgepath ee

path A, 2 edges, each with e=0.5, epath = 1/4 path B, 3 edges, each with e=0.5 epath = 1/6 path C, 2 edges, one with e=0 the other with e=1, epath = 0

simplifying assumption: electricity flows along most efficient path

Page 33: Lecture 28 Network resilience Slides are modified from Lada Adamic

Efficiency of the network

Efficiency of the network: average over the most efficient paths from each

generator to each distribution station

ij is the efficiency of the most efficient path between i and j

Page 34: Lecture 28 Network resilience Slides are modified from Lada Adamic

capacity and node failure

Assume capacity of each node is proportional to initial load

L represents the weighted betweenness of a node

Each neighbor of a node is impacted as follows

load exceeds capacity

Load is distributed to other nodes/edges The greater a (reserve capacity), the less susceptible the network to

cascading failures due to node failure

Page 35: Lecture 28 Network resilience Slides are modified from Lada Adamic

power grid structural resilience

efficiency is impacted the most if the node removed is the one with the highest load

highest load generator/transmission station removed

Source: Modeling cascading failures in the North American power grid; R. Kinney, P. Crucitti, R. Albert, and V. Latora

Page 36: Lecture 28 Network resilience Slides are modified from Lada Adamic

resilience: power grids and cascading failures

Vast system of electricity generation, transmission & distribution is essentiallya single network

Power flows throughall paths from source to sink(flow calculations areimportant for other networks,even social ones)

All AC lines within an interconnect must be in sync

If frequency varies too much (as line approaches capacity), a circuit breaker takes the generator out of the system

Larger flows are sent to neighboring parts of the grid – triggering a cascading failure

Source: .wikipedia.org/wiki/File:UnitedStatesPowerGrid.jpg

Page 37: Lecture 28 Network resilience Slides are modified from Lada Adamic

Cascading failures 1:58 p.m. The Eastlake, Ohio, First Energy

generating plant shuts down (maintenance problems).

3:06 p.m. A First Energy 345-kV transmission line fails south of Cleveland, Ohio.

3:17 p.m. Voltage dips temporarily on the Ohio portion of the grid. Controllers take no action, but power shifted by the first failure onto another power line causes it to sag into a tree at 3:32 p.m., bringing it offline as well. While Mid West ISO and First Energy controllers try to understand the failures, they fail to inform system controllers in nearby states.

3:41 and 3:46 p.m. Two breakers connecting First Energy’s grid with American Electric Power are tripped.

4:05 p.m. A sustained power surge on some Ohio lines signals more trouble building.

4:09:02 p.m. Voltage sags deeply as Ohio draws 2 GW of power from Michigan.

4:10:34 p.m. Many transmission lines trip out, first in Michigan and then in Ohio, blocking the eastward flow of power. Generators go down, creating a huge power deficit. In seconds, power surges out of the East, tripping East coast generators to protect them.

Source: Eric J. Lerner, “What's wrong with the electric grid?” http://www.aip.org/tip/INPHFA/vol-9/iss-5/p8.html

Page 38: Lecture 28 Network resilience Slides are modified from Lada Adamic

(dis) information cascades

Rumor spreading Urban legends Word of mouth

(movies, products)

Web is self-correcting: Satellite image hoax

is first passed around, then exposed, hoax fact is blogged about, then written up on urbanlegends.about.com

Source: undetermined

Page 39: Lecture 28 Network resilience Slides are modified from Lada Adamic

Actual satellite images of the effect of the blackout

20 hoursprior toblackout

7 hours after blackout

Source: NOAA, U.S. Government

Page 40: Lecture 28 Network resilience Slides are modified from Lada Adamic

Biological networks

In biological systems nodes and edges can represent different things nodes

protein, gene, chemical (metabolic networks)

edges mass transfer, regulation

Can construct bipartite or tripartite networks: e.g. genes and proteins

Page 41: Lecture 28 Network resilience Slides are modified from Lada Adamic

genome

proteome

metabolism

bio-chemical reactions

protein-protein interaction networks

gene regulatory networks:

protein-gene interactions

types of biological networks

Page 42: Lecture 28 Network resilience Slides are modified from Lada Adamic

protein-protein interaction networks

Properties giant component exists longer path length than

randomized higher incidence of short

loops than randomized

Source: Jeong et al, ‘Lethality and centrality in protein networks’

Page 43: Lecture 28 Network resilience Slides are modified from Lada Adamic

protein interaction networks

Properties power law distribution with an exponential cutoff higher degree proteins are more likely to be essential

Source: Jeong et al, ‘Lethality and centrality in protein networks’

Page 44: Lecture 28 Network resilience Slides are modified from Lada Adamic

resilience of protein interaction networks

if removed:

lethal

non-lethal

slow growth

unknown

Source: Jeong et al, ‘Lethality and centrality in protein networks’

Page 45: Lecture 28 Network resilience Slides are modified from Lada Adamic

Implications

Robustness resilient to random breakdowns mutations in hubs can be deadly

gene duplication hypothesis new gene still has same output protein, but no selection

pressure because the original gene is still present Some interactions can be added or dropped

leads to scale free topology

Page 46: Lecture 28 Network resilience Slides are modified from Lada Adamic

gene duplication

When a gene is duplicated every gene that had a connection to

it, now has connection to 2 genes preferential attachment at work…

Source: Barabasi & Oltvai, Nature Reviews 2003

Page 47: Lecture 28 Network resilience Slides are modified from Lada Adamic

Disease Network

source: Goh et al. The human disease network

Page 48: Lecture 28 Network resilience Slides are modified from Lada Adamic

Q: do you expect disease genes to be the essential genes?

source: Goh et al. The human disease network

- genetic origins of most diseases are

shared with other diseases

- most disorders relate to a few disease genes

Page 49: Lecture 28 Network resilience Slides are modified from Lada Adamic

Q: where do you expect disease genes to be positioned in the gene network

source: Goh et al. The human disease network

Page 50: Lecture 28 Network resilience Slides are modified from Lada Adamic

Metabolic networks are scale-free

In the bi-partite graph: the probability that a

given substrate participates in k reactions is k

indegree: = 2.2

outdegree: = 2.2

(a) A. fulgidus (Archae) (b) E. coli (Bacterium) (c) C. elegans (Eukaryote), (d) averaged over 43 organisms

Source: Jeong et al., Nature 407, 651-654 (5 October 2000) | doi:10.1038/35036627

Page 51: Lecture 28 Network resilience Slides are modified from Lada Adamic

Is there more to biological networks than degree distributions?

No modularity

Modularity

Hierarchical modularity

Source: E. Ravasz et al., Hierarchical Organization of Modularity in Metabolic Networks

Page 52: Lecture 28 Network resilience Slides are modified from Lada Adamic

clustering coefficients in different topologies

Source: Barabasi & Oltvai, Nature Reviews 2003

Page 53: Lecture 28 Network resilience Slides are modified from Lada Adamic

How do we know that metabolic networks are modular?

clustering decreases with degree as C(k)~ k-1

randomized networks (which preserve the power law degree distribution) have a clustering coefficient independent of degree

Source: E. Ravasz et al., Hierarchical Organization of Modularity in Metabolic Networks

Page 54: Lecture 28 Network resilience Slides are modified from Lada Adamic

How do we know that metabolic networks are modular?

clustering coefficient is the same across metabolic networks in different species with the same substrate

corresponding randomized scale free network:C(N) ~ N-0.75 (simulation, no analytical result)

bacteria

archaea (extreme-environment single cell organisms)

eukaryotes (plants, animals, fungi, protists)

scale free network of the same size

Source: E. Ravasz et al., Hierarchical organization in complex networks

Page 55: Lecture 28 Network resilience Slides are modified from Lada Adamic

Constructing a hierarchically modular network

RSMOB model Start from a fully

connected cluster of nodes

Create 4 identical replicas of the cluster, linking the outside nodes of the replicas to the center node of the original (N = 25 nodes)

This process can repeated indefinitely

(initial number of nodes can be different than 5)

Source: Ravasz and Barabasi, PRE 67 026112, 2003, doi: 10.1103/PhysRevE.67.026112

Page 56: Lecture 28 Network resilience Slides are modified from Lada Adamic

Properties of the hierarchically modular model

RSMOB model Power law exponent = 2.26

in agreement with real world metabolic networks

C ≈ 0.6, independent of network size also comparable with observed real-world values

C(k) ≈ k-1, as in real world network

How to test for hierarchically arranged modules in real world networks perform hierarchical clustering on the topological overlap map can be done with Pajek

Page 57: Lecture 28 Network resilience Slides are modified from Lada Adamic

Discovering hierarchical structure using topological overlap

A: Network consisting of nested modules B: Topological overlap matrix

hierarchical

clustering

Source: E. Ravasz et al., Science 297, 1551 -1555 (2002)

Page 58: Lecture 28 Network resilience Slides are modified from Lada Adamic

Modularity and the role of hubs

Party hub: interacts simultaneously within the same module

Date hub: sequential interactions connect different modules

connect biological processes

Source: Han et al, Nature 443, 88 (2004)

Page 59: Lecture 28 Network resilience Slides are modified from Lada Adamic

Q: which type of hub is more likely to be essential?

Page 60: Lecture 28 Network resilience Slides are modified from Lada Adamic

metabolic network of e. coli

Source: Guimera & Amaral, Functional cartography of complex metabolic networks

Page 61: Lecture 28 Network resilience Slides are modified from Lada Adamic

summing it up

resilience depends on topology

also depends on what happens when a node fails e.g. in power grid load is redistributed in protein interaction networks other proteins may

start being produced or cease to do so

in biological networks, more central nodes cannot be done without