28
Complex networks and decentralized search algorithms By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Embed Size (px)

Citation preview

Page 1: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Complex networks and decentralized search algo-

rithms

By Jon Kleinberg

Bo Young KimApplied Algorithm Lab

Page 2: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Research of large-scale network structure Importance: Limit of Reductionism Mathematics, computer science, social sci-

ence and biological science- Computer science: Internet, WWW- Social science: social network- Biological science: interaction in the path-

ways of a cell’s metabolism,Neurology (e.g. Neural burst modeling)

Complex Network What is “Complex Network”?

Page 3: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Euler(1736)- Graph Theory

New Problem- How is a network created? What are rules dominate its topology and structure?

Complex Network History 1

Page 4: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Observe real-world network property Modeling

(produced by a random mechanism) Reproduce another properties (It may be

observed in the real-world network) We can explain and predict!

Complex NetworkGoal

Page 5: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Erdos, Renyi (1959)- Random graph theory Different systems have different rules- in-

tentionally ignored Connecting a pair of nodes randomly Giant component(Phase transition)

G(n,p) where p= (c: const.)

1) c<1: G(n,p) consists a.a.s. of small com-ponents all of which have O(logn) vertices

2) c>1: a.a.s. a unique large component which consists of Θ(n) vertices

Complex Network History 2

n

c

Page 6: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

six degrees of separation“I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet. The President of the United States, a gondo-lier in Venice, just fill in the names.”

The small–world phenomenonsix degrees of separation

<six degrees of separation> by Gaure(1991)

or <Chains> by Karinthy(1929)

Yuna Kim, and You, just fill in the names.

Page 7: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Stanley Milgram’s experiment (1967) Want to know the “distance” of two per-

son in America Target person(stockbroker in Boston) Considerably Randomly chosen

starters(Wichita, Cansas, Omaha, Nebraska) Personal information Forward letter to a person on a first-name ba-

sis. The median length among the complete

paths was 5.5. (42/160) We are living in a Small World!

The small–world phenomenonMilgram’s experiment

Page 8: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Example- Social Network, Web, Biology… Barabasi(1998)

Web – 19 degrees of separationd=0.35+2logN (d: average distance, N: # of web pages)

General phenomenon observed in a lot of network

Caution: It doesn’t mean we can find some-one/something easily. (We don’t know the shortest path)

The small–world phenomenonother exmaples

Page 9: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

1. Such short chains are ubiquitous.2. Individuals operating with purely local in-

formation are very adept at finding these chains. (using “analysis”)

Length of the shortest path ≤ 6

The small–world phenomenonResult

Page 10: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Thm (Bollobas, de la Vega, 1982)Fix a constant k≥3. If we choose u.a.r. from the set of all n-node graph in which each node has degree exactly k, the with high probability every pair of nodes will be joined by a path length of O(logn).

Basic models of small-world networks in classic graph theory

Page 11: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Thm (Bollobas,Chung 1988)Consider a graph G formed by adding a random matching to an n-node cycle(assume n is even, pair up the nodes on the cycle u.a.r. and add edges between each of these node pairs). With high probability, every pair of nodes will be joined by a path length O(logn).

Basic models of small-world networksin classic graph theory

Page 12: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Granoveter(1972)- Existence of Cluster Real Network – Highly clustered(Erdos number, No cluster in Erdos-Reney model need to be modified! Watts-Strogatz model (1998, Nature) nxn grid-based model For each node v, one extra directed edge to some other

node w chosen u.a.r. (w: long range contact. ↔ local contacts) Superposition of structured and random links. Trade off- clustering ⇔ no clustering large world small world

Basic models of small-world networks Watts-Strogatz model

Page 13: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Basic models of small-world networksWatts-Strogatz model

Two-dimensional grid with a single random shortcut super-imposed.

Two-dimensional grid with many random shortcuts superimposed (as in the Watts-Strogatz model).

Page 14: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Decentralized search algorithmAn algorithm finding efficient paths to a destination using purely local informationi.e. an algorithm searching the shortest path under the following rule;At each step, the holder of the message must pass it across one of its connections. (In grid model, cur-rent holder doesn’t know the long-range connec-tion of nodes that have not touched the message.)

Thm (Kleinberg, 2000) The delivery time of any de-centralized algorithm in the grid-based model is Ω(n2/3).

Decentralized search in small-world networks

Page 15: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Extend model (Kleinberg, 2000) – Watt-strogatz model has no decentralized algorithm finding short paths.

α≥0 controls long range link correlated with the geometry of the underlying grid

Grid distance ρ(v,w) Choose u.a.r w for v with probability proportional to

ρ(v,w)-α

α=0, Watts-Strogatz model α is small: long range links are too random α is large: “ “ not random

enough.

Decentralized search in small-world networks

Page 16: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Thm (Kleinberg, 2000)1. 0≤α<2, delivery time of any decentralized

algorithm in the grid-based model: Ω(n(2-α)/3)2. α=2, There is a decentralized algorithm

with delivery time: O(log2n)3. α>2, delivery time of any decentralized al-

gorithm in the grid-based model: Ω(n(α-2)/(α-1))

Decentralized search in small-world networks

Page 17: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Decentralized search in small-world networks

A node with several random shortcuts spanning different distance scales.

Page 18: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Network is embedded in a hierarchy; Node resides at the leaves if a complete b-ary tree

Natural variation – Milgram’s experiment, Web page

Decentralized search in other models 1. Hierarchical models

Arts

Music

Opera

Verdi’s Aida

Science

Biol-ogy

Genetics

Yeast genome

Page 19: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Def b-ary treeA tree with no more than b children for each node

Def depth of a nodeThe distance from the node to the root of the tree

Def complete b-ary treeA b-ary tree with all leaf nodes at same depth. All internal node have b children.

Decentralized search in other models 1. Hierarchical models

Page 20: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Natural assumption: density of links is lower for node pairs that are more widely separated in the underlying hierarchy.

Hierarchical model with exponent β. Complete b-ary tree with n leaves(h=logbn) Tree distance h(v,w)=the height of their low-

est common ancestor Define random graph G on the set V of leaves k edge out of each v w as endpoint of the ith edge independently

with probability proportional to b-βh(v,w) . (β≥0)

Decentralized search in other models 1. Hierarchical models

Page 21: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Starting node s, target node t It must construct a path from s to t We know: edges out of nodes that it explicitly

visit. Caution: G may not contain a path from s to t. Def Delivery time f(n)

A decentralized algorithm has delivery time f(n) ↔ on a randomly generated n-node net-work, with s and t chosen u.a.r., the algorithm produces a path of length O(f(n)) with proba-bility at least 1-ε(n), ε→0 as n→∞

Decentralized search in other models 1. Hierarchical models

Page 22: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Thm (Kleinberg, 2001)(a) In the hierarchical model with exponent

β=1 and out-degree k=clog2n, for a suffi-ciently large const. a, ∃ a decentralized algorithm with polylogarithmic delivery time.

(b) ∀ β≠1 and every polylogarithmic func-tion k(n), there is no decentralized algo-rithm (in the hierarchical model with ex-ponent β and out-degree k(n)) that achieves polylogarithmic delivery time.

Decentralized search in other models 1. Hierarchical models

Page 23: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Watts, Dodds and Newman (2002) indepen-dently proposed a similar model.

Decentralized search in other models 1. Hierarchical models

Page 24: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Napster and music file sharing (1999) Centralized index Decentralized algorithm

Focused web crawler ↔ standard web search engine

Design principles and network data 1. Peer-to-peer systems and focused web crawling

Page 25: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

(Adamic, Adar, 2005) e-mail network ofob-servation: g(v,w)-3/4 compared with g(v,w)-1.

(Liben-Nowell, 2005) LiveJournal observa-tion

Rank-based friendship Thm (Liben-Nowell, 2005)

For an arbitrary population density on a grid, the expected delivery time of the de-centralized greedy algorithm in the rank-based friendship model is O(log3n).

Design principles and network data 2. Social Network data

Page 26: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Experiment in the social science : Highlights a fundamental and non-obvious property of network (efficient searchability in this case)

Random graph modeling, analyzing measure on large-scale data further results, question in algorithm, graph

theory and discrete probability

Recall: Flavor of research in this area

Page 27: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

J. Kleinberg. Navigation in a Small World. Nature 406(2000), 845.

J. Kleinberg. The Small-World Phenomenon and Decentralized Search. A short essay as part of Math Awareness Month 2004, ap-pearing in SIAM News 37(3), April 2004

J. Kleinberg. Complex Networks and Decentralized Search Algorithms. Proceedings of the International Congress of Mathemati-cians (ICM), 2006.

Albert-László Barabási, Linked: How Everything Is Con-nected to Everything Else and What It Means(2002)

Noga Alon, Joel H. Spencer, The Probabilistic Method, 2nd Edition(2000)

References

Page 28: By Jon Kleinberg Bo Young Kim Applied Algorithm Lab

Thanks