Searching via Your Neighbor’s Neighbor: The Power of Lookahead in P2P Networks

Searching via Your Neighbor’s Neighbor:

The Power of Lookahead in P2P Networks

Moni Naor Udi Wieder

The Weizmann Institute of Science

Gurmeet Manku

Stanford

The Small World Phenomena

a very brief history

Folklore – People are connected via short chains – The graph of social networks has small diameter. Barabasi: belief may have originated from a story by

Frigyes Karinthy, 1929

Quantitative approach initiated by Milgram in the 1960’s - “The six degrees of separation”.Mathematical modeling: Model a social network by some distribution on graphs.A precursor of P2P – need to locate a resource in a ‘natural’ network based on partial information.

P2P = Peer-to-Peer = a highly dynamic network

Routing in a Small World

Common question: do short paths exist?

Kleinberg’s algorithmic question: assuming short paths exist, how do people find them?

Modeling Small Worlds

Kleinberg’s model [2000]: People points on a two

dimensional grid. Grid edges (short range). One long range contact chosen

with the Harmonic distribution. probability of (u,v) proportional

to 1/d(u,v)2.

Naturally generalizes to k long range links (Symphony [MBR03],[ADS02].).

Naturally generalizes to any dimension.

Captures the intuitive notion that people know people who are close to them.

Modeling Small Worlds

Small World Percolation: People points on a two

dimensional grid. Grid edges (short range). Each edge appears independently

with probability = inverse of its distance squared. Degree of each node . Originates from long range

percolation model.

Shares structural properties with some popular randomized P2P networks: R-Chord, R-Hypercube, Skip Lists…

£(logn)

Routing in Small Worlds

Greedy algorithm: move to the node that minimizes the L1 distance to the target.

SchemeDegreeGreedy – path length

Kleinberg’s ModelP2P - [MBR03],[ADS02]

Percolation Small World, R-Chord, R-Hypercube

Skip Lists – Skip Nets [AS02],[HDJ+03]

£(logn)

£ (logn)

O(logn)

O(logn)

£( log2 nk )k · logn

Properties of Greedy

Simple – to understand and to implement.Local – If source and target are close, the path remains within a small area.In some cases – (Hypercube, Chord) – the best we can do.Not optimal with respect to the degree.


Kleinberg’s Model

[MBR03],[ADS02]



Can Greedy Routing be shortened?Without compromising the good properties

£(logn)

£( log2 nk

)

£(logn) O(logn)

O(logn)

k · logn

Neighbor of Neighbor (NoN) Routing

Each node has a list of its neighbor’s neighbors.The message is routed greedily to the closest neighbor of neighbor (2 hops).

Let w1, w2, … wk be the neighbors of current node u For each wi find zi, the closet neighbor to target t Let j be such that zj is the closest to target t Route the message from u via wj to zj

Effectively it is Greedy routing on the squared graph.

The first hop may not be a greedy choice.

Previous incarnations of the approach: Coppersmith, Gamarnik and Sviridenko [2002]: proved

an upper bound on the diameter of a small world graph. No routing algorithm

Manku, Bawa and Ragahavan [2003]: a heuristic routing algorithm in ‘Symphony’ - a Small-World like P2P network.

What can we show about Non Greedy

PSW, R-Chord, R-Hypercube are degree optimal w.h.p.Skip Lists – degree optimal on expectation.Kleinberg’s model and P2P variations – improved.Lower bounds for algorithms based on neighbor lists only (Greedy is a special case).


NoN Greedy – path length

Kleinberg’s ModelP2P - [MBR03],[ADS02]



£( lognlog logn )

£ ( lognlog logn )

k

£(logn)

£ (logn)

£ (logn)

£ (logn)

£( log2 nk ) £ ( log2 n

k logk )

Degree Optimal P2P RoutingDifferent routing schemes

Viceroy [MNR02]: emulates the butterfly network Constant degree O(log n) hops for routing

Constructions emulating De-Bruijn graphs Can achieve any degree/number of hops tradeoff

In particular degree O(log n) and O(log n/ log log n) hops

Routing is not greedy Recent construction [AM] fixes that.

Even if target and source are close in label space message might be routed awayNo (natural) prefix search

Random keys are necessary.

Skip – Graphs [AS02],[HDJ+03]

Each node (resource) has a name.Nodes are arranged on a line sorted by name.

Each node chooses a random string of bits.An edge is established if two nodes share a prefix which is not shared by the nodes between them.Allows prefix search.

0 1 110011

1 1100 00

0 1 0

a b c fed

Routing in Skip – Graphs

Greedy Routing – use longest edge possible.Path length is (log n) w.h.p.

The NoN algorithm optimizes over two hops.

0 1 110011

1 1100 00

0 1 0

Theorem: Using the NoN algorithm, the expected path length of any lookup is .

Call a NoN 2-hop successful if it reduces the distance from d to .Need succesful 2-hops to get to distance 1.From Lemma, this would take in expectation.

Skip Graphs – degree optimality

d 0

X - # of two hop paths between d and

D - the event a message reached the node d.

Lemma: Prob

O(logn=loglogn)

O(logn=loglogn)

Sufficiency of lemma:

d=logd

[ dlogd

;0]

d=logd

[(X > 0)jD] ¸ 12

Ai,j - There exists an edge between i, j.

Lemma:


Want to show Prob . Ignore dependence on D.

c1ji ¡ j j · Pr[A i ;j ] · c2

ji ¡ j j

Proof: For prefix of length k the probability of an edge is:

Let k be log(|i-j|).

2¡ k ¢(1¡ 2¡ k)ji ¡ j j¡ 1


0

d=logd

d

[ dlogd

;0]

E [X ] ¸d

logd¢

nX

i=1

c1

i(n ¡ i)¸ 5

Choice of constants

[(X > 0)jD] ¸ 12

0d i j x y

Which implies: var[X ] · E [X ]+ 12E 2[X ]

Pr[X = 0] · E [X ]+ 12

E 2[X ]E 2[X ] · 0:7

Ai,j - There exists an edge between i, j.



[ dlogd

;0]

Careful calculation:

deal with dependencies

cov[Ad;i ;x;Ad;j ;y] · 12

Pr[Ad;i ;x]¢Pr[Ad;j ;y]

The Cost/Performance of NoN

Cost of Neighbor of Neighbor lists: Memory: O(log2n) - marginal. Communication: Is it tantamount to squaring the degree?

Neighbor lists should be maintained (open connection, pinging, etc.)

NoN lists should only be kept up-to-date.

Reduce communication by piggybacking updates on top of the maintenance protocol.

Lazy updates: Updates occur only when communication load is low – supported by simulations.Networks of size 217 show 30-40% improvement

Simulation Results

Small World - one dimension

Skip Graphs

Simulation Results

2-dimensional small world

1-dimensional Small World each edge fails with probability 1/2

A Case for Randomized Topology

Average diameter of hypercube is .Average diameter of ‘perfect’ skip graph is .Average diameter of Chord is .

Conclusion – The randomization of edges reduces the average path lengths.Common design rule – reduce randomization in topology.

The long edges are just in the right density, so that NoN finds them without increasing the degree.

Other advantages: Security, fault tolerance….

(logn) (logn)

(logn)

Do People Use the NoN Algorithm?Experiment based on email [DRW03]About 25% sent the mail because:

The recipient traveled to target’s geographical region.

The recipient’s family originates from target’s geographical region.

Lower Bounds – A Probing Model

Goal: Find a path between two nodes in an unknown graph.The algorithm may probe a node. If the probing reveals a neighborhood of radius k, then the algorithm is k – local.A lower bound on the number of probes implies a lower bound on the sequential running time of routing.The Greedy algorithm is 1-local. NoN is 2-local.

Theorem: Every 1-local algorithm requires probes w.h.p, both in small worlds and in skip graphs.

(logn)

Conclusion: Some extra information is necessary.

Greedy algorithm dominates1-local algorithms.

Let A be a 1-local algorithm. Denote by the r.v. counting the number of probes it takes A (Greedy) to find a path between 0 and d.

Pr[gd · k] ¸ Pr[fd · k]Lemma: For all k;d>0 ;

• If a probe finds node i, reveal all edges (prefixes) in [d;i]. Only increases .

• The ‘best chance’ of getting close to 0 is by probing the node closest to 0.

0d irevealed

fd (gd)

Pr[f d · k]

Lower Bounds on Greedy

Partition the nodes to balls B0,B1,…,Blog d

Define Xi – the indicator of the event :“Greedy probed a node in Bi”

The probe complexity is at least .

d 0B0 B3B2B1

lognX

i=0

X i

Lemma: Both for skip graphs and small worlds, there exists a constant c such that:Pr[X i = 1jX 0 = 1;X 1 = 1;:: : ;X i ¡ 1 = 1] ¸ c

Azuma’s inequality :

Pr[P

X i · 12clogn] · n¡ ²

E [P

X i ] ¸ clogn

Lower Bounds on Greedy

Xi depends only on the last ball visited. When a ball is visited – skip to the last node.

Assume X0=1,X1=0.

The probability the dangling edge would skip over B2 is at most .

Lemma: Both for skip graphs and small worlds, there exists a constant c such that:Pr[X i = 1jX 0 = 1;X 1 = 1;:: : ;X i ¡ 1 = 1] ¸ c

d 0B0 B3B2B1

1¡ c

Conclusions

NoN Greedy seems like an almost free tweak that is a good idea in many settings.Do not be perfect (all the time) – randomization helps.What is more important

Prefix search. Easy and ‘natural’ degree optimality. Better understanding of the ‘small world’ phenomena.

Documents

Searching via Your Neighbor’s Neighbor: The Power of Lookahead in P2P Networks