Upload
darrell-farmer
View
212
Download
0
Tags:
Embed Size (px)
Citation preview
Peer-to-Peer Networks
Christian Scheideler
Institut für Informatik
Technische Universität München
0 1
Motivation
• Every distributed system must be based on a network interconnecting its sites
• Network: of physical or logical nature
Overview
• Graph Theory• Supervised and Peer-to-Peer Overlay
Networks• Continuous-Discrete Approach• Maintaining a robust Cycle• Skip Graphs• Locality-aware Overlay Networks• Networks for non-uniform Peers
Graph theory
Graph G=(V,E):
• V: set of nodes / vertices
• E ½ { (v,w) | v,w 2 V}: set of edges / arcs
A
DB C
valid path
v knows wv knows w
v can send info to wv can send info to w
Graph theory
• (v,w): distance (length of shortest path) of w to v in G
• D=maxv,w (v,w): diameter of G
A
DB C
D=4
Graph theory
• (U): set of neighbors of node set U
• (U)=|(U)| / |U|
• (G) = minU,|U|<|V|/2 (U): expansion of G
U
A
DB C
|U)|=1
|U|=2
Graph theory
Network G=(V,E,c):
• V: set of nodes, E: set of edges
• c:E ! IR+: edge capacities
2
A
DB C
Graph Theory
Unless mentioned otherwise:
• All edges have capacity 1
• {v,w} represents {(v,w), (w,v)}
A
DB C
Line Network
• degree 2 (optimal), BUT
• diameter bad (n-1 for n nodes)
• expansion bad ( (line) = 2/n )
How to get a low diameter?
Binary Tree
• n=2k+1-1 nodes, degree 3
• diameter is k = 2 log2 n, BUT
• expansion is still bad ( (tree)=2/n )
0
k
depth k
2-dimensional Grid
• n = k2 nodes, maximum degree 4
• diameter is 2(k-1) < 2 n
• expansion is ~2/ n
• Not too bad, but can we get better values?
1
k
side length k
Hypercube
• Nodes: (x1,…,xd) 2 {0,1}d
• Edges: 8 i: (x1,…,xd) ! (x1,..,1-xi,..,xd)
d=1 d=2 d=3
Degree d, diameter d, expansion 1/ dRouting: (x1,x2,…,xd) ! (y1,x2,…,xd) ! (y1,y2,x3,…,xd) ! … ! (y1,y2,…,yd)
Butterfly
• Nodes: (k,(xd,…,x1)) 2 {0,..,d} £ {0,1}d
• Edges: (k-1,(xd,…,x1)) ! (k,(xd,..,xk,..,x1)) (k,(xd,..,1-xk,..,x1))
Degree 4, diameter 2d, expansion ~1/d
0
1
0
1
2
0 100 01 10 11
Routing: (0,(x1,x2,…,xd)) ! (1,(y1,x2,…,xd)) ! (2,(y1,y2,x3,…,xd)) ! … ! (d,(y1,y2,…,yd))
Cube-Connected-Cycles
• Nodes: (k,(x1,…,xd)) 2 {0,..,d-1} £ {0,1}d
• Edges: (k,(x1,…,xd)) ! (k-1,(x1,...,xd)) (k+1,(x1,..,xd)) (k,(x1,..,1-xk+1,..,xd)
De Bruijn Graph
• Nodes: (x1,…,xd) 2 {0,1}d
• Edges: (x1,…,xd) ! (0,x1,…,xd-1) (1,x1,…,xd-1)
00
01
10
11 000
100 110
111
001
010 101
011
(x1,…xd) ! (yd,x1,…xd-1) ! (yd-1,yd,x1,…,xd-2) ! …
The Diameter
Theorem: Every graph of maximum degree d>2 and size n must have a diameter of at least (log n)/(log(d-1))-1.
Theorem: For every even d>2 there is a family of graphs of maximum degree d and size n with diameter (log n)/(log d -1).tree of
all reachable nodes at dist. k
The Expansion
Theorem: For every graph G the expansion (G) is at most 1.
Theorem: There are families of constant degree graphs with constant expansion.
Example: Gabber-Galil Graph• Node set: (x,y) 2 {0,…,n-1}2
• (x,y) ! (x,x+y),(x,x+y+1), (x+y,y), (x+y+1,y)
(mod n)
Overview
• Graph Theory• Supervised and Peer-to-Peer Overlay
Networks• Continuous-Discrete Approach• Maintaining a robust Cycle• Skip Graphs• Locality-aware Overlay Networks• Networks for non-uniform Peers
Overlay Network
Basic question: how to organize sites in a scalable and robust overlay network???
Scalability: works efficiently for large number of sitesRobustness: can handle faults and malicious behavior
AlternativesSupervised overlay network
Supervisor assists inmaintaining network
Peer-to-peer overlay network
Peers maintainnetwork themselves
Supervised Overlay Network
• Supervisor assigns peers to points in [0,1) so that peers evenly distributed
• Neighboring peers connect to form cycle
01
0
1/2
1/43/4
1/8
3/85/8
7/8
Supervised Overlay Network
• Node v wants to join (n nodes in system):give it (n+1)th position
• Node w wants to leave:move last node v to w‘s position
01
v
w
Supervised Overlay Network
• v: node at nth position
• supervisor: stores pred(v), v, succ(v), succ(succ(v))
• join and graceful leave operation:01
v
Pure Peer-to-Peer Network
We also focus on [0,1).Every peer mapped to random point in [0,1).
Peers form cycle based on points.• Chord: cryptographic hash function• CAN: random number
0 1v
Overview
• Graph Theory• Supervised and Peer-to-Peer Overlay
Networks• Continuous-Discrete Approach• Maintaining a robust Cycle• Skip Graphs• Locality-aware Overlay Networks• Networks for non-uniform Peers
Continuous-discrete Approach
• V: set of peers, U: virtual space• Each v 2 V mapped to region R(v) ½ U• Family F of functions f:U ! U• {v,w} edge , [F(R(v)) Å R(w)] [ [F(R(w)) Å R(v)] = ;
Continuous-discrete Approach
Basic questions:
• How to map peers to regions?
• What family F to choose?
Continuous-discrete Approach
• Take a classical family of networks(Hypercube, de Bruijn graph,…)
• Convert it into continuous form by interpreting node labels as points in U,edges as a family of functions F
• Mapping peers to regions will then convert continuous form back into discrete graph.
Hypercube
Classical hypercube:• V: nodes with labels (x1,…,xd) 2 {0,1}d
• For all i: (x1,…,xd) ! (x1,..,1-xi,..,xd)
Continuous version of hypercube:• Interpret (x1,…,xd) as z=i xi/2i
• d ! 1: U=[0,1)• F: fi
+(x) = x+1/2i, fi-(x) = x-1/2i 8 i>0
De Bruijn Graph
Classical de Bruijn graph:• V: nodes with labels (x1,…,xd) 2 {0,1}d
• E: (x1,…,xd) ! (0,x1,…,xd-1), (1,x1,…,xd-1)
Continuous de Bruijn graph:• Interpret (x1,…,xd) as z=i xi/2i
• d ! 1: U=[0,1)• F: f0(x) = x/2, f1(x) = (1+x)/2
Gabber-Galil Graph
Classical Gabber-Galil graph:• Node set: (x,y) 2 {0,…,n-1}2
• (x,y) ! (x,x+y),(x,x+y+1), (x+y,y), (x+y+1,y) (mod n)
Continuous Gabber-Galil graph:• n ! 1: U=[0,1)2
• F: f1(x,y)=(x,x+y), f2(x,y)=(x+y,y)
Continuous-discrete Approach
• Take a classical family of networks(Hypercube, de Bruijn graph,…)
• Convert it into continuous form by interpreting node labels as points in U,edges as a family of functions F
• Mapping peers to regions will then convert continuous form back into discrete graph.
Supervised Overlay Network
• How to map peers to regions?
• Consider any space U=[0,1)d
• Hierarchical decomposi-tion tree:
Supervised Overlay Network
Fact:
• Volumes of subcubes assigned to nodes differ by factor of at most 2.
• Subcubes pairwise disjoint.
• Union of subcubes gives U.
Combine this with family F of functions.
Join Operation
000
R(v)
001 10
11R(v) R(w)
f
f’
{u,v} edge , [F(R(u)) Å R(v)] [ [F(R(u)) Å R(v)] = ;
Join Operation
0 1000 001 01 10 11010 011
v w w inherits connections from vw inherits connections from v
Supervised Overlay Network
For any supervised network based on continuous-discrete approach with [0,1)d:
• Sufficient if supervisor introduces new peer to cycle neighbors. From these, new peer can get all F-connections
• Join/leave can be performed with constant time and work for supervisor.
High robustness:• Sufficient to secure base cycle!
Peer-to-Peer Overlay Network
We focus on U=[0,1).
Every peer mapped to random point in [0,1).
01
v
v owns region[v,succ(v))
Join Operation
• New peer chooses random position x.
• Route to peer v owning position.
• Inherit all relevant edges w.r.t. F from v
0 1xv
Peer-to-Peer Overlay Network
Scalability: with hypercube / de Bruijn
• network has logarithmic diameter
• peers have (poly-)logarithmic degree
• join/leave need (poly-)logarithmic time/work (w.h.p.)
Robustness:
• Make sure base ring is robust!
Overview
• Graph Theory• Supervised and Peer-to-Peer Overlay
Networks• Continuous-Discrete Approach• Maintaining a robust Cycle• Skip Graphs• Locality-aware Overlay Networks• Networks for non-uniform Peers
Maintaining a robust cycle
Solution: connect to (log n) nearest neighbors
01 2 nearestChernoff bounds: nodes still connected under constant fraction of random failures
(with high probability)
Chernoff bounds: nodes still connected under constant fraction of random failures
(with high probability)
Nodes randomly distributed on cycle: constant fraction of correlated failures redu-ces to random failure case
Nodes randomly distributed on cycle: constant fraction of correlated failures redu-ces to random failure case
Maintaining a robust cycle
Problem: what if adversarial peers are part of in the system?
adversarial peershonest peers
system cannot distinguish between peers!
Supervised cycle
01
v
w
Nodes connect to (log n) nearest neighbors:Hard for adversarial peers to isolate honest peers
Peer-to-peer cycle
Chord: uses cryptographic hash function to map peers to points in [0,1)
• randomly distributes honest peers• does not randomly distribute adversarial peers
Peer-to-peer cycle
Group spreading:
• Map peers to random points in [0,1)
• Limit lifetime of peers
Too expensive!
Peer-to-peer cycle
How can the system enforce an evendistribution of honest and adversarial peers
in the [0,1) space???
Peer-to-peer cycle
• n honest peers, n adversarial peers
• partition [0,1) space into regions of size (c log n)/n for some constant c
For any region I ½ [0,1) of size (c log n)/n:
• Balancing condition: (log n) peers in I
• Majority condition: honest peers in majority
scalabilityscalability
robustnessrobustness
How to satisfy conditions?
• Rule that works: k-cuckoo rule
evict k/n-region
n honest n adversarial
< 1-1/k
Limitation of k-cuckoo rule
• Only works for any sequence of join and leave requests of adversarial peers.
• Does not work for any sequence of join and leave requests.
Example: adversary orders all peers in a region of size O(log n / n) to leave
Solution: also rearrangements for leave Op.
k-Flip&Cuckoo Rule
• Join: as before (k-cuckoo rule)
• Leave: choose random k/n-region among neighboring (c log n) k/n-regions, empty & flip it with random k/n-
region
n honest n adversarial
flipjoin
Random Number Generation
Critical component:robust distributed random number generator
Solution:• very simple (no error-correcting codes)• works for public channels• even if constant fraction is adversarial
Trick: generate groups of random numbers
Maintaining a robust cycle
• So far, only proactive techniques (i.e., techniques that protect cycle)
• Proactive techniques expensive and have their limits (minority of adv. peers)
• Also reactive techniques needed (i.e., techniques that can recover cycle)
Recovering a sorted list
Naïve approach:
• Continuously collect info about neighbors of neighbors until all nodes known
• Transform neighborhood into sorted list
Initialgraph
Not scalable!Not scalable!
Not easy to check!Not easy to check!
Recovering a sorted list
Better approach: linearization
Every node does the following locally:
12853 14 16
12853 14 16
coordination problemcoordination problem
Recovering a sorted list
Naïve solution of coordination problems:
• Suppose that time is synchronized
• In each round (2 time steps) each node v:– right linearization
– left linearization
v v
vv
Recovering a sorted list
Correctness of right/left linearization:• Consider arbitrary consecutive pair v,w
• Range reduces by 1 in each round
v w
range of path from v to w
Recovering a sorted list
Correctness of right/left linearization:
• Consider arbitrary consecutive pair v,w
v w
range of path from v to w
Recovering a sorted list
Correctness of right/left linearization:• Consider arbitrary consecutive pair v,w
• degree increases by +2 in each round
v w
range of path from v to w
Recovering a sorted list
More realistic approach: take asynchronous behavior into account
• Peers operate in actions:<label>: <guard> ! <commands>
• v.NB: neighbor list of v
• we assume: w 2 v.NB , v 2 w.NB
{v,w}: 0/1
v w edges like shared variables
no edges {v,v}
Recovering a sorted list
u.L, u.R: left / right neighborhood of u
Actions for node u:• grow right: (v 2 u.R) Æ (w 2 v.L) Æ (w 2 u.NB) !
u.NB := u.NB [ {w}
• trim right: (v,w 2 u.R) Æ (w 2 v.L) ! u.NB := u.NB n {v}
• grow left and trim left similar
u vw
w vu
safe if executed sequentially in each nodesafe if executed sequentially in each node
preferred op to keep degree lowpreferred op to keep degree low
wait until w2 u.NB and u2 w.NBwait until w2 u.NB and u2 w.NB
Recovering a sorted cycle
Establish wrap-around edge:
• v.wa: wrap-around edge of v• we assume: v.wa = w , w.wa=v• v sets v.wa to w: v.NB:=v.NB [ {v.wa}, v.wa:=w
Problem: more cases for initial state!
Recovering a sorted cycle
Additional actions for node u:• wrap: (u.L=;) Æ (u.wa=?) Æ (w 2 u.R) !
u.wa := w
• extend: (u.L=;) Æ (u.wa=?) Æ (w2 u.wa.R) ! u.wa := w
• unwrap: (u.L=;) Æ (u.wa=?) Æ (u.wa>u) ! u.wa := ?
wu
wu
uv
Overview
• Graph Theory• Supervised and Peer-to-Peer Overlay
Networks• Continuous-Discrete Approach• Maintaining a robust Cycle• Skip Graphs• Locality-aware Overlay Networks• Networks for non-uniform Peers
Skip Graphs
Better:
• Give nodes hierarchically specified names europe.germany.bavaria.munich.tum
• Sort nodes according to names
name space
Problem: high imbalance, so cont-disc approach does not work!
Skip Graphs
• Each node v has arbitrary unique name ID(v) and random bit string s(v)
• prefixi(s(v)): first i bits of s(v)
Skip graph rule:
For every node v and i 2 IN0:• v connects to closest successor and pre-
decessor w (w.r.t. ID(v) ) with prefixi(s(w)) = prefixi(s(v))
Skip Graphs
Hierarchical view:
0 1
00 01 10 11
000 001
log n) Degree, (log n) diameter, (1) expansion w.h.p.
The Hyperring
Is randomization in skip graphs necessary?
Hyperring: deterministic form of skip graph
Approach similar to skip graphs: organize nodes in cycle according to real names.
CherryBananaApple
k-separated Hyperring
In every level, bridges are k nodes apart.
How large does k have to be to guaranteepolylogarithmic expansion ?
Theorem: = (1/n)(1/ k )
So k has to be non-constant ( ( log n ) ).
Do areas with old insertions/deletions have to berevisited??
2
k-separated Hyperring
Rule: Choose k=6(d+3) d: current degree of node initiating op.
Theorem:• degree: O(log n)• expansion: (1/log n)• congestion for permutations: O(log n)
w.h.p.• work for Join/Leave: O(log n) 3
Locality-aware Overlay Networks
Problem: in general, a distance metric can-not be embedded well into 1-dimensional space
So applicability of skip graphs limited
Use different construction based on Plaxton, Rajaraman and Richa
Overview
• Graph Theory• Supervised and Peer-to-Peer Overlay
Networks• Continuous-Discrete Approach• Maintaining a robust Cycle• Skip Graphs• Locality-aware Overlay Networks• Networks for non-uniform Peers
Locality-aware Overlay Networks
For a node v let
• s(v) be its random bit string and
• Bi(v) be ball around v of minimum radius so that Bi(v) contains c 2i log n peers
B1(v)B2(v)
B3(v)
Locality-aware Overlay Networks
Assumption: growth-bounded metric
• N(v,r): set of nodes w with d(v,w) < r
• There is a constant >0 so that|N(v,(1+)r)| < 2|N(v,r)| all v, r
B1(v)B2(v)
B3(v)
Locality-aware Overlay Networks
Topology: for every node v and i 2 IN:
• v connects to all nodes w 2 Bi(v) with prefixi-1(s(v)) = prefixi-1(s(w))
B1(v)B2(v)
B3(v)
c 2i log n peersin Bi(v)
Locality-aware Overlay Networks
Topology rule implies:
• degree of each node (log2 n) w.h.p.
• v has nodes w in Bi(v) with prefixi(s(w)) = prefixi-1(s(v)) ± x for all x 2 {0,1} w.h.p.
B1(v)B2(v)
B3(v)
c 2i log n peersin Bi(v)
Locality-aware Routing
Routing from v to w:
• s(v)=(x1 x2 x3…), s(w)=(y1,y2,y3,…)
• v ! closest u1 in B1(v) with prefix1(u1) = y1
• u1 ! closest u2 in B2(u1) with prefix2(u2) = y1 y2
• …
• until we reach uk-1 with w in Bk(uk-1)
Locality-aware Routing
Let r(B) be radius of ball B.• d(u1,v) < r(B1(v))/ w.h.p. ( = (log1+ c) )• r(B2(u1)) > (1+-1/) r(B1(v))• d(u2,u1) < r(B2(u1))/ w.h.p.• r(B3(u2)) > (1+-1/) r(B2(u1))• …
After k hops ( r=r(B1(v)) ):• d(uk, w) < d(v,w) + i=0
k-1 (1+-1)i r/ < d(v,w) + (-1)-1 r (1+-1/)k
• r(Bk+1(uk)) > (1+-1/)k r
Locality-aware Routing
After k hops ( r=r(B1(v)) ):• d(uk, v) < i=0
k-1 (1+-1)i r/ < (-1)-1 r (1+-1/)k
• r(Bk+1(uk)) > (1+-1/)k r
Finally, w 2 Bk+1(uk):• d(v,w) > r(Bk(uk-1)) – d(uk-1,v)
> (1-1/(-1)) (1+-1/)k-1 r• d(uk,v) < d*=(-1)-1 r (1+-1/)k and
total path length < 2d*+d(v,w)
vukw
d* < (/2)d(v,w) if > 2(1+)/+2
Overview
• Graph Theory• Supervised and Peer-to-Peer Overlay
Networks• Continuous-Discrete Approach• Maintaining a robust Cycle• Skip Graphs• Locality-aware Overlay Networks• Networks for non-uniform Peers
Networks for non-uniform peers
Problem: peers have non-uniform bandwidth
Cont-disc and skip graphs do not work!
Networks for non-uniform peers
Ad-hoc solutions:
• cut large peers into many small peers
• multi-tier network
Better approach:
• organize peers in a heap
How to design scalable distributed heap?
Networks for non-uniform peers
dB(1)
dB(2)
dB(3)
dB(4)
………………..
dB(d): leveled de Bruijn graph of dimension d
Routing between v and w via nodes of two dB-levels up
PAGODA heap network
5 levels
4 levels
3 levels
v w
Join
dB(1)
dB(2)
dB(3)
dB(4)
………………..
dB(d): leveled de Bruijn graph of dimension d
PAGODA heap network
5 levels
4 levels
~log2 n levels
Move upwards until all parents havelarger bandwidth
Leave
dB(1)
dB(2)
dB(3)
dB(4)
………………..
dB(d): leveled de Bruijn graph of dimension d
PAGODA heap network
5 levels
4 levels
~log2 n levels
Set bandwidth to 0, send downwards untilno further children, remove node
Networks for non-uniform peers
dB(1)
dB(2)
dB(3)
dB(4)
………………..
dB(d): leveled de Bruijn graph of dimension d
Problem: updating PAGODA may need O(log2 n) time
PAGODA heap network
~log2 n levels
Networks for non-uniform peers
SHELL network: oblivious heap
Join operation: O(log n) time
Leave operation: O(1) time
Conclusions
Many interesting fronts to work on in contextof scalable distributed systems:• self-optimizing networks• social networks• proactive approaches• reactive approaches
(repairs under adversarial presence)• new paradigms