45
Overlays and DHTs Presented by Dong Wang and Farhana Ashraf

Overlays and DHTs

  • Upload
    heath

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

Overlays and DHTs. Presented by Dong Wang and Farhana Ashraf. Schedule. Review of Overlay and DHT RON Pastry Kelips. Review on Overlay and DHT. Overlay Network build on top of another network Nodes connected to each other by logical/virtual links - PowerPoint PPT Presentation

Citation preview

Page 1: Overlays and DHTs

Overlays and DHTs

Presented by

Dong Wang and Farhana Ashraf

Page 2: Overlays and DHTs

Schedule

Review of Overlay and DHT RON Pastry Kelips

Page 3: Overlays and DHTs

Review on Overlay and DHT Overlay

Network build on top of another network Nodes connected to each other by logical/virtual links Improve Internet Routing and Easy to deploy P2P(Gnutella, Chord…), RON

DHT Allows you to do lookup, insert, delete objects with keys in a distri

buted settings Performance Concerns

Load Balancing Fault Tolerance Efficiency of lookups and inserts Locality

CAN, Chord, Pasrty, Tapestry are all DHTs

Page 4: Overlays and DHTs

Resilient Overlay Network

David G. Andersen, etc.

MIT

SOSP 2001 Acknowledged:http://nms.csail.mit.edu/ron/Previous CS525 Courses

Page 5: Overlays and DHTs

RON-Resilient Overlay Network Motivation Goal Design Evaluation Discussion

Page 6: Overlays and DHTs

RON-Motivation

Current Internet Backbone NOT be able to

• Detect failed path and recover quickly• BGP takes several mins to recover from faults

• Detect flooding and congestion effectively• Leverage redundant path efficiently• Express fine-grained policy/metrics

Page 7: Overlays and DHTs

RON-Basic Idea

A

B

C

D

Inequality of Triangles does not usually hold for Internet! -Eg. Latency-It is possible that: AB+BC<AC RON makes use of underlying path redundancy of Internet to provide better path and route around failure RON is end to end solution, packets are simply wrapped around and sent normally

Page 8: Overlays and DHTs

RON-Main Goals

Fast Failure Detection and Recovery Average detect and recover delay<20s

Tighter integration of routing/path selection with the application Application can specify metrics to affect routing

Expressive Policy Routing Fine-grained and aim at users and hosts

Page 9: Overlays and DHTs

RON-Design

Overlay Old idea in networks Easily deployed and let Internet focus on

scalability Only keep functionality between active peers

Approach Aggressively probe all inter-RON node paths Exchange routing information Route along best path (from end to end view)

consistent with routing policy

Page 10: Overlays and DHTs

RON-Design: Architecture

Probe between nodes, detect path quality Store path qualities at Performance Database Link-state routing protocol among nodes Data are handled by application-specific conduit and

forwarded in UDP

Page 11: Overlays and DHTs

RON-Design: Routing and Lookup Policy routing

Classify by policy Generate table per policy

Metric optimization Application tags the packet

with its specific metric Generate table per metric

Multi-level routing table and 3 stage lookup Policy->Metric->Next hop

Page 12: Overlays and DHTs

RON Design-Probing and Outage Detect

Probe every PROBE_INTERVAL (12s) With 3 packets, both participants get an RTT and reachability withou

t syn. Clocks If probe is lost, send next immediately, up to 3 more probes (PROB

E_TIMEOUT 3s) Notify outage after 4 consecutive probe loses Outage detection time on average=19 s

Page 13: Overlays and DHTs

RON Design-Policy Routing Allow user to define types of traffic allowed on

particular network links

Place policy tags on packets and build up policy based routing table

Two policy mechanisms exclusive cliques general policies

Page 14: Overlays and DHTs

RON-Evaluation

Two main dataset from Internet deployment RON1-N=12 nodes, 132 distinct paths, traverse

36 AS and 74 Inter-AS paths RON2-N=16 nodes, 240 distinct paths, traverse

50 AS and 118 Inter-AS paths

Policy-prohibit sending traffic to or from commercial sites over the Internet2

CA-T1CCIArosUtah

CMU

To vu.nlLulea.se

MITMA-CableCisco

Cornell

NYU

OR-DSL

Page 15: Overlays and DHTs

RON Evaluation-Major Results Increase the resilience of the overlay network

RON takes ~10s to route around failure Compared to BGP’s several minutes

Many Internet outage are avoidable Improve performance –Loss rate, Latency, T

CP Throughput Single-hop indirect routing works well Overhead is reasonable and acceptable

Page 16: Overlays and DHTs

RON vs Internet 30 minute loss rates

For RON1-able to route around all outages; For RON2-about 60% outages are overcome

Page 17: Overlays and DHTs

Performance-Loss rate

Page 18: Overlays and DHTs

Performance-Latency

Page 19: Overlays and DHTs

Performance-TCP Throughput

Performance Improvement: RON employs App-specific metric optimization to select path

Page 20: Overlays and DHTs

Scalability

Page 21: Overlays and DHTs

Conclusions

RON improves network reliability and performance

Overlay approach is attractive for resiliency: development, fewer nodes, simple substrate

Single-hop indirection in RON works well RON also introduces more probing and

updating traffic into network

Page 22: Overlays and DHTs

RON-Discussion Aggressiveness

RON never back-off as TCP does Can it coexist with current traffic on Internet? What happens if everyone starts to use RON? Is it possible to modify RON to achieve good

behavior in a global sense Scalability

Trade scalability for improved reliability Many RONS coexisting in the Internet Hierarchical structure of RON network

Page 23: Overlays and DHTs

Distributed Hash Table

Page 24: Overlays and DHTs

Problem

Route a msg with key, K to the node, Z which has a ID closest to key K

Not scalable if routing table contains all the nodes

Tradeoffs Memory per node Lookup latency Number of messages

X=d46a1cd462ba

d4213f

d13da3

A = 65a1fc

d467c4

d471f1

Route(d46a1c)

Page 25: Overlays and DHTs

Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems

Antony RowstronPeter Druschel

Middleware 2001Acknowledged:Previous CS525 Courses

Page 26: Overlays and DHTs

Motivation

Node IDs are assigned randomly With high probability nodes with adjacent IDs are

diverse

Considers network locality Seeks to minimize distance messages travel

Scalar proximity metric #IP routing hops RTT

Page 27: Overlays and DHTs

Pastry: Node Soft State

Immediate Neighbors in ID space(Used for routing)

Nodes closest according to locality(Used to update routing table)

Used for routing

Similar to successor and predecessor

Similar to finger table entries

Storage requirement in each node = O(log N)

Page 28: Overlays and DHTs

Pastry: Node Soft State

Leaf Set Contains L nodes, closest in the ID space

Neighborhood Set Contains M nodes, closest according to proximity metric

Routing Table Entries of row n, shares exactly the first n digits with the

local node Nodes are chosen according to proximity metric

Page 29: Overlays and DHTs

Pastry: Routing

Case I Key within leaf set

Route to the node in the leaf set with ID closest to key

Case II (Prefix Routing) Key not within leaf set

Route a node in the routing table, such that the new node shares one more digit with the key than the local node

Case III Key not within leaf set Case II not possible

Route to a node which shares at least same number of digits with the key, but is closer to the key than the local node

Page 30: Overlays and DHTs

Routing Example

Cuts the ID space into 1/(2^b)

Number of hops needed is log2^bN

d46a1cd462ba

d4213f

d13da3

65a1fc

d467c4

d471f1

lookup(d46a1c)

Page 31: Overlays and DHTs

Self Organization: Node Join

X=d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

A = 65a1fc

Z=d467c4

d471f1

New node: X=d46a1c

A is X’s neighbor

Page 32: Overlays and DHTs

Pastry: Node State Initialization

Leaf set (X) = leaf set (Z)

Neighborhood set (A) = neighborhood set (X)

Routing Table Row zero of X = row zero

of A Row one of X = row one

of B

X=d46a1c

Route(d46a1c)

d462ba

d4213f

B= d13da3

A = 65a1fc

Z=d467c4

d471f1

New node: X=d46a1c

Page 33: Overlays and DHTs

X informs any nodes that need to be aware of its arrival X also improves its table locality by requesting neig

hborhood sets from all nodes X knows In practice: optimistic approach

Pastry: Node State Update

Page 34: Overlays and DHTs

Pastry: Node departure (failure) Leaf set repair (eager – all the time):

Leaf set members exchange keep-alive messages Request set from furthest live node in set

Routing table repair (lazy – upon failure): Get table from peers in the same row, if not found –

from higher rows

Neighborhood set repair (eager)

Page 35: Overlays and DHTs

Routing Performance

Average no. of hops = log(N) Pastry uses locality information

Page 36: Overlays and DHTs

Kelips: Building an Efficient and Stable P2P DHT through Increased Memory and Background Overhead

Indranil Gupta, Ken Birman, Prakash Linga, Al Demers, and Robert van Renesse

IPTPS 2003Acknowledged:Previous CS525 Courses

Page 37: Overlays and DHTs

Motivation

For n=1000000 and 20 byte per entry Storage requirement at a Pastry node = 120 byte

Not using memory efficiently

How can we achieve O(1) lookup latency? Increase memory usage per node

Page 38: Overlays and DHTs

Design

Consists of k virtual affinity groups

Each node member of an affinity group

Soft State Affinity Group View

(Partial) set of other nodes lying in the same affinity group Contacts

(constant sized) set of nodes lying in each of the foreign affinity groups

Filetuples (partial) set of filename and host IP address (homenode), where hom

enode lies in the same affinity group

Contains RTT, heartbeat count for each of the entries

Page 39: Overlays and DHTs

Kelips: Node Soft State

…AffinityGroup # 0 # 1 # k-1

129

30

15

160

76

18

167

soft state

id hbeat rttime

18

167

1890

2067

23ms

67ms

affinity group view

group contactnodes

0

1

[129, 30, … ]

[15, 160, …]

fname homenode

[18, 167, … ]p2p.txt

contacts

filetuple

Page 40: Overlays and DHTs

Storage Requirement @ Kelips Node S(k,n) = n/k + c * (k-1) + F/k entries

Minimized at k = √( (n+F) / c ) Assume F is proportional to n, and c fixed Optimal k = O(sqrt(n))

For n=1000000, and F = 10 million Total memory requirement < 20 MB

Affinity groups Contacts Filetuples

Page 41: Overlays and DHTs

Algorithm: Lookup

Lookup (key D at node A) Affinity group G of homenode of D = hash(D) A sends message to closest node X in the contact

set for affinity group G X finds homenode of D from filetuple set

O(1) lookup

Page 42: Overlays and DHTs

Maintaining Soft State

Heartbeat mechanism Soft state entries refreshed periodically within and across g

roups

Each node periodically selects a few nodes as gossip targets and sends them partial soft state information

Uses constant gossip message size

O(log n) complexity for gossiping

Page 43: Overlays and DHTs

Load Balance

N = 1500 with 38 affinity groups 1000 nodes with 30 affinity groups

Page 44: Overlays and DHTs

Discussion Points

What happens when triangular inequality does not hold for a proximity metric?

What happens for high churn rate in Pastry and Kelips?

What was the intuition behind affinity groups?

Can we use Kelips in a Internet scale network?

Page 45: Overlays and DHTs

Conclusion

A DHT tradeoff: Storage requirement Lookup Latency

Going one step further One hop lookups for p2p overlays

http://www.usenix.org/events/hotos03/tech/talks/gupta_talk.pdf

Chord Pastry Kelips

Storage requirement

O(log N) O(log N) O(√n)

Lookup Latency

O(log N) O(log N) O(1)