Upload
bryanna-dapp
View
215
Download
0
Embed Size (px)
Citation preview
Distributed Hash Tables
CPE 401 / 601Computer Network Systems
Modified from Ashwin Bharambe and Robert Morris
P2P Search
How do we find an item in a P2P network? Unstructured:
• Query-flooding• Not guaranteed• Designed for common case: popular items
Structured• DHTs – a kind of indexing• Guarantees to find an item• Designed for both popular and rare items
The lookup problem
Internet
N1
N2 N3
N6N5
N4
Publisher
Put (Key=“title”Value=file data…) Client
Get(key=“title”)
?
• At the heart of all DHTs
Centralized lookup (Napster)
Publisher@
Client
Lookup(“title”)
N6
N9 N7
DB
N8
N3
N2N1SetLoc(“title”, N4)
Simple, but O(N) state and a single point of failure
Key=“title”Value=file data…
N4
Flooded queries (Gnutella)
N4Publisher@
Client
N6
N9
N7N8
N3
N2N1
Robust, but worst case O(N) messages per lookup
Key=“title”Value=file data…
Lookup(“title”)
Routed queries (Freenet, Chord, etc.)
N4Publisher
Client
N6
N9
N7N8
N3
N2N1
Lookup(“title”)
Key=“title”Value=file data…
Hash Table
Name-value pairs (or key-value pairs) E.g,. “Mehmet Hadi Gunes” and [email protected] E.g., “http://cse.unr.edu/” and the Web page E.g., “HitSong.mp3” and “12.78.183.2”
Hash table Data structure that associates keys with values
7
lookup(key) valuekey value
Distributed Hash Table
Hash table spread over many nodes Distributed over a wide area
Main design goals Decentralization
• no central coordinator Scalability
• efficient even with large # of nodes Fault tolerance
• tolerate nodes joining/leaving
Distributed hash table (DHT)
Distributed hash table
Distributed application
get (key) data
node node node….
put(key, data)
Lookup service
lookup(key) node IP address
• Application may be distributed over many nodes• DHT distributes data storage over many nodes
Distributed Hash Table
Two key design decisions How do we map names on to nodes? How do we route a request to that node?
Hash Functions
Hashing Transform the key into a number And use the number to index an array
Example hash function Hash(x) = x mod 101, mapping to 0, 1, …, 100
Challenges What if there are more than 101 nodes?
Fewer? Which nodes correspond to each hash value? What if nodes come and go over time?
Definition of a DHT
Hash table supports two operations insert(key, value) value = lookup(key)
Distributed Map hash-buckets to nodes
Requirements Uniform distribution of buckets Cost of insert and lookup should scale
well Amount of local state (routing table size)
should scale well
Plaxton Trees
Motivation Access nearby copies of replicated objects Time-space trade-off
• Space = Routing table size• Time = Access hops
Plaxton Trees Algorithm
9 A E 4 2 4 7 B
1. Assign labels to objects and nodes
Each label is of log2b n digits
Object Node
- using randomizing hash functions
Plaxton Trees Algorithm
2 4 7 B
2. Each node knows about other nodes with varying prefix matches
Node
2 4 7 B
2 4 7 B
2 4 7 B2 4 7 B
3
1
5
3
6
8
A
C
2
2
2 4
2 42 4 7
2 4 7
Prefix match of length 0
Prefix match of length 1
Prefix match of length 2
Prefix match of length 3
Plaxton Trees Object Insertion and Lookup
Given an object, route successively towards nodes
with greater prefix matches
2 4 7 B
Node
9 A E 2
9 A 7 6
9 F 1 0
9 A E 4
Object
Store the object at each of these locations
Plaxton Trees Object Insertion and Lookup
Given an object, route successively towards nodes
with greater prefix matches
2 4 7 B
Node
9 A E 2
9 A 7 6
9 F 1 0
9 A E 4
Object
Store the object at each of these locations
log(n) steps to insert or locate object
Plaxton Trees Why is it a tree?
2 4 7B
9 F 1 0
9A7 6
9AE 2
Object
Object
Object
Object
Plaxton Trees Network Proximity Overlay tree hops could be totally
unrelated to the underlying network hops
USA
Europe
East Asia
Plaxton trees guarantee constant factor approximation! Only when the topology is uniform in some sense
Chord Map nodes and keys to identifiers
Using randomizing hash functions Arrange them on a circle
IdentifierCircle
x
succ(x)
010110110
010111110
pred(x)
010110000
Consistent Hashing0
4
8
12 Bucket
14
• Construction– Assign each of C hash buckets to random points on
mod 2n circle; hash key size = n– Map object to random position on circle– Hash of object = closest clockwise bucket
• Desired features– Balanced: No bucket responsible for large number of objects– Smoothness: Addition of bucket does not cause major movement among
existing buckets– Spread and load: Small set of buckets that lie near object
• Similar to that later used in P2P Distributed Hash Tables (DHTs)• In DHTs, each node only has partial view of neighbors
Consistent Hashing Large, sparse identifier space (e.g., 128
bits) Hash a set of keys x uniformly to large id space Hash nodes to the id space as well
0 1
Hash(name) object_idHash(IP_address) node_id
Id space represented as a ring
2128-1
Where to Store (Key, Value) Pair?
Mapping keys in a load-balanced way Store the key at one or more nodes Nodes with identifiers “close” to the key
• where distance is measured in the id space
Advantages Even distribution Few changes as
nodes come and go…
Hash(name) object_idHash(IP_address) node_id
Hash a Key to Successor
N32
N10
N100
N80
N60
CircularID Space
• Successor: node with next highest ID
K33, K40, K52
K11, K30
K5, K10
K65, K70
K100
Key ID Node ID
Joins and Leaves of Nodes Maintain a circularly linked list around the
ring Every node has a predecessor and successor
node
pred
succ
Successor Lists Ensure Robust Lookup
N32
N10
N5
N20
N110
N99
N80
N60
• Each node remembers r successors• Lookup can skip over dead nodes to find blocks
N40
10, 20, 32
20, 32, 40
32, 40, 60
40, 60, 80
60, 80, 99
80, 99, 110
99, 110, 5
110, 5, 10
5, 10, 20
Joins and Leaves of Nodes
When an existing node leaves Node copies its <key, value> pairs to its
predecessor Predecessor points to node’s successor in the
ring When a node joins
Node does a lookup on its own id And learns the node responsible for that id This node becomes the new node’s successor And the node can learn that node’s
predecessor• which will become the new node’s predecessor
Nodes Coming and Going
Small changes when nodes come and go Only affects mapping of keys mapped to the
node that comes or goes
Hash(name) object_idHash(IP_address) node_id
How to Find the Nearest Node? Need to find the closest node
To determine who should store (key, value) pair
To direct a future lookup(key) query to the node
Strawman solution: walk through linked list Circular linked list of nodes in the ring O(n) lookup time when n nodes in the ring
Alternative solution: Jump further around ring “Finger” table of additional overlay links
Links in the Overlay Topology Trade-off between # of hops vs. # of neighbors
E.g., log(n) for both, where n is the number of nodes E.g., such as overlay links 1/2, 1/4 1/8, … around the ring Each hop traverses at least half of the remaining distance
1/2
1/4
1/8
Chord “Finger Table” Accelerates Lookups
N80
½¼
1/8
1/161/321/641/128
Chord lookups take O(log N) hops
N32
N10
N5
N20
N110
N99
N80
N60
Lookup(K19)
K19
Chord Self-organization Node join
Set up finger i: route to succ(n + 2i) log(n) fingers ) O(log2 n) cost
Node leave Maintain successor list for ring connectivity Update successor list and finger pointers
Chord lookup algorithm properties
Interface: lookup(key) IP address Efficient: O(log N) messages per lookup
N is the total number of servers Scalable: O(log N) state per node Robust: survives massive failures Simple to analyze
Comparison of Different DHTs
# Links per node Routing hops
Pastry/Tapestry O(2b log2b n) O(log2
b n)
Chord log n O(log n)
CAN d dn1/d
SkipNet O(log n) O(log n)
Symphony k O((1/k) log2 n)
Koorde d logd n
Viceroy 7 O(log n)
What can DHTs do for us?
Distributed object lookup Based on object ID
De-centralized file systems CFS, PAST, Ivy
Application Layer Multicast Scribe, Bayeux, Splitstream
Databases PIER
Where are we now?
Many DHTs offering efficient and relatively robust routing
Unanswered questions Node heterogeneity Network-efficient overlays vs. Structured
overlays• Conflict of interest!
What happens with high user churn rate? Security
Are DHTs a panacea?
Useful primitive Tension between network efficient
construction and uniform key-value distribution
Does every non-distributed application use only hash tables? Many rich data structures which cannot be built
on top of hash tables alone Exact match lookups are not enough Does any P2P file-sharing system use a DHT?