Plethora: Infrastructure andSystem Design
Introduction• Peer-to-Peer (P2P) networks:
– Self-organizing distributed systems– Nodes receive and provide services cooperatively – No predetermined client/server roles
• Key features:– Scalable– Adaptive and reconfigurable– Leverage technology trends
(network/processor/memory)
• Key problems:– Locating and routing objects efficiently– Consistency management– Fault-Tolerance
Location and Routing - DHT• Apply structure to the network:
– Inputs hashed to a key– Each node responsible for a subset of keys
• Nodes maintain small routing tables
• Queries routed to neighboring nodes that ensure progress towards the ultimate destination.
Location and Routing - DHT0XX
X1XX
X2XX
X3XX
X
2321
2032
2001
0112
START
0112 routes a message
to key 2000.
First hop fixes first digit (2)
Second hop fixes second
digit (20)END
2001 closest live node to
2000.
Motivation
• Virtualization destroys locality.
• Query responses do not contain locality information.
• Recent studies show that queries for multiple keys in P2P networks follow a Zipf-like distribution.
• Exploit geographic locality.
• Build highly-distributed collaborative environments and applications:– information lifecycle– distributed file systems– software distribution– archival storage and disaster recover
IP Addresses as Virtual IDs
• Incorporate locality into overlay networks:– Explore addressing scheme of the underlying network.
• In most cases, nodes with IP addresses that are numerically close are also physically close.
• Organization of the Internet in ASs. By correcting a few bits in each hop, the last hops would be inside an AS.
• Issues:– IP space is not uniformly populated by peers.– Load imbalance at the peers.– The upper bound of O(log n) can no longer be guaranteed.
IP Addresses as Virtual IDs
IP Addresses as Virtual IDs
IP Addresses as Virtual IDs
• 2,420 nodes. 20 keys per node.
Plethora• Two-level overlay
– One global overlay– Several local overlays
• Global overlay is the main repository of data. – Global overlay helps nodes organize themselves into local
overlays.
• Local overlays exploit the organization of the Internet in ASs .
– Size of the local overlay is controlled by an overlay leader.– Uses efficient distributed algorithms for merging and splitting
local overlays.
Cache Organization
Simulation Setup• Internet topology generated using GT-ITM
topology generator.
• 10,000 overlay nodes selected randomly from the hosts.
• NLANR web proxy trace with 500,254 objects.
• Zipf distribution parameters: {0.75, 0.80, 0.85, 0.90, 0.95}
• Local cache size: 5MB (LRU replacement policy).
IP Addresses as Virtual IDs
Simulation Results
Zipf-parameter Cache Hit Ratio Gain
0.75 76% 31.0%
0.80 79% 33.5%
0.85 81% 36.0%
0.90 83% 38.7%
0.95 86% 41.3%
Summary
• IP addresses as virtual IDs:– Overlays with good locality properties.– Non-uniform realworld distribution:
• severe load imbalance• no bounded latency
• Plethora Routing Core:– Two-level overlay architecture.– Local overlays are created to cluster nodes that are close in the
underlying network.
• Significant performance gains– Low maintenance overhead
Latency Hiding
• For large-scale collaborative and distributed applications:– latency effects are still an issue.– need resiliency in the presence of network
failures.
• Record updates using a transactional versioning system:– Aggregate updates– Distributed conflict resolution
Versioning and Transaction Model
T1T1
T2T2 T3T3
TTkk
TTk+1k+1 TTk+2k+2
Global homeGlobal home Local HomeLocal Home
commitcommit
Development Issues
• Implementation of versioning trees– Efficient update and commit protocols – Dealing with failures (node, network)
• Object structure of the repository to exploit versioning semantics
• Guarantees on object access and consistency of updates