View
216
Download
2
Tags:
Embed Size (px)
Citation preview
1
High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two
Nov. 24, 2003
Byung-Gon Chun
2
Contents
• Introduction
• Basic Model
• Availability and Redundancy
• Discussion
• High Availability, Scalable Storage, Dynamic Peer Networks: Pick Three
3
Introduction
• Peer-to-peer lookup: robust, scalable with dynamic membership Robust and scalable storage with dynamic membership ?
• Pick two – Lookup is not bottleneck.– (upstream) bandwidth limitation– Disk space grows faster than access bandwidth
4
Basic Model
• Assumptions– Simple redundancy maintenance mechanism (enter and
exit)– Static data placement strategy (f: RB-> N)– Identical per-node space and bandwidth contributions– Constant rate of entering and exiting.– Independence of exit events– Constant steady-state number of nodes and total data
size– Maintenance bandwidth
• Average case analysis
5
Basic Model
• N: number of hosts• D: data• S: data + redundancy (S = kD) : entering rate : exiting rate ( = )• T: lifetime (T=N/)• B: bandwidth
6
Understanding the Scaling
- Short membership : enormous nodes to scale- How fast storage of systems can grow?
(k = 20)
7
Availability & Redundancy
• Membership timeout: distinguish true departures from temporary downtime, delay its response to failures
• Counting offline hosts as members– Lifetime is longer
– Hosts serve as a fraction of time (a: availability)– More redundancy is needed– Effective bandwidth is reduced
• Redundancy: replication vs. erasure coding
8
Model
9
Availability & Redundancy
• 33000 hosts Gnutella network, 1TB data, six nine data availability
• 30-fold savings by membership timeout• Additional 8-fold savings by erasure coding
– 75Kbps maintenance bandwidth per node– 500MB of disk per host contributed
• 5000 of 33000 hosts usually available– Aggregate bandwidth 500Mbps– 5 dedicated, reliable PCs with 250GB drives and
50Mbps connection up 99% of the time
10
Membership Timeout
11
Replication vs. Coding
12
Admission Control, Load-Shifting
• Do not admit highly volatile nodes, Shift responsibility to non-volatile hosts
• 5% most available hosts - 40% of service years. – 30Kbps per node per unique-TB using coding– 1000-fold savings using delayed response, coding, and
admission control
• Still bounded by bandwidth– 100Kbps maintenance bandwidth, 3GB disk space– 10 universities with 1/3 OC3
• Two million cable modem users at 40% availability ~ 2000 universities with ½ OC3
13
Hardware Trends
• Participation should be more stable to contribute meaningful fraction of disks
14
Incentive Issues
• Stable membership is necessary.
• How to incent?– Added value of service guarantees– Allow client bandwidth usage to be only
proportional to contributed bandwidth
-- Prioritizing traffic
15
Discussion
• High availability, scale, dynamic membership: high service bandwidth
Current DHT research trajectory ???
• Static membership – small lookup-state optimization do more harm than good
(another approach - one-hop lookup)
(another approach – distributed directory)
• Dynamic membership – why leverage many flaky nodes to serve data a few reliable ones
16
Discussion
• Why worry about lookup guarantees if storage guarantees are inappropriate?
• When anonymity or related security properties are the high, why not plan to include the defense from the beginning?
17
Availability
[Bhagwan, Savage, and Voelker 2003]
18
Pick Three
• Distributed directory (DD)– Uses a level of indirection– Controls the data placement– Exploits heterogeneity (availability, lifetime, and
bandwidth)
Pick Three!!!
19
Discussion?