Upload
matthew-york
View
212
Download
0
Embed Size (px)
Citation preview
David Choffnes, Winter 2006
OceanStore
Maintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels, B. Zhao, H. Weatherspoon, J. Kubiatowicz, IEEE Internet Computing, 5(5):40-49, 2001. .
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
2
Data makes the world go round
We’re addicted to persistent storage– Wouldn’t it be great if it would follow us globally?– And automatically make itself resilient to failures?– But that would require 1,000s or millions of PCs!
OceanStore– A global data store that manages itself– Scales to billions of users and exabytes of data– Features:
• Durability• Resistance to attack/failures• Fault tolerant• Churn-resistant• Catchy name
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
3
A bit of this, a byte of that
Routing messages/data– Self-maintaining (DHT)
Durabilitly– M x N erasure encoding
Security/Fault tolerance– Byzantine updates, secure hashes, encryption
Availability– Introspective replica management
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
4
Don’t call it a comeback
Erasure codes– Break object into m chunks of size n; n*m > n– Encode chunks such that any k set of them can
reconstruct entire object
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
5
An object by any other GUID would smell as sweet
Each object assigned a GUID based on 160-bit SHA-1 hash
OceanStore supports versioning, so each object has an active GUID that points to a list of GUIDs from different versions– Each GUID is a B-tree of links to chunks
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
6
Magic Tapestry Ride
Tapestry = DHT
Allows nodes to join and leave network relatively seamlessly
Named objects found in deterministic # of hops
OceanStore uses multiple “root nodes” for each object– More redundancy– Lower latency
Inner ring made responsible for objects
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
7
He says, she says
Possibility for faulty, malicious servers– Byzantine protocol ensures correctness if less than
1/3 of servers is faulty/misbehaving• New technique vastly reduces n^2 message complexity• Cached data is signed• Proactive signature threshold allows same public key
despite inner ring membership change
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
8
Know thyself
Introspection– Servers measure independence of failure rates and
change encoding rate appropriately– Auto repair: root node can check redundancy,
regenerate/redistribute blocks– Durability: sweep/repair
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
9
Put it all together
CS 395/495 Autonomic Computing SystemsEECS, Northwestern University
10
Some Performance Numbers