Upload
viet-trung-tran
View
465
Download
1
Tags:
Embed Size (px)
Citation preview
2
BlobSeer: Architecture
Clients Perform fine grain blob accesses
Providers Store the pages of the blob
Provider manager Monitors the providers Favours data load balancing
Metadata providers Store information about page location
Version manager Ensures concurrency control
Clients
Providers
Metadata providers
Provider manager
Version manager
3
BlobSeer: What may be refined
Hotspots/fault-tolerance Fixed single version manager Fixed provider manager
Load balancing Version manager, provider manager may become hotspots Fixed metadata providers
4
BlobSeer: What I am thinking of
5
Background: Lighting-weigh DHT(may not correct) Using consistent hashing to hash distribute keys
Load balancing Fault tolerance Elasticity
Lookup cost: O(1) Base on Gossip overlay (borrowed from NoSQL world) Or base on Kelips P2P prototype (I have just know about it) Given a key, node know the destination exactly in most cases Overhead: OK ref. NoSQL world (Facebook Cassandra, Amazon Dynamo,
Voldermort)
I will try solving my given problems by building BlobSeer on top of this DHT
6
Distributed version managers
Distributed version managers: A 2 levels Splitting BLOB_ID namespace
DHT-based Fortunately, blob is independent from each other Hash (BLOB_ID) => ID of version manager server
Splitting version ID’s space per BLOB Easily Rely on DHT replication Hash (BLOB_ID) => {neighbouring version managers}
Lookup cost = O(1), equally to BlobSeer
7
Concurrent writing/appending need to be serialized On master Blob.getlatest() Blob.write() Blob.append()
Access to history versions Randomly on {master, slaves} Blob.read() Blob.getsize() Ask Master only in case of necessary
Master periodically PUTS OR Slaves PULL versions to do serialization Version info is quite tiny
8
Eliminate the provider manager Provider manager keeps cluster state to answer clients’ requests
Lookup costs O(1)
Providers can learn themselves about the system state Load and Load balancing?? Lookup costs O(1) Use the presented DHT overlay to propagate providers’ states
Gossip-based (limited in cluster size around 1000 but it is still good) Or a lighting version of P2P overlay (E.g. Kelips) Hotspot when increasing number of clients, providers
Client randomly asks any providers
9
However !!!
We will not want to use consistent hashing
10
Architecture
Version managers, metadata managers, providers, clients
DHT with consistent hashing
Distributed membership management
Gossip based
Zookeeper (like Google’s chubby)
Replication, fault tolerance, leader election
11
Access scenarios
Reading Hash blobID to know its associated version manager Go down the metadata tree Access providers O(1) for any step and equal to the current BlobSeer design
Writing The same as in BlobSeer but no provider manager
12
Overview of the implementation
Gossip based DHT
We need 3 hash namespaces Version managers Metadata providers Providers
Elasticity Is inherent if we use consistent hashing for DHT
Fault-tolerance DHT based
Load balancing DHT based
13
Advantages
Still keeping the current nice features of BlobSeer
Monolithic-based design Node provides all capabilities as a client, a version manager, a metadata
manager and a provider Simpler/easier for configuration/deployment (autonomic feature?)
Load balancing
Fault tolerance
Elasticity
Compare to NoSQL key/value store Efficient one key/ a value of TB size (versioning, throughput)
14
Some more discussions
If client is outside of BlobSeer storage cloud, client randomly chooses one node to communicate. Node is as a proxy server (Cassandra)
We may need a small number of version manager, metadata managers Leader election (can base on Apache Zookeeper) If we fix them, we will reduce overhead at DHT level
BlobSeer cloud
Client
15
BlobSeer in NoSQL paradigm
Document stores
Column stores
16
{pages} distribution
BlobSeer’s approach Distribute {pages} over different providers {pages} are mapped to physical addresses of providers directly
DHT’s approach DHT is used only to know how has {pages} but not to route {pages} Must find a good way: {pages} of single write should be distributed over
different providers? [YES or NO] Hopefully, page keys are picked by client in BlobSeer
DHT load balancing DHT fault-tolerance Lookup cost: O(1)
17
Eliminate the provider manager Provider manager keeps cluster state to answer clients’ requests
Lookup costs O(1) Hotspot when increasing number of clients, providers
Providers can learn themselves about the system state Lookup costs O(1) Use the presented DHT overlay to propagate providers’ states
Gossip-based (limited in cluster size around 1000 but it is still good) Or a lighting version of P2P overlay (E.g. Kelips)
Need a good way to distribute {pages} of each separated write operation over DHT?
BlobSeer’s approach DHT’s approach