Upload
daniel-gomez-ferro
View
1.366
Download
3
Embed Size (px)
DESCRIPTION
http://yahoo.github.com/omid Many data processing tasks can be thought as small mutations to a large database triggered by events. Contrary to batch processing, the incremental processing model achieves very low delay from the reception of an event to the application of the mutation. In the absence of transactions support, developers have to use ad-hoc mechanisms to ensure atomic execution of mutations despite failures and concurrent accesses to the database by other clients. Most NoSQL data stores, like HBase, BigTable and Cassandra, lack support of transactions, which makes them unsuitable for a whole range of applications. In this talk we present Omid, an open source tool for transactional support and incremental processing on top of HBase (https://github.com/yahoo/omid). Due to the centralized nature of its client-replicated status oracle, Omid (i) avoids distributed locks, (ii) scales up to 60,000 TPS and a thousand clients, (iii) requires no changes to HBase, and (vi) adds a negligible overhead to data servers.
Citation preview
Omid
Efficient Transaction Management and Incremental Processing for HBase
Copyright © 2013 Yahoo! All rights reserved. No reproduction or distribution allowed without express written permission.
Daniel Gómez Ferro 21/03/2013
About me
§ Research engineer at Yahoo!
§ Main Omid developer
§ Apache S4 committer
§ Contributor to: • ZooKeeper • HBase
Omid - Yahoo! Research 2
Outline
§ Motivation
§ Goal: transactions on HBase
§ Architecture
§ Examples
§ Future work
3 Omid - Yahoo! Research
Motivation
§ Traditional DBMS: • SQL • Transactions • Scalable
§ NoSQL stores: • SQL • Transactions • Scalable
§ Some use cases require • Scalability • Consistent updates
Omid - Yahoo! Research 4
Motivation
§ Requested feature • Support for transactions in HBase is a common request
§ Percolator • Google implemented transactions on BigTable
§ Incremental Processing • MapReduce latencies measured in hours • Reduce latency to seconds • Requires transactions
Omid - Yahoo! Research 5
Incremental Processing
State0 State1 Shared State
Offline MapReduce
Online Incremental Processing
Fault tolerance Scalability
Fault tolerance Scalability
Shared state
Using Percolator, Google dramatically reduced crawl-to-index time
Omid - Yahoo! Research 6
Transactions on HBase
Omid
§ ‘Hope’ in Persian
§ Optimistic Concurrency Control system • Lock free • Aborts transactions at commit time
§ Implements Snapshot Isolation
§ Low overhead
§ http://yahoo.github.com/omid Omid - Yahoo! Research 8
Snapshot Isolation
§ Transactions read from their own snapshot
§ Snapshot: • Immutable • Identified by creation time • Values committed before creation time
§ Transactions conflict if two conditions apply: A) Overlap in time B) Write to the same row
Omid - Yahoo! Research 9
Transaction conflicts
r1 r2 r3 r4 Time
§ Transaction B overlaps with A and C • B ∩ A = ∅ • B ∩ C = { r4 } • B and C conflict
§ Transaction D doesn’t conflict Omid - Yahoo! Research 10
Write set:
r3 r4
r2 r4
r1 r3
Id:
A
B
C
D
Time overlap:
Omid Architecture
Omid Clients Omid
Clients
Architecture
1. Centralized server 2. Transactional metadata replicated to clients 3. Store transaction ids in HBase timestamps
HBase Region Servers
Status Oracle (SO)
HBase Region Servers
HBase Region Servers
HBase Region Servers
(1) (2)
(3)
Omid - Yahoo! Research 12
Omid Clients
SO
Status Oracle
§ Limited memory • Evict old metadata • … maintaining consistency guarantees!
§ Keep LowWatermark • Updated on metadata eviction • Abort transactions older than LowWatermark • Necessary for Garbage Collection
Omid - Yahoo! Research 13
Transactional Metadata
Status Oracle
Row LastCommitTS r1 T20 … …
StartTS CommitTS T10 T20 … …
Aborted T2 T18 …
LowWatermark T4
Omid - Yahoo! Research 14
Metadata Replication
§ Transactional metadata is replicated to clients
§ Status Oracle not in critical path • Minimize overhead
§ Clients guarantee Snapshot Isolation • Ignore data not in their snapshot • Talk directly to HBase • Conflicts resolved by the Status Oracle
§ Expensive, but scalable up to 1000 clients
Omid - Yahoo! Research 15
Omid Clients Omid
Clients
Architecture recap
HBase Region Servers
HBase Region Servers
HBase Region Servers
HBase Region Servers
Omid - Yahoo! Research 16
Omid Clients
SO
Row LastCommitTS r1 T20 … …
StartTS CommitTS T10 T20 … …
Aborted T2 T18 …
LowWatermark T4
Status Oracle
Architecture + Metadata
Status Oracle
R LC r1 20
ST CT 10 20
Abort …
LowW …
ST CT 10 20
Abort …
LowW …
Client
Omid - Yahoo! Research 17
Conflict detection (not replicated)
HBase
row TS val r1 10 hi
HBase values tagged with Start TimeStamp
Comparison with Percolator
18 Omid - Yahoo! Research
Omid
Clients HBase
Status Oracle
Percolator
Clients BigTable
TS Server
Simple API
§ TransactionManager: Transaction begin(); void commit(Transaction t) throws RollbackException; void rollback(Transaction t);
§ TTable:
Result get(Transaction t, Get g); void put(Transaction t, Put p); ResultScanner getScanner(Transaction t, Scan s);
Omid - Yahoo! Research 19
Example Transaction
ST CT 10 20
Abort …
LowW …
Client
Transaction Example
Status Oracle
R LC r1 20
ST CT 10 20
Abort …
LowW …
HBase
row TS val r1 10 hi
>> |
Omid - Yahoo! Research 21
Status Oracle
ST CT 10 20
Abort …
LowW …
Client
Transaction Example
R LC r1 20
ST CT 10 20
Abort …
LowW …
>> t = TM.begin() Transaction { ST: 25 } >>
t ST: 25
|
Omid - Yahoo! Research 22
HBase
row TS val r1 10 hi
Status Oracle
ST CT 10 20
Abort …
LowW …
Client
Transaction Example
R LC r1 20
ST CT 10 20
Abort …
LowW …
>> t = TM.begin() Transaction { ST: 25 } >> TTable.get(t, ‘r1’) ‘hi’ >>
t ST: 25
|
Omid - Yahoo! Research 23
HBase
row TS val r1 10 hi
Status Oracle
ST CT 10 20
Abort …
LowW …
Client
Transaction Example
R LC r1 20
ST CT 10 20
Abort …
LowW …
>> t = TM.begin() Transaction { ST: 25 } >> TTable.get(t, ‘r1’) ‘hi’ >> TTable.put(t, ‘r1’, ‘bye’) >>
t ST: 25 R: {r1}
|
Omid - Yahoo! Research 24
HBase
row TS val r1 10 hi r1 25 bye
Status Oracle
ST CT 10 20
Abort …
LowW …
Client
Transaction Example
R LC r1 30
ST CT 25 30
Abort …
LowW 20
>> TTable.get(t, ‘r1’) ‘hi’ >> TTable.put(t, ‘r1’, ‘bye’) >> TM.commit(t) Committed{ CT: 30 } >>
t ST: 25 R: {r1}
|
Omid - Yahoo! Research 25
HBase
row TS val r1 10 hi r1 25 bye
Status Oracle
ST CT 10 20
Abort …
LowW …
Client
Transaction Example
Abort 15
>> t_old Transaction{ ST: 15 } >> TT.put(t_old, ‘r1’, ‘yo’) >> TM.commit(t_old) Aborted >>
t_old ST: 15 R: {r1}
Omid - Yahoo! Research 26
HBase
row TS val r1 15 yo r1 25 bye
R LC r1 30
ST CT 25 30
LowW …
|
Centralized Approach
27 Omid - Yahoo! Research
Omid Performance
§ Centralized server = bottleneck? • Focus on good performance • In our experiments, it never became the bottleneck
0
5
10
15
20
25
30
1K 10K 100K
Late
ncy
in m
s
Throughput in TPS
2 rows4 rows8 rows
16 rows32 rows
128 rows512 rows
Omid - Yahoo! Research 28
512 rows/t 1K TPS 10 ms
2 rows/t 100K TPS
5 ms
Fault Tolerance
§ Omid uses BookKeeper for fault tolerance • BookKeeper is a distributed Write Ahead Log
§ Recovery: reads log’s tail (bounded time)
0
3000
6000
9000
12000
15000
18000
5900 5950 6000 6050 6100
Throughput in TPS
Time in s
Downtime < 25 s
Omid - Yahoo! Research 29
Example Use Case News Recommendation System
News Recommendation System
§ Model-based recommendations • Similar users belong to a cluster • Each cluster has a trained model
§ New article • Evaluate against clusters • Recommend to users
§ User actions • Train model • Split cluster
Omid - Yahoo! Research 31
Transactions Advantages
§ All operations are consistent • Updates and queries
§ Needed for a recommendation system? • We avoid corner cases • Implementing existing algorithms is straightforward
§ Problems we solve • Two operations concurrently splitting a cluster • Query while cluster splits • …
32 Omid - Yahoo! Research
Example overview
33 Omid - Yahoo! Research
Index
Articles & User actions Queries
Index Data Store Wor
kers
Updates
Other use cases
§ Update indexes incrementally • Original Percolator use case
§ Generic graph processing
§ Adapt existing transactional solutions to HBase
34 Omid - Yahoo! Research
Future Work
Current Challenges
§ Load isn’t distributed automatically
§ Complex logic requires big transactions • More likely to conflict • More expensive to retry
§ API is low level for data processing
Omid - Yahoo! Research 36
Future work
§ Framework for Incremental Processing
• Simpler API
• Trigger-based, easy to decompose operations
• Auto scalable
§ Improve user friendliness • Debug tools, metrics, etc
§ Hot failover
Omid - Yahoo! Research 37
Questions?
All time contributors Ben Reed
Flavio Junqueira Francisco Pérez-Sorrosal
Ivan Kelly Matthieu Morel
Maysam Yabandeh
Partially supported by CumuloNimbo project (ICT-257993)
funded by the European Commission
Try it! yahoo.github.com/omid
Backup
39 Omid - Yahoo! Research
Commit Algorithm
§ Receive commit request for txn_i
§ Set CommitTimestamp(txn_i) <= new timestamp
§ For each modified row: • if LastCommit(row) > StartTimestamp(txn_i) => abort
§ For each modified row: • Set LastCommit(row) <= CommitTimestamp(txn_i)
Omid - Yahoo! Research 40