Upload
institut-teknologi-bandung
View
57
Download
0
Embed Size (px)
Citation preview
Introduction
Scalable Atomic Visibility with RAMP Transactions
Peter Bailis, Alan Fekete2, Ali Ghodsi, Joseph M. Hellerstein, Ion StoicaUC Berkeley and University of Sydney2
Iskandar Setiadi13511073
Advanced Distributed SystemInstitut Teknologi Bandung
April 21, 2015
April 14, 2023 1
Iskandar Setiadi
IntroductionOverview and MotivationSemantic and System ModelRAMP Transaction AlgorithmsExperimental Evaluation
Note: Due to time restriction, several additional details and further optimizations are left as an exercise for the reader.
Outline
April 14, 2023 2
Iskandar Setiadi
TransactionA sequence of operations performed as a single logical unit of work
Atomic Visible Transactional AccessCases where all or none of each transaction’s effects should be visibleIf a transaction T1 writes x = 1 and y = 1, then another transaction T2 should not read x = 1 and y = null.
Introduction
April 14, 2023 3
Iskandar Setiadi
Scalability and Atomic VisibilityMany traditional transactional mechanisms use two-phases locking and variants of optimistic concurrency control to ensure the correctness of transactions.These algorithms are slow and, under failure, unavailable in a distributed environment.
Current Problems
April 14, 2023 4
Iskandar Setiadi
Read Atomic Multi-Partition (RAMP)This algorithm enforces atomic visibility while offering excellent scalability, guaranteed commit despite partial failures (via synchronization independence), and minimized communication between servers (via partition independence).
RAMP transactions allow reads to “race” writes: It can autonomously detect the presence of non-atomic reads and, if necessary, repair them via a second round of communication with servers.
Read Atomic (RA) Isolation
April 14, 2023 5
Iskandar Setiadi
RAMP uses ACPs (Atomic Commitment Protocol) with non-blocking concurrency control mechanisms: individual transactions can stall due to failures or communication delays without forcing other transactions to stall.
Overview
April 14, 2023 6
Iskandar Setiadi
Facebook and LinkedIn Espresso allow a user to perform a “like” action on a certain message / post.Violations of atomic visibility may surface as broken bi-directional relationship (friend relationship in Facebook) and dangling references.
Motivation: Foreign Key Constraints
April 14, 2023 7
Iskandar Setiadi
Secondary IndexingSearching data via secondary attributes (e.g. birth date) is challenging. In Cassandra and Google Megastore, they allow local secondary index, which requires contacting every partition for secondary attribute lookups.
Materialized View MaintenanceExample: Mailbox “unread message counter”
Motivation (Cont.)
April 14, 2023 8
Iskandar Setiadi
Fractured ReadsA transaction Tj exhibits fractured reads if transaction Ti writes versions xm and yn (in any order, with x possibly but not necessarily equal to y), Tj reads version xm and version yk, and k < n.
Read Atomic Isolation (RA) prevents fractured read anomalies and also prevents transactions from reading uncommited, aborted, or intermediate data. (snapshot view)
Semantic and System Model
April 14, 2023 9
Iskandar Setiadi
RA does not prevent concurrent updates or provide serial access to data items.Example: RA cannot be used to maintain bank account balances. RA is a better fit for the “friend” operation.
RA Implications & Limitations
April 14, 2023 10
Iskandar Setiadi
Given specification for RA isolation and scalability, the following example will focus on providing read-only and write-only transactions with “last writer wins” overwrite policy.
3 types:1. RAMP-Fast (RAMP-F): metadata size is linear to
transaction size (not data size)2. RAMP-Hybrid (RAMP-H): constant-factor metadata3. RAMP-Small (RAMP-S): constant-factor metadata
RAMP Transaction Algorithms
April 14, 2023 11
Iskandar Setiadi
One RTT for reads (stable), except for partial readsTwo RTTs for writes
RAMP-Fast
April 14, 2023 12
Iskandar Setiadi
WriteIn the PREPARE phase, each partition adds the write to its local database.In the COMMIT phase, each partition updates an index containing the highest-timestamped committed version of each item.
Read Fetching the last committed version for each item and calculate whether it is “missing” any versions.
RAMP-Fast (Cont.)
April 14, 2023 14
Iskandar Setiadi
RAMP-S uses constant-size metadata but always requires two RTT for reads.First round of reads: fetch the highest committed timestamp for each item from its respective partitionSecond round of reads: retrieve the highest-timestamped version of the item that also appears in the supplied set of timestamps
RAMP-Small
April 14, 2023 16
Iskandar Setiadi
RAMP-H Write: store a Bloom filter as the metadataRAMP-H Read: Same with RAMP-F, except this algorithm computes a list of potentially higher-timestamped writes for each item from the Bloom filter. Any potentially missing versions are fetched in a second round of reads.
RAMP-Hybrid
April 14, 2023 18
Iskandar Setiadi
Safety PropertiesBloom filter may result in false positive. In the appendix, it’s proven that any false positive will not compromise the integrity of the result set; with unique timestamps, any reads due to false positive will return null.
RAMP-Hybrid (Cont.)
April 14, 2023 20
Iskandar Setiadi
RAMP-F, RAMP-H, and often RAMP-S outperform existing solutions across a range of workload conditions while exhibiting overheads typically within 8% and no more than 48% of peak throughput.Each algorithm is evaulated using YCSB benchmark and several cr1.8xlarge instances on Amazon EC2 with a 95% read and 5% write proportion.
Experimental Evaluation
April 14, 2023 23
Iskandar Setiadi
LWLR: Long write locks and long read locks, providing Repeatable Read Isolation (PL-2.99)LWSR: Long write locks with short read locks, providing Read Committed Isolation (PL-2L, ≠ RA)LWNR: Long write with no read locks, providing Read Uncommitted Isolation (≠ RA)NWNR: No locks, base performance for parallelized operationsE-PCI: Eiger system’s 2PC-PCI, where for each transaction, designated “coordinator” server enforce RA isolation
Notation
April 14, 2023 24
Iskandar Setiadi
Cooperative Termination Protocol (CTP)Several transactions may become stalled operations. To “free” these leaks, CTP is used.
In the real environment, the blocked operations should occur with a modest failure rate of 1 in 1000 writes. Thus, the average-case overheads are small.
CTP Reference: P. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency control and recovery in database systems. Addison-wesley New York, 1987.
Experimental: CTP Overhead
April 14, 2023 27