Upload
chaz
View
44
Download
0
Embed Size (px)
DESCRIPTION
Critical Sections: Re-emerging Concerns for DBMS. Ryan JohnsonIppokratis Pandis Anastasia Ailamaki Carnegie Mellon University École Polytechnique Féderale de Lausanne. A Convergence of Trends. OLTP fits in memory [stonebraker07] 256GB RAM = 80M TPC-C customers - PowerPoint PPT Presentation
Citation preview
Critical Sections: Re-emerging Concerns for DBMS
Ryan Johnson Ippokratis Pandis
Anastasia Ailamaki
Carnegie Mellon University
École Polytechnique Féderale de Lausanne
© 2008 Ryan Johnson
A Convergence of Trends OLTP fits in memory[stonebraker07]
– 256GB RAM = 80M TPC-C customers– Amazon.com had 30M customers in 2005
Multi-core computing– Dozens of HW contexts per chip today– Expect 2x more cores each generation
Scalability concerns replace I/O bottleneck
© 2008 Ryan Johnson
Potential Scalability Bottlenecks
Hardware
DBMS
Applications
ü
ü
??
© 2008 Ryan Johnson
DBMS Scalability Comparison
Current engines face internal scalability challenges
Sun T2000
32 HW threads
Insert-only microbenchmark0
100200300400500600700800900
1000
0 8 16 24 32Threads
Tim
e/tr
x (u
sec)
DBMS "X"
Postgres
MySQL
BerkeleyDB
Shore
Lower
is better
© 2008 Ryan Johnson
Contributions Evaluate common synchronization
approaches, identify most useful ones Highlight performance impact of tuning
database engines for scalability
© 2008 Ryan Johnson
Outline Introduction Scalability and Critical Sections Synchronization Techniques Shore-MT Conclusion
© 2008 Ryan Johnson
Sources of Serialization Concurrency control
– Serialize conflicting logical transactions– Enforce consistency and isolation
Latching– Serialize physical page updates– Protect page integrity
Critical sections (semaphores)– Serialize internal data structure updates– Protect data structure integrity
Critical sections largely overlooked
© 2008 Ryan Johnson
Critical Sections in DBMS Fine-grained parallelism vital to scalability Transactions require tight synchronization
– Many short critical sections– Most uncontended, some heavily contended
TPC-C Payment – Accesses 4-6 records– Enters ~100 critical sections in Shore
Cannot ignore critical section performance
© 2008 Ryan Johnson
Example: Index Probe
Time
Locks
Latches
Critical Sections
Many short critical sections per operation
© 2008 Ryan Johnson
Outline Introduction Scalability and Critical Sections Synchronization Techniques Shore-MT Conclusion
© 2008 Ryan Johnson
Related Work Concurrency control
– Locking [got92]– BTrees [moh90]– Often assumes single hardware context
Latching– “Solved problem” [agr87]– Table-based latching [got92]
Synchronization Techniques
© 2008 Ryan Johnson
Lock-based SynchronizationBlocking Mutex
ü Simple to use û Overhead, unscalable
Test and set Spinlock
ü Efficient û Unscalable
Queue-based spinlock
ü Scalable û Mem. management
Reader-writer lock
ü Concurrent readers û Overhead
© 2008 Ryan Johnson
Lock-free SynchronizationOptimistic Concurrency Control (OCC)
ü No read overhead û Writes cause livelock
Atomic Updates
ü Efficient û Limited applicability
Lock-free Algorithms
ü Scalable û Special-purpose algs
Hardware Approaches (e.g. transactional memory)
ü Efficient, scalable û Not widely available
© 2008 Ryan Johnson
Experimental Setup Hardware
– Sun T2000 “Niagara” server– 8 cores, 4 threads each (32 total)
Microbenchmark:
while(!timout_flag) delay_ns(t_out); acquire(); delay_ns(t_in); release();
© 2008 Ryan Johnson
Critical Section Overhead
Scalability vs. Duration
0
200
400
600
800
0 100 200 300Duration (nsec)
Co
st/
Ite
rati
on
(n
se
c)
ideal
tatas
mcs
ppmcs
pthread
DBMS Critical Sections
t_out = t_in
16 threads
Critical sections are 60-90% overhead
© 2008 Ryan Johnson
Scalability Under Contention
Scalability vs. Contention
0
200
400
600
800
0 8 16 24 32
Threads
Co
st/
Ite
rati
on
(n
se
c) ideal
tatas
mcs
ppmcs
pthread
TATAS or MCS best depending on contention
t_out = t_in = 0ns
© 2008 Ryan Johnson
Reader-writer Performance
0
200
400
600
800
1 10 100Reads/Write (avg.)
Co
st/
ite
rati
on
(n
se
c)
ideal
tatas
mcs
tatas_rwlock
mcs_rwlock
occ
Reader-writer locks too expensive to be useful
t_out = t_in = 100ns
16 threads
DBMS spans gamut
© 2008 Ryan Johnson
Selecting a Primitive
Read-mostly
Long
Contended
Uncontended
Mutex
Short
OCC
OCCMCS
TAS
Lock-free
A handful of primitives covers most cases
© 2008 Ryan Johnson
Outline Introduction Scalability and Critical Sections Synchronization Techniques Shore-MT Conclusion
© 2008 Ryan Johnson
Alleviating Contention Modify algorithms
– Shrink/distribute/eliminate critical sections– Fundamental scalability improvements
Tune existing critical sections– Reduce overheads– Straightforward and localized changes
Both approaches vital for scalability
© 2008 Ryan Johnson
From Shore to Shore-MT
Tuning and algorithmic changes at each step
1
10
100
0 8 16 24 32
Concurrent Threads
Th
rou
gh
pu
t (t
ps)
baseline
dist-bpool
tune-locks
malloc
log2
lockm
bpool2
final
ideal3
30
300
© 2008 Ryan Johnson
Conclusions Only a few types of primitives useful Algorithms and tuning both essential to
performance/scalability Open issues
– Developing ever-finer grained algorithms– Reduce synchronization overhead– Improve usability of reader-writer locks– Efficient lock-free algorithms
Plenty of room for improvements by future research
© 2008 Ryan Johnson
Bibliography [agr87] R. Agrawal, M. Carey, and M. Livoy.
“Concurrency Control Performance Modeling: Alternatives and Implications.” In ACM ToDS, 12(4):609-654, 1987.
[car94] M. Carey, et al. “Shoring up persistent applications.” SIGMOD Record 23(2):383-394, 1994.
[got92] V. Gottemukkala and T. Lehman, “Locking and Latching in a Memory-Resident Database System.” In proc. VLDB’92.
[moh90] C. Mohan, “Commit-LSN: A novel and simple method for reducing locking and latching in transaction processing system.” In proc. VLDB’90.
© 2008 Ryan Johnson
Thank You!