On the Performance of Window-Based Contention Managers for Transactional Memory

On the Performance of Window-Based Contention Managers for Transactional

Memory

Gokarna Sharma and Costas BuschLouisiana State University

Agenda• Introduction and Motivation

• Previous Studies and Limitations

• Execution Window Model Theoretical Results

Experimental Results

• Conclusions and Future Directions

Retrospective• 1993

A seminal paper by Maurice Herlihy and J. Eliot B. Moss: “Transactional Memory: Architectural Support for Lock-Free Data Structures”

• Today Several STM/HTM implementation efforts by Intel, Sun, IBM;

growing attention

• Why TM? Many drawbacks of traditional approaches using Locks,

Monitors: error-prone, difficult, composability, …

lock datamodify/use dataunlock data

Lock: only one thread can execute TM: many threads can executeatomic {modify/use data}

Transactional Memory• Transactions perform a sequence of read and write operations

on shared resources and appear to execute atomically• TM may allow transactions to run concurrently but the results

must be equivalent to some sequential execution

Example:

• ACI(D) properties to ensure correctness

Initially, x == 1, y == 2atomic { x = 2; y = x+1; }

atomic { r1 = x; r2 = y; }

T1 then T2 r1==2, r2==3T2 then T1 r1==1, r2==2

x = 2;y = 3;

T1 r1 == 1 r2 = 3;

Incorrect r1 == 1, r2 == 3

Software TM SystemsConflicts:

A contention manager decides Aborts or delay a transaction

Centralized or Distributed: Each thread may have its own CM

Example:

atomic { … x = 2; }

atomic { y = 2; … x = 3; }

Initially, x == 1, y == 1

conflict

Abort undo changes (set x==1) and restart

atomic { … x = 2; }

atomic { y = 2; … x = 3; }

T1 T2conflict

Abort (set y==1) and restart OR wait and retry

Transaction SchedulingThe most common model:

m concurrent transactions on m cores that share s objects Sequence of operations and a operation takes one time unit Duration is fixed

Throughput Guarantees: Makespan: the time needed to commit all m transactions Makespan of my CM

Makespan of optimal CM

Problem Complexity: NP-Hard (related to vertex coloring)

Challenge: How to schedule transactions so that makespan is

minimized?

Competitive Ratio:

Literature• Lots of proposals

Polka, Priority, Karma, SizeMatters, …

• Drawbacks Some need globally shared data (i.e., global clock) Workload dependent Many have no theoretical provable properties

i.e., Polka – but overall good empirical performance

• Mostly empirical evaluation using different benchmarks Choice of a contention manager significantly affects the

performance Do not perform well in the worst-case (i.e., contention,

system size, and number of threads increase)

Literature on Theoretical BoundsGuerraoui et al. [PODC’05]: First contention manager GREEDY with O(s2) competitive bound

Attiya et al. [PODC’06]: Bound of GREEDY improved to O(s)

Schneider and Wattenhofer [ISAAC’09]: RandomizedRounds with O(C . log m) (C is the maximum degree of a transaction in the conflict graph)

Attiya et al. [OPODIS’09]: Bimodal scheduler with O(s) bound for read-dominated workloads

Sharma and Busch [OPODIS’10]:Two algorithms with O() and O() bounds for balanced workloads

Objectives

Scalable transactional memory scheduling:

Design contention managers that exhibit both good theoretical and empirical performance guarantees

Design contention managers that scale well with the system size and complexity

1 2 3 n

Transactions. . .

Threads

Execution Window Model• Collection of n sets of m concurrent transactions that

share s objects

. . .Assuming maximum

degree in conflict graph C and execution time duration τ

Serialization upper bound: τ . min(Cn,mn)One-shot bound: O(sn) [Attiya et al., PODC’06]Using RandomizedRounds: O(τ . Cn log m)

Theoretical Results• Offline Algorithm: (maximal independent sets)

For scheduling with conflicts environments, i.e., traffic intersection control, dining philosophers problem

Makespan: O(τ. (C + n log (mn)), (C is the conflict measure) Competitive ratio: O(s + log (mn)) whp

• Online Algorithm: (random priorities) For online scheduling environments Makespan: O(τ. (C log (mn) + n log2 (mn))) Competitive ratio: O(s log (mn) + log2 (mn))) whp

• Adaptive Algorithm Conflict graph and maximum degree C both not known Adaptively guesses C starting from 1

Intuition (1)• Introduce random delays at the beginning of the

execution window

1 2 3 n

Transactions . . .

Random interval

1 2 3 n

• Random delays help conflicting transactions shift avoiding many conflicts

Intuition (2)• Frame based execution to handle conflicts

Frame size

F11 F12 F1nF21

Makespan: max {qi} + No of frames X frame size

Experimental Results (1)• Platform used

Intel i7 (4-core processor) with 8GB RAM and hyperthreading on

• Implemented window algorithms in DSTM2, an eager conflict management STM implementation

• Benchmarks used List, RBTree, SkipList, and Vacation from STAMP suite.

• Experiments were run for 10 seconds and the data plotted are average of 6 experiments

• Contention managers used for comparison Polka – Published best CM but no theoretical provable properties Greedy – First CM with both theoretical and empirical properties Priority – Simple priority-based CM

Experimental Results (2)Performance throughput:

No of txns committed per second Measures the useful work done by a CM each time step

0 5 10 15 20 25 30 350

List Benchmark

Polka Greedy Priority Online Adaptive

No of threads

0 5 10 15 20 25 30 350

SkipList Benchmark

No of threads

Experimental Results (3)

0 5 10 15 20 25 30 350

RBTree Benchmark

No of threads

0 5 10 15 20 25 30 350

Vacation Benchmark

No of threadsCo

Performance throughput:

Conclusion #1: Window CMs always improve throughput over Greedy and PriorityConclusion #2: Throughput is comparable to Polka (outperforms in Vacation)

Experimental Results (4)Aborts per commit ratio:

No of txns aborted per txn commit Measures efficiency of a CM in utilizing computing

resources

0 5 10 15 20 25 30 350

List Benchmark

No of threads

0 5 10 15 20 25 30 350

SkipList Benchmark

No of threads

Experimental Results (5)Aborts per commit ratio:

0 5 10 15 20 25 30 350

Vacation Benchmark

No of threads

it0 5 10 15 20 25 30 35

RBTree Benchmark

No of threads

Conclusion #3: Window CMs always reduce no of aborts over Greedy and PriorityConclusion #4: No of aborts are comparable to Polka (outperform in Vacation)

Experimental Results (6)Execution time overhead:

Total time needed to commit all transactions Measures scalability of a CM in different contention

scenarios

Low Medium High0

List Benchmark

Amount of contention

Low Medium High0

SkipList Benchmark

Experimental Results (7)Execution time overhead:

Low Medium High0

RBTree Benchmark

Low Medium High0

Vacation Benchmark

Conclusion #5: Window CMs generally reduce execution time over Greedy and Priority (except SkipList) Conclusion #6: Window CMs good at high contention due to randomization overhead

Future Directions• Encouraging theoretical and practical results

• Plan to explore (experimental) Wasted Work Repeat Conflicts Average Response Time Average committed transactions durations

• Plan to do experiments using more complex benchmarks E.g., STAMP, STMBench7, and other STM implementations

• Plan to explore (theoretical) Other contention managers with both theoretical and

empirical guarantees

Conclusions• TM contention management is an important online

scheduling problem

• Contention managers should scale with the size and complexity of the system

• Theoretical as well as practical performance guarantees are essential for design decisions

• Need to explore mechanisms that scale well in other multi-core architectures: ccNUMA and hierarchical multilevel cache architectures Large scale distributed systems

Thank you for your attention!!!

On the Performance of Window-Based Contention Managers for Transactional Memory

Documents

Bones of Contention - PBS

Transactional SMS Services Hyderabad, Transactional Bulk SMS Packages Hyderabad

contention et isolement

Transactional Contention Management as a Non-Clairvoyant Scheduling Problem

Sq l Server Latch Contention

Technology’s Role in Optimizing Result of Assessment Center for …assessmentcenter-indonesia.org/web/images/pdf/Kelas... · 2016. 2. 17. · managers and executives Transactional

Recruiting Strategy Metrics: From Transactional to ... · recruiting strategy metrics: from transactional to transformational Most hiring managers can easily recite the set of indicators

Window-Based Greedy Contention Management for Transactional Memory Gokarna Sharma (LSU) Brett Estrade (Univ. of Houston) Costas Busch (LSU) 1DISC 2010

SQL Server Latch Contention

Improving the Performance Competitive Ratios of ...busch/slides/2010-WTTM-transactions.pdfImproving the Performance Competitive Ratios of Transactional Memory Contention Managers Gokarna

Eliminate Resource Contention IN YOUR DATA CENTER€¦ · fully manage contention. PREDICT 3 Move Workloads Before Any Contention Occurs With the game-changing capabilities of Predictive

Benchmarking Contention Management Strategies in …projekter.aau.dk/projekter/files/32587755/report.pdf · Strategies in Clojure’s Software Transactional Memory Implementation

Performance Tradeoffs in Software Transactional Memory833477/... · 2015. 6. 30. · Hardware Transactional Memory (HTM), Software Transactional Memory (STM) and Hybrid Transactional

Bones of Contention: The Political Economy of Height ...cboix/bones-of-contention-apsr-final.pdf · Bones of Contention February 2014 different heights if one was inadequately nourished

Transactional Analysis 2- Key concepts in transactional analysis

j O'NEILL CONTENTION IIB

Transactional Contention Management as a Non-Clairvoyant Scheduling Problem Alessia Milani [Attiya et al. PODC 06] [Attiya and Milani OPODIS 09]

Evaluation de l’efficacité d’une jupe de contention ... de l’efficacité d’une jupe de contention... · Evaluation de l’efficacité d’une jupe de contention élastique

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED …ditec.um.es/caps/pdfs/titos_tpds14.pdf · Abstract—Transactional contention management policies show considerable variation in relative

Sycamore Ward - Beware of Contention