12 Transactions

Transactional Memory

David Chisnall

March 10, 2011

What is a Transaction?

• Set of operations

• Operates atomically

• Either all succeed, or all fail

Author's Note

Comment

Transactions are a way of gouping a set of operations so that they appear - to anything outside the transaction - as an atomic operation.

History: Databases

ACID compliance (simplified):

Atomicity - groups of commands must execute as a singleoperation

Consistency - the database must never expose partially completedoperations

Isolation - concurrent operations should not have to worryabout each other

Durability - the database shouldn’t randomly corrupt data

Author's Note

Comment

Databases have supported transactions for a long time. This is one of the main reasons why they are used as the back end for large systems - they provide a programmer model that deals well with thousands of clients trying to modify the same data simultaneously.

Transactions in Databases

�BEGIN TRANSACTION;

SELECT {stuff};

UPDATE {more stuff};

UPDATE {even more stuff};

END TRANSACTION; � �• The statements in the transaction that modify the database

either all complete or all fail

• If they fail, you can detect this and retry

Author's Note

Comment

Transactions never partially finish, and no code outside of the database ever sees the data in a state where part of the transaction has completed. In the SQL-like pseudocode example, we have two updates, but no code will ever see the database after the first update but before the second. If the transaction performs an update concurrently with another process, and there is a conflict, then one transaction will fail - you won't get the database corrupted.

Transactions for Concurrency

• Two users issue transactions

• Both see the data in a consistent state

• If both modify the same data, (at least) one fails

• Neither sees the data in the middle of the other’s changes

Author's Note

Comment

Transactions are perfectly suited to concurrency. Assume that you can do whatever you want, and then check whether there were any conflicts, reverting to the old state and retrying if there were.

Transactional Memory

• Transactions are great for concurrency

• Why not make them available for all memory?

Author's Note

Comment

The idea of transactional memory is to group arbitrary memory operations (rather than database updates) into transactions.

Simplest Case: Load-Linked / Store-Conditional

• Found on ARM, PowerPC, etc. chips

• Load-linked begins transaction on a word

• Store-conditional commits it

• Limitation: Transaction may only modify one word in memory

Author's Note

Comment

We looked at this pair of operations before. They effectively define a transactional model for a single word in memory. You load the word, do some arbitrary computations on it, then complete the transaction by storing it. At this point, the transaction either completes (the store-conditional succeeds) or aborts (it fails).

General Case: Fully Transactional Memory

• All modifications to memory are private

• Commit instruction applies them all

• Commit instruction fails if any updated memory locations hasbeen modified

• Multiword Load-Linked / Store-Conditional

Author's Note

Comment

In the ideal case, we'd generalise this to support arbitrarily complex operations on memory. The load-linked and store-conditional instructions would support reserving large amounts of memory.

The Problem

• Must keep a copy of all memory updated by in-flighttransactions

• Potentially huge!

• Must lock all memory locations for update

Author's Note

Comment

Unfortunately, this is really hard. For transactional memory to work, you must have store all changes being made by an in-flight transaction (which can be a huge number of memory writes), and you must provide a way of writing them all atomically.

Hardware Transactional Memory

• Add instructions for begin / commit transaction

• Handle all updates via caches / buffers

• Not implemented anywhere (yet!)

Author's Note

Comment

In an ideal world, this would all be done in hardware. You'd extend the paging and caching systems so that each thread saw a private copy-on-write snapshot of memory and could commit changes to the shared store periodically. Even then, nesting transactions is hard.

Hardware-Assisted Transactional Memory: Rock

• Rock was an UltraSPARC design from Sun

• Project was cancelled near completion by Oracle

• Provided some HTM support

Author's Note

Comment

Sun's Rock chip (which almost made it to market) had basic support for transactional memory, although with some limitations.

Rock’s Interfaces

• New chkpt and commit instructions

• chkpt looks like conditional branch

• Branches if commit fails

• Instructions between chkpt and commit not committed ifconflict occurred

• Lots of limitations: No function calls, small number ofmemory addresses, no ‘difficult’ instructions in transactions...

• Can be used to assist implementing software transactionalmemory

Author's Note

Comment

The checkpoint and commit instructions worked quite a lot like load-linked and store-conditional, but for the entire program memory. There were quite a lot of limitations on what could be done between these two, however.

Software Transactional Memory

• Emulate hardware transactional memory

• Comes with performance overhead

• Can still be useful

Author's Note

Comment

If you use transactional memory at all at the moment, it's likely to be a software implementation. This is sometimes done purely in a library, but often with some compiler support. This has some overhead, but it often makes it easy to write scalable code. You may have a factor of 7 slowdown (typical for STM solutions), but if you can then scale to 10 times more cores it can be worthwhile.

STM Approaches: Indirection and Swizzling

• Create a copy of data to modify

• Modify copy

• Update pointer with atomic compare-and-swap

• Transaction fails if pointer has been modified since copy wasmade

• Uses a lot of memory (and time to make all of the copies)

• Simple to implement, works well for small data structures witha single entry point

Author's Note

Comment

The simplest way of implementing STM works well for small data structures. You access everything via one root pointer. First copy everything, then do your work on the copy, then use a single-word atomic operation to update the global version to your copy.

STM Approaches: Locking

• Lock(s) must be held for the duration of an update

• Global lock? Very little concurrency possible

• Locks on ranges of memory (e.g. one per page)

• Deadlock potential: must acquire locks in same order (e.g.lowest memory address first)

Author's Note

Comment

A trivial approach is to have a single mutex for all memory operations. Although this works, it completely defeats the point. A better approach is to have one lock for every memory page.

Why Transactional Memory is Damn ShinyTM

• With atomic set and get, you can’t make atomic increment

• With transactional set and get, you can

• Arbitrary transactions are composable

Author's Note

Comment

Transactions are useful as a programmer model because they are very easy to understand and very easy to verify correctness.

Composing Transactions

�void atomic_increment(void)

{

do

{

begin_transaction ();

int a = atomic_get ();

a++;

atomic_set(a);

} while(end_transaction ());

} � �• If set() and get() are thread-safe, so is increment()

• No extra thinking is required (thinking is hard!)

Proposed Imperative Language Support

�atomic

{

if (i == 0)

retry;

next = queue[i];

i++;

}; � �• atomic keyword defines a transaction

• retry restarts transaction, blocks until a memory address thathas been read already is modified by something else

• Transaction automatically restarted if it fails

Author's Note

Comment

Extension was proposed by Fraser and Harris. Not supported by any real languages yet, but probably will be in the next 10 years or so. Understand it now and you can claim 10 years of experience with BuzzwordLanguageâ—¢ when it's released in 2020...

Example: Transactional Linked List

�void insert(struct list_item *li);

{

atomic

{

li->prev = NULL;

li->next = head ->next;

head = li;

};

} � �

Author's Note

Comment

That's all that's needed to make an atomic linked list insert operation out of a non-atomic one. This doesn't even depend on all inserts being atomic - the transaction will fail if someone tries to unsafely modify the list from another thread, and will be automatically restarted.

STM and Haskell

• Fitting transactional memory into imperative programming ishard

• Haskell’s type system already collects side effects in monads

• Concurrency monad implements STM

• Monad handles merging results, retrying failed transactions

Author's Note

Comment

Haskell have very nice support for STM courtesy of Simon Peyton Jones. Adding STM support to functional languages is easier than to languages like C, because they already hide direct access to memory and require side effects to be handled in a special way.

Questions?

Documents

12 Transactions