29
Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William N. Scherer III, Michael L. Scott Featuring: RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Featuring: RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

  • Upload
    terena

  • View
    36

  • Download
    4

Embed Size (px)

DESCRIPTION

Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William N. Scherer III, Michael L. Scott. Featuring: RSTM – low overhead STM library for C++ Presenting: Yosef Etigin. What is this paper about?. - PowerPoint PPT Presentation

Citation preview

Page 1: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Lowering the Overheadof Software Transactional MemoryVirendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William N. Scherer III, Michael L. Scott

Featuring:

RSTM – low overhead STM library for C++

Presenting: Yosef Etigin

Page 2: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

What is this paper about?

Design and implementation of RSTM. RSTM is meant to be a fast STM library for C++

multi-threaded programs. RSTM main features:

Cache-optimized metadata organization. No memory allocations during runtime, except

for cloning objects. Use a contention manager to tune performance. Allow different strategies: eager/lazy acquire,

visible/invisible readers.

Page 3: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Where RSTM fits in?

Requires atomic load/store and CAS in hardware. Provides C++ “Smart Pointers” API that can be

used to safely access shared data within transactions.

beginTx { openRO, openRW } endTx

HW: atomic Load & Store, CAS

RSTM Library

User application

Page 4: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Overview RSTM Theory

Transaction Semantics Readers Writers

RSTM Design Descriptor Data Object Shared Object Handle

RSTM Implementation Resolving the data object Open for read-only Acquire Open for read-write Commit Abort

Performance results Conclusion

Page 5: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Transaction Semantics

Data is considered in object granularity. Objects are shadowed, rather than changed “in

place”. Inside a transaction, objects may be opened for

read-only or for read-write. Objects that are opened for read-write are cloned,

and those for read-only are not. “Commit” tries to set the clone as the current object. “Abort” tries to set the original as the current object. Transactions may abort each other, but they consult

the Contention Manager (CM) before doing so.

Page 6: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Readers

A thread that opens an object for reading may become a “visible” or “invisible” reader.

“visible” = visible to writers. Reader must have a consistent view of its opened

objects. “consistent” = no writer has made a change that the

reader sees only in some of its opened objects. Inconsistency might cause hardware exceptions and

infinite loops, thus: Invisible reader, on every “open”, must validate all

previously opened objects (O(n2) cost). Visible reader must be explicitly aborted by a writer

that acquired it.

Page 7: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Writers

Opening an object for writing involves “acquiring” it. Acquiring is getting exclusive access to the object.

Writers conflict with other writers and with visible readers. Visible readers can co-exist with each other.

Acquiring can be done in eager or lazy fashion: Eager – acquire an object as soon as it’s opened. Lazy – acquire it prior to committing the transaction.

Eager acquire aborts doomed transactions immediately, but causes more conflicts.

Lazy acquire enables readers to run together with a writer that is not committing yet. Has the same consistency issue as with invisible reads.

Page 8: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Contention Management

CM is a Thread-local object Notified of transaction events Decides what to do on a conflict:

Abort a transaction or spin-wait Which transaction to abort, if any

For instance: “Polka” CM Prefers writers over readers

Page 9: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Overview RSTM Theory

Transaction Semantics Readers Writers

RSTM Design Descriptor Data Object Shared Object Handle

RSTM Implementation Resolving the data object Open for read-only Acquire Open for read-write Commit Abort

Performance results Conclusion

Page 10: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

RSTM Design

Descriptor)writer(

Shared Object Handle

Data Object)New(

owner

header

Data Object)Old(

nextvisiblereaders

Descriptor(reader)

Descriptor)reader(

Thread 1 Thread 2

Thread 3

Page 11: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Descriptor

Each thread has a static descriptor that is used for all transactions of this thread. Don’t support nested transactions

Descriptor has: Status: ACTIVE / COMMITTED / ABORTED Lists of opened objects:

Visible, invisible reads. Eager, lazy writes.

Page 12: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Data Object

Shared objects hold, in addition to data fields, “owner” and “next” fields.

Owner is the descriptor of the current writer thread, if any.

Next is the original object, if this is a writer-made clone.

Page 13: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Shared Object Handle (1)

Encapsulates a reference to a shared object. Global variables are handles rather than

pointers. Direct pointers are obtained within a

transaction, via “open”. Holds:

“header” word - identifies the current version of the object.

“visible readers” word – bitmap of the visible readers.

Page 14: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Shared Object Handle (2)

The header is a single word that holds a pointer and a dirty bit. Take advantage of address alignment

The pointer holds some data object “pObj”. The dirty bit tells whether “pObj” is a clean

object, or a writer-made clone. Saves one dereference in the common case

of non-conflicting access.

Page 15: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Shared Object Handle (3)

“Visible readers” is a bitmap of the visible readers.

Bit i of the mask is set if thread i is a visible reader of the object.

Allows getting all readers or adding a reader in a single atomic operation.

Limits the number of visible readers All others will be invisible

Page 16: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Overview RSTM Theory

Transaction Semantics Readers Writers

RSTM Design Descriptor Data Object Shared Object Handle

RSTM Implementation Resolving the data object Open for read-only Acquire Open for read-write Commit Abort

Performance results Conclusion

Page 17: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

RSTM Implementation

This section will provide pseudo-code for the most important STM operations: Open object for read-only Open object for read-write Commit Abort

We present pseudo-code for methods of Descriptor class, which is the object that implements RSTM functionality.

Page 18: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Resolving the Data Object// This function returns the up-to-date data object, associated with // a handle. If the object has an active owner, call CM.Object *Descriptor::resolve(Handle *shared){

long snapshot = shared->header;Object *ptr = snapshot & ~1; // mask out LSBif (snapshot & 1) { // dirty

switch (ptr->owner->m_status) { case ACTIVE:

m_cm.handleConflict(this, ptr->owner); return NULL;

case COMMITTED:return ptr;

case ABORTED:return ptr->next;

}} else { // clean

return ptr;}

}

Page 19: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Open for Read-Only// Open an object for read-only Object *Descriptor::openRO(Handle *shared){

long headerSnapshot = shared->header;// find the data objectObject *ptr;do {

ptr = resolve(shared);} while (!ptr);

if (m_isVisible) {m_visibleReads.add(shared); // install this tx as a visible reader of the objectwhile (! CAS(&shared->readers, shared->readers, shared->readers | (1 << m_id)) );// make sure no writer acquired this object before he could see the CAS aboveif (headerSnapshot != shared->header) abort();

} else {m_invisibleReads.add(shared);

}validate();return ptr;

}

Page 20: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Open for Read-Write// Open an object for read-writeObject *Descriptor::openRW(Handle *shared){

// find the data objectObject *ptr;do {

ptr = resolve(shared);} while (!ptr);// make a writeable cloneObject *clone = ptr->clone();clone->owner = this;clone->next = ptr;// eager acquires now. lazy acquires later.if (m_isEager) {

acquire(shared, clone);m_eagerWrites.add(shared, clone);

} else {m_lazyWrites.add(shared, clone);

}validate();return clone;

}

Page 21: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Acquire// acquire the object

void Descriptor::acquire(Handle *shared, Object *clone)

{

// replace the header with a dirty reference to the clone

if (!CAS( &shared->header, shared->header, (long)clone | 1))

abort();

// abort all visible readers

for (i = 0; i < sizeof(shared->readers) * 8; ++i) {

if (shared->readers & (1 << i))

allDescriptors[i]->abort();

}

// record this object for cleanup

m_acquiredObjects.add(<shared, clone>);

}

Page 22: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Commit// commit a transactionvoid Descriptor::onCommit(){

validate();// acquire now lazily opened-for-rw objectsacquireLazyWrites();// if this CAS succeeds our clones (if any) become the active objectsCAS( &m_status, ACTIVE, COMMITTED );

if (m_status == COMMITTED) {// replace a dirty reference to our clone// with a clean reference to our clonefor (<shared, clone> in m_acquiredObjects) {

CAS( &shared->header, clone | 1, clone );}for (Shared *shared in m_visibleReads) {

while (!CAS( &shared->readers, shared->readers, shared->readers & ~(1 << m_id)) );

}} else {

abort();}

}

Linearization Point

Page 23: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Abort// called when “Aborted” exception is caughtvoid Descriptor::onAbort(){

// after this CAS, our clones (if any) are discardedCAS( &m_status, ACTIVE, ABORTED );// cleanup the written objects// replace a dirty reference to our clone // with a clean reference to the original objectfor (<shared, clone> in m_acquiredObjects) {

CAS( &shared->header, clone | 1, clone->next );}// remove the thread from readers bitmap of all// visibly opened objectsfor (Shared *shared in m_visibleReads) {

while (!CAS( &shared->readers, shared->readers, shared->readers & ~(1 << m_id)) );

}}

Page 24: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Overview RSTM Theory

Transaction Semantics Readers Writers

RSTM Design Descriptor Data Object Shared Object Handle

RSTM Implementation Resolving the data object Open for read-only Acquire Open for read-write Commit Abort

Performance results Conclusion

Page 25: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Performance Results (1)

Compare ASTM and RSTM (previous work showed that ASTM outperforms DSTM and OSTM).

Platform: 16-processor SunFire 6800 at 1.2GHz. Use several benchmarks with different configurations:

visible/invisible readers, eager/lazy writers. Each benchmark was run for 10 seconds with 1 to 28

threads. Contention manager: “Polka”. Count successful transactions.

Page 26: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Performance Results (2)

• RSTM with invisible readers is ~2 times better than ASTM.

• Visible readers are expensive because each access reads the root node and causes cache invalidation.

• The only difference between C++ ASTM and RSTM is metadata organization.

Page 27: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Performance Results (3)

• In LinkedList, FGL performs bad if #threads > #CPUs due to preemption.

• In LinkedList, ASTM outperforms RSTM since each writer invalidates objects for many readers.

• HashTable allows great concurrency, so RSTM works well (~3 times faster than ASTM).

Page 28: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Performance Results (4)

• In RandomGraph and LFUCache, all STM’s perform worse than CGL, because these data structures do not allow much concurrency.

• Nevertheless, RSTM beats ASTM.

Page 29: Featuring:  RSTM – low overhead STM library for C++ Presenting: Yosef Etigin

Conclusion

RSTM has a novel metadata organization which reduces overhead, due to: One level of indirection instead of the common two. Using static instead of dynamic data structures.

RSTM provides a variety of policies for conflict detection, so can be customized for a given workload.

Compared to ASTM, RSTM gives better performance due to better metadata organization.