124
Concurrent Tries with Efficient Non-blocking Snapshots Aleksandar Prokopec Phil Bagwell Martin Odersky École Polytechnique Fédérale de Lausanne Nathan Bronson Stanford

Ctries Snapshots 140925165050 Phpapp02

Embed Size (px)

DESCRIPTION

ctries

Citation preview

  • Concurrent Tries with Efficient Non-blocking Snapshots

    Aleksandar Prokopec

    Phil Bagwell

    Martin Odersky cole Polytechnique Fdrale de Lausanne

    Nathan Bronson Stanford

  • Motivation

    val numbers = getNumbers()

    // compute square roots

    numbers foreach { entry =>

    x = entry.root

    n = entry.number

    entry.root = 0.5 * (x + n / x)

    if (abs(entry.root - x) < eps)

    numbers.remove(entry)

    }

  • Hash Array Mapped Tries (HAMT)

  • Hash Array Mapped Tries (HAMT)

    0 = 0000002

  • Hash Array Mapped Tries (HAMT)

    0

  • Hash Array Mapped Tries (HAMT)

    0 16 = 0100002

  • Hash Array Mapped Tries (HAMT)

    0 16

  • Hash Array Mapped Tries (HAMT)

    0 16 4 = 0001002

  • Hash Array Mapped Tries (HAMT)

    16

    0

    4 = 0001002

  • Hash Array Mapped Tries (HAMT)

    16

    0 4

  • Hash Array Mapped Tries (HAMT)

    16

    0 4

    12 = 0011002

  • Hash Array Mapped Tries (HAMT)

    16

    0 4

    12 = 0011002

  • Hash Array Mapped Tries (HAMT)

    16

    0 4 12

  • Hash Array Mapped Tries (HAMT)

    16 33

    0 4 12

  • Hash Array Mapped Tries (HAMT)

    16 33

    0 4 12

    48

  • Hash Array Mapped Tries (HAMT)

    16

    0 4 12

    48

    33 37

  • Hash Array Mapped Tries (HAMT)

    16

    4 12

    48

    33 37

    0 3

  • Hash Array Mapped Tries (HAMT)

    4 12 16 20 25 33 37

    0 1 8 9 3

    48 57

  • Immutable HAMT

    used as immutable maps in functional languages

    4 12 16 20 25 33 37

    0 1 8 9 3

  • Immutable HAMT

    updates rewrite path from root to leaf

    4 12 16 20 25 33 37

    0 1 8 9 3

    4 12

    8 9 11

    insert(11)

  • Immutable HAMT

    updates rewrite path from root to leaf

    4 12 16 20 25 33 37

    0 1 8 9 3

    4 12

    8 9 11

    insert(11)

    efficient updates - logk(n)

  • Node compression

    48 57

    48 57 1 0 1 0

    48 57 1 0 1 0

    48 57 10

    BITPOP(((1 > lev) & 1F)) 1) & BMP)

  • Node compression

    48 57

    48 57 1 0 1 0

    48 57 1 0 1 0

    48 57 10 48 57

  • Ctrie

    Can mutable HAMT be modified to be

    thread-safe?

  • Ctrie insert

    4 9 12 16 20 25 33 37

    0 1 3

    48 57

    17 = 0100012

  • Ctrie insert

    4 9 12 16 20 25 33 37

    0 1 3

    48 57

    17 = 0100012 16 17

    1) allocate

  • Ctrie insert

    4 9 12 20 25 33 37

    0 1 3

    48 57

    17 = 0100012 16 17

    2) CAS

  • Ctrie insert

    4 9 12 20 25 33 37

    0 1 3

    48 57

    17 = 0100012 16 17

  • Ctrie insert

    4 9 12 33 37

    0 1 3

    48 57

    18 = 0100102

    16 17

    20 25

  • Ctrie insert

    4 9 12 33 37

    0 1 3

    48 57

    18 = 0100102

    16 17

    20 25

    1) allocate 16 17 18

  • Ctrie insert

    4 9 12 33 37

    0 1 3

    48 57

    18 = 0100102

    20 25

    2) CAS 16 17 18

  • Ctrie insert

    4 9 12 33 37

    0 1 3

    48 57

    18 = 0100102

    20 25

    2) CAS 16 17 18

    Unless

  • Ctrie insert

    4 9 12 33 37

    0 1 3

    48 57

    18 = 0100102

    16 17

    20 25

    T1-1) allocate 16 17 18

    Unless 28 = 0111002

    T1

    T2

  • Ctrie insert

    4 9 12

    0 1 3

    18 = 0100102

    16 17

    20 25

    T1-1) allocate 16 17 18

    Unless 28 = 0111002

    T1

    T2

    20 25 28 T2-1) allocate

  • Ctrie insert

    4 9 12

    0 1 3

    18 = 0100102

    16 17

    20 25

    T1-1) allocate 16 17 18

    28 = 0111002

    T1

    T2

    20 25 28

    T2-2) CAS

  • Ctrie insert

    4 9 12

    0 1 3

    18 = 0100102

    16 17

    20 25

    T1-2) CAS

    16 17 18

    28 = 0111002

    T1

    T2

    20 25 28

    T2-2) CAS

  • Ctrie insert

    4 9 12

    0 1 3

    18 = 0100102

    16 17

    20 25

    16 17 18

    28 = 0111002

    T1

    T2

    20 25 28

    Lost insert!

  • Ctrie insert 2nd attempt

    4 9 12

    0 1 3 16 17

    20 25

    Solution: I-nodes

  • Ctrie insert 2nd attempt

    4 9 12

    0 1 3 16 17

    20 25

    18 = 0100102

    28 = 0111002

    T1

    T2

  • Ctrie insert 2nd attempt

    4 9 12

    0 1 3 16 17

    T1

    T2

    20 25

    18 = 0100102

    28 = 0111002

    16 17 18

    20 25 28 T2-1) allocate

    T1-1) allocate

  • Ctrie insert 2nd attempt

    4 9 12

    0 1 3 16 17

    T1

    T2

    20 25

    16 17 18

    20 25 28

    T2-2) CAS

    T1-2) CAS

  • Ctrie insert 2nd attempt

    4 9 12

    0 1 3 16 17 18

    20 25 28

  • Ctrie insert 2nd attempt

    4 9 12

    0 1 3 16 17 18

    20 25 28

    Idea: once added to the Ctrie, I-nodes remain present.

  • Ctrie insert 2nd attempt

    4 9 12

    0 1 3 16 17 18

    20 25 28

    Remove operation supported as well - details in the paper.

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 0

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 0

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 0

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 0

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 1

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 2

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 3

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 5

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 5

    actual size = 12

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 5

    0 1

    actual size = 12

  • Ctrie size

    4 9 12

    0 1 3 16 17 18

    20 25 28

    size = 5

    0 1

    CAS

    actual size = 11

  • Ctrie size

    4 9 12

    16 17 18

    20 25 28

    size = 5

    0 1

    actual size = 11

  • Ctrie size

    4 9 12

    16 17 18

    20 25 28

    size = 6

    0 1

    actual size = 11

  • Ctrie size

    4 9 12

    16 17 18

    20 25 28

    size = 6

    0 1

    actual size = 11

    19

  • Ctrie size

    4 9 12

    16 17 18

    20 25 28

    size = 6

    0 1

    actual size = 11

    16 17 18 19

  • Ctrie size

    4 9 12

    16 17 18

    20 25 28

    size = 6

    0 1

    actual size = 12

    16 17 18 19

    CAS

  • Ctrie size

    4 9 12 20 25 28

    size = 6

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 6

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 7

    0 1

    actual size = 9

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 8

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 9

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 10

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 11

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 12

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 13

    0 1

    actual size = 12

    16 17 18 19

  • Ctrie size

    4 9 12 20 25 28

    size = 13

    0 1

    actual size = 12

    16 17 18 19

    But the size was never 13!

  • Global state information

    4 9 12 20 25 28

    0 1 16 17 18 19

    size find filter iterator

  • Global state information

    4 9 12 20 25 28

    0 1 16 17 18 19

    size find filter iterator

    snapshot

  • Snapshot using locks

    4 9 12 20 25 28

    0 1 16 17 18 19

  • Snapshot using locks

    4 9 12 20 25 28

    0 1 16 17 18 19

    copy expensive

  • Snapshot using locks

    4 9 12 20 25 28

    0 1 16 17 18 19

    copy expensive not lock-free

  • Snapshot using locks

    4 9 12 20 25 28

    0 1 16 17 18 19

    copy expensive not lock-free can insert or

    remove remain lock-free?

    0 1 2

    CAS

  • Snapshot using locks

    4 9 12 20 25 28

    0 1 16 17 18 19

    copy expensive not lock-free can insert or

    remove remain lock-free?

    0 1 2

    CAS

  • Snapshot using logs

    4 9 12 20 25 28

    0 1 16 17 18 19

    keep a linked list of previous values in each I-node

  • Snapshot using logs

    4 9 12 20 25 28

    0 1 16 17 18 19 0 1 2

    keep a linked list of previous values in each I-node

  • Snapshot using logs

    4 9 12 20 25 28

    0 1 16 17 18 19

    keep a linked list of previous values in each I-node

    when is it safe to delete old entries?

    0 1 2

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    root

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    root

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    snapshot!

    root

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    snapshot!

    #2

    root

    1) create new I-node at #2

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    snapshot!

    #2

    root

    2) set snapshot

    snapshot #1

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    snapshot!

    #2

    root 3) CAS root to new I-node snapshot #1

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root snapshot #1

    2

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root snapshot #1

    2

    generation #2 - ok!

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root snapshot #1

    2

    generation #1 not ok, too old!

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root

    1) create updated node at #2

    snapshot #1

    2

    #2 #2

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root

    2) CAS to the updated node snapshot #1

    2

    #2 #2

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root snapshot #1

    2

    #2 #2

    #1 too old!

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root snapshot #1

    2

    #2 #2

    4 9 12

    #2 1) create updated node at #2

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root snapshot #1

    2

    #2 #2

    4 9 12

    #2

    2) CAS

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    subsequent insert

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2

    finally, create a new leaf and CAS

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    another insert

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2

    3

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    another insert

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    But... this won't really work... why?

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

    T2: remove 19

    16 17 18

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

    T2: remove 19

    16 17 18

    CAS

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

    T2: remove 19

    16 17 18

    CAS

    How to fail this last CAS?

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

    T2: remove 19

    16 17 18

    DCAS

    How to fail this last CAS? DCAS

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

    T2: remove 19

    16 17 18

    How to fail this last CAS? DCAS - software based

    DCAS

  • Snapshot using immutability

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 0 1 2 3

    T2: remove 19

    16 17 18

    How to fail this last CAS? DCAS - software based ...creates intermediate objects

    DCAS

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    T2: remove 19

    16 17 18 prev

    1) set prev field

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    T2: remove 19

    16 17 18 prev

    2) CAS

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    T2: remove 19

    16 17 18 prev

    3) read root generation

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    16 17 18 prev 4) if root generation changed CAS prev to FailedNode(prev)

    FN

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    16 17 18 prev 4) if root generation changed CAS prev to FailedNode(prev)

    FN

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    16 17 18 prev 5) CAS to previous value

    FN

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    16 17 18 prev 4) if root generation unchanged CAS prev to null

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    16 17 18 4) if root generation unchanged CAS prev to null

  • GCAS - generation-compare-and-swap

    4 9 12 20 25 28

    0 1 16 17 18 19

    #1

    #1 #1

    #1 #1

    #2

    root snapshot #1

    #2 #2

    4 9 12

    #2

    0 1 2 3

    1) Replace all CAS with GCAS 2) Replace all READ with GCAS_READ (which checks if prev field is null)

  • Snapshot-based iterator

    def iterator =

    if (isSnapshot) new Iterator(root)

    else snapshot().iterator()

  • Snapshot-based size

    def size = {

    val sz = 0

    val it = iterator

    while (it.hasNext) sz += 1

    sz

    }

  • Snapshot-based size

    def size = {

    val sz = 0

    val it = iterator

    while (it.hasNext) sz += 1

    sz

    }

    Above is O(n). But, by caching size in nodes - amortized O(logkn)! (see source code)

  • Snapshot-based atomic clear

    def clear() = {

    val or = READ(root)

    val nr = new INode(new Gen)

    if (!CAS(root, or, nr)) clear()

    }

    (roughly)

  • Evaluation - quad core i7

  • Evaluation UltraSPARC T2

  • Evaluation 4x 8-core i7

  • Evaluation snapshot

  • Conclusion

    snapshots are linearizable and lock-free

    snapshots take constant time

    snapshots are horizontally scalable

    snapshots add a non-significant overhead to the algorithm if they aren't used

    the approach may be applicable to tree-based lock-free data-structures in general (intuition)

  • Thank you!