View
214
Download
0
Category
Tags:
Preview:
Citation preview
Range Queries in Non-blocking
k-ary Search Trees
Trevor BrownHillel Avni
Problem Statement
● Want to store keys in a dynamic data structure supporting insertion, deletion, and:
● RangeQuery(a, b): returns all keys of the data structure in range [a, b]
Previous Solutions● Software transactional memory ● Locks● Persistence
a d
f
e
b
root pointer
e
b
d
c
Insert(c)
k-ary Search Tree (k-ST)
● Add or remove keys by replacing node(s)● Related to persistent data structures
[Brown, Helga]
The Range Query AlgorithmRangeQuery(a, b):
– Traverse the tree, skipping sub-trees which cannot contain a key in [a, b]
– During this traversal, save a pointer to each leaf that contains a key in [a, b]
– […]
– Problem: how to efficiently tell if a key was added or removed during this traversal?
Extending the Data Structure● Add a dirty bit to each leaf● Each leaf has its dirty bit set just before it is
replaced– Consequence: If a leaf's dirty bit is not set, then
it has not been replaced
The Range Query AlgorithmRangeQuery(a, b):
– Traverse the tree, skipping sub-trees which cannot contain a key in [a, b]
– During this traversal, save a pointer to each leaf that contains a key in [a, b]
– After this traversal, check the dirty bits of these leaves, one by one
– If no dirty bit is set, then return “the result”– Otherwise, retry
● Reading dirty bit is far faster than re-traversing
Example: RangeQuery(3, 14)
8, 13, 25
2, 3, 5 8, 9, 12
1 4 5, 7
14, 19, 21
1315, 16,
1823, 24
29, 35
Saved pointers3, 4 Insert(3)
RangeQuery sees 4 is dirty… Retry!
Retrying RangeQuery(3, 14)
8, 13, 25
2, 3, 5 8, 9, 12
1 4 5, 7
14, 19, 21
1315, 16,
1823, 24
29, 35
Saved pointers3, 4
Success! Return the result…
Ctrie● Taking a “snapshot:” atomically replace root
● Old tree no longer changes● Future searches and updates copy nodes from
the old tree
[Prokopec, Bronson, Bagwell, Odersky]
| | |
… … … …
root pointer
| | |00 01 10 1101 10 11
When our Algorithm is Good● When workloads contain range queries over
small ranges (i.e., where snapshots are bad)– Example: database applications such as airline
database of flights
When it might not be● Very large ranges increase the chance that a
range query will have to retry– Our experiments explore how much this matters– In extreme cases Ctrie or Snap might be better
Experiment: compare performance of
● k-ST: k=16, 32, 64● Snap● Ctrie● Java’s Concurrent Skip List (SL)
– NOT LINEARIZABLE!
Experiment● Throughput vs. number of concurrent threads● Each thread repeatedly chooses a random
operation (Search, Insert, Delete, RangeQuery) with arguments chosen uniformly randomly in [0, 10^6)
● Each experiment ran with a fixed amount of memory, for a fixed, sufficiently long amount of time
Hardware● Intel 4-chip, 40-core, 80-thread● Sun 2-chip, 16-core, 128-thread
Many queries with small ranges
Snap
Ctrie
16-ST32-ST
SL
64-ST
Number of threads
Throughput
(millions)
50% search, 5% insert, 5% delete, 40% range query size 100
Many queries with bigger ranges
Snap
Ctrie
16-ST
32-ST
SL
64-ST
Number of threads
Throughput(hundred
thousands)
50% search, 5% insert, 5% delete, 40% range query size 10,000
Few queries (with small ranges)
Snap
Ctrie
16-ST
32-ST
SL
64-ST
Number of threads
Throughput(ten
millions)
59% search, 20% insert, 20% delete, 1% range query size 100
Throughput versus P(range query)
Snap
Ctrie
KST16KST32
SL
KST64
Probability of range query
Throughput(ten
millions) 1:10000 operations is RQ
Throughput versus arity
5i-5d-40r-size10000
20i-20d-1r-size10000
5i-5d-40r-size100
20i-20d-1r-size100
Degree of tree
Throughput(ten
millions)
Conclusion● Provably correct algorithm● Searches can ignore concurrent updates
● Although dirty bits invalidate range queries,they do not invalidate searches
● Range queries are invisible● No CAS, don’t change data structure
● Avoids excessive duplication of nodes● Appears to be practical when workloads contain
queries over small ranges
Future work● Adding balance
– (a, b)-tree, chromatic tree, relaxed AVL tree
● Wait-freedom?
Recommended