32
The Adaptive Radix Tree: ARTful Indexing for Main- Memory Databases Presentation by Aaron Kabcenell The adaptive radix tree: ARTful indexing for main-memory databases. Viktor Leis, Alfons Kemper, Thomas Neumann. International Conference on Data Engineering (ICDE), 2013

The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases

Presentation by Aaron Kabcenell

The adaptive radix tree: ARTful indexing for main-memory databases. Viktor Leis, AlfonsKemper, Thomas Neumann. International Conference on Data Engineering (ICDE), 2013

Page 2: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

https://en.wikipedia.org/wiki/The_Starry_Night#/media/File:Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg

Page 3: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory
Page 4: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

What is the problem?

Page 5: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Main Memory Indexing

Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and

Compilation. Harald Lang, Tobias Mühlbauer, Florian Funke, Peter A. Boncz, Thomas Neumann,

Alfons Kemper. ACM SIGMOD International Conference on Management of Data. 2016

?

Page 6: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Why is it important?

Page 7: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

OLTP Workloads limited by Index Performance

Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and

Compilation. Harald Lang, Tobias Mühlbauer, Florian Funke, Peter A. Boncz, Thomas Neumann,

Alfons Kemper. ACM SIGMOD International Conference on Management of Data. 2016

Page 8: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Why is it hard?

Page 9: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Hash Tables

Fast, O(1) access time

x Point queries only

x Overflow causes periodic latency

A study of index structures for main memory database management systems. T. J. Lehman and M. J. Carey. International Conference on Very Large Databases (VLDB),1986

Page 10: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Trees

Keeps elements ordered

x Not ideal for modern hardware (cache misses, pipeline stalls)

Modern B-Tree Techniques, by Goetz Graefe, Foundations and Trends in Databases, 2011

Page 11: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Can we get fast, fully featured indexing?

Page 12: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Why do existing solutions not work?

Page 13: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

T-Trees

A study of index structures for main memory database management systems. T. J. Lehman and M. J. Carey. International Conference on Very Large Databases (VLDB),1986

Balance of space overhead and search speed

x Significant amounts of data stored per node, but only two pointers used

x Poor cache behavior

Page 14: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Cache Sensitive B+-Trees

J. Rao and K. A. Ross, “Making B+ trees cache conscious in main memory”, SIGMOD, 2000.

Stores only one child pointer per node

Can fan out more and keep more nodes in cache line

x Many comparisons cause pipeline stalls

Page 15: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Fast Architecture Sensitive Trees

C. Kim et al., “FAST: fast architecture sensitive tree search on modern cpus and gpus”, SIGMOD 2010.

Binary Tree

SIMD Blocking

Three Level Hierarchy

Reduce comparisons by matching structure to SIMD vector size

Reduce cache misses by matching structure to cache line size

Pointer-free, stored in arrays and use offset calculations

x Expensive updates

Page 16: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Radix Tree

• Two factors determine performance:• k: key length in bits

• s: span – number of bits in key stored in each node

• Tree has k/s levels

• Node has 2s pointers

A

N R

D T Y E T

Page 17: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Radix Tree

Complexity of operations based on key length, not key number

Keys are ordered and stored implicitly

Insertion order independent creation with no rebalancing

x Mostly studied for character strings

x Poor space usage due to large number of null paths

A

N R

D T Y E T

Page 18: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

What is the core intuition for the solution?

Page 19: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Adaptive Nodes to Reduce Space Consumption

Page 20: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Adaptive Node Types

Page 21: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Path Compression

Page 22: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Worst-Case Space Consumption

Page 23: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Binary-Comparable Keys

• Unsigned integers:• Binary representation already sorted

• Signed Integers:• Flip sign bit and store as unsigned integers

• Floating Point Numbers:• Separate into positive, negative, normalized, denormalized, NaN, Inf, or 0

• Reorder and store as unsigned integers

• Character Strings:• Standard libraries available

• Null:• Add one byte to key length to encode Null value

• Compound Keys:• Transform attributes individually and concatenate

Page 24: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

What is the setup of analysis/experiments? Is it sufficient?

Page 25: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Micro Benchmarks

• Use 32-bit integers as keys• Path compression disabled for short keys

• Two different key distributions• Dense – keys ranging from [1,tree size]

• Sparse – each bit equally likely either 0 or 1

Page 26: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Search Performance65K Single Thread 16M Single Thread

256M Single Thread 16M Multi-Thread

Page 27: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Caching Effects

Page 28: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Updates

Page 29: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

TPC-C Benchmark

• OLTP benchmark describing a merchandising company• Includes selects, inserts, deletes

• Write-heavy

• Integrates ART into HyPer• Depends heavily on index performance

Page 30: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

TPC-C Benchmark Using HyPer

Page 31: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Gaps and Next steps?

Page 32: The Adaptive Radix Tree: ARTful Indexing for Main- Memory ...daslab.seas.harvard.edu/classes/cs265/files/presentations/feb28.pdf · The Adaptive Radix Tree: ARTful Indexing for Main-Memory

Gaps and Next Steps

• Own implementation of competing data structures

• Sparse vs dense key performance

• Ideal node number and size?

• Synchronizing concurrent updates