All About Bitmap IndexesAnd Sorting Them

All About Bitmap Indexes. . . And Sorting Them

Daniel Lemire

http://www.daniel-lemire.com/

Joint work (presented at BDA’08 and DOLAP’08) with Owen Kaser (UNB) andKamel Aouiche (post-doc).

February 12, 2009

Daniel Lemire All About Bitmap Indexes. . . And Sorting Them

http://www.daniel-lemire.com/

Database Indexes

Databases use precomputed indexes (auxiliary data structures)to speed processing.

An index costs memory, can hurt update speed.

Improving indexes is practically important.


Database Indexes





Database Indexes





What make indexes fast?

We are going to use these three ideas:

Expect specific queries? Avoid a full scan!

Data is not random? Compress it!

A specific computer architecture? taylor your code for it!














Bitmap indexes

SELECT * FROMT WHERE x=aAND y=b;

Bitmap indexes have a longhistory. (1972 at IBM.)

Long history with DW & OLAP.(Sybase IQ since mid 1990s).

Main competition: B-trees.

Above, compute

{r | r is the row id of a row where x = a} ∩{r | r is the row id of a row where y = b}


Bitmap indexes





Above, compute



Bitmap indexes





Above, compute



Bitmap indexes





Above, compute



Bitmaps and fast AND/OR operations

Computing the union of two sets of integers between 1 and 64(eg row ids, trivial table). . .E.g., {1, 5, 8} ∪ {1, 3, 5}?

Can be done in one operation by a CPU:BitwiseOR( 10001001, 10101000)

Extend to sets from 1..N using dN/64e operations.

To compute [a0, . . . , aN−1] ∨ [b0, b1, . . . , bN−1] :a0, . . . , a63 BitwiseOR b0, . . . , b63;a64, . . . , a127 BitwiseOR b64, . . . , b127;a128, . . . , a192 BitwiseOR b128, . . . , b192;. . .



Computing the union of two sets of integers between 1 and 64(eg row ids, trivial table). . .E.g., {1, 5, 8} ∪ {1, 3, 5}?Can be done in one operation by a CPU:BitwiseOR( 10001001, 10101000)





Computing the union of two sets of integers between 1 and 64(eg row ids, trivial table). . .E.g., {1, 5, 8} ∪ {1, 3, 5}?Can be done in one operation by a CPU:BitwiseOR( 10001001, 10101000)




Common applications of the bitmaps

The Java language has had a bitmap class since thebeginning: java.util.BitSet.

(Sun’s implementation is basedon 8-bit words.)

Search engines use bitmaps to filter queries, e.g. ApacheLucene



The Java language has had a bitmap class since thebeginning: java.util.BitSet. (Sun’s implementation is basedon 8-bit words.)




The Java language has had a bitmap class since thebeginning: java.util.BitSet. (Sun’s implementation is basedon 8-bit words.)



Bitmap compression

1

x

... ......

x=1

x=3

x=2

index bitmapscolumn

1 00

00 1

0 0

0

1

0 1

L

n

...

2

1

3

A column with n rows and L distinctvalues ⇒ nL bits

E.g., n = 106, L = 104 → 10 Gbits

Uncompressed bitmaps are oftenimpractical

Moreover, bitmaps often contain longstreams of zeroes. . .

Logical operations over these zeroes is awaste of CPU cycles.


Bitmap compression

1

x

... ......

x=1

x=3

x=2

index bitmapscolumn

1 00

00 1

0 0

0

1

0 1

L

n

...

2

1

3


E.g., n = 106, L = 104 → 10 Gbits





Bitmap compression

1

x

... ......

x=1

x=3

x=2

index bitmapscolumn

1 00

00 1

0 0

0

1

0 1

L

n

...

2

1

3


E.g., n = 106, L = 104 → 10 Gbits





Bitmap compression

1

x

... ......

x=1

x=3

x=2

index bitmapscolumn

1 00

00 1

0 0

0

1

0 1

L

n

...

2

1

3


E.g., n = 106, L = 104 → 10 Gbits





Bitmap compression

1

x

... ......

x=1

x=3

x=2

index bitmapscolumn

1 00

00 1

0 0

0

1

0 1

L

n

...

2

1

3


E.g., n = 106, L = 104 → 10 Gbits





How to compress bitmaps?

Must handle long streams of zeroes efficiently ⇒Run-length encoding? (RLE)

Bitmap: a run of 0s, a run of 1s, a run of 0s, a run of 1s, . . .

So just encode the run lengths, e.g.,

0001111100010111 →3, 5, 3, 1,1,3





So just encode the run lengths, e.g.,

0001111100010111 →3, 5, 3, 1,1,3





So just encode the run lengths, e.g.,0001111100010111 →3, 5, 3, 1,1,3


Compressing better with delta codes

RLE can make things worse.

E.g., Use 8-bit counters, then11 may become 000000101.

How many bits to use for the counters?

Universal coding like delta codes use no more than c log xbits to represent value x .

Recall Gamma codes: 0 is 0, 1 is 1, 01 is 2, 001 is 3, 0001 is4, etc.

Delta codes build on Gamma codes.

Has two steps:x = 2N + (x mod 2N).

Write N − 1 as gamma code;write x mod 2N as an N − 1-bit number.

E.g. 17 = 24 + 1, 0010001



RLE can make things worse. E.g., Use 8-bit counters, then11 may become 000000101.







E.g. 17 = 24 + 1, 0010001










E.g. 17 = 24 + 1, 0010001










E.g. 17 = 24 + 1, 0010001










E.g. 17 = 24 + 1, 0010001










E.g. 17 = 24 + 1, 0010001







Delta codes build on Gamma codes. Has two steps:x = 2N + (x mod 2N).


E.g. 17 = 24 + 1, 0010001








Write N − 1 as gamma code;

write x mod 2N as an N − 1-bit number.

E.g. 17 = 24 + 1, 0010001









E.g. 17 = 24 + 1, 0010001









E.g. 17 = 24 + 1, 0010001


RLE with delta codes is pretty good

In some (weak) sense, RLE compression with delta codes isoptimal!

Theorem

A bitmap index over an N-value column of length n, compressedwith RLE and delta codes, uses O(n log N) bits.


Is the compression rate what matters?

There is endless debate about whether more compression is better:

Solid-State Drives (SSD) have 10× the bandwidth? Allproblems are CPU-bound!

Multi-core CPUs? All problems I/O-bound!

Store your indexes in RAM? All problems are CPU-bound!

. . .

No definitive answer on whether more compression is better. Itdepends!







. . .








. . .








. . .








. . .



Byte/Word-aligned RLE

RLE variants can focus on runs that align with machine-wordboundaries.

Trade compression for speed.

That is what Oracle is doing.

Variants: BBC (byte aligned), WAH

Our EWAH extends Wu et al.’s (was known to Wu as WBC)word-aligned hybrid.

0101000000000000 000. . . 000 000. . . 000 0011111111111100 . . .⇒ dirty word, run of 2 “clean 0” words, dirty word. . .


































Computational and storage bounds

n → number of rows, c → number of 1s per row;

Model storage cost as #(dirty words) + #(clean words, 0x00)

Storage is in O(nc);

Bounds do not depend on the number ofbitmaps. (Assuming O(n) bitmaps).

Construction time is proportional to index size. (Data iswritten sequentially on disk.)

Implementation scales to millions of bitmaps.










































What about other compression types?

Why not compress using other techniques (Huffman, LZ77,Arithmetic Coding, . . . )?

With RLE-like compression we have B1 ∨ B2 or B1 ∧ B2 intime O(|B1|+ |B2|).

We don’t know how to do this using the other compressiontechniques!

Hence, with RLE, compress saves both storage and CPUcycles!!!!




















What happens when you have many bitmaps?

Consider B1 ∨ B2 ∨ . . . ∨ BN .

First compute the first two : B1 ∨ B2 in time O(|B1|+ |B2|).

|B3 ∨ B4| is in O(|B3|+ |B4|).

Thus (B1 ∨ B2) ∨ (B3 ∨ B4) takes O(2∑

i |Bi |). . .

Total is in O(∑N

i=1 |Bi | log N) [Lemire et al., 2009].



Consider B1 ∨ B2 ∨ . . . ∨ BN .


|B3 ∨ B4| is in O(|B3|+ |B4|).

Thus (B1 ∨ B2) ∨ (B3 ∨ B4) takes O(2∑

i |Bi |). . .

Total is in O(∑N




Consider B1 ∨ B2 ∨ . . . ∨ BN .


|B3 ∨ B4| is in O(|B3|+ |B4|).

Thus (B1 ∨ B2) ∨ (B3 ∨ B4) takes O(2∑

i |Bi |). . .

Total is in O(∑N




Consider B1 ∨ B2 ∨ . . . ∨ BN .


|B3 ∨ B4| is in O(|B3|+ |B4|).

Thus (B1 ∨ B2) ∨ (B3 ∨ B4) takes O(2∑

i |Bi |). . .

Total is in O(∑N




Consider B1 ∨ B2 ∨ . . . ∨ BN .


|B3 ∨ B4| is in O(|B3|+ |B4|).

Thus (B1 ∨ B2) ∨ (B3 ∨ B4) takes O(2∑

i |Bi |). . .

Total is in O(∑N



Improving compression by sorting the table

RLE, BBC, WAH, EWAH are order-sensitive:they compress sorted tables better;

But finding the best row ordering isNP-hard [Lemire et al., 2009].

Lexicographic row sorting is

fast, even for very large tables.easy: sort is a Unix staple.

Substantial index-size reductions (often 2.5 times)













fast, even for very large tables.

easy: sort is a Unix staple.

















Improving compression via k-of-N encoding

1-of-N100000010000001000000100000010100000000001

valuecatdogdishfishcowcatpony

With L bitmaps, you can represent L valuesby mapping each value to one bitmap;

Alternatively, you can represent(L2

)= L(L− 1)/2 values by mapping each

value to a pair of bitmaps;

More generally, you can represent(Lk

)values

by mapping each value to a k-tuple ofbitmaps;

At query time, you need to load k bitmapsin a look-up for one value;

You trade query-time performance forfewer bitmaps;

Often, fewer bitmaps translates into asmaller index, created faster.



1-of-N100000010000001000000100000010100000000001

2-of-N1100101010010110010111000011






)values







1-of-N100000010000001000000100000010100000000001

2-of-N1100101010010110010111000011






)values







1-of-N100000010000001000000100000010100000000001

2-of-N1100101010010110010111000011






)values







1-of-N100000010000001000000100000010100000000001

2-of-N1100101010010110010111000011






)values







1-of-N100000010000001000000100000010100000000001

2-of-N1100101010010110010111000011






)values






Encode then sort? Or vice versa?

Two different conceptual approaches:

1 Encode attributes in table, obtaining an uncompressed index

Sort the index rowsCompress each column

2 Sort the table rowsEncode attributes in table, build compressed index on-the-fly.

paint maker

red fordblue hondagreen ford. . . . . .

⇒1 1 0 1 0 11 0 1 0 1 10 1 1 1 0 1

. . . . . .

⇒0 1 1 1 0 11 0 1 0 1 11 1 0 1 0 1

. . . . . .

paint maker


⇒

paint maker

blue hondagreen fordred ford. . . . . .

⇒1 0 1 0 1 11 1 0 1 0 10 1 1 1 0 1

. . . . . .




1 Encode attributes in table, obtaining an uncompressed indexSort the index rows

Compress each column


paint maker


⇒1 1 0 1 0 11 0 1 0 1 10 1 1 1 0 1

. . . . . .

⇒0 1 1 1 0 11 0 1 0 1 11 1 0 1 0 1

. . . . . .

paint maker


⇒

paint maker


⇒1 0 1 0 1 11 1 0 1 0 10 1 1 1 0 1

. . . . . .




1 Encode attributes in table, obtaining an uncompressed indexSort the index rowsCompress each column


paint maker


⇒1 1 0 1 0 11 0 1 0 1 10 1 1 1 0 1

. . . . . .

⇒0 1 1 1 0 11 0 1 0 1 11 1 0 1 0 1

. . . . . .

paint maker


⇒

paint maker


⇒1 0 1 0 1 11 1 0 1 0 10 1 1 1 0 1

. . . . . .





2 Sort the table rows

Encode attributes in table, build compressed index on-the-fly.

paint maker


⇒1 1 0 1 0 11 0 1 0 1 10 1 1 1 0 1

. . . . . .

⇒0 1 1 1 0 11 0 1 0 1 11 1 0 1 0 1

. . . . . .

paint maker


⇒

paint maker


⇒1 0 1 0 1 11 1 0 1 0 10 1 1 1 0 1

. . . . . .






paint maker


⇒1 1 0 1 0 11 0 1 0 1 10 1 1 1 0 1

. . . . . .

⇒0 1 1 1 0 11 0 1 0 1 11 1 0 1 0 1

. . . . . .

paint maker


⇒

paint maker


⇒1 0 1 0 1 11 1 0 1 0 10 1 1 1 0 1

. . . . . .


Gray-code order

Lex. order0 1 10 1 11 0 11 0 11 1 01 1 01 1 11 1 11 1 1

Gray-code

0 1 10 1 11 1 01 1 01 1 11 1 11 1 11 0 11 0 1

Gray-code (GC) order is analternative to lexicographicalorder (defined only for bitarrays);

May improve compression morethan lex. sort (k > 1);

[Pinar et al., 2005] process anuncompressed bitmap index.

Slow, if uncompressed indexdoes not fit in RAM.

GC order is not supported byDBMSes or Unix utilities.


Gray-code order

Lex. order0 1 10 1 11 0 11 0 11 1 01 1 01 1 11 1 11 1 1

Gray-code

0 1 10 1 11 1 01 1 01 1 11 1 11 1 11 0 11 0 1







Gray-code order

Lex. order0 1 10 1 11 0 11 0 11 1 01 1 01 1 11 1 11 1 1

Gray-code

0 1 10 1 11 1 01 1 01 1 11 1 11 1 11 0 11 0 1







Gray-code order

Lex. order0 1 10 1 11 0 11 0 11 1 01 1 01 1 11 1 11 1 1

Gray-code

0 1 10 1 11 1 01 1 01 1 11 1 11 1 11 0 11 0 1







Gray-code order

Lex. order0 1 10 1 11 0 11 0 11 1 01 1 01 1 11 1 11 1 1

Gray-code

0 1 10 1 11 1 01 1 01 1 11 1 11 1 11 0 11 0 1







Gray-code sorting, cheaply

Size improvement is small (usually < 4%), but it’s essentially free:

1 What Pinar et al. do: expensive GC sort after encodingeg: [Tax, Cat, Girl, Cat] → sort([1100, 0110, 1001, 0110]);

2 Instead, sort the table lexicographically—comparing valuesalphabetically or by frequency (easy);eg: [Tax, Cat, Girl, Cat] → [Cat, Cat, Girl, Tax]

3 Map ordered values to k-tuples of bitmaps ordered as Graycodes: Cat: 0011, Dog: 0110, Girl: 0101, Tax: 1100;

Lex ascending sequence: Cat, Dog, Girl, Tax.GC ascending sequence: 0011, 0110, 0101, 1100 for codes

eg: [Cat, Cat, Girl, Tax] → [0011, 0011, 0101, 1100](generates a GC-sorted result without expensive GC sorting).

4 Easily extended for > 1 columns.

In our tests, this is as good as a Gray-code bitmap indexsort [Pinar et al., 2005], but technically much easier.



























Lex ascending sequence: Cat, Dog, Girl, Tax.GC ascending sequence: 0011, 0110, 0101, 1100 for codeseg: [Cat, Cat, Girl, Tax] → [0011, 0011, 0101, 1100](generates a GC-sorted result without expensive GC sorting).






















What about “other” Gray-codes?

Define Gray-code to be a way to list all bitvectors whileminimizing Hamming distances [Knuth, 2005, § 7.2.1.1]

There are other alternatives [Goddyn and Gvozdjak, 2003,Savage and Winkler, 1995].

Our tests suggest traditional Gray codes are best.












Test data sets

Previous studies used data sets where the uncompressed indexwould fit in RAM.

Do their results apply to more realisticdata sets?

Our tests: Mix of real and synthetic data,

up to 877 M rows, 22 GB, 4 M attribute values.using 4–10 columns/dimensions


Test data sets

Previous studies used data sets where the uncompressed indexwould fit in RAM. Do their results apply to more realisticdata sets?




Test data sets

Previous studies used data sets where the uncompressed indexwould fit in RAM. Do their results apply to more realisticdata sets?




When sorting, column order matters

The first column(s) gainmore from the sort(column 1 is primary sortkey);

Its bitmaps (first 11 inexample) are compressedwell, compared to a“randomsort”. (Redabove green)

Least important column’sbitmaps (43–49) don’tgain much (red vs green)

Compression on TWEED-4d

0

0.2

0.4

0.6

0.8

1

11 18 43 49

1-C

/N

rang des bitmaps

GrayRandom-sort







0

0.2

0.4

0.6

0.8

1

11 18 43 491-

C/N

rang des bitmaps

GrayRandom-sort







0

0.2

0.4

0.6

0.8

1

11 18 43 491-

C/N

rang des bitmaps

GrayRandom-sort



Conceptually, we maywish to reorder columns,eg swap columns 1 & 3.

Column order is crucial(to successful sorting).

Finding the best orderingquickly remains open.

Netflix: 24 column orderings

1e+08

1.5e+08

2e+08

2.5e+08

3e+08

3.5e+08

4e+08

4.5e+08

5e+08

5.5e+08

432143124231421341324123342134123241321431423124243124132341231421432134143214231342132412431234

inde

x si

ze

column permutation

1-of-N encoding4-of-N encoding







1e+08

1.5e+08

2e+08

2.5e+08

3e+08

3.5e+08

4e+08

4.5e+08

5e+08

5.5e+08

432143124231421341324123342134123241321431423124243124132341231421432134143214231342132412431234

inde

x si

ze

column permutation








1e+08

1.5e+08

2e+08

2.5e+08

3e+08

3.5e+08

4e+08

4.5e+08

5e+08

5.5e+08

432143124231421341324123342134123241321431423124243124132341231421432134143214231342132412431234

inde

x si

zecolumn permutation



Progress toward choosing column order

Paper models “gain” of putting a given column first.

Idea: order columns greedily (by max gain).

Experimentally, this approach is not promising: the bestorderings don’t seem to depend on gain.

Factors:

skews of columnsnumber of distinct valueskdensity of column’s bitmaps






Factors:







Factors:



What usually works for dimension ordering?: k=1

For 1-of-N bitmaps, a density-based approach was okay:

Ordering rule, k = 1 : “sparse but not too sparse”

Order columns by decreasing

min

(1

ni,

1− 1/ni

4w − 1

), where

50 100 150 200 250 300

distinct values in column

ni → the number of distinct values in column i ,

w → the word size.

See 30–40% size reduction, merely knowing dimension sizes (ni ).






min

(1

ni,

1− 1/ni

4w − 1

), where

50 100 150 200 250 300










min

(1

ni,

1− 1/ni

4w − 1

), where

50 100 150 200 250 300






What usually works for dimension ordering?: k > 1

Density formula (ni → k√

ni ) recommends poorly when k > 1. Ourexperiments on synthetic data give some guidance:

When k > 1, order columns by

1 descending skew

2 descending size

(And do the reverse when k = 1.)

Open issues, k > 1

1 How do we balance skew & size factors?

2 What other properties of the histograms are needed?






1 descending skew

2 descending size


Open issues, k > 1








1 descending skew

2 descending size


Open issues, k > 1








1 descending skew

2 descending size


Open issues, k > 1




Bitmap-by-bitmap reordering

One might instead make the index, reorder its columns, thenapply GC sort [Canahuate et al., 2006].

Our best implementation of this is ≈ 100 times slower, cannothandle larger data sets.

We tried several bitmap orders on DBGEN and Census. Outof 8 cases, only one gained, and only by 3%.

Canahaute suggests ordering does not matter much, but wesee factor-of-2 differences (??)

Seems sufficient (and much faster) to work with groups ofbitmaps (reorder attributes, not bitmaps)






























Index size versus block-wise sorting

Netflix

100

200

300

400

500

600

700

800

900

0 100 200 300 400 500 600 700

tail

le d

e l’

inde

x (M

o)

# de blocs

k=1

k=2

k=3

k=4

Instead of fully sorting thetable, we sorted itblock-wise;

Fewer blocks means amore complete sort;

Larger k means smallerindex (in this case);

Index size diminishesdrastically with sorting.



Netflix

100

200

300

400

500

600

700

800

900

0 100 200 300 400 500 600 700

tail

le d

e l’

inde

x (M

o)

# de blocs

k=1

k=2

k=3

k=4







Netflix

100

200

300

400

500

600

700

800

900

0 100 200 300 400 500 600 700

tail

le d

e l’

inde

x (M

o)

# de blocs

k=1

k=2

k=3

k=4







Netflix

100

200

300

400

500

600

700

800

900

0 100 200 300 400 500 600 700

tail

le d

e l’

inde

x (M

o)

# de blocs

k=1

k=2

k=3

k=4






How do 64-bit words compare to 32-bit words?

We implemented EWAH using 16-bit, 32-bit and 64-bit words;

Only 32-bit and 64-bit are efficient;

64-bit indexes are nearly twice as large;

64-bit indexes are between 5%-40% faster (despite higherI/O costs).




















Open Source Software?

Lemur Bitmap Index C++ Library:http://code.google.com/p/lemurbitmapindex/.

JavaEWAH: A compressed alternative to the Java BitSet classhttp://code.google.com/p/javaewah/.


http://code.google.com/p/lemurbitmapindex/

http://code.google.com/p/javaewah/

Open Source Software?

Lemur Bitmap Index C++ Library:http://code.google.com/p/lemurbitmapindex/.

JavaEWAH: A compressed alternative to the Java BitSet classhttp://code.google.com/p/javaewah/.


http://code.google.com/p/lemurbitmapindex/

http://code.google.com/p/javaewah/

Future direction?

Need better mathematical modelling of bitmap compressedsize in sorted tables;


Questions?

?


Canahuate, G., Ferhatosmanoglu, H., and Pinar, A. (2006).Improving bitmap index compression by data reorganization.http://hpcrd.lbl.gov/~apinar/papers/TKDE06.pdf (checked2008-12-15).

Goddyn, L. and Gvozdjak, P. (2003).Binary gray codes with long bit runs.Electronic Journal of Combinatorics, 10(R27):1–10.

Knuth, D. E. (2005).The Art of Computer Programming, volume 4, chapter fascicle2.Addison Wesley.

Lemire, D., Kaser, O., and Aouiche, K. (2009).Sorting improves word-aligned bitmap indexes.available from http://arxiv.org/abs/0901.3751.

Pinar, A., Tao, T., and Ferhatosmanoglu, H. (2005).


http://hpcrd.lbl.gov/~apinar/papers/TKDE06.pdf

http://hpcrd.lbl.gov/~apinar/papers/TKDE06.pdf

http://arxiv.org/abs/0901.3751

Compressing bitmap indices by data reorganization.In ICDE’05, pages 310–321.

Savage, C. and Winkler, P. (1995).Monotone gray codes and the middle levels problem.Journal of Combinatorial Theory, A, 70(2):230–248.

Wu, K., Otoo, E. J., and Shoshani, A. (2006).Optimizing bitmap indices with efficient compression.ACM Transactions on Database Systems, 31(1):1–38.


Documents

All About Bitmap IndexesAnd Sorting Them