C-Store: Tuple Reconstruction

C-Store: Tuple Reconstruction

Jianlin FengSchool of SoftwareSUN YAT-SEN UNIVERSITYMar 27, 2009

Motivation

In a Column-Oriented DBMS, columns are stored separately

Separate column values of the same logical tuple must be stitched together when the tuple is finally returned to a user.

How to Identify Column Values of the Same Logical Tuple? Attach either physical or virtual tuple ID or positions t

o column values. In the Read Store of C-Store, a Storage Key is equal to a p

osition in a column. In the Write Store of C-Store, a Storage Key is physically st

ored as a tuple ID.

Tuple Reconstruction is easy if columns are sorted in the same order Join on the positions instead of on the physical tuple ID.

Two Strategies of Tuple Reconstruction Early Materialization (EM)

Whenever a column C1 is accessed, add C1 (concrete column values) to an intermediate tuple representation if C1 is needed by some later operator, or if C1 is one of the output columns.

Late Materialization (LM) Construct tuples as late as possible.

Tuple Reconstruction: An Example (1) Assume a relation R has 3 columns

R.a, R.b, R.c All the 3 columns are sorted in the same order, and are stored in separate files.

Suppose a query consists of 3 selection predicates σ1, σ2, σ3 over R.a, R.b, R.c respectively σ1 is the most selective predicate σ3 is the least selective predicate

Tuple Reconstruction : An Example (2) An early materialization strategy could proces

s the query as follows: Read in a block of R.a, a block of R.b, and a block

of R.c from disk. Stitch them together into block(s) of triples (R.a, R.

b, R.c ). Apply σ1, σ2, σ3 in turn, allowing tuples that match th

e predicates to pass through.

Tuple Reconstruction : An Example (3)

A late materialization strategy could process the query as follows: First scan R.a, and output the positions in R.a that satisfy σ1. Second scan R.b, and output the positions in R.b that satisfy

σ2. Third scan R.c, and output the positions in R.c that satisfy σ3. Fourth use position-wise AND to find the intersection of the 3

position lists. Finally re-scan R.a, R.b, and R.c , and extract the values of th

e records whose positions are in the intersection, and stitch these values together into output tuples.

Late Materialization: Potential Pros and Cons+ Operating directly on position lists

+ Constructing only relevant tuples.

- re-scanning the base columns to form tuples.

Early Materialization Advantages No need to re-scan a column.

If the re-scanning cost at tuple reconstruction time is high, early materialization gets bonus.

An Analytical Model for Comparing the Two Materialization Strategies The model is composed of 3 types of operato

rs: Data Source (DS) operator

AND operator

Tuple Construction operator

These operators are enough for expressing simple queries using each materialization strategy.

Data Source (DS) operator: Case 1 Input

A column Ci of | Ci | blocks from disk. A predicate with selectivity SF.

Ouput A column of positions of the tuples that satisfy the

predicate.

Used by late materialization.


A column Ci of | Ci | blocks from disk. A predicate with selectivity SF.

Ouput A column of (position, value) pairs of the tuples th

at satisfy the predicate.

Used by early materialization.


A column Ci of | Ci | blocks from disk or memory. A list of positions, i.e., POSLIST.

Ouput A column of the values corresponding to the positi

ons in POSLIST.

Used by late materialization.


A column Ci of | Ci | blocks from disk. A predicate with selectivity SF. A set of intermediate tuples of the form (pos, <a1, ..., an>).

Ouput A set of intermediate tuples of the new form (pos, <a1, ..., an,

, an+1), i,e., adding column Ci to tuples.

Used by early materialization.

The AND Operator

Input: k position lists, inpos1,...,inposk.

Output: outpos: a new list of positions representing the ine

tersection of those input lists.

Operating on positions is fast.

Tuple Construction Operators The MERGE operator

input: k sets of values VAL1,...,VALk. output: a set of k-ary tuples. This operator is used to construct tuples at the top of a late

materialization plan.

The SPC(Scan, Predicate, and Construct) operator input:

k columns VAL1,...,VALk from disk; a set of predicates.

output: a set of tuples that pass all predicates. This operator can sit at the bottom of an early materializatio

n plan.

Example Query Plans: EM

Example Query Plans: LM

Optimization in Late Materialization Data Source Case 3: produce values from positions

Input A column Ci of | Ci | blocks from disk or memory. A list of positions, i.e., POSLIST.

Ouput A column of the values corresponding to the positions in

POSLIST.

Optimization If the column is in memory, do not read it from dis

k. i.e., reduce the cost of re-scanning a column.

LM Optimization: Multi-Columns A Multi-Column is a specialized data structure

allows blocks of column data to remain in memory after the first scan so that those blocks can be easily scanned again later on.

Contains a memory-resident, horizontal partition of some subset of columns from a logical relation.

Components of a Multi-Column A covering position range:

Indicates the virtual start position and end position of the horizontal partition

An array of mini-columns: A mini-column is the set of corresponding values for a

specified position range of a column. Each mini-column is kept compressed the same way as it

was on disk. A position descriptor:

Indicates which positions in the position range remain valid.

Construction of a Multi-Column Initially a multi-column contains only one mini-

column. When a page of a column is read from disk, a

mini-column is created with a position descriptor indicating that all positions are valid.

Each mini-column can be just a pointer to the page in the buffer.

A modified AND operator is used to merge two multi-columns into a wider multi-column.

The Use of a Multi-Column

If a DS Case 3 operator takes as input a multi-column rather than just a position list, then it has no need to re-scan the column (from di

sk).

Predicated vs. Actual Behavior

Heuristic for Choosing Materialization Strategy Use Late Materialization

If a query contains aggregation, or if the selectivity of predicates in the query is sm

all. Use Early Materialization

in contrast to the conditions for late materialization.

References

Mike Stonebraker, Daniel Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O'Neil, Pat O'Neil, Alex Rasin, Nga Tran and Stan Zdonik. C-Store: A Column Oriented DBMS , VLDB, 2005.

Daniel J. Abadi, Daniel S. Myers, David J. DeWitt, and Samuel R. Madden 。 Materialization Strategies in a Column-Oriented DBMS . Proceedings of ICDE, April, 2007, Istanbul, Turkey.

Documents

C-Store: Tuple Reconstruction