Upload
olivia-underwood
View
227
Download
1
Embed Size (px)
Citation preview
M.Kersten 2008 1
MonetDB, a Column-Store in Midflight
Martin KerstenCWI
Amsterdam
M.Kersten 2008 3
Where is the field heading to
Mike Stonebraker, VLDB 2007:One size fits all: A concept whose time has come and gone
Martin Kersten, ICDE 2008:Mike is wrong…..Every size fits him always…Even a Euro solution.
M.Kersten 2008 4
If you want to play with a generic column-store, we recommend you download the academic version of Vertica, the commercialization of the C-Store project, or download MonetDB, an open-source column-store.
M.Kersten 2008 5
Paste
Present
PotencyCracking
Columns Chaos
M.Kersten 2008 6
Paste
Present
PotencyPAX stores
N-ary stores Column
stores
M.Kersten 2008 7
John 32 HoustonOK
Early 80s: tuple storage structures for PCs were simple
Mary 31 HoustonOK
Easy to access at the cost of wasted space
Try to keep things simple
M.Kersten 2008 8
Slotted pages Logical pages equated physical pages
32 John Houston
31 Mary Houston
Try to keep things simple
M.Kersten 2008 9
Slotted pages Logical pages equated multiple physical pages
32 John Houston
31 Mary Houston
Try to keep things simple
M.Kersten 2008 10
Not all attributes are equally important
Avoid things you don’t always need
M.Kersten 2008 11
A column orientation is as simple and acts like an array
Attributes of a tuple are correlated by offset
Avoid moving too much around
M.Kersten 2008 12
• MonetDB Binary Association TablesID Day Discount
10 4/4/98 0.19511 9/4/98 0.06512 1/2/98 0.17513 7/2/98 0
OID ID100 10101 11102 12103 13104 14
OID Day100 4/4/98101 9/4/98102 1/2/98103 7/2/98104 1/2/99
OID Discount100 0.195101 0.065102 0.175103 0104 0.065
Try to keep things simple
M.Kersten 2008 13
Physical data organization• Binary Association Tables
head tail
100 10101 11102 12103 13104 14
Bat Unitfixed size
Densesequence
Memory mappedfiles
Try to avoid doing things twice
M.Kersten 2008 14
• Binary Association Tables acceleratorsOID ID
100 10101 11102 12103 13104 14Hash-based
access
Try to avoid doing things twice
Column properties:key-nessnon-nulldenseordered
M.Kersten 2008 15
• Binary Association Tables storage controlOID ID
100 Amsterdam101 Seattle102 New York103 London104 Paris
ID
AmsterdamSeattleNew YorkLondon
OID ID
100
101
102
103
104
A BAT can be usedas an encoding table
A VID datatypecan be used torepresent denseenumerations
Type remappingsare used to squeezespace
100
Try to avoid doing things twice
M.Kersten 2008 16
• Column orientation benefits datawarehousing
• Brings a much tighter packaging and improves transport through the memory hierarchy
• Each column can be more easily optimized for storage using compression schemes
• Each column can be replicated for read-only access
Mantra: Try to keep things simple
M.Kersten 2008 17
Try to maximize performance
Paste
Present
PotencyMaterialize All Model
Vectorizedmodel
Volcanomodel
M.Kersten 2008 18
Volcano Refresher
Query
SELECT name, salary*.19 AS tax
FROMemployee
WHERE age > 25
Try to maximize performance
M.Kersten 2008 19
Volcano Refresher
Operators
Iterator interface-open()-next(): tuple-close()
Try to maximize performance
M.Kersten 2008 20
• The Volcano model is based on a simple pull-based iterator model for programming relational operators.
• The Volcano model minimizes the amount of intermediate store
• The Volcano model is CPU intensive and inefficient
Try to maximize performance
Volcano paradigm
M.Kersten 2008 21
MonetDB paradigm
• The MonetDB kernel is a programmable relational algebra machine
• Relational operators operate on ‘array’-like structures
• Based on experiences in database machines 25 years ago, RAP, CASSM,…ICL RAP… PRISMA… IDIOMS (J. Kerridge)…
Try to use simple a software pattern
M.Kersten 2008 22
SQL
MonetDB Server
MonetDB Kernel
XQuery
MAL
function user.s3_1():void; X1:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",0); X6:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",1); X9:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",2); X13:bat[:oid,:oid] := sql.bind_dbat("sys","photoobjall",1); X8 := algebra.kunion(X1,X6); X11 := algebra.kdifference(X8,X9); X12 := algebra.kunion(X11,X9); X14 := bat.reverse(X13); X15 := algebra.kdifference(X12,X14); X16 := calc.oid(0@0); X18 := algebra.markT(X15,X16); X19 := bat.reverse(X18); X20 := aggr.count(X19); sql.exportValue(1,"sys.","count_","int",32,0,6,X20,"");end s3_1;
select count(*) from photoobjall;
Try to use simple a software pattern
M.Kersten 2008 23
Operator implementation
• All algebraic operators materialize their result
• Local optimization decisions
• Heavy use of code expansion to reduce cost• 55 selection routines• 149 unary operations• 335 join/group operations• 134 multi-join operations• 72 aggregate operations
Try to use simple a software pattern
M.Kersten 2008 25
Micro-benchmark
MonetDB/SQL 0.34 N 44
MySQL 25.1 N 238
PostgreSQL 10.6 N 1230
Commercial 1 39.0 N 800
Commercial 2 17 N 150
In milliseconds/10KFixed cost in ms
• Keeping the query result in a new table is often too expensive
select * into tmp from tapestry where attr1>=0 and attr1 <=@range
create table tmp( attr0 int, attr1 int);insert into tmp select * from tapestry
where attr1>=0 and attr1 <=@range;
M.Kersten 2008 26
Multi-column tapestry
Experiments ran on Athlon 1.4, Linux
commercial
MonetDB/SQL
#joins
ms
M.Kersten 2008 27
• A column store should be designed from scratch to benefit from its characteristics
• Simulation of a column store on top of an n-ary system using the Volcano model does not work
M.Kersten 2008 28
Try to maximize performance
Paste
Present
Potency
Execution Paradigm
DatabaseStructures
Queryoptimizer
M.Kersten 2008 29
• Applications have different characteristics• Platforms have different characteristics• The actual state of computation is crucial
• A generic all-encompassing optimizer cost-
model does not work
Try to avoid the search space trap
M.Kersten 2008 30
SQL
MonetDB Server
MonetDB Kernel
XQuery
MAL
MAL
Operational optimizer:– Exploit everything you know at runtime– Re-organize if necessary
Try to disambiguate decisions
M.Kersten 2008 31
SQL
MonetDB Server
MonetDB Kernel
XQuery
MAL
MAL
Strategic optimizer:– Exploit the semantics of the language– Rely on heuristics
Operational optimizer:– Exploit everything you know at runtime– Re-organize if necessary
Try to disambiguate decisions
M.Kersten 2008 32
SQL
MonetDB Server
Tactical Optimizer
MonetDB Kernel
XQuery
MAL
MAL
y1:bat[:oid,:dbl]:= bpm.take("sys_photoobjall_ra");y2 := bpm.new(:oid,:oid);barrier rs:= bpm.newIterator(y1,A0,A1); t1:= algebra.uselect(rs,A0,A1); bpm.addSegment(y2,t1);redo rs:= bpm.hasMoreElements(y1,A0,A1);exit rs;
x1:bat[:oid,:dbl]:= sql.bind("sys","photoobjall","ra",0);x14:= algebra.uselect(x1,A0,A1);
Tactical MAL optimizer:– No changes in front-ends and no direct human guidance– Minimal changes in the engine
Try to disambiguate decisions
M.Kersten 2008 33
Code Inliner. Constant Expression Evaluator.
Accumulator Evaluations.Strength Reduction. Common Term Optimizer.
Join Path Optimizer. Ranges Propagation. Operator Cost Reduction. Foreign Key handling. Aggregate Groups.
Code Parallizer. Replication Manager. Result Recycler.
MAL Compiler. Dynamic Query Scheduler. Memo-based Execution. Vector Execution.
Alias Removal. Dead Code Removal. Garbage Collector.
Try to disambiguate decisions
M.Kersten 2008 34
Try to maximize performance
Paste
Present
Potency
Execution Paradigm
DatabaseStructures
Queryoptimizer
M.Kersten 2008 35
Execution paradigms
• The MonetDB kernel is set up to accommodate different execution engines
• The MonetDB assembler program is • Interpreted in the order presented• Interpreted in a dataflow driven manner• Compiled into a C program• Vectorised processing
• X100 project
No data from persistent store to the memory trash
M.Kersten 2008 36
MonetDB/x100
Combine Volcano model withvector processing.
All vectors together should fit the CPU cache
Vectors are compressed
Optimizer should tune this,given the query characteristics.
ColumnBM (buffer manager)
X100 query engine
CPUcache
networkedColumnBM-s
RAM
M.Kersten 2008 37
• Varying the vector size on TPC-H query 1
mysql, oracle,
db2
X100
MonetDB
low IPC, overhead
RAM bandwidth
bound
No data from persistent store to the memory trash
M.Kersten 2008 38
• Vectorized-Volcano processing can be used for both multi-core and distributed processing
• The architecture and the parameters are influenced heavily by• Hardware characteristics• Data distribution to compress columns
No data from persistent store to the memory trash
M.Kersten 2008 39
Does MonetDB stand a ‘real’ test?
Is the main memory orientation a bottleneck?
Is it functionally complete?
The proof of the pudding is in the eating
M.Kersten 2008 40
TPC-H
ATHLON X2 3800+ (2000mhz) 2 disks in raid 0, 2G main memory
TPC-H 60K rows line_item table
Comfortably fit in memoryPerformance in milliseconds
M.Kersten 2008 41
TPC-H
ATHLON X2 3800+ (2000mhz) 2 disks in raid 0, 2G main memory
Scale-factor 16M row line-item table
Out of the box performanceQueries produce empty
or erroneous results
M.Kersten 2008 42
TPC-H
ATHLON X2 3800+ (2000mhz) 2 disks in raid 0, 2G main memory
M.Kersten 2008 43
TPC-H
ATHLON X2 3800+ (2000mhz) 2 disks in raid 0, 2G main memory
M.Kersten 2008 44
• Code base for MonetDB/SQL is 1.2M lines of C• Nightly regression testing on 17 platforms
M.Kersten 2008 45
Try to maximize performance
Paste
Present
Potency
CrackingB-tree, HashIndices
MaterializedViews
M.Kersten 2008 46
• Indices in database systems focus on:
• All tuples are equally important for fast retrieval
• There are ample resources to maintain indices
• MonetDB cracks the database into pieces based on actual query load
Find a trusted fortune teller
M.Kersten 2008
Cracking algorithms
Physical reorganization happens per column based on selection predicates.
Split a piece of a column in two new pieces
A<10
A>=10
A<10
M.Kersten 2008
Cracking algorithms
Physical reorganization happens per column
Split a piece of a column in two new pieces
Split a piece of a column in three new pieces
A<10
A>=10
A<10
5<A<10
A>=10
5<A<10
A<5
M.Kersten 2008
Cracking example
3
8
6
2
12
13
4
17
15
select A>5 and A<10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12 >=10
>=10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>=10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>=10
<=5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>=10
<=5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>=10
<=5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>=10
<=5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>=10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>=10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<=5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<=5
<=5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<=5
>5 and <10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6 2
15
13
4
17
12
<=5
>5 and <10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6 2
15
13
4
17
12
<=5
>5 and <10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<=5
>5 and <10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
>5 and <10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
13
4
17
12
<= 5
>= 10
> 5
15
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
M.Kersten 2008
racking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
>3 and <14
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
>3 and <14
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
>3 and <14
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
>3 and <14
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
1712
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
1712
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 10
> 5
<=3
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 14
> 5
<=3
>=10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
> 3
>= 14
> 5
<=3
>=10
M.Kersten 2008
Cracking example
3
8
6
2
15
13
4
17
12
select A>5 and A<10
3
8
6
2
15
13
4
17
12
<= 5
>= 10
> 5
Improve data access for
future queries
select A>3 and A<14
3
8
6
2
15
13
4
17
12
>3
>= 14
> 5
<=3
>=10
The more we crack the more
we learn
M.Kersten 2008
Design
The first time a range query is posed on an attribute A, a cracking DBMS makes a copy of column A, called the cracker column of A A cracker column is continuously physically reorganized based on queries that need to touch attribute such as the result is in a contiguous space
For each cracker column, there is a cracker index
Cracker Index
Cracker Column
M.Kersten 2008
A simple range queryTry to avoid useless investments
M.Kersten 2008
TPC-H query 6
Try to avoid useless investments
M.Kersten 2008 91
• Cracking is easy in a column store and is part of the critical execution path
• Cracking works under high volume updates
Try to avoid useless investments
M.Kersten 2008
Updates
Base columns are updated as normally
We need to update the cracker column and the cracker index
Efficiently
Maintain the self-organization properties
Two issues: When How
M.Kersten 2008
When to propagate updates in cracking
Follow the workload to maintain self-organization
Updates become part of query processing
When an update arrives, it is not applied
For each cracker column there is a pending insertions column and a pending deletions column
Pending updates are applied only when a query needs the specific values
M.Kersten 2008
Updates aware select
We extended the cracker select operator to apply the needed updates before cracking
The select operator:1. Search the pending insertions column2. Search the pending deletions column3. If Steps 1 or 2 find tuples run an update algorithm4. Search the cracker index5. Physically reorganize the cracker column6. Update the cracker index7. Return a slice of the cracker column
M.Kersten 2008
Merging
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >12
Start position: 1values: >1
Insert a new tuple with value 9
The new tuple belongs to the blue piece 9
M.Kersten 2008
Merging
7
2
10
29
25
31
57
42
53
Start position: 8values: >35
Start position: 5values: >12
Start position: 1values: >1
Insert a new tuple with value 9
The new tuple belongs to the blue piece
9
Pieces in the cracker column are ordered
Tuples inside a piece are not ordered
Shifting is not a viable solution
M.Kersten 2008
Merging by Hopping
7
2
10
29
25
31
42
53 Start position: 8values: >35
Start position: 4values: >12
Start position: 1values: >1
57
9Insert a new tuple with value 9
We need to make enough room to fit the new tuples
M.Kersten 2008
Merge Gradually
A query merges only the qualifying values, i.e., only the values that it needs for a correct and complete result
Average cost increases significantly
We avoid the large peaks but...
Merge CompletelyMerge Gradually
M.Kersten 2008
The Ripple
Touch only the pieces that are relevant for the current query
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1
Touch only the pieces that are relevant for the current query
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1
Select 7<= A< 15Touch only the pieces that are relevant for the current query
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1
Select 7<= A< 15
5
9
16
35
Pending insertions
Touch only the pieces that are relevant for the current query
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
9
16
35
Pending insertions
Touch only the pieces that are relevant for the current querySelect 7<= A< 15
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
9
16
35
Pending insertions
Touch only the pieces that are relevant for the current querySelect 7<= A< 15
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
9
16
35
Pending insertions
Touch only the pieces that are relevant for the current querySelect 7<= A< 15
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
9
16
35
Pending insertions
Touch only the pieces that are relevant for the current query
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
7
2
10
29
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
9
16
35
Pending insertions
Touch only the pieces that are relevant for the current query
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
7
2
1029
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
9
16
35
Pending insertions
Touch only the pieces that are relevant for the current query
Immediately make room for the new tuples
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
7
2
1029
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
916
35
Pending insertions
Touch only the pieces that are relevant for the current query
Immediately make room for the new tuples
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
7
2
1029
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
916
35
Pending insertions
Touch only the pieces that are relevant for the current query
Immediately make room for the new tuples
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
7
2
10
25
31
57
42
53
Start position: 7values: >35
Start position: 4values: >22
Start position: 1values: >1 5
916
35
Pending insertions
29
Touch only the pieces that are relevant for the current query
Immediately make room for the new tuples
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
7
2
10
25
31
57
42
53
Start position: 7values: >35
Start position: 5values: >22
Start position: 1values: >1 5
916
35
Pending insertions
29
Touch only the pieces that are relevant for the current query
Immediately make room for the new tuples
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
7
2
10
25
31
57
42
53
Start position: 7values: >35
Start position: 5values: >22
Start position: 1values: >1 5
916
35
Pending insertions
29
Touch only the pieces that are relevant for the current query
Immediately make room for the new tuples
Avoid shifting down non interesting pieces
Select 7<= A< 15
M.Kersten 2008
The Ripple
Maintain high performance through the whole query sequence in a self-organizing way
M.Kersten 2008
The Ripple
Maintain high performance through the whole query sequence in a self-organizing way
Merge Gradually Merge Completely
Merge Ripple
M.Kersten 2008 116
• MonetDB Research & Development line• XQuery and Information Retrieval• Astronomy databases (SkyServer)• Event Stream Engine• Relational Cache Effectiveness• Multi-core Parallelism• DataStorage Rings• RDF Engine• ….
• Array Databases become en vogue
M.Kersten 2008 117
Summary
• MonetDB is a mature column store
• Cracking based on physical re-organization in the critical path can be made to work
• There is work for another decade
M.Kersten 2008 118
CstoreSQLserver
DB2
PostgreSQL
MySQL
Whoa MonetDB !Speed lines !
M.Kersten 2008 119
Martin Kersten Peter Boncz Niels Nes Stefan Manegold Fabian Groffen Sjoerd Mullender Steffen GoeldnerArjen de VriesMenzo WindhouwerTim RuhlRomulo Goncalves
Jan Rittinger Wouter Alink Jennie Zhang Stratos IdreosErietta LiarouLefteris SidirourgosFlorian Waas Albrecht Schmidt Jonas Karlsson Martin van Dinther Peter Bosch Carel van den Berg Wilco Quak
Acknowledgements
Alex van Ballegooij Johan List Georgina Ramirez Marcin Zukowski Roberto CornacchiaSandor HemanTorsten Grust Jens Teubner Maurice van Keulen Jan Flokstra Milena Ivanova
MonetDB/{SQL,XQuery} open-source platformMonetDB/SQL Version 5 experimentation platformMonetDB/X100 aimed at cpu & IO squeezingMonetDB/RAM aimed at IR and science DBMonetDB/Armada aimed at evolving databasesMonetDB/SkyServer portal for Astronomy