Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
Adaptive Indexingin Modern Databases
Stratos Idreos, Stefan Manegold and Goetz Graefe
CWI, Amsterdam HP Labs, Palo Alto
* *
* +
+
Tutorial PlanPart IIntroduction and db cracking basics (Stratos)Adaptive merging (Goetz)Benchmarking and hybrids (Stefan)
Part IIStochastic cracking and sideways cracking (Stratos)Concurrency control (Goetz)Partial cracking and updates (Stefan)
Introduction and db cracking basics
CIDR2007, Database CrackingStratos Idreos, Martin Kersten and Stefan Manegold
Physical design problem
which indexes to build?on which data parts?and when to build them?
Database systems perform efficiently only after proper tuning...
DBA without adaptive indexing
Physical Design
Sample Workload
Timeline
Physical Design
Sample Workload
Analyze Performance
Timeline
Physical Design
Sample Workload
Analyze Performance
Prepare Estimated physical design
Timeline
Physical Design
Sample Workload
Analyze Performance
Prepare Estimated physical design
Timeline
Queries
Physical Design
Sample Workload
Analyze Performance
Prepare Estimated physical design
Timeline
Queries
Complex and time consuming process
Physical Design
Sample Workload
Analyze Performance
Prepare Estimated physical design
Timeline
Queries
Complex and time consuming process
? Dynamic Workloads
Very Large Databases?
Dynamic environments
idle time workload knowledge
Dynamic environments
idle time workload knowledge
some problem cases
Dynamic environments
idle time workload knowledge
• Not enough idle time to finish proper tuningsome problem cases
Dynamic environments
• By the time we finish tuning, the workload changes
idle time workload knowledge
• Not enough idle time to finish proper tuningsome problem cases
Dynamic environments
• By the time we finish tuning, the workload changes• No index support during tuning
idle time workload knowledge
• Not enough idle time to finish proper tuningsome problem cases
Dynamic environments
• By the time we finish tuning, the workload changes• No index support during tuning
• Not all data parts are equally useful
idle time workload knowledge
• Not enough idle time to finish proper tuningsome problem cases
Adaptive Indexing
Remove all tuning, physical design steps but still get similar performance as a fully tuned system
How? Design new auto-tuning kernels
(operators, plans, structures, etc.)
For dynamic environments:
DBA with adaptive indexing
Adaptive Indexingno monitoringno preparation
no human involvement
no external tools
no full indexes
Adaptive Indexingno monitoringno preparation
no human involvement
no external tools
Continuous on-the-fly physical reorganization
no full indexes
Adaptive Indexingno monitoringno preparation
no human involvement
no external tools
Continuous on-the-fly physical reorganization
no full indexes
partial, incremental, adaptive indexing
Cracking ExampleDatabase Cracking CIDR 2007
Each query is treated as an advice on how data should be stored
Cracking ExampleDatabase Cracking CIDR 2007
Each query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Cracking ExampleDatabase Cracking CIDR 2007
Each query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Resu
lt tu
ples
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42Gain knowledge
on how data is organized
Physically reorganize based on the selection predicate
Resu
lt tu
ples
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42Gain knowledge
on how data is organized
Dynamically/on-the-fly within the select-operator
Physically reorganize based on the selection predicate
Resu
lt tu
ples
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
Physically reorganize based on the selection predicate
Resu
lt tu
ples
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking ExampleEach query is treated as an advice on how data should be stored
from Rwhere R.A > 10
and R.A < 14
select *
Q2:select *from Rwhere R.A > 7
and R.A <= 16
Q1:136798131211141619 Piece 5: 16 < A
Piece 3: 10 < A < 14
Piece 1: A <= 7
Piece 2: 7 < A <= 10
Piece 4: 14 <= A <= 16
Cracker column of A Cracker column of A
10 < A < 14
14 <= A
A <= 10Piece 1:
Piece 3:
Piece 2:
(in!place)(copy)
Q1 Q2
Column A
13164921271193141186
49271386131211161914
42
The more we crack, the more
we learn
Physically reorganize based on the selection predicate
Resu
lt tu
ples
Dynamically/on-the-fly within the select-operator
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
Cracking a column
5 20 40 60 90 120
cracker array A
existing cracks
All values after this position are bigger than 5
new select: 15<A<105
5 20 40 60 90 12015 105
after cracking
Database Cracking CIDR 2007
EngineeringDatabase Cracking CIDR 2007
Challenge: Efficiently refine index
Goal: Lightweight adaptation
index is refined as part of query processing
the select operator answers the query and refines the index in one go
EngineeringDatabase Cracking CIDR 2007
Exploit column-store bulk processing
EngineeringDatabase Cracking CIDR 2007
Exploit column-store bulk processing
select(A,v1-v2)A
EngineeringDatabase Cracking CIDR 2007
Exploit column-store bulk processing
select(A,v1-v2)A
scan and materialize result
EngineeringDatabase Cracking CIDR 2007
Exploit column-store bulk processing
select(A,v1-v2)A
scan and materialize result
more operators
EngineeringDatabase Cracking CIDR 2007
Exploit column-store bulk processing
select(A,v1-v2)A
scan and materialize result
more operators
crackselect(A,v1-v2)
A
EngineeringDatabase Cracking CIDR 2007
Exploit column-store bulk processing
select(A,v1-v2)A
scan and materialize result
more operators
crackselect(A,v1-v2)
A no need to materialize
EngineeringDatabase Cracking CIDR 2007
Exploit column-store bulk processing
select(A,v1-v2)A
scan and materialize result
more operators
crackselect(A,v1-v2)
A no need to materialize
more operators
Cracking ExampleEach query is treated as an advice on how data should be stored
100K random selections random selectivityrandom value rangesin a 10 million integer column
Database Cracking CIDR 2007
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
Cracking ExampleEach query is treated as an advice on how data should be stored
100K random selections random selectivityrandom value rangesin a 10 million integer column
almost no initialization overhead
Database Cracking CIDR 2007
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
Cracking ExampleEach query is treated as an advice on how data should be stored
100K random selections random selectivityrandom value rangesin a 10 million integer column
almost no initialization overhead
continuous improvement
Database Cracking CIDR 2007
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
Cracking ExampleEach query is treated as an advice on how data should be stored
100K random selections random selectivityrandom value rangesin a 10 million integer column
almost no initialization overhead
continuous improvement
Database Cracking CIDR 2007
set-upScan
Full Index
Crack
0.001
0.01
0.1
1
10
100
1000
1 10 100 1000 10000 100000
Resp
onse
tim
e (
secs
)
Query sequence (x1000)
Cracking ExampleEach query is treated as an advice on how data should be stored
10K random selections selectivity 10%random value rangesin a 30 million integer column
Database Cracking CIDR 2007
set-up
0.004
200
0.001
0.01
0.1
1
10
100
1 10 100 1000 10000
Cum
ula
tive a
vera
ge r
esp
onse
tim
e (
secs
)
Query sequence
Scan
Full Index
Crack
Cracking ExampleEach query is treated as an advice on how data should be stored
10K random selections selectivity 10%random value rangesin a 30 million integer column
10K queries later, Full Index still has not
amortized the initialization costs
Database Cracking CIDR 2007
set-up
0.004
200
0.001
0.01
0.1
1
10
100
1 10 100 1000 10000
Cum
ula
tive a
vera
ge r
esp
onse
tim
e (
secs
)
Query sequence
Scan
Full Index
Crack
Skyserver experiments
By the time stochastic cracking answered 160.000 queriesfull indexing (sorting) is still half way creating the index
Stochastic Cracking, PVLDB 12
4T Skyserver instancereal workload queries on PhotoObjAllPhotoObjAll = 500 million tuples
Literature
LiteratureCIDR’05, first sketch
CIDR’07, initial architecture
SIGMOD’07, updates
SIGMOD’09, tuple reconstruction
PhDThesis’10
TPCTC’10, benchmarking
EDBT’10, adaptive merging
PVLDB’11, hybrids
PVLDB‘12a, robustness
PVLDB‘12b, concurrency control
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexingA B C D
Table 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexingA B C D
Table 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
A B C DTable 1
A B C DTable 2
Base Data
x
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
A B C DTable 1
A B C DTable 2
Base Data
xx
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
A B C DTable 1
A B C DTable 2
Base Data
x
xx
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Sort, cluster in caches
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Sort, cluster in caches
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Sort, cluster in caches
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Sort, cluster in caches
Crack joins
Tangram
A B C DTable 1
A B C DTable 2
As queries arrive...Partial materialization
Partial indexing
Continuous adaptation
Storage adaptation
No tuple reconstruction
Adaptive Alignment
A B C DTable 1
A B C DTable 2
Base Data
Sort, cluster in caches
Crack joins
Related Work Auto-tuning tools
On-line tuning
Offline analysis and physical design.
Online analysis. Combine scans with index build.
Surajit Chaudhuri, Vivek Narassaya, Nico Bruno, Guy Lohman, Daniel Zilio, Anastassia Ailamaki, Alkis Polyzotis, Sam Lightstone,
Kai-Uwe Sattler, Karl Schnaitter ... and many more
Related Work Auto-tuning tools
On-line tuning
Offline analysis and physical design.
Online analysis. Combine scans with index build.
Surajit Chaudhuri, Vivek Narassaya, Nico Bruno, Guy Lohman, Daniel Zilio, Anastassia Ailamaki, Alkis Polyzotis, Sam Lightstone,
Kai-Uwe Sattler, Karl Schnaitter ... and many more
Adaptive Indexing pushes partial, incremental, adaptive, online indexing inside
the kernel
Indexing Overviewworkload analysis
index buildingquery processing
offline indexing
Indexing Overviewworkload analysis
index buildingquery processing
offline indexing
online indexingworkload analysis
index buildingquery processing
Indexing Overviewworkload analysis
index buildingquery processing
offline indexing
online indexing
adaptive indexing
workload analysisindex building
query processing
adaptive indexing
Indexing Overviewworkload analysis
index buildingquery processing
offline indexing
online indexing
adaptive indexing
workload analysisindex building
query processing
adaptive indexing
wor
kloa
d kn
owle
dge
idle time
adaptive
online
offline
So what’s next...Is adaptive indexing the ultimate solution?Should we throw auto-tuning tools away? NO
So what’s next...Is adaptive indexing the ultimate solution?Should we throw auto-tuning tools away? NO
SIGMOD2012, PHDw, Holistic IndexingEleni Petraki and Stratos Idreos
So what’s next...Is adaptive indexing the ultimate solution?Should we throw auto-tuning tools away?
But AI may be the missing link
NO
adaptive indexing offline analysis online analysis
A system that continuously adapts and exploits both online and offline statistics as well as any a priori or online idle time
and its tuning/alert actions do not slow down queries
SIGMOD2012, PHDw, Holistic IndexingEleni Petraki and Stratos Idreos
Tutorial PlanPart IIntroduction and db cracking basics (Stratos)Adaptive merging (Goetz)Benchmarking and hybrids (Stefan)
Part IIStochastic cracking and sideways cracking (Stratos)Concurrency control (Goetz)Partial cracking and updates (Stefan)