auto-tuning database kernels 2 - Harvard...

Preview:

Citation preview

auto-tuning database kernels 2.0prof. Stratos Idreos

HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/

class 22

Stratos Idreos

a bit more about auto-tuning db kernels

and then a couple of open research topics

Stratos Idreos

storage indexing queryschema loadX X X

be able to query the data immediately & with good performance

raw data explore data and gain knowledge “immediately”

Stratos Idreos

database cracking

Stratos Idreos

database cracking

idle timeworkload knowledge

external tools

human control

Stratos Idreos

database crackingauto-tuning database kernels

incremental, adaptive, partial indexing

idle timeworkload knowledge

external tools

human control

Stratos Idreos

database crackingauto-tuning database kernels

incremental, adaptive, partial indexing

indexing

initialization querying

idle timeworkload knowledge

external tools

human control

Stratos Idreos

database crackingauto-tuning database kernels

incremental, adaptive, partial indexing

indexing

initialization querying

idle timeworkload knowledge

external tools

human control

Stratos Idreos

database crackingauto-tuning database kernels

incremental, adaptive, partial indexing

indexing

initialization querying

idle timeworkload knowledge

external tools

human control

every query is treated as an advice on how data should be stored

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

resu

lt

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

resu

lt

gain knowledge on how data is organized

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

dynamically/on-the-fly within the select-operator

resu

lt

gain knowledge on how data is organized

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

4 2 1 3 6 7 9 8 13 12 11 14 16 19

piece1: A<=7

piece2: 7<A<=10Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

4 2 1 3 6 7 9 8 13 12 11 14 16 19

piece1: A<=7

piece2: 7<A<=10

piece3: 10<A<14Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

4 2 1 3 6 7 9 8 13 12 11 14 16 19

piece1: A<=7

piece2: 7<A<=10

piece3: 10<A<14

piece4: 14<=A<=16

piece5: A>16

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operatorDatabase Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

4 2 1 3 6 7 9 8 13 12 11 14 16 19

piece1: A<=7

piece2: 7<A<=10

piece3: 10<A<14

piece4: 14<=A<=16

piece5: A>16

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

resu

lt

Database Cracking CIDR 2007

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column AQ1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14

piece1: A<=10

piece2: 10<A<14

piece3: A>=14

4 2 1 3 6 7 9 8 13 12 11 14 16 19

piece1: A<=7

piece2: 7<A<=10

piece3: 10<A<14

piece4: 14<=A<=16

piece5: A>16

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

resu

lt

the more we crack, the more we learn

Database Cracking CIDR 2007

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

almost no initialization overhead

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

almost no initialization overhead

continuous improvement

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

almost no initialization overhead

continuous improvement

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

cracking databases

updates (SIGMOD07)

storage-restrictions (SIGMOD09)

benchmarking (TPCTC10)

concurrency control

(PVLDB12)

algorithms (PVLDB11)

basics (CIDR07)

multi-cores (SIGMOD15)

hadoop (Yale/Saarland)

b-trees (HP Labs)

>1 columns (SIGMOD09)

robustness (PVLDB12)

adaptive storage

(SIGMOD14)

time-series (SIGMOD14)

encryption (SIGMOD16)

Stratos Idreos

concurrency control

Concurrency Control, PVLDB 12

problem: read queries become write queries! (?)

goal: be able to crack for multiple queries in parallel

Stratos Idreos

traditional indexing adaptive indexing

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexingwrite queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

change index contents and structure

only index structure changes

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

change index contents and structure

only index structure changes

no need for traditional locks = too heavy

short term latches = fast and release quickly

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

change index contents and structure

only index structure changes

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

change index contents and structure

only index structure changes

v1 v2 v3

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

change index contents and structure

only index structure changes

v1 v2 v3

a b

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

change index contents and structure

only index structure changes

v1 v2 v3 v1 v2 v3

a b

a b

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

change index contents and structure

only index structure changes

v1 v2 v3 v1 v2 v3

a b

a b

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

change index contents and structure

only index structure changes

v1 v2 v3 v1 v2 v3

a b

a b

v1 v2 v3a

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

change index contents and structure

only index structure changes

v1 v2 v3 v1 v2 v3

a b

a b

v1 v2 v3a

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

impact stable storage stable storage optional

change index contents and structure

only index structure changes

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

impact stable storage stable storage optional

change index contents and structure

only index structure changes

v1 v2 v3 v1 v2 v3a b

write queries read queries

a b

Concurrency Control, PVLDB 12

Stratos Idreos

traditional indexing adaptive indexing

all or nothing incremental and optional

impact stable storage stable storage optional

change index contents and structure

only index structure changes

need to serialize can execute in any order

write queries read queries

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

diskA B C D

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

A<10disk memory

A B C D

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

A<10disk memory

A B C D IDs

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

A<10disk memory

A B C D IDs B

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

B<20A<10disk memory

A B C D IDs B

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

B<20A<10disk memory

A B C D IDs B CIDs

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

B<20 minCA<10disk memory

A B C D IDs B CIDs

Concurrency Control, PVLDB 12

Stratos Idreos

select min(C) from R where A<10 & B<20

B<20 minCA<10disk memory

A B C D IDs B CIDs

column lock and release as soon as an operator completes

Concurrency Control, PVLDB 12

Stratos Idreos

need to latch only to be cracked pieces (max 2 per select)

Concurrency Control, PVLDB 12

select [a,b]

Stratos Idreos

piece locking

avl-tree crack column

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columncrack select

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columncrack selectwlock

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columncrack select

wlock

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack column

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columnmax

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columnmaxrlock

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columnmax

rlock

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columnmax

rlock

Concurrency Control, PVLDB 12

Stratos Idreos

piece locking

avl-tree crack columnmax

rlock

Concurrency Control, PVLDB 12

Stratos Idreos

avl-tree crack column1090

140

200

300

piece locking

Stratos Idreos

avl-tree crack column crack select1090

140

200

300

65

230

piece locking

Stratos Idreos

avl-tree crack column crack selectwlock

wlock

1090

140

200

300

65

230

piece locking

Stratos Idreos

avl-tree crack column crack selectwlock

wlock

r/wlock

1090

140

200

300

65

230

piece locking

Stratos Idreos

avl-tree crack column crack selectwlock

wlock

q1,q2,q3...qn1090

140

200

300

65

230

10

90

piece locking

Stratos Idreos

avl-tree crack column crack selectwlock

wlock

q1,q2,q3...qn1090

140

200

300

65

230

10

90

2030

7080

piece locking

Stratos Idreos

avl-tree crack column crack selectwlock

wlock

q1,q2,q3...qn1090

140

200

300

65

230

10

90

2030

7080

wlock

piece locking

Stratos Idreos

0

10

20

30

40

50

60

70

80

1 2 4 8 16 32

Tota

l Tim

e for

1024 Q

ueries

(secs

)

Number of Concurrent Clients

scansort

crack

300

0

50

100

150

200

250

350

1 2 4 8 16 32

Thro

ughput ove

r 1024 Q

ueries

(Queries

/ se

c)

Number of Concurrent Clients

scansort

crack

4.5

4.55

4.6

4.65

4.7

4.75

4.8

Tota

l Tim

e for

1024 Q

ueries

(secs

)

(Sequential Execution)

Concurrency Control

Enabled Disabled

Rows: 100M Query: select sum(A) from R where v1 < A1 < v2 Selectivity: 0.01%, Random, # of queries: 1024 Clients: 1-32, Machine: 4 cores

adaptive indexing maintains its performance advantage

Concurrency Control, PVLDB 12

Stratos Idreos

crack time and wait time

10-6

10-5

10-4

10-3

10-2

10-1

100

1 10 100 1000

Tim

e p

er

query

(se

cs)

Query sequence

index refinementwait time

Query: select sum(A) from R where v1 < A1 < v2 Selectivity: 0.01%, Random, # of queries: 1024 Clients: 8, Machine: 4 cores

adaptive indexing maintains

the adaptive behavior

Concurrency Control, PVLDB 12

Stratos Idreos

stochastic cracking robustness

Stochastic Cracking, PVLDB 12

Stratos Idreos

adaptive indexing

v4v5 v3 v2 v1

Stratos Idreos

Response time: X

adaptive indexing

v4v5 v3 v2 v1

Stratos Idreos

Response time: X

adaptive indexing

v4v5 v3v2 v1

Stratos Idreos

Response time: X

adaptive indexing

Response time: 1000X

v4v5 v3v2 v1

Stratos Idreos

v4v5 v3 v2 v1

Stochastic Cracking, PVLDB 12

Stratos Idreos

v4v5 v3 v2 v1 v1 v2 v3 v4 v5

Stochastic Cracking, PVLDB 12

Stratos Idreos

v4v5 v3 v2 v1

v1 v2 v3 v4 v5

Stochastic Cracking, PVLDB 12

Stratos Idreos

v4v5 v3v2 v1

v1 v2 v3 v4 v5

Stochastic Cracking, PVLDB 12

Stratos Idreos

v4v5 v3v2 v1

v1 v2 v3 v4 v5

v1 v2 v3 v4 v5

Stochastic Cracking, PVLDB 12

Stratos Idreos Stochastic Cracking, PVLDB 12

Stratos Idreos

select [15,55]

Stochastic Cracking, PVLDB 12

Stratos Idreos

select [15,55]

Stochastic Cracking, PVLDB 12

Stratos Idreos

10 20 30 40 50 60

select [15,55]

Stochastic Cracking, PVLDB 12

Stratos Idreos

10 20 30 40 50 60

select [15,55]

select [15,55]

Stochastic Cracking, PVLDB 12

Stratos Idreos

10 20 30 40 50 60

select [15,55]

select [15,55]

Stochastic Cracking, PVLDB 12

Stratos Idreos

10 20 30 40 50 60

select [15,55]

select [15,55]

Stochastic Cracking, PVLDB 12

Stratos Idreos

10 20 30 40 50 60

select [15,55]

select [15,55]

the amount of work for each query depends on the index state

the state of the index depends on past queries patterns

Stochastic Cracking, PVLDB 12

Stratos Idreos

good sequence

column with 100 unique integers [1,100]

Stochastic Cracking, PVLDB 12

Stratos Idreos

q1

good sequence

column with 100 unique integers [1,100]<50

Stochastic Cracking, PVLDB 12

Stratos Idreos

Nq1

good sequence

column with 100 unique integers [1,100]<50

Stochastic Cracking, PVLDB 12

Stratos Idreos

Nq1

good sequence

column with 100 unique integers [1,100]<50

Stochastic Cracking, PVLDB 12

Stratos Idreos

Nq1q2

good sequence

column with 100 unique integers [1,100]<50<25

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2

q1q2

good sequence

column with 100 unique integers [1,100]<50<25

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2

q1q2

good sequence

column with 100 unique integers [1,100]<50<25

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2

q1q2 q3

good sequence

column with 100 unique integers [1,100]<50<25 <75

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

good sequence

column with 100 unique integers [1,100]<50<25 <75

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

good sequence

column with 100 unique integers [1,100]<50<25 <75

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

good sequence

column with 100 unique integers [1,100]<50<25 <75

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1N

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1N

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1 q2N

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2 <3

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1 q2NN-1

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2 <3

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1 q2NN-1

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2 <3

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1 q2 q3NN-1

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2 <3 <4

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1 q2 q3NN-1N-2

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2 <3 <4

Stochastic Cracking, PVLDB 12

Stratos Idreos

NN/2N/2

q1q2 q3

q1 q2 q3NN-1N-2

good sequence

bad sequence

column with 100 unique integers [1,100]<50<25 <75

<2 <3 <4

blindly adapting to queries is not always a good idea

Stochastic Cracking, PVLDB 12

Stratos Idreos Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

progressive cracking

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

q1: <v1progressive cracking

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

q1: <v1 crack + filter <v1progressive cracking

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

random

swap

q1: <v1 crack + filter <v1progressive cracking

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

random

swap

q1: <v1 crack + filter <v1

scan + filter <v1

progressive cracking

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

randomq1: <v1progressive cracking

q2: <v2

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

randomq1: <v1progressive cracking

scan + filter <v2q2: <v2

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

randomq1: <v1progressive cracking

scan + filter <v2

crack + filter <v2

q2: <v2

swap

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

to be cracked

q random

randomq1: <v1progressive cracking

scan + filter <v2

crack + filter <v2

q2: <v2

swap

Stochastic Cracking, PVLDB 12

query driven

Stratos Idreos

cracking on Skyserver (4TB) (Sloan Digital Sky Survey, www.sdss.org)

cracking answers 160.000 queries while full indexing is still half way creating one index

Stochastic Cracking, PVLDB 12

Stratos Idreos

multi-core utilization

Multi-cores, SIGMOD 15

Stratos Idreos Multi-cores, SIGMOD 15

select [a,b]

core 1 core 2

Stratos Idreos

select [a,b]

Multi-cores, SIGMOD 15

Stratos Idreos

select [a,b]

Multi-cores, SIGMOD 15

Stratos Idreos

select [a,b]

<a >=a

Multi-cores, SIGMOD 15

Stratos Idreos

select [a,b]

<a >=a

core 1 core 2 core 3 … core N

Multi-cores, SIGMOD 15

Stratos Idreos Multi-cores, SIGMOD 15

Stratos Idreos

Multi-cores, SIGMOD 15

Stratos Idreos

core 1 core 2 core 3

core N…

Multi-cores, SIGMOD 15

Stratos Idreos

core 1 core 2 core 3

core N…

Multi-cores, SIGMOD 15

Stratos Idreos

<a

>=a

core 1 core 2 core 3

core N…

Multi-cores, SIGMOD 15

Stratos Idreos

<a

>=a

core 1 core 2 core 3

core N…

Multi-cores, SIGMOD 15

Stratos Idreos

<a

>=a

core 1 core 2 core 3

core N…

<a >=a

merge

Multi-cores, SIGMOD 15

Stratos Idreos

problem: cores may be under utilized

goal: either fully utilize a core or shut it down

Holistic, SIGMOD 15

Stratos Idreos

scheduler

monitoring CPU utilization

worker 1worker 2

worker 3worker 4

worker 5

worker 6worker 7

worker 8

thread pool

when there is an underutilized CPU, pin a thread to it and to do a cracking task

Stratos Idreos

index space

crack

partitions size - access frequency - hit ratio

random works best

Stratos Idreos

108 tuples - 10 attributes, random queries

1

10

100

1000

10000

1 10 100 1000

Cu

mu

lativ

e R

esp

on

se T

ime

(se

c)

Query sequence

no indexing

offline indexing

online indexing

adaptive indexing

holistic indexing

(a) Performance

0

50

100

150

200

AdaptiveIndexing

HolisticIndexing

To

tal R

esp

on

se T

ime

(se

c)

1

9

90

900

(b) Performance Breakdown

0

500

1000

1500

2000

2500

3000

3500

0 2 4 6 8 10

Cu

mu

lativ

e #

ind

ex

pa

rtiti

on

s

Query sequence (x100)

adaptive indexing

holistic indexing

(c) Index Partitions

0

5

10

15

20

25

30

2 4 6 8 10 12 14 0

8

16

24

32

To

tal R

esp

on

se T

ime

(se

c) o

f #

wo

rke

rs

#W

ork

ers

Holistic Worker Activation

time

#workers

(d) Idle CPU UtilizationFigure 6: Improving performance with holistic indexing.

center slice is contiguous, while the remaining n� 1 slices con-sist of two disjoint halves, each, that are arranged concentricallyaround the center slice (xi and yi indicate the first and the last el-ement of piece i respectively). n threads crack the n slices inde-pendently applying a vectorized, out-of-place cracking algorithm(Figure 5), which was proven in [44] to be the most CPU efficientsingle-threaded cracking implementation reported so far. Finally,the local data are merged into a big cracked piece (Figure 4(b)).We found that devoting all resources to perform adaptive indexingfor user queries in parallel does not lead to the absolute best per-formance. Specifically, we found that we can improve performanceeven more by assigning part of the resources to holistic indexing. Inthis way, some of the CPU resources are assigned to parallel crack-ing for user queries but the rest of the CPU resources are distributedacross several holistic workers for additional index refinements. Inthe experimental section we show why this approach is better thanassigning all available resources to user queries.

5. EXPERIMENTAL ANALYSISIn this section, we demonstrate that holistic indexing leads to a

self-organizing always-on DBMS with substantial benefits in termsof response time; with zero administration or set-up effort holisticindexing improves performance adaptively by exploiting all avail-able CPU resources to the maximum. We present a detailed exper-imental analysis using both standard benchmarks such as TPC-Hand real-life workloads such as SkyServer as well as synthetic mi-crobenchmarks for a fine-grained analysis.

We use a dual-socket machine equipped with two 2.00 GHz In-tel(R) Xeon(R) CPU E5-2650 processors and with 256 GB RAM.Each processor has 8 hyper-threading cores resulting in 32 hard-ware threads in total. The operating system is Fedora 20 (kernelversion 3.12.10). All experiments we report are based on an im-plementation of holistic indexing in MonetDB and assume a main-memory environment.

5.1 Improving over State-of-the-Art IndexingIn our first experiment we demonstrate that holistic indexing

has the potential to bring substantial performance improvementsover existing state-of-the-art indexing approaches. We test holisticindexing against parallel versions of adaptive indexing (databasecracking), offline indexing, online indexing and plain scans.

For plain scans (no indexing), we use a parallel select opera-tor implemented in MonetDB. For offline and online indexing wesort the columns using a highly parallel NUMA-aware sorting algo-rithm that was introduced in [9] (m-way, 16-byte keys) and is pub-licly available in [1]. Specifically, in offline indexing we pre-sortall the columns before query processing, while in online indexing

we assume that after processing a few queries we understand theworkload patterns and then we sort the relevant columns. MonetDBautomatically detects that a column is sorted and can use efficientbinary search actions during select operations. For adaptive index-ing we use the parallel vectorized database cracking algorithm thatwas introduced in [44] (see Section 4.2).

Here we use a synthetic benchmark. The query workload con-sists of 103 range select queries over a table of 10 attributes; eachquery touches a single attribute (we will see more complex querieslater on). Each attribute consists of 230 uniformly distributed inte-gers, while the value range requested by each query (and thus theselectivity) is random. All queries are of the following form.

select A from R where A < v

The tested scenario assumes a dynamic and ad-hoc environmentwith zero workload knowledge and zero idle time to pre-index thedata. Figure 6(a) shows the results. On the x-axis queries are rankedin execution order. The y-axis represents the cumulative responsetime as the query sequence evolves, i.e., each point (x,y) representsthe sum of the execution time y for the first x queries. In this way,the graph shows how the response times evolve as we process moreand more queries.

If there is no indexing support (plain scans), the entire column isscanned in parallel by 32 threads for every query. Because of thisstable access pattern, the cumulative response time of the query se-quence grows linearly as every query has similar cost. With offlineindexing, on the other hand, it takes 12 seconds to completely sorteach column, assuming a priori workload knowledge. This leads toa 120 seconds initialization overhead to sort all 10 columns. Sincethere is no idle time before the first query, the sorting cost is addedto the execution time of the first query in Figure 6(a). After thefirst query, all queries are answered with fast binary search actionswhich results in a rather flat cumulative curve. With online index-ing, the first 100 queries are answered without any index supportand thus the cost grows linearly. After the 100th query, assumingenough workload knowledge has been obtained via monitoring, weproceed to sort all 10 columns. The sorting cost is added to theexecution cost of the 101st query, since there is no idle time be-tween the 100th and the 101st query. As with offline indexing, allsubsequent queries are answered very fast with binary search overthe sorted column. Thus, both in offline and in online indexing,the whole query sequence is affected by the sorting costs. On theother hand, adaptive indexing continuously improves performancerequiring no workload knowledge and without penalizing individ-ual queries. This improvement is due to the fact that adaptive in-dexing builds only partial indices which it incrementally refines asqueries arrive. However, there is still room for big improvements.

Stratos Idreos

NORMALIZED DATA DENORMALIZED DATA

Stratos Idreos

NORMALIZED DATA DENORMALIZED DATA

good for updates, storage but we need joins

Stratos Idreos

NORMALIZED DATA DENORMALIZED DATA

good for updates, storage but we need joins

only fast scans but expensive to create,

storage & updates

Stratos Idreos

adaptive denormalization

normalized data

first award in ACM SIGMOD undergrad

research competition

Stratos Idreos

adaptive denormalization

normalized data possible denormalized space

first award in ACM SIGMOD undergrad

research competition

Stratos Idreos

adaptive denormalization

continuously physically reorganize data based on incoming query patterns (joins)

normalized data possible denormalized space

first award in ACM SIGMOD undergrad

research competition

Stratos Idreos

adaptive denormalization

continuously physically reorganize data based on incoming query patterns (joins)

normalized data

denormalized fragments queries only need to fast scan

possible denormalized space

first award in ACM SIGMOD undergrad

research competition

Stratos Idreos

adaptive denormalization

continuously physically reorganize data based on incoming query patterns (joins)

normalized data

denormalized fragments queries only need to fast scan

possible denormalized space

first award in ACM SIGMOD undergrad

research competition

Stratos Idreos

adaptive denormalization

continuously physically reorganize data based on incoming query patterns (joins)

normalized data

denormalized fragments queries only need to fast scan

possible denormalized space

first award in ACM SIGMOD undergrad

research competition

Stratos Idreos

adaptive denormalization

continuously physically reorganize data based on incoming query patterns (joins)

normalized data

denormalized fragments queries only need to fast scan

possible denormalized space

first award in ACM SIGMOD undergrad

research competition

Stratos Idreos

years

daily

dat

a

[IBMbigdata]years

daily

dat

a

[StratosGuess]

data

* ski

llsdata system design,

set-up, tune, use

Stratos Idreos

data systems that are easy to design(storage, data flow, algorithms, tuning, etc)

Stratos Idreos

SystemDesign

FullImplemention

PrototypeImplementation

workloads, h/wexpected properties

(performance, throughput,

energy, budget, …)

1

2

3

application specific workload, h/w & expected properties

Set-up &

Tune

4 Auto-tuningduring query processing

5

Design & DevelopmentRoles: Architects & Developers

Set-up and TuningRoles: Database Administrators

Stratos Idreos

say we need a system for workload X (data/access patterns): should we strip down a relational system?

should we build up a key-value store or main-memory system? should we build something from scratch?

Stratos Idreos

say we need a system for workload X (data/access patterns): should we strip down a relational system?

should we build up a key-value store or main-memory system? should we build something from scratch?

say the workload (read/write ratio) shifts (e.g., due to app features):should we use a different data layout for base data - diff updates?

should we use different indexing or no indexing?

Stratos Idreos

say we buy new hardware X (flash/memory): should we change the size of b-tree nodes?

should we change the merging strategy in our LSM-tree?

say we need a system for workload X (data/access patterns): should we strip down a relational system?

should we build up a key-value store or main-memory system? should we build something from scratch?

say the workload (read/write ratio) shifts (e.g., due to app features):should we use a different data layout for base data - diff updates?

should we use different indexing or no indexing?

Stratos Idreos

say we buy new hardware X (flash/memory): should we change the size of b-tree nodes?

should we change the merging strategy in our LSM-tree?

say we want to improve response time:would it be beneficial if we would buy faster flash disks?

would it be beneficial if we buy more memory? or just spend the same budget on improving software?

say we need a system for workload X (data/access patterns): should we strip down a relational system?

should we build up a key-value store or main-memory system? should we build something from scratch?

say the workload (read/write ratio) shifts (e.g., due to app features):should we use a different data layout for base data - diff updates?

should we use different indexing or no indexing?

Stratos Idreos

# of design options?

Stratos Idreos

# of design options?

> 1 QUINTILLION(and counting…)

Stratos Idreos

data systems design (and research) is kind of an art(both in the good & in the bad sense)

Stratos Idreos

data systems design (and research) is kind of an art

but can we keep up?

(both in the good & in the bad sense)

Stratos Idreos

disk memory flash …

1 easily utilize past concepts/designs

Stratos Idreos

do not miss out on cool ideas and concepts

# of

cita

tions

0

7

14

21

28

35

1996 1999 2002 2005 2008 2011 2014

P. O’Neil, E. Cheng, D. Gawlick, E, O'Neil The log-structured merge-tree (LSM-tree) Acta Informatica 33 (4): 351–385, 1996

2

Stratos Idreos

do not miss out on cool ideas and concepts

# of

cita

tions

0

7

14

21

28

35

1996 1999 2002 2005 2008 2011 2014

P. O’Neil, E. Cheng, D. Gawlick, E, O'Neil The log-structured merge-tree (LSM-tree) Acta Informatica 33 (4): 351–385, 1996

Google publishes BigTable

2

Stratos Idreos

data+queries+hardware

data system

self-designing data systems

can we design new systems in weeks instead of years?

Stratos Idreos

data+queries+hardware

data system

self-designing data systems

easy to design

adapt to environment

can we design new systems in weeks instead of years?

Stratos Idreos

INTERACTIVE DATA SYSTEM DESIGN/TUNING/TESTING

datasystem

designer

Stratos Idreos

How to model/abstract design decisions?

How much can happen automatically (auto-design)?

Stratos Idreos

How to model/abstract design decisions?

How much can happen automatically (auto-design)?

move from design based on intuition & experience only to a more formal and systematic way to design systems

Stratos Idreos

row-store

column-store

key-value store

hybrid store

ACM SIGMOD Blog, June’15

1. write/extend modules in a high level language

(optimizations)2. modules=

storage/execution/data flow

3. try out >>1 designs (sets of modules)

Stratos Idreos

data systems that are easy to use

dbTouch

show me something interesting

Queriosity

DATA

cidr2013/icde2014

bigdata2015

can we design systems where no SQL is needed?

cracking example

dbTouch CIDR 2013

Stratos Idreos

data systems todayallow us to answer queries fast

data systems tomorrow should allow us to find fast which queries to ask

db

db explore

Stratos Idreos

a semester of quizzes and brainstorming

option between systems project (key-value store) & research with DASlab

(only for CS165 students or otherwise advanced students)

the end

\*project deadline Dec 21 OH and labs continue until then*/

DASlab productions 2016

learn the concepts, not just the techniques learn to adapt

it all starts with how we store the data yes you can do research

Recommended