30
Query processing for parallel languages Brandon Myers, Mark Oskin, Bill Howe [email protected] DB Day 2015 1

Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Query processing for parallel languages

Brandon Myers, Mark Oskin, Bill Howe [email protected]

DB Day 2015

1

Page 2: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

2slide src: Jeff Gardner

Page 3: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

How to turn astrophysics simulation output into scientific knowledge

3slide src: Jeff Gardner

Page 4: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

How to turn astrophysics simulation output into scientific knowledge

4slide src: Jeff Gardner

Page 5: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

5

GASOLINE (a cosmology N-body code)required 10 person-years of development

but code is reused for many years by many people

Data analysis codeless reuse, ad hoc queries

slide src: Jeff Gardner

Page 6: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

6

globalList[int]histogram[N]block;forallpinParticles{atomic{ifnothistogram[h(p.z)]{histogram[h(p.z)]=newList[int];

}histogram[h(p.z)].find(p.z)+=1;

}}

valhistogram=selectcount(*),zfromParticlesgroupbyz

Page 7: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Our solution

7

memory

queries

HPC stack

Page 8: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

8

memory

queries

PGAS

Our solution

Page 9: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

9

Integrate queries into PGAS languages

Page 10: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

10

Integrate queries into PGAS languages

Page 11: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Partitioned global address space (PGAS) languages

Global data

DRAM DRAM DRAM DRAM

Core Core Core Core

Global Tasks

Page 12: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Example PGAS program

globalParticleP[N]block;globalinthistogram[N]block;forallpjinP{

atomichistogram[pj.z]+=1;

}

12

[pj.z]

histogram

pjParticle

P0 P1 P2

Page 13: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

13

Integrate queries into PGAS languages

Page 14: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

14

Integrate queries into PGAS languages

Page 15: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Radish: queries in PGAS

Relational Algebra++

SQL

Grappa Chapel X10

PGAS physical algebra

mapping?

15

PGAS languages

Page 16: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Approaches for query processing in PGAS

demand-driven data flow (iterators)

next!next!next!

16

HashTableJoinProbe[o_custkey=c_custkey]

HashTableAggregateScan

Project[cnt, c_name]

Scan(Order)Scan(Customer)

Select[o_totalprice > 20]

HashTableJoinBuild[c_custkey]

HashTableAggregateUpdate

[o_custkey]

Shuffle[o_custkey]Shuffle[c_custkey]

Project[c_custkey, c_name]

Scan(Customer)

HashTableJoinBuild[c_custkey]

Shuffle[c_custkey]

Project[c_custkey, c_name]

query compilation

codegen

optimize/ compile

binary

forallcinCustomerT1t1t1.id=t0.custkeyt1.c_name=t0.c_nameonpartition(join_table[h(t1.c_custkey)])atomic{join_table[h(t1.c_custkey)].append(t1)}

Page 17: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

17

forallcinCustomerT1t1t1.id=t0.custkeyt1.c_name=t0.c_name

Generation of tuple-centric parallel code

Customer

globalTupleCustomerblock;

P0 P1 P2

Scan(Customer)

HashTableJoinBuild[c_custkey]

Shuffle[c_custkey]

Project[c_custkey, c_name]

Page 18: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

18

Generation of tuple-centric parallel code

Customer

P0 P1 P2

Scan(Customer)

HashTableJoinBuild[c_custkey]

Shuffle[c_custkey]

Project[c_custkey, c_name]

hashpartitions

…localjoin

…localjoin

…localjoin

Page 19: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

19

onpartition(jointable[h(t1.c_custkey)])atomic{jointable[h(t1.c_custkey)].append(t1)}

Generation of tuple-centric parallel code

jointable

Customer

globalTupleCustomerblock;globalCelljointableblock;

P0 P1 P2

Scan(Customer)

HashTableJoinBuild[c_custkey]

Shuffle[c_custkey]

Project[c_custkey, c_name]

forallcinCustomerT1t1t1.id=t0.custkeyt1.c_name=t0.c_name

Page 20: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Generation algorithm

20

classOperator{//getthenexttuplenext(T&):bool}

Runtime interpretation Code generation

classOperator{//emitcodeforproducing//atupleproduce(state&):unit

//emitcodeforconsuming//atupleconsume(state&,inputTuple):string

}

Page 21: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Generation algorithm

21

classOperator{//emitcodeforproducing//atupleproduce(state&):unit

//emitcodeforconsuming//atupleconsume(state&,inputTuple):string

}

Code generation

HashTableJoinProbe[o_custkey=c_custkey]

HashTableAggregateScan

Project[cnt, c_name]

Scan(Order)Scan(Customer)

Select[o_totalprice > 20]

HashTableJoinBuild[c_custkey]

HashTableAggregateUpdate

[o_custkey]

Shuffle[o_custkey]Shuffle[c_custkey]

Project[c_custkey, c_name]

full dependence

pipeline

tuple dependence

P

C

Page 22: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

22

Integrate queries into PGAS languages

Page 23: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

23

Integrate queries into PGAS languages

Page 24: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

24

Experiment #1: conventional approach vs. Radish

0

5

10

15

20

25

30

1 2 3 4 6 8 9 10 11 14 15 16 17 19 20

Speedupoveriterators

Query

Radish Radish-sym

(TPC-H)

Page 25: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

0

5

10

15

20

25

30

1 2 3 4 6 8 9 10 11 14 15 16 17 19 20

Speedupoveriterators

Query

Radish Radish-sym25

Experiment #1: conventional approach vs. Radish

(TPC-H)

Page 26: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

26

Experiment #2: vs ImpalaExperiment #2: vs Impala

0

20

40

60

80

100

120

140

160

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20

Runtime(s)

Query

Radish Impala (codegen) Impala (nocodegen)

(TPC-H)

Page 27: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

27

Experiment #2: vs Impala

0

20

40

60

80

100

120

140

160

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 19 20

Runtime(s)

Query

Radish Impala (codegen) Impala (nocodegen)

TPC-H100GB database

(TPC-H)

Page 28: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Compilation time is high

28

0

5

10

15

20

25

30

35

6 1 19 12 17 14 15 4 3 11 16 10 18 20 5 7 9 2 8

Time

Query

compile queryplan

Page 29: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

Conclusion

get the code!

RACO+RADISH: github.com/uwescience/raco GRAPPA: grappa.io

29

Page 30: Query processing for parallel languageshomepage.cs.uiowa.edu/~bdmyers/papers/myers_dbday15.pdfData analysis code less reuse, ad hoc queries slide src: Jeff Gardner. 6 ... Approaches

30

Attribution