22
DBRIDGE: A PROGRAM REWRITE TOOL FOR SET-ORIENTED QUERY EXECUTION Mahendra Chavan*, Ravindra Guravannavar, Prabhas Kumar Samanta, Karthik Ramachandra, S Sudarshan Indian Institute of Technology Bombay, Indian Institute of Technology Hyderabad *Current Affiliation: Sybase Inc.

DBridge: A program rewrite tool for set-oriented query execution

  • Upload
    morela

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

DBridge: A program rewrite tool for set-oriented query execution. Mahendra Chavan * , Ravindra Guravannavar , Prabhas Kumar Samanta , Karthik Ramachandra , S Sudarshan Indian Institute of Technology Bombay, Indian Institute of Technology Hyderabad *Current Affiliation: Sybase Inc. - PowerPoint PPT Presentation

Citation preview

Page 1: DBridge: A program rewrite tool for  set-oriented query execution

DBRIDGE: A PROGRAM REWRITE TOOL FOR SET-ORIENTED QUERY EXECUTION

Mahendra Chavan*, Ravindra Guravannavar, Prabhas Kumar Samanta, Karthik Ramachandra, S Sudarshan

Indian Institute of Technology Bombay,Indian Institute of Technology Hyderabad

*Current Affiliation: Sybase Inc.

Page 2: DBridge: A program rewrite tool for  set-oriented query execution

2

THE PROBLEM

Applications often invoke Database queries/Web Service requests

repeatedly (with different parameters) synchronously (blocking on every request)

Naive iterative execution of such queries is inefficient

No sharing of work (eg. Disk IO) Network round-trip delaysThe problem is not within the database engine!

The problem is the way queries are invoked from the application!!

Query optimization: time to think out of the box

Page 3: DBridge: A program rewrite tool for  set-oriented query execution

3

Repeated invocation of a query automatically replaced by a single invocation of its batched form.

Enables use of efficient set-oriented query execution plans

Sharing of work (eg. Disk IO) etc. Avoids network round-trip delaysApproach Transform imperative programs using equivalence rules Rewrite queries using decorrelation, APPLY operator etc.

OUR WORK 1: BATCHING

Rewriting Procedures for Batched Bindings

Guravannavar et. al. VLDB 2008

Page 4: DBridge: A program rewrite tool for  set-oriented query execution

4

Repeated synchronous invocation of queries automatically replaced by asynchronous submission.

Application can performother work while query executes

Sharing of work (eg. Disk IO) on the database engine Reduces impact of network round-trip delays Extends and generalizes equivalence rules from our

VLDB 2008 paper on batching

OUR WORK 2: ASYNCHRONOUS QUERY SUBMISSIONProgram Transformation for Asynchronous Query SubmissionChavan et al., ICDE 2011 Research track – 8; April 13th, 14:30-

16:00

Page 5: DBridge: A program rewrite tool for  set-oriented query execution

5

DBRIDGE: BRIDGING THE DIVIDE A tool that implements these ideas on Java

programs that use JDBC Set-oriented query execution Asynchronous Query submission

Two components: The DBridge API

Handles query rewriting and plumbing The DBridge Transformer

Rewrites programs to optimize database access Significant performance gains on real world

applications

Page 6: DBridge: A program rewrite tool for  set-oriented query execution

6

THE DBRIDGE API

Java API which extends the JDBC interface, and can wrap any JDBC driver

Can be used with: Manual writing/rewriting Automatic rewriting (by DBridge transformer)

Same API for both batching and asynchronous submission

Abstracts the details of Parameter batching and query rewrite Thread scheduling and management

Page 7: DBridge: A program rewrite tool for  set-oriented query execution

THE DBRIDGE API

stmt = con.prepareStatement("SELECT count(partkey) "

+ "FROM part " + "WHERE p_category=?");

while(!categoryList.isEmpty()) {

category = categoryList.next();

stmt.setInt(1, category);

ResultSet rs = stmt.executeQuery();

rs.next();int count =

rs.getInt(”count");sum += count;print(category + ”: ” +

count);}

stmt = con.dbridgePrepareStatement(

"SELECT count(partkey) " +"FROM part " +"WHERE p_category=?");

LoopContextTable lct = new LCT();while(!categoryList.isEmpty()) { LoopContext ctx=lct.createContext(); category = categoryList.next(); stmt.setInt(1, category); ctx.setInt(”category”, category); stmt.addBatch(ctx);}stmt.executeBatch(); for (LoopContext ctx : lct) {

category = ctx.getInt(”category”);

ResultSet rs = stmt.getResultSet(ctx);

rs.next();int count =

rs.getInt(”count");sum += count;print(category + ”: ” +

count);}

7

BEFORE

AFTER

Page 8: DBridge: A program rewrite tool for  set-oriented query execution

DBRIDGE API – SET ORIENTED EXECUTIONLoopContextTable lct = new LoopContextTable();while(!categoryList.isEmpty()){

LoopContext ctx = lct.createContext(); category = categoryList.next(); stmt.setInt(1, category); ctx.setInt(”category”, category);

stmt.addBatch(ctx);}stmt.executeBatch();for (LoopContext ctx : lct) {

category = ctx.getInt(”category”);

ResultSet rs = stmt.getResultSet(ctx);

rs.next();int count = rs.getInt(”count");sum += count;print(category + ”: ” + count);

}

DB

Parameter Batch(temp table)

Set of ResultSets

addBatch(ctx) – insert tuple to parameter batch executeBatch() – execute set-oriented form of query getResultSet(ctx) – retrieve results corresponding to the

context

8

Page 9: DBridge: A program rewrite tool for  set-oriented query execution

9

LoopContextTable lct = new LoopContextTable();while(!categoryList.isEmpty()){

LoopContext ctx = lct.createContext(); category = categoryList.next(); stmt.setInt(1, category); ctx.setInt(”category”, category);

stmt.addBatch(ctx);}stmt.executeBatch();for (LoopContext ctx : lct) {

category = ctx.getInt(”category”);

ResultSet rs = stmt.getResultSet(ctx);

rs.next();int count = rs.getInt(”count");sum += count;print(category + ”: ” + count);

}

DBRIDGE API – ASYNCHRONOUS SUBMISSION

Submit Q

Result array

Thread

DB

addBatch(ctx) – submits query and returns immediately getResultSet(ctx) – blocking wait

Page 10: DBridge: A program rewrite tool for  set-oriented query execution

10

DBRIDGE - TRANSFORMER Java source-to-source transformation tool Rewrites programs to use the DBridge API Handles complex programs with:

Conditional branching (if-then-else) structures Nested loops

Performs statement reordering while preserving program equivalence

Uses SOOT framework for static analysis and transformation (http://www.sable.mcgill.ca/soot/)

Page 11: DBridge: A program rewrite tool for  set-oriented query execution

11

DBRIDGE - TRANSFORMER

Page 12: DBridge: A program rewrite tool for  set-oriented query execution

BATCHING: PERFORMANCE IMPACT

12

Category hiearchy traversal (real world example) For small no. of iterations, no change observed At large no. of iterations, factor of 8 improvement

Leaf(1) Middle(10) Top(78)05

10152025303540

Original ProgramTransformed Program

Category Level (Number of Subtree nodes/Loop Iter-ations)

Tim

e (in

sec

)

Page 13: DBridge: A program rewrite tool for  set-oriented query execution

1 2 5 10 20 30 40 5005

101520253035404550

Original ProgramTransformed Program

Number of Threads

Tim

e

13

ASYNCHRONOUS SUBMISSION:PERFORMANCE IMPACT

Auction system benchmark application For small no. (4-40) iterations, transformed program slower At 400-40000 iterations, factor of 4-8 improvement Similar for warm and cold cache

Page 14: DBridge: A program rewrite tool for  set-oriented query execution

COMPARISON: BATCHING VS. ASYNCHRONOUS SUBMISSION

14

400 4000 400000

0.2

0.4

0.6

0.8

1

1.2

Original ProgramAsynchronous ModeBatching Mode

Number of Iterations

Tim

e(no

rmal

ized

)

Auction system benchmark application Asynchronous execution with 10 threads

Page 15: DBridge: A program rewrite tool for  set-oriented query execution

15

CONCLUSIONS AND ONGOING WORK Significant performance benefits possible by

using batching and/or asynchronous execution for Repeated database access from applications Repeated access to Web services

DBridge: batching and asynchronous execution made easy API + automated Java program transformation

Questions? Contact us at http://www.cse.iitb.ac.in/infolab/dbridge Email: [email protected]

Page 16: DBridge: A program rewrite tool for  set-oriented query execution

16

TRANSFORMATION WALK-THROUGH

PreparedStatement stmt = con.prepareStatement("SELECT COUNT(p_partkey) AS itemCount

FROM newpart WHERE p_category = ?");

while(category != 0){stmt.setInt(1, category);ResultSet rs = stmt.executeQuery();rs.next();int itemCount = rs.getInt("itemCount");sum = sum + itemCount;category = getParent(category);

}

Input: A Java Program which uses JDBC

Page 17: DBridge: A program rewrite tool for  set-oriented query execution

17

TRANSFORMATION WALK-THROUGH

PreparedStatement stmt = con.prepareStatement(”SELECT COUNT(p_partkey) AS itemCount

FROM part WHERE p_category = ?");

while(category != 0){stmt.setInt(1, category);ResultSet rs = stmt.executeQuery();rs.next();int itemCount = rs.getInt("itemCount");sum = sum + itemCount;category = getParent(category);

}

Iterative execution of a parameterized query

Step 1 of 5: Identify candidates for set-oriented query execution:

Intention: Split loop at this point

Page 18: DBridge: A program rewrite tool for  set-oriented query execution

18

TRANSFORMATION WALK-THROUGH

PreparedStatement stmt = con.prepareStatement(

"SELECT COUNT(p_partkey) AS itemCount FROM part WHERE p_category = ?");

while(category != null){stmt.setInt(1, category);ResultSet rs =

stmt.executeQuery();rs.next();int itemCount =

rs.getInt("itemCount");sum = sum + itemCount;category = getParent(category);

}

Step 2 of 5: Identify dependencies that prevent loop splitting:

A Loop Carried Flow Dependency edge crosses the query execution statement

Iterative execution of a parameterized query

Page 19: DBridge: A program rewrite tool for  set-oriented query execution

19

TRANSFORMATION WALK-THROUGH

PreparedStatement stmt = con.prepareStatement("SELECT COUNT(p_partkey) AS itemCount

FROM part WHERE p_category = ?");

while(category != null){int temp = category;category = getParent(category);stmt.setInt(1, temp);ResultSet rs = stmt.executeQuery();rs.next();int itemCount = rs.getInt("itemCount");sum = sum + itemCount;

}

Step 3 of 5: Reorder statements to enable loop splitting

Move statement above the Query invocation

Loop can be safely split now

Page 20: DBridge: A program rewrite tool for  set-oriented query execution

20

TRANSFORMATION WALK-THROUGH

LoopContextTable lct = new LoopContextTable();while(category != null){

LoopContext ctx = lct.createContext();

int temp = category;category = getParent(category);stmt.setInt(1, temp);stmt.addBatch(ctx);

}stmt.executeBatch();

for (LoopContext ctx : lct) {ResultSet rs =

stmt.getResultSet(ctx);rs.next();int itemCount =

rs.getInt("itemCount");sum = sum + itemCount;

}

Step 4 of 5: Split the loop (Rule 2)

Query execution statement isout of the loop and replaced with a call to its set-oriented form

To preserve split local values and order of processing results

Process result sets in the same order as the original loop

Accumulates parameters in case of batching; submits query in case of asynchrony

Page 21: DBridge: A program rewrite tool for  set-oriented query execution

21

TRANSFORMATION WALK-THROUGH

CREATE TABLE BATCHTABLE1(paramcolumn1 INTEGER, loopKey1

INTEGER)

INSERT INTO BATCHTABLE1 VALUES(..., …)

SELECT BATCHTABLE1.*, qry.* FROM BATCHTABLE1 OUTER APPLY (

SELECT COUNT(p_partkey) AS itemCount FROM part WHERE p_category = paramcolumn1) qry ORDER BY loopkey1

Step 5 of 5: Query Rewrite

Original Query

Set-oriented Query

Temp table to store Parameter batch

Batch Inserts intoTemp table

SELECT COUNT(p_partkey) AS itemCount FROM part WHERE p_category = ?

Page 22: DBridge: A program rewrite tool for  set-oriented query execution

while(…) {…..qt.bind(1,

category);count =

executeQuery(qt);sum += count;

}

while(...) {.......qt.bind(1, category);handle[n++] =

submitQuery(qt);} for(int i = 0; i < n; i++) {

count = fetchResult(handle[i]);

sum += count;}

22

Query optimization: time to think out of the

Program Transformation for Asynchronous Query Submission

Chavan, Guravannavar, Ramachandra, Sudarshan Research track – 8; April 13th, 14:30-16:00

Query Optimization within the database: Well researched Problem: Interface between application and the database Our Approach: Rewrite programs and queries together

Asynchronous query submission or Batching Automated rewriting based on static analysis

Upto 8x performance improvement

IIT Bombay and IIT Hyderabad