18
Sadayuki Furuhashi Founder & Software Architect ODBC & JDBC connectivity for Presto Treasure Data, inc.

Prestogres, ODBC & JDBC connectivity for Presto

Embed Size (px)

DESCRIPTION

Prestogres provides ODBC & JDBC connectivity to Presto, a distributed SQL query engine. Presto meetup @ Facebook (2014-05-14)

Citation preview

Page 1: Prestogres, ODBC & JDBC connectivity for Presto

Sadayuki Furuhashi

Founder & Software Architect

ODBC & JDBC connectivity for Presto

Treasure Data, inc.

Page 2: Prestogres, ODBC & JDBC connectivity for Presto

A little about me...

> Sadayuki Furuhashi github/twitter: @frsyuki

> Treasure Data, Inc. Founder & Software Architect

> Open source projects MessagePack - efficient object serializer Fluentd - data collection tool

ServerEngine - ruby framework to build multiprocess servers

LS4 - distributed object storage system (suspended)

kumofs - distributed key-value data store (suspended)

Page 3: Prestogres, ODBC & JDBC connectivity for Presto

Background + Intro:

Page 4: Prestogres, ODBC & JDBC connectivity for Presto

Background

Pig

• Tableau • Pentaho • Web apps

RDB, HTTP, etc.“Plazma”

ColumnarCloud Storage

This is us(Treasure Data)

Page 5: Prestogres, ODBC & JDBC connectivity for Presto

Pig

• Tableau • Pentaho • Web apps

RDB, HTTP, etc.“Plazma”

ColumnarCloud Storage

Data collection

> “Fluentd” streaming data collection tool

> Plugin architecture

> github.com/fluent/fluentd

Page 6: Prestogres, ODBC & JDBC connectivity for Presto

Pig

• Tableau • Pentaho • Web apps

RDB, HTTP, etc.“Plazma”

ColumnarCloud Storage

Hadoop as a service

> “BigData” processing • Funnel analysis for

web services • Correlation analysis for

ad-tech (DSP/SSP/DMP) • Creating OLAP cube

> Multi-tenant scheduling • utilize idling resources

purchased by other users

Page 7: Prestogres, ODBC & JDBC connectivity for Presto

Pig

• Tableau • Pentaho • Web apps

RDB, HTTP, etc.“Plazma”

ColumnarCloud Storage

Presto as a service

> Interactive queries

> Multi-tenant scheduling(in progress)

Page 8: Prestogres, ODBC & JDBC connectivity for Presto

Pig

• Tableau • Pentaho • Web apps

RDB, HTTP, etc.“Plazma”

ColumnarCloud Storage

Here is the problem…

ODBC/JDBC

Missing!

Page 9: Prestogres, ODBC & JDBC connectivity for Presto

The problem to solve

• Providing open-source ODBC/JDBC connectivity for Presto quickly

• Tableau • Pentaho • Web apps

ODBC/JDBC

• ODBC/JDBC are VERY complicated API > PostgreSQL ODBC driver: 60,000 lines > PostgreSQL JDBC driver: 43,000 lines

Page 10: Prestogres, ODBC & JDBC connectivity for Presto

A solution

• Using PostgreSQL ODBC/JDBC drivers

• Creating PostgreSQL protocol gateway

Page 11: Prestogres, ODBC & JDBC connectivity for Presto

A solution

• Using PostgreSQL ODBC/JDBC drivers

• Creating PostgreSQL protocol gateway

PostgreSQL protocol gateway for Presto

feature-complete & matured for many years

some middlewarealready implemented

Page 12: Prestogres, ODBC & JDBC connectivity for Presto

Architecture

Page 13: Prestogres, ODBC & JDBC connectivity for Presto

Architecture

Tableau PentahoWeb apps…

PostgreSQL protocol

PostgreSQL ODBC/JDBC driver, Other PostgreSQL clients

Page 14: Prestogres, ODBC & JDBC connectivity for Presto

pgpool-II (patched)

Internal Architecture

Tableau…select count(*) from x;

run_presto_as_temp_table( …, ’select count(*) from x’);

patched pgpool-II wrapsthe SQL in a function call

PostgreSQL

the function sends theoriginal sql to Presto

select count(*) from x;

Page 15: Prestogres, ODBC & JDBC connectivity for Presto

SELECT from system catalogs

pgpool-II (patched) Tableau…

get table list

PostgreSQL

run CREATE TABLEfor each actual table

run the original query

to get metadata of tables

Page 16: Prestogres, ODBC & JDBC connectivity for Presto

Demo

Page 17: Prestogres, ODBC & JDBC connectivity for Presto

Limitations

• Server-side prepare is not supported

• Cursor (DECLARE/FETCH) is not supported

• JDBC driver needs ?protocolVersion=2 option

Page 18: Prestogres, ODBC & JDBC connectivity for Presto

We’re hiring!

www.treasuredata.com/careers