32
Gofer A scalable stateless proxy for DBI [email protected] - September 2008

DBD::Gofer 200809

Embed Size (px)

DESCRIPTION

DBD::Gofer is the scalable stateless proxy driver for Perl DBI.These are the slides for my lightning talk on DBD::Gofer given at the Italian Perl Workshop in 2008 (with a few extra slides added).

Citation preview

Page 1: DBD::Gofer 200809

GoferA scalable stateless proxy for DBI

[email protected] - September 2008

Page 2: DBD::Gofer 200809

DBI Layer Cake

Application

DBI

DBD::Foo

Database API

Database

Page 3: DBD::Gofer 200809

New Layers

Application

DBI

DBD::Gofer

Database

Gofer Transport

DBI

DBD::Foo

Database APIGofer Transport

Page 4: DBD::Gofer 200809

Gofer gives you...

Application

DBI

DBD::Gofer

Gofer Transport

DBI

Gofer Transport

Client-sidecaching

Server-sidecaching

Fewround-trips

Pluggabletransports

ssh, http, etc...Retry on

error

Stateless protocol

Configurevia DSN

DBD::Foo

Database DBD only needed on server

Flexible and easy to extende.g. DBIx::Router

Page 5: DBD::Gofer 200809

DBD::Proxy vs DBD::GoferDBD::Proxy DBD::Gofer

Supports transactions ✓ ✗(not yet)

Supports very large results ✓ ✗(memory)

Large test suite ✗ ✓Minimal round-trips ✗ ✓

Automatic retry on error ✗ ✓Modular & Pluggable classes ✗ ✓

Tunable via Policies ✗ ✓Scalable ✗ ✓

Connection pooling ✗ ✓Client-side caching ✗ ✓Server-side caching ✗ ✓

Page 6: DBD::Gofer 200809

Gofer, logically

• Gofer is- A scalable stateless proxy architecture for DBI

- Transport independent

- Highly configurable on client and server side

- Efficient, in CPU time and minimal round-trips

- Well tested

- Scalable

- Cacheable

- Simple and reliable

Page 7: DBD::Gofer 200809

Gofer, structurally

• Gofer is- A simple stateless request/response protocol

- A DBI proxy driver: DBD::Gofer

- A request executor module

- A set of pluggable transport modules

- An extensible client configuration mechanism

• Development sponsored by Shopzilla.com

Page 8: DBD::Gofer 200809

Gofer Protocol

• DBI::Gofer::Request & DBI::Gofer::Response

• Simple blessed hashes- Request contains all required information to

connect and execute the requested methods.

- Response contains results from methods calls, including result sets (rows of data).

• Serialized and transported by transport modules like DBI::Gofer::Transport::http

Page 9: DBD::Gofer 200809

Using DBD::Gofer

• Via DSN- By adding a prefix $dsn = “dbi:Driver:dbname”;

$dsn = “dbi:Gofer:transport=foo;dsn=$dsn”;

• Via DBI_AUTOPROXY environment variable- Automatically applies to all DBI connect calls$ export DBI_AUTOPROXY=“dbi:Gofer:transport=foo”;

- No code changes required!

Page 10: DBD::Gofer 200809

Gofer Transports

• DBI::Gofer::Transport::null- The ‘null’ transport.

- Serializes request object,transports it nowhere,then deserializes and passes it to DBI::Gofer::Execute to execute

- Serializes response object,transports it nowhere,then deserializes and returns it to caller

- Very useful for testing.

DBI_AUTOPROXY=“dbi:Gofer:transport=null”

Page 11: DBD::Gofer 200809

Gofer Transports

• DBD::Gofer::Transport::stream (ssh)- Can ssh to remote system to self-start server

ssh -xq [email protected] \perl -MDBI::Gofer::Transport::stream \-e run_stdio_hex

- Automatically reconnects if required

- ssh gives you security and optional compression

DBI_AUTOPROXY=’dbi:Gofer:transport=stream;url=ssh:[email protected]

Page 12: DBD::Gofer 200809

Gofer Transports• DBD::Gofer::Transport::http- Sends requests as http POST requests

- Server typically Apache mod_perl running DBI::Gofer::Transport::http

- Very flexible server-side configuration options

- Can use https for security

- Can use web techniques for scaling and high-availability. Will support web caching.

DBI_AUTOPROXY=’dbi:Gofer:transport=http;url=http://example.com/gofer’

Page 13: DBD::Gofer 200809

Gofer Transports

• DBD::Gofer::Transport::gearman- Distributes requests to a pool of workers

- Gearman a lightweight distributed job queuehttp://www.danga.com/gearman

- Gearman is implemented by the same people who wrote memcached, perlbal, mogileFS, & DJabberd

DBI_AUTOPROXY=’dbi:Gofer:transport=gearman;url=http://example.com/gofer’

Page 14: DBD::Gofer 200809

Pooling via gearman vs http

• I haven’t compared them in use myself yet

+ Gearman may have lower latency

+ Gearman spreads load over multiple machines without need for load-balancer

+ Gearman coalescing may be beneficial

+ Gearman async client may work well with POE

- Gearman clients need to be told about servers

• More gearman info http://danga.com/words/2007_04_linuxfest_nw/linuxfest.pdf (p61+)

Page 15: DBD::Gofer 200809

DBD::Gofer• A proxy driver

• Aims to be as ‘transparent’ as possible

• Accumulates details of DBI method calls

• Delays forwarding request for as long as possible

• execute_array() is a single round-trip

• Policy mechanism allows fine-grained tuning to trade transparency for speed

Page 16: DBD::Gofer 200809

DBD::Gofer::Policy::*• Three policies supplied: pedantic, classic, and

rush. Classic is the default.

• Policies are implemented as classes

• Currently 22 individual items within a policy

• Policy items can be dynamic methods

• Policy is selected via DSN:DBI_AUTOPROXY=“dbi:Gofer:transport=null;policy=pedantic”

Page 17: DBD::Gofer 200809

$dbh = DBI->connect_cached

$dbh->ping

$dbh->quote

$sth = $dbh->prepare

$sth->execute

$sth->{NUM_OF_FIELDS}

$sth->fetchrow_array

$dbh->tables

Round-trips per Policypedantic classic rush

connect() ✓

✓ if notdefault

if notdefault

✓ ✓ ✓

✓ ✓ cached after first

Page 18: DBD::Gofer 200809

Error Handling

• DBD::Gofer can automatically retry on failureDBI_AUTOPROXY=“dbi:Gofer:transport=...;retry_limit=3”

• Default behaviour is to retry if $request->is_idemponent is true- looks at SQL, returns true for most SELECTs

• Default is retry_limit=0, so disabled

• You can define your own behaviour:DBI->connect(..., { go_retry_hook => sub { ... },});

Page 19: DBD::Gofer 200809

Client-side Caching

• DBD::Gofer can automatically cache resultsDBI_AUTOPROXY=“dbi:Gofer:transport=null;cache=CacheModuleName”

• Default behaviour is to cache if $request->is_idemponent is true

looks at SQL returns true for most SELECTs

• Use any cache module with standard interface set($k,$v) and get($k)

• You can define your own behaviour by subclassing

Page 20: DBD::Gofer 200809

Server-side Caching

• DBI::Gofer::Transport::* can automatically cache results

• Use any cache module with standard interface- set($k,$v) and get($k)

• You can define your own behaviour by subclassing

Page 21: DBD::Gofer 200809

Gofer Caveats

• Adds latency time cost per network round-trip- Can minimize round-trips by fine-tuning a policy

- and/or using client-side caching

• State-less-ness has implications...- No transactions, AutoCommit only, at the moment.

- Can’t alter $dbh attributes after connect

- Can’t use temp tables, locks, and other per-connection persistent state, except within a stored procedure

- Code using last_insert_id needs a (simple) change

- See the docs for a few other very obscure caveats

Page 22: DBD::Gofer 200809

An ExampleUsing Gofer for Connection Pooling,

Scaling, and High Availability

Page 23: DBD::Gofer 200809

The ProblemApache

WorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkers

Database Server

40 worker processes per server

=1 overloaded database

+150 servers

-

Page 24: DBD::Gofer 200809

A SolutionApache

WorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkers

Database Server

A ‘database proxy’ thatdoes connection pooling

Holds a fewconnections open

Uses them to service DBI requestsRepeat for all servers

-

Page 25: DBD::Gofer 200809

An ImplementationApache

WorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkers

!"#$%#&Apache Database

Server

Standard apache+mod_perl

DBI + DBD::*

DBD::Gofer with http transport

DBI::Gofer::Execute and DBI::Gofer::Transport::http

modules implement stateless proxy

-

Page 26: DBD::Gofer 200809

Workers

Workers

HT

TP

Lo

ad

Ba

lan

ce

r a

nd

Ca

ch

e

Ap

ach

e r

un

nin

g D

BI

Go

fer

Ap

ach

e r

un

nin

g D

BI

Go

fer

Database Server

Manyweb servers

Load Balance and Cache

GoferServers

New middle tier

Load balancing and fail-over

x150

servers

serverwith40

workers

= 6000 db connections

Far fewer db connections

-Caching

Caching

Page 27: DBD::Gofer 200809
Page 28: DBD::Gofer 200809

Gofer’s Future

• Caching for http transport

• Optional JSON serialization

• Caching in DBD::Gofer (now done)

• Make state-less-ness of transport optionalto support transactions

• Patches welcome!

Page 29: DBD::Gofer 200809

Future: http caching

• Potential big win

• DBD::Gofer needs to indicate cache-ability

- via appropriate http headers

• Server side needs to agree

- and respond with appropriate http headers

• Caching then happens just like for web pages- if there’s a web cache between client and server

• Patches welcome!

Page 30: DBD::Gofer 200809

Future: JSON

• Turns DBI into a web service!

- Service Oriented Architecture anyone?

• Accessible to anything that can talk JSON

• Clients could be JavaScript, Java, ...

- and various languages that begin with P ...

• Patches welcome!

Page 31: DBD::Gofer 200809

Future: Transactions

• State-less-ness could be made optional

- If transport layer being used agrees

- Easiest to implement for stream / ssh

• Would be enabled byAutoCommit => 0

$dbh->begin_work

• Patches welcome!

Page 32: DBD::Gofer 200809

Questions?