Click here to load reader

DBD::Gofer 200809

  • View
    2.825

  • Download
    3

Embed Size (px)

DESCRIPTION

DBD::Gofer is the scalable stateless proxy driver for Perl DBI.These are the slides for my lightning talk on DBD::Gofer given at the Italian Perl Workshop in 2008 (with a few extra slides added).

Text of DBD::Gofer 200809

2. DBI Layer Cake Application DBIDBD::Foo Database API Database 3. New LayersApplicationGofer Transport DBIDBIDBD::Gofer DBD::Foo Gofer Transport Database APIDatabase 4. Gofer gives you...Few Application round-trips Gofer TransportDBIDBI DBD::Gofer DBD::FooGofer TransportStatelessprotocol Server-sideDatabase DBDcaching Client-side only neededcaching on server PluggableRetry on Congure transportsFlexible and easy to extend errorvia DSN ssh, http, etc... e.g. DBIx::Router 5. DBD::Proxy vs DBD::Gofer DBD::Proxy DBD::Gofer Supports transactions (not yet) Supports very large results (memory) Large test suite Minimal round-trips Automatic retry on error Modular & Pluggable classes Tunable via Policies Scalable Connection pooling Client-side caching Server-side caching 6. Gofer, logically Gofer is - A scalable stateless proxy architecture for DBI - Transport independent - Highly congurable on client and server side - Efcient, in CPU time and minimal round-trips - Well tested - Scalable - Cacheable - Simple and reliable 7. Gofer, structurally Gofer is - A simple stateless request/response protocol - A DBI proxy driver: DBD::Gofer - A request executor module - A set of pluggable transport modules - An extensible client conguration mechanism Development sponsored by Shopzilla.com 8. Gofer Protocol DBI::Gofer::Request & DBI::Gofer::Response Simple blessed hashes - Request contains all required information to connect and execute the requested methods. - Response contains results from methods calls, including result sets (rows of data). Serialized and transported by transport modules like DBI::Gofer::Transport::http 9. Using DBD::Gofer Via DSN - By adding a prex $dsn = dbi:Driver:dbname; $dsn = dbi:Gofer:transport=foo;dsn=$dsn; Via DBI_AUTOPROXY environment variable - Automatically applies to all DBI connect calls $ export DBI_AUTOPROXY=dbi:Gofer:transport=foo; - No code changes required! 10. Gofer Transports DBI::Gofer::Transport::null - The null transport. - Serializes request object, transports it nowhere, then deserializes and passes it to DBI::Gofer::Execute to execute - Serializes response object, transports it nowhere, then deserializes and returns it to caller - Very useful for testing. DBI_AUTOPROXY=dbi:Gofer:transport=null 11. Gofer Transports DBD::Gofer::Transport::stream (ssh) - Can ssh to remote system to self-start serverssh -xq [email protected] perl -MDBI::Gofer::Transport::stream -e run_stdio_hex - Automatically reconnects if required - ssh gives you security and optional compression DBI_AUTOPROXY=dbi:Gofer:transport=stream ;url=ssh:[email protected] 12. Gofer Transports DBD::Gofer::Transport::http - Sends requests as http POST requests - Server typically Apache mod_perl running DBI::Gofer::Transport::http - Very exible server-side conguration options - Can use https for security - Can use web techniques for scaling and high- availability. Will support web caching. DBI_AUTOPROXY=dbi:Gofer:transport=http ;url=http://example.com/gofer 13. Gofer Transports DBD::Gofer::Transport::gearman - Distributes requests to a pool of workers - Gearman a lightweight distributed job queue http://www.danga.com/gearman - Gearman is implemented by the same people who wrote memcached, perlbal, mogileFS, & DJabberd DBI_AUTOPROXY=dbi:Gofer:transport=gearman ;url=http://example.com/gofer 14. Pooling via gearman vs http I havent compared them in use myself yet + Gearman may have lower latency + Gearman spreads load over multiple machines without need for load-balancer + Gearman coalescing may be benecial + Gearman async client may work well with POE - Gearman clients need to be told about servers More gearman info http://danga.com/words/ 2007_04_linuxfest_nw/linuxfest.pdf (p61+) 15. DBD::Gofer A proxy driver Aims to be as transparent as possible Accumulates details of DBI method calls Delays forwarding request for as long as possible execute_array() is a single round-trip Policy mechanism allows ne-grained tuning to trade transparency for speed 16. DBD::Gofer::Policy::* Three policies supplied: pedantic, classic, and rush. Classic is the default. Policies are implemented as classes Currently 22 individual items within a policy Policy items can be dynamic methods Policy is selected via DSN: DBI_AUTOPROXY=dbi:Gofer:transport=null ;policy=pedantic 17. Round-trips per Policypedantic classic rush $dbh = DBI->connect_cached connect() $dbh->ping if not if not $dbh->quote defaultdefault$sth = $dbh->prepare $sth->execute $sth->{NUM_OF_FIELDS} $sth->fetchrow_array cached $dbh->tables after rst 18. Error Handling DBD::Gofer can automatically retry on failure DBI_AUTOPROXY=dbi:Gofer:transport=...; retry_limit=3 Default behaviour is to retry if$request->is_idemponent is true - looks at SQL, returns true for most SELECTs Default is retry_limit=0, so disabled You can dene your own behaviour: DBI->connect(..., { go_retry_hook => sub { ... }, }); 19. Client-side Caching DBD::Gofer can automatically cache results DBI_AUTOPROXY=dbi:Gofer:transport=null ;cache=CacheModuleName Default behaviour is to cache if$request->is_idemponent is true looks at SQL returns true for most SELECTs Use any cache module with standard interface set($k,$v) and get($k) You can dene your own behaviour by subclassing 20. Server-side Caching DBI::Gofer::Transport::* can automatically cache results Use any cache module with standard interface - set($k,$v) and get($k) You can dene your own behaviour by subclassing 21. Gofer Caveats Adds latency time cost per network round-trip - Can minimize round-trips by ne-tuning a policy - and/or using client-side caching State-less-ness has implications... - No transactions, AutoCommit only, at the moment. - Cant alter $dbh attributes after connect - Cant use temp tables, locks, and other per-connection persistent state, except within a stored procedure - Code using last_insert_id needs a (simple) change - See the docs for a few other very obscure caveats 22. An Example Using Gofer for Connection Pooling,Scaling, and High Availability 23. The Problem Apache40 worker processes per serverWorkerWorkerWorkerWorkerWorker +versWorkerWorkerWorkerWorkerWorker 150 serWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorker DatabaseWorkerWorkerServerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorker =WorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkers1 overloaded database - 24. A Solution A database proxy that ApacheWorker does connection poolingWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorker DatabaseWorkerWorkerServerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorker Holds a fewWorkerWorkerWorkerWorkerconnections openWorkerWorkerWorkerWorkerWorkerWorkers Repeat for all servers -Uses them to ser vice DBI requests 25. An ImplementationStandard apache+mod_perl ApacheWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorker DBI + DBD::*WorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerApache WorkerDatabaseWorkerWorker ServerWorkerWorkerWorker !"#$%#&WorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerWorkerDBI::Gofer::Execute andWorkerWorkerWorkerWorkerDBI::Gofer::Transport::httpWorkers modules implement stateless proxy- DBD::Gofer with http transport 26. Load Balance and Cache Load balancingserver and fail-over Apache running DBI Gofer with40 workersFar fewer db HTTP Load Balancer and CacheconnectionsWorkers xMany = 6000 db 150 webGoferDatabaseconnections serversServers Server servers Apache running DBI GoferWorkersCaching-New middle tier Caching 27. Gofers Future Caching for http transport Optional JSON serialization Caching in DBD::Gofer (now done) Make state-less-ness of transport optional to support transactions Patches welcome! 28. Future: http caching Potential big win DBD::Gofer needs to indicate cache-ability- via appropriate http headers Server side needs to agree- and respond with appropriate http headers Caching then happens just like for web pages- if theres a web cache between client and server Patches welcome! 29. Future: JSON Turns DBI into a web service!- Service Oriented Architecture anyone? Accessible to anything that can talk JSON Clients could be JavaScript, Java, ...- and various languages that begin with P ... Patches welcome! 30. Future: Transactions State-less-ness could be made optional- If transport layer being used agrees- Easiest to implement for stream / ssh Would be enabled by AutoCommit => 0 $dbh->begin_work Patches welcome! 31. Questions?