22
SCALING COUCHDB WITH BIGCOUCH a. kocoloski O’Reilly Webcast October 2010

Scaling CouchDB with BigCouch

Embed Size (px)

Citation preview

Page 1: Scaling CouchDB with BigCouch

SCALING COUCHDB WITH BIGCOUCH

a. kocoloski

O’Reilly WebcastOctober 2010

Page 2: Scaling CouchDB with BigCouch

2

• Introductions and plugs

• Brief intro to CouchDB

• BigCouch Overview

• Building and Using a CouchDB Cluster

• BigCouch Internals

• Future Plans and How you can Help

OUTLINE

Page 3: Scaling CouchDB with BigCouch

3

INTRODUCTIONS

Cloudant CTOCouchDB Committer Since 2008

PhD physics MIT 2010

BigCouchPutting the “C” back in CouchDB

Open Core: 2 years Development

[email protected]

kocolosk in #cloudant or #couchdb@kocolosk

Hashtag of the Day: #bigcouch

Page 4: Scaling CouchDB with BigCouch

4

PLUGS

Hosting, Management, Support for CouchDB and BigCouch

http://cloudant.com

Page 5: Scaling CouchDB with BigCouch

5

COUCHDB IN A SLIDE

• Schema-free document database management systemDocuments are JSON objectsAble to store binary attachments

• RESTful APIhttp://wiki.apache.org/couchdb/reference

• Views: Custom, persistent representations of your dataIncremental MapReduce with results persisted to diskFast querying by primary key (views stored in a B-tree)

• Bi-Directional ReplicationMaster-slave and multi-master topologies supportedOptional ‘filters’ to replicate a subset of the dataEdge devices (mobile phones, sensors, etc.)

• Futon Web Interface

Page 6: Scaling CouchDB with BigCouch

6

WHY BIGCOUCH?

“CouchDB is built of the web... This is what software of the future looks

like” -J. Kaplan-Moss

CouchDB is Awesome

...But somewhat incompleteClusterOfUntrustedCommodityHardware

“CouchDB is not a distributed database” -J. Ellis

“Without the Clustering, it’s just OuchDB”

Page 7: Scaling CouchDB with BigCouch

7

WHAT WE TALK ABOUT WHEN WE TALK ABOUT SCALING

• Horizontal scaling: more servers creates more capacity

• Transparent to the application: adding more capacity should not a!ect the business logic of the application.

• No single point of failure.

http://adam.heroku.com/past/2009/7/6/sql_databases_dont_scale/

Pseudo Scalars

Page 8: Scaling CouchDB with BigCouch

8

BIGCOUCH = COUCH+SCALING• Horizontal Scalability

Easily add storage capacity by adding more serversComputing power (views, compaction, etc.) scales with more servers

• No SPOFAny node can handle any requestIndividual nodes can come and go

• Transparent to the ApplicationAll clustering operations take place “behind the curtain”‘looks’ like a single server instance of Couch, just with more awesomeasterisks and caveats discussed later

Page 9: Scaling CouchDB with BigCouch

9

GRAPHICAL REPRESENTATION

hash(blah) = E

Load Balancer

PUT http://kocolosk.cloudant.com/dbname/blah?w=2

N=3W=2R=2

Node 1

A B C DNode 2B

CD

E

Node 3

C

D

E

F

Node 4

D

E

F

G

Node 24

XY

ZA

• Clustering in a ring (a la Dynamo)

• Any node can handle a request

• O(1) lookup• Quorum system (N, R, W)• Views distributed like

documents• Distributed Erlang• Masterless

Page 10: Scaling CouchDB with BigCouch

10

• Shopping List3 networked macbook prosUsual CouchDB DependenciesBigCouch Code

• http://github.com/cloudant/bigcouch

BUILDING YOUR FIRST CLUSTER

Page 11: Scaling CouchDB with BigCouch

11

BUILDING YOUR FIRST CLUSTER

foo.example.com bar.example.com baz.example.com

Build and Start BigCouch

Pick one node and add the others to the local “nodes” DB

Make sure they all agree on the magic cookie (rel/etc/vm.args)

Page 12: Scaling CouchDB with BigCouch

12

QUORUM: IT’S YOUR FRIEND• BigCouch databases are governed by 4 parameters

Q: Number of shardsN: Number of redundant copies of each shardR: Read quorum constantW: Write quorum constant(NB: Also consider the number of nodes in a cluster)

For the next few examples, consider a 5

node cluster

1

2

34

5

Page 13: Scaling CouchDB with BigCouch

13

Q• Q: The number of shards over which a DB will be spread

consistent hashing space divided into Q piecesSpecified at DB creation timepossible for more than one shard to live on a nodeDocuments deterministically mapped to a shard

Q=1 Q=4

1

2

34

5

Page 14: Scaling CouchDB with BigCouch

14

N• N: The number of redundant copies of each document

Choose N>1 for fault-tolerant clusterSpecified at DB creationEach shard is copied N timesRecommend N>2

1

2

34

5

N=3

Page 15: Scaling CouchDB with BigCouch

15

W• W: The number of document copies that must be saved

before a document is “written”W must be less than or equal to NW=1, maximize throughputW=N, maximize consistencyAllow for “201” created responseCan be specified at write time

1

2

34

5

W=2‘201 Created’

Page 16: Scaling CouchDB with BigCouch

16

R• R: The number of identical document copies that must be

read before a read request is okR must be less than or equal to NR=1, minimize latencyR=N, maximize consistencyCan be specified at query timeInconsistencies are automatically repaired

1

2

34

5

R=2

Page 17: Scaling CouchDB with BigCouch

VIEWS• So far, so good, but what about secondary indexes?

Views are built locally on each node, for each DB shardMergesort at query time using exactly one copy of each shardRun a final rereduce on each row if a the view has a reduce

• _changes feed works similarly, but hasno global orderingSequence numbers convertedto strings to encode moreinformation

17

1

2

34

5

Page 18: Scaling CouchDB with BigCouch

18

HACKER PORTIONThe BigCouch Stack

CHTTPD

Fabric

Rexi Mem3

Embedded CouchDBMochiweb, Spidermonkey, etc.

Page 19: Scaling CouchDB with BigCouch

19

MEM3

CHTTPD

Fabric

Rexi Mem3

Embedded CouchDB

• Maintains the shard mapping for each clustered database in a node-local CouchDB database

• Changes in the node registration and shard mapping databases are automatically replicated to all cluster nodes

Page 20: Scaling CouchDB with BigCouch

20

REXI

CHTTPD

Fabric

Rexi Mem3

Embedded CouchDB

• BigCouch makes a large number of parallel RPCs

• Erlang RPC library not designed for heavy parallelismpromiscuous spawning of processesresponses directed back through single process on remote noderequests block until remote ‘rex’ process is monitored

• Rexi removes some of the safeguardsin exchange for lower latenciesno middlemen on the local node remote process responds directly to clientremote process monitoring occursout-of-band

Page 21: Scaling CouchDB with BigCouch

21

FABRIC / CHTTPD

CHTTPD

Fabric

Rexi Mem3

Embedded CouchDB

• FabricOTP library application (no processes) responsible for clustered versions of CouchDB core API callsQuorum logic, view merging, etc.Provides a clean Erlang interface to BigCouchNo HTTP awareness

• ChttpdCut-n-paste of couch_httpd, but usingfabric for all data access

Page 22: Scaling CouchDB with BigCouch

22

SUMMARY

• BigCouch: putting the ‘C’ back in CouchDB

• http://github.com/cloudant/bigcouch

• https://cloudant.com/future