56
EMBRACING CONSTRAINTS WITH COUCHDB

Embracing Constraints With CouchDB

Embed Size (px)

DESCRIPTION

Presentation given at Dutch PHP Conference 2010. http://joind.in/1651

Citation preview

Page 1: Embracing Constraints With CouchDB

EMBRACING CONSTRAINTS WITH COUCHDB

Page 2: Embracing Constraints With CouchDB

David Zülke

Page 3: Embracing Constraints With CouchDB

David Zuelke

Page 4: Embracing Constraints With CouchDB
Page 5: Embracing Constraints With CouchDB

http://en.wikipedia.org/wiki/File:München_Panorama.JPG

Page 6: Embracing Constraints With CouchDB

Founder

Page 8: Embracing Constraints With CouchDB

Lead Developer

Page 11: Embracing Constraints With CouchDB

http://joind.in/1651

Page 12: Embracing Constraints With CouchDB

A DISCLAIMER FIRSTBefore You All Figure This Out Yourselves...

Page 13: Embracing Constraints With CouchDB
Page 14: Embracing Constraints With CouchDB
Page 15: Embracing Constraints With CouchDB

NEIN NEIN NEIN NEIN

DAS IST BETRUG

Page 16: Embracing Constraints With CouchDB

This talk is not really about embracing constraints

Page 17: Embracing Constraints With CouchDB

I’ll tell you what it’s really about when we’re finished

Page 18: Embracing Constraints With CouchDB

I’ll also apologize to you for lying at that point

Page 19: Embracing Constraints With CouchDB

(it’s always easier to apologize than to ask for permission)

Page 20: Embracing Constraints With CouchDB

COUCHDB IN THREE SLIDESFull Of DIS IS SRS BSNS Bullet Points

Page 21: Embracing Constraints With CouchDB

COUCHDB STORES DOCUMENTS

• CouchDB stores documents with arbitrary keys and values

• Each document is identified by an ID and has a revision

•Documents can have file attachments

• Stored as JSON, so it’s easy to interface with

Page 22: Embracing Constraints With CouchDB

COUCHDB SPEAKS HTTP

• CouchDB uses HTTP to communicate with clients & servers

• That means scalability

• That means a lot of kick ass stuff totally for free

• Caching

• Load Balancing

• Content Negotiation

Page 23: Embracing Constraints With CouchDB

COUCHDB USES MVCC

•Multiversion Concurrency Control

•When updating, you must supply a revision number

• Your change will be rejected if the revision is not the latest

• All writes are serialized

•No need for locks, but puts some responsibility on developers

Page 24: Embracing Constraints With CouchDB

SOME DETAILSAn In-Depth Look At What Makes CouchDB Different

Page 25: Embracing Constraints With CouchDB

CAP

consistency

availability

partition toleranceX

Page 26: Embracing Constraints With CouchDB

“So, CouchDB does not have consistency of CAP?”

Page 27: Embracing Constraints With CouchDB

“Booh, that means my data will be inconsistent. Fail!”

Page 28: Embracing Constraints With CouchDB

psssshhh

Page 29: Embracing Constraints With CouchDB

YOUR MOM IS INCONSISTENT

Page 30: Embracing Constraints With CouchDB

CouchDB is eventually consistent

Page 31: Embracing Constraints With CouchDB

When replicating, conflicting revisions will be marked as such

Page 32: Embracing Constraints With CouchDB

These conflicts can then be resolved (users, daemons,...)

Page 33: Embracing Constraints With CouchDB

and everything will be fine\o/

Page 34: Embracing Constraints With CouchDB

which brings us to...

Page 35: Embracing Constraints With CouchDB

REPLICATION

• You can do Master-Master replication

• Conflicts are detected and marked automatically

• Conflicts are supposed to be resolved by applications

•Or by users, who usually know best what to do!

Page 36: Embracing Constraints With CouchDB

CouchDB is Ground Computing

Page 37: Embracing Constraints With CouchDB

Imagine a world where every computer runs CouchDB

Page 38: Embracing Constraints With CouchDB

Ubuntu One already does, to sync bookmarks etc!

Page 39: Embracing Constraints With CouchDB

MAP/REDUCE

Page 40: Embracing Constraints With CouchDB

BASIC PRINCIPLE: MAPPER

• The Mapper reads records and emits <key, value> pairs

• Example: Apache access.log

• Each line is a record

• Extract client IP address and number of bytes transferred

• Emit IP address as key, number of bytes as value

• For hourly rotating logs, the job can be split across 24 nodes** In pratice, it’s a lot smarter than that

Page 41: Embracing Constraints With CouchDB

BASIC PRINCIPLE: REDUCER

• A Reducer is given a key and all values for this specific key

• Even if there are many Mappers on many computers; the results are aggregated before they are handed to Reducers

• Example: Apache access.log

• The Reducer is called once for each client IP (that’s our key), with a list of values (transferred bytes)

•We simply sum up the bytes to get the total traffic per IP!

Page 42: Embracing Constraints With CouchDB

EXAMPLE OF MAPPED INPUT

IP Bytes

212.122.174.13 18271

212.122.174.13 191726

212.122.174.13 198

74.119.8.111 91272

74.119.8.111 8371

212.122.174.13 43

Page 43: Embracing Constraints With CouchDB

REDUCER WILL RECEIVE THIS

IP Bytes

212.122.174.13

18271

212.122.174.13191726

212.122.174.13198

212.122.174.13

43

74.119.8.11191272

74.119.8.1118371

Page 44: Embracing Constraints With CouchDB

AFTER REDUCTION

IP Bytes

212.122.174.13 210238

74.119.8.111 99643

Page 45: Embracing Constraints With CouchDB

COUCHDB INCREMENTAL MAPREDUCE

Page 46: Embracing Constraints With CouchDB

THE KEY DIFFERENCE

•Maps and Reduces are incremental:

• If one document changes, only that one document needs:

•mapping

• reduction

• Then a few new reduce runs are performed to compute the final result

Page 47: Embracing Constraints With CouchDB

MAPPER: DOCS BY TAGS

function(doc)  {    if(doc.type  ==  'talk')  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  doc);        });    }}

Page 48: Embracing Constraints With CouchDB

MAPREDUCE: COUNT TAGS

function(doc)  {    if(doc.type  ==  'talk')  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  1);        });    }}

function(key,  values)  {    return  sum(values);}

Page 49: Embracing Constraints With CouchDB

LUCENE INTEGRATIONFull Control Over What Is Indexed, And How

Page 50: Embracing Constraints With CouchDB

COUCHAPPPython Tool For Development And Deployment

Page 51: Embracing Constraints With CouchDB

DEMO TIMELet’s Relax On The Couch

Page 52: Embracing Constraints With CouchDB

!e End

Page 53: Embracing Constraints With CouchDB

FURTHER READING

• http://books.couchdb.org/

• http://couchdb.apache.org/

• http://github.com/couchapp/couchapp

• http://github.com/rnewson/couchdb-lucene/

• http://janl.github.com/couchdbx/

• http://j.mp/oqbQs (E4X in CouchDB for XML parsing)

Page 54: Embracing Constraints With CouchDB

DID YOU SEE THE HEAD FAKE?This Talk Was Not About Embracing Constraints

It Was About Embracing Awesomeness

Page 55: Embracing Constraints With CouchDB

Questions?

Page 56: Embracing Constraints With CouchDB

THANK YOU!This was

http://joind.in/1651by

@dzuelke