Transcript
Page 1: Non Relational Databases And World Domination

Non-Relational Databases

andWorld Domination

Jason Davies

Thursday, 3 December 2009

Page 2: Non Relational Databases And World Domination

Overview• Relational vs. Non-Relational

• Why Switch?

• Non-Relational Solutions

• Document Databases

• Key Value Stores

• CouchDB Features

Thursday, 3 December 2009

Page 3: Non Relational Databases And World Domination

Relational Databases

Thursday, 3 December 2009

Page 4: Non Relational Databases And World Domination

Relational Databases• Relational algebra: union, intersection,

difference, cartesian product

• Easy to perform dynamic queries

• Fixed Schemas

• Normalisation

Thursday, 3 December 2009

Page 5: Non Relational Databases And World Domination

Non-Relational Databases• Everything else!

• Myriad of features, including:

• Key-Value stores with external indexers

• Schemaless

• RESTful APIs

Thursday, 3 December 2009

Page 6: Non Relational Databases And World Domination

CAP !eorem• Three requirements for applications in a

distributed environment:

• Consistency

• Availability

• Partition tolerance

• Pick two

Thursday, 3 December 2009

Page 7: Non Relational Databases And World Domination

Why Switch?• Data structure

• Scalability

• The New Cool

Thursday, 3 December 2009

Page 8: Non Relational Databases And World Domination

Data Structure Symptoms

Thursday, 3 December 2009

Page 9: Non Relational Databases And World Domination

Sparse Data• Tables with many columns, only a few

being used by any particular row

Thursday, 3 December 2009

Page 10: Non Relational Databases And World Domination

Attribute Tables• Each row is (fkey, att_name, att_value)

Thursday, 3 December 2009

Page 11: Non Relational Databases And World Domination

Data Dumps• Given up on using columns for structured

data

• Instead simply serialising it (JSON, YAML, XML, etc.) and dumping strings to database

Thursday, 3 December 2009

Page 12: Non Relational Databases And World Domination

Too Many Joins• Schemas involving large numbers of

many-to-many join tables or tree-like structures

Thursday, 3 December 2009

Page 13: Non Relational Databases And World Domination

Frequent Schema Changes• May be fine for small databases

• Can be tedious

• Rebuilding indexes is slow for millions of rows

Thursday, 3 December 2009

Page 14: Non Relational Databases And World Domination

Scalability

Thursday, 3 December 2009

Page 15: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 16: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 17: Non Relational Databases And World Domination

Write Capacity• If read capacity is the problem, then set

up master-slave replication

Thursday, 3 December 2009

Page 18: Non Relational Databases And World Domination

Too Much Data• Too much for one server to hold

• Hard to shard the data sensibly

Thursday, 3 December 2009

Page 19: Non Relational Databases And World Domination

Non-Relational Solutions

Thursday, 3 December 2009

Page 20: Non Relational Databases And World Domination

Diverse Ecosystem• Column-oriented databases

• Document-oriented databases

• Key value stores

• Graph-oriented databases

• Distributed databases

• MapReduce

Thursday, 3 December 2009

Page 21: Non Relational Databases And World Domination

BigTable• “a sparse, distributed multi-dimensional

sorted map”

• Designed to scale into the petabyte range

• HBase (Java, Hadoop)

• Hypertable

• Cassandra (Facebook, based on Amazon’s Dynamo)

Thursday, 3 December 2009

Page 22: Non Relational Databases And World Domination

Document Databases• Arbitrary number of “sparse” attributes per

document

• Documents often map well to JSON e.g. in CouchDB

• Cons: usually can’t perform joins or transactions spanning multiple documents

Thursday, 3 December 2009

Page 23: Non Relational Databases And World Domination

Graph Databases• Good for highly interconnected data

• Focus on the relationships between items

• Optimised for querying transitive relationships i.e. variable length chains of joins

• Neo4J, AllegroGraph, Sesame

Thursday, 3 December 2009

Page 24: Non Relational Databases And World Domination

Distributed K-V Stores• Giant hash table/dictionary

• Mainly solve data scalability problems

• Transparently partition and replicate data

• Cons:

• eventual consistency or other distributed transaction protocols

• hard to do integrity constraints, hard to catch application bugs

Thursday, 3 December 2009

Page 25: Non Relational Databases And World Domination

Distributed K-V Stores• Scalaris, Dynomite, Ringo: data

consistency

• MemcacheDB, Tokyo Cabinet: low latency

Thursday, 3 December 2009

Page 26: Non Relational Databases And World Domination

CouchDBApache

Thursday, 3 December 2009

Page 27: Non Relational Databases And World Domination

CouchDB and Ruby# with !, it creates the database if it doesn't already exist@db = CouchRest.database!("http://127.0.0.1:5984/couchrest-test")response = @db.save_doc({ :key => 'value', 'another key' => 'another value'})doc = @db.get(response['id'])puts doc.inspect

Thursday, 3 December 2009

Page 28: Non Relational Databases And World Domination

CouchDB and [email protected]_save([ {"wild" => "and random"}, {"mild" => "yet local"}, {"another" => ["set","of","keys"]} ])# returns ids and revs of the current docsputs @db.documents.inspect

Thursday, 3 December 2009

Page 29: Non Relational Databases And World Domination

CouchDB and [email protected]_doc({ "_id" => "_design/first", :views => { :test => { :map => "function(doc){for(var w in doc){ if(!w.match(/^_/))emit(w,doc[w])}}" } } })puts @db.view('first/test')['rows'].inspect

Thursday, 3 December 2009

Page 30: Non Relational Databases And World Domination

CouchDB and Ruby• Read more about CouchRest on github

• Also check out newcomer RubyAqua

Thursday, 3 December 2009

Page 31: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 32: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 33: Non Relational Databases And World Domination

Documents

http://www.flickr.com/photos/stilleben2001/223243329/

Thursday, 3 December 2009

Page 34: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 35: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 36: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 37: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 38: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 39: Non Relational Databases And World Domination

Document-Oriented

• Documents in the Real World™

• Bills, letters, tax forms…

• Same type != same structure

• Can be out of date (so what?)

• No references

Not Relational

Thursday, 3 December 2009

Page 40: Non Relational Databases And World Domination

Document-Oriented• Documents in the Real World™

• Bills, letters, tax forms…

• Same type != same structure

• Can be out of date (so what?)

• No references

Not Relational

Natural Data Behaviour

Thursday, 3 December 2009

Page 41: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 42: Non Relational Databases And World Domination

Highly Concurrent

Thursday, 3 December 2009

Page 43: Non Relational Databases And World Domination

Highly Concurrent

• Functional languages highly appropriate for parallellism

Thursday, 3 December 2009

Page 44: Non Relational Databases And World Domination

Highly Concurrent

• Functional languages highly appropriate for parallellism

• Lightweight “processes” and message-passing; “shared-nothing”

Thursday, 3 December 2009

Page 45: Non Relational Databases And World Domination

Highly Concurrent

• Functional languages highly appropriate for parallellism

• Lightweight “processes” and message-passing; “shared-nothing”

• Easy to create fault-tolerant systems

Thursday, 3 December 2009

Page 46: Non Relational Databases And World Domination

MVCC• Multiversion Concurrency Control

• Reads: lock-free; never block

• Potential for massive horizontal scaling

• Writes: all-or-nothing

• Success

• Fail: conflict error, fetch and try again

Thursday, 3 December 2009

Page 47: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 48: Non Relational Databases And World Domination

"#$%ful &%%' (')• Create

HTTP PUT /db/mydocid

• ReadHTTP GET /db/mydocid

• UpdateHTTP PUT /db/mydocid

• DeleteHTTP DELETE /db/mydocid

CRUD

Thursday, 3 December 2009

Page 49: Non Relational Databases And World Domination

couch = CouchRest.database!("http://127.0.0.1:5984/tweets")

tweets_url = "http://twitter.com/statuses/user_timeline.json"

tweets = http.get(tweets_url)couch.bulk_save(tweets)

"#$%ful &%%' (')Example

Thursday, 3 December 2009

Page 50: Non Relational Databases And World Domination

Cacheability• Both documents and views return ETags

• Clients send If-None-Match

• CouchDB responds with 304 Not Modified and bypasses potentially expensive lookup

• Can use Varnish/Squid as caching proxy

• Proxy- friendly

Thursday, 3 December 2009

Page 51: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 52: Non Relational Databases And World Domination

JavaScript-Powered Map/Reduce

• Map functions extract data from your documents

• Reduce functions aggregate intermediate values

• The kicker: Incremental B-tree storage

Thursday, 3 December 2009

Page 53: Non Relational Databases And World Domination

http://horicky.blogspot.com/2008/10/couchdb-implementation.htmlThursday, 3 December 2009

Page 54: Non Relational Databases And World Domination

Map/Reduce ViewsDocs

Map{"user" : "Chris",

"points" : 3 }{"user": "Joe","points" : 10 }

{"user": "Alice","points" : 5 }

{"user": "Mary","points" : 9}

{"user": "Bob","points": 7}

function(doc) {if (doc.user && doc.points) {

emit(doc.user, doc.points);}

}

{"key": "Alice", "value": 5}{"key": "Bob", "value": 7}

{"key": "Chris", "value": 3}{"key": "Joe", "value": 10}{"key": "Mary", "value": 9}

ReduceAlice ... Chris: 15

Everyone: 34function(keys, values, rereduce) { return sum(values);}

Thursday, 3 December 2009

Page 55: Non Relational Databases And World Domination

Map/Reduce ViewsDocs

Map{"user" : "Chris",

"points" : 3 }{"user": "Joe","points" : 10 }

{"user": "Alice","points" : 5 }

{"user": "Mary","points" : 9}

{"user": "Bob","points": 7}

function(doc) {if (doc.user && doc.points) {

emit(doc.user, doc.points);}

}

{"key": "Alice", "value": 5}{"key": "Bob", "value": 7}

{"key": "Chris", "value": 3}{"key": "Joe", "value": 10}{"key": "Mary", "value": 9}

ReduceAlice … Chris: 15

Everyone: 34function(keys, values, rereduce) { return sum(values);}

Thursday, 3 December 2009

Page 56: Non Relational Databases And World Domination

Map/Reduce ViewsDocs

Map{"user" : "Chris",

"points" : 3 }{"user": "Joe","points" : 10 }

{"user": "Alice","points" : 5 }

{"user": "Mary","points" : 9}

{"user": "Bob","points": 7}

function(doc) {if (doc.user && doc.points) {

emit(doc.user, doc.points);}

}

{"key": "Alice", "value": 5}{"key": "Bob", "value": 7}

{"key": "Chris", "value": 3}{"key": "Joe", "value": 10}{"key": "Mary", "value": 9}

ReduceAlice … Chris: 15

Everyone: 34function(keys, values, rereduce) { return sum(values);}

Thursday, 3 December 2009

Page 57: Non Relational Databases And World Domination

Render Views as HTMLlists/index.js /drl/_list/sofa/index/recent-posts?descending=true&limit=8

Thursday, 3 December 2009

Page 58: Non Relational Databases And World Domination

Server-Side JavaScript• _show for transforming documents

• _list for transforming views

• _update for transforming PUTs/POSTs

• Code-sharing between client and server

• Easy deployment

Thursday, 3 December 2009

Page 59: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 60: Non Relational Databases And World Domination

Replication• Incremental

• Near-real-time

• Clustered mirrors

• Scheduled

• Ad-hoc

Thursday, 3 December 2009

Page 61: Non Relational Databases And World Domination

http://www.flickr.com/photos/mcpig/872293700/

“Ground Computing”@jhuggins

- local to the user, more like desktop web than like Gears - local http server - browser apps - same application on the client and server or the cloud

Thursday, 3 December 2009

Page 62: Non Relational Databases And World Domination

http://www.flickr.com/photos/hercwad/2290378571/

Thursday, 3 December 2009

Page 63: Non Relational Databases And World Domination

Latency Sucks

speed of lightdrawback to cloud computing

Thursday, 3 December 2009

Page 64: Non Relational Databases And World Domination

! !

Stuart Langridge - Canonical

Thursday, 3 December 2009

Page 65: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 66: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 67: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 68: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 69: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 70: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 71: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 72: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 73: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 74: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 75: Non Relational Databases And World Domination

Con*icts

Thursday, 3 December 2009

Page 76: Non Relational Databases And World Domination

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 77: Non Relational Databases And World Domination

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 78: Non Relational Databases And World Domination

❦❦

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 79: Non Relational Databases And World Domination

❦ ❦

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 80: Non Relational Databases And World Domination

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 81: Non Relational Databases And World Domination

Con*ict resolution by example

A B

❦ ✿

Thursday, 3 December 2009

Page 82: Non Relational Databases And World Domination

Con*ict resolution by example

A B

❦ ✿♪

Thursday, 3 December 2009

Page 83: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 84: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 85: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 86: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 87: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 88: Non Relational Databases And World Domination

Robust Storage

Append-Only File Structure

Designed to Crash

Instant-On

Thursday, 3 December 2009

Page 89: Non Relational Databases And World Domination

Robust

- when britain is burning - Enda Farrell - bbc

Thursday, 3 December 2009

Page 90: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 91: Non Relational Databases And World Domination

!anks!

www.jasondavies.com

@jasondavies

Thursday, 3 December 2009