91
Non-Relational Databases and World Domination Jason Davies Thursday, 3 December 2009

Non Relational Databases And World Domination

Embed Size (px)

DESCRIPTION

Apparently NoSQL is all the rage these days, but what does it really mean and what technologies are out there? When to use a non-relational database? How to decide which one to use to achieve world domination? How do I use CouchDB with Ruby on Rails?

Citation preview

Page 1: Non Relational Databases And World Domination

Non-Relational Databases

andWorld Domination

Jason Davies

Thursday, 3 December 2009

Page 2: Non Relational Databases And World Domination

Overview• Relational vs. Non-Relational

• Why Switch?

• Non-Relational Solutions

• Document Databases

• Key Value Stores

• CouchDB Features

Thursday, 3 December 2009

Page 3: Non Relational Databases And World Domination

Relational Databases

Thursday, 3 December 2009

Page 4: Non Relational Databases And World Domination

Relational Databases• Relational algebra: union, intersection,

difference, cartesian product

• Easy to perform dynamic queries

• Fixed Schemas

• Normalisation

Thursday, 3 December 2009

Page 5: Non Relational Databases And World Domination

Non-Relational Databases• Everything else!

• Myriad of features, including:

• Key-Value stores with external indexers

• Schemaless

• RESTful APIs

Thursday, 3 December 2009

Page 6: Non Relational Databases And World Domination

CAP !eorem• Three requirements for applications in a

distributed environment:

• Consistency

• Availability

• Partition tolerance

• Pick two

Thursday, 3 December 2009

Page 7: Non Relational Databases And World Domination

Why Switch?• Data structure

• Scalability

• The New Cool

Thursday, 3 December 2009

Page 8: Non Relational Databases And World Domination

Data Structure Symptoms

Thursday, 3 December 2009

Page 9: Non Relational Databases And World Domination

Sparse Data• Tables with many columns, only a few

being used by any particular row

Thursday, 3 December 2009

Page 10: Non Relational Databases And World Domination

Attribute Tables• Each row is (fkey, att_name, att_value)

Thursday, 3 December 2009

Page 11: Non Relational Databases And World Domination

Data Dumps• Given up on using columns for structured

data

• Instead simply serialising it (JSON, YAML, XML, etc.) and dumping strings to database

Thursday, 3 December 2009

Page 12: Non Relational Databases And World Domination

Too Many Joins• Schemas involving large numbers of

many-to-many join tables or tree-like structures

Thursday, 3 December 2009

Page 13: Non Relational Databases And World Domination

Frequent Schema Changes• May be fine for small databases

• Can be tedious

• Rebuilding indexes is slow for millions of rows

Thursday, 3 December 2009

Page 14: Non Relational Databases And World Domination

Scalability

Thursday, 3 December 2009

Page 15: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 16: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 17: Non Relational Databases And World Domination

Write Capacity• If read capacity is the problem, then set

up master-slave replication

Thursday, 3 December 2009

Page 18: Non Relational Databases And World Domination

Too Much Data• Too much for one server to hold

• Hard to shard the data sensibly

Thursday, 3 December 2009

Page 19: Non Relational Databases And World Domination

Non-Relational Solutions

Thursday, 3 December 2009

Page 20: Non Relational Databases And World Domination

Diverse Ecosystem• Column-oriented databases

• Document-oriented databases

• Key value stores

• Graph-oriented databases

• Distributed databases

• MapReduce

Thursday, 3 December 2009

Page 21: Non Relational Databases And World Domination

BigTable• “a sparse, distributed multi-dimensional

sorted map”

• Designed to scale into the petabyte range

• HBase (Java, Hadoop)

• Hypertable

• Cassandra (Facebook, based on Amazon’s Dynamo)

Thursday, 3 December 2009

Page 22: Non Relational Databases And World Domination

Document Databases• Arbitrary number of “sparse” attributes per

document

• Documents often map well to JSON e.g. in CouchDB

• Cons: usually can’t perform joins or transactions spanning multiple documents

Thursday, 3 December 2009

Page 23: Non Relational Databases And World Domination

Graph Databases• Good for highly interconnected data

• Focus on the relationships between items

• Optimised for querying transitive relationships i.e. variable length chains of joins

• Neo4J, AllegroGraph, Sesame

Thursday, 3 December 2009

Page 24: Non Relational Databases And World Domination

Distributed K-V Stores• Giant hash table/dictionary

• Mainly solve data scalability problems

• Transparently partition and replicate data

• Cons:

• eventual consistency or other distributed transaction protocols

• hard to do integrity constraints, hard to catch application bugs

Thursday, 3 December 2009

Page 25: Non Relational Databases And World Domination

Distributed K-V Stores• Scalaris, Dynomite, Ringo: data

consistency

• MemcacheDB, Tokyo Cabinet: low latency

Thursday, 3 December 2009

Page 26: Non Relational Databases And World Domination

CouchDBApache

Thursday, 3 December 2009

Page 27: Non Relational Databases And World Domination

CouchDB and Ruby# with !, it creates the database if it doesn't already exist@db = CouchRest.database!("http://127.0.0.1:5984/couchrest-test")response = @db.save_doc({ :key => 'value', 'another key' => 'another value'})doc = @db.get(response['id'])puts doc.inspect

Thursday, 3 December 2009

Page 28: Non Relational Databases And World Domination

CouchDB and [email protected]_save([ {"wild" => "and random"}, {"mild" => "yet local"}, {"another" => ["set","of","keys"]} ])# returns ids and revs of the current docsputs @db.documents.inspect

Thursday, 3 December 2009

Page 29: Non Relational Databases And World Domination

CouchDB and [email protected]_doc({ "_id" => "_design/first", :views => { :test => { :map => "function(doc){for(var w in doc){ if(!w.match(/^_/))emit(w,doc[w])}}" } } })puts @db.view('first/test')['rows'].inspect

Thursday, 3 December 2009

Page 30: Non Relational Databases And World Domination

CouchDB and Ruby• Read more about CouchRest on github

• Also check out newcomer RubyAqua

Thursday, 3 December 2009

Page 31: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 32: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 33: Non Relational Databases And World Domination

Documents

http://www.flickr.com/photos/stilleben2001/223243329/

Thursday, 3 December 2009

Page 34: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 35: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 36: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 37: Non Relational Databases And World Domination

Schema-Free ( JSON){ "_id": "BCCD12CBB", "_rev": "AB764C",

"type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true}

Thursday, 3 December 2009

Page 38: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 39: Non Relational Databases And World Domination

Document-Oriented

• Documents in the Real World™

• Bills, letters, tax forms…

• Same type != same structure

• Can be out of date (so what?)

• No references

Not Relational

Thursday, 3 December 2009

Page 40: Non Relational Databases And World Domination

Document-Oriented• Documents in the Real World™

• Bills, letters, tax forms…

• Same type != same structure

• Can be out of date (so what?)

• No references

Not Relational

Natural Data Behaviour

Thursday, 3 December 2009

Page 41: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 42: Non Relational Databases And World Domination

Highly Concurrent

Thursday, 3 December 2009

Page 43: Non Relational Databases And World Domination

Highly Concurrent

• Functional languages highly appropriate for parallellism

Thursday, 3 December 2009

Page 44: Non Relational Databases And World Domination

Highly Concurrent

• Functional languages highly appropriate for parallellism

• Lightweight “processes” and message-passing; “shared-nothing”

Thursday, 3 December 2009

Page 45: Non Relational Databases And World Domination

Highly Concurrent

• Functional languages highly appropriate for parallellism

• Lightweight “processes” and message-passing; “shared-nothing”

• Easy to create fault-tolerant systems

Thursday, 3 December 2009

Page 46: Non Relational Databases And World Domination

MVCC• Multiversion Concurrency Control

• Reads: lock-free; never block

• Potential for massive horizontal scaling

• Writes: all-or-nothing

• Success

• Fail: conflict error, fetch and try again

Thursday, 3 December 2009

Page 47: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 48: Non Relational Databases And World Domination

"#$%ful &%%' (')• Create

HTTP PUT /db/mydocid

• ReadHTTP GET /db/mydocid

• UpdateHTTP PUT /db/mydocid

• DeleteHTTP DELETE /db/mydocid

CRUD

Thursday, 3 December 2009

Page 49: Non Relational Databases And World Domination

couch = CouchRest.database!("http://127.0.0.1:5984/tweets")

tweets_url = "http://twitter.com/statuses/user_timeline.json"

tweets = http.get(tweets_url)couch.bulk_save(tweets)

"#$%ful &%%' (')Example

Thursday, 3 December 2009

Page 50: Non Relational Databases And World Domination

Cacheability• Both documents and views return ETags

• Clients send If-None-Match

• CouchDB responds with 304 Not Modified and bypasses potentially expensive lookup

• Can use Varnish/Squid as caching proxy

• Proxy- friendly

Thursday, 3 December 2009

Page 51: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 52: Non Relational Databases And World Domination

JavaScript-Powered Map/Reduce

• Map functions extract data from your documents

• Reduce functions aggregate intermediate values

• The kicker: Incremental B-tree storage

Thursday, 3 December 2009

Page 53: Non Relational Databases And World Domination

http://horicky.blogspot.com/2008/10/couchdb-implementation.htmlThursday, 3 December 2009

Page 54: Non Relational Databases And World Domination

Map/Reduce ViewsDocs

Map{"user" : "Chris",

"points" : 3 }{"user": "Joe","points" : 10 }

{"user": "Alice","points" : 5 }

{"user": "Mary","points" : 9}

{"user": "Bob","points": 7}

function(doc) {if (doc.user && doc.points) {

emit(doc.user, doc.points);}

}

{"key": "Alice", "value": 5}{"key": "Bob", "value": 7}

{"key": "Chris", "value": 3}{"key": "Joe", "value": 10}{"key": "Mary", "value": 9}

ReduceAlice ... Chris: 15

Everyone: 34function(keys, values, rereduce) { return sum(values);}

Thursday, 3 December 2009

Page 55: Non Relational Databases And World Domination

Map/Reduce ViewsDocs

Map{"user" : "Chris",

"points" : 3 }{"user": "Joe","points" : 10 }

{"user": "Alice","points" : 5 }

{"user": "Mary","points" : 9}

{"user": "Bob","points": 7}

function(doc) {if (doc.user && doc.points) {

emit(doc.user, doc.points);}

}

{"key": "Alice", "value": 5}{"key": "Bob", "value": 7}

{"key": "Chris", "value": 3}{"key": "Joe", "value": 10}{"key": "Mary", "value": 9}

ReduceAlice … Chris: 15

Everyone: 34function(keys, values, rereduce) { return sum(values);}

Thursday, 3 December 2009

Page 56: Non Relational Databases And World Domination

Map/Reduce ViewsDocs

Map{"user" : "Chris",

"points" : 3 }{"user": "Joe","points" : 10 }

{"user": "Alice","points" : 5 }

{"user": "Mary","points" : 9}

{"user": "Bob","points": 7}

function(doc) {if (doc.user && doc.points) {

emit(doc.user, doc.points);}

}

{"key": "Alice", "value": 5}{"key": "Bob", "value": 7}

{"key": "Chris", "value": 3}{"key": "Joe", "value": 10}{"key": "Mary", "value": 9}

ReduceAlice … Chris: 15

Everyone: 34function(keys, values, rereduce) { return sum(values);}

Thursday, 3 December 2009

Page 57: Non Relational Databases And World Domination

Render Views as HTMLlists/index.js /drl/_list/sofa/index/recent-posts?descending=true&limit=8

Thursday, 3 December 2009

Page 58: Non Relational Databases And World Domination

Server-Side JavaScript• _show for transforming documents

• _list for transforming views

• _update for transforming PUTs/POSTs

• Code-sharing between client and server

• Easy deployment

Thursday, 3 December 2009

Page 59: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 60: Non Relational Databases And World Domination

Replication• Incremental

• Near-real-time

• Clustered mirrors

• Scheduled

• Ad-hoc

Thursday, 3 December 2009

Page 61: Non Relational Databases And World Domination

http://www.flickr.com/photos/mcpig/872293700/

“Ground Computing”@jhuggins

- local to the user, more like desktop web than like Gears - local http server - browser apps - same application on the client and server or the cloud

Thursday, 3 December 2009

Page 62: Non Relational Databases And World Domination

http://www.flickr.com/photos/hercwad/2290378571/

Thursday, 3 December 2009

Page 63: Non Relational Databases And World Domination

Latency Sucks

speed of lightdrawback to cloud computing

Thursday, 3 December 2009

Page 64: Non Relational Databases And World Domination

! !

Stuart Langridge - Canonical

Thursday, 3 December 2009

Page 65: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 66: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 67: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 68: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 69: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 70: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 71: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 72: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 73: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 74: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 75: Non Relational Databases And World Domination

Con*icts

Thursday, 3 December 2009

Page 76: Non Relational Databases And World Domination

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 77: Non Relational Databases And World Domination

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 78: Non Relational Databases And World Domination

❦❦

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 79: Non Relational Databases And World Domination

❦ ❦

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 80: Non Relational Databases And World Domination

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 81: Non Relational Databases And World Domination

Con*ict resolution by example

A B

❦ ✿

Thursday, 3 December 2009

Page 82: Non Relational Databases And World Domination

Con*ict resolution by example

A B

❦ ✿♪

Thursday, 3 December 2009

Page 83: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 84: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 85: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 86: Non Relational Databases And World Domination

✿♪

Con*ict resolution by example

A B

Thursday, 3 December 2009

Page 87: Non Relational Databases And World Domination

Features• Schema-Free (JSON)

• Document-Oriented, Not Relational

• Highly Concurrent

• RESTful HTTP API

• JavaScript-Powered Map/Reduce

• N-Master Replication

• Robust Storage

Thursday, 3 December 2009

Page 88: Non Relational Databases And World Domination

Robust Storage

Append-Only File Structure

Designed to Crash

Instant-On

Thursday, 3 December 2009

Page 89: Non Relational Databases And World Domination

Robust

- when britain is burning - Enda Farrell - bbc

Thursday, 3 December 2009

Page 90: Non Relational Databases And World Domination

Thursday, 3 December 2009

Page 91: Non Relational Databases And World Domination

!anks!

www.jasondavies.com

@jasondavies

Thursday, 3 December 2009