Open source, high performance...

Preview:

Citation preview

1

July 2012

Open source, high performance database

2

• Quick introduction to mongoDB

• Data modeling in mongoDB, queries, geospatial, updates and map reduce.

• Using a location-based app as an example

• Example works in mongoDB JS shell

3

4

MongoDB is a scalable, high-performance, open source, document-oriented database.

• Fast Querying • In-place updates• Full Index Support• Replication /High Availability• Auto-Sharding• Aggregation; Map/Reduce• GridFS

5

Better data locality

Relational MongoDB

In-Memory Caching

Auto-Sharding

Write scalingRea

d sc

alin

g

We just can't get any faster than the way MongoDB handles our data.

Tony TamCTO, Wordnik

6

MongoDB is Implemented in C++• Windows, Linux, Mac OS-X, Solaris

Drivers are available in many languages10gen supported• C, C# (.Net), C++, Erlang, Haskell, Java,

JavaScript, Perl, PHP, Python, Ruby, Scala, nodejs

• Multiple community supported drivers

7

RDBMS MongoDBTable CollectionRow(s) JSON DocumentIndex IndexPartition ShardJoin Embedding/Linking

Schema (implied Schema)

8

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Asya",date : ISODate("2012-02-02T11:52:27.442Z"), text : "About MongoDB...",tags : [ "tech", "databases" ],comments : [{

author : "Fred",date : ISODate("2012-02-03T17:22:21.124Z"), text : "Best

Post Ever!"}],comment_count : 1

}

9

•JSON has powerful, limited set of datatypes– Mongo extends datatypes with Date, Int types, Id, …

•MongoDB stores data in BSON

•BSON is a binary representation of JSON– Optimized for performance and navigational abilities– Also compression

See: bsonspec.org

10

MySQLSTART TRANSACTION;INSERT INTO contacts VALUES 

(NULL, ‘joeblow’);INSERT INTO contact_emails VALUES 

( NULL, ”joe@blow.com”,LAST_INSERT_ID() ),

( NULL, “joseph@blow.com”, LAST_INSERT_ID() );

COMMIT;

MongoDBdb.contacts.save( { 

userName: “joeblow”,emailAddresses: [ 

“joe@blow.com”,“joseph@blow.com” ] } );

• Native drivers for dozens of languages• Data maps naturally to OO data

structures

11

•Want to build an app where users can check in to a location

•Leave notes or comments about that location

12

"As a user I want to be able to find other locations nearby"

• Need to store locations (Offices, Restaurants, etc)

– name, address, tags– coordinates– User generated content e.g. tips / notes

13

"As a user I want to be able to 'checkin' to a location"

Checkins– User should be able to 'check in' to a location– Want to be able to generate statistics:

• Recent checkins• Popular locations

14

users

user1, user2loc1, loc2, loc3

locations checkins

checkin1, checkin2

15

> location_1 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012

}

16

> location_1 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012

}

> db.locations.find({name: "Lotus Flower"})

17

> location_1 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012

}

> db.locations.ensureIndex({name: 1})> db.locations.find({name: "Lotus Flower"})

18

> location_2 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012,

tags: ["restaurant", "dumplings"]}

19

> location_2 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012,

tags: ["restaurant", "dumplings"]}

> db.locations.ensureIndex({tags: 1})

20

> location_2 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012,

tags: ["restaurant", "dumplings"]}

> db.locations.ensureIndex({tags: 1})> db.locations.find({tags: "dumplings"})

21

> location_3 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012, tags: ["restaurant", "dumplings"],

lat_long: [52.5184, 13.387]}

22

> location_3 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012, tags: ["restaurant", "dumplings"],

lat_long: [52.5184, 13.387]}

> db.locations.ensureIndex({lat_long: "2d"})

23

> location_3 = {name: "Lotus Flower",address: "123 University Ave",city: "Palo Alto",post_code: 94012, tags: ["restaurant", "dumplings"],

lat_long: [52.5184, 13.387]}

> db.locations.ensureIndex({lat_long: "2d"})> db.locations.find({lat_long: {$near:[52.53, 13.4]}})

24

// creating your indexes:> db.locations.ensureIndex({tags: 1})> db.locations.ensureIndex({name: 1})> db.locations.ensureIndex({lat_long: "2d"})

// finding places:> db.locations.find({lat_long: {$near:[52.53, 13.4]}})

// with regular expressions:> db.locations.find({name: /^Din/})

// by tag:> db.locations.find({tag: "dumplings"})

25

Atomic operators:

$set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit

26

// initial data load:> db.locations.insert(location_3)

// adding a tip with update:> db.locations.update(

{name: "Lotus Flower"},{$push: {

tips: {user: "Asya",date: "28/03/2012",tip: "The hairy crab dumplings are awesome!"}

}})

27

> db.locations.findOne(){ name: "Lotus Flower",

address: "123 University Ave",city: "Palo Alto",post_code: 94012, tags: ["restaurant", "dumplings"],lat_long: [52.5184, 13.387],tips:[{

user: "Asya",date: "28/03/2012",tip: "The hairy crab dumplings are awesome!"

}]

}

28

"As a user I want to be able to 'checkin' to a location"

Checkins

– User should be able to 'check in' to a location– Want to be able to generate statistics:

• Recent checkins• Popular locations

29

> user_1 = {_id: "asya@10gen.com",name: "Asya",twitter: "asya999",checkins: [{location: "Lotus Flower", ts: "28/03/2012"},{location: "Meridian Hotel", ts: "27/03/2012"}]

}

> db.users.ensureIndex({checkins.location: 1})> db.users.find({checkins.location: "Lotus Flower"})

30

// find all users who've checked in here:> db.users.find({"checkins.location":"Lotus Flower"})

31

// find all users who've checked in here:> db.users.find({"checkins.location":"Lotus Flower"})

// find the last 10 checkins here?> db.users.find({"checkins.location":"Lotus Flower"})

.sort({"checkins.ts": -1}).limit(10)

32

// find all users who've checked in here:> db.users.find({"checkins.location":"Lotus Flower"})

// find the last 10 checkins here: - Warning!> db.users.find({"checkins.location":"Lotus Flower"})

.sort({"checkins.ts": -1}).limit(10)

Hard to query for last 10

33

> user_2 = {_id: "asya@10gen.com",name: "Asya",twitter: "asya999",

}

> checkin_1 = {location: location_id,user: user_id,ts: "20/03/2010"

}

> db.checkins.ensureIndex({user: 1})> db.checkins.find({user: user_id})

34

// find all users who've checked in here:> location_id = db.checkins.find({"name":"Lotus Flower"})> u_ids = db.checkins.find({location: location_id},

{_id: -1, user: 1})> users = db.users.find({_id: {$in: u_ids}})

// find the last 10 checkins here:> db.checkins.find({location: location_id})

.sort({ts: -1}).limit(10)

// count how many checked in today:> db.checkins.find({location: location_id,

ts: {$gt: midnight}}).count()

35

// Find most popular locations> agg = db.checkins.aggregate(

{$match: {ts: {$gt: now_minus_3_hrs}}},{$group: {_id: "$location",numEntries: {$sum: 1}}}

)

> agg.result[{"_id": "Lotus Flower", "numEntries" : 17}]

36

// Find most popular locations> map_func = function() {

emit(this.location, 1);}

> reduce_func = function(key, values) {return Array.sum(values);

}

> db.checkins.mapReduce(map_func, reduce_func, {query: {ts: {$gt: now_minus_3_hrs}},out: "result"})

> db.result.findOne(){"_id": "Lotus Flower", "value" : 17}

37

DeploymentDeployment

38

P• Single server- need a strong backup plan

39

• Single server- need a strong backup plan

• Replica sets- High availability- Automatic failover

P

P S S

40

• Single server- need a strong backup plan

• Replica sets- High availability- Automatic failover

• Sharded- Horizontally scale- Auto balancing

P S S

P S S

P

P S S

41

User Data Management High Volume Data Feeds 

Content Management Operational Intelligence E‐Commerce

42

43

@mongodb

conferences, appearances, and meetupshttp://www.10gen.com/events

http://bit.ly/mongofb

Facebook | Twitter | LinkedInhttp://linkd.in/joinmongo

download at mongodb.org

support, training, and this talk brought to you by