75
Copyright © Mammatus Technology Inc. | Licensed under Creative Commons . MongoDB Introduction for Java, Python and PHP Developers Developers Covers using MongoDB from a Java, PHP and Python developer’s perspective. Uses the Official MongoDB driver for Java, Python and PHP as well as command line tools for MongoDB to teach core concepts.

Mongo DB for Java, Python and PHP Developers

Embed Size (px)

DESCRIPTION

Getting started with MongoDB. Covers basic why Mongo, features, architecture (replica sets, map reduce, aggregation framework), code examples in JavaScript, Java, Python and PHP.

Citation preview

Page 1: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MongoDB Introduction for Java, Python and PHP Developers

Developers

Covers using MongoDB from a Java, PHP and Python developer’s perspective. Uses the Official MongoDB driver for Java, Python and PHP as well as command line tools for MongoDB to teach core concepts.

Page 2: Mongo DB for Java, Python and PHP Developers

2Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Outline• Theory and Architecture of MongoDB (49%)• Setup instructions (9%)• Code examples in JavaScript, PHP, Python and Java

(39%)• 3% random

© 2012 10gen. MongoDB®, Mongo®, and theleaf logo are registered trademarks of 10gen, Inc.

10gen in no way endorses this slide deck or Mammatus Tehcnology Inc.

Page 3: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MongoDB• NoSQL landscape full of contenders tackling big

data problems • MongoDB very capable • Document-oriented schema-less storage solution • JSON-style documents to represent, query and

modify data• Supports many clients/languages Python, PHP,

Java, Ruby, C++, etc.

Page 4: Mongo DB for Java, Python and PHP Developers

4Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Resources• 10Gen site (http://www.mongodb.org)

– Great documentation and presentations

• InfoQ articles and presentations• Wikipedia• http://mongly.com/

Page 5: Mongo DB for Java, Python and PHP Developers

5Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Feedback welcome• Send any and all feedback to

[email protected]

• Criticism welcome– Prefer constructive criticism, but will take any and all– Needed for continuous improvement of this slide deck and my

knowledge

Page 6: Mongo DB for Java, Python and PHP Developers

6Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Creative Commons• This slide deck and all material therein are covered

under creative commons• You can use all material in here as long as you don’t

copy it word for word and then use it for commercial reasons

• Other material in here from other sources are covered under fair use

• http://creativecommons.org/

Page 7: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Why MongoDB?• MongoDB great active communitySupports: • High availability,• Journaling• Replication, • Sharding, • Indexing,• Aggregation,• Map/Reduce

MongoDB commercialhttp://www.10gen.com/what-is-mongodb

Page 8: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MongoDB is a top job trend

Page 9: Mongo DB for Java, Python and PHP Developers

9Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Leader in the NoSQL space?

MongoDB seems to bethe clear mind-share leader

Cassandra a close second

Page 10: Mongo DB for Java, Python and PHP Developers

10Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Why do Developers pick Mongo?• NoSQL, in general, can be more agile than full

RDBMS/ SQL – problems with schema migration– a lot of upfront design needed for RDBMS– (or a lot schema migration later)

• MongoDB does not require a lot of ramp up time– Easy to get started– Many DevOps things come for free– Easy on ramp for NoSQL – Gateway drug?

Page 11: Mongo DB for Java, Python and PHP Developers

11Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Why PHP, Python and Java?

Might add Ruby later to the mix or just focus on these

Page 12: Mongo DB for Java, Python and PHP Developers

12Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Built for speed | Cache built in• MongoDB was built with speed in mind• Speed shaped architecture of MongoDB• Uses binary protocol instead of HTTP text/

(CouchDB)• Pads disk space around document

– faster updates– uses more disk

• Uses memory-mapped files as default storage engine, letting OS manage swapping– Linux/Windows/Solaris really good at virtual memory...

MongoDB builds on top of this

Page 13: Mongo DB for Java, Python and PHP Developers

13Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Negatives of MongoDB• Indexes are not as flexible as

Oracle/MySQL/Postgres or other NoSQL solutions– Order of index matters, uses B-Trees, not very many options

like more mature solutions

• Realtime queries might not be as fast as Oracle/MySQL and other NoSQL solutions

• Good enough if queries are simple• Probably hits the sweet spot of 20/80 rule• Not as mature as RDBMS• Does not have full text search engine

Page 14: Mongo DB for Java, Python and PHP Developers

14Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Very useful still• Every version seems to add more features• Added journaling so they can have single server

durability• Improved Replica’s with Replica Sets• Replica Sets and Autosharding required very little

admin once running• What it does, it does well...

– Can be combined with Relational database– Can be combined with full text search (Solr)– Can be combined with Hadoop

Page 15: Mongo DB for Java, Python and PHP Developers

15Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

• MTV• Craigslist• Disney• Shutterfly• foursqaure• bit.ly• The New York Times• Barclay’s• The Guardian

• SAP• Forbes• National Archives UK• Intuit• github• LexisNexis• many more

Who uses MongoDB?

Big names, big data

Page 16: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MongoDB Concept• Oracle: Schema, Tables, Rows, Columns• MySQL: Database, Tables, Rows, Columns• MongoDB: Database, Collections, Document, Fields• MySQL/Oracle: Indexes• MongoDB: Indexes• MySQL/Oracle: Stored Procedures• MongoDB: Stored JavaScript• Oracle/MySQL: Database Schema• MongoDB: Schema free!

Page 17: Mongo DB for Java, Python and PHP Developers

17

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MongoDB Architecture

Page 18: Mongo DB for Java, Python and PHP Developers

18Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Additional Mongo Features• Geo Indexing: How close am I to X?• File Storage

– Stores large files and file meta-data• Capped Collection (like Ring Buffer)

– Older documents auto-deleted• Aggregation• Auto sharding• Load sharing for reads • High availability• Speed or durability (journaling)

Page 19: Mongo DB for Java, Python and PHP Developers

19Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

What do you need?• Read scalability and high availability (HA)?

– Use Replica Sets

• Write scalability?– Use Autosharding (also just called sharding)

• HA, Read Scalability, and Write Scalability?– Use Autosharding and Replica Sets

• You can start basic and add as your growth/needs change– Capacity planning, monitoring, determine needs

HA

Page 20: Mongo DB for Java, Python and PHP Developers

20Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Durability• Journaling added in 1.8, and is now default for 64 bit

OS for MongoDB 2.0• Prior to that, you used replication to make sure copy

of operation was on replica– MongoDB did not have single server durability, now it does

with addition of journaling

• General thought was/is durability is overvalued• You can also force an fsync• See links in notes

Page 21: Mongo DB for Java, Python and PHP Developers

21Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

• Drivers know the primary• If primary down, Drivers

know how to get new primary

• Data is replicated after writing

• Typical to have three in a replica set

• You can do more• Load sharing for reads

Replica Sets

Replica 1 Replica 2

Replica 0PRIMARY

Client Driver

Read/WriteRead

Page 22: Mongo DB for Java, Python and PHP Developers

22Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Replica Sets Usage• Business Continuity• Data Redundancy• High Availability• Load sharing (reads) • “Just works / NoOps (low ops)”

Page 23: Mongo DB for Java, Python and PHP Developers

23Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Replica Sets• Non-blocking master/slave replication• Auto failover• Two or more nodes (usually three)• No primary, master is nominated• Share nothing architecture• Brains in the client libraries • Client libraries (Drivers) are Replica Set aware• Client can block until data is replicated on all servers

(for important data)

Page 24: Mongo DB for Java, Python and PHP Developers

24Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Replica Sets• You only write to the master, it then replicates to

slaves• Replication is by default async (non-blocking)• Slave data and write data can be out of sync

– There are workarounds– You can force master to sync to master before

continuing (blocking, sync)• Sync blocking is slower• Async non-blocking is faster (eventual consistency)

Page 25: Mongo DB for Java, Python and PHP Developers

25Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Durability and Replica Sets• Client libraries can do the following:

• Wait until write has happened on all replicas• Wait until write is on two servers (primary and one

other)• Wait until write has occurred on majority of replicas• Wait until write operation has been written to

journal

Page 26: Mongo DB for Java, Python and PHP Developers

26Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Replica Sets• Replica Sets and Autosharding go hand in hand for

mass scale out• Replica Sets are good for failover and speeding up

reads, but...• To speed up writes, you need autosharding

Page 27: Mongo DB for Java, Python and PHP Developers

27Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Sharding• Sharding allows MongoDB to scale horizontally • Sharding = partitioning• Auto-shards

– load balances – changes for data distribution

• Elastic adding of new nodes• Supports automatic failover (along with replica sets)• No single point of failure• 90% of deployments don’t need sharding according

to Roger Bodamer

Page 28: Mongo DB for Java, Python and PHP Developers

28Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

• Client Driver talks directly to mongod process

Non-sharded client connection

Page 29: Mongo DB for Java, Python and PHP Developers

29Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

• Three actors now: mongod, mongos, and Client Driver library

• Mongod is the process• Mongos is a router, it

routes writes to correct mongod instance

• Shares writing

Autosharded

Page 30: Mongo DB for Java, Python and PHP Developers

30Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

• Autosharding increases writes, helps with scale out

• Replica Sets are for high availability

• There is a whole lesson on sharding.

Autoshard plus Replica Set

Page 31: Mongo DB for Java, Python and PHP Developers

31Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Sharding Topology• Config Servers (mongod)

– contain versioned shard topology– maps which shard has key– used by mongos– like DNS server for shards

• Mongos– Shard router clients drivers talk to Mongos instead of mongod

directly– Mongos uses Config Servers to find shard where key lives– MongoD are shards that can be replicated

Page 32: Mongo DB for Java, Python and PHP Developers

32Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Large deployment

Page 33: Mongo DB for Java, Python and PHP Developers

33Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MapReduce• Used for batch processing• Similar to Hadoop,• Massive aggregation possible through divide and

conquer• Used instead of Group/By in SQL

– Also added simplified framework to MongoDB (aggregation framework)

• Map and Reduce functions are written in JavaScript– Executed on server, code next to data it is operating on

• Can copy results to results collections

Page 34: Mongo DB for Java, Python and PHP Developers

34Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MapReduce Theory

Image from http://code.google.com/p/mapreduce-framework/wiki/MapReduce

Page 35: Mongo DB for Java, Python and PHP Developers

35Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Incremental MapReduce• Run MapReduce job over collections• Run a second job but only over new documents in

collection• Use reduce output to merge new data into existing

collection

Page 36: Mongo DB for Java, Python and PHP Developers

36Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Aggregation Framework• Added in MongoDB 2.1• Similar to SQL group by• Before Aggregation framework, you had to use

MapReduce for things like SQL group by• Easier to use than MapReduce

Table from 10 Gen http://www.mongodb.org/display/DOCS/SQL+to+Aggregation+Framework+Mapping+Chart

Page 37: Mongo DB for Java, Python and PHP Developers

37Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Aggregation Framework

Table from 10 Gen http://www.mongodb.org/display/DOCS/SQL+to+Aggregation+Framework+Mapping+Chart

Page 38: Mongo DB for Java, Python and PHP Developers

38

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

MongoDB versus SQL

Compare contrast

Page 39: Mongo DB for Java, Python and PHP Developers

39Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

SQL to Mongo

From http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart

Page 40: Mongo DB for Java, Python and PHP Developers

40Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Mongo versus SQL

From http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart

Page 41: Mongo DB for Java, Python and PHP Developers

41Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Mongo versus SQL

From http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart

Page 42: Mongo DB for Java, Python and PHP Developers

42Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Mongo vs. SQL

From http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart

Page 43: Mongo DB for Java, Python and PHP Developers

43

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Getting Started with Mongo

Installing and using Mongo

Page 44: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Install (1 of 2)• http://www.mongodb.org/downloads• Extract

– ~/mongodb-platform-version/

• $ sudo mkdir /etc/mongodb/data• Create file

– /etc/mongodb/mongodb.config

• $ cat mongodb.config – dbpath=/etc/mongodb/data

Page 45: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Install (2 of 2)• Link:sudo ln -s ~/mongodb-platform-version/ /usr/local/mongodb

• Add to Path:export PATH=$PATH:/usr/local/mongodb/bin

• Run the server:mongod --config /etc/mongodb/mongodb.config

Page 46: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Run the client (type db.version())$ mongoMongoDB shell version: 2.0.4connecting to: test…> db.version()2.0.4>

Page 47: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Client: mongo db.help()> db.help()DB methods:

db.addUser(username, password[, readOnly=false])db.auth(username, password)db.cloneDatabase(fromhost)db.commandHelp(name) returns the help for the commanddb.copyDatabase(fromdb, todb, fromhost)db.createCollection(name, { size : ..., capped : ..., max : ... } )db.currentOp() displays the current operation in the dbdb.dropDatabase()db.eval(func, args) run code server-sidedb.getCollection(cname) same as db['cname'] or db.cnamedb.getCollectionNames()

Page 48: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

db.help()

db.getLastError() - just returns the err msg stringdb.getLastErrorObj() - return full status objectdb.getMongo() get the server connection objectdb.getMongo().setSlaveOk() allow this connection to read from the nonmaster member of a replica pair

db.getName()db.getPrevError()db.getProfilingStatus() - returns if profiling is on and slow threshold db.getReplicationInfo()db.getSiblingDB(name) get the db at the same server as this onedb.isMaster() check replica primary statusdb.killOp(opid) kills the current operation in the dbdb.listCommands() lists all the db commandsdb.logout()

db.printCollectionStats()db.printReplicationInfo()db.printSlaveReplicationInfo()db.printShardingStatus()db.removeUser(username)db.repairDatabase()db.resetError()db.runCommand(cmdObj) run a database command. if cmdObj is a string, turns it into { cmdObj : 1 }

db.serverStatus()db.setProfilingLevel(level,<slowms>) 0=off 1=slow 2=alldb.shutdownServer()db.stats()db.version() current version of the serverdb.getMongo().setSlaveOk() allow queries on a replication slave serverdb.fsyncLock() flush data to disk and lock server for backupsdb.fsyncUnock() unlocks server following a db.fsyncLock()

Page 49: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Create Employee Collection> use tutorial;switched to db tutorial> db.getCollectionNames();[ ]> db.employees.insert({name:'Rick Hightower', gender:'m', gender:'m', phone:'520-555-1212', age:42});Mon Apr 23 23:50:24 [FileAllocator] allocating new datafile /etc/mongodb/data/tutorial.ns, filling with zeroes…..

Page 50: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Query DB (1 of 2)> db.getCollectionNames();[ "employees", "system.indexes" ]

> db.employees.find(){ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

Page 51: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Query DB (2 of 2)> db.employees.find({name:"Bob"})

> db.employees.find({name:"Rick Hightower"}){ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

> db.employees.find({age:{$lt:100}}){ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

> db.employees.find({age:{$lt:100}})[0].nameRick Hightower

> db.system.indexes.find(){ "v" : 1, "key" : { "_id" : 1 }, "ns" : "tutorial.employees", "name" : "_id_" }

> db.employees.find({_id : ObjectId("4f964d3000b5874e7a163895")}){ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

Page 52: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Java: Setup• Download latest mongo driver

– https://github.com/mongodb/mongo-java-driver/downloads

$ mkdir tools/mongodb/lib$ cp mongo-2.7.3.jar tools/mongodb/lib

Create new Eclipse project in new Workspace

Page 53: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Java: Setup Eclipse (1 of 2)

• Right Click Project, Open Properties, Java Build Path->Libraries->Add Variable->Configure Variable

Page 54: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Java: Setup Eclipse (2 of 2)

• From Properties->Java Build Path->Libraries• Click Add Variable, Select MONGO, Click Extend…, select jar

file you just downloaded

Page 55: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Java: Using Java Basics (1 of 2)

Out:{ "_id" : { "$oid" : "4f964d3000b5874e7a163895"} , "name" : "Rick Hightower" , "gender" : "m" , "phone" : "520-555-1212" , "age" : 42.0}{ "_id" : { "$oid" : "4f984cce72320612f8f432bb"} , "name" : "Diana Hightower" , "gender" : "f" , "phone" : "520-555-1212" , "age" : 30}

Page 56: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Using Java Basics (2 of 2)

Output:Rick?{ "_id" : { "$oid" : "4f964d3000b5874e7a163895"} , "name" : "Rick Hightower" , "gender" : "m" , "phone" : "520-555-1212" , "age" : 42.0}Diana?{ "_id" : { "$oid" : "4f984cae72329d0ecd8716c8"} , "name" : "Diana Hightower" , "gender" : "m" , "phone" : "520-555-1212" , "age" : 30}Diana by object id?{ "_id" : { "$oid" : "4f984cce72320612f8f432bb"} , "name" : "Diana Hightower" , "gender" : "f" , "phone" : "520-555-1212" , "age" : 30}

Page 57: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

All of the above but in Python• Install mongodb lib for PythonMAC OSX$ sudo env ARCHFLAGS='-arch i386 -arch x86_64'

python -m easy_install pymongoLinux$ easy_install pymongoor$ pip install pymongo

Page 58: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Basic Python Operations (1 of 2)

Output:{u'gender': u'm', u'age': 42.0, u'_id': ObjectId('4f964d3000b5874e7a163895'), u'name': u'Rick Hightower', u'phone': u'520-555-1212'}{u'gender': u'm', u'age': 30, u'_id': ObjectId('4f984cae72329d0ecd8716c8'), u'name': u'Diana Hightower', u'phone': u'520-555-1212'}{u'gender': u'm', u'age': 8, u'_id': ObjectId('4f9e111980cbd54eea000000'), u'name': u'Lucas Hightower', u'phone': u'520-555-1212'}

Page 59: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Basic Python Operations (2 of 2)

Page 60: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

All of the above but in PHP• Install on PHP$ sudo pecl install mongo

• Add to php.ini:extension=mongo.so

• Restart apache $ apachectl stop$ apachectl start

Page 61: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Mongo PHP Basics (1 of 2)

Output:array ( '_id' => MongoId::__set_state(array( '$id' => '4f964d3000b5874e7a163895', )), 'name' => 'Rick Hightower', 'gender' => 'm', 'phone' => '520-555-1212', 'age' => 42, )

array ( '_id' => MongoId::__set_state(array( '$id' => '4f984cae72329d0ecd8716c8', )), 'name' => 'Diana Hightower', 'gender' => ‘f', 'phone' => '520-555-1212', 'age' => 30, )

array ( '_id' => MongoId::__set_state(array( '$id' => '4f9e170580cbd54f27000000', )), 'gender' => 'm', 'age' => 8, 'name' => 'Lucas Hightower', 'phone' => '520-555-1212', )

Page 62: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Mongo PHP (2 of 2)

OutputRick? array ( '_id' => MongoId..., 'name' => 'Rick Hightower', 'gender' => 'm', 'phone' => '520-555-1212', 'age' => 42, )Diana? array ( '_id' => MongoId::..., 'name' => 'Diana Hightower', 'gender' => ‘f', 'phone' => '520-555-1212', 'age' => 30, )Diana by id? array ( '_id' => MongoId::..., 'name' => 'Diana Hightower', 'gender' => 'f', 'phone' => '520-555-1212', 'age' => 30, )

Page 63: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Shell commands in action> show dbslocal (empty)tutorial 0.203125GB

> show collectionsemployeessystem.indexes

> show users

> show profiledb.system.profile is emptyUse db.setProfilingLevel(2) will enable profiling..

> show logsglobal

> show log globalMon Apr 23 23:33:14 [initandlisten] MongoDB starting : pid=11773 port=27017 dbpath=/etc/mongodb/data 64-bit……Mon Apr 23 23:33:14 [initandlisten] options: { config: "/etc/mongodb/mongodb.config", dbpath: "/etc/mongodb/data" }

Page 64: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Collection Methods (1 of 2) > db.employees.help()DBCollection help

db.employees.find().help() - show DBCursor helpdb.employees.count()

db.employees.dataSize()

db.employees.distinct( key ) - eg. db.employees.distinct( 'x' )db.employees.drop() drop the collectiondb.employees.dropIndex(name)db.employees.dropIndexes()db.employees.ensureIndex(keypattern[,options]) - options is an object with these possible fields: name, unique, dropDupsdb.employees.reIndex()db.employees.find([query],[fields]) - query is an optional query filter. fields is optional set of fields to return. e.g. db.employees.find( {x:77} , {name:1, x:1} )db.employees.find(...).count()db.employees.find(...).limit(n)db.employees.find(...).skip(n)db.employees.find(...).sort(...)

Page 65: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Collection Methods (2 of 2)

db.employees.findOne([query])db.employees.findAndModify( { update : ... , remove : bool [, query: {}, sort: {}, 'new': false] } )db.employees.getDB() get DB object associated with collectiondb.employees.getIndexes()db.employees.group( { key : ..., initial: ..., reduce : ...[, cond: ...] } )db.employees.mapReduce( mapFunction , reduceFunction , <optional params> )db.employees.remove(query)db.employees.renameCollection( newName , <dropTarget> ) renames the collection.db.employees.runCommand( name , <options> ) runs a db command with the given name where the first param is the collection name

db.employees.save(obj)db.employees.stats()db.employees.storageSize() - includes free space allocated to this collectiondb.employees.totalIndexSize() - size in bytes of all the indexesdb.employees.totalSize() - storage allocated for all data and indexesdb.employees.update(query, object[, upsert_bool, multi_bool])db.employees.validate( <full> ) – SLOWdb.employees.getShardVersion() - only for use with shardingdb.employees.getShardDistribution() - prints statistics about data distribution in the cluster

Page 66: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Basic shell commands (1 of 2)> help

db.help() help on db methodsdb.mycoll.help() help on collection

methodsrs.help() help on replica set methodshelp admin administrative helphelp connect connecting to a db helphelp keys key shortcutshelp misc misc things to knowhelp mr mapreduce

Page 67: Mongo DB for Java, Python and PHP Developers

Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Basic shell commands (2 of 2)> help…show dbs show database namesshow collections show collections in current databaseshow users show users in current databaseshow profile show most recent system.profile entries time>= 1ms

show logs show the accessible logger namesshow log [name] prints out the last segment of log in memory, use <db_name> set current database

DBQuery.shellBatchSize = x set default number of items to display on shellexit quit the mongo shell

Page 68: Mongo DB for Java, Python and PHP Developers

68Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Tips for scaling mongo• Roger Bodamer• http://www.infoq.com/presentations/Scaling-with-M

ongoDB• Good ideas on EC2 shortcoming• RAID configuration (RAID 10 for speed and scaling)• Config Servers know where keys are, has key to

shard mapping, mongos refer to config servers

Page 69: Mongo DB for Java, Python and PHP Developers

69Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Tips for scaling mongo• Don’t use sharding unless needed• 90% of deployments don’t need sharding

according to Roger Bodamer– Are you Twitter, Linkedin, Facebook, Foursquare? No– You probably not going to need it

• Replica Sets are more needed– Why? HA, read scalability

• Mongos can live on primary box• ConfigServer can live on a primary box• http://www.infoq.com/presentations/Scaling-with-MongoDB

Page 70: Mongo DB for Java, Python and PHP Developers

70Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Tips for scaling mongo• Replicas should be on separate boxes

Page 71: Mongo DB for Java, Python and PHP Developers

71Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Backups• You can use tools with mongodb• Or, if your hardware supports shutdowns• Sync to disk, shutdown cleanly, take a snapshot• http://www.infoq.com/presentations/Scaling-with-MongoDB

Page 72: Mongo DB for Java, Python and PHP Developers

72Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

Coming up• Basic CRUD and queries: Slide deck showing a simple CRUD listing

website in Python, Java and PHP using MySQL and MongoDB– Justing showing general operations

• Queries: Slide deck improving demo app to do more complex queries

• Replica Set: Slide deck setting up simple Replica Set in Amazon EC2 with boto scripts– Durability modes with examples in Python, PHP and Java

• Sharding: Slide deck setting up Sharding in MongoDB– Python, PHP and Java

• Map Reduce: Slide deck using sample app to do map reduce in Java, PHP and Python

Page 73: Mongo DB for Java, Python and PHP Developers

73Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

More to come• This is an early version of this.... expect updates• Looking for some feedback• Want to grow it out, put it on github, etc.• Do some compare and contrast between MongoDB,

MySQL, Cassandra, etc.• Things you won’t get from a vendor

– complaining– criticism– shortcomings

Page 74: Mongo DB for Java, Python and PHP Developers

74Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

About Author• Rick Hightower, founder of ArcMind (RIP) and Mammatus,

father of five• Former CTO of LearningPatterns and Trivera Technologies

(global training and consulting firms)• Director of Development at 3 different places• Author of five technical books• Editor at InfoQ• 20 years hacking code (C, Python, C++, Java, etc.)• https://twitter.com/#!/RickHigh

Page 75: Mongo DB for Java, Python and PHP Developers

75Copyright © Mammatus Technology Inc. | Licensed under Creative Commons.

About Mammatus, ArcMind• ArcMind focused on Spring framework, JSF, Java EE,

Spring MVC, Grails, Groovy, Development, Tiles, etc.– Started in 2003 ended in 2009– Lot of clients (Boeing, Qualcomm, Bank of America, et.c), lots

of traveling

• Mammatus Tech (bumpy clouds) focuses on Cloud computing, NoSQL, BigData, Map Reduce, Java, PHP, Python, EC2, etc.– Started in 2009– http://cloud.mammatustech.com/

– http://nosql.mammatustech.com/

– http://www.mammatustech.com/