Mongo db tutorials

Preview:

Citation preview

An Introduction toMongoDB

Anuj Jain Equal Experts India

MongoDB

NoSQL

Key-value

Graph database

Document-oriented

Column family

The Great Divide

Not a RDBMS

• Mongo is not a relational database like MySQL, Oracle

• No transactions.

• No referential Integrity.

• No Joins.

• No schema, so no columns or rows.

• NoSQL.

What is MongoDB ?

• Scalable High-Performance Open-source, Document-orientated database.

• Built for Speed

• Rich Document based queries for Easy readability.

• Full Index Support for High Performance.

• Replication and Failover for High Availability.

• Auto Sharding for Easy Scalability.

• Map / Reduce for Aggregation

Quiz ?

Which of the following statement are true about MongoDB ?

1. MongoDB is document oriented.

2. MongoDB supports Joins.

3. MongoDB has dynamic schema.

4. MongoDB supports SQL.

What MongoDB is great for ?

• Semi structured Content Management.

• Real time Analytics and High-Speed Logging.

• Caching and High Availability.

• Mobile and Social Infrastructure, Big Data etc.

Some considerations while designing schema in MongoDB

• Combine objects into one document if you will use them together. Otherwise separate them (but make sure there should not be need of joins).

• Duplicate the data (but limited) because disk space is cheap as compare to compute time.

• Do joins while write, not on read.

• Optimize your schema for most frequent use cases.

• Do complex aggregation in the schema.

Example – Blog Post

• Every post has the unique title, description and url.

• Every post can have one or more tags.

• Every post has the name of its publisher and total number of likes.

• Every Post have comments given by users along with their name, message, data-time and likes.

• On each post there can be zero or more comments.

RDBMS

Mongo Schema

{ “_id” : ObjectId("55b1f50899708bec87f96edc")

“title” : “MongoDB Tutorial for beginner”, “description: “How to start using mongodb”, “by: Anuj Jain, “url: “http://mongodbtutorial.com/blog/mongodb”, “tags” : ['mongodb', 'nosql' ], “likes” : 200, “comments” : [ { “user” : ''MongoUser”, “message” : “Very Nice Tutorial” , “dateCreated” : NumberLong(1437725960469), “like” : true

} ]}

Quiz ?How many different data types are there in JSON ?

1. 4

2. 5

3. 6

4. 7

AnswerAns: 6

1. String

2. Number

3. Boolean

4. null

5. Array

6. Object/document

CRUD

Createdb.collection.insert( <document> ) db.collection.save( <document> ) db.collection.update( <query>, <update>, { upsert: true } ) Readdb.collection.find( <query>, <projection> )db.collection.findOne( <query>, <projection> )

Updatedb.collection.update( <query>, <update>, <options> )

Deletedb.collection.remove( <query>, <justOne> )

Some Other Operators

1. $and

2. $or

3. $in

4. $nin

5. $exists

6. $push

7. $pop

8. $addToSet

Indexes

1.Indexes are special data structures, that store a small portion of the data set in an easy to traverse form.

2. Stores the value of a specific field or set of fields.

3. Ordered by the value of the field as specified in index.

4. Indexes can improves read operation but slower the write operations.

5. Mongodb use B-Tree data structure to store indexes.

6.Blocks mongod process unless you specify the background.

7. null is assumed if field is missing.

When To Index ?1. Frequently Queried Fields

2. Low response time

3. Sorting

4. Avoid full collection scans.

Indexes Types1. Default (_id)

2. Single Field

3. Compound Index

4. Multikey Index

5. Geospatial Index

6. Sparse Index

7. TTL Index

Quiz ?Mongodb index can have keys of different types( ints, dates, string for example) in it ?1. True2. False

Covered IndexesQueries that can be resolved with only the index (does not need to

fetch the original document)

Example: { “name”:”Anuj”,

– “age”:28,– “gender”:Male,– “skills”:[“Java”,”Mongo”]}

db.people.ensureIndex({“name”:1,”age”:1})

db.people.find ({“name”:”Anuj”},{“_id” :0 , “age”:1})

TTL Indexes {“time to live”}

1. Mongod already remove the data from the collections after specify number of seconds.

2. Field type should either be BSON date type or an array of BSON date-type object

Eg. db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )

Where createdAt is date field

Quiz ?Suppose we run :

db.foo.ensureIndex ({a:1, b:2, c:3})

db.foo.find({a : “sports”, b:{$gt : 100}})

Then

1.Only the index needs touched to fully execute the query.

2.Then index and some documents need to be executed.

Why Replication?

• To keep your data safe.

• High (24*7) availability of data.

• Disaster Recovery.

• Read scaling (extra copies to read from).

• No downtime for maintenance (like backups, index rebuilds, compaction).

Replica set features

• A class of N nodes.

• Anyone node can be primary.

• All write operation goes to primary.

• Automatic fail-overs.

• Automatic Recovery.

• Consensus election of primary.

Capped Collection• Fixed size circular queues that follow the insertion order.

• Fixed size is preallocated and when it exhausted, oldest document will automatically start getting deleted.

• We cannot delete documents from a capped collection.

• There are no default indexes present in a capped collection, not even on _id field.

Capped CollectionCommands :

1. db.createCollection (

"cappedcollection", {capped:true,size:10000} )

2. db.createCollection (

"cappedcollection", capped:true, size:10000, max:1000 })3. db.cappedLogCollection.isCapped()

4. db.runCommand({"convertToCapped":"posts",size:10000})

5. db.cappedLogCollection.find().sort({$natural:-1})

Query Limitations:

Indexing can't be used in queries which use:

1. Regular expressions or negation operators like $nin, $not, etc.

Arithmetic operators like $mod, etc.

2. $where clause

Hence, it is always advisable to check the index usage for your queries.

Index LimitationMaximum Ranges:• A collection cannot have more than 64 indexes.

• The length of the index name cannot be longer than 125 characters.

• A compound index can have maximum 31 fields indexed

$explain

The $explain operator provides information and statistics on the query for example :

1. Indexes used the query.

2. Number of document scan in serving the query.

3. Whether index enough to serve the query data i.e. covered Index.

Usage :

db.users.find({gender:"M"}, {user_name:1,_id:0} ).explain()

$hint

The $hint operator forces the query optimizer to use the specified index to run a query

db.users.find({gender:"M"},{user_name:1,_id:0})» .hint({gender:1,user_name:1})

Backup & RestoreBackup Utilities1. mongodump (use to dump complete data directory or db)

2. mongoexport (use to dump certain collection to output file like json or csv).

Restore Utilities

1. mongorestore

2. mongoimport

ObjectId

An ObjectId is a 12-byte BSON type having the following structure:

The first 4 bytes representing the seconds since the unix epoch

The next 3 bytes are the machine identifier

The next 2 bytes consists of process id

The last 3 bytes are a random counter value

Thank you

References:https://www.mongodb.org/