23
Aggregation Framework in MongoDB Overview - Part-1

Aggregation Framework in MongoDB Overview Part-1

Embed Size (px)

Citation preview

Page 1: Aggregation Framework in MongoDB Overview Part-1

Aggregation Framework in MongoDB

Overview - Part-1

Page 2: Aggregation Framework in MongoDB Overview Part-1

What is Aggregation

English Definition:- The act of gathering something together.

Database Definition:- Aggregation are operations that process on the data sets, group some data records, do some computation on that records and return computed results.

Page 3: Aggregation Framework in MongoDB Overview Part-1

Aggregration Approach in MongoDB

● Aggregation Pipeline

● Map-Reduce

● Single Purpose Aggregation Operations

● Hadoop Connector

Page 4: Aggregation Framework in MongoDB Overview Part-1

Aggregation-Pipeline

Similar to pipeline in UNIX.

In Unix -

Cat file.txt | grep abc | wc -l

In MongoDB -

Group Limit SortCollection Output

Page 5: Aggregation Framework in MongoDB Overview Part-1

MapReduce

● Two Phase process – Mapper & Reducer.● Use JavaScript function for map-reduce

job.● Less efficient as it run over only one thread.● Finalise stage to make final modification.

Page 6: Aggregation Framework in MongoDB Overview Part-1
Page 7: Aggregation Framework in MongoDB Overview Part-1

Aggregarion Vs. MapReduce

Easy to use, just need to use the buildin operators.

Complex and steep learning curve.

Supports non-sharded and sharded input collections.

Supports non-sharded and sharded input collections.

Returns results inline. Return result in inline, new collection, merge, replace, reduce.

Limited to the operators and expressions supported by the aggregation pipeline.

Custom map, reduce and finalize JavaScript functions offer flexibility to aggregation logic.

Page 8: Aggregation Framework in MongoDB Overview Part-1

Single Purpose Aggregation Operation

● Applicable for single purpose of aggregation, like count, distinct, group.

● Limited scope as compared to aggregation pipeline and map-reduce.

● group does not support data in sharded collections, and result of operation should not be more than 16 MB.

Page 9: Aggregation Framework in MongoDB Overview Part-1

Aggregation Pipeline Operators

● $group● $match● $project● $limit● $sort● $unwind● $skip

Page 10: Aggregation Framework in MongoDB Overview Part-1

SQL to Aggregation Mapping Chart

SQL Operators Aggregation Operators

WHERE $match

GROUP BY $group

SELECT $project

LIMIT $limit

ORDER BY $sort

SUM() $sum

COUNT() $sum

JOIN $unwind (Not exact operator as JOIN works, but unwinds the array embedded in the document)

Page 11: Aggregation Framework in MongoDB Overview Part-1

Examples

SQL Query Aggregation Query

SELECT COUNT(*) AS COUNT FROM EMPLOYEE

db.employee.aggregate([{$group: {_id:null, count: { $sum:1} } }])

Explaination : Group by : on nothing Sum : Just add 1 to count field for each record

SELECT SUM(SALARY) AS TOTALFROM EMPLOYEE

db.employee.aggregate([{$group: {_id:null, total: { $sum:”$salary”} } }])

Explaination : Group by : on nothing Sum: do the sum of value of salary field of each doc and put result in total.

Page 12: Aggregation Framework in MongoDB Overview Part-1

Example Cont.

SELECT DEPARTMENT_ID, SUM(SALARY) AS TOTAL FROM EMPLOYEE GROUP BY DEPARTMENT_ID ORDER BY TOTAL

db.employee.aggregate( [{ $group: { _id: “$department_id”, total: { $sum:”$salary” } }, { $sort: { total : 1 } } }] )

Explanation:Group by : department,Sum : salary,Order by : total of salary for each department

Page 13: Aggregation Framework in MongoDB Overview Part-1

Example Cont.

SELECT DEPT_ID, SUM(SALARY) AS TOTAL FROM EMPLOYEE WHERE AGE>25 GROUB BY DEPT_ID, HAVING TOTAL > 5000

db.employee.aggregate( [{ $match : { age : {$gt: 25 } } }{ $group: { _id: “$dept_id”, total: { $sum: ”$salary” } } },{ $match: { total: {$gt:5000 } } }] )

Explanation:Group by : department id,Sum : salary,Having: on total of each department salary

Page 14: Aggregation Framework in MongoDB Overview Part-1

Example Cont.

SELECT DEPT_ID, SUM(SALARY) AS TOTAL FROM EMPLOYEE WHERE AGE>25 GROUB BY DEPT_ID, HAVING TOTAL > 5000

db.employee.aggregate( [{ $match : { age : {$gt: 25 } } }{ $group: { _id: “$dept_id”, total: { $sum: ”$salary” } } },{ $match: { total: {$gt:5000 } } }] )

Explanation:Group by : department id,Sum : salary,Having: on total of each department salary

Page 15: Aggregation Framework in MongoDB Overview Part-1

$unwind Operator

Decompose the embedded array into flat document and relate each entry in the array with outer fields.

Example :-

{ _id: “blog”, tags: [ “social”, “economic” ] }

{ _id: “blog”, { _id: “blog”

tags: “social” } tags: “economic” }

Page 16: Aggregation Framework in MongoDB Overview Part-1

Mongo Aggregation Optimization

MongoDB re-arranged the pipeline operations to optimize the aggregation performance.

● Pipeline Sequence Optimization.● Projection Optimization.

Page 17: Aggregation Framework in MongoDB Overview Part-1

Mongo Aggregation Optimization

MongoDB re-arranged the pipeline operations to optimize the aggregation performance.

● Pipeline Sequence Optimization.● Projection Optimization.

Page 18: Aggregation Framework in MongoDB Overview Part-1

Pipeline Sequence Optimization

● $sort + $skip + $limit

{ $sort: { salary : -1 } }, { $sort: { salary: -1 } },

{ $skip: 10 }, { $limit: 20 },

{ $limit: 10 } { $skip: 10}

Page 19: Aggregation Framework in MongoDB Overview Part-1

Projection Optimization

● $project● Reduce the amount of data passing

through channels of operation and will help in performace improvement.

● Below example will only emit salary.

db.employee.aggregate(

[ {$match: {“name”: “xyz” } }

{ $project: { salary:1, _id:0} } ]

)

Page 20: Aggregation Framework in MongoDB Overview Part-1

Restriction

● Output BSON document cannot exceed the 16 MB of data. If exceed will throw error.

● If single aggregation operation consumes more than 10 percent of system RAM, the operation will produce error.

Page 21: Aggregation Framework in MongoDB Overview Part-1

Aggregation Examples

● Download data from the below link:http://media.mongdb.org/zips.json

● Practice sample problem from the below

links:http://docs.mongodb.org/manual/tutorial/aggregatio-zip-code-data-set/

Page 22: Aggregation Framework in MongoDB Overview Part-1

References

● http://docs.mongodb.org/

Good Examples:-

● http://rubayeet.wordpress.com/2013/12/29/web-analytics -using-mongodb-aggregation-framework/

● http://derickrethans.nl/aggregation-framework.html

● http://architects.dzone.com/articles/using-mongodb -aggregation