Webinar: Exploring the Aggregation Framework

Preview:

Citation preview

Exploring the Aggregation Framework

Jason Mimick - Senior Consulting Engineerjason.mimick@mongodb.com @jmimick

Original Slide Credits:Jay Runkel jay.runkel@mongodb.comet al

2

Warning or WhewThis is a “101” beginner talk!Assuming you know some basics about MongoDBBut basically nothing about the Aggregation Framework

3

Agenda

1. Analytics in MongoDB?2. Aggregation Framework3. Aggregation Framework in Action

– US Census Data– Aggregation Framework Options

4. New 3.2 stuff– Friends of friends $lookup for self-joins

4

Analytics in MongoDB?

CreateReadUpdateDelete

Analytics

?

GroupCountDerive ValuesFilterAverageSort

5

For Example: US Census Data• Census data from 1990, 2000, 2010

• Question: Which US Division has the fastest growing population density?– We only want to include data states with more than 1M people– We only want to include divisions larger than 100K square miles

Division = a group of US StatesPopulation density = Area of division/# of

peopleData is provided at the state level

6

US Regions and Divisions

7

How would we solve this in SQL?

SELECT GROUP BY HAVING

Of course, we don’t have SQL

we’re a noSQL database

8

The Aggregation Framework

9

Core Concept: Pipeline

ps -ef | grep mongod

10

What is the Aggregation Pipeline?A Series of Document Transformations

– Executed in stages– Original input is a collection– Output as a cursor or a collection

Rich Library of Functions– Filter, compute, group, and summarize data– Output of one stage sent to input of next– Operations executed in sequential order

11

An Example Aggregation Pipeline

12

Syntax

>db.foo.aggregate( [ { stage1 },{ stage2 },{ stage3 }, … ])mongo shell

1 db - variable pointing to current database

2 collection name

3 aggregate - method on collection

4 array of objects, each a pipeline operator

5 pipeline operators

1 2 3 4 ...5...

13

Syntax - Driver - Java

db.hospital.aggregate( [ { "$group" : { "_id" : "$PatientID, "count" : { "$sum" : 1 } } },{ "$match" : { "count" : { "$gte" : 5 } } },

{ "$sort" : { "count" : -1 } } ] )

14

Some Popular Pipeline Operators$match Filter documents

$project Reshape documents

$group Summarize documents

$unwind Expand arrays in documents

$sort Order documents

$limit/$skip Paginate documents

$redact Restrict documents

$geoNear Proximity sort documents

$let,$map Define variables

15

80+ operators available as of MongoDB 3.2

Aggregation Framework in Action(let’s play with the census data)

17

cData Collection• Document For Each State

– Name– Region– Division

• Census Data For 1990, 2000, 2010– Population– Housing Units– Occupied Housing Units

• Census Data is an array with three subdocuments

18

Count, Distinct

• Check out cData docs • count()• distinct()

When you starting building your aggregations you need to ‘get to know’ your data!

19

Simple $groupCensus data has a collection called regions> db.regions.findOne(){

"_id" : ObjectId("54d0e1ac28099359f5660f9f"),"state" : "Connecticut","region" : "Northeast","regNum" : 1,"division" : "New England","divNum" : 1

}

How can we find out how many states are in each region?

20

> db.regions.aggregate( [ { "$group" : { "_id" : "$region", "count" : { "$sum" : 1 } }

} ] )

{ "_id" : "West", "count" : 13 }{ "_id" : "South", "count" : 17 }{ "_id" : "Midwest", "count" : 12 }{ "_id" : "Northeast", "count" : 9 }

// make more readable - store your pipeline ops in variables>var group = { "$group" : { "_id" : "$region", "count" : { "$sum" : 1 } } };db.regions.aggregate( [ group ] )

21

$group• Group documents by value

– _id - field reference, object, constant

– Other output fields are computed• $max, $min, $avg, $sum• $addToSet, $push• $first, $last

– Processes all data in memory by default

22

Total US Area

Back to cData…

Can we use $group to find the total area of the US (according to these data)?

23

db.cData.aggregate([{"$group" : {"_id" : null,

"totalArea" : {$sum : "$areaM"}, "avgArea" : {$avg : "$areaM"} }}])

{ "_id" : null, "totalArea" : 3802067.0700000003, "avgArea" : 73116.67442307693 }

24

Area By Regiondb.cData.aggregate([

{"$group" : {"_id" : "$region", "totalArea" : {$sum : "$areaM"},

"avgArea" : {$avg : "$areaM"}, "numStates" : {$sum : 1},

"states" : {$push : "$name"}}}]){ "_id" : null, "totalArea" : 5393.18, "avgArea" : 2696.59, "numStates" : 2, "states" : [ "District of Columbia", "Puerto Rico" ] }{ "_id" : "Northeast", "totalArea" : 181319.86, "avgArea" : 20146.65111111111, "numStates" : 9, "states" : [ "New Jersey", "Vermont", "Maine", "New Hampshire", "Rhode Island", "Pennsylvania", "Connecticut", "Massachusetts", "New York" ] }{ "_id" : "Midwest", "totalArea" : 821724.3700000001, "avgArea" : 68477.03083333334, "numStates" : 12, "states" : [ "Iowa", "Missouri", "Ohio", "Indiana", "North Dakota", "Wisconsin", "Illinois", "Minnesota", "Kansas", "South Dakota", "Michigan", "Nebraska" ] }{ "_id" : "West", "totalArea" : 1873251.6300000001, "avgArea" : 144096.27923076923, "numStates" : 13, "states" : [ "Colorado", "Wyoming", "California", "Utah", "Nevada", "Alaska", "Hawaii", "Montana", "New Mexico", "Arizona", "Idaho", "Oregon", "Washington" ] }{ "_id" : "South", "totalArea" : 920378.03, "avgArea" : 57523.626875, "numStates" : 16, "states" : [ "Alabama", "Georgia", "Maryland", "South Carolina", "Florida", "Mississippi", "Arkansas", "Louisiana", "North Carolina", "Texas", "West Virginia", "Oklahoma", "Virginia", "Delaware", "Kentucky", "Tennessee" ] }

25

Calculating Average State Area By Region

{ $group: { _id: "$region", avgAreaM: {$avg: ”$areaM" }}}

{ _id: ”North East", avgAreaM: 154}

{ _id: “West", avgAreaM: 300}

{ state: ”New York", areaM: 218, region: “North East"}

{ state: ”New Jersey", areaM: 90, region: “North East”}

{ state: “California", areaM: 300, region: “West"}

26

Calculating Total Area and State Count

{ $group: { _id: "$region", totArea: {$sum: ”$areaM" }, sCount : {$sum : 1}}}

{ _id: ”North East", totArea: 308 sCount: 2}

{ _id: “West", totArea: 300, sCount: 1}

{ state: ”New York", areaM: 218, region: “North East"}

{ state: ”New Jersey", areaM: 90, region: “North East”}

{ state: “California", area: 300, region: “West"}

27

Total US Population By Yeardb.cData.aggregate( [{$unwind : "$data"}, {$group : {"_id" : "$data.year", "totalPop" : {$sum : "$data.totalPop"}}}, {$sort : {"totalPop" : 1}}])

{ "_id" : 1990, "totalPop" : 248709873 }{ "_id" : 2000, "totalPop" : 281421906 }{ "_id" : 2010, "totalPop" : 312471327 }

28

$unwind• Flattens arrays• Create documents from array elements

• Array replaced by element value• Missing/empty fields → no output• Non-array fields → error

• Pipe to $group to aggregate{ "a" : "foo", "b" : [1, 2, 3] }

{ "a" : "foo", "b" : 1 }{ "a" : "foo", "b" : 2 }{ "a" : "foo", "b" : 3 }

29

$unwind{ $unwind: $census }

{ state: “New York, census: 1990}

{ state: ”New York", census: [1990, 2000, 2010]}

{ state: ”New Jersey", census: [1990, 2000]}

{ state: “California", census: [1980, 1990, 2000, 2010]}{ state: ”Delaware", census: [1990, 2000]}

{ state: “New York, census: 2000}

{ state: “New York, census: 2010}

{ state: “New Jersey, census: 1990}

{ state: “New Jersey, census: 2000}

30

Southern State Population By Yeardb.cData.aggregate( [{$match : {"region" : "South"}}, {$unwind : "$data"}, {$group : {"_id" : "$data.year", "totalPop” : {"$sum” :

"$data.totalPop"}}}])

{ "_id" : 2010, "totalPop" : 113954021 }{ "_id" : 2000, "totalPop" : 99664761 }{ "_id" : 1990, "totalPop" : 84839030 }

31

$match

• Filter documents–Uses existing query syntax

32

$match{ $match: { “region” : “West” }}

{ state: ”New York", areaM: 218, region: “North East"}

{ state: ”Oregon", areaM: 245, region: “West”}

{ state: “California", area: 300, region: “West"}

{ state: ”Oregon", areaM: 245, region: “West”}

{ state: “California", area: 300, region: “West"}

33

Population Delta By State from 1990 to 2010

db.cData.aggregate([{$unwind : "$data"},

{$sort : {"data.year" : 1}},{$group :{"_id" : "$name", "pop1990" : {"$first" : "$data.totalPop"}, "pop2010" : {"$last" : "$data.totalPop"}}}, {$project : {"_id" : 0, "name" : "$_id",

"delta" : {"$subtract" : ["$pop2010", "$pop1990"]}, "pop1990" : 1,

"pop2010” : 1} }])

34

{ "pop1990" : 3725789, "pop2010" : 3725789, "name" : "Puerto Rico", "delta" : 0 }{ "pop1990" : 4866692, "pop2010" : 6724540, "name" : "Washington", "delta" : 1857848 }{ "pop1990" : 4877185, "pop2010" : 6346105, "name" : "Tennessee", "delta" : 1468920 }{ "pop1990" : 1227928, "pop2010" : 1328361, "name" : "Maine", "delta" : 100433 }{ "pop1990" : 1006749, "pop2010" : 1567582, "name" : "Idaho", "delta" : 560833 }{ "pop1990" : 1108229, "pop2010" : 1360301, "name" : "Hawaii", "delta" : 252072 }{ "pop1990" : 3665228, "pop2010" : 6392017, "name" : "Arizona", "delta" : 2726789 }{ "pop1990" : 638800, "pop2010" : 672591, "name" : "North Dakota", "delta" : 33791 }{ "pop1990" : 6187358, "pop2010" : 8001024, "name" : "Virginia", "delta" : 1813666 }{ "pop1990" : 550043, "pop2010" : 710231, "name" : "Alaska", "delta" : 160188 }{ "pop1990" : 1109252, "pop2010" : 1316470, "name" : "New Hampshire", "delta" : 207218 }

{ "pop1990" : 10847115, "pop2010" : 11536504, "name" : "Ohio", "delta" : 689389 }{ "pop1990" : 6016425, "pop2010" : 6547629, "name" : "Massachusetts", "delta" : 531204 }

{ "pop1990" : 6628637, "pop2010" : 9535483, "name" : "North Carolina", "delta" : 2906846 }

{ "pop1990" : 3287116, "pop2010" : 3574097, "name" : "Connecticut", "delta" : 286981 }{ "pop1990" : 17990455, "pop2010" : 19378102, "name" : "New York", "delta" : 1387647 }{ "pop1990" : 29760021, "pop2010" : 37253956, "name" : "California", "delta" : 7493935 }

{ "pop1990" : 16986510, "pop2010" : 25145561, "name" : "Texas", "delta" : 8159051 }{ "pop1990" : 11881643, "pop2010" : 12702379, "name" : "Pennsylvania", "delta" : 820736 }

{ "pop1990" : 2842321, "pop2010" : 3831074, "name" : "Oregon", "delta" : 988753 }

35

$sort, $limit, $skip• Sort documents by one or more

fields– Same order syntax as cursors– Waits for earlier pipeline operator to

return– In-memory unless early and indexed

• Limit and skip follow cursor behavior

36

$first, $last

• Collection operations like $push and $addToSet

• Must be used in $group• $first and $last determined by document

order• Typically used with $sort to ensure ordering is

known

37

$project• Reshape/Transform Documents

– Include, exclude or rename fields– Inject computed fields– Create sub-document fields

38

Including and Excluding Fields{ $project: { “_id” : 0, “pop1990” : 1, “pop2010” : 1}

{ "_id" : "Virginia”, "pop1990" : 453588, "pop2010" : 3725789}

{ "_id" : "South Dakota", "pop1990" : 453588, "pop2010" : 3725789}

{ "pop1990" : 453588, "pop2010" : 3725789}

{ "pop1990" : 453588, "pop2010" : 3725789}

39

{ ”name" : “South Dakota”, ”delta" : 118176}

Renaming and Computing Fields{ $project: { “_id” : 0, “pop1990” : 0, “pop2010” : 0, “name” : “$_id”, "delta" : {"$subtract" : ["$pop2010", "$pop1990"]}}}

{ "_id" : "Virginia”, "pop1990" : 6187358, "pop2010" : 8001024}

{ "_id" : "South Dakota", "pop1990" : 696004, "pop2010" : 814180}

{ ”name" : “Virginia”, ”delta" : 1813666}

40

Compare number of people living within 500KM of Memphis, TN in 1990, 2000, 2010

41

Compare number of people living within 500KM of Memphis, TN in 1990, 2000, 2010

db.cData.aggregate([ {$geoNear : { "near" : {"type" : "Point", "coordinates" : [90, 35]}, “distanceField” : "dist.calculated", “maxDistance” : 500000, “includeLocs” : "dist.location", “spherical” : true }}, {$unwind : "$data"}, {$group : {"_id" : "$data.year", "totalPop" : {"$sum" : "$data.totalPop"}, "states" : {"$addToSet" : "$name"}}}, {$sort : {"_id" : 1}}])

42

{ "_id" : 1990, "totalPop" : 22644082, "states" : [ "Kentucky", "Missouri", "Alabama", "Tennessee", "Mississippi", "Arkansas" ] }

{ "_id" : 2000, "totalPop" : 25291421, "states" : [ "Kentucky", "Missouri", "Alabama", "Tennessee", "Mississippi", "Arkansas" ] }

{ "_id" : 2010, "totalPop" : 27337350, "states" : [ "Kentucky", "Missouri", "Alabama", "Tennessee", "Mississippi", "Arkansas" ] }

43

$geoNear

• Order/Filter Documents by Location– Requires a geospatial index– Output includes physical distance– Must be first aggregation stage

44

{ "_id" : ”Tennessee", "pop1990" : 4877185, "pop2010" : 6346105, “center” : {“type” : “Point”, “coordinates” :

[86.6, 37.8]}}

{ "_id" : "Virginia”, "pop1990" : 6187358, "pop2010" : 8001024, “center” : {“type” : “Point”, “coordinates” :

[78.6, 37.5]}}

$geoNear{$geoNear : { "near”: {"type”: "Point", "coordinates”: [90, 35]}, maxDistance : 500000, spherical : true }}

{ "_id" : ”Tennessee", "pop1990" : 4877185, "pop2010" : 6346105, “center” : {“type” : “Point”, “coordinates” :

[86.6, 37.8]}}

45

What if I want to save the results to a collection?

db.cData.aggregate([ {$geoNear : { "near" : {"type" : "Point", "coordinates" : [90, 35]}, “distanceField” : "dist.calculated", “maxDistance” : 500000, “includeLocs” : "dist.location", “spherical” : true }}, {$unwind : "$data"}, {$group : {"_id" : "$data.year", "totalPop" : {"$sum" : "$data.totalPop"}, "states" : {"$addToSet" : "$name"}}}, {$sort : {"_id" : 1}}, {$out : “peopleNearMemphis”}])

46

$out

db.cData.aggregate([ <pipeline stages>, {“$out” :“resultsCollection”}])

• Save aggregation results to a new collection• NOTE: Overwrites any data existing in collection• Transform documents - ETL

47

Back To The Original Question

• Which US Division has the fastest growing population density?– We only want to include data states with more than 1M people– We only want to include divisions larger than 100K square miles

48

Division with Fastest Growing Pop Density

db.cData.aggregate( [{$match : {"data.totalPop" : {"$gt" : 1000000}}}, {$unwind : "$data"}, {$sort : {"data.year" : 1}}, {$group : {"_id" : "$name", "pop1990" : {"$first" : "$data.totalPop"},

"pop2010" : {"$last" : "$data.totalPop"}, "areaM" : {"$first" : "$areaM"}, "division" : {"$first" : "$division"}}}, {$group : {"_id" : "$division", "totalPop1990" : {"$sum" : "$pop1990"}, "totalPop2010" : {"$sum" : "$pop2010"},

"totalAreaM" : {"$sum" : "$areaM"}}}, {$match : {"totalAreaM" : {"$gt" : 100000}}}, {$project : {"_id" : 0, "division" : "$_id", "density1990" : {"$divide" : ["$totalPop1990", "$totalAreaM"]}, "density2010" : {"$divide" : ["$totalPop2010", "$totalAreaM"]}, "denDelta" : {"$subtract" : [{"$divide" : ["$totalPop2010", "$totalAreaM"]},{"$divide" : ["$totalPop1990”,"$totalAreaM"]}]}, "totalAreaM" : 1, "totalPop1990" : 1, "totalPop2010" : 1}}, {$sort : {"denDelta" : -1}}])

49

{ "totalPop1990" : 42293785, "totalPop2010" : 58277380, "totalAreaM" : 290433.39999999997, "division" : "South Atlantic", "density1990" : 145.62300685802668, "density2010" : 200.6566049221612, "denDelta" : 55.03359806413451 }

{ "totalPop1990" : 38577263, "totalPop2010" : 49169871, "totalAreaM" : 344302.94999999995, "division" : "Pacific", "density1990" : 112.0445322934352, "density2010" : 142.80990331334658, "denDelta" : 30.765371019911385 }

{ "totalPop1990" : 37602286, "totalPop2010" : 40872375, "totalAreaM" : 109331.91, "division" : "Mid-Atlantic", "density1990" : 343.9278249140621, "density2010" : 373.8375648975674, "denDelta" : 29.90973998350529 }

{ "totalPop1990" : 26702793, "totalPop2010" : 36346202, "totalAreaM" : 444052.01, "division" : "West South Central", "density1990" : 60.134381555890265, "density2010" : 81.85122729204626, "denDelta" : 21.716845736155996 }

{ "totalPop1990" : 15176284, "totalPop2010" : 18432505, "totalAreaM" : 183403.9, "division" : "East South Central", "density1990" : 82.74788049763391, "density2010" : 100.50225213313348, "denDelta" : 17.754371635499567 }

{ "totalPop1990" : 42008942, "totalPop2010" : 46421564, "totalAreaM" : 301368.57, "division" : "East North Central", "density1990" : 139.39390560867048, "density2010" : 154.03585052017866, "denDelta" : 14.641944911508176 }

{ "totalPop1990" : 12406123, "totalPop2010" : 20512410, "totalAreaM" : 618711.92, "division" : "Mountain", "density1990" : 20.051533838236054, "density2010" : 33.153410071685705, "denDelta" : 13.101876233449651 }

{ "totalPop1990" : 16324886, "totalPop2010" : 19018666, "totalAreaM" : 372541.8, "division" : "West North Central", "density1990" : 43.820280033005695, "density2010" : 51.05109279012449, "denDelta" : 7.230812757118798 }

Aggregate Options

51

Aggregate optionsdb.cData.aggregate([<pipeline stages>], {‘explain’ : false 'allowDiskUse' : true, 'cursor' : {'batchSize' : 5}})

explain – similar to find().explain()allowDiskUse – enable use of disk to store intermediate resultscursor – specify the size of the initial result

New things in 3.2

53

$sample

{ $sample: { size: <positive integer> } }

● If WT - pseudo-random cursor to return docs

● If MMAPv1 - uses _id index to randomly select docs

Used by Compass, Useful for unit tests, etc

54

$lookup• Performs a left outer join to another collection in the same database to filter in

documents from the “joined” collection for processing.• To each input document, the $lookup stage adds a new array field whose

elements are the matching documents from the “joined” collection.

{ $lookup: { from: <collection to join>, localField: <field from the input documents>, foreignField: <field from the documents of the "from" collection>, as: <output array field> }}

CANNOT BE SHARDED

https://docs.mongodb.org/master/reference/operator/aggregation/lookup/

55

• Sample data:> db.data.find(){ "_id" : ObjectId("565e759ae6f9919371a53896"), "v" : 14, "k" : 0 }{ "_id" : ObjectId("565e759ae6f9919371a53897"), "v" : 664, "k" : 1 }{ "_id" : ObjectId("565e759ae6f9919371a53898"), "v" : 701, "k" : 1 }{ "_id" : ObjectId("565e759ae6f9919371a53899"), "v" : 312, "k" : 1 }{ "_id" : ObjectId("565e759ae6f9919371a5389a"), "v" : 10, "k" : 2 }{ "_id" : ObjectId("565e759ae6f9919371a5389b"), "v" : 686, "k" : 0 }{ "_id" : ObjectId("565e759ae6f9919371a5389c"), "v" : 669, "k" : 2 }{ "_id" : ObjectId("565e759ae6f9919371a5389d"), "v" : 273, "k" : 2 }{ "_id" : ObjectId("565e759ae6f9919371a5389e"), "v" : 473, "k" : 0 }{ "_id" : ObjectId("565e759ae6f9919371a5389f"), "v" : 158, "k" : 2 }

> db.keys.find(){ "_id" : 0, "name" : "East Meter" }{ "_id" : 1, "name" : "Central Meter 12" }{ "_id" : 2, "name" : "New HIFI Monitor" }

56

• Try to find ave “v” value but lookup name of “k”db.data.aggregate( [ { "$lookup" : { "from" : "keys", "localField" : "k", "foreignField" : "_id", "as" : "name" } }, { "$unwind" : "$name" }, { "$project" : { "k" : "$k", "name" : "$name.name", "v" : "$v" } }, { "$group" : { "_id" : "$name", "aveValue" : { "$avg" : "$v" } } }, { "$project" : { "_id" : 0, "name" : "$_id", "aveValue" : "$aveValue" } }]);

{ "aveValue" : 277.5, "name" : "New HIFI Monitor"}{ "aveValue" : 559, "name" : "Central Meter 12"}{ "aveValue" : 391, "name" : "East Meter"}

57

friends of friends

Use $lookup to perform "self-joins" for graph problems.Simple case: find the friends of someone's friendsCan extend this to find cliques, paths, etc.

Dataset:

{ "_id" : 1, "name" : "FLOYD", "friends" : [ "BILLIE", "MARGENE", "HERMINIA", "LACRESHA", "SHAUN", "INOCENCIA", "DEANA", "MARAGRET", "MICHELE", "KARLENE", "KASSANDRA", "JOAN", "HIRAM" ] }

{ "_id" : 2, "name" : "ELIDA", "friends" : [ "ALI", "KESHIA" ] }

...

58

59

don't forget your indexes…Running FOF.friendsOfFriends(1)2016-01-26T10:19:41.201-0500 I COMMAND [conn6] command friendship.friends command: aggregate { aggregate: "friends", pipeline: [ { $match: { _id: 1.0 } }, { $unwind: "$friends" }, { $lookup: { from: "friends", localField: "friends", foreignField: "name", as: "friendsOfFriends" } }, { $unwind: "$friendsOfFriends" }, { $unwind: "$friendsOfFriends.friends" }, { $group: { _id: "$friendsOfFriends.friends" } }, { $project: { friendOfFriend: "$_id", _id: 0.0 } } ], cursor: {} } cursorid:42505581740 keyUpdates:0 writeConflicts:0 numYields:0 reslen:3722 locks:{ Global: { acquireCount: { r: 1124 } }, Database: { acquireCount: { r: 562 } }, Collection: { acquireCount: { r: 562 } } } protocol:op_command 48ms

with indexes { "friends" : 1 } & { "name" : 1 }:

2016-01-26T10:17:45.167-0500 I COMMAND [conn6] command friendship.friends command: aggregate { aggregate: "friends", pipeline: [ { $match: { _id: 1.0 } }, { $unwind: "$friends" }, { $lookup: { from: "friends", localField: "friends", foreignField: "name", as: "friendsOfFriends" } }, { $unwind: "$friendsOfFriends" }, { $unwind: "$friendsOfFriends.friends" }, { $group: { _id: "$friendsOfFriends.friends" } }, { $project: { friendOfFriend: "$_id", _id: 0.0 } } ], cursor: {} } cursorid:39053867824 keyUpdates:0 writeConflicts:0 numYields:0 reslen:3722 locks:{ Global: { acquireCount: { r: 32 } }, Database: { acquireCount: { r: 16 } }, Collection: { acquireCount: { r: 16 } } } protocol:op_command 2ms

60

lots of new mathematical operators

$stdDevSamp Calculates standard deviation. { $stdDevSamp: <array> }$stdDevPop Calculates population standard deviation. { $stdDevPop: <array> }$sqrt Calculates the square root. { $sqrt: <number> }$abs Returns the absolute value of a number. { $abs: <number> }$log Calculates the log of a number in the specified base. { $log: [ <number>, <base> ] }$log10 Calculates the log base 10 of a number. { $log10: <number> }$ln Calculates the natural log of a number. { $ln: <number> }$pow Raises a number to the specified exponent. { $pow: [ <number>, <exponent> ] }$exp Raises e to the specified exponent. { $exp: <number> }$trunc Truncates a number to its integer. { $trunc: <number> }$ceil Returns the smallest integer greater than or equal to the specified number.{$ceil:<number>}

$floor Returns the largest integer less than or equal to the specified number. {$floor: <number>}

61

new array operators

$slice Returns a subset of an array.{ $slice: [ <array>, <n> ] } or { $slice: [ <array>, <position>, <n> ] }

$arrayElemAt Returns the element at the specified array index.{ $arrayElemAt: [ <array>, <idx> ] }$concatArrays Concatenates arrays. { $concatArrays: [ <array1>, <array2>, ... ]}$isArray Determines if the operand is an array. { $isArray: [ <expression> ] }$filter Selects a subset of the array based on the condition.

{ $filter: { input: <array>, as: <string>, cond: <expression> }}

Summary

63

Analytics in MongoDB?

CreateReadUpdateDelete

Analytics

?

GroupCountDerive ValuesFilterAverageSort

YES!

64

Framework Use Cases

• Basic aggregation queries

• Ad-hoc reporting

• Real-time analytics

• Visualizing and reshaping data

Questions?

Thanks for attending & happy aggregatingPlease complete survey

jason.mimick@mongodb.com@jmimick

Recommended