Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Back to Basics 2016 : Webinar 4

Advanced Indexing – Text and Geospatial Indexes

Joe Drumgoole
Director of Developer Advocacy, EMEA



• Webinar 1 – Introduction to NoSQL– The different types of NoSQL databases– What kind of database is MongoDB? A document database.

• Webinar 2 – My First Application– Creating databases and collections– CRUD operations– Indexes and Explain

• Webinar 3 – Schema Design– Dynamic schema– Embedding approaches– Examples

• An efficient way to look up data by its value• Avoids table scans

Traditional Databases Use Btrees

• … and so does MongoDB

Queries, Inserts, Deletes O(Log(n) Time

Creating a Simple Index

db.coll.createIndex( { fieldName : <Direction> } )

Database Name

Collection Name


Field Name to be indexed

Ascending : 1 Descending : -1

Two Other Kinds of Indexes

• Full Text Index– Allows searching inside the text of a field ( Lucene, Solr and Elastic

Search)• Geospatial Index

– Allows searching by location (e.g. people near me)• These indexes do not use Btrees

Full Text Indexes

• An “inverted index” on all the words inside a single field (only one text index per collection)

{ “comment” : “I think your blog post is very interesting and informative. I hope you will post more info like this in the future” }

>> db.posts.createIndex( { “comments” : “text” } )

MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} ){ "_id" : ObjectId(“…"), "comment" : "I think your blog post is very interesting and informative. I hope you will post more info like this in the future" }MongoDB Enterprise >

MongoDB Enterprise > db.posts.getIndexes()...

{"v" : 1,"key" : {

"_fts" : "text","_ftsx" : 1

},"name" : "comment_text","ns" : "test.posts","weights" : {

"comment" : 1},"default_language" : "english","language_override" : "language","textIndexVersion" : 3


Dropping Text Indexes

• We drop text indexes by name rather than shapedb.posts.getIndexes()

{"v" : 1,"key" : {

"_fts" : "text","_ftsx" : 1

},"name" : "comment_text_text","ns" : "test.posts","weights" : {

"comment" : 5,"tags" : 10

},"default_language" : "english","language_override" : "language","textIndexVersion" : 3


MongoDB Enterprise > db.posts.dropIndex( "comment_text_tags_text" ){ "nIndexesWas" : 2, "ok" : 1 }MongoDB Enterprise >

• You can give an index an explict name to make this easier

MongoDB Enterprise > db.posts.createIndex( { "comments" : "text", "tags" : "text" }, { "name" : "text_index" } ){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1


On The Server

I INDEX [conn275] build index on: test.posts properties: { v: 1, key: { _fts: "text", _ftsx: 1 }, name: "comment_text", ns: "test.posts", weights: { comment: 1 }, default_language: "english", language_override: "language", textIndexVersion: 3 }}I INDEX [conn275] building index using bulk methodI INDEX [conn275] build index done. scanned 3 total records. 0 secs

More Detailed Example

>> db.posts.insert( { "comment" : "Red yellow orange green" } )>> db.posts.insert( { "comment" : "Pink purple blue" } )>> db.posts.insert( { "comment" : "Red Pink" } )

>> db.posts.find( { "$text" : { "$search" : "Red" }} ){ "_id" : ObjectId(“…”), "comment" : "Red yellow orange green" }{ "_id" : ObjectId(  »…"), "comment" : "Red Pink" }>> db.posts.find( { "$text" : { "$search" : "Red Green" }} ){ "_id" : ObjectId(« …"), "comment" : "Red Pink" }{ "_id" : ObjectId(« …"), "comment" : "Red yellow orange green" }>> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve{ "_id" : ObjectId(“…"), "comment" : "Red yellow orange green" }{ "_id" : ObjectId(«…”), "comment" : "Red Pink" }>>

Using Weights

• We can assign different weights to different fields in the text index• E.g. I want to favour tags over comments in searching• So I increase the weight for the the tags field

>> db.blog.createIndex( { comment: "text", tags : "text” }, { weights: { comment: 5, tags : 10 }} )• Now searches will favour tags

• Weights impact $textscore:

>> db.posts.find( { "$text" : { "$search" : "Red" }}, { score: { $meta: "textScore" }} ).sort( { score: { $meta: "textScore" } } ){ "_id" : …, "comment" : "hello", "tags" : "Red green orange", "score" : 6.666666666666666 }{ "_id" : …, "comment" : "Red Pink", "score" : 3.75 }{ "_id" : …, "comment" : "Red yellow orange green", "score" : 3.125 }>>

Other Parameters

• Language : Pick the language you want to search in e.g. – $language : Spanish

• Support case sensitive searching– $caseSensitive : True (default false)

• Support accented characters (diacritic sensitive search e.g. café is distinguished from cafe )– $diacriticSensitive : True (default false)

Geospatial Indexes

Geospatial Indexes

• MongoDB supports 2D Sphere indexes• Allows a user to represent location on the earth (which is a sphere)• Coordinates are stored in GeoJSON format• The Geospatial index supports subset of the GeoJSON operations• The index is based on a QuadTree representation• Index is based on WGS 84 standard

Page 20: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes



• Coordinates are represented as longitude, latitude• longitude

– Measured from Greenwich meridian in London (0 degrees) locations east (up to 180 degrees)

– For locations west we specify as negative • Latitude

– Measured from equator north and south (0 to 90 north, 0 to -90 south)• Coordinates in MongoDB are stored on Longitude/Latitude order• Coordinates in Google are stored in Latitude/Longitude order

2DSphere Versions

• Three versions of 2dSphere index in MongoDB• Version 1 : Up to MongoDB 2.4• Version 2 : From MongoDB 2.6 onwards• Version 3 : From MongoDB 3.2 onwards• We will only be talking about Version 3 in this webinar

Creating a 2dSphere Index

db.collection.createIndex ( { <location field> : "2dsphere" } )

• Location field must be coordinate or GeoJSON data

>> db.test.createIndex( { loc : "2dsphere" } ){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1


>> db.test.getIndexes()[

{"v" : 1,"key" : {

"loc" : "2dsphere"},"name" : "loc_2dsphere","ns" : "geo.test","2dsphereIndexVersion" : 3


Use a Simple Dataset to investigate Geo Queries

• Lets search for restaurants in Manhattan• Using two candidate collections

–  https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json– https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json

• Import them into MongoDB– mongoimport –c neighborhoods –d geo neighborhoods.json– mongoimport –c restaurants –d geo restaurants.json

Neighborhood Document

MongoDB Enterprise > db.neighborhoods.findOne(){

"_id" : ObjectId("55cb9c666c522cafdb053a1a"),"geometry" : {"coordinates" : [[[-73.94193078816193,40.70072523469547],



"type" : "Polygon"},"name" : "Bedford"


Restaurant Document

MongoDB Enterprise > db.restaurants.findOne(){

"_id" : ObjectId("55cba2476c522cafdb053adf"),"location" : {

"coordinates" : [-73.98241999999999,40.579505

],"type" : "Point"

},"name" : "Riviera Caterer"

}MongoDB Enterprise >

You can type this into google maps but

remember to reverse the coordinate order

Add Indexes

MongoDB Enterprise > db.restaurants.createIndex({ location: "2dsphere" }){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1

}MongoDB Enterprise > db.neighborhoods.createIndex({ geometry: "2dsphere" }){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1

}MongoDB Enterprise >

Use $geoIntersects to find our Neighborhood

• Assume we are at -73.93414657, 40.82302903• What neighborhood are we in? Use $geoIntersects

db.neighborhoods.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -73.93414657, 40.82302903 ]}}}})

{"geometry" : {

”coordinates" : [[


], ...


] ]

"type" : "Polygon"},"name" : "Central Harlem North-Polo Grounds"


Find All Restaurants within 0.35 km

db.restaurants.find({ location: { $geoWithin: { $centerSphere: [ [ -73.93414657, 40.82302903 ], 5 / 6,378.1 ] } } })

Distance in km Divide by radius of earth to convert to radians

Results – (Projected)

{ "name" : "Gotham Stadium Tennis Center Cafe" }{ "name" : "Chuck E. Cheese'S" }{ "name" : "Red Star Chinese Restaurant" }{ "name" : "Tia Melli'S Latin Kitchen" }{ "name" : "Domino'S Pizza" }

• Without projection

{ "_id" : ObjectId("55cba2476c522cafdb0550aa"), "location" : { "coordinates" : [ -73.93795159999999, 40.823376 ], "type" : "Point" }, "name" : "Domino'S Pizza" }

Summary of Operators

• $geoIntersect: Find areas or points that overlap or are adjacent

• $geoWithin: Find areas on points that lie within a specific area• $geoNear: Returns locations in order from nearest to furthest


• Text Indexes : Full text searching of all the text items in a collection

• Geospatial Indexes : Search by location, by intersection or by distance from a point

Q & A

