Efficient CRUD Queries in MongoDB - Percona · Solr and Elasticsearch Mesos (Non-tech) Distributed Systems and Infrastructure Architecture. ... summaries, etc No .explain() support

Efficient CRUD Queries in MongoDB

Tim VaillancourtSr. Technical Operations Architect

Percona

Agenda

●What is CRUD?●CRUD-related MongoDB Features●Troubleshooting

○ Explain○ Database Profiler○ Log File

●Read Preference●Read and Write Concerns●Deletion/Retention Strategies●Monitoring

About Me

●Started at Percona in January 2016●Experience

○ Web Publishing■ Big-scale LAMP-based web infrastructures

○ Ecommerce■ Large Inventory SaaS at Amazon / AbeBooks

○ Gaming■ DevOps / NoSQL DBA at EA SPORTS■ DBA at EA DICE

About Me

○ Technologies■ MongoDB, Cassandra and Couchbase■ MySQL■ Redis and Memcached■ RabbitMQ, Kafka and ActiveMQ■ Solr and Elasticsearch■ Mesos■ (Non-tech) Distributed Systems and Infrastructure Architecture

Terminology

● What is CRUD?○ Create

■ Operations that create entire documents, eg: .insert()■ Relatively light operation with low cache/disk IO impact

○ Read■ Operations that read documents, eg: .find(), .aggregate(), …■ Generally the main source of slowness

Terminology

● What is CRUD?○ Update

■ Operations that find and replace data, eg: .update()■ Expensive, operates similar to a .find() and updates data after

○ Delete■ Operations that remove documents, eg: .remove()

Terminology

● Document - a single MongoDB document (JSON/BSON)● Collection - a collection of documents, similar to “table”● Database - a grouping of MongoDB collections● Index - a BTree Index applied to a MongoDB collection● CRUD?

○ Create: operations that create documents, eg: .insert()○ Read: operations that read documents, eg: .find(), .aggregate(), …○ Update: operations that find and replace data, eg: .update()○ Delete: operations that deletes documents, eg: .delete()

Terminology

● What is efficient CRUD?○ Use the minimal server resources possible○ Scalable

● Metrics○ keysExamined - # of Index items examined○ docsExamined - # of Documents (cache or disk) examined○ Nreturned - # of Documents returned to client

● Lag○ The delay seen in reading data from a replication replica

Storage Engines

● MMAPv1: Default in 1.x and 2.x○ Good read performance, poor write performance○ No compression○ Collection-level Locking(!)

● WiredTiger: Default in 3.2+, available since 3.x○ Good read performance, good write performance○ Compression supported

● RocksDB: Available in Percona Server for MongoDB (or patch)○ Good 95%~ read performance, very good write performance○ Compression supported

Isolation and Atomicity

● Isolation○ Read Uncommitted

■ Any session can see a change, even before ack’d■ Essentially no isolation compared to RDBMs

○ Atomicity■ Single-document Update

● A reader will never see a partially-changed document■ Multi-document Update

● Multi-operation not atomic, no rollback

Replication

● Async changelog replication● Primary

○ Single Primary via election○ Serves Read and Write operations

● Secondary○ One or more, three required for reliable elections○ Serves Read queries only○ May take-over Primary in election

● Consistency○ Driver-level tunables for Read and Write consistency(!)

Insert Operations

● A Single-document Insert triggers:○ 1 x append to the journal○ 1 x document to be added to the data file(s)○ MMAPv1 Considerations

■ MMAP will scan for a free slot for the insert■ Shown as the “nmoved” metric

● Multi-document Insert Operations○ Can be used to batch inserts○ Improves insert performance in many cases○ Can be ordered <true/false> in options document

Insert Operations

● Write Concern accepted● RocksDB or WiredTiger recommended for high write volumes

Insert Operations

Example:db.users.insert([

{“username”: “tim”,“password”: 123456,

“createDate”: new Date(),},{...}

], {“writeConcern”: { w: ”majority” },ordered: true

})

Read Operations

● .find()○ Returns cursor of document(s) matching a set of conditions○ Option to specify which fields to return, default: full doc!

● .aggregate()○ Powerful framework for data aggregations, summaries, etc○ No .explain() support○ Scans entire collection unless using a $match as first stage

■ db.<coll>.find(<$match conditions>).explain() to explain!● .mapReduce()

○ Runs in JavaScript and offers little insight

Read Operations

● Read Concern○ Important for strict data integrity (often combined with WCs)

■ Replica Set failovers can cause some data “rollback”● “Rolledback” data is written to a json file in your dbPath

○ Tunable read consistency■ “local” = default, local node read only■ “majority” = read from a majority of members■ “linearizable” = read and ack’d from a majority of members

○ RC can be changed per query

Read Operations

● Example:

test1:PRIMARY> db.getMongo().setReadPref('primary')test1:PRIMARY> db.pages.find({title : "Geography of Guinea-Bissau"}, {_id: 1, title: 1}).pretty(){

"_id" : "0c25a313481757720f5e9e46b4cffddd08a13fef","title" : "Geography of Guinea-Bissau"

}

Update Operations

● Operates much like a .find() with special re-insertion● Cannot run .explain() on .update()

○ Convert the .update() match to a .find(<conds).explain()● Can change one or more documents

○ { multi: true } option required for multi-document updates○ Avoid Multi-document updates if possible

● MMAPv1 Considerations○ In many cases MMAP can do in-place updates (no replace)

● Provides many update operators to change the data○ Examples: $set, $inc, $mul, $min, $max

Update Operations

● Efficient Updates○ Use indexes on the match condition (very crucial)○ Avoid multi-document updates in your design

Update Operations

● Example:

test1:PRIMARY> db.pages.update({title : "Geography of Guinea-Bissau"},{$set: {title: "Geography of Guinea-Bisso"}

})WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

Delete Operations

● Similar in execution impact to an .update()● Accepts array of delete conditions (batching)● Using TTL Index to Delete

○ Deletes documents based on MongoDB ISODate() objects○ Batches deletes by minute (by default)○ Doubles as an index for the date field

● Batch delete by field○ Mark a field as {deleted:true}○ Run controlled batch job to .remove({deleted:true})○ Requires some scripting / cronjob

Delete Operations

● RocksDB Considerations○ Compaction may struggle with very high delete volume○ “Toombstones” are written in place of the document

■ Considered by read operations until compaction

Delete Operations

● Example:

test1:PRIMARY> db.pages.remove({title : "Geography of Guinea-Bisso"})WriteResult({ "nRemoved" : 1 })

Indexing

● MongoDB supports BTree, text and geo indexes● Default behaviour

● Collection lock until indexing completes● {background:true}

○ Runs indexing in the background avoiding pauses○ Hard to monitor and troubleshoot progress○ Unpredictable performance impact

● Avoid drivers that auto-create indexes. Use real performance data● Too many indexes hurts performance● Indexes have a forward or backward direction

Indexing

● Compound Indexes○ Several fields supported○ Fields can be in forward or backward direction

■ Consider any .sort() query options and match sort direction!○ Read Left -> Right

■ Index can be partially-read■ Left-most fields do not need to be duplicated, example:

You have an index with fields: {status:1, date:1} and a 2nd with: {status:1}. {status:1} is duplicated!

Indexing

● Get Indexes Example:

test1:PRIMARY> db.pages.getIndexes()[

{"v" : 1,"key" : {

"_id" : 1},"name" : "_id_","ns" : "wikipedia.pages"

}]

Operation Profiling

● Writes slow database operations to a new MongoDB collection for analysis○ Capped Collection: “system.profile” in each database, default 1mb○ The collection is capped, ie: profile data doesn’t last forever○ Enable operationProfiling in “slowOp” mode

■ Start with a very high threshold and decrease it in stepsUsually 50-100ms is a good thresholdEnable in mongod.confoperationProfiling:

slowOpThresholdMs: 100mode: slowOp

Operation Profiling● op/ns/query: type, namespace and

query of a profile● keysExamined: # of index keys

examined● docsExamined: # of docs examined

to achieve result● writeConflicts: # of WCE

encountered during update● numYields: # of times operation

yielded for others● locks: detailed lock statistics

.explain() and Profiler

● .explain() Example: Profiler:

Log File: Slow Queries, etc

● Interesting details are logged to the mongod/mongos log files○ Slow queries○ Storage engine details (sometimes)○ Index operations○ Chunk moves○ Connections

Query Efficiency

● Index Efficiency: keysExamined / nreturned● Document Efficiency: docsExamined / nreturned

● End goal: Examine only as many Documents as you return● Example: a query scanning 10 documents to return 1 has efficiency 0.1

Schema Design: Data Types

● Strings○ Only use strings if required○ Do not store numbers as strings!

■ Look for {field:“123456”} instead of {field:123456}■ “12345678” moved to a integer uses 25% less space■ Range queries on proper integers is more efficient


● Strings■ Example JavaScript to convert a field in an entire collection:

db.items.find().forEach(function(x) {newItemId = parseInt(x.itemId); db.containers.update(

{ _id: x._id },{ $set: {itemId: itemId } }

)});

○ Do not store dates as strings!■ The field "2017-08-17 10:00:04 CEST" stores in 52.5% less

space!


● Strings■ Do not store booleans as strings!

● “true” -> true = 47% less space wasted● DBRef

○ DBRefs provide pointers to another document○ DBRefs can be cross-collection

Schema Design: Data Locality

● MongoDB optimised for single-document operations● Single Document / Centralised

○ Greate cache/disk-footprint efficiency○ Centralised schemas may create a hotspot for write locking

● Multi Document / Decentralised○ MongoDB rarely stores data sequentially on disk○ Multi-document operations are less efficient○ Less potential for hotspots/write locking○ Increased overhead due to fan-out of updates

Schema Design: Data Locality

● Multi Document / Decentralised○ Suited for a background worker model

■ Example: Social Media status update, graph relationships, etc

Batching Operations

● Batching Inserts/Updates○ Requires less network commands○ Allows the server to do some internal batching○ Operations will be slower overall

■ Suited for queue worker scenarios batching many changes■ Traditional user-facing database traffic should aim to

operate on a single (or few) document(s)

Batching Operations

● Thread-per-connection model○ 1 x DB operation = 1 x CPU core only○ Executing Parallel Reads

■ Large batch queries benefit from several parallel sessions■ Break query range or conditions into several client->server

threads■ Not recommended for Primary nodes or Secondaries with

heavy reads

Read Preference / Scaling

● Controls the node type (not consistency) for a session:○ “primary” - Only read from Primary○ “primaryPreferred” - Read from Primary unless there is none○ “secondary” - Only read from Secondary○ “secondaryPreferred” - Read from Secondary unless there are

none○ “nearest” - Connect to the first node you can

● Using a secondary-focussed Read Preference provides read scalability○ Adding more secondaries provides more read capacity!○ 2/3rds of a replset is often sitting around idle!

Query Antipatterns

● No list of fields specified in .find()○ MongoDB returns entire documents unless fields are specified○ Only return the fields required for an application operation!○ Covered-index operations require only the index fields to be

specified● Using $where operators

○ This executes JavaScript with a global lock● Many $and or $or conditions

○ MongoDB doesn’t handle large lists of $and or $or efficiently

Monitoring MongoDB

● Methodology○ Monitor Frequently

■ Monitor every N seconds (not minutes!)■ Problems can begin/end in seconds

○ Iterative / Responsive■ “What graph would have told us the problem faster?”

○ Correlate Database and Operating System

Monitoring MongoDB

● Database○ Operation counters○ Cache Traffic and Capacity○ Checkpoint / Compaction Performance○ Concurrency Tickets (WiredTiger and RocksDB)○ Document and Index scanning○ Various engine-specific details

Monitoring MongoDB

● Operating System○ CPU○ Disk

■ Bandwidth / Util■ Average Wait Time

○ Memory○ Network

Monitoring MongoDB: Percona PMM

● Open-source database monitoring from Percona!

● Based on open-source technologies

● Simple deployment● Graphs in this demo are

from PMM● 800+ metrics per ping● Includes Linux OS

metrics

Monitoring MongoDB: Percona PMM QAN

● QAN (“Query Analyser”) for MongoDB!

○ Query data using the MongoDB profiler!

○ Still in Beta

Testing Performance and Capacity

General● Database

○ Restore Production backups○ Use mongoreplay or Parse/flashback tools to capture + replay real

database traffic○ Use PMM or other monitoring to analyse changes

● Full stack○ Try to emulate the real user traffic○ Add micro-pauses to simulate reality○ Cloud-based providers are great for running load generation

Themes

● Perform as few client->server operations as possible (batch)● For read or updates:

○ Consider if a query will benefit from an index○ Only request/set the data fields required

● Use the log file and profiler to find slow performers● Only add indexes if they’re required● Store data in the smallest data type possible● Use Secondaries to scale read traffic● Use Read/Write Concerns for strong integrity● Review performance data (profiler, monitoring, etc) on a schedule

Questions?

Thanks for joining!Be sure to checkout

the Percona Blog for more technical blogs and topics!