Benchmarking, Load Testing, and Preventing Terrible Disasters

Mike KaniaProduction Engineer @ Parse

Benchmarking, Load Testing, and Preventing

Terrible Disasters

What Parse Does

We have 500k+ apps running on Parse.

Provide services to —

•Store user data

•Run server side JavaScript

•Send push notifications

•Handle crash reporting

•Generate analytics

Parse + MongoDB• Use many of MongoDB’s feature set

• Support almost every type of workload you can imagine

•Millions of collections and indexes

• new ones being created every minute

•Run MongoDB exclusively on AWS

•We do crazy things with MongoDB

Why Should You Listen to Me?

• Parse has one of the most complex MongoDB infrastructures(in the world?)

• Started using MongoDB in 1.8

• Upgraded 2.6 everywhere 6 months ago

• We have some battle wounds from upgrading MongoDB to pass on to you

Why Shouldn’t You Listen to Me?

MongoDB is a jack of all trades, and there’s certain features that we haven’t touched.

•Sharding — We built our own way to shard data

•Aggregation/Map Reduce — We don’t touch this at all

History of MongoDB Upgrades at Parse

1.8 2.0 2.2 2.4 2.6 3.0{Doitlive

Cowboy Upgrade1. Review “Upgrade Requirements” and

known bugs in JIRA

2. Run intigration/unit tests agains the new version

3. Spin up a hidden secondary. Watch for problems

4. Unhide SECONDARY.. Watch for problems

5. Promote to PRIMARY

6. Declare success! Oh wait I mean watch for problems.

What Went Wrong• 60% perf reduction• all geo indexes block global

lock until the first document found

• unindexable writes suddenly refused

• changed the definition of scan limits,

A New Approach

1.8 2.0 2.2 2.4 2.6 3.0{ {Doitlive

Doitwith production workloads

in a test environment

Flashback• Open sourced benchmarking

tool specifically for MongoDB

• Captures production workloads

• Replay those workloads over and over again with configurable speeds

• Recently merged a pull request to support load testing with Mongo sharing

RecordGet the config setup:

•oplog_server: A secondary that will be used to tail the oplog for write operations

•profiler_server: The primary in the target replica set to capture profiling data

•duration_sec: Defines how long you want to record

Enable Profiling

• Keep in mind, it does an additional write for every operation.

•./set_mongo_profiling.py -a enable -n $PRIMARY_HOSTNAME

Moar Better Recording

• What about just capturing it over the wire?• Maybe use mongosniff

• MongoDB has a built in pcap library. • Enter mongocaputils

• Also open source• Still a little buggy

Running the Record

./record.py

Creating a Consistent Snapshot

Need a way to quickly capture a consistent snapshot of your dataset

We use EBS snapshots,

•locking mongod

•creating an EBS snapshot of all the RAIDed volumes on /var/lib/mongodb

•unlocking mongod.

Quickly Replaying Workloads

•Pre-Warming EBS snapshots after each run is slow and time consuming

•Pulling down the blocks from S3 takes hours or days if you have terabytes of data.

•We decided to use LVM on top of EBS

•Does incur I/O overhead

•Allows us to do LVM snapshots!

How we used LVMDefine a restore point before benchmarking

•lvcreate -l 10%VG -s -n restore_point /dev/mongovg/mongoraid

Merge Copy-on-Write logical volume to rollback

•Stop MongoDB

•Unmount Filesystem

•lvconvert –merge /dev/mongovg/restore_point

Creating the Test Environment

• Spin up new EC2 instance and restore the EBS volumes from snapshot

•New EBS volumes need to be pre-warmed. Blocks are lazily loaded from S3

• Benchmark server which will run Flashback request and has the workload on disk.

•Nothing specials needs to happen here

Benchmarking New Shiny Storage Engines

In MongoDB 3.0, each storage engine has a different on-disk format

So we also need to run an initial sync of each new storage engine against our restored MMAPv1 backup, and then run benchmarks on each format.

MMAPv1 (restored from

snapshot)

RocksDB

WiredTiger

initial sync

initial sync

Side Note: The Storage Efficiency of the RocksDB/

WiredTiger is Amazing*

*You should totally check out the “Storage Engine Wars” talk by Charity Majors and Igor Canadi

0

1,000

2,000

3,000

4,000

283GB318GB

3,245GB

MMAPv1 WiredTiger RocksDB

Running the Replay• Two styles to replay: real and

stress

flashback \ -ops_filename=OUTPUT \ -style=real \ -url=$MONGO_HOST:27017 \ -workers=50

MongoDB 2.6 MMAPv1

MongoDB 3.0 MMAPv1 MongoDB 3.0

RocksDB

Flashback

Metrics Gathering

• Flashback percentile latencies broken down by operation type.

• Useful from a high level

• Not so useful when diving into query regressions

Logging Pipeline• Mongo logs are hard to parse.

• Thankfully you don’t need to worry about it

• Just use our open source PEG parser mongologtools

• Ship JSON via Scribe to an internal Facebook data diving tool

First Results

Op 2.6 MMAPv1

3.0 MMAPv1

3.0 RockDB

query 2.93ms 4.43ms 3.04ms

p50 Query Latency

Op 2.6 MMAPv1

3.0 MMAPv1 3.0 RockDB

query 177.41ms 619471.47ms

1441442.26ms

p99 Query Latency

First Regression•Regression in $nearSphere queries just for 3.0

•SERVER-17469 — patched in 3.0.2

• After the fix average latency for $nearSphere went from

•2354 ms to 35 ms

More Ad-Hoc AnalysisM

MAP

v1Ro

cksD

B

# documents scanned

dura

tion

ms

dura

tion

ms

# documents scanned

P99 Latency

query

insert

remove

update

findandmodify

count

0ms 10ms 20ms 30ms 40ms

1

5

1

1

0

2

0

28

1

22

23

2

0

15

11

21

32

82.6 MMAPv13.0 MMAPv13.0 RockDB

Some time later…

Benchmarks Won’t Find Everything

•[RocksDB] Prefix collision could happen between restarts

https://github.com/mongodb-partners/mongo/commit/da8a90b3b71bf291684ffc5a6d2fd32118ce1a7b

•[MongoDB] Secondary reads block replication

https://jira.mongodb.org/browse/SERVER-18190

https://github.com/mongodb-partners/mongo/commit/da8a90b3b71bf291684ffc5a6d2fd32118ce1a7b


Where are we now with testing 3.0?

• MongoDB 3.0 with RocksDB is serving some production traffic and it looks amazing.

mill

isec

onds

API Request

Linkage• Flashback

• https://github.com/ParsePlatform/flashback

• Mongologtools

• https://github.com/tmc/mongologtools

• MongoDB 3.0 Benchmarking Results

• http://blog.parse.com/learn/engineering/mongodb-rocksdb-writing-so-fast-it-makes-your-head-spin/

• nearSphere regression

• https://jira.mongodb.org/browse/SERVER-17469

• WT/RocksDB secondary crash

• https://jira.mongodb.org/browse/SERVER-17882



Technology

Benchmarking, Load Testing, and Preventing Terrible Disasters