Upload
mongodb
View
104
Download
1
Tags:
Embed Size (px)
Citation preview
Mike KaniaProduction Engineer @ Parse
Benchmarking, Load Testing, and Preventing
Terrible Disasters
What Parse Does
We have 500k+ apps running on Parse.
Provide services to —
•Store user data
•Run server side JavaScript
•Send push notifications
•Handle crash reporting
•Generate analytics
Parse + MongoDB• Use many of MongoDB’s feature set
• Support almost every type of workload you can imagine
•Millions of collections and indexes
• new ones being created every minute
•Run MongoDB exclusively on AWS
•We do crazy things with MongoDB
Why Should You Listen to Me?
• Parse has one of the most complex MongoDB infrastructures(in the world?)
• Started using MongoDB in 1.8
• Upgraded 2.6 everywhere 6 months ago
• We have some battle wounds from upgrading MongoDB to pass on to you
Why Shouldn’t You Listen to Me?
MongoDB is a jack of all trades, and there’s certain features that we haven’t touched.
•Sharding — We built our own way to shard data
•Aggregation/Map Reduce — We don’t touch this at all
History of MongoDB Upgrades at Parse
1.8 2.0 2.2 2.4 2.6 3.0{Doitlive
Cowboy Upgrade1. Review “Upgrade Requirements” and
known bugs in JIRA
2. Run intigration/unit tests agains the new version
3. Spin up a hidden secondary. Watch for problems
4. Unhide SECONDARY.. Watch for problems
5. Promote to PRIMARY
6. Declare success! Oh wait I mean watch for problems.
What Went Wrong• 60% perf reduction• all geo indexes block global
lock until the first document found
• unindexable writes suddenly refused
• changed the definition of scan limits,
A New Approach
1.8 2.0 2.2 2.4 2.6 3.0{ {Doitlive
Doitwith production workloads
in a test environment
Flashback• Open sourced benchmarking
tool specifically for MongoDB
• Captures production workloads
• Replay those workloads over and over again with configurable speeds
• Recently merged a pull request to support load testing with Mongo sharing
RecordGet the config setup:
•oplog_server: A secondary that will be used to tail the oplog for write operations
•profiler_server: The primary in the target replica set to capture profiling data
•duration_sec: Defines how long you want to record
Enable Profiling
• Keep in mind, it does an additional write for every operation.
•./set_mongo_profiling.py -a enable -n $PRIMARY_HOSTNAME
Moar Better Recording
• What about just capturing it over the wire?• Maybe use mongosniff
• MongoDB has a built in pcap library. • Enter mongocaputils
• Also open source• Still a little buggy
Running the Record
./record.py
Creating a Consistent Snapshot
Need a way to quickly capture a consistent snapshot of your dataset
We use EBS snapshots,
•locking mongod
•creating an EBS snapshot of all the RAIDed volumes on /var/lib/mongodb
•unlocking mongod.
Quickly Replaying Workloads
•Pre-Warming EBS snapshots after each run is slow and time consuming
•Pulling down the blocks from S3 takes hours or days if you have terabytes of data.
•We decided to use LVM on top of EBS
•Does incur I/O overhead
•Allows us to do LVM snapshots!
How we used LVMDefine a restore point before benchmarking
•lvcreate -l 10%VG -s -n restore_point /dev/mongovg/mongoraid
Merge Copy-on-Write logical volume to rollback
•Stop MongoDB
•Unmount Filesystem
•lvconvert –merge /dev/mongovg/restore_point
Creating the Test Environment
• Spin up new EC2 instance and restore the EBS volumes from snapshot
•New EBS volumes need to be pre-warmed. Blocks are lazily loaded from S3
• Benchmark server which will run Flashback request and has the workload on disk.
•Nothing specials needs to happen here
Benchmarking New Shiny Storage Engines
In MongoDB 3.0, each storage engine has a different on-disk format
So we also need to run an initial sync of each new storage engine against our restored MMAPv1 backup, and then run benchmarks on each format.
MMAPv1 (restored from
snapshot)
RocksDB
WiredTiger
initial sync
initial sync
Side Note: The Storage Efficiency of the RocksDB/
WiredTiger is Amazing*
*You should totally check out the “Storage Engine Wars” talk by Charity Majors and Igor Canadi
0
1,000
2,000
3,000
4,000
283GB318GB
3,245GB
MMAPv1 WiredTiger RocksDB
Running the Replay• Two styles to replay: real and
stress
flashback \ -ops_filename=OUTPUT \ -style=real \ -url=$MONGO_HOST:27017 \ -workers=50
MongoDB 2.6 MMAPv1
MongoDB 3.0 MMAPv1 MongoDB 3.0
RocksDB
Flashback
Metrics Gathering
• Flashback percentile latencies broken down by operation type.
• Useful from a high level
• Not so useful when diving into query regressions
Logging Pipeline• Mongo logs are hard to parse.
• Thankfully you don’t need to worry about it
• Just use our open source PEG parser mongologtools
• Ship JSON via Scribe to an internal Facebook data diving tool
First Results
Op 2.6 MMAPv1
3.0 MMAPv1
3.0 RockDB
query 2.93ms 4.43ms 3.04ms
p50 Query Latency
Op 2.6 MMAPv1
3.0 MMAPv1 3.0 RockDB
query 177.41ms 619471.47ms
1441442.26ms
p99 Query Latency
First Regression•Regression in $nearSphere queries just for 3.0
•SERVER-17469 — patched in 3.0.2
• After the fix average latency for $nearSphere went from
•2354 ms to 35 ms
More Ad-Hoc AnalysisM
MAP
v1Ro
cksD
B
# documents scanned
dura
tion
ms
dura
tion
ms
# documents scanned
P99 Latency
query
insert
remove
update
findandmodify
count
0ms 10ms 20ms 30ms 40ms
1
5
1
1
0
2
0
28
1
22
23
2
0
15
11
21
32
82.6 MMAPv13.0 MMAPv13.0 RockDB
Some time later…
Benchmarks Won’t Find Everything
•[RocksDB] Prefix collision could happen between restarts
https://github.com/mongodb-partners/mongo/commit/da8a90b3b71bf291684ffc5a6d2fd32118ce1a7b
•[MongoDB] Secondary reads block replication
https://jira.mongodb.org/browse/SERVER-18190
Where are we now with testing 3.0?
• MongoDB 3.0 with RocksDB is serving some production traffic and it looks amazing.
mill
isec
onds
API Request
Linkage• Flashback
• https://github.com/ParsePlatform/flashback
• Mongologtools
• https://github.com/tmc/mongologtools
• MongoDB 3.0 Benchmarking Results
• http://blog.parse.com/learn/engineering/mongodb-rocksdb-writing-so-fast-it-makes-your-head-spin/
• nearSphere regression
• https://jira.mongodb.org/browse/SERVER-17469
• WT/RocksDB secondary crash
• https://jira.mongodb.org/browse/SERVER-17882