View
841
Download
0
Category
Preview:
DESCRIPTION
Presentation held at MongoUK, September 2012
Citation preview
SHORTCUTSAROUND THEMISTAKES I’VEMADE SCALING
MONGODB
Theo, Chief Architect atonsdag 21 september 11
What we doWe want to revolutionize the digital advertising industry by showing that there is more to ad analytics than click through rates.
onsdag 21 september 11
Ads
onsdag 21 september 11
Data
onsdag 21 september 11
Assembling sessionsexposure
pingping
ping ping
ping
event
event
ping
session➔ ➔
onsdag 21 september 11
Crunching
session
session
session
session
sessionsession
session session
session
session
session
session
session
➔ ➔ 42
onsdag 21 september 11
Reports
onsdag 21 september 11
What we doTrack ads, make pretty reports.
onsdag 21 september 11
That doesn’t sound so hard
onsdag 21 september 11
That doesn’t sound so hardWe don’t know when sessions end
onsdag 21 september 11
That doesn’t sound so hardWe don’t know when sessions endThere’s a lot of data
onsdag 21 september 11
That doesn’t sound so hardWe don’t know when sessions endThere’s a lot of dataIt’s all done in (close to) real time
onsdag 21 september 11
Numbers
onsdag 21 september 11
Numbers40 Gb data
onsdag 21 september 11
Numbers40 Gb data50 million documents
onsdag 21 september 11
Numbers40 Gb data50 million documentsper day
onsdag 21 september 11
How we use MongoDB
onsdag 21 september 11
How we use MongoDB“Virtual memory” to offload data while we wait for sessions to finish
onsdag 21 september 11
How we use MongoDB“Virtual memory” to offload data while we wait for sessions to finishShort time storage (<48 hours) for batch jobs
onsdag 21 september 11
How we use MongoDB“Virtual memory” to offload data while we wait for sessions to finishShort time storage (<48 hours) for batch jobsMetrics storage
onsdag 21 september 11
Why we use MongoDB
onsdag 21 september 11
Why we use MongoDBSchemalessness makes things so much easier, the data we collect changes as we come up with new ideas
onsdag 21 september 11
Why we use MongoDBSchemalessness makes things so much easier, the data we collect changes as we come up with new ideasSharding makes it possible to scale writes
onsdag 21 september 11
Why we use MongoDBSchemalessness makes things so much easier, the data we collect changes as we come up with new ideasSharding makes it possible to scale writesSecondary indexes and rich query language are great features (for the metrics store)
onsdag 21 september 11
Why we use MongoDBSchemalessness makes things so much easier, the data we collect changes as we come up with new ideasSharding makes it possible to scale writesSecondary indexes and rich query language are great features (for the metrics store)It’s just… nice
onsdag 21 september 11
Btw.
onsdag 21 september 11
Btw.We use JRuby, it’s awesome
onsdag 21 september 11
A story in 7 iterations
onsdag 21 september 11
secondary indexes and updates1st iteration
onsdag 21 september 11
secondary indexes and updates1st iteration
One document per session, update as new data comes alongOutcome: 1000% write lock
onsdag 21 september 11
#1Everything is aboutworking around the
GLOBALWRITELOCK
onsdag 21 september 11
MongoDB 2.0.0
db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true)
db.coll.update({_id: "abc"}, {$push: {x: “...”}}, true)
onsdag 21 september 11
MongoDB 1.8.1
db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true)
db.coll.update({_id: "abc"}, {$push: {x: “...”}}, true)
onsdag 21 september 11
using scans for two step assembling2nd iteration
Instead of updating, save each fragment, then scan over _id to assemble sessions
onsdag 21 september 11
using scans for two step assembling2nd iteration
Outcome: not as much lock, but still not great performance. We also realised we couldn’t remove data fast enough
onsdag 21 september 11
#2Everything is aboutworking around the
GLOBALWRITELOCK
onsdag 21 september 11
#3Give a lot of
thought to your
PRIMARYKEY
onsdag 21 september 11
partitioning3rd iteration
onsdag 21 september 11
partitioning3rd iteration
We came up with the idea of partitioning the data by writing to a new collection every hour
onsdag 21 september 11
partitioning3rd iteration
We came up with the idea of partitioning the data by writing to a new collection every hourOutcome: lots of complicated code, lots of bugs, but we didn’t have to care about removing data
onsdag 21 september 11
#4Make sure you can
REMOVE OLD DATA
onsdag 21 september 11
sharding4th iteration
onsdag 21 september 11
sharding4th iteration
To get around the global write lock and get higher write performance we moved to a sharded cluster.
onsdag 21 september 11
sharding4th iteration
To get around the global write lock and get higher write performance we moved to a sharded cluster.Outcome: higher write performance, lots of problems, lots of ops time spent debugging
onsdag 21 september 11
#5Everything is aboutworking around the
GLOBALWRITELOCK
onsdag 21 september 11
#6SHARDINGIS NOT A
SILVER BULLETand it’s buggy,
if you can, avoid it
onsdag 21 september 11
onsdag 21 september 11
#7IT WILL FAIL
design for it
onsdag 21 september 11
onsdag 21 september 11
onsdag 21 september 11
moving things to separate clusters5th iteration
onsdag 21 september 11
moving things to separate clusters5th iteration
We saw very different loads on the shards and realised we had databases with very different usage patterns, some that made autosharding not work. We moved these off the cluster.
onsdag 21 september 11
moving things to separate clusters5th iteration
We saw very different loads on the shards and realised we had databases with very different usage patterns, some that made autosharding not work. We moved these off the cluster.Outcome: a more balanced and stable cluster
onsdag 21 september 11
#8Everything is aboutworking around the
GLOBALWRITELOCK
onsdag 21 september 11
#9ONE DATABASEwith one usage pattern
PER CLUSTER
onsdag 21 september 11
#10MONITOR
EVERYTHINGlook at your health
graphs daily
onsdag 21 september 11
monster machines6th iteration
onsdag 21 september 11
monster machines6th iteration
We got new problems removing data and needed some room to breathe and think
onsdag 21 september 11
monster machines6th iteration
We got new problems removing data and needed some room to breathe and think Solution: upgraded the servers to High-Memory Quadruple Extra Large (with cheese).
onsdag 21 september 11
monster machines6th iteration
We got new problems removing data and needed some room to breathe and think Solution: upgraded the servers to High-Memory Quadruple Extra Large (with cheese).
♥Ionsdag 21 september 11
#11Don’t try to scale up
SCALE OUT
onsdag 21 september 11
#12When you’re out of ideas
CALL THE EXPERTS
onsdag 21 september 11
partitioning (again) and pre-chunking7th iteration
onsdag 21 september 11
partitioning (again) and pre-chunking7th iteration
We rewrote the database layer to write to a new database each day, and we created all chunks in advance. We also decreased the size of our documents by a lot.
onsdag 21 september 11
partitioning (again) and pre-chunking7th iteration
We rewrote the database layer to write to a new database each day, and we created all chunks in advance. We also decreased the size of our documents by a lot.Outcome: no more problems removing data.
onsdag 21 september 11
#13Smaller objects means a smaller database, and a smaller database means
LESS RAM NEEDED
onsdag 21 september 11
#14Give a lot of
thought to your
PRIMARYKEY
onsdag 21 september 11
#15Everything is aboutworking around the
GLOBALWRITELOCK
onsdag 21 september 11
#16Everything is aboutworking around the
GLOBALWRITELOCK
onsdag 21 september 11
KTHXBAI
@iconaraarchitecturalatrocities.com
burtcorp.com
onsdag 21 september 11
Since we got time…
onsdag 21 september 11
Safe modeTips
onsdag 21 september 11
Safe modeTips
Run every Nth insert in safe mode
onsdag 21 september 11
Safe modeTips
Run every Nth insert in safe modeThis will give you warnings when bad things happen; like failovers
onsdag 21 september 11
Avoid bulk insertsTips
onsdag 21 september 11
Avoid bulk insertsTips
Very dangerous if there’s a possibility of duplicate key errors
onsdag 21 september 11
EC2Tips
onsdag 21 september 11
EC2Tips
You have three copies of your data, do you really need EBS?
onsdag 21 september 11
EC2Tips
You have three copies of your data, do you really need EBS?Instance store disks are included in the price and they have predictable performance.
onsdag 21 september 11
EC2Tips
You have three copies of your data, do you really need EBS?Instance store disks are included in the price and they have predictable performance.m1.xlarge comes with 1.7 TB of storage.
onsdag 21 september 11
Recommended