20

Click here to load reader

MongoDB Operational Best Practices (mongosf2012)

Embed Size (px)

DESCRIPTION

Learn about mongodb best practices from examples from fields.

Citation preview

Page 1: MongoDB Operational Best Practices (mongosf2012)

Operational Best Practices

Tales from the field

Page 2: MongoDB Operational Best Practices (mongosf2012)

The Plan● Review support cases

○ Taken from real issues○ Names/ips/dates changed to protect identities

● Analyze reported issues● Distill best practices● Summarize takeaways ● Repeat...

Page 3: MongoDB Operational Best Practices (mongosf2012)

Scenario 1● Fire, it is on fire! ● Users notice response time takes 1-3 sec● App logs show timeouts● Server log show socket exceptions

Page 4: MongoDB Operational Best Practices (mongosf2012)

Scenario 1 - Diagnostics● Logs ● Understanding the timeouts

○ Client read timeout set○ Connection closed/discarded○ Symptom not cause

● Server connection exceptions

○ Match timing of client timeouts○ Symptom not cause

Page 5: MongoDB Operational Best Practices (mongosf2012)

Scenario 1 - MonitoringGraphs speak a thousand words

Page 6: MongoDB Operational Best Practices (mongosf2012)

Scenario 1 - Takeaways● Monitor Logs

○ Alert, escalate○ Correlate

● Disk○ Monitor○ Moved to RAID (10)

● Instrument/Monitor App● Know your application and application (write)

characteristics

Page 7: MongoDB Operational Best Practices (mongosf2012)

Scenario 2● Alerts warn that server is running hot● Random (small) slowdowns● Increased traffic/queries

Page 8: MongoDB Operational Best Practices (mongosf2012)

Scenario 2 - SymptomsHigh use cpu Similar query pattern

Page 9: MongoDB Operational Best Practices (mongosf2012)

Scenario 2 - Diagnostics● Turn on DB Profiling● Look at logs Identify query patterns taking longest or with highest frequency and run explain

Page 10: MongoDB Operational Best Practices (mongosf2012)

Scenario 2 - Explaindb.scenario2.find({...}).sort({...}).explain() { "cursor" : "BtreeCursor ABC", "nscanned" : 160677, "nscannedObjects" : 12015, "n" : 55, "millis" : 99, "scanAndOrder" : true, "indexBounds" : {...} }

Page 11: MongoDB Operational Best Practices (mongosf2012)

Scenario 2 - Diagnostics● Create a compound index

○ Used for criteria and sort○ Reduced CPU dramatically

Page 12: MongoDB Operational Best Practices (mongosf2012)

Scenario 2 - Takeaways● Performance test/analyze system behavior● Load test before deployment● Alert on abnormal states● High CPU is a sign of poorly indexed● Rolling upgrade for indexes

Page 13: MongoDB Operational Best Practices (mongosf2012)

Scenario 3● General slowdown on login● High disk utilization

Page 14: MongoDB Operational Best Practices (mongosf2012)

Scenario 3 - DiagnosticsiostatDevice: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilsdp 0.00 0.00 0.50 0.00 27.86 0.00 56.00 149.58 20320.00 2010.00 100.00

Page 15: MongoDB Operational Best Practices (mongosf2012)

Scenario 3$ blockdev --reportRO RA SSZ BSZ StartSec Size Devicerw 8096 512 4048 0 1099494850560 /dev/sdp

Huge read-ahead of 4MB

Page 16: MongoDB Operational Best Practices (mongosf2012)

Scenario 3 - Takeaways● Pay attention to disk configurations● Load testing would have found this early● MongoDB depends on the OS a lot● Connect the dots from disportionate effects

Page 17: MongoDB Operational Best Practices (mongosf2012)

Best Practices Learned● System provisioning

○ Capacity○ Performance○ Scale○ Configuration

● Logs○ Review○ Alert○ Rotate and collect (per cluster)

Page 18: MongoDB Operational Best Practices (mongosf2012)

Best Practices Learned● Query/Index Analysis

○ Database Profiler○ Run explain periodically (sampled)○ Instrument code, generate metrics

● Plan/test rollouts○ Rolling upgrade for Replica Set○ Generate indexes on secondaries first○ Name services, use redirection

Page 19: MongoDB Operational Best Practices (mongosf2012)

Thanks, more refsPlease take a look at http://mongodb.org (docs) ● Ask on mongodb-user group● Use MMS or historic monitoring

○ Watch for trends○ Create alerts○ Forecast capacity for provisioning

● logrotate unix command● monitor disk - munin or the like● iostat, dstat, vmstat, free, netstat

Page 20: MongoDB Operational Best Practices (mongosf2012)

Questions