35
Solution Architect, MongoDB c [email protected] @ctindel Chad Tindel #MongoDBWorld Hardware Provisioning

Hardware Provisioning

  • Upload
    mongodb

  • View
    465

  • Download
    2

Embed Size (px)

DESCRIPTION

Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.

Citation preview

Page 1: Hardware Provisioning

Solution Architect, MongoDB

[email protected]

@ctindel

Chad Tindel

#MongoDBWorld

Hardware Provisioning

Page 2: Hardware Provisioning

MongoDB is so easy for programmers….

Page 3: Hardware Provisioning

Even a baby can write an application!

Page 4: Hardware Provisioning

MongoDB is so easy to manage with MMS…

Page 5: Hardware Provisioning

Even a baby can manage a cluster!

Page 6: Hardware Provisioning

Hardware Selection for MongoDB is….

Page 7: Hardware Provisioning

Not so easy!

Page 8: Hardware Provisioning
Page 9: Hardware Provisioning

Text Over Photo

A Cautionary Tale

Page 10: Hardware Provisioning

The methodology (in theory)

Page 11: Hardware Provisioning

Requirements – Step One

• It is impossible to properly size a MongoDB cluster without first documenting your business requirements

• Availability: what is your uptime requirement?

• Throughput

• Responsiveness– what is acceptable latency?– is higher latency during peak times acceptable?

Page 12: Hardware Provisioning

Requirements – Step Two

• Understand your own resources available to you– Storage– Memory– Network– CPU

• Many customers limited to the options available in AWS or presented by their own Enterprise Virtualization team

Page 13: Hardware Provisioning

Continuing Requirements – Step Three• Once you deploy initially, it is common for

requirements to change– More users added to the application

• Causes more queries and a larger working set– New functionality changes queries patterns

• New indexes added causes a larger working set– What started as a read-intensive application can add more

and more write-heavy workloads• More write-locking increases reader queue depth

• You must monitor and collect metrics and update your hardware selection as necessary (scale up / Add RAM? Add more shards?)

Page 14: Hardware Provisioning

Run a Proof of Concept

• Forces you to:– Do schema / index design– Understand query patterns– Get a handle on Working Set size

• Start small on a single node– See how much performance you can get from one box

• Add replication, then add sharding– Understand how these affect performance in your use

case

• POC can be done on a smaller scale to infer what will be needed for production

Page 15: Hardware Provisioning

POC – Requirements to Gather

Data Sizes– Total Number of Documents– Average Document Size– Size of Data on Disk– Size of Indexes on Disk– Expected growth– What is your document model?

• Ingestion– Insertions / Updates / Deletes per second, peak &

average– Bulk inserts / updates? If so, how large and how often?

Page 16: Hardware Provisioning

POC – Requirements to Gather

• Query Patterns and Performance Expectations– Read Response SLA– Write Response SLA– Range queries or single document queries?– Sort conditions– Is more recent data queried more frequently?

• Data Policies– How long will you keep the data for?– Replication Requirements– Backup Requirements / Time to Recovery

Page 17: Hardware Provisioning

POC – Requirements to Gather

• Multi-datacenter Requirements– Number and location of datacenters– Cross DC latency– Active / Active or Active / Passive?– Geographical / Data locality requirements?

• Security Requirements– Encryption over the wire (SSL) ?– Encryption of data at rest?

Page 18: Hardware Provisioning

Resource Usage

• Storage– IOPS– Size– Data & Loading

Patterns

• Memory

– Working Set

• CPU– Speed– Cores

• Network– Latency– Throughput

Page 19: Hardware Provisioning

Storage Capability

7,200 rpm SATA ~ 75-100 IOPS

15,000 rpm SAS ~ 175-210 IOPS

Amazon SSD EBS ~ 4000 PIOPS / Volume~ 48,000 PIOPS / Instance

Intel X25-E (SLC) ~ 5,000 IOPS

Fusion IO ~ 135,000 IOPS

Violin Memory 6000 ~ 1,000,000 IOPS

Page 20: Hardware Provisioning

Storage Measuring

Page 21: Hardware Provisioning

Storage Measuring

Page 22: Hardware Provisioning

Memory Measuring

• Added in 2.4– workingSet option on db.serverStatus() > db.serverStatus( { workingSet: 1 } )

Page 23: Hardware Provisioning

Network

• Latency– WriteConcern– ReadPreference

• Throughput– Update/Write Patterns– Reads/Queries

• Come to love netperf

Page 24: Hardware Provisioning

CPU Usage

• Non-indexed Queries

• Sorting

• Aggregation– Map/Reduce– Framework

Page 25: Hardware Provisioning

Case Studies (theory applied)

Page 26: Hardware Provisioning

Case Study #1: A Spanish Bank

• Problem statement: want to store 6 months worth of logs

• 18TB of total data (3 TB/month)

• Primarily analyzing the last month’s worth of logs, so Working Set Size is 1 month’s worth of data (3TB) plus indexes (1TB) = 4 TB Working Set

Page 27: Hardware Provisioning

Case Study #1: Hardware Selection

• QA Environment– Did not want to mirror a full production cluster. Just

wanted to hold 2TB of data– 3 nodes / shard * 4 shards = 12 physical machines– 2 mongos – 3 config servers (virtual machines)

• Production Environment– 3 nodes / shard * 36 shards = 108 physical machines– 128GB/RAM * 36 = 4.6 TB RAM– 2 mongos – 3 config servers (virtual machines)

Page 28: Hardware Provisioning

Case Study #2: A Large Online Retailer• Problem statement: Moving their product

catalog from SQL Server to MongoDB as part of a larger architectural overhaul to Open Source Software

• 2 main datacenters running active/active• On Cyber Monday they peaked at 214

requests/sec, so let’s budget for 400 requests/sec to give some headroom

Page 29: Hardware Provisioning

Case Study #2: The POC

• A POC yielded the following numbers:– 4 million product SKUs, average JSON document

size 30KB• Need to service requests for:

– a specific product (by _id)– Products in a specific category (i.e. “Desks” or

“Hard Drives”)• Returns 72 documents, or 200 if it’s a

google bot crawling)

Page 30: Hardware Provisioning

Case Study #2: The Math

• Want to partition (Shard) by category, and have products that exist in multiple categories duplicated– The average product appears in 2 categories, so we

actually need to store 8M SKU documents, not 4M

• 8M docs * 30KB/doc = 240GB of data

• 270 GB with indexes

• Working Set is 100% of all data + indexes as this is a core functionality that must be fast at all times

Page 31: Hardware Provisioning

Case Study #2: Our Recommendation

• MongoDB initial recommendation was to deploy a single Replica Set with enough RAM in each server to hold all the data (at least 384GB RAM/server)

• 4 node Replica Set (2 nodes in each DC, 1 arbiter in a 3rd DC)– Allows for a node in each DC to go down for maintenance or

system crash while still servicing the application centers in that datacenter

• Deploy using secondary reads (NEAREST read preference)

• This avoids the complexity of sharding, setting up mongos, config servers, worrying about orphaned documents, etc.

Page 32: Hardware Provisioning

Node 1Primary

Node 2Secondary

Node 3Secondary

Node 3Secondary

Datacenter 3

Arbiter

Datacenter 1 Datacenter 2

Page 33: Hardware Provisioning

Case Study #2: Actual Provisioning

• Customer decided to deploy on their corporate VMWare Cloud

• IT would not give them nodes any bigger than 64 GB RAM

• Decided to deploy 3 shards (4 nodes each + arbiter) = 192 GB/RAM cluster wide into a staging environment and add a fourth shard if staging proves it would be worthwhile

Page 34: Hardware Provisioning

Key Takeaways

• Document your performance requirements up front

• Conduct a Proof of Concept• Always test with a real workload• Constantly monitor and adjust based on

changing requirements

Page 35: Hardware Provisioning

Solution Architect, MongoDB

Chad Tindel

#MongoDBWorld

Thank You