Running MongoDB 3.0 on AWS

Preview:

Citation preview

Running MongoDB on AWS

Mark YalentiSenior Solutions Architect, MongoDB Inc.

3

Agenda

• MongoDB Basics• Deployment Configurations• AWS EC2 Instances• Configuring Instances• Storage Considerations• Backup Considerations

MONGODB BASICS

5

MongoDB Basics

• Open source• Document database• High performance• Horizontally scalable• Full featured• Built to match agile development and deployment

6

MongoDB Features

• Flexible document data model• Rich ad-hoc queries • Real-time aggregation• Geospatial support (Within, Intersects and Near operators)• Text search• Pluggable Storage Engine Architecture• Built-in support for

– Redundancy, failover, auto-partitioning

7

7x-10x Performance, 50%-80% Less Storage

How: WiredTiger Storage Engine• Same data model, same query language, same ops• Write performance gains driven by document-level concurrency control• Storage savings driven by native compression• Non-disruptive upgrade

MongoDB 3.0MongoDB 2.6

Performance

8

MMAPv1 Storage Engine

• History– MMAPv0 was initial storage engine of MongoDB– Delegates memory management to operating system

• New Capabilities– Collection-level concurrency control– Multiple performance enhancements– Windows performance now equivalent to Linux

• Advantages– Read-intensive applications– Cache survives MongoDB restart, upgrades– Drop-in upgrade

9

Accessing MongoDB

ShellCommand-line shell for interacting directly with database

DriversDrivers for most popular programming languages and frameworks

> db.collection.insert({product:“MongoDB”, type:“Document Database”})> > db.collection.findOne(){

“_id” : ObjectId(“5106c1c2fc629bfe52792e86”),

“product” : “MongoDB”“type” : “Document Database”

}

Java

Python

Perl

Ruby

Haskell

JavaScript

DEPLOYMENT CONFIGURATION

11

Deploying MongoDB

• Single node– Development: prototyping, testing

• Replica Set– Production: high availability, disaster recovery

• Shard Cluster– Production: auto-partitioning, linear read/write scale

12

MongoDB: Single Node

MongoDB

App

13

MongoDB: Replica Sets

MongoDBPrimary

App

MongoDBSecondary

MongoDBSecondary

14

MongoDB: Shard ClusterApp

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

mongosconfig

config

config

App

mongos

App

mongos

AMAZON WEB SERVICES

16

EC2 Instance Types

• General Purpose• Compute-optimized• GPU• Memory-optimized• Storage-optimized• Micro

17

EC2 Instance Types

• General Purpose• Compute-optimized• GPU (compute resources not needed)• Memory-optimized• Storage-optimized• Micro (bursty, no sustained CPU)

18

EC2 Instance Types

• General Purpose– M3, M4 – (Instance Store vs EBS)

• Compute-optimized– C3, C4 – (Instance Store vs EBS)

• Memory-optimized– R3

• Storage-optimized – I2, D2

19

Additional Considerations

• Memory Optimized Instances for larger working set• More CPUs are suggested for WiredTiger based instances• Placement groups can be used for high-bandwidth needs

20

Components and Sizing

mongod

Core database process

High performance

Memory, CPU Storage, Network

config

Shard metadata

Smaller

m4.medium or better

mongos

Shard query router

Deploy on app server

21

Replica Sets: Availability Zones

MongoDBPrimary

App

MongoDBSecondary

MongoDBSecondary

Zone 1 Zone 2 Zone 3

22

Replica Sets: Regions

MongoDBPrimary

App

MongoDBSecondary

MongoDBSecondary

Region 1 Region 2

23

Replica Sets: Regions and Zones

MongoDBPrimary

App

MongoDBSecondary

MongoDBSecondary

Region 1 Region 2

24

Shard Cluster: RegionsApp

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

mongosconfig

config

config

App

mongos

App

mongos

Region 1

Region 2

25

Shard Cluster: RegionsApp

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

MongoDBPrimary

MongoDBSecondary

Shard

MongoDBSecondary

mongosconfig

config

config

App

mongos

App

mongos

Region 1 Region 2

26

High Availability• Use Replica Sets

– Deploy in odd numbers– Maintain majority

• Withstand the loss of– Any single zone?– Any single region?– Deploy in 3 places

• Scale– Replica Sets for HA– Shards for scale– Combine for both

MongoDBPrimary

1

MongoDBSecondar

y

2

MongoDBSecondar

y

3

BEST PRACTICES

28

Sensible Instance Defaults

• Best practices are meant to be a sensible starting point• Strive for smooth and consistent performance• Tune -> Scale Vertically -> Scale Horizontally• Amazon Linux optimized for EC2• EBS provides persistent storage• EBS-optimized allocates additional NIC for storage• Provisioned IOPS provides consistent EBS performance• Use separate PIOPS volumes for data, log, journal

29

Instance Configuration Best Practices

• Install via yum for flexibility and simplicity – See mongodb.org for details• Update system settings (Don’t forget about NTP!)• Use EXT4 or XFS (WiredTiger runs best on XFS)• Set read ahead (default is too high)• Update ulimits (default is too low)• Update TCP KeepAlive

https://docs.mongodb.org/manual/administration/production-notes/

30

Data Safety

• What’s your backup plan?• Have you tested restoring?• Is your data highly available?• How do you recover from disaster?

31

Protecting Your Data

• Replica Sets– Proper deployments provide HA and DR

• Manual backup/restore– Scriptable, tunable

• Cloud Manager Backup– Continuous, secure backup

32

Manual Backup Considerations

• Consider Journaling (Write Ahead Log)– on by default• Allow for DB durability in case of a fault• With Journaling a snapshot technology can be used with MMAPv1• MMAP v1 does in-place updates – fsync is required if you don’t use

journaling• WiredTiger does not require fsync as it effectively does write ahead natively • Journaling with WiredTiger is still a good idea

33

MongoDB Cloud Manager

Single-click provisioning, scaling & upgrades, admin tasks – including instance deployment on EC2

Monitoring, with charts, dashboards and alerts on 100+ metrics

Backup and restore, with point-in-time recovery, support for shard clusters

The Best Way to Manage MongoDB In Your Data CenterUp to 95% Reduction in Operational Overhead

34

Resources

• MongoDB on AWS best practices:– http://docs.mongodb.org/ecosystem/platforms/amazon-ec2/

• MongoDB production Notes– http://docs.mongodb.org/manual/administration/production-notes/

• MongoDB docs– http://docs.mongodb.org

QUESTIONS?

Recommended