70

Back tobasicswebinar part6-rev

  • Upload
    mongodb

  • View
    1.025

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Back tobasicswebinar part6-rev
Page 2: Back tobasicswebinar part6-rev
Page 3: Back tobasicswebinar part6-rev

Solution Architect, MongoDB

Sam Weaver

#MongoDBBasics

‘Build an Application’ Webinar Series

Deploying your application in production

Page 4: Back tobasicswebinar part6-rev

Agenda

• Replica Sets Lifecycle

• Developing with Replica Sets

• Scaling your database

Page 5: Back tobasicswebinar part6-rev

Q&A

• Virtual Genius Bar– Use chat to post

questions– EMEA Solution

Architecture / Support Team are on hand

– Make use of them during the sessions!!!

Page 6: Back tobasicswebinar part6-rev

Recap

• Introduction to MongoDB

• Schema design

• Interacting with the database

• Indexing

• Analytics– Map Reduce– Aggregation Framework

Page 7: Back tobasicswebinar part6-rev

Deployment Considerations

Page 8: Back tobasicswebinar part6-rev

Working Set Exceeds Physical Memory

Page 9: Back tobasicswebinar part6-rev

Why Replication?

• How many have faced node failures?

• How many have been woken up from sleep to do a fail-over(s)?

• How many have experienced issues due to network latency?

• Different uses for data– Normal processing– Simple analytics

Page 10: Back tobasicswebinar part6-rev

Replica Set Lifestyle

Page 11: Back tobasicswebinar part6-rev

Replica Set – Creation

Page 12: Back tobasicswebinar part6-rev

Replica Set – Initialize

Page 13: Back tobasicswebinar part6-rev

Replica Set – Failure

Page 14: Back tobasicswebinar part6-rev

Replica Set – Failover

Page 15: Back tobasicswebinar part6-rev

Replica Set – Recovery

Page 16: Back tobasicswebinar part6-rev

Replica Set – Recovered

Page 17: Back tobasicswebinar part6-rev

Developing with Replica Sets

Page 18: Back tobasicswebinar part6-rev

Strong Consistency

Page 19: Back tobasicswebinar part6-rev

Delayed Consistency

Page 20: Back tobasicswebinar part6-rev

Write Concern

• Network acknowledgement

• Wait for error

• Wait for journal sync

• Wait for replication

Page 21: Back tobasicswebinar part6-rev

Unacknowledged

Page 22: Back tobasicswebinar part6-rev

MongoDB Acknowledged (wait for error)

Page 23: Back tobasicswebinar part6-rev

Wait for Journal Sync

Page 24: Back tobasicswebinar part6-rev

Wait for Replication

Page 25: Back tobasicswebinar part6-rev

Tagging

• Control where data is written to, and read from

• Each member can have one or more tags– tags: {dc: "ny"}– tags: {dc: "ny", subnet: "192.168", rack:

"row3rk7"}

• Replica set defines rules for write concerns

• Rules can change without changing app code

Page 26: Back tobasicswebinar part6-rev

{

_id : "mySet",

members : [

{_id : 0, host : "A", tags : {"dc": "ny"}},

{_id : 1, host : "B", tags : {"dc": "ny"}},

{_id : 2, host : "C", tags : {"dc": "sf"}},

{_id : 3, host : "D", tags : {"dc": "sf"}},

{_id : 4, host : "E", tags : {"dc": "cloud"}}],

settings : {

getLastErrorModes : {

allDCs : {"dc" : 3},

someDCs : {"dc" : 2}} }

}

> db.blogs.insert({...})

> db.runCommand({getLastError : 1, w : "someDCs"})

Tagging Example

Page 27: Back tobasicswebinar part6-rev

Wait for Replication (Tagging)

Page 28: Back tobasicswebinar part6-rev

Read Preference Modes

• 5 modes– primary (only) - Default– primaryPreferred– secondary– secondaryPreferred– Nearest

When more than one node is possible, closest node is used for reads (all modes but primary)

Page 29: Back tobasicswebinar part6-rev

Tagged Read Preference

• Custom read preferences

• Control where you read from by (node) tags– E.g. { "disk": "ssd", "use": "reporting" }

• Use in conjunction with standard read preferences– Except primary

Page 30: Back tobasicswebinar part6-rev

• SAFE writes acceptable for our use case

• Potential to use secondary reads for comments, but probably not needed

• Use tagged reads for analytics

Our application

Page 31: Back tobasicswebinar part6-rev

Scaling

Page 32: Back tobasicswebinar part6-rev

Working Set Exceeds Physical Memory

Page 33: Back tobasicswebinar part6-rev

• When a specific resource becomes a bottle neck on a machine or replica set• RAM• Disk IO• Storage• Concurrency

When to consider Sharding?

Page 34: Back tobasicswebinar part6-rev

Vertical Scalability (Scale Up)

Page 35: Back tobasicswebinar part6-rev

Horizontal Scalability (Scale Out)

Page 36: Back tobasicswebinar part6-rev

Partitioning

• User defines shard key

• Shard key defines range of data

• Key space is like points on a line

• Range is a segment of that line

Page 37: Back tobasicswebinar part6-rev

Initially 1 chunk

Default max chunk size: 64mb

MongoDB automatically splits & migrates chunks when max reached

Data Distribution

Page 38: Back tobasicswebinar part6-rev

Architecture

Page 39: Back tobasicswebinar part6-rev

What is a Shard?

• Shard is a node of the cluster

• Shard can be a single mongod or a replica set

Page 40: Back tobasicswebinar part6-rev

Meta Data Storage

• Config Server– Stores cluster chunk ranges and locations– Can have only 1 or 3 (production must have

3)– Not a replica set

Page 41: Back tobasicswebinar part6-rev

Routing and Managing Data

• Mongos– Acts as a router / balancer– No local data (persists to config database)– Can have 1 or many

Page 42: Back tobasicswebinar part6-rev

Sharding infrastructure

Page 43: Back tobasicswebinar part6-rev

Cluster Request Routing

• Targeted Queries

• Scatter Gather Queries

• Scatter Gather Queries with Sort

Page 44: Back tobasicswebinar part6-rev

Cluster Request Routing: Targeted Query

Page 45: Back tobasicswebinar part6-rev

Routable request received

Page 46: Back tobasicswebinar part6-rev

Request routed to appropriate shard

Page 47: Back tobasicswebinar part6-rev

Shard returns results

Page 48: Back tobasicswebinar part6-rev

Mongos returns results to client

Page 49: Back tobasicswebinar part6-rev

Cluster Request Routing: Non-Targeted Query

Page 50: Back tobasicswebinar part6-rev

Non-Targeted Request Received

Page 51: Back tobasicswebinar part6-rev

Request sent to all shards

Page 52: Back tobasicswebinar part6-rev

Shards return results to mongos

Page 53: Back tobasicswebinar part6-rev

Mongos returns results to client

Page 54: Back tobasicswebinar part6-rev

Cluster Request Routing: Non-Targeted Query with Sort

Page 55: Back tobasicswebinar part6-rev

Non-Targeted request with sort received

Page 56: Back tobasicswebinar part6-rev

Request sent to all shards

Page 57: Back tobasicswebinar part6-rev

Query and sort performed locally

Page 58: Back tobasicswebinar part6-rev

Shards return results to mongos

Page 59: Back tobasicswebinar part6-rev

Mongos merges sorted results

Page 60: Back tobasicswebinar part6-rev

Mongos returns results to client

Page 61: Back tobasicswebinar part6-rev

Shard Key

Page 62: Back tobasicswebinar part6-rev

Shard Key

• Shard key is immutable

• Shard key values are immutable

• Shard key must be indexed

• Shard key limited to 512 bytes in size

• Shard key used to route queries– Choose a field commonly used in queries

• Only shard key can be unique across shards– `_id` field is only unique within individual shard

Page 63: Back tobasicswebinar part6-rev

A suitable shard key for our app…

• Occurs in most queries

• Routes to each shard

• Is granular enough to not exceed 64MB chunks

• Any candidates?– Author?– Date?– _id?– Title?– Author & Date?

Page 64: Back tobasicswebinar part6-rev

Summary

Page 65: Back tobasicswebinar part6-rev

Things to remember

• Size appropriately for your working set

• Shard when you need to, not before

• Pick a shard key wisely

Page 66: Back tobasicswebinar part6-rev

Next Session – 17th April

• Backup and Disaster Recovery• Backup and restore options

Page 67: Back tobasicswebinar part6-rev

Thank you

Page 68: Back tobasicswebinar part6-rev
Page 69: Back tobasicswebinar part6-rev
Page 70: Back tobasicswebinar part6-rev