21
Sharding in 20 minutes Why;Who;When;Where; David Murphy , Mongo Master Lead DBA, ObjectRocket @dmurphy_data @objectrocket

Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

  • Upload
    mongodb

  • View
    251

  • Download
    2

Embed Size (px)

DESCRIPTION

Curious about the benefits of sharding your MongoDB deployments? Do you need help deciding when you should shard, or which collections to shard first? Or maybe you just need some guidance on finding the right shard key. This session will cover these topics and give you a primer on MongoDB sharding and why it makes the database so compelling for so many applications. This is an entry-level to medium-level talk with references and links to more advanced material on sharding MongoDB.

Citation preview

Page 1: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Sharding in 20 minutesWhy;Who;When;Where;

David Murphy , Mongo MasterLead DBA, ObjectRocket

@dmurphy_data @objectrocket

Page 2: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Background

• 16 yrs in databases, development, & system

engineering

• Lead DBA @ ObjectRocket

• Mongo Master with a focus on sharding, chunks, and

scaling mongo beyond normal means.

Page 3: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

What does a sharded cluster look like?

Page 4: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Why;Who;When;Where;

i) Why do we shard?

ii) Who should I shard?

iii) When do we shard?

iv) Where is my shard key

Page 5: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Why do we shard?

• Scaling out write locks

Page 6: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Why do we shard?

• Scaling out write locks

• Small dataset to search per node

Page 7: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Why do we shard?

• Scaling out write locks

• Small dataset to search per node

• Getting more connections to the data

Page 8: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Why do we shard?

• Scaling out write locks

• Small dataset to search per node

• Getting more connections to the data

• More smaller node vs Scaling up to expensive

nodes

Page 9: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Who should I shard?

• Biggest Collections by Size

Page 10: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Who should I shard?

• Biggest Collections by Size

• Busiest Collection by changes

Page 11: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Who should I shard?

• Biggest Collections by Size

• Busiest Collection by changes

• Groupings of data (example):

• State/Country

• UserID

• Company

• Category

Page 12: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

When do I shard?

ALWAYS as early as possible!

Page 13: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

When do I shard?

ALWAYS as early as possible!

Reasons:

• Not all commands work

Page 14: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

When do I shard?

ALWAYS as early as possible!

Reasons:

• Not all commands work

• Future Proof - No recoding

Page 15: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

When do I shard?

ALWAYS as early as possible!

Reasons:

• Not all commands work

• Future Proof - No recoding

• Adding index once your live can take time

you don’t have!

Page 16: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Where (and what) is my shard key?

You have to pick your own :/

But there are some quick hints…

Page 17: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Where (and what) is my shard key?

Sharding Quick Hints:

• Hashed Shard keys

Great for even disk usage

Uses Scatter-Gathers == More Conns

Dates,Increasing IDs , and text are great here

Page 18: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Where (and what) is my shard key?

Sharding Quick Hints:

• Hashed Shard keys

Great for even disk usage

Uses Scatter-Gathers == More Conns

Dates,Increasing IDs , and text are great here

• Non-Hashed KeysUse profiler_level:2 & review ALL operations

Things you wont change only

No Dates

No Increasing ID numbers

No Text

Page 19: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Why mongo sharding/balancing

Modulus Sharding with MySQL:

Hard to rebalance online

Requires application coding to support

Ring Topologies like Cassandra -

Cant change schema online

Hard to rebalance online

Page 20: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Further Reading

Presentations:

Kenny Gorman Sharding - bit.ly/1oXYDfm

David Murphy - Adv Sharding for Operations - bit.ly/1oXYDfm

Other Sharding MongoDB Links - bit.ly/ZTtDI1

Picking a shard key (manual) - http://bit.ly/1ozuzMH

Choosing a shard key - http://slidesha.re/1nBnGtq

Page 21: Lightning Talk: What You Need to Know Before You Shard in 20 Minutes

Contact@dmurphy_data@objectrocket

[email protected]://www.objectrocket.com

WE ARE HIRING! (DBA,DEVOPS, and more)https://www.objectrocket.com/careers