45
MongoDB: Scaling write performance Junegunn Choi

MongoDB: Scaling write performance | Devon 2012

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: MongoDB: Scaling write performance | Devon 2012

MongoDB:Scaling write performance

Junegunn Choi

Page 2: MongoDB: Scaling write performance | Devon 2012

First impression:Easy

• Easy installation

• Easy data model

•No prior schema design

•Native support for secondary indexes

Page 3: MongoDB: Scaling write performance | Devon 2012

Second thought:Not so easy

•No SQL

• Coping with massive data growth

• Setting up and operating sharded cluster

• Scaling write performance

Page 4: MongoDB: Scaling write performance | Devon 2012

Today we’ll talk aboutinsert performance

Page 5: MongoDB: Scaling write performance | Devon 2012

Insert throughputon a replica set

Page 6: MongoDB: Scaling write performance | Devon 2012

* 1kB record. ObjectId as PK* WriteConcern: Journal sync on Majority

Steady 5k inserts/sec

Page 7: MongoDB: Scaling write performance | Devon 2012

Insert throughputwith a secondary index

Page 8: MongoDB: Scaling write performance | Devon 2012
Page 9: MongoDB: Scaling write performance | Devon 2012

Culprit:B+Tree index

• Good at sequential insert

• e.g. ObjectId, Sequence #, Timestamp

• Poor at random insert

• Indexes on randomly-distributed data

Page 10: MongoDB: Scaling write performance | Devon 2012

Sequential vs. Random insert

123456789101112

B+Tree

55757819936809152635633

B+Tree

Sequential insert ➔ Small working set➔ Fits in RAM ➔ Sequential I/O

(bandwidth-bound)

Random insert ➔ Large working set➔ Cannot fit in RAM ➔ Random I/O

(IOPS-bound)

working set working set

Page 11: MongoDB: Scaling write performance | Devon 2012

So, what do we do now?

Page 12: MongoDB: Scaling write performance | Devon 2012

B+Tree

1. Partitioning

does not fit in memory

Aug 2012 Sep 2012 Oct 2012

fits in memory

Page 13: MongoDB: Scaling write performance | Devon 2012

1. Partitioning

•MongoDB doesn’t support partitioning

• Partitioning at application-level

• e.g. Daily log collection

• logs_20121012

Page 14: MongoDB: Scaling write performance | Devon 2012

Switch collection every hour

Page 15: MongoDB: Scaling write performance | Devon 2012

2. Better H/W

•More RAM

•More IOPS

• RAID striping

• SSD

• AWS Provisioned IOPS (1k ~ 10k)

Page 16: MongoDB: Scaling write performance | Devon 2012
Page 17: MongoDB: Scaling write performance | Devon 2012

SHARD3SHARD2SHARD1

3. More H/W: Sharding

• Automatic partitioning across nodes

mongos router

Page 18: MongoDB: Scaling write performance | Devon 2012

3 shards (3x3)

Page 19: MongoDB: Scaling write performance | Devon 2012

3 shards (3x3)on RAID 1+0

Page 20: MongoDB: Scaling write performance | Devon 2012

There’s no free lunch• Manual partitioning

• Incidental complexity

• Better H/W

• $

• Sharding

• $$

• Operational complexity

Page 21: MongoDB: Scaling write performance | Devon 2012

“Do you really need that index?”

Page 22: MongoDB: Scaling write performance | Devon 2012

Scaling insert performancewith sharding

Page 23: MongoDB: Scaling write performance | Devon 2012

=Choosing the right shard key

Page 24: MongoDB: Scaling write performance | Devon 2012

SHARD3

USERS

Shard key example:year_of_birth

SHARD1

USERS

~ 1950 1951 ~ 1970

1991 ~ 2005

SHARD2

USERS

1971 ~ 1990

2010 ~ ∞

2006 ~ 2010

64MB chunk

mongos router

Page 25: MongoDB: Scaling write performance | Devon 2012

5k inserts/sec w/o sharding

Page 26: MongoDB: Scaling write performance | Devon 2012

Sequential key

•ObjectId as shard key

• Sequence #

• Timestamp

Page 27: MongoDB: Scaling write performance | Devon 2012

Worse throughput with 3x H/W.

Page 28: MongoDB: Scaling write performance | Devon 2012

Sequential key

• All inserts into one chunk

• Chunk migration overhead

SHARD-x

USERS

1000 ~ 2000

9000 ~ ∞

5000 ~ 7500

9001, 9002, 9003, 9004, ...

Page 29: MongoDB: Scaling write performance | Devon 2012

Sequential key

Page 30: MongoDB: Scaling write performance | Devon 2012

Hash key

• e.g. SHA1(_id) = 9f2feb0f1ef425b292f2f94 ...

•Distributes evenly across all ranges

Page 31: MongoDB: Scaling write performance | Devon 2012
Page 32: MongoDB: Scaling write performance | Devon 2012

Hash key

• Performance drops as collection grows

•Why? Mandatory shard key index

• B+Tree problem again!

Page 33: MongoDB: Scaling write performance | Devon 2012

Sequential keyHash key

Page 34: MongoDB: Scaling write performance | Devon 2012

Sequential + hash key• Coarse-grained sequential prefix

• e.g. Year-month + hash value

• 201210_24c3a5b9

B+Tree

201210_*201209_*201208_*

Page 35: MongoDB: Scaling write performance | Devon 2012

B+Tree

But what if...

201210_*201209_*201208_*

large working set

Page 36: MongoDB: Scaling write performance | Devon 2012

Sequential + hash key

• Can you predict data growth rate?

• Balancer not clever enough

•Only considers # of chunks

•Migration slow during heavy-writes

Page 37: MongoDB: Scaling write performance | Devon 2012

Sequential keyHash key

Sequential + hash key

Page 38: MongoDB: Scaling write performance | Devon 2012

Low-cardinality hash key

• e.g. A~Z, 00~FF

• Alleviates B+Tree problem

• Sequential access on fixed # of parts

LocalB+Tree

A B C

Shard key range: A ~ D

AA BB CC

Page 39: MongoDB: Scaling write performance | Devon 2012
Page 40: MongoDB: Scaling write performance | Devon 2012

Low-cardinality hash key

• Limits the # of possible chunks

• e.g. 00 ~ FF ➔ 256 chunks

• Chunk grows past 64MB

• Balancing becomes difficult

Page 41: MongoDB: Scaling write performance | Devon 2012

Sequential keyHash key

Sequential + hash keyLow-cardinality hash key

Page 42: MongoDB: Scaling write performance | Devon 2012

Low-cardinality hash prefix+ sequential part

• e.g. Short hash prefix + timestamp

• FA1350005981

•Nice index access pattern

• Unlimited # of chunks

LocalB+Tree

A123 B123 C123

Shard key range: A000 ~ C999

A000 B000 C000

Page 43: MongoDB: Scaling write performance | Devon 2012

Finally, 2x throughput

Page 44: MongoDB: Scaling write performance | Devon 2012

Lessons learned• Know the performance impact of secondary index

• Choose the right shard key

• Test with large data sets

• Linear scalability is hard

• If you really need it, consider HBase or Cassandra

• SSD