Upload
jeremy-taylor
View
712
Download
1
Tags:
Embed Size (px)
DESCRIPTION
by Jared Rosoff. Dec 2012
Citation preview
Technical Director, 10gen
@forjared
Jared Rosoff
#MongoSV 2012
Schema Design-- Inboxes!
Single Table En
Agenda
• Problem overview
• Design Options – Fan out on Read– Fan out on Write– Fan out on Write with Bucketing
• Conclusions
Problem Overview
Let’s getSocial
Sending Messages
?
Reading my Inbox
?
Design Options
3 Approaches (there are more)• Fan out on Read
• Fan out on Write
• Fan out on Write with Bucketing
Fan out on read
• Generally, not the right approach
• 1 document per message sent
• Multiple recipients in an array key
• Reading an inbox is finding all messages with my own name in the recipient field
• Requires scatter-gather on sharded cluster
• Then a lot of random IO on a shard to find everything
// Shard on “from”db.shardCollection(”myapp.messages”, { ”from”: 1} )
// Make sure we have an index to handle inbox readsdb.messages.ensureIndex( { ”to”: 1, ”sent”: 1 } )
msg = { from: "Joe”, to: [ ”Bob”, “Jane” ],
sent: new Date(), message: ”Hi!”,
}
// Send a messagedb.messages.save(msg)
// Read my inboxdb.messages.find({ to: ”Joe” }).sort({ sent: -1 })
Fan out on Read
Fan out on read – Send Message
Shard 1 Shard 2 Shard 3
Send Message
Fan out on read – Inbox Read
Shard 1 Shard 2 Shard 3
Read Inbox
Fan out on write
• Tends to scale better than fan out on read
• 1 document per recipient
• Reading my inbox is just finding all of the messages with me as the recipient
• Can shard on recipient, so inbox reads hit one shard
• But still lots of random IO on the shard
// Shard on “recipient” and “sent” db.shardCollection(”myapp.messages”, { ”recipient”: 1, ”sent”: 1 } )
msg = { from: "Joe”, to: [ ”Bob”, “Jane” ],
sent: new Date(), message: ”Hi!”,
}
// Send a messagefor( recipient in msg.to ) {
msg.recipient = recipientdb.messages.save(msg);
}
// Read my inboxdb.messages.find({ recipient: ”Joe” }).sort({ sent: -1 })
Fan out on Write
Fan out on write – Send Message
Shard 1 Shard 2 Shard 3
Send Message
Fan out on write– Read Inbox
Shard 1 Shard 2 Shard 3
Read Inbox
Fan out on write with bucketing• Generally the best approach
• Each “inbox” document is an array of messages
• Append a message onto “inbox” of recipient
• Bucket inbox documents so there’s not too many per document
• Can shard on recipient, so inbox reads hit one shard
• 1 or 2 documents to read the whole inbox
// Shard on “owner / sequence”db.shardCollection(”myapp.inbox”, { ”owner”: 1, ”sequence”: 1 } )db.shardCollection(”myapp.users”, { ”user_name”: 1 } )msg = { from: "Joe”, to: [ ”Bob”, “Jane” ],
sent: new Date(), message: ”Hi!”,
}// Send a messagefor( recipient in msg.to) { sequence = db.users.findAndModify({ query: { user_name: recipient}, update: { '$inc': { ’msg_count': 1 }}, upsert: true, new: true }).msg_count / 50
db.inbox.update({ owner: recipient, sequence: sequence},
{ $push: { ‘messages’: msg } },
{ upsert: true });}// Read my inboxdb.inbox.find({ owner: ”Joe” }).sort({ sequence: -1 }).limit(2)
Fan out on Write
Bucketed fan out on write - Send
Shard 1 Shard 2 Shard 3
Send Message
Bucketed fan out on write - Read
Shard 1 Shard 2 Shard 3
Read Inbox
Discussion
TradeoffsFan out on
ReadFan out on
WriteBucketed Fan out on Write
Send Message Performance
Best Single shardSingle write
GoodShard per recipientMultiple writes
WorstShard per recipientAppends (grows)
Read Inbox Performance
WorstBroadcast all shardsRandom reads
GoodSingle shardRandom reads
Best Single shardSingle read
Data Size Best Message stored once
WorstCopy per recipient
WorstCopy per recipient
Things to consider
• Lots of recipients
• Fan out on write might become prohibitive• Consider introducing a “Group”
• Very large message size
• Multiple copies of messages can be a burden• Consider single copy of message with a “pointer”
per inbox
• More writes than reads
• Fan out on read might be okay
Comments – where do they live?
Conclusion
Summary
• Multiple ways to model status updates
• Bucketed fan out on write is typically the better approach
• Think about how your model distributes across shards
• Think about how much random IO needs to happen on a shard
Technical Director, 10gen
Jared Rosoff
#MongoSV
Thank You