18
mongo @ ex.fm Lucas Hrabovsky CTO #MongoPGH

mongodb + ex.fm

  • Upload
    lucas

  • View
    215

  • Download
    1

Embed Size (px)

DESCRIPTION

_id, padding factor, and bucketing, oh my! Slides from my talk at MongoPGH http://www.10gen.com/events/mongodb-pgh May 15, 2012

Citation preview

Page 1: mongodb + ex.fm

mongo @ ex.fm

Lucas HrabovskyCTO

#MongoPGH

Page 2: mongodb + ex.fm

ex.fm turns websites into CD’s

Page 3: mongodb + ex.fm

browser extensions

Page 4: mongodb + ex.fm

_id and indexes

• Bad Ideas– ObjectId("4fb284…") – Big Compound Indexes– Long,VariableWidthStringsMissIndexes

• Good Ideas–Make _id mean something– Fixed Width Hashes– Use _id as a compound index

Page 5: mongodb + ex.fm

activity feeds: first attempt

db.user.feed.find({‘username’: ‘lucas’, ‘verb’: ‘love’}).sort({‘created’: -1})

{“_id”: “201109122304-lucas-dan-c7dede43…”, "username”: “lucas”, "created”: 201109122304, "actor”: “dan”, “verb”: “love”}

Working just fine for 4MM documents, but getting slow…

Page 6: mongodb + ex.fm

new version of activity feeds

db.user.feed.find({‘vid’: /^lucas-/}).sort({‘vid’: -1})

{“_id”: “201109122304-lucas-dan-c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”: lucas-love-201109122304, "actor”: “dan”}

Fast for all 3 use cases!

Page 7: mongodb + ex.fm

removing indexes pays off

Don’t need to buy more/bigger machines!

Page 8: mongodb + ex.fm

sites! sites! sites!

Page 9: mongodb + ex.fm

padding factor

• Variable document size• Allocate for the latest and fattest• Document moves• Can be very inefficient• More RAM!• Pre-allocate to prevent moves

Page 10: mongodb + ex.fm

unbounded embedded lists

• Useful for followers, favorites• Good for a few things, bad for lots• Constantly bumping up padding

factor• Lots of document moves

Page 11: mongodb + ex.fm

a metaphor

• You run a coffee shop and can buy only one size of cup. Which size do you buy?

• On average, each customer has only one cup

• Heavy drinkers have hundreds of cups

credit: Macintex macintex.deviantart.com

Page 12: mongodb + ex.fm

bucketing!

• Split list across multiple documents• Median number of items = bucket

size• Pre-allocate• Easy seeking and traversal• Much faster

Page 13: mongodb + ex.fm

site.meta 1

site.songs 1 site.songs 2

site.meta 2

Allocated and unused

Allocated and full of data

hey charts!

Page 14: mongodb + ex.fm

same charts when using bucketing

site.meta 1

site.songs 1 -2

site.songs 1 - 1 site.songs 2 - 1

site.songs 2 -6

site.songs 2 - 3 site.songs 2 - 4

site.songs 2 - 5

site.songs 2 - 2

site.meta 2

Allocated and unused

Allocated and full of data

Page 15: mongodb + ex.fm

doesn’t work for everything…

• Picking right bucket size • Defragging• Random insertion– Easy for things you don’t much care

about the order of–More difficult is you’re going to insert

and change the order later

Page 16: mongodb + ex.fm

micro documents

db.site.songs.find({_id: /^bfc25de08d964a8a41226c6016dd7753-/}).sort({_id:-1})

{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" : 18436532 }{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" : 18804590 }{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" : 18804591 }

Page 17: mongodb + ex.fm

paying it back

• Bent mongoengine to make this easy• Follow github.com/exfm• Also added tooling for– Trace all queries– Aggregate tracing by request

middleware– Raise exceptions when queries miss an

index

Page 18: mongodb + ex.fm

thanks!

github.com/[email protected]