Upload
mongodb
View
10.952
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Martin Tepper of Travel IQ presents at Mongo Berlin
Citation preview
MongoDB Conference Berlin 2011
MongoDB as aqueryable cache
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
About me
• Martin Tepper
• Lead Developer at Travel IQ
• http://monogreen.de
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Contents
• About Travel IQ
• The problem
• The solution
• The headaches
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
About Travel IQ
• Meta Search Engine for Flights and Hotels
• 9 Hotel Providers
• 21 Flight Providers
• ~ 6000 searches per day
• ~ 64k provider queries per day
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
About Travel IQ
• Real-Time Aggregation
• Ruby/Rails based
• API-Driven
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Quick aside
• Ruby: OO script language
• Rails: MVC Web application framework
• ActiveRecord: ORM framework
The Problem
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Basic Architecture
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Basic Architecture
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Strongly Normalized
• Very organized
• Reuse of models
• Saves disk space
• But …
sql = <<-SQL SELECT MIN(outerei.id) FROM ( SELECT OBJ1.starts_at AS OBJ1_starts_at, OBJ1.ends_at AS OBJ1_ends_at, OBJ1.origin_id AS OBJ1_origin_id, OBJ1.destination_id AS OBJ1_destination_id, MIN(P1.price) AS the_price FROM packages P1 LEFT JOIN journeys OBJ1 ON (P1.outbound_journey_id = OBJ1.id) LEFT JOIN results R1 ON (R1.package_id = P1.id) LEFT JOIN packagings PA1a ON (PA1a.package_id = P1.id AND PA1a.position = 1) LEFT JOIN offers O1a ON (PA1a.offer_id = O1a.id) WHERE R1.search_id IN (#{search_id}) AND R1.search_type = 'FlightSearch' AND O1a.expires_at > #{expiring_after} GROUP BY OBJ1.starts_at, OBJ1.ends_at, OBJ1.origin_id, OBJ1.destination_id ) AS innerei JOIN ( SELECT P2.id, OBJ2.starts_at AS OBJ2_starts_at, OBJ2.ends_at AS OBJ2_ends_at, OBJ2.origin_id AS OBJ2_origin_id, OBJ2.destination_id AS OBJ2_destination_id, P2.price FROM packages P2 LEFT JOIN results R2 ON (R2.package_id = P2.id) LEFT JOIN journeys OBJ2 ON (P2.outbound_journey_id = OBJ2.id) LEFT JOIN packagings PA2a ON (PA2a.package_id = P2.id AND PA2a.position = 1) LEFT JOIN offers O2a ON (PA2a.offer_id = O2a.id) WHERE R2.search_id IN (#{search_id}) AND R2.search_type = 'FlightSearch' AND O2a.expires_at > #{expiring_after} ) AS outerei ON ( OBJ1_starts_at = OBJ2_starts_at AND OBJ1_ends_at = OBJ2_ends_at AND OBJ1_origin_id = OBJ2_origin_id AND OBJ1_destination_id = OBJ2_destination_id AND outerei.price = the_price ) GROUP BY OBJ1_starts_at, OBJ1_ends_at, OBJ1_destination_id, OBJ1_origin_id SQL
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
The problem
• Strongly normalized database
• Complex query requirements
• Lots of joins
• ActiveRecord and rendering overhead
• Slow API calls
The Solution
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Solution 1: Schema
• Redo the schema
• Migration hard
• Some relationships hard to denormalize
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Solution 2: Memcached
• Memcached
• Very fast response times
• But no real queries
→ Horrible abstraction layer
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
0
2,0
4,0
6,0
8,0
10,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Memcached response times over time
seconds after search start
resp
onse
tim
e of
api
cal
l in
seco
nds
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Solution 3: MongoDB
• Document-oriented – less render overhead
• Grouping of offers
• Proper queries and counts
• Still quite fast
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
How we use MongoDB
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
How we use MongoDB
• Replica set with 2 nodes and 2 arbiters
• Two servers with 16 cores / 64GB RAM
→ run MySQL and MongoDB
• ~ 600 writes/s and reads/s normal load
• ~ 6000 writes/s doable
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
0
2,0
4,0
6,0
8,0
10,0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
seconds after search start
resp
onse
tim
e of
api
cal
l in
seco
nds
MongoDB response times over time
The Headaches
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Segmentation Faults
• Only in production
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Segmentation Faults
• Only in production
→ Replica Set helped a lot
→ Fixed with nightly build
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Write performance during peak load
• Lots of small concurrent writes
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Write performance during peak load
• Lots of small concurrent writes
→ Solved by bundling writes
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Hotel data too big to denormalize
• In separate collection
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Hotel data too big to denormalize
• In separate collection
→ Solved with app-level “join“
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Data consistency
• Typical caching problem
• Updates to MySQL also in MongoDB
MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
Problems with MongoDB
• Data consistency
• Typical caching problem
• Updates to MySQL also in MongoDB
→ Solved with callbacks in ActiveRecord
Thank you