19
Just-In-Time Scalability: Agile Methods to Support Massive Growth

Just In Time Scalability Agile Methods To Support Massive Growth Presentation

Embed Size (px)

DESCRIPTION

Eric Reis and Chris Hondl's MySQL conference presentation on Just In Time Scalability. http://startuplessonslearned.blogspot.com/2008/09/just-in-time-scalability.html

Citation preview

Page 1: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Just-In-Time Scalability: Agile Methods to Support Massive

Growth

Page 2: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

What is IMVU?

 

Page 3: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Behind the scenes...

IMVU is LAMP, plus...• Perlbal• Memcached• Solr• MogileFS• plus...

• BuildBot• eAccelerator• Linux (Debian)• memcached• Nagios• Perl• Roundup• rrd• Subversion

• ADODB• b2evolution• Coppermine• feed2js• FreeTag• Incutio XML-RPC• jrcache• JSON-PHP• Magpie• osCommerce• phpBB• Phorum• SimpleTest• Selenium

• Audiere• Boost• Cal3D • CFL• NSIS• Pixomatic• Python• pywin32• SCons• wxPython

Page 4: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Before and After Architecture

Before

We started with a small site, a mess of open source, and a small team that didn't know much about scaling. 

After

We ended with a large site, a medium sized team, and an architecture that has scaled. 

We never stopped. We used a roadmap and a compass, made weekly changes in direction, regularly shipped code on Wednesday to handle the next weekend's capacity constraints, and shipped new features the whole time.  

Page 5: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Before and After Architecture (1/4)

November

Page 6: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Before and After Architecture (2/4)

December

Page 7: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Before and After Architecture (3/4)

February

Page 8: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Before and After Architecture (4/4)

May

Page 9: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Advanced planning vs. fast response“Driving”

• Continuously figure out what is going to go wrong soon

• Quickly fix it, without breaking something else

• Get feedback along the way

“Rocket ship”

• Figure out in advance what is going to go wrong

• Build a plan that prevents those things from happening

• Execute your plan

• Get feedback when done

Page 10: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Questions to ask“Driving”

• How do you know you will be able to fix the problem in time?

• How can you be sure you won't cause collateral damage?

• How can you be sure you won't code yourself into a corner?

“Rocket ship”

• Are you sure you know what is going to happen?

• Are you sure you can execute?

• Can you afford it?

• Do you need feedback?

Page 11: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Continuous Ship

• Deploy new software quickly• At IMVU time from check-in to production = 20 minutes

• Tell a good change from a bad change (quickly)

• Revert a bad change quickly

• Work in small batches• At IMVU, a large batch = 3 days worth of work

• Break large projects down into small batches

• Don't have the same problem twice – fix the root cause of each class of problems

IMVU pushes code to production 20-30 times every day

Page 12: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Cluster Immune SystemWhat it looks like to ship one piece of code to production:

• Run tests locally (SimpleTest, Selenium)o Everyone has a complete sandbox

• Continuous Integration Server (BuildBot)o All tests must pass or “shut down the line”o Automatic feedback if the team is going too fast

• Incremental deployo Monitor cluster and business metrics in real-timeo Reject changes that move metrics out-of-bounds

• Alerting & Predictive monitoring (Nagios)o Monitor all metrics that stakeholders care abouto If any metric goes out-of-bounds, wake somebody upo Use historical trends to predict acceptable bounds

When customers see a failure:o Fix the problem for customerso Improve your defenses at each level

Page 13: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Case Study: Sharding

Problem: Spread write queries across multiple databases

Solution: •Intercept and redirect queries based on SQL comments

• Move one table or sub-system at a time• Our experience was one engineer horizontally partitions one table or

small sub-system in one week

•New engineers figure this out in about 5 minutes

db_query(“INSERT INTO inventory (customers_id, products_id) VALUES ($customer_id, $product_id)");

db_query("/*shard customer://$customer_id */ INSERT INTO inventory (customers_id, products_id) VALUES ($customer_id, $product_id)");

•Learning: cross shard joins & transactions aren’t required

Page 14: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Case Study: Caching

Problem: Cache frequently read data to memcached

Solution: •Intercept and cache queries based on SQL comments

db_query_cache(BUDDY_CACHE_TIME, "/*shard customer://$customer_id */ /*cache-class customer://$customer_id/buddies */ SELECT friend_id, buddy_order FROM customers_friends WHERE customers_id=$customer_id");

-----------------

db_query(“/*shard customer://$customer_id */ DELETE FROM customers_friends WHERE customers_id = $customer_id AND friend_id = $friend_id”);db_flush_cacheclass("customer://$customer_id/buddies”);

•Learning: Flushing cache critical to users and performance–When a customer spends $24.95, they want the benefits immediately

•Learning: Test the cache behavior for critical systems

Page 15: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Case Study: Steering Data Design

Problem: Improve database schemas and data design to meet scalability requirements without downtime

Solution: •Measure to find the real problems (harder than it sounds)•Migrate to new design that takes advantage of sharding and/or caching

Page 16: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Case Study: Steering Data Design

Page 17: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Case Study: Steering Data Design

Page 18: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

Case Study: Steering Data Design

Problem: You can’t bulk move large frequently accessed data

Solution:

•Copy on read–Use when you are read bound–Reads check cache, new location, and copy to new location if missing–Writes go to new location if data has been migrated, otherwise old

•Copy on write–Use when you are write bound–Reads check cache, new location, then old location–Writes go to new location, copying to new location if missing

•Copy all–Use when file system fills up–Reads & writes go to new location, falling back to old location if missing–Cron copies data a few records at a time

Page 19: Just In Time Scalability  Agile Methods To Support Massive Growth Presentation

“Thank You for Listening!”