Startup Scalability Strategies
Frank Mashraqi
Hello!
Agenda
• Keeping score: What to measure?• Discerning the difference: What to focus on?• Finding your path: Which way to go?• Choosing your architecture: How to partition?• Walking the line: How to balance?• Building your team: Who to hire?• Thinking ahead: What about the future?• Offloading scalability: Is it for me?
Keeping Score: What to Measure?
client response timememory utilization
throughput
cache utilization
reads per second
thread thrashing
disk utilization
disk response time failure rates
resource utilization
IO wait
swap utilization
cache prunes
threshold exceptions
exceeding high or low water marks
cache hit ratio
writes per second
threads createdtransactions per second
connections usage
distribution of data
disk saturation
locking statisticsgrowth rate
queries per shardthreads runningCPU Saturation
memory/ IO contention
connections per shard
Performance != High Availability !=
Scalability
Performance
Ability to process or execute a task compared to time and resources used
High Availability
Ability of a system to ensure a certain degree of operational continuity
flickr.com/photos/mag3737
Scalability
Ability to handle growing amounts of traffic in a graceful manner or ability to be readily
enlarged
: freefoto.ca/key/viewpoint?g2_itemId=7348
Pick any two!!
• Consistency
• Availability
• Partition-Tolerance
Choose if you don’t care about 24/7 availability or accommodating high traffic.
Choose for a site that must be available 24/7
Choose for a high traffic website
Vertical or Horizontal?
Vertical• Aka Scaling up• Adding resources to a node
– Getting a bigger server– Using faster CPUs
• Twice as fast servers can be more than twice expensive
Horizontal• Aka Scaling out• Adding more nodes• Cost efficient
– Commodity hardware– increased management
complexity• “more complex” programming
model– Right foundation
• Throughput and latency between nodes
How to Partition?
• Functional Partitioning• Key based partitioning– (users ending in 01 go to server 1)
• Range based partitioning– (records ranging from 2M to 4M go to server 8)
• Directory server based partitioning– (no pre-defined partitioning scheme, instead a
lookup is required)
How to balance?
• Balance is easier if the foundation is right
• Use agile methodologies• Technical debt is expensive• Technical mortgage is a KILLER!
Before and After• What to do before you get big?
– Lay the right foundation– Ability to Shard / Partition– Decouple components– Effectively cache– Have a plan in place
• What can wait after things start to grow?– Now focus on micro optimizations– Acquiring and upgrading hardware– Performance optimizations and OS tuning– Implementing High Availability and Disaster recovery– Use CDN
What skills / hires are most crucial to dealing with scalability?
Best Practices• Go Asynchronous• Go Stateless• “Best IO is No IO”– Cache effectively using shared cache and monitor
utilization• Decouple as much as possible• Build using APIs– Easy to scale development and deployment and open
up your service• Virtualize/Abstract everything
How to blow up?
Can scalability be outsourced? aka Can Cloud fix Twitter?
• Amazon• Google AppEngine• Rackspace• AppNexus• 10Gen• Other providers?
Things to take away• Focus on scalability, the rest will
follow• Horizontal is better, Vertical is costly• Go Asynchronous• Architect so you don’t have to re-
architect• Choose two out of Consistency,
Availability and Partition-Tolerance• Measure utilization first, then
performance• Choose the right infrastructure &
invest in right skills
Appendix
• Notes/Tips: http://mashraqi.com/2008/09/startonomics-startup-scalability.html
• Personal blog: http://mashraqi.com• Twitter: http://twitter.com/mashraqi• MySQL Blog:
http://mysqldatabaseadministration.blogspot.com
• Email: [email protected]