Upload
blaine
View
227.013
Download
0
Embed Size (px)
DESCRIPTION
Scaling Twitter - Slides for a talk presented at the SDForum Silicon Valley Ruby Conference 2007 on Twitter's challenges scaling Rails.
Citation preview
Big Bird.(scaling twitter)
Rails Scales.(but not out of the box)
First, Some Facts
• 600 requests per second. Growing fast.
• 180 Rails Instances (Mongrel). Growing fast.
• 1 Database Server (MySQL) + 1 Slave.
• 30-odd Processes for Misc. Jobs
• 8 Sun X4100s
• Many users, many updates.
Oct Nov Dec Jan Feb March Apr
Joy Pain
IM IN UR RAILZ
MAKIN EM GO FAST
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
It’s Easy, Really.
1. Realize Your Site is Slow
2. Optimize the Database
3. Cache the Hell out of Everything
4. Scale Messaging
5. Deal With Abuse
6. Profit
It’s Easy, Really.
{ Part the First }
themoreyouknow
We Failed at This.
Don’t Be Like Us
• Munin
• Nagios
• AWStats & Google Analytics
• Exception Notifier / Exception Logger
• Immediately add reporting to track problems.
Test Everything
• Start Before You Start
• No Need To Be Fancy
• Tests Will Save Your Life
• Agile Becomes Important When Your Site Is Down
Benchmarks?let your users do it.
<!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:00 in 409 ms (d 88 / r 307). thank you, come again. -->
<!-- served to you through a copper wire by raven.twitter.com at 22 Apr 15:01 in 450 ms (d 96 / r 337). thank you, come again. -->
<!-- served to you through a copper wire by quetzal at 22 Apr 15:01 in 384 ms (d 70 / r 297). thank you, come again. -->
<!-- served to you through a copper wire by sampaati at 22 Apr 15:02 in 343 ms (d 102 / r 217). thank you, come again. -->
<!-- served to you through a copper wire by kolea.twitter.com at 22 Apr 15:02 in 235 ms (d 87 / r 130). thank you, come again. -->
<!-- served to you through a copper wire by firebird at 22 Apr 15:03 in 2094 ms (d 643 / r 1445). thank you, come again. -->
The Database{ Part the Second }
“The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
“The Next Application I Build is Going to Be Easily Partitionable” - S. Butterfield
Too Late.
Index Everything
class AddIndex < ActiveRecord::Migration def self.up add_index :users, :email end
def self.down remove_index :users, :email endend
Repeat for any column that appears in a WHERE clause
Rails won’t do this for you.
Denormalize A Lot
class DenormalizeFriendsIds < ActiveRecord::Migration def self.up add_column "users", "friends_ids", :text end
def self.down remove_column "users", "friends_ids" endend
class Friendship < ActiveRecord::Base belongs_to :user belongs_to :friend
after_create :add_to_denormalized_friends after_destroy :remove_from_denormalized_friends
def add_to_denormalized_friends user.friends_ids << friend.id user.friends_ids.uniq! user.save_without_validation end
def remove_from_denormalized_friends user.friends_ids.delete(friend.id) user.save_without_validation endend
Don’t be Stupid
bob.friends.map(&:email)Status.count()
“email like ‘%#{search}%’”
That’s where we are.
Seriously.If your Rails application is doing anything more
complex than that, you’re doing something wrong*.
* or you observed the First Rule of Butterfield.
Partitioning Comes Later.(we’ll let you know how it goes)
The Cache{ Part the Third }
MemCache
MemCache
MemCache
!
class Status < ActiveRecord::Base class << self def count_with_memcache(*args) return count_without_memcache unless args.empty? count = CACHE.get(“status_count”) if count.nil? count = count_without_memcache CACHE.set(“status_count”, count) end count end alias_method_chain :count, :memcache end after_create :increment_memcache_count after_destroy :decrement_memcache_count ...end
class User < ActiveRecord::Base def friends_statuses ids = CACHE.get(“friends_statuses:#{id}”) Status.find(:all, :conditions => [“id IN (?)”, ids]) endend
class Status < ActiveRecord::Base after_create :update_caches def update_caches user.friends_ids.each do |friend_id| ids = CACHE.get(“friends_statuses:#{friend_id}”) ids.pop ids.unshift(id) CACHE.set(“friends_statuses:#{friend_id}”, ids) end endend
Active
Recor
d
The Future
90% API RequestsCache Them!
“There are only two hard things in CS: cache invalidation and naming things.”
– Phil Karlton, via Tim Bray
Messaging{ Part the Fourth }
You Already Knew All That Other Stuff, Right?
ProducerProducerProducer
MessageQueue
ConsumerConsumerConsumer
DRb
• The Good:
• Stupid Easy
• Reasonably Fast
• The Bad:
• Kinda Flaky
• Zero Redundancy
• Tightly Coupled
Jabber Client(drb)
PresenceIncomingMessages
OutgoingMessages
ejabberd
MySQL
ServerDRb.start_service ‘druby://localhost:10000’, myobject
Clientmyobject = DRbObject.new_with_uri(‘druby://localhost:10000’)
Rinda
• Shared Queue (TupleSpace)
• Built with DRb
• RingyDingy makes it stupid easy
• See Eric Hodel’s documentation
• O(N) for take(). Sigh.
SELECT * FROM messages WHERE substring(truncate(id,0),-2,1) = #{@fugly_dist_idx}
Timestamp: 12/22/06 01:53:14 (4 months ago)Author: latticeMessage: Fugly. Seriously. Fugly.
It Scales.(except it stopped on Tuesday)
Options
• ActiveMQ (Java)
• RabbitMQ (erlang)
• MySQL + Lightweight Locking
• Something Else?
erlang?
What are you doing?
Stabbing my eyes out with a fork.
Starling
• Ruby, will be ported to something faster
• 4000 transactional msgs/s
• First pass written in 4 hours
• Speaks MemCache (set, get)
Use Messages to Invalidate Cache
(it’s really not that hard)
Abuse{ Part the Fifth }
The Italians
9000 friends in 24 hours(doesn’t scale)
http://flickr.com/photos/heather/464504545/http://flickr.com/photos/curiouskiwi/165229284/http://flickr.com/photo_zoom.gne?id=42914103&size=lhttp://flickr.com/photos/madstillz/354596905/http://flickr.com/photos/laughingsquid/382242677/http://flickr.com/photos/bng/46678227/