64
Scaling rails with performance in mind from the beginning Tom Caspy [email protected] un.orthodoxgeek.co.il

Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Embed Size (px)

DESCRIPTION

performance on rails, how to treat your code throughout your application's life cycle for best code style and performance.

Citation preview

Page 1: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Scaling rails with performance in mind

from the beginning

Tom [email protected]

Page 2: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Performance - what itactually is

well, code which does what it'ssupposed to, and doesn't do it asslow as rails 3.0's boot time.

in every part of a project's lifecycle, the way we treat performance is very different.

Page 3: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

When you're young

Page 4: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

When you're young and naive

when you start with a project, and it's still small on traffic, write naive code!

Page 5: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Do TDD!

To avoid this :)

Page 6: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Write short and concise code

Page 7: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Don't bother with premature optimization

Page 8: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

(when you prematurely optimize, this happens)

Page 9: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

READ!

prepare for growth, because you're optimistic and all that. make sure you'll know what to do when shit gets real.

Page 10: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

be naive but not TOO naive, though

there are some things which just scream - don't do this! it's gonna suck, BAD!the n+1 query issue is a good example of too naive code.

Page 11: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

The controller

Page 12: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

The view

Page 13: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

The view

Page 14: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

The problemwe have an array of users, and when we iterate over that array we reach for profile_image and for posts, which triggers two queries to the DB for each user. ending up with 2n+1 queries, n being number of users

ActiveRecord's includes prefetches the extra queries, so they turn into two queries, instead of 2n queries

The solution

Page 15: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

The new controller

now there are only 3 queries, instead for 2n+1 (n being the amount of users)note that this might not be the right thing to do in larger scale projects. you might want to cache the profile image in redis, for instance, and completely avoid bringing in the profile_image object from the database.

Page 16: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

The importance of TDD

One of the roles I took upon arriving to FTBPro is kickstarting and leading the move to TDD, we also wrote a bunch of specs for our legacy code. Difference was incredible.

Page 17: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Daily deploys

(instead of weekly deploys)

Page 18: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

New code's clean and awesome

Page 19: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

More focus on features

because code's fairly covered, there's less issues that come up in production (less being relative, yeah?)

Page 20: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Upgrading made easy

we moved from rails 3.0 to 3.2 within two weeks. mostly because a vast majority of the issues were discovered in tests.

Page 21: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

But this talk is about performance!

When writing TDD, your code will be faster.● TDD forces you to write short and atomic

methods● we try to make these methods fast because

we hate slow specs :)● code doesn't fail on production, because if it

fails, we know about it before deployment.● no long-running methods, because they're

short and concise

Page 22: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

More performance specific TDD

using rspec you can test the time a method takes to run, set a threshold above which the spec fails!when using the bullet gem, you can set a limit on number of queries you allow a controller to run.Do benchmarks and performance tests

Page 23: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

original code - written without tests

Page 24: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Rewrite - the specs

Page 25: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")
Page 26: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

the actual code does exactly the same thing, but it's much shorter, and much more readable, because it's TDD, every method does only one thing, and is tested well.

Page 27: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Conclusion - do TDD!

● code is shorter● easier to maintain● it's tested so when it breaks we know it

before it's on production● when we need to refactor or change it, we

can be fairly certain it will still work as intended because of the tests.

Page 28: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

When you're growing

Page 29: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Now, you start growing, and there are growing pains

● because you've written TDD, when you optimize, you're not going to break anything (or are, but will see it when tests run)

● your code is short and concise, so optimizing it will be easy

● because you didn't optimize anything, you'll feel what needs to be optimized first (using newrelic and the such)

● again, don't optimize what's easy to optimize, optimize the parts which start causing pain.

Page 30: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

How to get the feelin`

Page 31: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Newrelic

Page 32: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

shows you what's hurting the most

Page 33: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

And gives you a breakdown of that

Page 34: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Google Analytics

Page 35: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Browse your site (that crazy!!)

Page 36: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Listen to users

they may come and complain, and may just go away. use google analytics to look for pages with unusually high bounce rate.

Page 37: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Custom tools

statsd and graphite can be quite handy

Page 38: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Real life example

in FTBPro, we have a score table for each league, it gets daily(ish) updated from an external source.We noticed in Newrelic that the league page took a long time to load. A short investigation pointed to the table, which led to a tiny change in the code.

Page 39: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Before

Page 40: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

After

Page 41: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

What? wait! it looks the same!

well, almost. there are two changes - one is a tiny change in variable names to make code more readable.the second change is we used a caching mechanism to bring in the team (called Subject in our code) without making any queries.

the difference was HUGE. time to build the table when cache was cold went down from 7 seconds to 0.5 seconds.

Page 42: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

So - what have we done exactly?

● we removed an n+1 query not by including stuff, but by avoiding the query altogether

● we used a caching mechanism for teams, which takes the team's nick (Barcelona can be referred to as barca, or F.C. Barcelona) and returns the cached team.

● used that cache to speed up a very painful part of the site by a lot.

● and yes, of course the view is cached so the rebuild of the table only happens once a day.

Page 43: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

When you need to refactor, or rewrite.

refactoring is taking code and changing it, while rewriting is starting from scratch.different reasons for refactoring or rewriting● code is causing performance issues● code is too clumsy, and makes debugging

very hard and costly● code just looks horrid● Tom said so.But when do we rewrite and when is it enough just to refactor?

Page 44: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

When to refactor

● code is generally ok, maintainable and worth keeping

● small changes would get the desired result easily

● code is well covered with specs● we're too damn lazy to rewrite it all (yes, it's

a valid reason, lazy programmers create short code)

Page 45: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

When should we just throw it away and rewrite.

● if the maintaining the code costs more than rewriting it, rewrite, and do it well!

● if the code does not have any test coverage and is untestable.

● when code looks like the Flying Spaghetti Monster

● when it was written by Avi Tzurel :)

make sure that new code is good, if you rewrite shit code to new shit code, you've done nothing!

Page 46: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

A little bit about queues

DelayedJob, Resque, Sidekiq, they all got strange names with typos in them. They all save us from hell.

Page 47: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Move long running stuff to the background!

Let's talk about user registration - a user comes to the site, signs in with facebook, we get his image, his facebook friends, etc. It takes a while, even a long while.

Page 48: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Put it aside!

Calculating all that stuff is long.This doesn't have to be that way. We really only need to save the user's name, facebook details, and that's it. We'll do the rest in the background, using one of the queueing mechanisms Ruby has to offer us. This will allow us to give the user a better, faster experience.

Page 49: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Starting to get seriously huge

(ok, maybe this isn't a good image)

Page 50: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Hitting large scale

Q - when do you know you've hit large scale?A - when your servers crash daily.

now, when you've reached that, you know you need to do some really drastic stuff to adjust to your new position.

Page 51: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

A quick detour to the land of DevOps

● handling large scale requires a lot of resources, and managing these resources effectively.

● cloud services such as Amazon AWS give companies some simple tools to handle scale very well.

● but if you don't know what you're doing, call for help :)

Page 52: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

FTBpro's setup on AWS

Page 53: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Mysql with RDS

RDS is Amazon's mysql. It's optimized and easy to set up. saves us a lot of time on system administration.

Page 54: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Memcached with elasticache

Elasticache is the Amazon memcached service. same as RDS, saves us time bother of messing with memcached servers.

Page 55: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Custom redis server

thinking about moving to cloud services to save us the trouble.

Page 56: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Web servers with nginx+unicorn

nginx+unicorn are like milk and cookies. With the right setup we also get zero-downtime deploys, which are awesome.

Page 57: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Resque servers

they're also built for automatic scaling. just because we're awesome!

Page 58: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

CDN cache with cotendo (akamai)

logged out users don't even touch the web servers - their content is served by the CDN.

Page 59: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Build it for quick and automatic scale

● self-deploying servers - when you start the server from its image, it will deploy to itself and start serving traffic / run resque workers

● adding servers is automatic - when there's high traffic, start them up, then kill them when traffic's low

● this allows to pay the minimum for hosting, while keeping scalability

Page 60: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

careful with these self-deploying robots! make sure they know the robot rules...

The rules:1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.2. A robot must obey any orders given to it by human beings, except where such orders would conflict with the First Law.3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Page 61: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

ok, back to Ruby (kind of)

When reaching massive scale, we'd start looking for custom solutions - relational dbs would stay forever, but some things should be moved to other customized solutions.● consider using mongo for document-like

data● consider using neo4j or other graph dbs for

representing graph data (sorry Avi, mongo ain't no graph DB!)

Page 62: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

And don't forget to stay naive!

being large scale, but still fun and lean, can be hard, but pulling it off is worth it!

Page 63: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Thanks for not falling asleep!

Page 64: Refactoring for performance and profit a real life use case (By: Tom Caspy "Tikal")

Questions?

Tom [email protected]