76
(c) 2009 Facebook, Inc. or its licensors. "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0 1 Sunday, September 27, 2009

Making Facebook Faster

Embed Size (px)

DESCRIPTION

Slides from talk on Frontend Performance Engineering delivered to Velocity 2009 by David Wei and Changhao Jiang

Citation preview

Page 1: Making Facebook Faster

(c) 2009 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0

1Sunday, September 27, 2009

Page 2: Making Facebook Faster

Making Facebook faster Frontend performance engineering

Velocity 2009Jun 24, 2009 San Jose, CA

David Wei and Changhao Jiang

2Sunday, September 27, 2009

Page 3: Making Facebook Faster

1 Site speed matters

2 Performance monitoring

3 Static resource management

4 Ajaxification

5 Client side cache

Agenda

3Sunday, September 27, 2009

Page 4: Making Facebook Faster

Site speed matters!

4Sunday, September 27, 2009First thing first: site speed matters.

Page 5: Making Facebook Faster

▪ 10ms per page = more than 1 man-year per day

= more than 5 human-life of time per year

Site speed matters: large scale200 million users, more than 4 billion page views /

day

5Sunday, September 27, 2009Facebook cares site speed. … -- so yes, we care about site speed.

With our scales, our 200 Million users generated more than 4 billion page loads per day.

If we can speed up each page load by 10 ms, aggregately, we will save our users 1 man-year of time per day; and accumulating over a year, that’s more than 5 human life of time.

Site speed is also affecting our bottleline. Experiments show that if we reduce the latency by 600ms, the user click rate improves by more than 5%. We are currently running an in-depth experiment on the impact of latency.

Page 6: Making Facebook Faster

Site speed matters: emerging

• Agile development

6Sunday, September 27, 2009On the other hand, there are huge challenges for a site like facebook in term of site performance optimization. Here are a few major ones….

Move fast, no stable code base

Fast Development: every week we release a new version of the site – with hundreds of code changes; tens of small code changes are pushed everyday. So the code base is never stable and there is no time to stop for pure optimization

Page 7: Making Facebook Faster

Site speed matters: emerging

• Agile development

• Deep integration

7Sunday, September 27, 2009

Deep integration: Each facebook home page is customized for a particular user, with features developed by many teams – some of them are applications by 3rd party developers, some of them are internal facebook feature – depending on the users’ adoption on the features and applications. it also takes a lot of javascript to run them.

Page 8: Making Facebook Faster

Site speed matters: emerging

• Agile development

• Deep integration

• Viral adoption

8Sunday, September 27, 2009Viral adoption: it is very hard to predict if a feature that is released today will be used by 1 million users or 10 million users next week. It is difficult to optimize beforehand. The infrastructure has to be adaptable to the growth of user adoption.

Page 9: Making Facebook Faster

• Agile development

• Deep integration

• Viral adoption

• Heavily interactive

9Sunday, September 27, 2009… this talk, we will share our experience on how to make a site faster with these challenges

Heavy interaction: our pages have many dynamic features that rely on javascript. E.g. the in-browser chat and application dock provide very convenient user experience, while it also takes a lot of javascript to run them.

Page 10: Making Facebook Faster

Site speed matters: emerging

• Agile development

• Deep integration

• Viral adoption

• Heavily interactive

10Sunday, September 27, 2009In summary, we have a lot of challenges.

And these challenges are actually essential to make Facebook a paradise for people who want to build new things – you can write something cool tonight, and push it out tomorrow to 200millions users. At the same time, it also makes the site performance hard to predict and maintain.

In this talk, we will share our experience on how to optimize front end performance with these challenges.

Page 11: Making Facebook Faster

▪ From a user request to the presentation of the page at the browser, interactive:▪ Network Transfer Time▪ Server Generation Time▪ Client Render Time

Site speed: end-to-end latency experienced by

FBServer

Content DistributionNetwork(CDN)

Browsers

▪ GenTim

▪ NetTim

Render

11Sunday, September 27, 2009Before going into details, we’d define our problem domains.

We define the end-to-end user latency as the time from user starts a page request, to the time the page is presented in the browser, interactive.

There are three components of latency in this process:

Network Transfer time is the time from the user browser to Facebook server, and back;Server Generation time is the time spent on the Facebook servers;And client render time is the time the browser spends on parsing the HTML, loading javascript/css/images and rendering the contents.

Page 12: Making Facebook Faster

▪ RenderTime: ~50% of end-user latency

▪ NetTime: ~25% of end-user latency

▪ GenTime: ~25% of end-user latency

Site speed: end-to-end latency experienced by

User latency = RenderTime + NetTime + GenTime

12Sunday, September 27, 2009Looking at facebook’s user latency, client side render time is about 50% of the end-to-end latency; network time and server-side generation time are about 25% each.

Page 13: Making Facebook Faster

▪ RenderTime: ~50% of end-user latency

▪ NetTime: ~25% of end-user latency

▪ GenTime: ~25% of end-user latency

Site speed: end-to-end latency experienced by

User latency = RenderTime + NetTime + GenTime

13Sunday, September 27, 2009In this talk, we focus on the biggest chunk: render time.

Page 14: Making Facebook Faster

Cavalry: Site speed monitoring

14Sunday, September 27, 2009

Page 15: Making Facebook Faster

User-based measurementServer

JS

All content loaded, Page Interactive

ReportWhat’s our speed?▪ sampling 1/10000 page loads

First bytes of HTML

15Sunday, September 27, 2009To make the site faster, the first question we want to ask is: what is our site speed?

There are usually two approaches: run some in-house testing, or samples on real users We did both and found that the second approach is much more helpful for us.

We actually have lessons on the first approach: our pages are vastly different for different users, and Facebook employees are most likely to be the outliers because they tend to have much more features and functionalities than normal users, and installed many plugins such as firebug, ie developers. even finding a “typical” users is hard, as the usage behaviors of our users have been changing all the time.

Our approach is to take samples from our users. We have javascript measurement on a sampled users, 1/10000. to measure the real speed. The red arrows are the events that we records.

This gives us a real image of what the site speed looks like for facebook.

Btw, we are loading the javascripts before our css, because the javascripts are loaded in parallel, along with css and images

Page 16: Making Facebook Faster

User-based measurementServer

JS

All content loaded, Page Interactive

ReportWhat’s our speed?▪ sampling 1/10000 page loads

First bytes of HTML

16Sunday, September 27, 2009The last thing I want to point out on this slide is that, we are loading the javascripts before our css – this violates the common best practice of putting css in front of js. However, the case here is that we are downloading most of our javascripts in parallel. If we put JS at top, we make JS, css and images are all in parallels. Half a year ago, we tested and found this is faster. We are running another set of experiments to see if things changed.

Page 17: Making Facebook Faster

Cavalry: Day-to-day monitoringWhat’s our speed?▪ Collect gen time / network transfer time and render time

Network Time

Cavalry Logs

GenTime

Browser onload time

Daily site speed monitoring

17Sunday, September 27, 2009We combine the js measurement along with our serverside measurement on page generation time and network round trip time, and put it into a database.

Now we can yell to the company that “Hey the site is slower today!”.

However, we still don’t know who made it? We are continuously launching different features every week. It is hard to stop-and-test for performance.

Page 18: Making Facebook Faster

Cavalry: Project-based analysisWho made it faster / slower?▪ Integrated with Launch System

Launch System

Network Time

Cavalry Logs

GenTime

Browser onload time

Daily site speed monitoring

Project-based regression detection

18Sunday, September 27, 20091. The second step of our measurement is to hook the logs with our launching system. For each measurement sample, we record what new features are launched in the

page load.

2. When there is a regression, we can go over the samples and identify the feature launch that causes regression.

3. This can make the corresponding team much more responsive to a regression.

4. Then there is still a question: “why is it slow? How can I fix it?”

Page 19: Making Facebook Faster

Project-based regression detection

Cavalry: Numeric metricsWhy are we fast / slow? How can I fix it? ▪ YSlow-like technical metrics

Gate Keeper

Network Time

Cavalry Logs

GenTime

Browser onload time

Daily site speed monitoring

Regression analysis

Yslow-like metrics

19Sunday, September 27, 2009To answer the “why” question, Yslow is a good tool.

1. We instrument a subset of the Yslow metrics into our sampled page load. We measure the # of images / # of dom nodes / # of script tags / # of html bytes / # of css rules and etc. These metrics can give indication on what causes a perf regression.

2. The missing thing is that we still don’t have a mapping from the yslow-metrics to the actual time (msec)

Page 20: Making Facebook Faster

“WWW” in performance monitoring:What? Who? Why?

▪ User-based measurement: unbiased, representative results

▪ Feature-launch integration: identify the regression

▪ Technical metrics: define actionable items for improvement

20Sunday, September 27, 20091. Missing part is the priority definition: how much saving, in ms, is if we reduce the # of css rules by 10%? Vs we move the js down to the bottom?

Page 21: Making Facebook Faster

Haste: Static resource management

21Sunday, September 27, 2009

Page 22: Making Facebook Faster

Why we need SR Management?• Day 1: Some smart engineers start a project!

<Print css tag for feature A>

<Print css tag for feature B>

<Print css tag for feature C>

<print HTML of feature A>

<print HTML of feature B>

<print HTML of feature C>

“Let’s write a new page with features A, B and C!”

22Sunday, September 27, 2009

Page 23: Making Facebook Faster

Why we need SR Management?• Day 2: Some smart engineers run PageSpeed and

thinks…

<Print css tag for feature A>

<Print css tag for feature B>

<Print css tag for feature C>

<print HTML of feature A>

<print HTML of feature B>

<print HTML of feature C>

“A & B & C are always used; let’s package them together!”

23Sunday, September 27, 2009

Page 24: Making Facebook Faster

Why we need SR Management?• Day 2: Awesome!

<Print css tag for feature A&B&C>

<print HTML of feature A>

<print HTML of feature B>

<print HTML of feature C>

24Sunday, September 27, 2009

Page 25: Making Facebook Faster

Why we need SR Management?• Day 3: feature C evolves…

<Print css tag for feature A & B & C>

<print HTML of feature A>

<print HTML of feature B>

If (users_signup_for_C()) { <print HTML of feature C>}

25Sunday, September 27, 2009

Page 26: Making Facebook Faster

Why we need SR Management?• Day 3:

<Print css tag for feature A & B & C>

<print HTML of feature A>

<print HTML of feature B>

If (users_signup_for_C()) { <print HTML of feature C>}

A&B are always used, while C is not. ..

26Sunday, September 27, 2009

Page 27: Making Facebook Faster

Why we need SR Management?• Day 4: feature C is deprecated

<Print css tag for feature A & B & C>

<print HTML of feature A>

<print HTML of feature B>

// no one uses C { <print HTML of feature C>}

27Sunday, September 27, 2009

Page 28: Making Facebook Faster

Why we need SR Management?• Day 4: we start to send unused bits

<Print css tag for feature A & B & C>

<print HTML of feature A>

<print HTML of feature B>

// no one uses C { <print HTML of feature C>}

It is hard to remember we should remove C here.

28Sunday, September 27, 2009

Page 29: Making Facebook Faster

Why we need SR Management?• One months later…

<Print css tag for feature A & B & C & D & E & F & G…>

if (F is used) <print HTML of feature F>

<print HTML of feature G>

if (F is not used) { <print HTML of feature E>}

Thousands of dead CSS rules in the package.

29Sunday, September 27, 2009

Page 30: Making Facebook Faster

Static Resource Management @ Challenges:

• Deep Integration

• Viral Adoption

• Agile Development

Responses:

• Separate requirement declaration and delivery of static resources

• Requirement declaration: lives with HTML generation

• Delivery: Globally optimized

30Sunday, September 27, 2009Deep Integration: each page has many features;Viral adoption: usage pattern changes quicklyAgile development: feature changes fast

Page 31: Making Facebook Faster

Haste: Static Resource Management

• Back to Day 1:

require_static(A_css); <render HTML of feature A>

require_static(B_css); <render HTML of feature B>

require_static(C_css);<render HTML of feature C>

<deliver all required CSS>

<print all rendered HTML>

Separate Declaration from actual Delivery

Global Optimization on Delivery

Requirement Declaration lives with HTML

31Sunday, September 27, 2009

Page 32: Making Facebook Faster

Haste: Global OptimizationOnline process

require_static(A_css);<render HTML of feature A>

require_static(B_css); <render HTML of feature B>

require_static(C_css); <render HTML of feature C>

<deliver all required CSS>

<print all rendered HTML>

Usage Pattern logs

Clustering algorithms

“Optimal” packages

Offline analysis

32Sunday, September 27, 2009

Page 33: Making Facebook Faster

Haste: Trace-based PackagingNov 2008 => May 2009

Date # of JS files # of JS bytes # of pkg at a home.php

# of bytes at a home.php

Nov 2008 461 4.4 MB 29 629 KB

May 2009 729 5.9 MB 14 560 KB

33Sunday, September 27, 2009The # of JS files are increased by 60%, the byte sites are increased by 30%. The # of pkg sent is halved, the byte size is 10% less.

find | grep -v \.svn | grep -v intern | grep \.css$ -cfind | grep -v \.svn | grep -v intern | grep \.css$ | xargs cat > /tmp/dwei_2008

Page 34: Making Facebook Faster

Haste: Trace-based PackagingNov 2008 => May 2009

Date # of JS files # of JS bytes # of pkg at a home.php

# of bytes at a home.php

Nov 2008 461 4.4 MB 29 629 KB

May 2009 729 5.9 MB 14 560 KB

'js/careers/jobs.js’, 'js/lib/ui/timeeditor.js’, 'resume/js/resumepro.js’, 'resume/js/resumesection.js’

34Sunday, September 27, 2009Developers think that timeeditor.js is a library file – in fact, it is only used in one production page (career)On the other hand, it turns out that “resume“ function is almost always used in career page.

Page 35: Making Facebook Faster

Haste: Trace-based PackagingNov 2008 => May 2009

Date # CSS files # of CSS bytes

# of pkg at a home.php

# of bytes at a home.php

Nov 2008 487 1.7 MB 24 69 KB

May 2009 706 1.9 MB 15 64 KB

Date # of JS files # of JS bytes # of pkg at a home.php

# of bytes at a home.php

Nov 2008 461 4.4 MB 29 629 KB

May 2009 729 5.9 MB 14 560 KB

35Sunday, September 27, 2009CSS is a similar story

Page 36: Making Facebook Faster

Haste: Trace-based AnalysisPotentials for image sprites too!

• Thousands of virtual gifts with static images, which to sprite?

36Sunday, September 27, 2009The same tracebase analysis techniques can be use in image spriting too:

Page 37: Making Facebook Faster

Haste: Trace-based AnalysisPotentials for image sprites too!

• The answer is…

37Sunday, September 27, 2009The answer is…

In retrospection, this is pretty straight forward.

Page 38: Making Facebook Faster

Haste: Trace-based AnalysisAdaptive Performance Optimization

• JS / CSS package optimization

• Guidance for image spriting

• Guidance of progressive rendering

38Sunday, September 27, 2009Once we separate the declaration and delivery of static resources, we have tons of area for automatic optimizations with trace analysis.

You can do automatic packaging, you can do automatic spriting, you can also do automatic progressive rendering – you can look at the most frequently used resources, and flush them out before generating the page.

Page 39: Making Facebook Faster

Quickling: Ajaxify the Facebook site

39Sunday, September 27, 2009

Page 40: Making Facebook Faster

load unload load unload load unload load unload

Full page load Ajax call

Remove redundant work via Ajax

Page 1 Page 2 Page 3 Page 4

Use session

40Sunday, September 27, 2009

Page 41: Making Facebook Faster

load unload load unload load unload load unload

Full page load Ajax call

Remove redundant work via Ajax

Page 1 Page 2 Page 3 Page 4

Use session

40Sunday, September 27, 2009

Page 42: Making Facebook Faster

load unload load unload load unload load unload

load unload

Full page load Ajax call

Remove redundant work via Ajax

Page 1 Page 2 Page 3 Page 4

Page 1 Page 2 Page 3 Page 4

Use session

Use session

40Sunday, September 27, 2009

Page 43: Making Facebook Faster

How Quickling works?

41Sunday, September 27, 2009

Page 44: Making Facebook Faster

How Quickling works?1. User clicks a link or back/forward button

41Sunday, September 27, 2009

Page 45: Making Facebook Faster

How Quickling works?1. User clicks a link or back/forward button

2. Quickling sends an ajax to server

3. Response arrives

41Sunday, September 27, 2009

Page 46: Making Facebook Faster

How Quickling works?1. User clicks a link or back/forward button

2. Quickling sends an ajax to server

4. Quickling blanks the content area

3. Response arrives

41Sunday, September 27, 2009

Page 47: Making Facebook Faster

How Quickling works?1. User clicks a link or back/forward button

2. Quickling sends an ajax to server

4. Quickling blanks the content area

3. Response arrives

5. Download javascript/CSS

41Sunday, September 27, 2009

Page 48: Making Facebook Faster

How Quickling works?1. User clicks a link or back/forward button

2. Quickling sends an ajax to server

4. Quickling blanks the content area

3. Response arrives

5. Download javascript/CSS

6. Show new content

41Sunday, September 27, 2009

Page 49: Making Facebook Faster

LinkControllerIntercept user clicks on links▪ Dynamically attach a handler to all link clicks:

$(‘a’).click(function() {

// ‘payload’ is a JSON encoded response from the server $.get(this.href, function(payload) { // Dynamically load ‘js’, ‘css’ resources for this page. bootload(payload.bootload, function() {

// Swap in the new page’s content $(‘#content’).html(payload.html)

// Execute the onloadRegister’ed js code execute(payload.onload) }); }});

42Sunday, September 27, 2009

Page 50: Making Facebook Faster

HistoryManagerEnable ‘Back/Forward’ buttons for AJAX requests▪ Set target page URL as the fragment of the URL

▪ http://www.facebook.com/home.php

▪ http://www.facebook.com/home.php#/cjiang?ref=profile

▪ http://www.facebook.com/home.php#/friends/?ref=tn

43Sunday, September 27, 2009

Page 51: Making Facebook Faster

BootloaderLoad static resources via ‘script’, ‘link’ tag injection

function requestResource(type, source) { var h = document.getElementsByTagName('head')[0]; switch (type) { case 'js': var script = document.createElement('script'); script.src = source; script.type = 'text/javascript'; h.appendChild(script); break; case 'css': var link = document.createElement('link'); link.rel = "stylesheet"; link.type = "text/css"; link.media = "all" ; link.href = source; h.appendChild(link); break; } }

44Sunday, September 27, 2009

Page 52: Making Facebook Faster

Other details▪ All pages now share a single global javascript scope:▪ Explicitly reclaim resources or reset states before leaving a page

▪ Stub out setTimeout and setInterval

▪ All CSS rules will be accumulated▪ Name-spacing CSS rules with page-specific information

▪ Busy indicator

▪ iframe transport▪ Permanent link

▪prelude inlined js code to redirect if necessary

45Sunday, September 27, 2009

Page 53: Making Facebook Faster

Current status

▪ Turned on for FireFox and IE users: (>90% users)▪ ~60% of page hits to Facebook site are Quickling requests

46Sunday, September 27, 2009

Page 54: Making Facebook Faster

Performance improvement

40% ~ 50% reduction in render time

47Sunday, September 27, 2009

Page 55: Making Facebook Faster

PageCache: Cache visited pages at client side

48Sunday, September 27, 2009

Page 56: Making Facebook Faster

PageCacheCache user visited pages in browsers▪ Motivation:▪ A typical user session:

▪ home -> profile -> photo -> home -> notes -> home -> photo -> photo

▪ Some pages are likely to be revisited soon (temporal locality)▪ Home page visited every 3 ~ 5 page views▪ Back/Forward button

49Sunday, September 27, 2009

Page 57: Making Facebook Faster

How PageCache works?1. User clicks a link or back button

2. Quickling sends ajax to server

4. Quickling blanks the content area

3. Response arrives

5. Download javascript/CSS

6. Show new content

50Sunday, September 27, 2009

Page 58: Making Facebook Faster

How PageCache works?1. User clicks a link or back button

2. Quickling sends ajax to server

4. Quickling blanks the content area

3. Response arrives

5. Download javascript/CSS

6. Show new content

3.5 Save response in cache

50Sunday, September 27, 2009

Page 59: Making Facebook Faster

How PageCache works?1. User clicks a link or back button

2. Quickling sends ajax to server

4. Quickling blanks the content area

3. Response arrives

5. Download javascript/CSS

6. Show new content

50Sunday, September 27, 2009

Page 60: Making Facebook Faster

How PageCache works?1. User clicks a link or back button

4. Quickling blanks the content area

3. Response arrives

5. Download javascript/CSS

6. Show new content

2. Find Page in the cache

50Sunday, September 27, 2009

Page 61: Making Facebook Faster

Cache consistency 1: Incremental updates

Cached version

51Sunday, September 27, 2009Provide functions to programmers to allow registering a javascript function to be called right before cached page is shown.Used by home page to refresh ‘ads’, fetch latest stories

Page 62: Making Facebook Faster

Cache consistency 1: Incremental updates

Cached version Restored version

51Sunday, September 27, 2009Provide functions to programmers to allow registering a javascript function to be called right before cached page is shown.Used by home page to refresh ‘ads’, fetch latest stories

Page 63: Making Facebook Faster

Cache consistency 1: Incremental Poll server for incremental updates via ajax calls.▪ Allow registering javascript functions to be called right before

cached page is shown.▪ Used by home page to refresh ‘ads’, fetch latest stories

Cached version Restored version

52Sunday, September 27, 2009Provide functions to programmers to allow registering a javascript function to be called right before cached page is shown.Used by home page to refresh ‘ads’, fetch latest stories

Page 64: Making Facebook Faster

Cache consistency 2: In-page writes

Cached version

53Sunday, September 27, 2009

Page 65: Making Facebook Faster

Cache consistency 2: In-page writes

Cached version Restored version

53Sunday, September 27, 2009

Page 66: Making Facebook Faster

Cache consistency 2: In-page writesRecord and replay▪ Automatically record all state-changing operations in a cached

page▪ Automatically replay those operations when cached page is

restored.

Cached version Restored version54Sunday, September 27, 2009

Page 67: Making Facebook Faster

Cache consistency 3: Cross-page writes

Cached version

55Sunday, September 27, 2009

Page 68: Making Facebook Faster

Cache consistency 3: Cross-page writes

Cached version State-changing op

55Sunday, September 27, 2009

Page 69: Making Facebook Faster

Cache consistency 3: Cross-page writes

Cached version Restored versionState-changing op

55Sunday, September 27, 2009

Page 70: Making Facebook Faster

Cache consistency 3: Cross-page writesServer side invalidation▪ Instrument server-side database access API, whenever a write

operations is detected, send a signal to the client to invalidate the cache.

Cached version Restored versionState-changing op

56Sunday, September 27, 2009

Page 71: Making Facebook Faster

Current status

▪ Deployed on production▪ Only cache in memory▪ Only turned on for home page

57Sunday, September 27, 2009

Page 72: Making Facebook Faster

20%

~20% savings on page hits to home page 58Sunday, September 27, 2009

Page 73: Making Facebook Faster

Performance improvement

3X ~ 4X speedup in render time vs Quickling

59Sunday, September 27, 2009

Page 74: Making Facebook Faster

Summary

60Sunday, September 27, 2009

Page 75: Making Facebook Faster

Summary▪ Performance monitoring: What, Who, and Why (“WWW”)▪ Static resource management: Adaptive to fast evolution▪ Ajaxify the website.▪ Client side caching of user visited pages

61Sunday, September 27, 2009Measurement: we need to answer three questions: what’s the speed, who made it faster/slower, why it is faster/slower.Static resource management: need to be adaptive to fast evolution of code changes and user adoption

Ajaxifying websites where pages in a user session share a lot of common work can save the redundant work and improve user perceived performance.Caching user’s visited pages on the client side can reduce server’s overall load and improve user perceived performance

Page 76: Making Facebook Faster

Thank you!

62Sunday, September 27, 2009