Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Flywheel: Google's Data Compression Proxy for the Mobile WebVictor Agababov, Michael Buettner, Victor Chudnovsky,
Mark Cogan, Ben Greenstein, Shane McDaniel, Michael Piatek, Colin Scott*, Matt Welsh, Bolian Yin
Currently a graduate student at UC Berkeley*
Dominant Access Tech: Mobile
• Mobile devices are increasingly dominant
• Growth is greatest in emerging markets
2
Mobile Data is Expensive
3
Source: blog.jana.com
Emerging markets have highest costs!
Flywheel• Flywheel: proxy service that optimizes HTTP
response size
• Three years of deployment experience, part of Chrome for Android, iOS, Desktop
• Currently serving millions of users & billions of requests per day
4
What Flywheel Does
5
Total bytes: 9764 Total bytes: 5565
Transcode to WebP
Pick quality level
Minify CSS, JS
GZip text objects
6
7
None of our optimizations are novel
What lessons did we learn from building and
operating Flywheel?
Key lessons: • Highly challenging to maintain good performance • Tussles are pervasive, ongoing, & time-consuming
Outline
• Is a proxy really needed?
• What’s hard about engineering Flywheel?
• Does Flywheel meet our goals?
• What can be learned?
9
The web isn’t well optimized for mobile
• Case in point:
• 42% of HTML response bytes not compressed
10
Hard to keep up with best practices
• New optimizations: WebP, SDCH, HTTP/2,... rolled out as often as every 6 weeks
• Heterogeneity of mobile devices increasing
• Need: an optimizing compiler for the mobile web 11
Need: Optimizing service for the mobile web
Outline
• Is a proxy really needed?
• What’s hard about engineering Flywheel?
• Does Flywheel meet our goals?
• What can be learned?
12
Design Constraints
• Opt-in deployment model
• Transparent to users
• HTTP only (No HTTPS)
13
Challenge: trading off latency vs compression
14
Indirection through Flywheel often increases RTT
RTT’
RTT>
Network latency is dominant performance factor (Page size is not a dominant factor!)
HTTPOrigin
Google datacenter
Cache
Optimization services
Optimization servicesOptimization
services
Fetch bots
Proxy
Fetch router
Flywheel Design
15
GET /
Fetch router maintains connection affinity
HTTPOrigin
Google datacenter
Cache
Optimization services
Optimization servicesOptimization
services
Fetch bots
Proxy
Fetch router
Flywheel Design
16
200
Optimizations: image transcoding, GZip, minification …
200
Separate optimization services for isolation, provisioning
HTTPOrigin
Selective Proxying
17
GET /
Fetch objects on critical path from origin
HTTPOrigin
Indirection through Flywheel often increases RTT
Selective Proxying
18
Fetch objects on critical path from origin
GET /i.jpg
Proxy objects that yield high data reduction
HTTPOrigin
Challenge: Accommodating Tussles
19
Mechanism: HTTP fallback (blockable canary requests)
Need to be policy-neutral
CANARY
Outline
• Is a proxy really needed?
• What’s hard about engineering Flywheel?
• Does Flywheel meet our goals?
• What can be learned?
20
Evaluation• This talk:
• Primary goal: reduce web page size
• Secondary goal: maintain good performance
• Paper:
• Fault tolerance
21
Workload: Geographic Adoption
22
Adoption highest in developing markets
Country Adoption
Worldwide 10.5%
Brazil 17%
Russia 16.5%
Indonesia 16.3%
Mexico 15.5%
USA 9.5%
Workload: Page Footprints
23
Workload: Page Footprints
24
Majority of pages are small
Workload: Page Footprints
25
97% of bytes come from top 5% largest pagesMajority of
pages are small
Type % of Bytes Savings Share of Benefit
Total 100% 58% -
Images 74.12% 66.40% 85%
HTML 9.64% 38.43% 6%
JavaScript 9.10% 41.09% 6%
CSS 1.81% 52.10% 2%
Other 5.33% 9.23% 1%
Data Reduction
26
Savings = 1 - (outgoing bytes / incoming bytes) * 100
Type % of Bytes Savings Share of Benefit
Total 100% 58% -
Images 74.12% 66.40% 85%
HTML 9.64% 38.43% 6%
JavaScript 9.10% 41.09% 6%
CSS 1.81% 52.10% 2%
Other 5.33% 9.23% 1%
Data Reduction
27
Savings = 1 - (outgoing bytes / incoming bytes) * 100
Type % of Bytes Savings Share of Benefit
Total 100% 58% -
Images 74.12% 66.40% 85%
HTML 9.64% 38.43% 6%
JavaScript 9.10% 41.09% 6%
CSS 1.81% 52.10% 2%
Other 5.33% 9.23% 1%
Data Reduction
28Images are bulk of bytes & savings
Savings = 1 - (outgoing bytes / incoming bytes) * 100
Reduction Across Users
29
Reduction Across Users
30
Median data reduction: 50%
Reduction Across Users
31
Reduction Across Users
32
Overall data reduction: 27%
Performance: Methodology
• Goal: don’t degrade performance
Compare:
• Random sampling of Flywheel users
• Holdback experimental group
33
Simple Model of Page Load TimeLoad time of
subresource si = propagation delay + transmission delay + computation time
Page load time = Σ si on critical path
Critical path = longest chain of
dependent subresources
Time
Page Load Time
35
Seco
nds
0
4
8
12
16
20
24
28
32
36
40
Quantile (page loads)
Median 70th 80th 90th 95th 99th
36.13
13.89
8.95
5.373.78
2.21
39.38
14.61
9.22
5.383.68
2.08
Holdback Flywheel
Flywheel improves performance only
in the tail
Why is this?• Recall our workload:
• Long tail of large pages
• Most pages are small
36
Propagation delay dominant factor
Transmission delay dominant factor
• Good way to understand propagation delay: time to first byte (TTFB)
Seco
nds
0
0.6
1.2
1.8
2.4
3
3.6
4.2
4.8
5.4
6
Quantile (requests)
Median 70th 80th 90th 95th 99th
5.81
1.90
1.16
0.690.490.30
5.06
1.69
1.00
0.550.360.19
Holdback Flywheel
0.19
Time to First Byte
37
Most users’ direct path to the origin is shorter than the indirect path
Outline
• Is a proxy really needed?
• What’s hard about engineering Flywheel?
• Does Flywheel meet our goals?
• What can be learned?
38
Lessons Learned
Many performance optimizations we expected to have impact
did not at Web scale
39
Disappointing Optimizations• Preconnect: open TCP connection with origin early
40
bar.com
foo.com..<head><script src=“bar.com/js”>...
• Increases reused connections from 73% to 80%
• Yields less than 2% decrease in median PLT
foo.com
bar.com
Already a strong tendency for connection affinity
Disappointing Optimizations• Prefetch: request cacheable subresources early
41
foo.com..<head><script src=“bar.com/js”>...
• Increases cache hit ratio from 22% to 32%
• Yields less than 2% decrease in median PLT
GET /js
/js
foo.com
bar.com
Cacheable items often aren’t on the critical path
Lessons Learned
Improving PLT is highly challenging
• Compression doesn’t help (much)
• Difficult to target critical path
42
Lessons Learned
43
If you want widespread adoption, you must
accommodate policy issues!
Lessons Learned
44
Many more measurement findings and lessons in the
paper!
Conclusion• Flywheel shows it’s possible to provide 58%
average HTTP data reduction at web scale
• Data reduction is the easy part
• Maintaining performance and accommodating tussles are the hard part
45
[email protected] [email protected]/?q=Chrome+Data+Saver