20
HotPar HotPar 10 10 Towards Parallelizing the Layout Engine of Firefox* Carmen Badea, UC Irvine Mohammad R. Haghighat, Intel Corporation Alex Nicolau, UC Irvine Alex Veidenbaum, UC Irvine June 14 June 14 th th , 2010 , 2010 *This work was supported in part by the National Science Foundation under grant NSF CCF-0811882

Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

Embed Size (px)

Citation preview

Page 1: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotParHotPar’’1010

Towards Parallelizing the Layout Engine of Firefox*

Carmen Badea, UC Irvine

Mohammad R. Haghighat, Intel Corporation

Alex Nicolau, UC Irvine

Alex Veidenbaum, UC Irvine

June 14June 14thth, 2010, 2010

*This work was supported in part by the National Science Foundation under grant NSF CCF-0811882

Page 2: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 2

OutlineOutline

Motivation

CSS Rule Matching Implementation in Firefox

Profiling/Tracing - Strategy and Results

Parallel CSS Rule Matching

Performance Results

Summary

Page 3: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 3

Parallelizing the Firefox Layout Engine Parallelizing the Firefox Layout Engine -- MotivationMotivation

Mozilla Firefox*

� ~24% market share of all the total

web browsers

� Improved performance since the recent adoption of the tracing JIT JavaScript* engine (TraceMonkey*)

� Layout Engine• heavy performance hitter component

• 40% of the total execution time• VTune* experiments on Intel platforms

Source: Net Applications (Browser Market Share)

http://marketshare.hitslink.com/

* Atom, Core, and VTune are trademarks or registered trademark of Intel Corporation or its subsidiaries in the United States or other countries. JavaScript is a trademark of Sun Microsystems, registered in the United States and other countries. Firefox is the trademark of Mozilla. *Other names and brands may be claimed as the property of their respective owners.

Page 4: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 4

Parallelizing the Firefox Layout Engine Parallelizing the Firefox Layout Engine -- MotivationMotivation

Firefox Layout Engine

� Cascading Style Sheets (CSS) rule matching

• hot part of layout

• amenable to parallelization

Layout ~42% execution

CSS rule matching ~32% of the layout

Page 5: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 5

Cascading Style Sheets: A bit of backgroundCascading Style Sheets: A bit of background

� CSS: designed to allow the

separation between

webpage content (HTML)

and presentation (e.g.

colors, fonts, etc.)

� Enables multiple pages to

share formatting features

and reduces the amount of

repetition in specifying

layout styles

Simple selector

Combination of simple selectors

CSS rule structure

*

* Image source: http://css.maxdesign.com.au/

Page 6: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 6

OutlineOutline

Motivation

CSS Rule Matching Implementation in Firefox

Profiling/Tracing - Strategy and Results

Parallel CSS Rule Matching

Performance Results

Summary

Page 7: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 7

CSS Rule matching implementation in FirefoxCSS Rule matching implementation in Firefox

CSS Rule matching process in Firefox

� Does a style rule applies to a certain HTML element ?

� Series of iterations

• each iteration executes a SelectorMatches call

• One for each ancestor up the tree

CSS style rule examples

HTML sample document tree

* Samples courtesy of http://css.maxdesign.com.au/

*

Page 8: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 8

OutlineOutline

Motivation

CSS Rule Matching Implementation in Firefox

Profiling/Tracing - Strategy and Results

Parallel CSS Rule Matching

Performance Results

Summary

Page 9: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 9

Profiling/Tracing StrategyProfiling/Tracing Strategy

Firefox CSS Rule matching process

� Algorithm split into two main branches

• Sibling selector

• Descendant selector

� Four exit points

• Implemented an extensive tracing algorithm

• Execution patterns of the CSS rule matching process

Which is most executed?

Which is the most often taken?

VProfLibrary

Page 10: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 10

Profiling/Tracing ResultsProfiling/Tracing Results

Benchmarks

� Mozilla Firefox Page Load Tests

• A collection of 390 web pages

from all over the world

• Test the page loading process in

Firefox

� Zimbra* Collaboration Suite

• AJAX-based rich-browser

integrated suite of email,

calendar, contacts, VoIP, etc.

Page 11: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 11

Profiling/Tracing Results for Mozilla Page Load TestsProfiling/Tracing Results for Mozilla Page Load Tests

Firefox Page Load Tests - % Total iterations

covered by the most frequent execution patterns

0

20

40

60

80

100

120

1 101 201 301 401 501 601 701 801 901 1001 1101 1201 1301 1401 1501

� Selector matching process heavily

biased towards descendant selectors

• 99% of the time for both benchmarks

� One exit point taken in the majority of

cases

• Dominant pattern: rule non-match

• Zimbra Collaboration Suite : ~ 89%

• Mozilla Page Load Tests : ~ 66%

� Mozilla Page Load Tests

• top 25 most frequent execution patterns

cover ~ 80% of all executed iterations

Page 12: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 12

OutlineOutline

Motivation

CSS Rule Matching Implementation in Firefox

Profiling/Tracing - Strategy and Results

Parallel CSS Rule Matching

Performance Results

Summary

Page 13: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 13

CSS Rule Matching CSS Rule Matching –– Parallel ImplementationParallel Implementation

Thread1

Thread2

ul em {color:blue}

SelectorMatches1

SelectorMatches2

SelectorMatches3

SelectorMatches4

Sequential

SelectorMatches1

SelectorMatches2

SelectorMatches3

SelectorMatches4

Parallel

Original Parallelized

Page 14: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 14

OutlineOutline

Motivation

CSS Rule Matching Implementation in Firefox

Profiling/Tracing - Strategy and Results

Parallel CSS Rule Matching

Performance Results

Summary

Page 15: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 15

Parallelized CSS Rule Match in Firefox Parallelized CSS Rule Match in Firefox -- Performance Results Performance Results

� Flexible and configurable

� Parallel configuration parameters

• Default iteration chunk size

• # of worker threads available

• Threshold on the # of ancestors necessary to enable parallel CSS

matching

�We have tested

• 7 configurations for the Zimbra Collaboration Suite

• 12 configurations for the Firefox Page Load Tests

• Kept first two parameters fixed and varied the last one

• 2 threads w/ default 4-iterations chunk

Page 16: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 16

Parallelized CSS Rule Matching: Parallelized CSS Rule Matching: Mozilla Tests Mozilla Tests

Performance ResultsPerformance Resultsx-Speedup of Firefox Page Load Tests - NHM

(max speedup = 1.84x)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

� Mozilla Firefox Page Load Tests

• Comprise ~ 400 distinct web page load tests

� Experiments carried out using 2

worker threads

� Mozilla Page Load Tests performance

results on Nehalem

• ~ 87% benchmarks show speedups

• ~ 1.8x maximum speedup observed

� Mozilla Page Load Tests performance

results on Atom

• ~ 60% benchmarks show speedups

• ~ 1.6x maximum speedup observed

• Hyper-threaded

x-Speedups of Firefox Page Load Tests - ATOM

(max speedup = 1.6x)

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

1.80

1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365 379

Page 17: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 17

Parallelized CSS Rule Matching: Parallelized CSS Rule Matching: Zimbra Suite Zimbra Suite

Performance ResultsPerformance Results

x-Speedup of Zimbra Collaboration Suite - NHM

(max. speedup = 1.61x)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

Test_

1

Test_

4

Test_

7

Test_

10

Test_

13

Test_

16

Test_

19

Test_

22

Test_

25

Test_

28

Test_

31

Test_

34

Test_

37

Test_

40

Test_

43

Zimbra performance results on Nehalem

• ~ 50% benchmarks show speedups

• 14% of the benchmarks are not affected performance-wise

• ~1.6x maximum speedup observed

The experiments were carried out on a Nehalem machine, with 2 worker threads

Page 18: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 18

SummarySummary

� Implemented a fully-functional, parallelized CSS rule matching

component in Firefox

• Mozilla Page Load Tests

• ~ 1.8x maximum speedup on NHM

• ~ 1.6x maximum speedup on Atom

• Zimbra Collaboration Suite: ~ 1.6x maximum speedup on NHM

� Submitted the parallel CSS rule matching code to Mozilla

• Excellent feedback from Firefox layout engine component owners

• Incorporated their suggestions into the existing parallelized code

• Awaiting adoption in a future release

Page 19: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 19

ReferencesReferences

� Mozilla Project [http://www.mozilla.com/en-US/]

� Zimbra Collaboration Suite [http://www.zimbra.com/]

� Cascading Style Sheets (CSS) [http://www.w3.org/Style/CSS/]

� Intel VTune Performance Analyzer [http://software.intel.com/en-us/intel-vtune/]

� VProf Value Profiling Library [Mozilla Source Code]

� Leo Meyerovich and Ras Bodik, Fast and Parallel Webpage Layout, WWW 2010: 19th International World Wide Web Conference (Raleigh, NC, USA), 2010.

� Chris Grier, Shuo Tang, and Samuel T. King, Secure Web Browsing with the OP Web Browser, SP ’08: Proceedings of the 2008 IEEE Symposium on Security and Privacy (Washington, DC, USA), IEEE Computer Society, 2008, pp. 402–416.

� Jungwoo Ha, Mohammad R. Haghighat, Shengnan Cong, and Kathryn S. McKinley, A Concurrent Trace-based Just-In-Time Compiler for Single-threaded JavaScript, 2009 Workshop on Parallel Execution of Sequential Programs on Multicore Architectures (PESPMA 2009), in conjunction with ISCA.

Page 20: Towards Parallelizing the Layout Engine of Firefox - · PDF fileTowards Parallelizing the Layout Engine of Firefox* Carmen Badea, ... Enables multiple pages to ... • ~ 50% benchmarks

HotPar’10 20

Thank you!Thank you!