39
The Next Leap in JavaScript Performance Mohammad Reza Haghighat Senior Principal Engineer, Intel Corporation October 20, 2014

The Next Leap in JavaScript Performance - HTML5 DevConf Haghighat... · The Next Leap in JavaScript Performance Mohammad Reza Haghighat Senior Principal Engineer, Intel Corporation

  • Upload
    haminh

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

The Next Leap in JavaScript Performance Mohammad Reza Haghighat

Senior Principal Engineer, Intel Corporation

October 20, 2014

• HTML5 - The New Lingua Franca?

• Exposing the full power of modern hardware to JavaScript*

• Bringing Perceptual Computing to the web platform

• Supporting JavaScript programming in Internet of Things (IoT)

• Summary

Agenda

2

HTML5 – The New Lingua Franca?

Native code PC spiral

1991

APPS .exe

2001

WEB HTML, Flash*

Web – “Write once, run on any browser”

2009

APPS iOS*, Android*, Windows*

App Stores Walled Gardens

2015

WEB HTML5

“Write Once, Run Everywhere”

“New open standards created in the mobile era, such as HTML5, will win on mobile devices.” – Steve Jobs

“If you want to do something that is universal, no question, world is going HTML5.” – Steve Ballmer

“It looks to me like HTML5 will eventually become a way almost all applications are built, including those on new phones.” – Eric Schmidt

3

Web: The Ubiquitous Software Platform

and the Application Model of the Future

Big Data Rich Capabilities

& Content

Social Contextual

Crowdsourced Sensors “Things”

4

• HTML5 - The New Lingua Franca?

• Exposing the full power of modern hardware to JavaScript*

• Bringing Perceptual Computing to the web platform

• Supporting JavaScript programming in Internet of Things (IoT)

• Summary

Agenda

5

Achieving ~ 1.5x native running time via targeting asm.js†, a highly optimizable subset of JavaScript defined by Mozilla

Astounding JavaScript* Performance With asm.js

asm.js : a highly optimizable low-level subset of JavaScript

http://www.unrealengine.com/html5/

Over 1M lines of C/C++ code compiled to JavaScript* by

Mozilla* and Epic

Epic* Games Unreal Engine* 3

† Courtesy of Mozilla Alon Zakai & Luke Wagner (http://people.mozilla.org/~lwagner/gdc-pres/gdc-2014.html#/)

asm.js Emscripten

JavaScript*

web

LLVM Bitcode Very efficient code generated by Firefox* JIT

6

Modern processors utilize parallelism to deliver high performance within a constrained power budget

The March of Parallelism

2002 2006 2008 2012

32 nm Tock

2010 2011 2012 2013

22 nm Tick

22 nm Tock

Intel® Advanced Vector Extensions

AVX2 FMA and integer support

AVX 256-bit floating point

1X=128-bit Since 2001

Next Gen Intel® Xeon PhiTM

AVX-512 512-bit vectors 8X peak SIMD

operations per core over 4 generations

2X

2X

2X

7

Optimizing Web Runtimes for Parallelism

Web runtimes need to be parallel end-to-end

Parse + build DOM

JavaScript*

Layout Engine

Render

GPU: parallel

CPU: mainly single-threaded

35%

33%

21%

11%

Render 35%

Layout 33%

Other 21%

JS 11%

• HTML5 runtimes of today are not scalable with number of cores

• Need parallelism for both responsiveness and energy efficiency

8

Parallel Parsing and Compilation

Background JIT compilers now in Chrome*, Firefox, Internet Explorer*, Safari*

PESPMA 2009

Four threads for JavaScript* parsing and compilation

JS and GFX execution

Epic* Citadel* profile on Firefox*

43.6

16.6

12.8

6.7

6.4 6.2

4.6 2.2 0.9

Cycle Breakdown by Categories js::compile

gfx::compile

os::others

js::parse

js::others

browser::others

os::mem

js::jitted

gfx::exec

bootstrap launch 4 threads

1 thread

9

Layout Engine: a performance bottleneck

Mozilla* Firefox* Page-Load Tests

Zimbra* Collaboration Suite*

ul em {color:blue}

CSS rule matching ~33% of the layout

HotPar 2010

Browser layout engine is a bottleneck but amenable to parallelism

10

Layout Engine ~42% execution

Towards Parallelizing the Browser Layout Engine

Parallel JavaScript*

• Started at Intel Labs, now with Mozilla*

• Extends JavaScript* with a data-parallel API

• Designed for multi-core CPUs and GPUs

• Simple, portable, and secure

Array increment example:

A.map(function(a) {return a+1;});

A.mapPar(function(a) {return a+1;});

Sequential

Parallel Accelerated animation of 3D avatars: more characters and more realism

Parallel JavaScript goal is to enable data-parallelism in web applications

11

SIMD – Single Instruction, Multiple Data

SIMD operations deliver great performance & power efficiency

Scalar Operation

Cx

Cy

Cz

Cw

=

=

=

=

Ax

Ay

Az

Aw

Bx

By

Bz

Bw

+

+

+

+

Cx

Cy

Cz

Cw

Ax

Ay

Az

Aw

Bx

By

Bz

Bw

+ =

SIMD Operation of Vector Length 4

Intel® Architecture currently has SIMD operations of vector length 4, 8, 16

12

SIMD - A Gap Between JavaScript* and Native

SIMD in JavaScript further reduces the performance gap Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

A Google*/Intel/Mozilla* ECMA TC39 Joint Project

• Bugzilla*: https://bugzilla.mozilla.org/show_bug.cgi?id=894105

• John McCutchan’s strawman proposal: http://wiki.ecmascript.org/doku.php?id=strawman:simd_number

C++ code for list average

“Proposed” JavaScript* code

SIMD code by ICC

13

SIMD.JS – The API

† Initial support for float32x4 and int32x4

Our SIMD prototype delivers 3x~4x Mandelbrot speedup†

Our Firefox* Prototype

14

Demo: Combining SIMD and Higher-Level Parallelism

SIMD speedup is nicely multiplied by WebWorkers†

† Source: Intel® Peter Jensen : https://github.com/PeterJensen/mandelbrot

WW: Number of WebWorkers

Our Chromium* Prototype

15

SIMD Speedups on our Chromium* Prototype

3.2 3.6 3.8 3.9

4.6 5.0

6.0

9.5

3.2 3.8

3.4

6.1 6.5

5.0 5.6

11.8

6.8

3.1 2.7

4.5 4.2 3.8

5.4

9.3

0

2

4

6

8

10

12

14

Transpose4x4 AOBench Mandelbrot MatrixMultiplication VertexTransform Average ShiftRows Matrix4x4Inverse

SIMD x-times faster than non-SIMD

3rd Generation Intel® Core™ i7 processor (3667U)@ 2.00 GHz, 32-bit, Ubuntu* 13 3rd Generation Intel® Core™ i7 processor (3667U)@ 2.00 GHz, 64-bit, Ubuntu* 13 Intel® Atom™ processor Z3770 @ 1.46GHz, Android* 4.4

Excellent early results while still focused on functionality

Theoretical speedup limit is 4

SIMD.JS benchmarks: https://github.com/johnmccutchan/ecmascript_simd/tree/master/src/benchmarks 16

SIMD.JS Proposal and Polyfill API SIMD Number (Google’s John McCutchan & Intel’s Peter Jensen): http://wiki.ecmascript.org/doku.php?id=strawman:simd_number

Polyfill API: https://github.com/johnmccutchan/ecmascript_simd

float32x4, int32x4, Float32x4Array, Int32x4Array

Constructors: float32x4(x,y,z,w) float32x4.zero() float32x4.splat(s)

Operations: abs, neg, add, sub, mul, div, clamp, min, max, reciprocal, reciprocalSqrt, scale, sqrt, shuffle, shuffleMix, withX, withY, withZ, withW, lessThan, lessThanOrEqual, equal, notEqual, greaterThanOrEqual, greaterThan, bitsToInt32x4, toInt32x4, …

The joint Google*/Intel/Mozilla* SIMD.JS proposal was approved to advance to the next stage of ECMAScript* TC39 standardization stage†

† A copy of the TC39 Presentation: http://esdiscuss.org/notes/2014-07/simd-128-tc39.pdf 17

Emscripten now targets SIMD.JS

Emscripten generates SIMD.JS from C++ SIMD intrinsics & auto-vectorized code

Near-native SIMD.JS speedup

C/C++ JavaScript*

1.00

2.03

7.18 8.13

0

2

4

6

8

10

Speedup over Scalar JS

Scalar JS Scalar C++

SIMD JS SIMD C++

18

Crosswalk in Brief

Application Runtime

Follow us at @xwalk_project

crosswalk-project.org

Open Source, using Blink* & Chromium*

Today on Android* and Tizen*

Easy addition of extensible APIs

Easy access to device APIs

Intel® platform capabilities

Latest HTML5 features in packaged web apps

Focuses on security, performance and standards compliance

Based on web technologies: HTML5, CSS3, JavaScript*

Updated & released to the latest Chromium every 6 weeks

19

Intel® XDK – Cross-platform Development Kit

Develop, debug, profile, and build responsive web & hybrid apps

Free at http://xdk.intel.com

Remote debugging & profiling

20

• HTML5 - The New Lingua Franca?

• Exposing the full power of modern hardware to JavaScript*

• Bringing Perceptual Computing to the web platform

• Supporting JavaScript programming in Internet of Things (IoT)

• Summary

Agenda

21

Toward Perceptual Computing†

Devices sense & perceive user actions in a natural & intuitive way † Source: Intel® Perceptual Computing SDK: www.intel.com/software/perceptual

Speech Recognition

Close-Range Tracking

Gesture Recognition

2D/ 3D Object Tracking

Facial Analysis

22

Reinventing Everyday Usages

Perceptual Computing opens up new dimensions in interacting with machine

Learning & Education 3D Scanning and Sharing

Scan it

Share it Customize & Print it

Immersive Collaboration

Gaming Out-of-reach Device Input

23

Enabling 3D Camera on Web Platform

3D Camera

• Beyond color: additional per-pixel distance

• Intel® RealSense™ on PC & tablets soon

Applications

• Real-time hand/finger/object tracking

• 3D scanning

• Video conferencing

Depth on Web Platform†

• Media Capture Depth Stream Extension

• Rendering & post-processing: <video>, <canvas>, WebGL* and SIMD.JS

• Streaming: transmit as MediaStream via WebRTC RTCPeerConnection

† Source: Intel® Ningxin Hu: https://github.com/huningxin/depth_stream_examples 25

Proposed Media Capture Depth Stream Extension†

† Source: http://w3c.github.io/mediacapture-depth/

Web Application

Browser or HTML5 runtime

RGB Stream

Depth Stream

getUserMedia (WebRTC) API

26

Gaming

Wireless Display for the Web

Unlock exciting new user experiences in HTML5

Presentation

† Big Buck Bunny video: http://www.bigbuckbunny.org/

Media Sharing/Casting†

27

• Connects web content to screens around you

• Hides display connection technologies from the developer

• Apple* AirPlay*, Microsoft* PlayTo*,

Google* Chromecast*, Miracast*, Intel® Widi

• Simple, high level API, easy to use

http://webscreens.github.io/presentation-api/

HTML5 Presentation API Proposal†

† Source: Intel® Dominik Röttsches

New standards-based feature for the cross-platform web

28

• HTML5 - The New Lingua Franca?

• Exposing the full power of modern hardware to JavaScript*

• Bringing Perceptual Computing to the web platform

• Supporting JavaScript programming in Internet of Things (IoT)

• Summary

Agenda

29

Intel® XDK IoT Edition

Companion Apps

Streamlined Workflow Design, Test, and Build Tools

• Quick start samples and templates

• Built-in editor and emulators

• UI Frameworks and Apache Cordova* APIs

• Test and debug tools

• Integration with Cloud Services APIs Design and build cross-platform companion apps easily for Android*, iOS*, and Windows*

30

Intel® XDK IoT Edition

JavaScript* apps on IoT devices

Integrated Development Environment Create, Debug, and Run Tools

• JavaScript allows easy on-board app development and deployment for many IoT devices

• Use JavaScript to define behavior of IoT device

• Deploy, run, debug on IoT device with JavaScript

• Integration with cloud, web services, and sensors through JavaScript APIs

IoT Device

Edit JavaScript app

Send app to device

Run app remotely

Remote debug

Development Platform

Development System

31

Internet of Things (IoT) Device (Intel® Galileo):

• PWM Led Controller on I2C bus

• RGB Led

• Node.js with Socket.io server

HTML App (Lenovo* K900):

• Socket.io connection to IoT device

• Change lighting color

• Cordova* App

Both made using:

Demo: Programming Internet of Things using Intel® XDK IoT Edition

† Source: Intel® Dan Yocom: http://xdk-software.intel.com/iot_edition_demo_video

RGB Lighting† Intel® XDK IoT Edition

32

• HTML5 - The New Lingua Franca?

• Exposing the full power of modern hardware to JavaScript*

• Bringing Perceptual Computing to the web platform

• Supporting JavaScript programming in Internet of Things (IoT)

• Summary

Agenda

33

• HTML5 is closing the gaps with native models

• SIMD in JavaScript* enables a large new class of high-performance apps

• JavaScript is about to get a lot faster for such domains as gaming

• Depth Camera support in HTML5 WebRTC enables exciting use cases

• JavaScript is proliferating rapidly in Internet of Things

• Intel® XDK supports end-to-end programming for Internet of Things

• HTML5 is the application model of the future

Summary

34

Web: The Ubiquitous Software Platform

and the Application Model of the Future

Big Data Rich Capabilities

& Content

Social Contextual

Crowdsourced Sensors “Things”

35

Download Firefox* Nightly and experience† the benefits of SIMD.JS

Leverage the power of SIMD.JS through Intel® XDK and Crosswalk

Download Intel® XDK free at http://xdk.intel.com

Call to Action

† SIMD.JS demos: http://peterjensen.github.io/idf2014-simd 36

Intel® Developer Zone

• Free tools and code samples

• Technical articles, forums and tutorials

• Connect with Intel and industry experts

• Get development support

• Build relationships

Tools. Knowledge. Community.

software.intel.com 37

Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Intel, Core, Atom, Xeon Phi, RealSense, Look Inside and the Intel logo are trademarks of Intel Corporation in the United States and other countries.

*Other names and brands may be claimed as the property of others. Copyright ©2014 Intel Corporation.

38