Download pdf - Web workers

Contents

1. Need for Web Workers

2. Introduction

Types of Web Workers

3. Web Workers API

4. Web-Worker support in browser

“localhost” bug

5. Working of Web Workers

Message Passing Model

Communicating with a Dedicated Web Worker

Communicating with a Shared Web Worker

Example of Dedicated Web Worker

Example of Shared Web Worker

Terminating a worker

6. Error handling and debugging

7. Advantages of using web workers

8. Disadvantages of using web workers

9. Conclusion

10. References

localhost#_

Problems with JavaScript Concurrency

(Need for Web Workers) JavaScript is a single-threaded environment, meaning multiple scripts

cannot run at the same time. As an example, imagine a site that needs to

handle UI events, query and process large amounts of API data, and

manipulate the DOM. Unfortunately all of that can't be done simultaneous

due to limitations in browsers' JavaScript runtime. Script execution

happens within a single thread. The downside of this is that some CPU

intensive piece of JavaScript can render the page unresponsive or slow it to

a crawl. If the script took long enough, the browser would prompt the user

to see if he/she wanted to stop the unresponsive script.

Unresponsive Script dialog box

Developers implement concurrency by using techniques like setTimeout(),

setInterval(), XMLHttpRequest and event handlers. Though all of these

features run asynchronously but these events are processed after the current

executing script has yielded.

Web Workers – Introduction

The Web Workers specification defines an API for spawning background

scripts in our web application. Web Workers allow us to do things like fire

up long-running scripts to handle computationally intensive tasks, but

without blocking the UI or other scripts to handle user interactions (the

window stays responsive to inputs like clicks and scrolling, even while

processing).

Workers utilize thread-like message passing to achieve

parallelism thus bringing about true multi-threading in

JavaScript.

Types of Web Workers

Dedicated Workers

Shared Workers

The Difference between the two

Dedicated Workers are linked

to the script that created them

(called the owner or creator).

Dedicated Web Worker is

targeted for applications

requiring point to point

communication.

The shared Web Workers are

named so that any script

running in the same

origin/domain can

communicate with them,

either by the URL of the

script used to create it, or by

name.

Shared web workers for

communication with multiple

producers and consumers

A Shared Worker exposes

more of the Messaging API

components.

http://www.whatwg.org/specs/web-workers/current-work/

Web Workers API

// Check if Web Workers are supported

if (typeof(Worker) !== "undefined") {

document.getElementById("support").innerHTML =

"Your browser supports HTML5 Web Workers";

}

// Create a new worker

// The URL for the JavaScript file on the same origin

worker = new Worker ("echoWorker.js");

//to load additional JavaScript in the worker

importScripts("helper.js", "anotherHelper.js");

//From the main page

worker.postMessage("Here's a message for we");

//Add event listener

worker.addEventListener("message”, messageHandler,

true);

//Process incoming messages

function messageHandler(e) {

// process message from worker

}

//Handle errors

worker.addEventListener("error", errorHandler, true);

//Stop worker

worker.terminate();

//From the Web Worker

function messageHandler(e) {

postMessage("worker says: " + e.data + " too");

}

//Add event listener

addEventListener("message", messageHandler, true);

//Using a Web Worker within a Web Worker

var subWorker = new Worker("subWorker.js");

Checking Web-Worker support in browser

/* Check if Web Workers are supported */

function getWebWorkerSupport() {

return (typeof(Worker) !== "undefined") ?

true:false;

}

Before we create any web worker related code, we must find out if our

browser supports web-workers.

Currently, Shared web workers are supported in Chrome, Safari and

Opera.

Dedicated Web Workers are implemented by Firefox 3.5, Safari 4 and

Chrome.

Mozilla Firefox 4 does not support shared web workers.

“localhost” bug

When we try to run a Worker script in Chrome on our local machine and not

on a webserver, an error is reported. Workers are restricted by the Same

Origin Policy.

The behavior of same-origin checks and related mechanisms is not well-

defined in a number of corner cases, such as for protocols that do not have a

clearly defined host name or port associated with their URLs (file:, data:,

etc.).

The exact error is:

Loading a local file, even with a relative URL, is the same as loading a file

with the file: protocol. So the problem is that when we are trying to load the

.js file of worker as a local file - Chrome doesn't like this (for some security

reasons), though we can force the issue by starting Chrome like this:

chrome.exe --allow-file-access-from-files.

The Same Origin Policy is an important security concept for a number

of browser-side programming languages, such as JavaScript. The

policy permits scripts running on pages originating from the same site

to access each other's methods and properties with no specific

restrictions, but prevents access to most methods and properties across

pages on different sites.

"Uncaught Error: SECURITY_ERR: DOM

Exception 18". viewing this file in the

file:/// protocol or over http://?

We’ll have to serve the page in order for

security to process it correctly."

http://en.wikipedia.org/wiki/Same_origin_policy

http://en.wikipedia.org/wiki/Same_origin_policy

http://en.wikipedia.org/wiki/Computer_security

http://en.wikipedia.org/wiki/Client-side_scripting

http://en.wikipedia.org/wiki/JavaScript

Working of Web Workers

Message Passing Model

Messages passed between the main page and workers are copied, not shared. It

appears that the object is being passed directly to the worker even though it's

running in a separate, dedicated space. In actuality, what is happening is that the

object is being serialized as it's handed to the worker, and subsequently, de-

serialized on the other end. The page and worker do not share the same instance, so

the end result is that a duplicate is created on each pass. Most browsers implement

this feature by automatically JSON encoding/decoding the value on either end.

PORT MESSAGING CHANNEL PORT

onmessage worker.js

Web Messaging Infrastructure

Page WorkerGlobalScope

postMessaage

WorkerGlobalScope

Workers have their own JavaScript context, separate from the renderer

Global scope (this) is NOT window

No DOM access

No window

No Document

No cookies

No storage

Chrome now provides Web Database API

Common Functions (across all implementations)

postMessage

Event support

addEventListener

dispatchEvent

removeEventListener

importScripts

location (read only)

navigator

XMLHttpRequest

setTimeout()/clearTimeout() and setInterval()/clearInterval()

Web Messaging Infrastructure`

Web Messaging more securely enables cross-document communication. Enabling

Cross-site scripting opens a security hole in a browser. For security reasons cross-

site scripting is disabled. Cross-document communication is important to building

Web Applications, so Web Messaging has been architected for security as well as

communication capability.

Web Messaging protocols pass around a MessageEvent object. In the example,

"data" is the attribute containing the message payload; "data" is a string in the

example, but can be any type.

Web Workers leverage the Web Messaging Channel messaging infrastructure. A

MessageChannel connects two MessagePorts. The specification refers to the setup

as "entangling" the ports. A call to postMessage on a MessagePort puts data across

the channel. Each MessagePort maintains a message queue. Messages posted on

one port on the MessageChannel are set to the other port on the MessageChannel

and visa-versa. MessagePorts receive a message via an "onmessage" function.

Web Workers extend the Web Messaging infrastructure supporting posting to an

Array of MessagePorts. MessagePort Arrays are handy for multiple notifications.

Communicating with a dedicated worker

Dedicated workers use MessagePort objects behind the scenes, and thus support all

the same features, such as sending structured data, transferring binary data, and

transferring other ports.

To receive messages from a dedicated worker, use the onmessage event handler

IDL attribute on the Worker object:

worker.onmessage = function (event) { ... };

We can also use the addEventListener() method.

The implicit MessagePort used by dedicated workers has its port message

queue implicitly enabled when it is created, so there is no equivalent to

the MessagePort interface's start() method on the Worker interface.

To send data to a worker, use the postMessage() method. Structured data can be

sent over this communication channel. To send ArrayBuffer objects efficiently (by

transferring them rather than cloning them), list them in an array in the second

argument.

worker.postMessage({

operation: 'find-edges',

input: buffer, // an ArrayBuffer object

threshold: 0.6,

}, [buffer]);

To receive a message inside the worker, the onmessage event handler IDL

attribute is used.

onmessage = function (event) { ... };

We can again also use the addEventListener() method.

In either case, the data is provided in the event object's data attribute.

To send messages back, we again use postMessage(). It supports the structured data

in the same manner.

postMessage(event.data.input, [event.data.input]); //

transfer the buffer back.

http://www.whatwg.org/specs/web-apps/current-work/multipage/workers.html#worker

http://www.whatwg.org/specs/web-apps/current-work/multipage/web-messaging.html#messageport

http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#arraybuffer

http://www.whatwg.org/specs/web-apps/current-work/multipage/workers.html#handler-dedicatedworkerglobalscope-onmessage

http://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html#dom-messageevent-data

http://www.whatwg.org/specs/web-apps/current-work/multipage/workers.html#dom-dedicatedworkerglobalscope-postmessage

Communicating with a shared worker

Shared workers are identified in one of two ways: either by the URL of the script

used to create it, or by explicit name. When created by name, the URL used by the

first page to create the worker with that name is the URL of the script that will be

used for that worker. This allows multiple applications on a domain to all use a

single shared worker to provide a common service, without the applications having

to keep track of a common URL for the script used to provide the service.In either

case, shared workers are scoped by origin. Two different sites using the same

names will not collide.

Creating shared workers is done using the SharedWorker() constructor. This

constructor takes the URL to the script to use for its first argument, and the name

of the worker, if any, as the second argument.

var worker = new SharedWorker('service.js');

Communicating with shared workers is done with explicit MessagePort objects.

The object returned by the SharedWorker() constructor holds a reference to the port

on its port attribute.

worker.port.onmessage = function (event) { ... };

worker.port.postMessage('some message');

worker.port.postMessage({ foo: 'structured', bar:

['data', 'also', 'possible']});

Inside the shared worker, new clients of the worker are announced using

the connect event. The port for the new client is given by the event object's source

attribute.

onconnect = function (event) {

var newPort = event.source;

// set up a listener

newPort.onmessage = function (event) { ... };

// send a message back to the port

newPort.postMessage('ready!'); // can also send

structured data.};

A shared worker will remain active as long as one window has a connection to it.

http://www.whatwg.org/specs/web-apps/current-work/multipage/workers.html#dom-sharedworker

http://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html#dom-messageevent-source

Example of Dedicated Worker

//The code below will find out the value of pi. It requires looping many, many

times to get at some real accuracy, and that's really processor intensive!. I have not

used web workers here.

<html>

<head>

<script type="text/javascript">

function CalculatePi(){

var loop = document.getElementById("loop");

var c = parseInt(loop.value);

var f = parseFloat(loop.value);

var Pi=0, n=1;

try {

if (isNaN(c) || f != c ) {

throw("errInvalidNumber");

} else if (c<=0) {

throw("errNegativeNumber");

}

for (var i=0;i<=c;i++) {

Pi=Pi+(4/n)-(4/(n+2));

n=n+4; }

document.getElementById("PiValue").innerHTML = Pi;

} catch (e) {

var msg = "Input Error: ";

if (e=="errInvalidNumber")

msg += "Invalid number.";

else if (e=="errNegativeNumber")

msg += "Input must be positive.";

else msg += e.message;

alert(msg);

}}

</script>

</head>

<body>

<label for="loop">Enter the number of cycles:</label>

<input id="loop" type="number" value="100" />

<input type="button" onclick="CalculatePi()"

value="Calculate Pi" />

<br> <br>

<div id="PiValue">PI value appears here</div>

</body>

</html>

We’ll see that for small values of ‘number of cycles’ the user interface will not

block and the value computes within no time but when we enter value in millions

and above, it would do two things: give a fairly accurate value of pi and slow down

the interface to a crawl.

On running the above code for 10000000000 cycles.

//Code with web workers

// pi.htm (main thread)

<html>

<head>

<script type="text/javascript">

function launchPiWebWorker() {

var worker = new Worker('pi.js');

worker.onmessage = function(e) {

document.getElementById("PiValue").innerHTML =

e.data.PiValue;

};

worker.onerror = function(e) {

alert('Error: Line ' + e.lineno + ' in ' +

e.filename + ': ' + e.message);

};

//start the worker

worker.postMessage({'cmd': 'CalculatePi',

'value':

document.getElementById("loop").value

});

}

</script>

</head>

<body>

<label for="loop">Enter the number of cycles:</label>

<input id="loop" type="number" value="100" />

<input type="button" onclick="launchPiWebWorker()"

value="Calculate Pi" />

<br>

<br>

<div id="PiValue">PI value appears here</div>

</body>

</html>

//worker file pi.js

function CalculatePi(loop)

{

var c = parseInt(loop);

var f = parseFloat(loop);

var n=1;

//these errors will need more work…

if (isNaN(c) || f != c ) {

throw("errInvalidNumber");

} else if (c<=0) {

throw("errNegativeNumber");

}

for (var i=0,Pi=0;i<=c;i++) {

Pi=Pi+(4/n)-(4/(n+2));

n=n+4;

}

self.postMessage({'PiValue': Pi});

}

//wait for the start 'CalculatePi' message

//e is the event and e.data contains the JSON object

self.onmessage = function(e) {

CalculatePi(e.data.value);

}

The above code uses a worker to compute the value of pi. This does not block the

user interface as the calculation part is done in a separate thread i.e. in the worker.

This snippet will not run in Chrome if we use the “file://” protocol because of

security reasons in chrome mentioned above in “localhost bug”.

(I have checked this in Chrome version 19.0.1084.52)

localhost#_

Example of a Shared Worker

When we have a web application with multiple windows each needing access to a

worker thread we don't really want to create a new thread in each window because

it takes time and system resources to create each worker thread.

The ability to share a single worker thread among each window from the same

origin comes as a great benefit in this case.

The following is the simplest way to create a SharedWorker thread that multiple

windows from the same origin can make use of:

// Window 1

var aSharedWorker = new SharedWorker("SharedWorker.js");

// Window 2

var aSharedWorker = new SharedWorker("SharedWorker.js");

The SharedWorker object accepts an optional 2nd parameter in the constructor that

serves as the name of the worker.

Most of the time having one shared worker will give the needed functionality. If

we simply have a desire to add more parallel processing, the shared worker can

always spawn web workers of its own.

What if we run into a scenario where we have a need for several windows to share

several workers rather than just the one?

That's where the 2nd parameter of the SharedWorker constructor comes into play.

We can create several different SharedWorker threads by specifying different

names when creating the worker objects.

The following is an example of two windows each sharing two worker threads

'Worker1' and 'Worker2':

// Window 1 - Shared Worker 1 & 2

var aSharedWorker1 = new SharedWorker("SharedWorker.js", "Worker1");


// Window 2 - Shared Worker 1 & 2



Here is a very good example of using shared workers.

http://coolaj86.github.com/html5-shared-web-worker-examples.

NOTE: Shared workers would not work in firefox and in chrome will work only

using http:// protocol.

http://coolaj86.github.com/html5-shared-web-worker-examples

Terminating the Web Workers

Once the main page starts a Worker thread, the thread doesn’t terminate by

itself. The calling page has to explicitly ask the Worker to terminate. This

may become necessary because creating each new Worker consumes

precious browser resources, which we will need to reclaim once the Workers

task is no longer required.

worker.terminate();

Once a worker is terminated, it goes out of scope and a new worker has to be

created if needed.

close() function can also be used to close the worker from within itself.

self.onmessage = function(e) {

if (e.data == "STOP!") self.close();

};

Error Handling and Debugging

Whenever an uncaught runtime script error occurs in one of the worker's

scripts, if the error did not occur while handling a previous script error, the

user agent must report the error at the URL of the resource that contained the

script, with the position(line number and column number) where the error

occurred, in the origin of the scripts running in the worker, using the

WorkerGlobalScope object’s onerror attribute.

If the implicit port connecting the worker to its Worker object has been

disentangled (i.e. if the parent worker has been terminated), then the user

agent must act as if the Worker object had no error event handler and as if

that worker's onerror attribute was null.

There are some browser differences to note here:

Chrome 5 and Safari 5 both just pass the error as a string to the error

handler in the thread

Firefox 3.6.8 and 4.0 beta 2 pass in an ErrorEvent object to the error

handler in the thread.

All browsers (Chrome 5, Safari 5, Firefox 3.6.8 / 4.0 beta 2) implement the

dedicated worker instance error event in the same way by passing in the

ErrorEvent object. When it comes to shared workers, however, the shared

worker object instance cannot trigger the onerror event in Chrome 5 or

Safari 5. It appears that for shared workers the onerror event will only be

triggered for the shared worker instance if there was a network error while

the worker thread was being created.

The following is an example of attaching to the onerror event of a dedicated

worker thread (the example will also work for shared workers with the exception

that with shared workers postMessage needs to be called on a port):

// Attach to the global error handler of the

// thread

onerror = OnErrorHandler;

function OnErrorHandler(e)

{

// In Chrome 5/Safari 5, 'e' is a string for

// both dedicated and shared workers within

// the thread

if (typeof (e) == "string")

{

postMessage("Error Message: " + e);

}

else // Dedicated worker in Firefox...(Firefox

// does not yet support shared workers)

{

postMessage("Error Message: " + e.message + " File

Name: " + e.filename + " Line Number: " + e.lineno);

}

}

// to test the error handler, throw an error

throw "This is a test error";

The message attribute must return the value it was initialized to. When the object is

created, this attribute must be initialized to the empty string. It represents the error

message.

The filename attribute must return the value it was initialized to. When the object

is created, this attribute must be initialized to the empty string. It represents the

absolute URL of the script in which the error originally occurred.

The lineno attribute must return the value it was initialized to. When the object is

created, this attribute must be initialized to zero. It represents the line number

where the error occurred in the script.

http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#absolute-url

Web Workers for which scenarios?

Image processing by using the data extracted from the <canvas> or the

<video> elements. We can divide the image into several zones and push

them to the different Workers that will work in parallel. We’ll then benefit

from the new generation of multi-cores CPUs.

Big amount of data retrieved that we need to parse after an

XMLHTTPRequest call. If the time needed to process this data is important,

we’d better do it in background inside a Web Worker to avoid freezing the

UI Thread. We’ll then keep a reactive application.

Background text analysis: as we have potentially more CPU time available

when using the Web Workers, we can now think about new scenarios in

JavaScript. For instance, we could imagine parsing in real-time what the user

is currently typing without impacting the UI experience. Think about an

application like Word (of our Office Web Apps suite) leveraging such

possibility: background search in dictionaries to help the user while typing,

automatic correction, etc.

Concurrent requests against a local database. IndexDB will allow what

the Local Storage can’t offer us: a thread-safe storage environment for our

Web Workers.

Prefetching and/or caching data for later use

Code syntax highlighting or other real-time text formatting

Background I/O or polling of web services

Processing large arrays or humungous JSON responses

Updating many rows of a local web database

Analyzing video or audio data

Advantages of Web Workers

The Worker interface spawns real OS-level threads, and concurrency can

cause interesting effects in our code if we aren't careful. However, in the

case of web workers, the carefully controlled communication points with

other threads mean that it's actually very hard to cause concurrency

problems. There's no access to non-thread safe components or the DOM and

we have to pass specific data in and out of a thread through serialized

objects.

Web Workers are not ridden with classic concurrency problems such as

deadlocks and race condition

Worker makes a natural sandbox for running untrusted code because it can’t

access page content or cookies.

“Jsandbox is an open source JavaScript sandboxing library that makes use of

HTML5 web workers. Jsandbox makes it possible to run untrusted

JavaScript without having to worry about any potential dangers.

Much of the danger comes from the script being executed on the same origin

– XMLHttpRequest

– OpenDatabase etc.

But new Worker() is same domain only and communication API allows for

cross-origin messaging using postMessage.

Multiple windows (viewers) can be opened that are all viewing the same

item for instance a map. All the windows share the same map information,

with a single worker coordinating all the viewers. Each viewer can move

around independently, but if they set any data on the map, all the viewers are

updated.( This feature of shared web workers can be used in our project )

Disadvantages of using Web Workers

postMessage can transfer strings between threads. But it is very rare that

data requiring analysis is solely string based, mostly we are working with

other primitive types as well such as numbers, Booleans, DateTimes, etc.

and the cost of converting (serializing) strings to/from these data types is

huge.

One thing to be aware of with web workers is that they are not intended to

be used in large numbers and are expected to be long-lived. The worker

threads also have a high start-up performance cost as well as a high memory

cost per worker instance.

Can’t send – Functions:

var func=function(e){return e}

postMessage(func); // Not allowed

Multi-threaded processes are difficult to debug.

Conclusion

As browser-based apps continue to become more complex, and CPUs gain

more cores, there will be a natural need to offload work into separate

threads. HTML5 Web Workers will likely form a big part of this and

combining them with jQuery Deferred objects can make it simpler for

developers to write simple, easy-to-read, parallel code, without adding any

extra overhead.

JavaScript web workers are in their infancy and the use cases are limited.

Browser support varies from patchy to non-existent and debugging is tough.

References

1. http://www.w3.org/TR/workers/

2. http://cggallant.blogspot.in/2010/08/deeper-look-at-html-5-web-

workers.html

3. http://www.html5rocks.com/en/tutorials/workers/basics/

http://www.w3.org/TR/workers/

http://cggallant.blogspot.in/2010/08/deeper-look-at-html-5-web-workers.html

http://cggallant.blogspot.in/2010/08/deeper-look-at-html-5-web-workers.html

http://www.html5rocks.com/en/tutorials/workers/basics/