47
Web Client performance Herea Adrian [email protected] The current paper tries to put together information about the ways of improvement of speed on web client. We will see which are the tips of yslow how we follow them and what we should do or not. We will also see how we could write better JavaScript code which could improve the performance.

Web Client Performance

Embed Size (px)

Citation preview

Page 1: Web Client Performance

Web Client performance

Herea Adrian

[email protected]

The current paper tries to put together information about the ways of

improvement of speed on web client. We will see which are the tips of yslow

how we follow them and what we should do or not. We will also see how we

could write better JavaScript code which could improve the performance.

Page 2: Web Client Performance

2 Herea Adrian

Prologue

The Pareto Principle

Economist Vilfredo Pareto found in 1897 that about 80 percent of Italy's wealth

was owned by about 20 percent of the population. This has become the 80/20 rule or

the Pareto principle, which is often applied to a variety of disciplines. Although some

say it should be adjusted to a 90/10 rule, this rule of thumb applies to everything from

employee productivity and quality control to programming.

Barry Boehm found that 20 percent of a program consumes 80 percent of the

execution time. He also found that 20 percent of software modules are responsible for

80 percent of the errors. Donald Knuth found that more than 50 percent of a program's

run time is usually due to less than 4 percent of the code. Clearly, a small portion of

code accounts for the majority of program execution time.

The importance of performance

500 ms slower = 20% drop in traffic (Google)

400 ms slower = 5-9% drop in full-page traffic* (Yahoo!)

100 ms slower = 1% drop in sales (Amazon)

Users leaving before the page finishes loading

Page 3: Web Client Performance

Tips for a fast web page inspired by YSlow 3

Tips for a fast web page inspired by YSlow

YSlow by Yahoo! is little plugin for Firefox and Firebug. It looks at a page and

uses Yahoo!‟s Best Practices for Speeding Up Your Web Site to tell you how to make

the page load faster. At least that is the idea behind it. In practice, however, not all of

its suggestions are useful or even meaningful.

1. Make Fewer HTTP Requests

If the browser has to make fewer requests, it can display your page faster.

There are 3 reasons for this. One is every HTTP request has a small amount of

network overhead, and the more requests you remove, the less overall network traffic.

The second reason is browsers have a limit to the # of HTTP requests they can

make to a webserver (or hostname, specifically). Older browsers were limited to 2.

Some newer browsers have upped that limit to 6 or 8, but they still have a limit.

Therefore when a browser has reached its limit, it has to wait for requests to finish

before starting up new ones. So the more requests necessary, the more queuing will

occur.

The third reason is specific to JavaScript in that browsers will only download and

execute 1 JavaScript file at a time. This is done because JavaScript can modify the

DOM, redirect the page or do any number of things that may affect what resources

need to be downloaded. So even if a browser can download 8 requests in parallel,

JavaScript files will still be download sequentially. (There are efforts to improve this

issue in Webkit and Gecko.)

That fewer requests leads to faster display is generally true, although some types of

pages require more images and such as part of their content. To account for this,

YSlow only looks at three types of request: Javascript files, style sheet files, and

images references in style sheets (CSS images). For each one of these, if you have too

many of them, your score starts to drop. The limits are 3 scripts, 2 style sheets, and 6

CSS images.

Following their suggestions, although the CSS images one may be too strict, can

really improve the performance.

2. Use a Content Delivery Network

Content delivery networks let you spread your content out on a geographically

dispersed network of servers so it can be delivered to your users more quickly. This is

especially good to do for components of your page such as images and scripts, rather

than the core content that you may need to serve dynamically.

YSlow restricts its attention to scripts, images, CSS images, style sheets, and Flash

objects. For each of these, it matches the URL against a list of known CDN URL

Page 4: Web Client Performance

4 Herea Adrian

patterns. You can also add your own CDN patterns, if you need to. Every one of those

files that doesn‟t match a CDN costs you points.

This advice can be ignored because almost all sites do not get enough traffic to

justify it, but CDNs are too expensive for most webmasters.

3. Add an Expires Header

Save requests for your return visitors by letting them know how long page

elements are good.

Expires headers are a type of header that tells the browser when an asset „expires‟

from its cache. When you set it to years in the future, a browser will cache it and

never ask the website again for it.

Expires headers look like this:

Expires: Thu, 15 Apr 2020 20:00:00 GMT

The assets you want to set expires headers on are things that don‟t change much.

Like images, CSS and JavaScript. I know, CSS & JavaScript might change every

week or two and this can result in your users never seeing new content. The way

around this is to change the filename of your assets whenever you update them. This

can be done manually or with a build script of some sort.

You can set expires headers in Apache by adding this to your httpd.conf or

.htaccess file:

<FilesMatch

"\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">

Header set Expires "Thu, 15 Apr 2020 20:00:00 GMT"

</FilesMatch>

Again YSlow turns its eye to scripts, images, CSS images, style sheets, and

Flash objects. If any don‟t have an expires header that lets them live at least 48 hours

(or a cache-control header that does the same thing), down goes your score.

This is the single biggest part of your YSlow score and for good reason. Expires

headers really do help speed things up and they are easy to add. In Apache, it is as

simple as adding a couple of mod_expires directives to your .htaccess file.

4. Gzip Components

If you compress your content it will get your users faster.

Page 5: Web Client Performance

Tips for a fast web page inspired by YSlow 5

Images are generally already compressed by virtue of their file formats (JPEG,

GIF, etc.), so this time YSlow focuses on everything else. If any of them aren‟t

compressed, you lose points fast. Even one infraction and you have lost your A. Miss

four and you get an F.

Computers are fast and networks are slow. Compressing your files just makes

sense and is easy to do. Apache‟s mod_deflate (or mod_gzip for the 1.3 crowd) lets

you do it with a few small changes to your .htaccess file.

5. Put CSS at the Top

Give your browser all the style information upfront so it can get the layout right the

first time.

Every style sheet used that is not mentioned in the header costs you a letter grade.

Do what you can. It is a good idea, but is not always possible if you are including

dynamic content from other sites, such as widgets, ads, etc. that include their own

CSS.

6. Move Scripts to the Bottom

Let the browser finish laying out your page before you start running all sorts of

dynamic scripts.

Browsers only download 1 JavaScript file at a time? Well, they also block

rendering of any content after them in the DOM. So when your JavaScript is

referenced at the top of the page, like most webpages do, it blocks the rest of the page.

The solution? Move them below all your content!

The same as #5, except every script used that is mentioned in the header costs you

a letter grade.

On the one hand, they do have a point about how the page will load. On the other

hand, the scripts will download faster if they are referenced in the header. This is what

onLoad is for. And many widgets and ads have to be included inline so you can‟t win

anyway. Do what you want.

7. Avoid CSS Expressions

Don‟t make the browser work harder than necessary for your style sheets.

You start this one with barely an A and work down from there. Every expression

costs you just two points, so you can get away with a few and still get a B.

You should avoid CSS expressions, although not necessarily for speed reasons.

You should avoid them because they are a nonstandard Microsoft “extension” that

will do nothing but cause you headaches as you try to make your site work cross-

browser. Anyone who gets less than an A here deserves their headaches.

Page 6: Web Client Performance

6 Herea Adrian

8. Make JavaScript and CSS External

Moving style sheets and scripts out of your (X)HTML will allow them to be

cached by browsers for subsequent page views on your site.

The code for this test is a stub that always gives you an “n/a” with a note saying

this it only makes sense for home pages.

Make them external if they are used on multiple pages of your site. It will help on

subsequent pages. On the other hand, if stuff is specific to one page, feel free to inline

with immunity.

9. Reduce DNS Lookups

The fewer host names a browser has to resolve, the faster it can load your page.

YSlow counts up all the host names for all the components. Anything past two will

cost you 5 points. So, four or fewer is an A, five of six a B, and so on.

Most of the time if you have multiple host names being used, it is because you are

including widgets or ads from somewhere else or you are using a CDN (see #2) to

speed things up. Nice catch-22 there. Try to minimize the number of hosts, but don‟t

sweat it too much for things like ads and CDNs, as your users‟ browsers may already

have resolved those hosts.

10. Minify JavaScript

While compressing Javascripts saves bandwidth and time (see #4), it turns out that

compression works out even better on Javascript that has been “minified.”

YSlow does a clever thing here and just looks for whitespace and comments in

your JS files. If it finds them, the file is definitely not minified. You lose 10 points per

non-minified file, so one offense is still an A, but you drop a letter grade for each

additional infraction.

Recommendation is to do it, once you have your scripts debugged. Debugging

minified code is self-flagellation. The difference between compression on a minified

script and a non-minified one is small, but measurable, so it is probably worth the

one-time effort. Compressor Rater can help you find the best compression overall for

your Javascript.

11. Avoid Redirects

Bouncing between different URLs causes more requests and more time, so avoid

them.

Page 7: Web Client Performance

Tips for a fast web page inspired by YSlow 7

YSlow looks at each component to check for redirects. As with non-minified JS

files, you lose 10 points per redirect. One redirect is okay, but any past that and your

grade drops fast.

Redirects are a useful tool, and are sometime quite useful, but they can be

overused. Avoid them if you don‟t have a good reason to use them.

12. Remove Duplicate Scripts

Don‟t load twice what you can load once.

Any JS file loaded more than once costs you five points, which means you can still

get an A with three duplicates.

Do your best. Duplicated scripts are rarely intentional, but they do happen all the

time. Every page that has moer than one AdSense block on it is guilty, which is

probably why YSlow cuts you some slack.

13. Configure ETags

ETags help reduce duplicate requests, so using them is good.

YSlow doesn‟t just look for ETag headers, but looks for ones that it feels are

valid. This means that it expects any application-generated ETag headers to match

those generated by your web server. As with several of the other tests, YSlow only

looks at scripts, images, CSS images, style sheets, and Flash objects.

Don‟t bother. ETags are helpful, but Expires headers accomplish much of the same

job. Furthermore, YSlow‟s implementation is, in my opinion, broken. There is

nothing in the HTTP specification that requires a specific format for ETags. An web

application should be able to generate whatever form ETag it wants provided it can

use it to determine if the content of the page has changed. A commonly used

technique is to use an MD5 of the content, but YSlow deems this unacceptable.

Page 8: Web Client Performance

8 Herea Adrian

Optimizing JavaScript for Execution Speed

As our JavaScript applications get larger and ever more sophisticated, the need for

efficient scripting becomes increasingly important and hard to bypass. Back in the

days when all that JavaScript could do was change your document's background

color, or validate a simple form, abuse in terms of efficiency in our codes was

common, with the browser not having a problem with it at all. Now, especially with

the language's marriage with DHTML, in turn, the ability to create almost full blown

applications, efficiency is no longer something we can sweep under the rug, and

forget about.

JavaScript can benefit from many of the same speed-optimization techniques that

are used in other languages, like C1,2 and Java. Algorithms and data structures,

caching frequently used values, loop unrolling and hoisting, removing tail recursion,

and strength-reduction techniques all have a place in your JavaScript optimization

toolbox. However, how you interact with the Document Object Model (DOM) in

large part determines how efficiently your code executes.

Unlike other programming languages, JavaScript manipulates web pages through a

relatively sluggish API, the DOM. Interacting with the DOM is almost always more

expensive than straight computations. After choosing the right algorithm and data

structure and refactoring, your next consideration should be minimizing DOM

interaction and I/O operations.

With most programming languages, you can trade space for time complexity and

vice versa. But on the web, Java Scripts must be downloaded. Unlike desktop

applications where you can trade another kilobyte or two for speed, with JavaScript

you have to balance execution speed versus file size.

Unlike C, with its optimizing compilers that increase execution speed and decrease

file size, JavaScript is an interpreted language that usually is run over a network

connection (unless you count Netscape's Rhino, which can compile and optimize

JavaScript into Java byte code for embedded applications). This makes JavaScript

relatively slow compared to compiled languages. However, most scripts are usually so

small and fast that users won't notice any speed degradation.

Many would agree that it's just a matter of time now before JavaScript eventually

graduates to become a full blown language like C or Java. Practicing responsible and

efficient coding now can save you a lot of work in the future

Design Levels

A hierarchy of optimization levels exists for JavaScript, what Bentley and others

call design levels.6 First comes the global changes like using the right algorithms and

data structures that can speed up your code by orders of magnitude. Next comes

refactoring that restructures code in a disciplined way into a simpler, more efficient

form7). Then comes minimizing DOM interaction and I/O or HTTP requests. Finally,

if performance is still a problem, use local optimizations like caching frequently used

values to save on recalculation costs. Here is a summary of the optimization process:

Page 9: Web Client Performance

Optimizing JavaScript for Execution Speed 9

Choose the right algorithm and data structure.

1. Refactor to simplify code.

2. Minimize DOM and I/O interaction.

3. Use local optimizations last.

When optimizing your code, start at the highest level and work your way down until

the code executes fast enough. For maximum speed, work at multiple levels.

Measure Your Changes

Measurement is a key part of the optimization process. Use the simplest algorithms

and data structures you can, and measure your code's performance to see whether you

need to make any changes. Use timing commands or profilers to locate any

bottlenecks. Optimize these hot spots one at a time, and measure any improvement.

You can use the date object to time individual snippets:

<script type="text/javascript">

function DoBench(x){

var startTime,endTime,gORl='local';

if(x==1){

startTime=new Date().getTime();

Bench1();

endTime=new Date().getTime();

}else{

gORl='global';

startTime=new Date().getTime();

Bench2();

endTime=new Date().getTime();

}

Page 10: Web Client Performance

10 Herea Adrian

alert('Elapsed time using '+gORl+' variable:

'+((endTime-startTime)/1000)+' seconds.');

}

...

</script>

This is useful when comparing one technique to another. But for larger projects,

only a profiler will do. Mozilla.org includes the Venkman profiler in the Mozilla

browser distribution to help optimize your JavaScript.

Algorithms and Data Structures

As we learn in computer science classes, global optimizations (such as algorithm

and data structure choices) determine in large part the overall performance of our

programs. For larger values of "n," or the number of input elements, the complexity of

running time can dominate any local optimization concerns. This complexity is

expressed in O-notation, where complexity or "order" is expressed as a function of n.

Table 10.1 shows some examples.

Notation Name Example

O(1) constant array index, simple statements

O(logn) logarithmic binary search

O(n) linear string comparison, sequential search

O(nlogn) nlogn quicksort and heapsort

O(n2) quadratic simple selection and insertion sorting methods (two loops)

O(n3) cubic matrix multiplication of nxn matrices

O(2n) exponential set partitioning (traveling salesman)

Array access or simple statements are constant-time operations, or O(1). Well-

crafted quicksorts run in nlogn time or O(nlogn). Two nested for loops take on the

order of nxn or O(n2) time. For low values of n, choose simple data structures and

algorithms. As your data grows, use lower-order algorithms and data structures that

will scale for larger inputs.

Use built-in functions whenever possible (like the Math object), because these are

generally faster than custom replacements. For critical inner loops, measure your

changes because performance can vary among different browsers.

Page 11: Web Client Performance

Optimizing JavaScript for Execution Speed 11

Refactor to Simplify Code

Refactoring is the art of reworking your code to a more simplified or efficient form

in a disciplined way. Refactoring is an iterative process:

1. Write correct, well-commented code that works.

2. Get it debugged.

3. Streamline and refine by refactoring the code to replace complex sections with

shorter, more efficient code.

4. Mix well, and repeat.

Refactoring clarifies, refines, and in many cases speeds up your code. Here's a

simple example that replaces an assignment with an initialization. So instead of this:

function foo() {

var i;

// ....

i = 5;

}

Do this:

function foo() {

var i = 5;

// ....

}

Minimize DOM Interaction and I/O

Interacting with the DOM is significantly more complicated than arithmetic

computations, which makes it slower. When the JavaScript interpreter encounters a

scoped object, the engine resolves the reference by looking up the first object in the

chain and working its way through the next object until it finds the referenced

property. To maximize object resolution speed, minimize the scope chain of objects.

Page 12: Web Client Performance

12 Herea Adrian

Each node reference within an element's scope chain means more lookups for the

browser. Keep in mind that there are exceptions, like the window object, which is

faster to fully reference. So instead of this:

var link = location.href;

Do this:

var link = window.location.href;

Minimize Object and Property Lookups

Object-oriented techniques encourage encapsulation by tacking sub-nodes and

methods onto objects. However, object-property lookups are slow, especially if there

is an evaluation. So instead of this:

for(var i = 0; i < 1000; i++)

a.b.c.d(i);

Do this:

var e = a.b.c.d;

for(var i = 0; i < 1000; i++)

e(i);

Reduce the number of dots (object.property) and brackets

(object["property"]) in your program by caching frequently used objects and

properties. Nested properties are the worst offenders (object. property.

property. property).

Here is an example of minimizing lookups in a loop. Instead of this:

for (i=0; i<someArrayOrObject.length; i++)

Do this:

for (i=0, var n=someArrayOrObject.length; i<n; i++)

Page 13: Web Client Performance

Optimizing JavaScript for Execution Speed 13

Also, accessing a named property or object requires a lookup. When possible, refer

to the object or property directly by using an index into an object array. So instead of

this:

var form = document.f2; // refer to form by name

Do this:

var form = document.forms[1]; // refer to form by

position

Shorten Scope Chains

Every time a function executes, JavaScript creates an execution context that defines

its own little world for local variables. Each execution context has an associated scope

chain object that defines the object's place in the document's hierarchy. The scope

chain lists the objects within the global namespace that are searched when evaluating

an object or property. Each time a JavaScript program begins executing, certain built-

in objects are created.

The global object lists the properties (global variables) and predefined values and

functions (Math, parseInt(), etc.) that are available to all JavaScript programs.

Each time a function executes, a temporary call object is created. The function's

arguments and variables are stored as properties of its call object. Local variables are

properties of the call object.

Within each call object is the calling scope. Each set of brackets recursively

defines a new child of that scope. When JavaScript looks up a variable (called

variable name resolution), the JavaScript interpreter looks first in the local scope,

then in its parent, then in the parent of that scope, and so on until it hits the global

scope. In other words, JavaScript looks at the first item in the scope chain, and if it

doesn't find the variable, it bubbles up the chain until it hits the global object.

That's why global scopes are slow. They are worst-case scenarios for object

lookups.

During execution, only with statements and catch clauses affect the scope

chain.

Avoid with Statements

The with statement extends the scope chain temporarily with a computed object,

executes a statement with this longer scope chain, and then restores the original scope

chain. This can save you typing time, but cost you execution time. Each additional

child node you refer to means more work for the browser in scanning the global

namespace of your document. So instead of this:

with (document.formname) {

Page 14: Web Client Performance

14 Herea Adrian

field1.value = "one";

field2.value = "two";...

}

Do this:

var form = document.formname;

form.field1.value = "one";

form.field2.value = "two;

Cache the object or property reference instead of using with, and use this variable

for repeated references. with also has been deprecated, so it is best avoided.

Add Complex Subtrees Offline

When you are adding complex content to your page (like a table), you will find it is

faster to build your DOM node and all its sub-nodes offline before adding it to the

document. So instead of this (see Code Sample 1):

Code Sample 1 Adding Complex Subtrees Online

var tableEl, rowEl, cellEl;

var numRows = 10;

var numCells = 5;

tableEl = document.createElement("TABLE");

tableEl = document.body.appendChild(tableEl);

for (i = 0; i < numRows; i++) {

rowEl = document.createElement("TR");

for (j = 0; j < numCells;j++) {

cellEl = document.createElement("TD");

Page 15: Web Client Performance

Optimizing JavaScript for Execution Speed 15

cellEl.appendChild(document.createTextNode("[row

"+i+" cell "+j+ "]"));

rowEl.appendChild(cellEl);

}

tableEl.appendChild(rowEl);

}

Do this (see Code Sample2): Code Sample 2 Adding Complex Subtrees Offline

var tableEl, rowEl, cellEl;

var numRows = 10;

var numCells = 5;

tableEl = document.createElement("TABLE");

for (i = 0; i < numRows; i++) {

rowEl = document.createElement("TR");

for (j = 0; j < numCells;j++) {

cellEl = document.createElement("TD");

cellEl.appendChild(document.createTextNode("[row "

+i+ " cell "+j+"]"));

rowEl.appendChild(cellEl);

}

tableEl.appendChild(rowEl);

}

document.body.appendChild(tableEl);

Code Sample 1 adds the table object to the page immediately after it is created and

adds the rows afterward. This runs much slower because the browser must update the

Page 16: Web Client Performance

16 Herea Adrian

page display every time a new row is added. Code Sample 2 runs faster because it

adds the resulting table object last, via document.body.appendChild().

Edit Subtrees Offline

In a similar fashion, when you are manipulating subtrees of a document, first

remove the subtree, modify it, and then re-add it. DOM manipulation causes large

parts of the tree to recalculate the display, slowing things down. Also,

createElement() is slow compared to cloneNode(). When possible, create a

template subtree, and then clone it to create others, only changing what is necessary.

Let's combine these two optimizations into one example. So instead of this (see Code

Sample 3):

Code Sample 3 Editing Subtrees Online

var ul = document.getElementById("myUL");

for (var i = 0; i < 200; i++) {

ul.appendChild(document.createElement("LI"));

}

Do this (see Code Sample 4):

Code Sample 4 Editing Subtrees Offline

var ul = document.getElementById("myUL");

var li = document.createElement("LI");

var parent = ul.parentNode;

parent.removeChild(ul);

for (var i = 0; i < 200; i++) {

ul.appendChild(li.cloneNode(true));

}

parent.appendChild(ul);

Page 17: Web Client Performance

Optimizing JavaScript for Execution Speed 17

By editing your subtrees offline, you'll realize significant performance gains. The

more complex the source document, the better the gain. Substituting cloneNode

instead of createElement adds an extra boost.

Concatenate Long Strings

By the same token, avoid multiple document.writes in favor of one

document.write of a concatenated string. So instead of this:

document.write(' string 1');

document.write(' string 2');

document.write(' string 3');

document.write(' string 4');

Do this:

var txt = ' string 1'+

' string 2'+

' string 3'+

' string 4';

document.write(txt);

Access NodeLists Directly

NodeLists are lists of elements from object properties like .childNodes and

methods like getElementsByTagName(). Because these objects are live

(updated immediately when the underlying document changes), they are memory

intensive and can take up many CPU cycles. If you need a NodeList for only a

moment, it is faster to index directly into the list. Browsers are optimized to access

node lists this way. So instead of this:

Page 18: Web Client Performance

18 Herea Adrian

nl = document.getElementsByTagName("P");

for (var i = 0; i < nl.length; i++) {

p = nl[i];

}

Do this:

for (var i = 0; (p =

document.getElementsByTagName("P")[i]); i++)

In most cases, this is faster than caching the NodeList. In the second example, the

browser doesn't need to create the node list object. It needs only to find the element at

index i at that exact moment.

Use Object Literals

Object literals work like array literals by assigning entire complex data types to

objects with just one command. So instead of this:

car = new Object();

car.make = "Honda";

car.model = "Civic";

car.transmission = "manual";

car.miles = 1000000;

car.condition = "needs work";

Do this:

car = {

make: "Honda",

model: "Civic",

Page 19: Web Client Performance

Optimizing JavaScript for Execution Speed 19

transmission: "manual",

miles: 1000000,

condition: "needs work"

}

This saves space and unnecessary DOM references.

Local Optimizations

Okay, you've switched to a better algorithm and revamped your data structure.

You've refactored your code and minimized DOM interaction, but speed is still an

issue. It is time to tune your code by tweaking loops and expressions to speed up hot

spots. In his classic book, Writing Efficient Programs (Prentice Hall, 1982), Jon

Bentley revealed 27 optimization guidelines for writing efficient programs. These

code-tuning rules are actually low-level refactorings that fall into five categories:

space for time and vice versa, loops, logic, expressions, and procedures. In this

section, I touch on some highlights.

Trade Space for Time

Many of the optimization techniques you can read about in Bentley's book and

elsewhere trade space (more code) for time (more speed). You can add more code to

your scripts to achieve higher speed by "defactoring" hot spots to run faster. By

augmenting objects to store additional data or making it more easily accessible, you

can reduce the time required for common operations.

In JavaScript, however, any additional speed should be balanced against any

additional program size. Optimize hot spots, not your entire program. You can

compensate for this tradeoff by packing and compressing your scripts.

Augment Data Structures

Douglas Bagnall employed data structure augmentation in the miniscule 5K chess

game that he created for the 2002 5K contest (http://www.the5k.org/). Bagnall used

augmented data structures and binary arithmetic to make his game fast and small. The

board consists of a 120-element array, containing numbers representing either pieces,

empty squares, or "off-the-board" squares. The off-the-board squares speed up the

Page 20: Web Client Performance

20 Herea Adrian

testing of the sides—preventing bishops, etc., from wrapping from one edge to the

other while they're moving, without expensive positional tests.

Each element in his 120-item linear array contains a single number that represents

the status of each square. So instead of this:

board=[16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,16,

16,16,2,3,4,5,6,2,3,4,5,16,....]

He did this:

bstring="ggggggggggggggggggggg23456432gg11111111gg0000

... g";

for (z=0;z<120;z++){

board[z]=parseInt(bstring.charAt(z),35);

}

This base-35 value represents the squares on the board (parseInt using a radix of

35). As alpha "g" corresponds to 16 (the 5th bit; that is, bit 4), Bagnall says he

actually could have used base-17 instead of 35. Perhaps this will leave room for future

enhancements.

Each position on the board is encoded like this:

bit 4 (16): 0 = on board, 1 = off board.

bit 3 (8): 0 = white, 1 = black.

bits 0-2(7): 0 = empty, non-zero = the piece type:

1 - pawn

2 - rook

3 - knight

4 - bishop

5 - queen

Page 21: Web Client Performance

Optimizing JavaScript for Execution Speed 21

6 - king

So to test the color of a piece, movingPiece, you'd use the following:

ourCol=movingPiece & 8; // what color is it?

8=black, 0=white

movingPiece &= 7; // now we have the color info,

dump it.

if(movingPiece > 1){ // If it is not a pawn.

Bagnall also checks that the piece exists (because the preceding code will return

white for an empty square), so he checks that movingPiece is non-empty. To see his

code and the game in action, visit the following sites:

http://halo.gen.nz/chess/

http://halo.gen.nz/chess/main-branch/ (the actual code)

Cache Frequently Used Values

One of the most effective techniques you can use to speed up your JavaScripts is to

cache frequently used values. When you cache frequently used expressions and

objects, you do not need to recompute them. So instead of this (see Code Sample 5): Code Sample 5 A Loop That Needs Caching and Fewer Evaluations

var d=35;

for (var i=0; i<1000; i++) {

y += Math.sin(d)*10;

}

Do this (see Code Sample 6): Code Sample 6 Caching Complex Calculations Out of a Loop

var d=35;

var math_sind = Math.sin(d)*10;

Page 22: Web Client Performance

22 Herea Adrian

for (var i=0; i<1000; i++) {

y += math_sind;

}

Because Math is a global object, declaring the math_sind variable also avoids

resolving to a global object for each iteration. You can combine this technique with

minimizing DOM interaction by caching frequently used object or property

references. Simplify the calculations within your loops and their conditionals.

Cache your objects

One of the best kept secrets to boosting script performance is to cache your objects.

Often times, your script will repeatedly access a certain object, as in the following

demonstration:

<script type="text/javascript">

for (i=0;i<document.images.length;i++)

document.images[i].src="blank.gif"

</script>

In the above, the object "document.images" is what's accessed multiple times. The

code to realizing it is inefficient, since the browser must dynamically look up

"document.images" twice during each loop (once to see if i<document.images, and

the other, to access and change the image's src). If you have 10 images on the page,

for example, that's 20 calls to the Images object right there. Excessive calls to

JavaScript objects can wear down the browser, not to mention your computer's

memory.

The term "cache your object" means storing a repeatedly access object inside a user

defined variable, and using that variable instead in subsequent references to the

object. The performance improvement can be significant. Here's a modified version of

the initial script using object caching:

<script type="text/javascript">

var theimages=document.images

for (i=0;i<theimages.length;i++)

Page 23: Web Client Performance

Optimizing JavaScript for Execution Speed 23

theimages[i].src="blank.gif"

</script>

Not only is the number of times document.images[] is referenced cut in half with

the above, but for each time it is referenced, the browser doesn't have to go through

document.images first, but goes straight to its containing array.

Remember to use object caching when calling highly nested DHTML objects, like

document.all.myobject, or document.layers.firstlayer etc.

Cache your scripts

You've "cashed in" your objects...another way to enhance script performance is the

cache the entire script, by including it in a .js file. The technique causes the browser to

load the script in question only once, and recall it from cache should the page be

reloaded or revisited.

<script type="text/javascript"

src="imagescript.js"></script>

Use script caching when a script is extremely large, or embedded across multiple

pages.

Understand the cost of your objects

The fact is, some JavaScript objects are less forgiving on the browser than others.

While recognizing exactly which isn't easy (and isn't the goal here), just becoming

aware of this fact is important.

Take, for example, these two properties:

-object.innerText //IE only

-object.innerHTML

Did you know that the second property demands multiple times the system

resources to call than the first? If all you're changing is the textual content of a <div>

or <span> and in IE only, innerText would definitely be the more efficient choice.

Another example are the CSS properties "display" and "visibility"; the former is

significantly more expensive than the later.

Page 24: Web Client Performance

24 Herea Adrian

Store Precomputed Results

For expensive functions (like sin()), you can precompute values and store the

results. You can use a lookup table (O(1)) to handle any subsequent function calls

instead of recomputing the function (which is expensive). So instead of this:

function foo(i) {

if (i < 10) {return i * i - i;}

}

Do this:

values = [0*0-0, 1*1-1, 2*2-2, ..., 9*9-9];

function foo(i) {

if (i < 10) {return values[i];}

}

This technique is often used with trigonometric functions for animation purposes.

A sine wave makes an excellent approximation of the acceleration and deceleration of

a body in motion:

for (var i=1; i<=360; i++) {

sin[i] = Math.sin(i);

}

In JavaScript, this technique is less effective than it is in a compiled language like

C. Unchanging values are computed at compile time in C, while in an interpreted

language like JavaScript, they are computed at runtime.

Page 25: Web Client Performance

Optimizing JavaScript for Execution Speed 25

Use Local versus Global Variables

Reducing the scope of your variables is not only good programming practice, it is

faster. So instead of this (see Code Sample 7):

Code Sample 7 Loop with Global Variable

function MyInnerLoop(){

for(i=0;i<1000;i++);

}

Do this (see Code Sample 8):

Code Sample 8 Loop with Local Variable

function MyInnerLoop(){

for(var i=0;i<1000;i++);

}

Local variables are 60 percent to 26 times faster than global variables for tight

inner loops. This is due in part to the fact that global variables require more time to

search up the function's scope chain. Local variables are properties of the function's

call object and are searched first. Netscape 6 in particular is slow in using global

variables. Mozilla 1.1 has improved speed, but this technique is relevant to all

browsers. See Scott Porter's local versus global test at http://javascript-

games.org/articles/local_global_bench.html.

Trade Time for Space

Conversely, you can trade time for space complexity by densely packing your data

and code into a more compact form. By recomputing information, you can decrease

the space requirements of a program at the cost of increased execution time.

Packing

Packing decreases storage and transmission costs by increasing the time to

compact and retrieve the data. Sparse arrays and overlaying data into the same space

at different times are two examples of packing. Removing spaces and comments are

two more examples of packing. Substituting shorter strings for longer ones can also

help pack data into a more compact form.

Page 26: Web Client Performance

26 Herea Adrian

Interpreters

Interpreters reduce program space requirements by replacing common sequences

with more compact representations.

Some 5K competitors (http://www.the5k.org/) combine these two techniques to

create self-extracting archives of their JavaScript pages, trading startup speed for

smaller file sizes (http://www.dithered.com/experiments/compression/). See Chapter

9, "Optimizing JavaScript for Download Speed," for more details.

Optimize Loops

Most hot spots are inner loops, which are commonly used for searching and

sorting. There are a number of ways to optimize the speed of loops: removing or

simplifying unnecessary calculations, simplifying test conditions, loop flipping and

unrolling, and loop fusion. The idea is to reduce the cost of loop overhead and to

include only repeated calculations within the loop.

Combine Tests to Avoid Compound Conditions

"An efficient inner loop should contain as few tests as possible, and preferably

only one."14 Try to simulate exit conditions of the loop by other means. One

technique is to embed sentinels at the boundary of data structures to reduce the cost of

testing searches. Sentinels are commonly used for arrays, linked lists, and binary

search trees. In JavaScript, however, arrays have the length property built-in, at least

after version 1.2, so array boundary sentinels are more useful for arrays in languages

like C.

One example from Scott Porter of JavaScript-Games.org is splitting an array of

numeric values into separate arrays for extracting the data for a background collision

map in a game. The following example of using sentinels also demonstrates the

efficiency of the switch statement:

var serialData=new;

Array(-1,10,23,53,223,-1,32,98,45,32,32,25,-

1,438,54,26,84,-1,487,43,11);

var splitData=new Array();

function init(){

var ix=-1,n=0,s,l=serialData.length;

for(;n<l;n++){

Page 27: Web Client Performance

Optimizing JavaScript for Execution Speed 27

s=serialData[n];

switch(s){ // switch blocks are much more efficient

case -1 : // than if... else if... else if...

splitData[++ix]=new Array();

break;

default :

splitData[ix].push(s);

}

}

alert(splitData.length);

}

Scott Porter explains the preceding code using some assembly language and the

advantage of using the switch statement:

"Here, -1 is the sentinel value used to split the data blocks. Switch blocks should

always be used where possible, as it's so much faster than an if—else series. This is

because with the if else statements, a test must be made for each "if" statement,

whereas switch blocks generate vector jump tables at compile time so NO test is

actually required in the underlying code! It's easier to show with a bit of assembly

language code. So an if/else statement:

if(n==12)

someBlock();

else if(n==26)

someOtherBlock();

becomes something like this in assembly:

cmp eax,12;

jz someBlock;

Page 28: Web Client Performance

28 Herea Adrian

cmp eax,26;

jz someOtherBlock;

Whereas a switch statement:

switch(a){

case 12 :

someBlock();

break;

case 26 :

someOtherBlock();

break;

}

becomes something like this in assembly:

jmp [VECTOR_LIST+eax];

where VECTOR_LIST would be a list of pointers to the address of the start of the

someBlock and someOtherBlock functions. At least this would be the method if the

switch were based on a numeric value. For string values I'd imagine eax would be

replaced by a pointer to the location of a string for the comparison.

As you can see, the longer the if...else if... block became, the more efficient the

switch block would become in comparison."15

Next, let's look at some ways to minimize loop overhead. Using the right

techniques, you can speed up a for loop by two or even three times.

Hoist Loop-Invariant Code

Move loop-invariant code out of loops (otherwise called coding motion out of

loops) to speed their execution. Rather than recomputing the same value in each

iteration, move it outside the loop and compute it only once. So instead of this:

Page 29: Web Client Performance

Optimizing JavaScript for Execution Speed 29

for (i=0;i<iter;i++) {

d=Math.sqrt(y);

j+=i*d;

}

Do this:

d=Math.sqrt(y);

for (i=0;i<iter;i++) {

j+=i*d;

}

Reverse Loops

Reversing loop conditions so that they count down instead of up can double the

speed of loops. Counting down to zero with the decrement operator (i--) is faster than

counting up to a number of iterations with the increment operator (i++). So instead of

this (see Code Sample 9):

Code Sample 9 A Normal for Loop Counts Up

function loopNormal() {

for (var i=0;i<iter;i++) {

// do something here

}

}

Do this (see Code Sample 10):

Page 30: Web Client Performance

30 Herea Adrian

Code Sample 10 A Reversed for Loop Counts Down

function loopReverse() {

for (var i=iter;i>0;i--) {

// do something here

}

}

Flip Loops

Loop flipping moves the loop conditional from the top to the bottom of the loop.

The theory is that the do while construct is faster than a for loop. So a normal loop

(see Code Sample 9) would look like this flipped (see Code Sample 11): Code Sample 11 A Flipped Loop Using do while

function loopDoWhile() {

var i=0;

do

{

i++;

}

while (i<iter);

}

In JavaScript, however, this technique gives poor results. IE 5 Mac gives

inconsistent results, while IE and Netscape for Windows are 3.7 to 4 times slower.

The problem is the complexity of the conditional and the increment operator.

Remember that we're measuring loop overhead here, so small changes in structure and

conditional strength can make a big difference. Instead, combine the flip with a

reverse count (see Code Sample 12):

Page 31: Web Client Performance

Optimizing JavaScript for Execution Speed 31

Code Sample 12 Flipped Loop with Reversed Count

function loopDoWhileReverse() {

var i=iter;

do

{

i--;

}

while (i>0);

}

This technique is more than twice as fast as a normal loop and slightly faster than a

flipped loop in IE5 Mac. Even better, simplify the conditional even more by using the

decrement as a conditional like this (see Code Sample 13): Code Sample 13 Flipped Loop with Improved Reverse Count

function loopDoWhileReverse2() {

var i=iter-1;

do

{

// do something here

}

while (i--);

}

This technique is over three times faster than a normal for loop. Note the

decrement operator doubles as a conditional; when it gets to zero, it evaluates as false.

One final optimization is to substitute the pre-decrement operator for the post-

decrement operator for the conditional (see Code Sample 14). Code Sample 14 Flipped Loop with Optimized Reverse Count

function loopDoWhileReverse3() {

var i=iter;

do

Page 32: Web Client Performance

32 Herea Adrian

{

// do something here

}

while (--i);

}

This technique is over four times faster than a normal for loop. This last condition

assumes that i is greater than zero. Table 10.2 shows the results for each loop type

listed previously for IE5 on my Mac PowerBook. Table 10.2 Loop Optimizations Compared

Do

while

Reverse Do while

Reverse

Do while

Reverse2

Do while

Reverse3

Total

Time(ms)

2022 1958 1018 932 609 504

Cycle time

(ms)

0.0040 0.0039 0.0020 0.0012 0.0012 0.0010

Unroll or Eliminate Loops

Unrolling a loop reduces the cost of loop overhead by decreasing the number of

times you check the loop condition. Essentially, loop unrolling increases the number

of computations per iteration. To unroll a loop, you perform two or more of the same

statements for each iteration, and increment the counter accordingly. So instead of

this:

var iter = number_of_iterations;

for (var i=0;i<iter;i++) {

foo();

}

Do this:

Page 33: Web Client Performance

Optimizing JavaScript for Execution Speed 33

var iter = multiple_of_number_of_unroll_statements;

for (var i=0;i<iter;) {

foo();i++;

foo();i++;

foo();i++;

foo();i++;

foo();i++;

foo();i++;

}

I've unrolled this loop six times, so the number of iterations must be a multiple of

six. The effectiveness of loop unrolling depends on the number of operations per

iteration. Again, the simpler, the better. For simple statements, loop unrolling in

JavaScript can speed inner loops by as much as 50 to 65 percent. But what if the

number of iterations is not known beforehand? That's where techniques like Duff's

Device come in handy.

Duff's Device

Invented by programmer Tom Duff while he was at Lucasfilm Ltd. in 1983,16

Duff's Device generalizes the loop unrolling process. Using this technique, you can

unroll loops to your heart's content without knowing the number of iterations

beforehand. The original algorithm combined a do-while and a switch statement. The

technique combines loop unrolling, loop reversal, and loop flipping. So instead of this

(see Code Sample 15): Code Sample 15 Normal for Loop

testVal=0;

iterations=500125;

for (var i=0;i<iterations;i++) {

// modify testVal here

}

Page 34: Web Client Performance

34 Herea Adrian

16. Tom Duff, "Tom Duff on Duff's Device" [electronic mailing list], (Linköping,

Sweden: Lysator Academic Computer Society, 10 November 1983 [archived

reproduction]), available from the Internet at http://www.lysator.liu.se/c/duffs-

device.html. Duff describes the loop unrolling technique he developed while at

Lucasfilm Ltd.

Do this (see Code Sample 16): Code Sample 16 Duff's Device

function duffLoop(iterations) {

var testVal=0;

// Begin actual Duff's Device

// Original JS Implementation by Jeff Greenberg

2/2001

var n = iterations / 8;

var caseTest = iterations % 8;

do {

switch (caseTest)

{

case 0: [modify testVal here];

case 7: [ditto];

case 6: [ditto];

case 5: [ditto];

case 4: [ditto];

case 3: [ditto];

case 2: [ditto];

case 1: [ditto];

Page 35: Web Client Performance

Optimizing JavaScript for Execution Speed 35

}

caseTest=0;

}

while (--n > 0);

}

Like a normal unrolled loop, the number of loop iterations (n = iterations/8) is a

multiple of the degree of unrolling (8, in this example). Unlike a normal unrolled

loop, the modulus (caseTest = iterations % 8) handles the remainder of any leftover

iterations through the switch/case logic. This technique is 8 to 44 percent faster in

IE5+, and it is 94 percent faster in NS 4.7.

Fast Duff's Device

You can avoid the complex do/switch logic by unrolling Duff's Device into two

loops. So instead of the original, do this (see Code Sample 17): Code Sample 17 Fast Duff's Device

function duffFastLoop8(iterations) {

// from an anonymous donor to Jeff Greenberg's site

var testVal=0;

var n = iterations % 8;

while (n--)

{

testVal++;

}

n = parseInt(iterations / 8);

while (n--)

{

Page 36: Web Client Performance

36 Herea Adrian

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

}

}

This technique is about 36 percent faster than the original Duff's Device on IE5

Mac. Even better, optimize the loop constructs by converting the while decrement to a

do while pre-decrement like this (see Code Sample 18): Code Sample 18 Faster Duff's Device

function duffFasterLoop8(iterations) {

var testVal=0;

var n = iterations % 8;

if (n>0) {

do

{

testVal++;

}

while (--n); // n must be greater than 0 here

}

Page 37: Web Client Performance

Optimizing JavaScript for Execution Speed 37

n = parseInt(iterations / 8);

do

{

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

testVal++;

}

while (--n);

}

This optimized Duff's Device is 39 percent faster than the original and 67 percent

faster than a normal for loop (see Table 10.3). Table 10.3 Duff's Device Improved

500.125

iterations

Normal For

Loop

Duff’s

Device

Duff’s

Fast

Duff’s

Faster

Total time (ms) 1437 775 493 469

Cycle time (ms) 0.00287 0.00155 0.00099 0.00094

Page 38: Web Client Performance

38 Herea Adrian

How Much to Unroll?

To test the effect of different degrees of loop unrolling, I tested large iteration

loops with between 1 and 15 identical statements for the Faster Duff's Device. Table

10.4 shows the results. Table 10.4 Faster Duff's Device Unrolled

Duff‟s faster 1 Degree 2 3 4 5 6 7 8 9

Total time (ms) 925 661 576 533 509 490 482 469 467

Cycle time(ms) 0.00184 0.00132 0.00115 0.00106 0.0101 0.00097 0.00096 0.00093 0.00093

As you can see in Table 10.4, the effect diminishes as the degree of loop unrolling

increases. Even after two statements, the time to loop through many iterations is less

than 50 percent of a normal for loop. Around seven statements, the time is cut by two-

thirds. Anything over eight reaches a point of diminishing returns. Depending on your

requirements, I recommend that you choose to unroll critical loops by between four

and eight statements for Duff's Device.

Fuse Loops

If you have two loops in close proximity that use the same number of iterations

(and don't affect each other), you can combine them into one loop. So instead of this:

for (i=0; i<j; i++) {

sumserv += serv(i);

}

for (i=0; i<j; i++) {

prodfoo *= foo(i);

}

Do this:

for (i=0; i<j; i++) {

sumserv += serv(i);

prodfoo *= foo(i);

}

Page 39: Web Client Performance

Optimizing JavaScript for Execution Speed 39

Fusing loops avoids the additional overhead of another loop control structure and is

more compact.

Loop benchmarking test suite result

Using the http://blogs.sun.com/greimer/resource/loop-test.html# we can see the

difference in performance for different ways of coding loops in JavaScript.

Accessing the length property is more expensive on HTML collections than on

arrays, depending on the browser. In those cases, caching it made a huge difference.

However, HTML collections are live, so a cached value may fail if the underlying

DOM is modified during looping. On the other hand, HTML collections will never be

sparse, so the best way to loop an HTML collection might just be to ignore the length

property altogether and combine the test with the item lookup .

Test environment: Mozila Firefox 3.0.14/Windows XP SP3/ Celeron D 2.8 GHz

Native Array (length=1000, looped 100 times)

Basic for loop. for (var i=0; i<arr.length; i++)

{ }

15ms

For loop, but

caching the length. for (var i=0, len=arr.length;

i<len; i++) { }

13ms

While loop that

imitates a for loop. var i = 0; while (i < arr.length)

{ i++; }

13ms

While loop that

imitates a for loop,

caching the length.

var i = 0, len = arr.length;

while (i < len) { i++; }

7ms

While loop in

reverse, simplifying

the test condition.

var i = arr.length; while (i--) {

}

4ms

do ... while loop in

reverse. var i = arr.length-1; do { }

while (i--);

4ms

for loop in reverse. for (var i=arr.length; i--;) { }

4ms

While looping by

popping values (this

fails on sparse

arrays).

var x; while (x = arr.pop()) { } 26ms

for ... in loop for (var i in arr) { }

56ms

for ... in loop, with

integer test var isInt = /(^[0-9]$)|(^[1-9][0-

258ms

Page 40: Web Client Performance

40 Herea Adrian

9]+$)/; for (var i in arr) {

if(!isInt.test(i)){continue;} }

For loop, testing on

existence rather than

length (this fails on

sparse arrays).

for (var i=0; arr[i]; i++) { } 8ms

For loop, testing on

existence rather than

length, plus array

lookup.

for (var i=0; arr[i]; i++) { var

x = arr[i]; }

23ms

For loop, testing on

existence rather than

length, array lookup

is combined with

test.

for (var i=0, x; x = arr[i++];) {

}

10ms

For reference. for (var i=0, len=arr.length;

i<len; i++) { var x = arr[i]; }

14ms

Array.forEach()

native

implementation.

arr.forEach(function(x){}); 17ms

For reference against

forEach(). var f=function(x){}; for (var

i=0, len=arr.length; i<len; i++)

{ f(arr[i]); }

29ms

Sparse Native Array (length=12705, sporadically populated with 1000 items,

looped 100 times)

Basic for loop. for (var i=0; i<sarr.length; i++)

{ }

245ms

For loop, but

caching the length. for (var i=0, len=sarr.length;

i<len; i++) { }

128ms

While loop that

imitates a for loop. var i = 0; while (i < sarr.length)

{ i++; }

225ms

While loop that

imitates a for loop,

caching the length.

var i = 0, len = sarr.length;

while (i < len) { i++; }

105ms

While loop in

reverse,

simplifying the test

condition.

var i = sarr.length; while (i--) {

}

43ms

Page 41: Web Client Performance

Optimizing JavaScript for Execution Speed 41

do ... while loop in

reverse. var i = sarr.length-1; do { }

while (i--);

47ms

for loop in reverse. for (var i=sarr.length; i--;) { }

55ms

for ... in loop for (var i in sarr) { }

60ms

for ... in loop, with

integer test var isInt = /(^[0-9]$)|(^[1-9][0-

9]+$)/; for (var i in sarr) {

if(!isInt.test(i)){continue;} }

271ms

Array.forEach()

native

implementation.

sarr.forEach(function(x){}); 278ms

For reference

against forEach(). var f=function(x){}; for (var i=0,

len=sarr.length; i<len; i++) {

f(sarr[i]); }

590ms

HTML Collection (length=1000, looped 100 times)

Basic for loop. for (var i=0; i<hColl.length; i++)

{ }

189ms

For loop, but

caching the length. for (var i=0, len=hColl.length;

i<len; i++) { }

10ms

While loop that

imitates a for loop. var i = 0; while (i <

hColl.length) { i++; }

187ms

While loop that

imitates a for loop,

caching the length.

var i = 0, len = hColl.length;

while (i < len) { i++; }

8ms

While loop in

reverse, simplifying

the test condition.

var i = hColl.length; while (i--)

{ }

4ms

do ... while loop in

reverse. var i = hColl.length-1; do { }

while (i--);

3ms

for loop in reverse. for (var i=hColl.length; i--;) { }

5ms

for ... in loop for (var i in hColl) { }

230ms

for ... in loop, with

integer test var isInt = /(^[0-9]$)|(^[1-9][0-

9]+$)/; for (var i in hColl) {

if(!isInt.test(i)){continue;} }

451ms

Page 42: Web Client Performance

42 Herea Adrian

For loop, testing on

existence rather

than length (this

fails on sparse

arrays).

for (var i=0; hColl[i]; i++) { } 289ms

For loop, testing on

existence rather

than length, plus

array lookup.

for (var i=0; hColl[i]; i++) { var

x = hColl[i]; }

548ms

For loop, testing on

existence rather

than length, array

lookup is combined

with test.

for (var i=0, x; x = hColl[i++];)

{ }

289ms

For loop, testing on

existence rather

than length, array

lookup is combined

with test, item()

instead of array

brackets.

for (var i=0, x; x =

hColl.item(i++);) { }

795ms

For reference. for (var i=0, len=hColl.length;

i<len; i++) { var x = hColl[i]; }

299ms

Expression Tuning

As regular expression connoisseurs can attest, tuning expressions themselves can

speed up things considerably. Count the number of operations within critical loops

and try to reduce their number and strength.

If the evaluation of an expression is costly, replace it with a less-expensive

operation. Assuming that a is greater than 0, instead of this:

a > Math.sqrt(b);

Do this:

a*a > b;

Or even better:

Page 43: Web Client Performance

Optimizing JavaScript for Execution Speed 43

var c = a*a;

c>b;

Strength reduction is the process of simplifying expensive operations like

multiplication, division, and modulus into cheap operations like addition, OR, AND,

and shifting. Loop conditions and statements should be as simple as possible to

minimize loop overhead. Here's an example from Code Sample 10. So instead of this:

for (var i=iter;i>0;i--)

Do this:

var i=iter-1;

do {} while (i--);

This technique simplifies the test condition from an inequality to a decrement,

which also doubles as an exit condition once it reaches zero.

Miscellaneous Tuning Tips

You can use many techniques to "bum" CPU cycles from your code to cool down

hot spots. Logic rules include short-circuiting monotone functions, reordering tests to

place the least-expensive one first, and eliminating Boolean variables with if/else

logic. You also can shift bits to reduce operator strength, but the speed-up is minimal

and not consistent in JavaScript.

Be sure to pass arrays by reference because this method is faster in JavaScript. If a

routine calls itself last, you can adjust the arguments and branch back to the top,

saving the overhead of another procedure call. This is called removing tail recursion.

Flash ActionScript Optimization

Like JavaScript, ActionScript is based on the ECMAScript standard. Unlike

JavaScript, the ActionScript interpreter is embedded within Macromedia's popular

Flash plug-in and has different performance characteristics than JavaScript. Although

the techniques used in this chapter will work for Flash, two additional approaches are

available to Flash programmers. You can speed up Flash performance by replacing

slower methods with the prototype command and hand-tune your code with Flasm.

Page 44: Web Client Performance

44 Herea Adrian

Flasm is a command-line assembler/disassembler of Flash ActionScript bytecode.

It disassembles your entire SWF file, allowing you to perform optimizations by hand

and replace all actions in the original SWF with your optimized routines. See

http://flasm.sourceforge.net/#optimization for more information.

You can replace slower methods in ActionScript by rewriting these routines and

replacing the originals with the prototype method. The Prototype site

(http://www.layer51.com/proto/) provides free Flash functions redefined for speed or

flexibility. These functions boost performance for versions up to Flash 5. Flash MX

has improved performance, but these redefined functions can still help.

Page 45: Web Client Performance

Bibliography 45

Bibliography

Website Optimization: Speed, Search Engine & Conversion Rate Secrets

Andrew King,

O'Reilly Media, Inc.; 1 edition (15 Jul 2008)

Image Optimization: How Many of These 7 Mistakes Are You Making

Stoyan Stefanov (Yahoo! Inc)

2:00pm Tuesday, 06/24/2008

http://en.oreilly.com/velocity2008/public/schedule/detail/2405

High Performance Web Sites - 14 Rules for Faster-Loading Web Sites

by Steve Souders

http://stevesouders.com/hpws/rules.php

Even Faster Web Sites

Steve Souders (Google)

http://sites.google.com/site/io/even-faster-web-sites

Optimization Impact

By Patrick Meenan

http://blog.patrickmeenan.com/

YSlow: Yahoo's Problems Are Not Your Problems

by Jeff Atwood

http://www.codinghorror.com/blog/archives/000932.html

http://performance.webpagetest.org:8080/

http://developer.yahoo.com/performance/

http://developer.yahoo.com/performance/rules.htm

http://www.ryandoherty.net/

http://www.hostscope.com/c/templature/

http://www.slideshare.net/stoyan/yslow-20-presentation

http://blogs.sun.com/greimer/resource/loop-test.html

http://www.thewojogroup.com/2008/10/10-easy-steps-to-great-website-optimization/

http://code.google.com/intl/es/speed/articles/

Other online java script tutorials and forums/blogs about java script/web page

performance.

Page 46: Web Client Performance

46 Herea Adrian

Content

Web Client performance .......................................................................................... 1 Prologue ................................................................................................................... 2

The Pareto Principle ............................................................................................. 2 The importance of performance ........................................................................... 2

Advices for fast web page inspired by YSlow ......................................................... 3 1. Make Fewer HTTP Requests ........................................................................... 3 2. Use a Content Delivery Network ..................................................................... 3 3. Add an Expires Header .................................................................................... 4 4. Gzip Components ............................................................................................. 4 5. Put CSS at the Top ........................................................................................... 5 6. Move Scripts to the Bottom ............................................................................. 5 7. Avoid CSS Expressions ................................................................................... 5 8. Make JavaScript and CSS External .................................................................. 6 9. Reduce DNS Lookups ...................................................................................... 6 10. Minify JavaScript ........................................................................................... 6 11. Avoid Redirects .............................................................................................. 6 12. Remove Duplicate Scripts .............................................................................. 7 13. Configure ETags ............................................................................................ 7

Optimizing JavaScript For Execution Speed ............................................................ 8 Design Levels ....................................................................................................... 8 Measure Your Changes ........................................................................................ 9 Algorithms and Data Structures ......................................................................... 10 Refactor to Simplify Code.................................................................................. 11 Minimize DOM Interaction and I/O ................................................................... 11

Minimize Object and Property Lookups ........................................................ 12 Shorten Scope Chains..................................................................................... 13 Avoid with Statements ................................................................................... 13

Add Complex Subtrees Offline .......................................................................... 14 Edit Subtrees Offline .......................................................................................... 16 Concatenate Long Strings .................................................................................. 17 Access NodeLists Directly ................................................................................. 17 Use Object Literals ............................................................................................. 18 Local Optimizations ........................................................................................... 19 Trade Space for Time ......................................................................................... 19

Cache Frequently Used Values ...................................................................... 21 Cache your objects ......................................................................................... 22 Cache your scripts .......................................................................................... 23 Understand the cost of your objects ............................................................... 23 Store Precomputed Results ............................................................................. 24 Use Local versus Global Variables ................................................................ 25

Trade Time for Space ......................................................................................... 25 Packing ........................................................................................................... 25 Interpreters ..................................................................................................... 26

Optimize Loops .................................................................................................. 26

Page 47: Web Client Performance

Content 47

Combine Tests to Avoid Compound Conditions ............................................ 26 Hoist Loop-Invariant Code ............................................................................. 28 Reverse Loops ................................................................................................ 29 Flip Loops ...................................................................................................... 30 Unroll or Eliminate Loops .............................................................................. 32 Duff's Device .................................................................................................. 33 Fast Duff's Device .......................................................................................... 35 How Much to Unroll? .................................................................................... 38 Fuse Loops ..................................................................................................... 38

Loop benchmarking test suite result ................................................................... 39 Expression Tuning ............................................................................................. 42 Miscellaneous Tuning Tips ................................................................................ 43 Flash ActionScript Optimization ........................................................................ 43

Bibliography........................................................................................................... 45 Content ................................................................................................................... 46