View
85
Download
1
Category
Tags:
Preview:
DESCRIPTION
JSZap : Compressing JavaScript Code. Martin Burtscher , UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha , IIT Kanpur. A Web 2.0 Application Dissected. 1+ MB code. Talks to 14 backend services (traffic, images, directions, ads, …). - PowerPoint PPT Presentation
Citation preview
JSZap: Compressing JavaScript Code
Martin Burtscher, UT AustinBen Livshits & Ben Zorn, Microsoft Research
Gaurav Sinha, IIT Kanpur
2
A Web 2.0 Application Dissected
70,000+ lines of JavaScript code
downloaded2,855 Functions
1+ MB codeTalks to 14 backend
services(traffic, images,
directions, ads, …)
3
Lots of JavaScript being Transmitted
www.live.com
spreadsheets.google
maps.live
chi.lexigame
hotmail
gmail
dropthings
maps.google
pageflakes
bunny hunt
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Fraction of download that is JavaScript
Up to 85% of a Web 2.0
app is JavaScript code!
AJAX: Tension Headaches
4
Execution can’t start without
the code
Move code to client for
responsiveness
5
JavaScript on the Wire
JavaScript crunch
gzip -d parser AST
JSZap
gzip
6
JSZap Approach
• Represent JavaScript as AST instead of source
• Serialize the compressed AST
• Decompress directly into AST on client
• Use gzip as 2nd-level (de-)compressor
7
Benefits of AST-based Compression
• Compression: less to transmit• ASTs are blasted directly into the browser
Reduced Latency
• Reduces mobile charges• Reduces operator network costs: better for servers
Reduced Network Bandwidth
• Ensures well-formedness of code• Can use to check language subsets easily (AdSafe)• Caching incremental updates• Unblocking HTML parser
Correctness, Security, and other Benefits
8
JSZap Compression
JavaScript JSZap gzip
9
JSZap Compression
JavaScript identifiers gzip
literals
productions1
2
3
10
GZIP is a formidable
opponent
11
JSZap vs. GZIP
JSZapgzip0
5
10
15
20
25
30
35
40
5.45.4
18.419.0
8.411.5
Literals Identifiers Productions
Size
in K
B
12
Talk Outline
identifiers
literals
productions1
2
3
evaluation on real code
13
Background: ASTs
a * b + c 1) E E + T
2) E T3) T T * F
4) T F5) F id
+
*
a b
c5
5
1
3
5
Expression Grammar Tree
14
A Simple Javascript Examplevar y = 2;function foo () {
var x = "jscrunch";var z = 3;z = y + y;
}x = "jszap";
Identifier Stream
y foo x z z y y x
Literal Stream
"jscrunch" 2 3 "jszap"
Production Stream
1 3 4 ... 1 3 4 ...
15
Benchmarking JSZap
Benchmark name Source lines
Source bytes
gmonkey 922 17,382getDOMHash 1,136 25,467bing1 3,758 77,891bingmap1 3,473 80,066livemsg1 5,307 93,982bingmap2 9,726 113,393facebook1 5,886 141,469livemsg2 7,139 156,282officelive1 22,016 668,051
• JavaScript files up to 22K LOC
• Variety of app types
• Both hand-generated, and machine-generated
• gzipped everything
16
Components of JavaScript Sourcegm
onke
y
getD
OM
Hash
bing
1
bing
map
1
livem
sg1
bing
map
2
face
book
1
livem
sg2
office
live1
0%10%20%30%40%50%60%70%80%90%
100%
productions identifiers literals
• None of the categories can be ignored
• Identifiers become more prominent with code growth
17
Compressing the Production Stream
• Frequency-based production renaming
• Differential encoding: 26 and 57 => 2 and 3
• Chain rule: eliminate predictable productions
• Tree-based prediction-by-partial-match
18
PPMC
• Consider compressing – if (P) then X else X
• Should be very compressible• if (P) then ...abc... else ...abc...
P
XX
…
…
• Tree context used to build a predictor
• Provides the next likely child node given context C and child position p
• Arithmetic coding: more likely=shorter IDs
• See paper for details
19
Production Compression with PPMC
gmon
key
getD
OM
Hash
bing
1
bing
map
1
livem
sg1
bing
map
2
face
book
1
livem
sg2
office
live1
50%55%60%65%70%75%80%85%90%95%
100%
0.6772
Prod
uctio
n Co
mpr
essi
on (g
zip
= 1)
20
Compressing the Identifier Stream
• Symbol tables instead of identifier stream:– Compress redundancy: offset into table– Global or local symbol tables– Use variable-length encoding
• Other techniques:– Sort symbols by frequency– Rename local variables
21
Variable-length Encoding for Identifiers
is global?
is renamed local
00…
01…
fits in 1 byte?
11…
10…
22
Variable-Length Identifier Encodinggm
onke
y
getD
OM
Hash
bing
1
bing
map
1
livem
sg1
bing
map
2
face
book
1
livem
sg2
office
live1
0%10%20%30%40%50%60%70%80%90%
100%
parent local 2byte local 1byte local builtin global 2byte global 1byte
23
Symbol Tables: Effectiveness
gmon
key
getD
OM
Hash
bing
1
bing
map
1
livem
sg1
bing
map
2
face
book
1
livem
sg2
office
live1
80%
85%
90%
95%
100%
0.943
89%
Global ST VarEnc
Iden
tifier
s (N
oST
= 1)
24
Compressing Literals
• Symbol tables• Grouping literals by type• Pre-fixes and post-fixes• These techniques result in 5-10% savings
compared to gzip
25
Average JSZap Compression: 10%
gmon
key
getD
OM
Hash
bing
1
bing
map
1
livem
sg1
bing
map
2
face
book
1
livem
sg2
office
live1
80%82%84%86%88%90%92%94%96%98%
100%
0.8792
JSZa
p Co
mpr
essi
on (g
zip
= 1)
Productions; 26%
Identifiers; 57%
Literals; 17%
13% savings
26
Summary and Conclusions• JSZap: AST-based compression for JavaScript
• Propose a range of techniques for compressing– Productions– Identifiers– Literals
• Preliminary results are encouraging: 10% savings over gzip
• Future focus– Latency measurements – Browser integration
27
Well-formedness
Security (AdSafe)
AST representation
Unblocking HTML parser
Caching and incremental
updates
Compression with JSZap
?
Questions?
Recommended