33
HTML Preloading Faster Loading of HTML Embedded Objects By Salim Batlouni

HTML Preloading Presentation

Embed Size (px)

DESCRIPTION

Presentation for the HTML Preloading project

Citation preview

Page 1: HTML Preloading Presentation

HTML PreloadingFaster Loading of HTML Embedded Objects

By Salim Batlouni

Page 2: HTML Preloading Presentation

HTML Page

Page 3: HTML Preloading Presentation

OverheadClient Server

--- 0 RTT ---Client opens

TCP connection

--- 1 RTT ---Client sends

HTTP request for HTML

--- 2 RTT ---Client parses

HTMLClient sends

HTTP request for image

--- 3 RTT ---Image begins to

arrive

Server reads from disk

Server reads from disk

SYN

SYN

ACKDAT

ACKDAT

ACKDAT

ACKDAT

Adapted from ‘The Case for Persistent-Connection HTTP’, Jeffrey C. Mogul

Page 4: HTML Preloading Presentation

Objective

Forecast/preload http requests and send responses ahead of time.

Page 5: HTML Preloading Presentation

My Idea

Adapted from ‘The Case for Persistent-Connection HTTP’, Jeffrey C. Mogul

OldClient Server

--- 0 RTT ---Client opens

TCP connection

--- 1 RTT ---Client sends

HTTP request for HTML

--- 2 RTT ---Client parses

HTML

--- 3 RTT ---Image begins to

arrive

Server reads from disk

SYN

SYN

ACKDAT

ACKDAT

DAT

Server forecasts dependents

Server reads images from disk

New

PROXY SERVER keeps on

receiving files

Client Server--- 0 RTT ---Client opens

TCP connection

--- 1 RTT ---Client sends

HTTP request for HTML

--- 2 RTT ---Client parses

HTMLClient sends

HTTP request for image

--- 3 RTT ---Image begins to

arrive

SYN

SYN

ACKDAT

ACKDAT

ACKDAT

ACKDAT

Page 6: HTML Preloading Presentation

HTTP Traffic Data

Y. C. Chehadeh, A. Z. Hatahe, A. E. Agamy, M. A. Bamakhrama, S. A. Banawan, Investigating Distribution of Data of HTTP Traffic: An Empirical Study, IEEE, 2006

2006 data shows:

More than 50% requests are for

images

Images are smallest files in HTTP traffic

Page 7: HTML Preloading Presentation

Prefetching HTML requests

Client Server--- 0 RTT ---Client opens

TCP connection

--- 1 RTT ---Client sends

HTTP request for HTML

--- 2 RTT ---Client parses

HTML

--- 3 RTT ---Image begins to

arrive

Server reads from disk

SYN

SYN

ACKDAT

ACKDAT

DAT

Server parses HTML

Server reads images from disk

New

Abdolreza Abhari, Adam Serbinski, Improving the Delivery of Multimedia Embedded In HTML Over HTTP on Wireless Networks, IEEE, 2008

PROXY SERVER keeps on

receiving files

Page 8: HTML Preloading Presentation

Preloading vs Multi-TCP

Preloading Concurrent connections

Source: Improving the Delivery of Multimedia Embedded In HTML Over HTTP on Wireless Networks

✓More benefits

receiving at 6 Mbps and sending at 800 Kbps. The client was situated on a much faster network. The purpose of this first environment is to simulate a typical busy network as would be encountered in a real application of this modified HTTP protocol. The second environment tested consists of a client connected to a server via 100 Mbps LAN. We use the second test to determine if or how the performance of preloading changes with the speed of the network. We expect that as the network latency decreases, so will the effects of preloading.

In our tests, we ran the server containing mirrors of the web pages at http://www.amazon.com, http://www.aol.com, http://www.google.ca, http://www.mapquest.com, and http://www.wikipedia.com. On the client system, we ran the lightweight proxy server capable of interacting with a web server employing modified HTTP protocol. The proxy server used is modified from what would be used by an end user in that it responds to a signal to immediately clear its cache. When the server and the proxy processes are running, the client software, in our case GNU Wget, is run to download the given web page.

Our testing procedure consisted of running GNU Wget in a loop 100 times to download each website being tested. The time it takes for Wget to retrieve the entire web page is recorded in a log for every iteration of the loop. Between every run of Wget, the proxy cache is cleared. Immediately after running a test using the modified HTTP protocol, the same test is performed again, but without using the proxy. When the proxy is not used, the header indicating the client's capability of receiving responses using the modified HTTP protocol is not included, resulting in the server reverting to standard HTTP protocol. In this manner, we are able to test using precisely the same server and client in order to eliminate any chance of contaminating the results by using alternate software that may perform differently than ours.

Following testing, the logs were processed to determine the average page load times for each web page tested. Each web page tested using modified HTTP protocol was compared to the results from the same web page using standard HTTP protocol. The difference in performance for each web page was compared with the difference in performance for the other web pages. The comparisons were made to determine the relationship between the performance of modified HTTP protocol and the number of embedded objects.

6. RESULTS

Our modification to HTTP protocol allows the server to

anticipate future requests from the client, in turn allowing the server to forward the responses to future requests to the client without explicit requests. From the log files generated by our testing, we can show an improvement in web page delivery time as anticipated.

As previously explained, our testing client, GNU Wget,

is not capable of concurrent connections. Due to the nature of our experiment, it was necessary to use a client that can be run repeatedly from a script. A normal web browser is capable of creating concurrent connections to web servers, which causes an improvement in performance over Wget, however, it is not possible to run a normal web browser and record page load times automatically, forcing us to use Wget. Table 1 and Figure 2 show the results of manually testing 5 web pages sequentially, concurrently, and through preloading. These web pages were tested 10 times each connecting to Apache HTTP Server, which is designed to support concurrent connections, using Mozilla Firefox 2.0.0.2 [11] with the Fasterfox 2.0.0 plugin [12]. The Fasterfox plugin is used for disabling concurrent connections and measuring page load times. Figure 3 shows the results from testing server to client preloading as described using Wget.

Table.1 Preloading time compared with sequential and concurrent downloading of a web site (s)

Website Sequential Concurrent Preloading

Amazon 15.017 14.796 4.216Aol 8.956 8.567 3.163Google 0.257 0.249 0.263mapquest 4.915 4.371 2.327wikipedia 4.830 4.320 2.012

AMAZ

ON

.CO

M

AOL.

CO

M

GO

OG

LE.C

A

MAP

QU

EST.

CO

M

WIK

IPED

IA.O

RG

0.002.004.006.008.00

10.0012.0014.0016.00

SequentialConcurrentPreloading

Website

Tim

e (s

)

Figure 2: Preloading compared with sequential and

concurrent downloading. The results of Table 1 and Figure 2 are not accurate over

long term, since each value is the average over just 10 trials. However, Table 1 does show a general tendency that compared to the effects of preloading, the effect of downloading web pages using concurrent connections is minimal. We use the results of this test with Firefox to validate the results of further testing where preloading is compared only to sequential downloading using GNU Wget.

Clearly, the improvements resulting in preloading embedded objects from server to client far outweigh the

001494

✓More benefits

Page 9: HTML Preloading Presentation

Multi-TCP

Preethi Natarajan, Fred Baker, and Paul D. Amer, Multiple TCP Connections Improve HTTP Throughput − Myth or Fact?, Communication and Networks Consortium, 2008 and on

Works better in networks with high-bandwidth

last hopsDeveloped world

Might hurt network with low-

bandwidth last hops

ADSL, Developing world

In both cases, preloading shows better performance

Page 10: HTML Preloading Presentation

Server Scheme

header.jpg

other.jpg

footer.jpg

header.jpg

diff.jpg

footer.jpg

two.jpg

some-website.com some-website.com/stuff

Browser requests files Browser requests highlighted files

Step 1: No Preloading

Server keeps track of requested files.

Page 11: HTML Preloading Presentation

Solution

header.jpg

other.jpg

footer.jpg

header.jpg

diff.jpg

footer.jpg

two.jpg

some-website.com some-website.com/stuff

Server sends files Server sends highlighted files

Step 2: Preloading

Page 12: HTML Preloading Presentation

Implementation

Local Server Local Browser

L. Rizzo, “Dummynet: a simple approach to the evaluation of network protocols,” Sigcomm, 1997.

Dummynet

Network SimulatorCommand-line tool

Simulates:Finite queues

Bandwidth limitationsCommunication delays

Page 13: HTML Preloading Presentation

Implementation

Local Server Local Browser

Dummynet

WebServer.javaVery simple, multi-threaded HTTP server

Modified to Preload & SendMade by Sun

Browser

Proxy ServerAlso based on Sun’s simple server

http://java.sun.com/developer/technicalArticles/Networking/Webserver/WebServer.java

Page 14: HTML Preloading Presentation

Server

✓ Client Handler Main.java

✓ Graph Graph.java

✓ ProxySender ProxySender.java

Page 15: HTML Preloading Presentation

Server: Step 1

first.htmlheader.jpg

im.jpgfooter.jpg

Requests on TCP Port 8080

first.html

header.jpg footer.jpg

im.jpg

1

1

1

Client Handler Graph ProxySender

1 2 3

...

Page 16: HTML Preloading Presentation

Server: Preloading

first.html

Requests on TCPPort 8080

first.html

header.jpg footer.jpg

im.jpg

1

1

1 header.jpg[data]im.jpg[data]footer.jpg[data]

Response to Port 8085

Client Handler Graph ProxySender

1 2 3

Page 17: HTML Preloading Presentation

Client

✓ Client Main.java

✓ Parser Parser.java

✓ ProxyServer ProxyServer.java

Page 18: HTML Preloading Presentation

Client: Thread 1

first.html

Sends RequestPort 8080

Client Parser Cache

1 2 3

Parse HTML File

Request Embedded Objects

header.jpgim.jpg

footer.jpg

header.jpgim.jpg

footer.jpg

Page 19: HTML Preloading Presentation

Client: Thread 2

Listens toPort 8085

ProxyServer Cache

1 2

header.jpgim.jpg

footer.jpg

header.jpg[data]im.jpg[data]footer.jpg[data]

Page 20: HTML Preloading Presentation

Simulation

56kbps → 1024kbps

0 → 83 embedded objects

10.4kB → 46.1kB average

Preload vs No Preload

Page 21: HTML Preloading Presentation

Preload vs No Preload

0

3750

7500

11250

15000

Google (0 obj) 0 kB Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83 obj) 10.4 kB

512kbps Connection

No Preload Preload

Page 22: HTML Preloading Presentation

Preload vs No Preload

0

17500

35000

52500

70000

Google (0 obj) 0 kB Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83 obj) 10.4 kB

56kbps Connection

No Preload Preload

Page 23: HTML Preloading Presentation

My Idea

Adapted from ‘The Case for Persistent-Connection HTTP’, Jeffrey C. Mogul

OldClient Server

--- 0 RTT ---Client opens

TCP connection

--- 1 RTT ---Client sends

HTTP request for HTML

--- 2 RTT ---Client parses

HTML

--- 3 RTT ---Image begins to

arrive

Server reads from disk

SYN

SYN

ACKDAT

ACKDAT

DAT

Server forecasts dependents

Server reads images from disk

New

PROXY SERVER keeps on

receiving files

Client Server--- 0 RTT ---Client opens

TCP connection

--- 1 RTT ---Client sends

HTTP request for HTML

--- 2 RTT ---Client parses

HTMLClient sends

HTTP request for image

--- 3 RTT ---Image begins to

arrive

SYN

SYN

ACKDAT

ACKDAT

ACKDAT

ACKDAT

Page 24: HTML Preloading Presentation

No Preloading

0

17500

35000

52500

70000

Google (0 obj) 0 kB

Apple (44) 22.5

AppStore (48) 46.1

NYTimes (83 obj) 10.4 kB

1024 512 256 128 56

Page 25: HTML Preloading Presentation

With Preloading

0

17500

35000

52500

70000

Google (0 obj) 0 kB

Apple (44) 22.5

AppStore (48) 46.1

NYTimes (83 obj) 10.4 kB

1024 512 256 128 56

Page 26: HTML Preloading Presentation

Benefit

0

20

40

60

80

Google (0 obj) 0 kB Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83 obj) 10.4 kB

Per

cent

Dec

reas

e

Number of Embedded Objects

1024 kbps 512 kbps 256 kbps 128 kbps 56 kbps

Page 27: HTML Preloading Presentation

Benefit

0

20

40

60

80

Google (0) 0 Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83) 10.4

Per

cent

Incr

ease

Number of Embedded Objects

1024 kbps 512 kbps 256 kbps 128 kbps 56 kbps

Page 28: HTML Preloading Presentation

Benefit

0

22.5

45.0

67.5

90.0

Google (0) 0 Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83) 10.4

Per

cent

Incr

ease

Number of Embedded Objects

1024 kbps 512 kbps 256 kbps 128 kbps 56 kbps

Page 29: HTML Preloading Presentation

File Sizes

0

17500

35000

52500

70000

010.4

1214

14.7522.5

46.1

No Preloading

1024 512 256 128 56

0

5000

10000

15000

20000

010.4

1214

14.7522.5

46.1

Preloading

1024 512 256 128 56

Page 30: HTML Preloading Presentation

Conclusion

256kbps connection83 embedded objects

16 seconds vs 5 seconds

Page 31: HTML Preloading Presentation

Conclusion

✓ Feasible through proxy servers✓ Significant improvement✓ Especially viable to developing world

Page 32: HTML Preloading Presentation

Future Work

✓ Study dynamic adaptation✓ Synchronize ProxySender with Server

Page 33: HTML Preloading Presentation

HTML PreloadingFaster loading of HTML embedded objects

By Salim Batlouni

End of Presentation