Upload
salim-batlouni
View
223
Download
2
Embed Size (px)
DESCRIPTION
Presentation for the HTML Preloading project
Citation preview
HTML PreloadingFaster Loading of HTML Embedded Objects
By Salim Batlouni
HTML Page
OverheadClient Server
--- 0 RTT ---Client opens
TCP connection
--- 1 RTT ---Client sends
HTTP request for HTML
--- 2 RTT ---Client parses
HTMLClient sends
HTTP request for image
--- 3 RTT ---Image begins to
arrive
Server reads from disk
Server reads from disk
SYN
SYN
ACKDAT
ACKDAT
ACKDAT
ACKDAT
Adapted from ‘The Case for Persistent-Connection HTTP’, Jeffrey C. Mogul
Objective
Forecast/preload http requests and send responses ahead of time.
My Idea
Adapted from ‘The Case for Persistent-Connection HTTP’, Jeffrey C. Mogul
OldClient Server
--- 0 RTT ---Client opens
TCP connection
--- 1 RTT ---Client sends
HTTP request for HTML
--- 2 RTT ---Client parses
HTML
--- 3 RTT ---Image begins to
arrive
Server reads from disk
SYN
SYN
ACKDAT
ACKDAT
DAT
Server forecasts dependents
Server reads images from disk
New
PROXY SERVER keeps on
receiving files
Client Server--- 0 RTT ---Client opens
TCP connection
--- 1 RTT ---Client sends
HTTP request for HTML
--- 2 RTT ---Client parses
HTMLClient sends
HTTP request for image
--- 3 RTT ---Image begins to
arrive
SYN
SYN
ACKDAT
ACKDAT
ACKDAT
ACKDAT
HTTP Traffic Data
Y. C. Chehadeh, A. Z. Hatahe, A. E. Agamy, M. A. Bamakhrama, S. A. Banawan, Investigating Distribution of Data of HTTP Traffic: An Empirical Study, IEEE, 2006
2006 data shows:
More than 50% requests are for
images
Images are smallest files in HTTP traffic
Prefetching HTML requests
Client Server--- 0 RTT ---Client opens
TCP connection
--- 1 RTT ---Client sends
HTTP request for HTML
--- 2 RTT ---Client parses
HTML
--- 3 RTT ---Image begins to
arrive
Server reads from disk
SYN
SYN
ACKDAT
ACKDAT
DAT
Server parses HTML
Server reads images from disk
New
Abdolreza Abhari, Adam Serbinski, Improving the Delivery of Multimedia Embedded In HTML Over HTTP on Wireless Networks, IEEE, 2008
PROXY SERVER keeps on
receiving files
Preloading vs Multi-TCP
Preloading Concurrent connections
Source: Improving the Delivery of Multimedia Embedded In HTML Over HTTP on Wireless Networks
✓More benefits
receiving at 6 Mbps and sending at 800 Kbps. The client was situated on a much faster network. The purpose of this first environment is to simulate a typical busy network as would be encountered in a real application of this modified HTTP protocol. The second environment tested consists of a client connected to a server via 100 Mbps LAN. We use the second test to determine if or how the performance of preloading changes with the speed of the network. We expect that as the network latency decreases, so will the effects of preloading.
In our tests, we ran the server containing mirrors of the web pages at http://www.amazon.com, http://www.aol.com, http://www.google.ca, http://www.mapquest.com, and http://www.wikipedia.com. On the client system, we ran the lightweight proxy server capable of interacting with a web server employing modified HTTP protocol. The proxy server used is modified from what would be used by an end user in that it responds to a signal to immediately clear its cache. When the server and the proxy processes are running, the client software, in our case GNU Wget, is run to download the given web page.
Our testing procedure consisted of running GNU Wget in a loop 100 times to download each website being tested. The time it takes for Wget to retrieve the entire web page is recorded in a log for every iteration of the loop. Between every run of Wget, the proxy cache is cleared. Immediately after running a test using the modified HTTP protocol, the same test is performed again, but without using the proxy. When the proxy is not used, the header indicating the client's capability of receiving responses using the modified HTTP protocol is not included, resulting in the server reverting to standard HTTP protocol. In this manner, we are able to test using precisely the same server and client in order to eliminate any chance of contaminating the results by using alternate software that may perform differently than ours.
Following testing, the logs were processed to determine the average page load times for each web page tested. Each web page tested using modified HTTP protocol was compared to the results from the same web page using standard HTTP protocol. The difference in performance for each web page was compared with the difference in performance for the other web pages. The comparisons were made to determine the relationship between the performance of modified HTTP protocol and the number of embedded objects.
6. RESULTS
Our modification to HTTP protocol allows the server to
anticipate future requests from the client, in turn allowing the server to forward the responses to future requests to the client without explicit requests. From the log files generated by our testing, we can show an improvement in web page delivery time as anticipated.
As previously explained, our testing client, GNU Wget,
is not capable of concurrent connections. Due to the nature of our experiment, it was necessary to use a client that can be run repeatedly from a script. A normal web browser is capable of creating concurrent connections to web servers, which causes an improvement in performance over Wget, however, it is not possible to run a normal web browser and record page load times automatically, forcing us to use Wget. Table 1 and Figure 2 show the results of manually testing 5 web pages sequentially, concurrently, and through preloading. These web pages were tested 10 times each connecting to Apache HTTP Server, which is designed to support concurrent connections, using Mozilla Firefox 2.0.0.2 [11] with the Fasterfox 2.0.0 plugin [12]. The Fasterfox plugin is used for disabling concurrent connections and measuring page load times. Figure 3 shows the results from testing server to client preloading as described using Wget.
Table.1 Preloading time compared with sequential and concurrent downloading of a web site (s)
Website Sequential Concurrent Preloading
Amazon 15.017 14.796 4.216Aol 8.956 8.567 3.163Google 0.257 0.249 0.263mapquest 4.915 4.371 2.327wikipedia 4.830 4.320 2.012
AMAZ
ON
.CO
M
AOL.
CO
M
GO
OG
LE.C
A
MAP
QU
EST.
CO
M
WIK
IPED
IA.O
RG
0.002.004.006.008.00
10.0012.0014.0016.00
SequentialConcurrentPreloading
Website
Tim
e (s
)
Figure 2: Preloading compared with sequential and
concurrent downloading. The results of Table 1 and Figure 2 are not accurate over
long term, since each value is the average over just 10 trials. However, Table 1 does show a general tendency that compared to the effects of preloading, the effect of downloading web pages using concurrent connections is minimal. We use the results of this test with Firefox to validate the results of further testing where preloading is compared only to sequential downloading using GNU Wget.
Clearly, the improvements resulting in preloading embedded objects from server to client far outweigh the
001494
✓More benefits
Multi-TCP
Preethi Natarajan, Fred Baker, and Paul D. Amer, Multiple TCP Connections Improve HTTP Throughput − Myth or Fact?, Communication and Networks Consortium, 2008 and on
Works better in networks with high-bandwidth
last hopsDeveloped world
Might hurt network with low-
bandwidth last hops
ADSL, Developing world
In both cases, preloading shows better performance
Server Scheme
header.jpg
other.jpg
footer.jpg
header.jpg
diff.jpg
footer.jpg
two.jpg
some-website.com some-website.com/stuff
Browser requests files Browser requests highlighted files
Step 1: No Preloading
Server keeps track of requested files.
Solution
header.jpg
other.jpg
footer.jpg
header.jpg
diff.jpg
footer.jpg
two.jpg
some-website.com some-website.com/stuff
Server sends files Server sends highlighted files
Step 2: Preloading
Implementation
Local Server Local Browser
L. Rizzo, “Dummynet: a simple approach to the evaluation of network protocols,” Sigcomm, 1997.
Dummynet
Network SimulatorCommand-line tool
Simulates:Finite queues
Bandwidth limitationsCommunication delays
Implementation
Local Server Local Browser
Dummynet
WebServer.javaVery simple, multi-threaded HTTP server
Modified to Preload & SendMade by Sun
Browser
Proxy ServerAlso based on Sun’s simple server
http://java.sun.com/developer/technicalArticles/Networking/Webserver/WebServer.java
Server
✓ Client Handler Main.java
✓ Graph Graph.java
✓ ProxySender ProxySender.java
Server: Step 1
first.htmlheader.jpg
im.jpgfooter.jpg
Requests on TCP Port 8080
first.html
header.jpg footer.jpg
im.jpg
1
1
1
Client Handler Graph ProxySender
1 2 3
...
Server: Preloading
first.html
Requests on TCPPort 8080
first.html
header.jpg footer.jpg
im.jpg
1
1
1 header.jpg[data]im.jpg[data]footer.jpg[data]
Response to Port 8085
Client Handler Graph ProxySender
1 2 3
Client
✓ Client Main.java
✓ Parser Parser.java
✓ ProxyServer ProxyServer.java
Client: Thread 1
first.html
Sends RequestPort 8080
Client Parser Cache
1 2 3
Parse HTML File
Request Embedded Objects
header.jpgim.jpg
footer.jpg
header.jpgim.jpg
footer.jpg
Client: Thread 2
Listens toPort 8085
ProxyServer Cache
1 2
header.jpgim.jpg
footer.jpg
header.jpg[data]im.jpg[data]footer.jpg[data]
Simulation
56kbps → 1024kbps
0 → 83 embedded objects
10.4kB → 46.1kB average
Preload vs No Preload
Preload vs No Preload
0
3750
7500
11250
15000
Google (0 obj) 0 kB Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83 obj) 10.4 kB
512kbps Connection
No Preload Preload
Preload vs No Preload
0
17500
35000
52500
70000
Google (0 obj) 0 kB Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83 obj) 10.4 kB
56kbps Connection
No Preload Preload
My Idea
Adapted from ‘The Case for Persistent-Connection HTTP’, Jeffrey C. Mogul
OldClient Server
--- 0 RTT ---Client opens
TCP connection
--- 1 RTT ---Client sends
HTTP request for HTML
--- 2 RTT ---Client parses
HTML
--- 3 RTT ---Image begins to
arrive
Server reads from disk
SYN
SYN
ACKDAT
ACKDAT
DAT
Server forecasts dependents
Server reads images from disk
New
PROXY SERVER keeps on
receiving files
Client Server--- 0 RTT ---Client opens
TCP connection
--- 1 RTT ---Client sends
HTTP request for HTML
--- 2 RTT ---Client parses
HTMLClient sends
HTTP request for image
--- 3 RTT ---Image begins to
arrive
SYN
SYN
ACKDAT
ACKDAT
ACKDAT
ACKDAT
No Preloading
0
17500
35000
52500
70000
Google (0 obj) 0 kB
Apple (44) 22.5
AppStore (48) 46.1
NYTimes (83 obj) 10.4 kB
1024 512 256 128 56
With Preloading
0
17500
35000
52500
70000
Google (0 obj) 0 kB
Apple (44) 22.5
AppStore (48) 46.1
NYTimes (83 obj) 10.4 kB
1024 512 256 128 56
Benefit
0
20
40
60
80
Google (0 obj) 0 kB Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83 obj) 10.4 kB
Per
cent
Dec
reas
e
Number of Embedded Objects
1024 kbps 512 kbps 256 kbps 128 kbps 56 kbps
Benefit
0
20
40
60
80
Google (0) 0 Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83) 10.4
Per
cent
Incr
ease
Number of Embedded Objects
1024 kbps 512 kbps 256 kbps 128 kbps 56 kbps
Benefit
0
22.5
45.0
67.5
90.0
Google (0) 0 Scholar (1) 12 Apple (44) 22.5 Amazon (45) 14.75 AppStore (48) 46.1 Gardlen (60) 14 NYTimes (83) 10.4
Per
cent
Incr
ease
Number of Embedded Objects
1024 kbps 512 kbps 256 kbps 128 kbps 56 kbps
File Sizes
0
17500
35000
52500
70000
010.4
1214
14.7522.5
46.1
No Preloading
1024 512 256 128 56
0
5000
10000
15000
20000
010.4
1214
14.7522.5
46.1
Preloading
1024 512 256 128 56
Conclusion
256kbps connection83 embedded objects
16 seconds vs 5 seconds
Conclusion
✓ Feasible through proxy servers✓ Significant improvement✓ Especially viable to developing world
Future Work
✓ Study dynamic adaptation✓ Synchronize ProxySender with Server
HTML PreloadingFaster loading of HTML embedded objects
By Salim Batlouni
End of Presentation