28
Hybrid Prefetching for WWW Proxy Servers Yui-Wen Horng , Wen-Jou Lin , Hsing Mei Department of Computer Science and Information Engineering Fu Jen Catholic University, Taiwan, R.O.C International Conference on Parallel and Distributed Systems,1998 Mikt Tien [email protected]

Hybrid Prefetching for WWW Proxy Servers

  • Upload
    earl

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Hybrid Prefetching for WWW Proxy Servers. Yui-Wen Horng , Wen-Jou Lin , Hsing Mei Department of Computer Science and Information Engineering Fu Jen Catholic University, Taiwan, R.O.C International Conference on Parallel and Distributed Systems,1998 Mikt Tien [email protected] - PowerPoint PPT Presentation

Citation preview

Page 1: Hybrid Prefetching for WWW Proxy Servers

Hybrid Prefetching for WWW Proxy Servers

Yui-Wen Horng , Wen-Jou Lin , Hsing MeiDepartment of Computer Science and Information

Engineering

Fu Jen Catholic University, Taiwan, R.O.C

International Conference on Parallel and Distributed Systems,1998

Mikt Tien

[email protected]

Syslab Yan Zen

Page 2: Hybrid Prefetching for WWW Proxy Servers

Outline

1.Introduction 2.Related work 3.Prefetching Mechanism 4.Experiment Result 5.Conclusion and Future Work

Page 3: Hybrid Prefetching for WWW Proxy Servers

1.Introduction Depend on the location of cache,We can classify

cache into three types: client cache,server cache,proxy cache

Some studies show that, the maximum possible hit rate of a proxy cache is about 30%-50%.To overcome prefetch is clear solution

So we classify prefetcher into three types: client prefetcher,server prefetcher,proxy prefetcher

Client Prefetcher can analyze personal requests to predict future request, proxy prefetcher can gather information from multi-client to multi-server.

Page 4: Hybrid Prefetching for WWW Proxy Servers

2.Related Work

Interactive Prefetching proxy Server(Wcol) (Content Parsing)

-- To get linked documents by parsing HTML pages(include images). -- advantage: Hit rate of the cache is more than 60% -- disadvantage: the traffic is 4.12 times larger than a normal caching proxy and task to parse HTML also adds overhead to the server..

Page 5: Hybrid Prefetching for WWW Proxy Servers

Related Work(cont.)

Top-10 Approach -- Requires cooperation between web server,proxy and client browser. The higher level servers know the popular documents to their lower level clients. -- advantage: Hit rate more than 40% and increase traffic is no more than 10% in most case. -- disadvantage: In order to achieve good prediction, every proxies and servers need to follow the same policy. That is the major problem in implementation.

Page 6: Hybrid Prefetching for WWW Proxy Servers

Related Work(cont.)

Predictive Prefetching

-- The prefetcher install in client, but

communicates to a prediction engine ehich is

part of web server. This engine tracks client

request sequences and builds a dependency

graph which contains probability information,the

prefetcher can prefetch files with high probability.

-- disadvantage: Requires specially designed

protocol or modification to HTTP.

Page 7: Hybrid Prefetching for WWW Proxy Servers

Related Work(cont.)

Prefetching Files System for WWW Servers -- It utilizes “referer” information contains in HTTP request message to build access probability graph. “Referer” is a header in HTTP request message, it indicates that the requested URL is linked from which URL. -- advantage: the response time can be reduced more than 20%. -- disadvantage: Not all requests contain this information and it takes time to accumulate enough data to build the graph.

Page 8: Hybrid Prefetching for WWW Proxy Servers

Related Work(cont.)

Our approach

-- Hybrid prefetcher that both parse HTML and build

access probability graph. To make more intelligent

prefetching, both access popularity and probability

are considered.

Page 9: Hybrid Prefetching for WWW Proxy Servers

3.Prefetching Mechanism

Page 10: Hybrid Prefetching for WWW Proxy Servers

3.1 Problem 1:How to find more documents that may be requested in the

near future? Prefetch by Parsing HTML -- It does not need information from past request

history and can find related URLs even the request

URL was never retrieved before.

-- But ,it increase overhead of server,and increase

the traffic

Page 11: Hybrid Prefetching for WWW Proxy Servers

3.1 Problem 1:How to find more documents that may be requested in the

near future?(cont.) Prefetch by Referer -- Building “Referer link graph”

-- The accumulated weight value of each node and edge can

also be used to calculate access probability which is useful

for prefetching.

-- disad: Maintain the graph increase memory overhead and not

all requests contain referer information.

Page 12: Hybrid Prefetching for WWW Proxy Servers

3.1 Problem 1:How to find more documents that may be requested in the

near future?(cont.) Hybrid Prefetch -- If referer exist ,use referer to build “referer link

graph” ,else pasing the HTML file to build the link

graph.

-- The HTML files require parsing are less than first

approach, so the CPU overhead is smaller.

Page 13: Hybrid Prefetching for WWW Proxy Servers

3.1 Problem 1:How to find more documents that may be requested in the

near future?(cont.) Prefetch by Directory -- Assumption: related documents are usually put in

the same directory in the web server.

-- If the directory structure of the web site does not

agree with our assumption, the ratio of successful

prefetchinf may be low.

Page 14: Hybrid Prefetching for WWW Proxy Servers

3.2 Problem 2: How to increase the ratio of prefetched documents that are actually

be requested? Popularity Constraint -- Building a table to track popularity of each

requested document.The table is updated when

new requested is coming. Probability Constraint --

Page 15: Hybrid Prefetching for WWW Proxy Servers

3.2 Problem 2: How to increase the ratio of prefetched documents that are actually

be requested?(cont.) Combined Constraint -- Combination of both constraints by “OR” them.

That is ,prefetch a document if it can pass either

constraint.

Page 16: Hybrid Prefetching for WWW Proxy Servers

4.Experiment Results

Experiment A

Page 17: Hybrid Prefetching for WWW Proxy Servers
Page 18: Hybrid Prefetching for WWW Proxy Servers
Page 19: Hybrid Prefetching for WWW Proxy Servers
Page 20: Hybrid Prefetching for WWW Proxy Servers

Experiment B-Popularity Constraint(threshold)

prefetch level=2 , cache size =10MB

Page 21: Hybrid Prefetching for WWW Proxy Servers
Page 22: Hybrid Prefetching for WWW Proxy Servers
Page 23: Hybrid Prefetching for WWW Proxy Servers
Page 24: Hybrid Prefetching for WWW Proxy Servers

Experiment B—Probability Constraint

Page 25: Hybrid Prefetching for WWW Proxy Servers
Page 26: Hybrid Prefetching for WWW Proxy Servers
Page 27: Hybrid Prefetching for WWW Proxy Servers
Page 28: Hybrid Prefetching for WWW Proxy Servers

5.Conclusion and Future Work

Hybrid prefetching technique, which is effective to imprpove hit rate of cache proxy and the accuracy of prediction is higher than other methods.

It can accomplish more than 70% cache hit rate and the increased traffic rate is below 40%.

Our experiments also show that separated caches is better than one common cache if total size is small.