5
Design and Implementation of Page Replacement Algorithm for Web Proxy Caching Yogesh Niranjan Computer Science and Engineering Lakshmi Narain College of Technology, Indore, India Shailendra Tiwari Computer Science and Engineering Lakshmi Narain College of Technology, Indore, India Abstract With an ever- increasing emphasis of human activity on internet, www is growing extensively. Which result is heavy network traffic? To overcome network traffic and latency proxy caching is one of the remedy. Proxy caching has been used to enhance the performance of user access to popular web content. Web proxy caching is a well-known technique for reducing access latencies and bandwidth consumption. As in other caching systems, a replacement policy is necessary for determining when and what to evict from the cache, and many proxy caching algorithms have been earlier. This paper proposed a page replacement algorithm for proxy server. The simulation result shows that it performs better than other algorithms like LRU, LFU and FIFO. Key words: Web cache, Latency, Access time, proxy server, page replacement algorithm. 1. Introduction The world wide web or internet is a global system of interconnected computer networks. www is network of network that consists of millions of private, public, academic and government networks. Internet user‘s increases day by day. Satisfying every user in an efficient manner might not be possible. User suffer with low response time, network traffic, server load etc. Hence to overcome latency, traffic and server load, proxy server is connected between the server and the client. Caching allows large organization to significantly reduce their upstream bandwidth usage. Due to cache memory on proxy server. Web proxy caches have an important role in reducing server loads, client request latencies, and network traffic. Web caching is a well-known technique for reducing access latencies and bandwidth consumption. As in caching systems, a replacement policy is necessary for determining when and what to evict from the cache and many proxy caching algorithms have been proposed and evaluated. This topic analyzes the distribution of current web contents and re-evaluates various proxy cache replacement algorithms including LFU, LRU and FIFO. A proxy server that passes all requests and replies unmodified is usually called a gateway or sometimes tunneling proxy. A proxy server can be placed in the user's local computer or at various points between the user and the destination servers or the Internet [1]. Proxy server provides many types of services to their client. It fulfills the request of client. If request come to proxy server it check in cache first. If it hits, the proxy proceeds at high speed, page will be served to client. If miss occure than request will forward to specified web server. When cache is saturated and a request for page arrives which is not in cache then miss occurs, page replacement algorithm decide which page has to be evict from when a page of memory needs to be allocate. The page replacement algorithm helps for better utilization of cache. The objective of this paper is to design and implementation of a proxy caching algorithm which can be used to reduce the network traffic and bandwidth. A web cache is a mechanism for the temporary storage (caching) of web documents, such as HTML pages and images, to reduce bandwidth usage, server load, and perceived lag. A web cache stores copies of documents passing through it; subsequent requests may be satisfied from the cache if certain conditions are met. 2. Related Work Web proxy tracing and caching are highly active research areas. Recent studies of Web traffic include analyses of Web access traces from the perspective of browsers proxies. Earlier tracing studies were limited

Design and Implementation of Page Replacement Algorithm for Web Proxy Caching

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 2: Design and Implementation of Page Replacement Algorithm for Web Proxy Caching

in request rate, number of requests, and diversity of

population. The most recent tracing studies have been

larger and more diverse. In addition to static analysis,

some studies have also used trace driven cache

simulation to characterize the locality and sharing

properties of very large traces and to study the effects

of cookies, aborted connections, and persistent

connections on the performance of proxy caching [2].

Gonzaley, et.al had studied six replacement

algorithms LRU, LFU and LFUDA, other three are

specially developed for web documents GD Size,

GDSF and GD, It finally concluded that no

replacement policy outperforms the other for all

content type [3].

Prischepa analyzes the effectiveness of LRU-K

replacement policy for the purpose of caching on

proxy server [4]. Cao et.al introduced greedy Dual

Size, which incorporates locality with cast and size

concern in a simple and non parameterized fashion

for high performance [5]. G Golan[6] proposes an

optimal offline algorithm for replacement in

multilevel cache, based on an algorithm for the

relaxes list updated problem and the DEMOTE

operation.

Shiva Shankar Reddy P, Swetha L. [7] proposes a

new method of caching HTTP Proxy servers which

takes lower bandwidth by maintaining a cache of

internet objects. V. Sathiyamoorthi and Dr.Murali

Bhaskaran, discusses various data preprocessing

techniques that are carried out at proxy server access

log which generate web access pattern. These

patterns are used for further applications [8].

Martin Arlitt, Ludmila Cherkasova, John Dilley,

Richard Friedrich, Tai J in [9] introduces virtual

caches, an approach for improving the performance

of the cache for multiple metrics simulteneously.

Yong Zhen Guo, Kotagiri Ramamohanarao and

Laurence A. F. Park [10] proposes web page

prefetching technique, they must be able to predict the next set of pages that will be accessed by users,

and ―Page Rank-Like Algorithm‖ is proposed for

conducting web page prediction.

R. Gupta and Tokekar [11] have presented a

preeminent pair of replacement algorithms for L1 and

L2 cache for proxy server. According with them the

access pattern of L1 and L2 cache are different. Thus

the replacement algorithm which is giving efficient

results for L1 may not be suitable for L2 cache. They

concluded that the pair of algorithm effort more

efficient than the savme used algorithm.

Most of studies for replacement policies have

been done for L1. Replacement algorithms can be

recency based or frequency based or may follow both

aspects, such as Least Recently Used (LRU) [12,13].

Least Recently Used-K (LRU-K) [14], Most Recently

Used (MRU) [12,15]. Michael et al proposes a policy

karma which uses application hints to partition the

cache and to manage each range of blocks with the

policy best suited for its access pattern[16].

John Dilley et. al reports on the implementation

and characterization of two newly proposed cache

policies, LFU with Dynamic Aging (LFUDA) and

Greedy Dual Size – Frequency (GDS-F) in the squid

cache, The combination of replacement algorithm

and offered workload determines the efficiency of

cache in optimizing the utilization of system

resources [17].

3. Issues of Web Proxy Caching Due to the explosive and ever growing size of the

web, distributed caching has received considerable

attention. The major aim of cache is to move the

frequently accessed information closer to the users.

Caching system should improve performance for end users, network operators, and content providers.

Caching can be recognized as an effective way to:

speed up web access, reducing latency perceived by

the users, reduce network traffic, reduce server load,

and improve response time to the users.

A. Load Balancing

The situation occur at any time for large number of clients who wishes to simultaneously

access data or get some services from a local

cache with single server. If the site is not

provisioned to deal with all of these clients

simultaneously, service may be degraded or lost.

Several approaches to overcoming this issue have

been proposed. The most frequently used method

is caching. This caching strategy stores copies of

popular pages or services throughout the Internet;

this spreads the work of serving a page or service

across several servers.

B. Transparency

Transparency of cache systems enables users

to get the benefits of caches without knowing that

they exist, or without knowing their physical

location. The advantages of this technique are

easy to use, no configuration required by the end

user and no users can bypass the cache.

Yogesh Niranjan et al, Int.J.Computer Technology & Applications,Vol 4 (2),221-225

IJCTA | Mar-Apr 2013 Available [email protected]

222

ISSN:2229-6093

Page 3: Design and Implementation of Page Replacement Algorithm for Web Proxy Caching

C. Scalability

It is vital that the cache system be scalable as

the number of users and servers increases. It can

be clustered or cooperative and stand-alone

caches. Stand-alone caches are better suited for

individual systems and are easier to maintain.

However, cooperation between caches could

provide more information about cached data, which could be communicated between caches

without referring to the originating servers.

D. Cache miss

Cache systems should be capable of efficiently

handling cache misses. When a request cache

misses, a decision should be taken on where to

forward the request. And also a cache system

should decide on which data to be cached or

should all cached data be treated equally.

4. Design and Implementation of Proposed

Proxy Caching Algorithm

Web proxy caching is a well-known strategy for

improving performance of Web-based systems by

keeping Web objects that are likely to be used in the

near future closer to the client. Most of the current

Web browsers still employ traditional caching

policies that are not efficient in Web caching.

Web caching is an emerging technology in Web

and in Web caching if the client is requesting a page

from a server, it will fetch from the server and will

give response to the server. According to the

locations where objects are cached, Web caching

technology can be classified into three categories i.e.,

client‘s browser caching, client-side proxy caching,

and server-side proxy caching.

A Client Side Proxy Caching (CSPC) is a caching

server that acts as an intermediary for requests from

clients seeking resources from other servers. A client connects to the proxy server, requesting some

service, such as a file, connection, web page, or other

resource available from a different server. The proxy

server evaluates the request according to its filtering

rules. For example, it may filter traffic by IP address

or protocol.

If the request is validated by the filter, the proxy

provides the resource by connecting to the relevant

server and requesting the service on behalf of the

client. A proxy server may optionally alter the client's

request or the server's response, and sometimes it

may serve the request without contacting the

specified server. In this case, it 'caches' responses

from the local cache.

Figure 1 show a proxy server with cache memory

which runs with many features such as Reduces

network traffic, Reduces Latency time, reduce load

on web server. This architecture also inherently helps speeder browsing of web pages. In this system when

proxy cache saturated and new page request arrives at

proxy a page replacement algorithm decides which

page has to be evict from the cache. Efficiency of

system depends on the page replacement algorithm.

Client Side Proxy Caching Algorithm (CSPC)

There has been extensive theoretical and

empirical work done on exploring web caching

policies that perform best under different

performance metrics. Many algorithms have been

proposed and found effective for web proxy caching.

These algorithms range from simple traditional

schemes such as Least-Recently Used (LRU), Least-

Frequently Used (LFU), First-In First-Out (FIFO), and various size-based algorithms, to complex hybrid

algorithms such as LRU-Threshold, which resembles

LRU with a size limit on single cache elements,

Lowest-Relative Value (LRV), which uses cost, size

and last reference time to calculate its utility, and

GreedyDual, which combines locality, size and cost

considerations into a single online algorithm.

Figure 1 Proxy Server Caching

C1

C2

Cn

Cache

Processor

Clients Proxy Server Server

Internet

Yogesh Niranjan et al, Int.J.Computer Technology & Applications,Vol 4 (2),221-225

IJCTA | Mar-Apr 2013 Available [email protected]

223

ISSN:2229-6093

Page 4: Design and Implementation of Page Replacement Algorithm for Web Proxy Caching

The table I show the proposed Client Side Web

Proxy caching algorithm. This algorithm focused on

the aspect i.e. algorithm of replacing documents.

With the study of web cache characteristics going

further; algorithms of replacing documents based on

the statistics of collected web data are proposed.

Following consider factors into its scheme:

Document reference frequency

Document size

Consistence of documents

Freshness of document

Efficient schemes combine more than one of factors

in their implementation of web cache. Some

algorithms consider different cache architecture to

improve caching performance.

Table I: Client Side Proxy Cache Algorithm (CSPC)

5. Experimental Result

A proposed algorithm (CSPC) is devloped in

windows XP using C# .Net. and the unique

identification number is allot to unique URL‘s of proxy server log. These numbers are taken as

reference string that become input to the algorithms.

The result of algorithm in the form of Hit Rate are

shown in Table II.

6. Conclusion This paper is basically concentrated to explore the

client side proxy caching algorithm which is best

suited for proxy server. Real trace of web references

is achieved with the help of log details of proxy

server. For the simulation numeric reference string

was obtained by giving numeric identity to each of

the URLs. After simulation it concluded that the

proposed client Side Proxy Caching Algorithm

perform well than other algorithms like LRU, LFU

and FIFO. The CSPC algorithm improves Hit Ratio

approx 10.67%.

After exhaustive simulation experiments it is

concluded that for proxy caching the SCPC hit ratio

performance better than others algorithms.

1. WHILE there is a page p in Cache in the current window.

2. Serve first such p and mark the page

3. IF all pages in the cache are marked

4. Unmark all the pages

5. Evict randomly an unmarked page from the

cache

No of

Request

Replacement Algorithms

LRU LFU FIFO CSPC

100 0.400 0.410 0.390 0.460

300 0.486 0.496 0.483 0.563

500 0.490 0.496 0.480 0.560

800 0.548 0.551 0.545 0.625

1000 0.591 0.596 0.585 0.664

1200 0.635 0.640 0.631 0.691

1500 0.634 0.638 0.630 0.701

1800 0.658 0.658 0.653 0.711

2000 0.677 0.679 0.671 0.737

Figure 2: Hit Ratio Analysis

Yogesh Niranjan et al, Int.J.Computer Technology & Applications,Vol 4 (2),221-225

IJCTA | Mar-Apr 2013 Available [email protected]

224

ISSN:2229-6093

Page 5: Design and Implementation of Page Replacement Algorithm for Web Proxy Caching

7. References

[1] David A. Malts and Pravin Bhagwat (march 1998). ―Improving HTTP caching proxy performance with

TCP tap‖. Technical report, IBM.

[2] A. Feldman et. al.; Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments,

Proceeding of INFOCOM 99, 1999.

[3] F.J. Gonzalez- Canete, E Casilari, Alicia Trivino – Cabrera, ―Characterizing Document Types to Evaluate

Web Cache Replacement Policies‖, International

conference on Information Technology ITNG 2007.

[4] Valdimir V. Prischepa, ―An efficient Web Caching

Algorithm based on LFU-K replacement policy‖

Spring Young Researcher‘s Colloquium on Database

and Information System, 2004.

[5] P.cao and Irani, ―Cast aware WWW proxy caching

Algorithms‖, in proc. USENIX Symp. Internate

Technologies and System, Monterey, CA, 1997.

[6] Gala Golan ―Multilevel cache management based on

application Hints‖ computer science department,

Technion Haifa 32000, ISRAEL. November 24, 2003.

[7] Shiva Shankar Reddy P,Swetha L ―Analysis and

Design of Enhanced HTTP Proxy Cashing Server ―

paper published in International Journal of computer Technology, Volume 2 (3), 537-541.

[8] V. Sathiyamoorthi and Dr.Murali Bhaskaran ―Data

Preprocessing Techniques for Pre-Fetching and Caching of Web Data through Proxy Server‖

International Journal of Computer Science and

Network Security, VOL.11 No.2011.

[9] Martin Arlitt, Ludmila Cherkasova, John Dilley,

Richard Friedrich, Tai J in "Evaluating Content

Management Techniques for Web Proxy Caches",

published in ACM SIGMETRICS Performance Evaluation Review Volume 27 Issue 4, March 2000.

[10] Yong Zhen Guo, Kotagiri Ramamohanarao and

Laurence A. F. Park ―Personalized PageRank for Web Page Prediction Based on Access Time-Length and

Frequency‖ This paper published in 2007

IEEE/WIC/ACM International Conference on Web

Intelligence.

[11] R. Gupta, Tokekar .‖Preeminent pair of replacement

algorithms for L1 and L2 cache for proxy server‖.

First Asian Himalayas International Conference AH-ICI 2009.

[12] Abraham Silberschatz and Peter Baer Galvin,

Operating System concepts. Addison Wesley. 1997.

[13] A. Dan and D. Towsley, ―An Approximate Analysis

of the LRU and FIFO Buffer Replacement

Schemes‖, in Proceedings of ACM SIGMETRICS.

Boulder, Colorado, United States, 1990, pp. 143—

152.

[14] E.J O‘Neil, P.E. O‘Neil and G. Weikum, ―The

LRU-K page replacement algorithm for database

Disk Buffering‖ proc.ACMSIGMOD Int ‗l Conf.

Management of Data, pp. 297-306, May 1993.

[15] M.J. Bach, ―The Design of the UNIX Operation

system‖. Engle wood Cliffs, NJ: Prentice-Hall,

1986.

[16] Michael Factor, Assaf Schuster, Gala Yadgar,

―Multilevel Cache Management Based on

Application Hints‖, Technion- Computer Science Department Technical Report CS-2006.

[17] John Dilley, Martine Arlitt and Stephane Perret

―Enhancement and Validation of Squid‘s Cache Replacement Policy‖ Internet Systems and

Applications Laboratory HP Laboratories Palo Alto

HPL- 1999-69, May 2009.

Yogesh Niranjan et al, Int.J.Computer Technology & Applications,Vol 4 (2),221-225

IJCTA | Mar-Apr 2013 Available [email protected]

225

ISSN:2229-6093