Design and Implementation of Page Replacement Algorithm for Web Proxy Caching

mailto:[email protected]

mailto:[email protected]

in request rate, number of requests, and diversity of

population. The most recent tracing studies have been

larger and more diverse. In addition to static analysis,

some studies have also used trace driven cache

simulation to characterize the locality and sharing

properties of very large traces and to study the effects

of cookies, aborted connections, and persistent

connections on the performance of proxy caching [2].

Gonzaley, et.al had studied six replacement

algorithms LRU, LFU and LFUDA, other three are

specially developed for web documents GD Size,

GDSF and GD, It finally concluded that no

replacement policy outperforms the other for all

content type [3].

Prischepa analyzes the effectiveness of LRU-K

replacement policy for the purpose of caching on

proxy server [4]. Cao et.al introduced greedy Dual

Size, which incorporates locality with cast and size

concern in a simple and non parameterized fashion

for high performance [5]. G Golan[6] proposes an

optimal offline algorithm for replacement in

multilevel cache, based on an algorithm for the

relaxes list updated problem and the DEMOTE

operation.

Shiva Shankar Reddy P, Swetha L. [7] proposes a

new method of caching HTTP Proxy servers which

takes lower bandwidth by maintaining a cache of

internet objects. V. Sathiyamoorthi and Dr.Murali

Bhaskaran, discusses various data preprocessing

techniques that are carried out at proxy server access

log which generate web access pattern. These

patterns are used for further applications [8].

Martin Arlitt, Ludmila Cherkasova, John Dilley,

Richard Friedrich, Tai J in [9] introduces virtual

caches, an approach for improving the performance

of the cache for multiple metrics simulteneously.

Yong Zhen Guo, Kotagiri Ramamohanarao and

Laurence A. F. Park [10] proposes web page

prefetching technique, they must be able to predict the next set of pages that will be accessed by users,

and ―Page Rank-Like Algorithm‖ is proposed for

conducting web page prediction.

R. Gupta and Tokekar [11] have presented a

preeminent pair of replacement algorithms for L1 and

L2 cache for proxy server. According with them the

access pattern of L1 and L2 cache are different. Thus

the replacement algorithm which is giving efficient

results for L1 may not be suitable for L2 cache. They

concluded that the pair of algorithm effort more

efficient than the savme used algorithm.

Most of studies for replacement policies have

been done for L1. Replacement algorithms can be

recency based or frequency based or may follow both

aspects, such as Least Recently Used (LRU) [12,13].

Least Recently Used-K (LRU-K) [14], Most Recently

Used (MRU) [12,15]. Michael et al proposes a policy

karma which uses application hints to partition the

cache and to manage each range of blocks with the

policy best suited for its access pattern[16].

John Dilley et. al reports on the implementation

and characterization of two newly proposed cache

policies, LFU with Dynamic Aging (LFUDA) and

Greedy Dual Size – Frequency (GDS-F) in the squid

cache, The combination of replacement algorithm

and offered workload determines the efficiency of

cache in optimizing the utilization of system

resources [17].

3. Issues of Web Proxy Caching Due to the explosive and ever growing size of the

web, distributed caching has received considerable

attention. The major aim of cache is to move the

frequently accessed information closer to the users.

Caching system should improve performance for end users, network operators, and content providers.

Caching can be recognized as an effective way to:

speed up web access, reducing latency perceived by

the users, reduce network traffic, reduce server load,

and improve response time to the users.

A. Load Balancing

The situation occur at any time for large number of clients who wishes to simultaneously

access data or get some services from a local

cache with single server. If the site is not

provisioned to deal with all of these clients

simultaneously, service may be degraded or lost.

Several approaches to overcoming this issue have

been proposed. The most frequently used method

is caching. This caching strategy stores copies of

popular pages or services throughout the Internet;

this spreads the work of serving a page or service

across several servers.

B. Transparency

Transparency of cache systems enables users

to get the benefits of caches without knowing that

they exist, or without knowing their physical

location. The advantages of this technique are

easy to use, no configuration required by the end

user and no users can bypass the cache.

Yogesh Niranjan et al, Int.J.Computer Technology & Applications,Vol 4 (2),221-225

IJCTA | Mar-Apr 2013 Available [email protected]

222

ISSN:2229-6093

C. Scalability

It is vital that the cache system be scalable as

the number of users and servers increases. It can

be clustered or cooperative and stand-alone

caches. Stand-alone caches are better suited for

individual systems and are easier to maintain.

However, cooperation between caches could

provide more information about cached data, which could be communicated between caches

without referring to the originating servers.

D. Cache miss

Cache systems should be capable of efficiently

handling cache misses. When a request cache

misses, a decision should be taken on where to

forward the request. And also a cache system

should decide on which data to be cached or

should all cached data be treated equally.

4. Design and Implementation of Proposed

Proxy Caching Algorithm

Web proxy caching is a well-known strategy for

improving performance of Web-based systems by

keeping Web objects that are likely to be used in the

near future closer to the client. Most of the current

Web browsers still employ traditional caching

policies that are not efficient in Web caching.

Web caching is an emerging technology in Web

and in Web caching if the client is requesting a page

from a server, it will fetch from the server and will

give response to the server. According to the

locations where objects are cached, Web caching

technology can be classified into three categories i.e.,

client‘s browser caching, client-side proxy caching,

and server-side proxy caching.

A Client Side Proxy Caching (CSPC) is a caching

server that acts as an intermediary for requests from

clients seeking resources from other servers. A client connects to the proxy server, requesting some

service, such as a file, connection, web page, or other

resource available from a different server. The proxy

server evaluates the request according to its filtering

rules. For example, it may filter traffic by IP address

or protocol.

If the request is validated by the filter, the proxy

provides the resource by connecting to the relevant

server and requesting the service on behalf of the

client. A proxy server may optionally alter the client's

request or the server's response, and sometimes it

may serve the request without contacting the

specified server. In this case, it 'caches' responses

from the local cache.

Figure 1 show a proxy server with cache memory

which runs with many features such as Reduces

network traffic, Reduces Latency time, reduce load

on web server. This architecture also inherently helps speeder browsing of web pages. In this system when

proxy cache saturated and new page request arrives at

proxy a page replacement algorithm decides which

page has to be evict from the cache. Efficiency of

system depends on the page replacement algorithm.

Client Side Proxy Caching Algorithm (CSPC)

There has been extensive theoretical and

empirical work done on exploring web caching

policies that perform best under different

performance metrics. Many algorithms have been

proposed and found effective for web proxy caching.

These algorithms range from simple traditional

schemes such as Least-Recently Used (LRU), Least-

Frequently Used (LFU), First-In First-Out (FIFO), and various size-based algorithms, to complex hybrid

algorithms such as LRU-Threshold, which resembles

LRU with a size limit on single cache elements,

Lowest-Relative Value (LRV), which uses cost, size

and last reference time to calculate its utility, and

GreedyDual, which combines locality, size and cost

considerations into a single online algorithm.

Figure 1 Proxy Server Caching

C1

C2

Cn

Cache

Processor

Clients Proxy Server Server

Internet



223

ISSN:2229-6093

The table I show the proposed Client Side Web

Proxy caching algorithm. This algorithm focused on

the aspect i.e. algorithm of replacing documents.

With the study of web cache characteristics going

further; algorithms of replacing documents based on

the statistics of collected web data are proposed.

Following consider factors into its scheme:

Document reference frequency

Document size

Consistence of documents

Freshness of document

Efficient schemes combine more than one of factors

in their implementation of web cache. Some

algorithms consider different cache architecture to

improve caching performance.

Table I: Client Side Proxy Cache Algorithm (CSPC)

5. Experimental Result

A proposed algorithm (CSPC) is devloped in

windows XP using C# .Net. and the unique

identification number is allot to unique URL‘s of proxy server log. These numbers are taken as

reference string that become input to the algorithms.

The result of algorithm in the form of Hit Rate are

shown in Table II.

6. Conclusion This paper is basically concentrated to explore the

client side proxy caching algorithm which is best

suited for proxy server. Real trace of web references

is achieved with the help of log details of proxy

server. For the simulation numeric reference string

was obtained by giving numeric identity to each of

the URLs. After simulation it concluded that the

proposed client Side Proxy Caching Algorithm

perform well than other algorithms like LRU, LFU

and FIFO. The CSPC algorithm improves Hit Ratio

approx 10.67%.

After exhaustive simulation experiments it is

concluded that for proxy caching the SCPC hit ratio

performance better than others algorithms.

1. WHILE there is a page p in Cache in the current window.

2. Serve first such p and mark the page

3. IF all pages in the cache are marked

4. Unmark all the pages

5. Evict randomly an unmarked page from the

cache

No of

Request

Replacement Algorithms

LRU LFU FIFO CSPC

100 0.400 0.410 0.390 0.460

300 0.486 0.496 0.483 0.563

500 0.490 0.496 0.480 0.560

800 0.548 0.551 0.545 0.625

1000 0.591 0.596 0.585 0.664

1200 0.635 0.640 0.631 0.691

1500 0.634 0.638 0.630 0.701

1800 0.658 0.658 0.653 0.711

2000 0.677 0.679 0.671 0.737

Figure 2: Hit Ratio Analysis



224

ISSN:2229-6093

7. References

[1] David A. Malts and Pravin Bhagwat (march 1998). ―Improving HTTP caching proxy performance with

TCP tap‖. Technical report, IBM.

[2] A. Feldman et. al.; Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments,

Proceeding of INFOCOM 99, 1999.

[3] F.J. Gonzalez- Canete, E Casilari, Alicia Trivino – Cabrera, ―Characterizing Document Types to Evaluate

Web Cache Replacement Policies‖, International

conference on Information Technology ITNG 2007.

[4] Valdimir V. Prischepa, ―An efficient Web Caching

Algorithm based on LFU-K replacement policy‖

Spring Young Researcher‘s Colloquium on Database

and Information System, 2004.

[5] P.cao and Irani, ―Cast aware WWW proxy caching

Algorithms‖, in proc. USENIX Symp. Internate

Technologies and System, Monterey, CA, 1997.

[6] Gala Golan ―Multilevel cache management based on

application Hints‖ computer science department,

Technion Haifa 32000, ISRAEL. November 24, 2003.

[7] Shiva Shankar Reddy P,Swetha L ―Analysis and

Design of Enhanced HTTP Proxy Cashing Server ―

paper published in International Journal of computer Technology, Volume 2 (3), 537-541.

[8] V. Sathiyamoorthi and Dr.Murali Bhaskaran ―Data

Preprocessing Techniques for Pre-Fetching and Caching of Web Data through Proxy Server‖

International Journal of Computer Science and

Network Security, VOL.11 No.2011.

[9] Martin Arlitt, Ludmila Cherkasova, John Dilley,

Richard Friedrich, Tai J in "Evaluating Content

Management Techniques for Web Proxy Caches",

published in ACM SIGMETRICS Performance Evaluation Review Volume 27 Issue 4, March 2000.

[10] Yong Zhen Guo, Kotagiri Ramamohanarao and

Laurence A. F. Park ―Personalized PageRank for Web Page Prediction Based on Access Time-Length and

Frequency‖ This paper published in 2007

IEEE/WIC/ACM International Conference on Web

Intelligence.

[11] R. Gupta, Tokekar .‖Preeminent pair of replacement

algorithms for L1 and L2 cache for proxy server‖.

First Asian Himalayas International Conference AH-ICI 2009.

[12] Abraham Silberschatz and Peter Baer Galvin,

Operating System concepts. Addison Wesley. 1997.

[13] A. Dan and D. Towsley, ―An Approximate Analysis

of the LRU and FIFO Buffer Replacement

Schemes‖, in Proceedings of ACM SIGMETRICS.

Boulder, Colorado, United States, 1990, pp. 143—

152.

[14] E.J O‘Neil, P.E. O‘Neil and G. Weikum, ―The

LRU-K page replacement algorithm for database

Disk Buffering‖ proc.ACMSIGMOD Int ‗l Conf.

Management of Data, pp. 297-306, May 1993.

[15] M.J. Bach, ―The Design of the UNIX Operation

system‖. Engle wood Cliffs, NJ: Prentice-Hall,

1986.

[16] Michael Factor, Assaf Schuster, Gala Yadgar,

―Multilevel Cache Management Based on

Application Hints‖, Technion- Computer Science Department Technical Report CS-2006.

[17] John Dilley, Martine Arlitt and Stephane Perret

―Enhancement and Validation of Squid‘s Cache Replacement Policy‖ Internet Systems and

Applications Laboratory HP Laboratories Palo Alto

HPL- 1999-69, May 2009.



225

ISSN:2229-6093

Documents

Design and Implementation of Page Replacement Algorithm for Web Proxy Caching