3
An Improvised Dynamic and Semantic based Web Cache Replacement Policy K Geetha a and N Ammasai Gounden b a Department of Computer Science and Engineering,National Institute of Technology, Tiruchirappalli-15, Tamilnadu, India 620015, Contact: [email protected] b Department of Electrical and Electronics Engineering, National Institute of Technology, Tiruchirappalli-15, Tamilnadu, India 620015 This paper proposes a web cache replacement policy based on semantic content of the pages cached at the client side. Two models namely Clustered Model(CM) and Relational Model(RM) are proposed that focus on the Dynamicity which refers the dynamic nature of the content and the Semantic content which exhibits the relation of information available among cached web pages and hence the name DynaSem. The proposed policy marks the page for eviction prioritized by Eviction Index (EI) in CM and Relation Index (RI) in RM. CM uses an interface with a web browser incorporated into it. The Trie data structure that enables the searching process to be more efficient has been framed to store the well-known categories of cached content as clusters. Pages with highest EI are marked for eviction. RM employs a technique to reveal the relation among cached documents. It evicts documents that are less related(minimum RI) to an incoming document which needs to be stored in the cache to ensure that only related documents are cached; hence the contents of the cache represent the documents of interest to the user and those which are of more static in nature. The proposed policy has been developed to incorporate two algorithms- one to find the dynamic count of the given web page ’P’ and the other to the find semantic relation between the pages cached. Both the models(CM and RM) are used to establish the semantic relation. The policy has been simulated under model driven simulation with the help of an input set consisting of a few web pages. The parameters pertinent to cache replacement algorithms are computed and the result shows there is a factual improvement compared to the original semantic based policies. Keywords : Web caching, Replacement Policies, Eviction, Semantic Relation, Dynamism. 1. INTRODUCTION Web caching is the process of storing the fre- quently accessed web pages or documents. The increasing demand for web services insisted the need for web caching that can indubitably re- duce the Internet traffic, download time, net- work bandwidth usage, server load, and perceived lag. Cache being a limited resource in terms of size, becomes saturated quite frequently and hence eviction has to be made often. Especially in wireless network, size of the client cache at mobile terminal is very small that demands fre- quent replacement. The state of art dictates mul- titude policies based on recency, frequency, size, and combination of the above parameters as some function. A cache server stores web objects (e.g., HTML pages, images, and files) locally for the use of future requests to those objects. As cache size is finite, a cache replacement policy is needed to manage cache content. If a cache is full when an object needs to be stored, the policy will deter- mine which object is to be evicted to make room for the new object. However, in practical implementation, a replace- ment policy usually takes place before the cache is really full. The cache uses two water marks, high and low, to guide the replacement process. If the size of total cached objects exceeds the high watermark, the policy will evict objects until the low watermark is reached. The advantage of do- ing this is reducing the overhead of invoking the policy on demand. The goal of the replacement policy is to make the best use of available re- sources, including disk space, processing power, server load, and network bandwidth. The increas- ing use of the internet and its emerging appli- 14 International Journal of Information Processing, 5(3), 14-25, 2011 ISSN : 0973-8215 IK International Publishing House Pvt. Ltd., New Delhi, India

An Improvised Dynamic and Semantic based Web Cache ...3)/p2.pdf · An Improvised Dynamic and Semantic based Web Cache Replacement Policy K Geethaa and N Ammasai Goundenb aDepartment

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Improvised Dynamic and Semantic based Web Cache ...3)/p2.pdf · An Improvised Dynamic and Semantic based Web Cache Replacement Policy K Geethaa and N Ammasai Goundenb aDepartment

An Improvised Dynamic and Semantic based Web Cache

Replacement Policy

K Geethaa and N Ammasai Goundenb

aDepartment of Computer Science and Engineering,National Institute of Technology, Tiruchirappalli-15,Tamilnadu, India 620015, Contact: [email protected]

bDepartment of Electrical and Electronics Engineering, National Institute of Technology,Tiruchirappalli-15, Tamilnadu, India 620015

This paper proposes a web cache replacement policy based on semantic content of the pages cached at the clientside. Two models namely Clustered Model(CM) and Relational Model(RM) are proposed that focus on theDynamicitywhich refers the dynamic nature of the content and the Semantic content which exhibits the relationof information available among cached web pages and hence the name DynaSem. The proposed policy marks thepage for eviction prioritized by Eviction Index (EI) in CM and Relation Index (RI) in RM. CM uses an interfacewith a web browser incorporated into it. The Trie data structure that enables the searching process to be moreefficient has been framed to store the well-known categories of cached content as clusters. Pages with highestEI are marked for eviction. RM employs a technique to reveal the relation among cached documents. It evictsdocuments that are less related(minimum RI) to an incoming document which needs to be stored in the cacheto ensure that only related documents are cached; hence the contents of the cache represent the documents ofinterest to the user and those which are of more static in nature. The proposed policy has been developed toincorporate two algorithms- one to find the dynamic count of the given web page ’P’ and the other to the findsemantic relation between the pages cached. Both the models(CM and RM) are used to establish the semanticrelation. The policy has been simulated under model driven simulation with the help of an input set consisting ofa few web pages. The parameters pertinent to cache replacement algorithms are computed and the result showsthere is a factual improvement compared to the original semantic based policies.

Keywords : Web caching, Replacement Policies, Eviction, Semantic Relation, Dynamism.

1. INTRODUCTION

Web caching is the process of storing the fre-quently accessed web pages or documents. Theincreasing demand for web services insisted theneed for web caching that can indubitably re-duce the Internet traffic, download time, net-work bandwidth usage, server load, and perceivedlag. Cache being a limited resource in termsof size, becomes saturated quite frequently andhence eviction has to be made often. Especiallyin wireless network, size of the client cache atmobile terminal is very small that demands fre-quent replacement. The state of art dictates mul-titude policies based on recency, frequency, size,and combination of the above parameters as somefunction.

A cache server stores web objects (e.g., HTMLpages, images, and files) locally for the use of

future requests to those objects. As cache sizeis finite, a cache replacement policy is needed tomanage cache content. If a cache is full when anobject needs to be stored, the policy will deter-mine which object is to be evicted to make roomfor the new object.

However, in practical implementation, a replace-ment policy usually takes place before the cacheis really full. The cache uses two water marks,high and low, to guide the replacement process.If the size of total cached objects exceeds the highwatermark, the policy will evict objects until thelow watermark is reached. The advantage of do-ing this is reducing the overhead of invoking thepolicy on demand. The goal of the replacementpolicy is to make the best use of available re-sources, including disk space, processing power,server load, and network bandwidth. The increas-ing use of the internet and its emerging appli-

14

International Journal of Information Processing, 5(3), 14-25, 2011ISSN : 0973-8215IK International Publishing House Pvt. Ltd., New Delhi, India

Page 2: An Improvised Dynamic and Semantic based Web Cache ...3)/p2.pdf · An Improvised Dynamic and Semantic based Web Cache Replacement Policy K Geethaa and N Ammasai Goundenb aDepartment

24 K Geetha and N Ammasai Gounden

0

5

10

15

20

25

30

10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

# E

vict

ions

Cache size in KB

LRUSEMANTIC

SEMALRUDYNASEM

Figure 11. Number of replacements Vs varyingcache size for different policies.

naSem policy supersedes the other policies in allthe above metrics.

4. CONCLUSIONS AND FUTUREWORK

This work aims at improvising the semantic basedweb cache replacement policy by considering thelevel of dynamism among the pages possessingthe same relation. A modified policy termed ‘Dy-naSem’ has been developed. A formal frameworkfor the DynaSem policy has been designed thatincorporates two logics - dynamic and semanticrelations. Rating the document based on the dy-namic count leads to conservative improvementin most of the performance metrics used to eval-uate the cache replacement policy. The seman-tic relation has been established using clusteredmodel and relational model. Both models aim atidentifying the dependencies among the cacheddocument and the new incoming document. Therequest set is generated by a simulation processand the clustered model can be chosen, if the useraccess pattern is going to explore all the possiblesub links provided in a web site.

Relational model can be preferred if the userswitches between various web sites for related in-formation.Though both models experience sametime complexity, better results can be obtainedin CM inspite of its increased space complexity.Using model driven simulation, the performanceof DynaSem policy has been analyzed for relatedand unrelated request sets. Even if the user ac-cess pattern is not related, this policy has not de-teriorated much from other policies namely LRU,

SEMANTIC and SEMALRU. Ranking the docu-ments based on their dynamism shows commend-able results and hence it can be used as a vi-tal parameter for tuning the performance of allprevalent replacement strategies that ignore filerelations and communication overhead. An in-creasingly important technique to enhance theweb caching performance is to prefetch web pages.Prefetching can happen in a predictive manner orin an interactive manner. For predictive prefetch-ing, the proposed policy can be used to predictthe reference probability of new requests after an-alyzing the user access pattern The experimentsconducted in the proposed work are restricted toonly isolated cache. A possible extension to thiswork could be to experiment with a grid of co-operating caches. This policy tackles single userinterest and can be extended to satisfy groupof users. Instead of trie structure used in CM,standard vector model that is widely employed insearch engines for information retrieval can alsobe adopted.

ACKNOWLEDGEMENT

The authors would like to thank Mr. N Rama-subramanian, Associate Professor, Departmentof Computer Science and Engineering, NationalInstitute of Technology, Tiruchirappalli for per-miting to conduct the simulation work in his laband for his valuable suggestions in making thiswork possible.

REFERENCES

1. Kin Yeung Wong, Web Cache Replacement Poli-cies: A Pragmatic Approach,IEEE Network ,20(3):342-351, 2006.

2. J Wanf, A Survey of Web Caching Schemes forthe Internet, ACM SIGCOMM Computer. Com-munication Review, 29(5):36-46, 2006.

3. Abdullah Balamash And Marwan Krunz,AnOverview of Web Caching Replacement Algo-rithms, IEEE Communications Surveys, 6(2):44-56, Second Quarter 2004.

4. S Podlipnig and L Boszormenyi, Web Cache Re-placement Strategies, ACM Computing Surveys,35(4):374-398, 2003.

5. Brian D Davison, A Web Caching Primer, IEEEInternet Computings, 5(4):38-45, July/August2001.

6. K Psounis and Balaji Prabhakar. A Randomized

Page 3: An Improvised Dynamic and Semantic based Web Cache ...3)/p2.pdf · An Improvised Dynamic and Semantic based Web Cache Replacement Policy K Geethaa and N Ammasai Goundenb aDepartment

An Improvised Dynamic and Semantic based Web Cache Replacement Policy 25

Web-Cache Replacement Scheme,INFOCOM -Twentieth Annual Joint Conference of the IEEEComputer and Communications Societies. Pro-ceedings, 4, 1407-15, April 2001.

7. L Rizzo and L Vicisano. Replacement Policiesfor a Proxy Cache, IEEE/ACM Transactions onNetworking , 8(2):158-70, April 2000.

8. S Williams et al.. Removal Policies in NetworkCaches for World-Wide Web Documents, Pro-ceeding of ACM SIGCOMM, 293-305, August1996.

9. P Cao and S Irani. Cost-Aware WWW ProxyCaching Algorithms, Proceeding of USENIXSymposiam Internet Technology and System, 193-206, Dec 1997.

10. L Breslau et al., Web caching and Zipf-like Distri-butions: Evidence and Implications, Proceedingof INFOCOM, 126-34, Aug. 1999.

11. Alcides Calsavara, Rogerio Guaraci dos Santos,Edgard Jamhour, The Least Semantically Re-lated Cache Replacement Algorithm, ACM LatinAmerica conference on Towards a Latin Ameri-can agenda for network research Proceedings ofthe 2003 IFIP , 21-34, October 2003.

12. Ren Q, Dunham M H, and Kumar. Semanticcaching and query processing, IEEE Transac-tions on Knowledge and Data Engineering, 15,192-210, 2003.

13. Michael Stollberg, Martin Hepp, and Jorg Hoff-mann. A Caching Mechanism for Semantic WebService Discovery, LNCS 4825, Springer-VerlagBerlin, 15, 480-493,2007.

14. K Geetha, and N Ammasai Gounden, and SMonikandan. SEMALRU: An implementationof modified web cache replacement algorithm,INC -09 International Symposium on Innova-tions In Natural Computing IEEE Computer So-ciety, 1406-1410, Dec. 2009.

15. Chidlovskii, B Roncanico C. Semantic cachemechanism for heterogeneous web querying,www8/computer Networks, 31(11-16):1347-1360,1999.

16. Zheng B, et al.. Cache Invalidation and Replace-ment Strategies for Location-dependent Datain Mobile Environments, IEEE Transactions onComputers ,10(51):1141-1153, 2002.

17. C Aggarwal, J Wolf, and P Fellow. Caching onthe World Wide Web, IEEE Transactions onKnowledge and Data Engineering, 11(1):94-107,1999.

18. Murta et al.. Analyzing performance of parti-tioned caches for the WWW., Proceedings of 3rdInternational WWW caching workshop, 1998.

19. Cheng K, Kambayashi. Advanced replacementpolicies for www caching., In web-Age Informa-

tion Management, 239-244, 2000.20. R Wooster and M Abrams. Proxy Caching that

Estimates Page Load Delays, in Proceeding of6th International World Wide Web Conferenceon Santa Clara, 32534, Apr.1997.

21. D Fasulo. An analysis of Recent ork on ClusteringAlgorithms of the Technical Report, 1999.

22. Glenn Fung. A Comprehensive Overview of BasicClustering Algorithms of the Technical Report,June 2001.

23. Seung-Hyun Oh, and Jong-Suk Ahn. Bit-mapTrie: A Data Structure for Fast ForwardingLookups, INC -09, Global TelecommunicationsConference, GLOBECOM ’01. IEEE, vol. 3,pages 1872-1876, December 2001.

24. Sharun Santhosh and Weisong Shi. A Semantic-based Cache Replacement Algorithm for MobileFile Access, WWW2005 conference, 2005.

K Geetha was born in Tiruchirappalli, Tamilnadu,India, on June 24, 1972. She received the B.E degreein Computer Science and Engineering from RegionalEngineering College, Tirucirappalli, India in 1996and M.Tech. in computer science from the NationalInstitute of Technology(formerly REC), Tiruchirap-palli, India in 2003. She is currently pursuing herdoctoral programme in the area of Web caching andits applications. She worked as Research Associatein the National Institute of Technology, Tiruchirap-palli, India from 2000 to 2006 and taught courseson Computer Architecture, Computer Networkingand Digital Systems and Microprocessors for gradu-ate and undergraduate students. Since 2009 she isworking as an Assistant professor in the Departmentof Computer Science and Engineering, Bharathi-dasan University, Tiruchirappalli, India. She hasalso guided projects related to network security, webtechnology and computer architecture.

N Ammasai Gounden was born in Coimbatore,TamilNadu, India, on October 5, 1955. He receivedthe B.E. degree from the College of Engineering,Guindy India,(Madras University)in 1978 and theM.E. degree in control systems from P.S.G.Collegeof Technology, Coimbatore, India (Madras Univer-sity) in 1980. He received the Ph.D. degree from theBharathidasan University, Tiruchirappalli, India, in1990. Currently, he is a professor with the Depart-ment of Electrical and Electronics Engineering, Na-tional Institute of Technology, Tiruchirappalli wherehe has been working since 1982. His areas of interestare power electronic applications in renewable energysystems, energy conversion, use of programmable dig-ital controllers in hybrid renewable systems, patternrecognition and memory architecture.