Web Cache

Y E E VA N G

WEB CACHE

INTRODUCTION

• Internet has many user• Issues with access latency (lag)• Server crashing

• How to solve?• One solution, Web Cache

WEB CACHE

• What is web cache?

• Cache “a place of storage”

• Web cache – “a place to store websites or web objects”

WEB CACHING

• Web Caching

• Technique that can:

• Reduce access latency• “the time it takes for a request to be completed”

• Network congestion• “occurs when a link or node is carrying so much data that its

quality of service deteriorates”

WEB CACHING

• How does it reduce user access latency and network congestion?

• No cache example• Movie Storage Room in the next building• Contain one copy of every movie

• One worker

WEB CACHING

• Cache example

• The same as the previous example

• Movie Storage Room in the next building• Contain one copy of every movie

• One worker

• A Movie rack that can hold five movie at a time, to simulate a movie cache.

WEB CACHING

• In the example

• Customer -> User

• Movie -> Web Pages

• Worker -> ISP

• Movie Storage Room -> Origin Server

• Movie Rack -> Web Cache

CACHE HIT/CACHE HIT RATE

• Cache hit• Occurs when a request can be satisfied by the web

cache.

• In the movie store example• Hit?

• Cache hit rate• Is the percentage that a previously cached object will score a

cache hit

CACHE MISS

• Cache miss

• Occurs when a request cannot be satisfied by the web cache.

• In the movie example• Miss?

WEB CACHING

• Pros

• Can reduce internet bandwidth• If a request can be satisfied by the web cache

• Reduce the work load of the origin server• By storing previously requested web objects in a web cache

• Reduce user access latency• When a cache hit occurs

WEB CACHING

• Cons

• Not every web objects are cacheable• Website that generate dynamic data• Requires an active connection• https://

• Stale Cache• Cache that are out of date

• Bottleneck at the proxy server (in proxy caching)

TYPES OF WEB CACHE

• Browser Cache

• Proxy Cache

• Reverse Proxy Cache

BROWSER CACHE

• Cache stored at client level• Meaning the cache is actually stored on the user’s computer

• i.e. Temporary internet files,

Mozilla/Netscape C:\Users\Profile\AppData\Roaming\Mozilla\Profiles\[random.string].slt\

Firefox C:\Users\Profile\AppData\Roaming\Mozilla\Firefox\Profiles\[random.string]\

Thunderbird C:\Users\Profile\AppData\Roaming\Thunderbird\Profiles\[random.string]\

[http://www.holgermetzger.de/pdl.html]

BROWSER CACHE

• Advantages of Browser Cache

• Stored Locally • On cache hit it saves bandwidth• Increase in access latency

• User pattern• The same user has a higher probability of browsing the same

website each day.

BROWSER CACHE

• Disadvantages of Browser Cache

• Takes up hard drive space

• Stale object• Always risk running into stale object with caching.

• Stored Locally• Only serves one computer.

PROXY CACHE• Cache are stored at a proxy server

• The proxy server usually serves more than one user

• Acts as a gateway to the internet for large company or institution

http://www.codeproject.com/KB/web-cache/ExploringCaching/cache_array.jpg

PROXY CACHE

• Request are directed to the proxy server instead of the origin server.

• On cache hit• Returns the requested object to the user.

• On cache miss• Request is then forwarded to origin server.

PROXY CACHE

• Advantages• Serves more than one client• Cache hit can occur even if different user makes the same

request.• Gateway• Companies can limit what user can access.

• Disadvantages• Serves more than one client• Can be overloaded.

• Gateway• When the proxy server is down all the users are disconnected

from then internet.

REVERSE PROXY CACHE

• Serves, origin server

• Basically a proxy server that sits in front of the origin server.

http://odino.org/images/proxy-cache.jpg

REVERSE PROXY CACHE

• When a request is made?

• Directed to the reverse proxy cache server

• On cache hit • Object is returned to user

• On cache miss• Request is forwarded to the origin server• A copy is stored on the Reverse proxy server• A copy is sent back to the user

REVERSE PROXY CACHE

• Advantages

• Reduces workload off of the origin server• Requested object can be requested once, cached on the

reverse proxy server, and server many clients without contacting the origin server again

• Static files can be cached• i.e. CSS files, java scripts, logos• Allows the origin server to better process dynamic contents

REVERSE PROXY CACHE

• Disadvantages

• Bottleneck• Many users making requests at the same time

• Stale Cache/old files• Risk of cache hits on stale object, also static files can be

outdated

WEB CACHING ARCHITECTURE

• Two main web caching architecture• Hierarchical • Distributed

• They both utilizes the network shown below

[3]

HIERARCHICAL CACHING ARCHITECTURE

• There are more than one level of cache between the users and the origin server

• Typically employs more than one types of cache

• There are parents, child and sibling relationships between caches.


• First level of cache – Institutional Network• Second level of cache – Regional Network• Third level of cache – National Network• Parents? Child? Siblings?

[3]


• When a request is made

• Its sent to the level one cache

• If the level one cache cannot satisfy the request

• Then its forwarded to the level two cache

• If the level two cache cannot satisfy the request

• Then its forwarded to the next level.

• Once it reaches the last level, and still not be satisfied, then the request is forwarded to the origin server


• Advantages• Different level of cache offers more chance for a cache

hit• Leads to decrease access latency• Also reduce workload on the origin servers

• Disadvantages• Every level added to the hierarchy adds delay• On cache miss there is a slight increase in latency• Higher level cache servers are expensive

DISTRIBUTED CACHING ARCHITECTURE

• Cache are stored at the Institutional Level• Regional and national level are eliminated• Each institutional network in the distributed

system are siblings to each other.

[3]


• What is special in the distributed caching architecture?

• Each institutional cache can contact its sibling cache

• So each cache can knows what is in the other cache

• They can receive objects from their sibling



• Query-Based Approach – Internet Caching Protocol• Request sent to configured institutional cache server

• On cache miss, the request is broadcasted to the institutional cache’s sibling cache.

• If a sibling cache contains the requested object, the sibling cache sends the object to the immediate institutional cache. The immediate institutional cache then stores a copy in itself, and sends the client another copy

• If no sibling contains the requested object, a timeout will occur. • At which point the immediate institutional cache will then forward the request to

the origin server.



• Directory-Based Approach – Cache Digest (Squid)

• In this approach metadata is used.

• Each cache is aware of it’s siblings content.

• When a request is made, its sent to the immediate institutional cache.

• On cache miss, the institutional cache checks its metadata to see if any of it’s sibling cache contains the requested object.

• If not, then it forwards the request to the origin server


• Advantage

• Sibling cache servers share common interests• More chance of cache hit

• Sibling cache servers are assigned based on proximity• Faster response time


• Disadvantage

• Sibling cache servers share common interests• If the servers are too far apart

• Increase in access latency

• Sibling cache servers are assigned based on proximity• Servers may not share common interest

• Less chance of cache hit

WEB CACHE COHERENCY

• Web cache coherency• Is the cache up to date?

• Web cache coherency mechanism• Validation check

• When a web object is first received

• It gets time stamped

• When the cached object is used, the cache server makes a validation check, by sending the time stamp to the origin server

WEB CACHE COHERENCY

• Web cache coherency mechanism• Callback

• When a web object is cached, it receives a callback promise for the object, from server.

• Callback promise – a promise that the origin server will notify the cache server if the object has been updated

• So the cache object is up to date if the cache server have not received a notification from the origin server

WEB CACHE COHERENCY

• Web cache coherency mechanism• Expiration

• When an object is cache an expiration date is assigned to it

• Object is valid until expiration date

• The first request for the object after its expiration date is requested from the origin server again.

• At this time a new expiration date is assigned to the object

CACHE PLACEMENT AND REPLACEMENT POLICIES

• How cache are replaced

• Random• A random cache is replaced.

• Size• Largest cache is replaced first

• FIFO – First In First Out• Oldest cache is eliminated first


• LRU – Least Recently Used• Cache that has not been requested for the longest time is

eliminated first

• LRU/MIN – Least Recently Used Minimum• The first document whose size is larger than or equal to the size

of the new document is removed

• HLRU – History Least Recently Used• Record how many times each cached object is used• Elimination based on• LRU• Least Used


• LFU – Least Frequently Used• Cache are sorted based on how frequently it is used• On cache hit, the counter for the hit object is incremented

by one.• List is then re-ordered• The web object with the lowest count is replaced first

• LFU – Aging• Same as LRU• The Average count of all cached object is monitored• When the average count reaches a threshold, all counts are

reset back to zero


• LRV – Lowest Relative Value• Each cached object is assigned a cost value• Object with the lowest cost value are replaced first

• GD – Greedy Duel• Each cached object is assigned a cost value• Lowest cost object are replaced first• Then all cached object has their cost lowered by the

replaced object’s cost• Each time a cache is accessed its cost is reset back to its

original cost

CONCLUSION

• Web caching helps reduce:

• Network Congestion

• User access latency

• Performance of origin server

QUESTIONS?

•Questions?

REFERENCE

• [1]Barish, G., & Obraczke, K. (2000). World Wide Web caching: trends and techniques. Communications Magazine, IEEE , 38(5), 178 - 184 . doi:10.1109/35.841844 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=841844&isnumber=18201

• • [2]Bakiras, S., Loukopoulos, T., Papadias, D., & Ahmad, I. (2005). Adaptive schemes for distributed

web caching. Jour of Parallel and Distributed Computing, Retrieved from http://www.cs.ust.hk/~dimitris/PAPERS/JPDC05-DWC.pdf

• • [3]Biersack, E. W., Rodriguez, P., & Spanner, C. (2001). Analysis of Web caching architectures:

hierarchical and distributed caching. Networking, IEEE/ACM Transactions on , 9(4), 404-418. doi:10.1109/90.944339 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=944339&isnumber=20434

• • [4]Das, S., Dykes, S. G., & Jeffery, C. L. (1999). Taxonomy and design analysis for distributed Web

caching. System Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference on , 8, 10. doi:10.1109/HICSS.1999.773040 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=773040&isnumber=16788

• • [5]Davison, B. D. (2001). A Web caching primer. Internet Computing, IEEE, 5(4), 38-45.

doi:10.1109/4236.939449 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=939449&isnumber=20329

REFERENCE

• [6]Dubois, M., & Jeong, J. (2002, June). In R Bianchini (Chair). Cost-sensitive cache replacement algorithms. Paper presented at Second workshop on caching, coherence, and consistency, New York, NY, USA Retrieved from http://www.research.rutgers.edu/~wc3/papers/dubois.pdf.gz

• • [7]Geetha, K., Gounden, N. A., & Monikandan, S. (2009). SEMALRU: An Implementation of modified web

cache replacement algorithm. Nature & Biologically Inspired Computing, 1406-1410.• doi: 10.1109/NABIC.2009.5393711• URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5393711&isnumber=5393306• • [8]Hassanein, H., Liang, Z., & Liang, P. (2002). Performance comparison of alternative Web caching

techniques. Computers and Communications, 2002. Proceedings. ISCC 2002. Seventh International Symposium on , 213 - 218 . doi:10.1109/ISCC.2002.1021681 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1021681&isnumber=21983

• • [9](n.d.). Reverse Proxy Caching. In Cisco ACNS Caching and Streaming Configuration Guide. (5th ed.).

(pp. 6-1). San Jose, CA: Cisco Systems, Inc.. doi:OL-4070-01 Retrieved from http://www.cisco.com/en/US/docs/app_ntwk_services/waas/acns/v51/configuration/local/guide/a51cag.pdf

• • [10]Tay, T. T., & Wijesundara, M. N. (2002). Distributed Web caching. Communication Systems, 2002.

ICCS 2002. The 8th International Conference on , 2(25-28), 1142- 1146 vol.2 . doi:10.1109/ICCS.2002.1183311 Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1183311&isnumber=26554

REFERENCE

• [11]Vakali, A. (2000). Lru-based algorithms for web cache replacement. In K. Bauknecht, S. Kumar Madria & G. Pernul (Eds.), Electronic Commerce and Web Technologies, First International Conference (p. 409-418). Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.59.5504&rep=rep1&type=pdf

Documents

Web Cache