View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Cache Resolution/Routing
Most Web caching schemes: many Web caches scattered over the Internet
Main challenge: how to quickly locate the appropriate cache
No necessary location among document’s cache location
Unmanageably large cache routing tables
Cache Resolution/Routing
Out-of-date cache routing information leads to cache miss
Minimize the cost of a cache miss: ideal cache routing algorithm
Cache Resolution/Routing
Common approach – caching distribution tree from popular servers towards high demand sources – Resolution via cache routing table/ hash
functions Works well for popular documents
Cache Routing Table
Malpani – make a group of caches function as one– Cache is selected arbitrary– In case of miss: use IP multicast (why?)– Redirection
Cache Routing Table
Harvest – organize caches in hierarchy– Internet Cache protocol: ICP– In case of miss: siblings and upward
Cache Routing Table
Adaptive Web Caching – mesh of caches– Distribution trees are built– Overlapping multicast groups– No root node will be overloaded– For less popular objects: long journey
Hashing Function
Cache Array Routing Protocol: CARP– “query-less” caching by hash function– Based on “array membership list” and URL
for exact cache location– Proxy removal: reassign 1/n URLs and
distribute new hash function
Prefetching
Caching documents at proxies improve Web performance with limited benefit
Maximum cache hit rate < 50%
Prefetching
Prefetching must be effective (why?) Prefetching can be applied in 3 ways:
– Between browser clients and Web Servers– Between proxies and Web Servers– Between browser clients and proxies
Between browser clients and Web Servers Cunha – use a collection of Web clients
– How to predict user’s future Web accesses from his past Web accesses
– Two types of users: net surfer and conservative
Between browser clients and Web Servers Conservative – easy to guess which
document will access next– Prefetching is well paid off
Net surfer – all documents have equal probability of being accessed– Price to be paid in terms of extra
bandwidth is too high
Between proxies and Web Servers Markatos –
– Web servers push popular documents to Web proxies (top-10)
– Web proxies push popular documents to Web clients
– Web servers can anticipate > 40% client’s request
– Requires cooperation from Web servers
Between proxies and Web Servers Performance:
– Top-10 manages to prefetch (up to) 60% of future requests
– less than 20% corresponding increase in traffic
Between browser clients and proxies Fan – reduce latency by prefetching
between caching proxies and browsers– Relies on the proxy – predict which cached
documents a user might reference next– Use idle time between user requests to
push documents to the user– Reduce client latency by 23%
Prefetching - summary
First two approaches – increase WAN traffic
Last approach – affects traffic over modems/LANs
Cache placement/replacement
Document placement/replacement algorithm can yield high hit rate
Cache placement – not been well studied
Cache replacement – can be classified into 3 categories:
Cache replacement – traditional policies Least Recently Used
– LRU Least Frequently Used
– LFU Pitkow/Recker – LRU
except if all objects are accessed within the same day
Cache replacement – key-based policies Size – evicts the
largest object (why?)
LRU-MIN – biased in favor the smaller objects– Evicts the LRU object which has size > S,
S/2, S/4 etc.
Cache replacement – key-based policies LRU-Threshold – LRU but objects
which have size > Threshold are never cached
Lowest Latency First
Cache replacement – cost-based policies GreedyDual-Size – associates a cost
with each object– Evicts object with lowest cost/size
Server-assisted – models the value of caching an object in terms of its fetching cost, size and cache prices– Evicts object with lowest value
Cache coherency
Caches provides lower access latency Side effect: stale pages Every Web cache must update pages in
its cache
Cache coherency
HTTP commands that assist Web proxies in maintaining cache coherence:
HTTP GET Conditional GET: HTTP GET combined
with the header IF-Modified-Since Pragma:no-cache Last-Modified:date
Cache coherence mechanisms
Current cache coherency schemes provides two types of consistency– Strong cache
consistency
– Weak cache consistency
Strong cache consistency
Client validation – polling every time – Cached resources are potentially out-of-
date– If-Modified-Since with each access to
proxy– Many 304 responses
Strong cache consistency
Server invalidation – Upon detecting a resource change, send
invalidation message– Server must keep track
of lists of clients– The lists can become
out-of-date
Weak cache consistency
The Piggyback Cache Validation: on every communication between proxy to server, the proxy piggybacks a list of cached resources for validation
Weak cache consistency
The Piggyback Server Invalidation: on every communication between server to proxy, the server piggybacks a list of resources that have changed since the last access
Weak cache consistency
Combination of PSI and PCV: depends on the time since the proxy last requested invalidation
– If the time is small: PSI– For longer gaps: PCV