Proxy Cache and YOU By Stuart H. Schwartz. What is cache anyway? The general idea of cache is simple… Buffer data from a slow, large source within a (usually)

Proxy Cache and YOU

By Stuart H. Schwartz

What is cache anyway?

• The general idea of cache is simple…• Buffer data from a slow, large source within a

(usually) smaller, faster source.• We can free up bus traffic (or bandwidth) and

also speeeeeed up access to data.• Used everywhere in modern computers.

Why are we here? Today..?

• Learn a little about: • Cache• Web cache• Problems of web cache• Solutions to the problems of web cache• Becoming s-m-r-t Smart!

Ok.. Cool… How does this apply to the inter-ma-web?

• Well, the web can be thought of as another I/O medium like a very very very large number of external devices on a USB bus

• Caching data at various locations between the requesting process and the data source can really help speed things up for the local machine and across the web.

So, why not just use regular caching methods?

• Some small differences exists between web cache and cache used on a local machine in the hardware architecture.

• On a local machine, the program using the memory usually changes the values. Once all changes are made, or an eviction occurs, the values are written back to main memory…. This is not the case for web cache. Web cache is generally pulled in for read only.

There’s more…

• Traditional replacement policies work under the assumption that blocks of data are pulled into cache in uniform sizes, and that the cost of pulling data in is relatively consistent for every block of data that could be requested...

• Again, web cache is different. Individual web objects can vary in size and can also vary in network cost for retrieval.

So… who cares?

• It's not a big problem until the web cache must evict data. Common eviction policies aren’t designed to handle non-uniform size blocks of data nor non-uniform costs for retrieving the data.

• Evicting the object that is to be used furthest in the future is optimal only in situations where size and cost are all equal among objects.

That sucks… so what do we do?

• We come up with a policy that will evict objects from the web cache in such an order as to maximize some metric we care about… …or have someone smarter than us come up with one.

Wait, what metrics are you talking about? Isn’t it just hit or miss?

• Now that there is more to the problem, there are more metrics we might care about minimizing or maximizing. Some include:

• Object hit rate• Byte hit rate• Latency• Network Hops• $

So, hasn’t anyone looked into this?

• YES! Some replacement policies exist, like:• Least Recently Used• Least Frequently Used• LRU-Threshold• Log(Size) + LRU• Hyper-G

And..

• Pitkow/Recker• Lowest-Latency-First• Hybrid• Lowest Relative Value• GreedyDual-Size

Whoa… That’s a lot of algorithms… What’s the best?

• Well, that’s a tough call. They all have their advantages and disadvantages, but GreedyDual-Size has been tested to function very well under normal conditions.

• GreedyDual-Size does well at maximizing object hits or byte hits, and also does well at minimizing latency or network hops… but it can only be setup to do one of those at any given time.

How do you know GreedyDual-Size performs well? Huh?

• Proving optimality is pretty difficult in such a complicated problem.

• It's possible to use sample sets of web access requests (traces) that were recorded over time and can then be used to simulate how well the individual algorithms will perform with respect to the specific metrics discussed before.

Locality is still a factor.

• In machine cache, locality is a big factor, data that is logically close to already accessed data is more likely to be accessed next, than data far away.

• The same goes for temporal locality, data that has been recently accessed is more likely to be accessed again than data that has not been accessed in a while.

Locality on the web

• Studies have shown that web access also follows some of the same patterns.

• Data within a single web site is more likely to be accessed next rather than data from another site.

• Data that has been accessed most recently is likely to be accessed again. Another odd property is this generally occurs in k * 24 hour cycles.

Back to GreedyDual-Size

• Greedy Dual-Size is an eviction policy that attempts to perform very well without fine-tuning any heuristics based on network behavior.

• It is based on the tried and true idea of Least Recently Used but also adds provisions for different object sizes and different network costs to bring the object in.

How does it work?

• Well GreedyDual-Size works by associating a value (we call H) with every object that is in cache. This value H = Cost / Size, where Cost is some abstract cost of bringing the object into cache, and Size is the size of the object in bytes.

• This simple Cost / Size relationship works very well at maximizing or minimizing desired metrics.

Ok… then what?

• When it comes time to evict an object, we pick the object with the lowest H value to evict.

• Then, we subtract that H value from all the objects still in memory, essentially depreciating their H value as evictions occur over time.

• If an object in memory is accessed again, we bring its H value back to the original Cost / Size again.

Pseudo CodeSet L = 0 If Object is in memory

Set H for the object to L + Cost(Object) / Size(Object)

Return objectElse

While memory cannot fit Size(Object)L = minimum H value of objects in memoryEvict object with H value of L

End WhileInsert Object into memorySet Object’s H = L + Cost(Object) / Size(Object)

End If

Wow, that’s crazy, what does that do?

• If we subtracted the minimum H value from every object in memory, the Big O for an eviction would be O(n), where n is the number of Objects in cache, and that is unreasonable.

• Instead, this pseudo code uses a heap queue where it is sorted by H value. And instead of depreciating the H values of objects already in memory, we just appreciate the H value of new objects coming into memory by L, which ‘remembers’ the appreciation.

• This allows evictions and insertions to occur in O(log n) time, which is very reasonable.

Performance Results

• Comparing the performance of GreedyDual-Size with LRU, Size, Hybrid and LRV yielded very promising results.

• Using a sample trace set, GreedyDual-Size performs better than its competitors as far as hit ratio. Incurring only a 5% miss rate when the cache is 5% of the total data size.

• LRV (Lowest Relative Value) sometimes performs better than GreedyDual-Size but it can be attributed to the fact that LRV is customized for the patterns of the networks where as GreedyDual-Size is generic.

Main flavors of GreedyDual-Size• GD-Size(1)

– Set all the network costs to 1; this aims to achieve maximum object hit rate.• GD-Size(Packets)

– Setting the network costs to the number of packets required for an objectAims to reduce network traffic.

• GD-Size(Latency)– Account for network latency and improve response times

• GD-Size(Average Latency)– Take an average of the network latency, works better on larger caches.

• GD-Size(Hops)– Number of network hops. Works the best

• GD-Size(Weighted Hops)– Number of network hops weighted by the number of packets to transfer.

Use GreedyDual-Size!

• GreedyDual-Size(Hops) and GreedyDual-Size(Weighted Hops) work the best at minimizing latency and network traffic as well as maximizing hit rate.

• They are simple to implement with no need for custom-i-zation.

• Using these cache replacement algorithms for all levels of web cache would yield a faster internet!

Cache is useful

• Cache is useful in many ways and a good cache replacement policy is the key to making it perform well.

• Well performing cache can bring us web pages, images and videos at much faster rates than no cache at all or cache with a poor replacement policy.

• Cache can bring us media like…

LASER CATS! Laser fast!

This picture came from the internet

Documents

Proxy Cache and YOU By Stuart H. Schwartz. What is cache anyway? The general idea of cache is simple… Buffer data from a slow, large source within a (usually)