Upload
david-richards
View
25
Download
0
Embed Size (px)
DESCRIPTION
On Filter Effects in Web Caching Hierarchies. Carey Williamson Department of Computer Science University of Calgary. Introduction. “The Web is both a blessing and a curse…” Blessing: Internet available to the masses Seamless exchange of information Curse: Internet available to the masses - PowerPoint PPT Presentation
Citation preview
1
On Filter Effects inWeb Caching Hierarchies
Carey WilliamsonDepartment of Computer ScienceUniversity of Calgary
2
Introduction
“The Web is both a blessing and a curse…”
Blessing: Internet available to the masses Seamless exchange of information
Curse: Internet available to the masses Stress on networks, protocols, servers, users
Motivation: techniques to improve the performance and scalability of the Web
3
Why is the Web so slow?
Client-side bottlenecks (PC, modem) Solution: better access technologies
Server-side bottlenecks (busy Web site) Solution: faster, scalable server designs
Network bottlenecks (Internet congestion) Solutions: caching, replication; improved
protocols for client-server communication
4
Example of a Web Proxy Cache
Proxy server
Web server
Web server
Web server
Web Client
Web Client
Web Client
Web Client
Web Client
5
Our Previous WorkEvaluation of Canada’s national Web caching
infrastructure for CANARIE’s CA*net II backboneWorkload characterization and evaluation of
CA*net II Web caching hierarchy (IEEE Network, May/June 2000)
Developed Web proxy caching simulator for trace-driven simulation evaluation of Web proxy caching architectures
Developed synthetic Web proxy workload generator called ProWGen [Busari/Williamson INFOCOMM 2001]
CA*net II Web Caching Hierarchy (Dec 1998)
USask
CANARIE(Ottawa)
(selectedmeasurement points for our traffic analyses; 6-9 months of data from each)
To NLANR
Caching Hierarchy Overview
C C C CC C C
Proxy
Proxy
Proxy Proxy
Proxy
Proxy
... ...
Regional/Univ.(5-10 GB)
National(10-20 GB)
Top-Level/International(20-50 GB)
Cache Hit Ratios
30-40%
15-20%
5-10%(empiricallyobserved)
8
Some Observationson Multi-Level Caching...
Caching hierarchy not very effective, due to a “diminishing returns” effect
Reason: workload characteristics change as you move up the caching hierarchy (due to filtering effects, etc)
Bigger caches aren’t really the answerBetter caching system design might be...
9
Research Goals
Develop better understanding of cache filter effects (intuitively, quantitatively)
Try to do something about it!Idea #1: Try different cache
replacement policies at different levels of hierarchy
Idea #2: Try partitioning cache content in overall hierarchy based on size or type to limit replication, etc.
10
Talk Overview
Background/MotivationUnderstanding Cache Filtering
EffectsExploiting Cache Filtering EffectsSummary and Conclusions
11
Part I: UnderstandingCache Filter Effects
12
Simulation Model
Proxy server
Web Servers
Web Clients
Proxy server
Proxy server
Upper Level (Parent)
Lower Level (Children)
13
Experimental MethodologyTrace-driven simulation (empirical
traces)Multi-factor experimental designCache size
1 MB to 32 GBCache Replacement Policy
Recency-based LRU (currently active docs) Frequency-based LFU-Aging (popular docs) Size-based GD-Size (favours smaller docs)
Analyze workload characteristics
14
Web Workload Characteristics“One-timers” (60-70% docs are useless!!!)Zipf-like document referencing popularityHeavy-tailed file size distribution (i.e.,
most files small, but most bytes are in big files)
Zero correlations between document size and document popularity (debate!)
Temporal locality (temporal correlation between recent past and near future references) [Mahanti et al. PER 2000]
15
Zipf-Like Referencing
An intrinsic “power-law” relationship in the way that humans organize, access, and use information (e.g., library books, English words in text, movie rentals, Web sites, Web pages, ...)
Plot item popularity versus relative rank, on a log-log scale, results in straight line
16
Example: Zipf-Like Document Popularity Profile for UofS Trace
17
Quiz Time: What do you get AFTER the cache?
18
Quiz Time: What do you get AFTER the cache?
19
(a)
Quiz Time: What do you get AFTER the cache?
20
(a)
Quiz Time: What do you get AFTER the cache?
(b)
21
(a)
Quiz Time: What do you get AFTER the cache?
(b)
(c)
22
(a)
Quiz Time: What do you get AFTER the cache?
(b)
(c) (d)
23
Quiz Time: What do you get AFTER the cache?
(c)
Answer: (c)
24
Simulation Results for Input Workload Traceswith Different Initial Zipf Slopes
25
The Magnitude of the Filter Effect Depends on Cache Size
26
Filter Effect Depends on Cache Replacement Policy
27
Filter Effect is Most Pronounced at First-Level Cache
28
Part II: ExploitingCache Filter Effects
29
Research Questions:Multi-Level Caches
In a multi-level caching hierarchy, can overall caching performance be improved by using different cache replacement policies at different levels of the hierarchy?
In a multi-level caching hierarchy, can overall performance be improved by keeping disjoint document sets at each level of the hierarchy?
30
Simulation Model
Proxy server
Web Servers
Web Clients
Proxy server
Proxy server
Upper Level (Parent)
Complete Overlap
No Overlap
Partial Overlap (50%) Lower Level
(Children)
31
Performance Metrics
Document Hit Ratio Percent of requested docs found in
cache (HR)Byte Hit Ratio
Percent of requested bytes found in cache (BHR)
32
010203040506070
1 100 10000Cache Size (MB)
Byte
Hit
Ratio
Children (LRU) Parent (GD-Size)
Parent (LFU-Aging) Parent (LRU)
01020304050607080
1 100 10000Cache Size (MB)
Hit R
atio
Children (LRU) Parent (GD-Size)
Parent (LFU-Aging) Parent (LRU)
Experiment 1: Different Policies at Different Levels of the hierarchy
(a) Hit Ratio (b) Byte Hit Ratio
Parent Parent
Children
Children
33
01020304050607080
1 100 10000Cache Size (MB)
Hit
Rat
ioChildren (LRU) Parent (GD-Size)
Parent (LFU-Aging) Parent (LRU)
Parent
Children
34
35
Experiment 2:Sensitivity to Workload Overlap
The greater the degree of workload overlap amongst the child proxies, the greater the role for the parent cache
In the “no overlap” scenario, the parent cache has negligible hit ratios, particularly when child caches are large
36
37
38
39
Experiment 3:Size-based Partitioning
Partition files across the two levels of the hierarchy based on size (e.g., keep small files at the lower level and large files at the upper level) (or vice versa)
Three size thresholds for “small”... 5,000 bytes 10,000 bytes 100,000 bytes
40
010203040506070
1 100 10000Cache Size (MB)
Hit
Ra
tio
Children (LRU) Parent (GD-Size)
Parent (LRU) Parent (LFU-Aging)
010203040506070
1 100 10000Cache Size (MB)
Byt
e H
it R
ati
o
Parent (LRU) Parent (LFU-Aging)
Parent (GD-Size) Children (LRU)
Size threshold = 5,000 bytes
010203040506070
1 100 10000Cache Size (MB)
Hit
Ra
tio
Children (LRU) Parent (GD-Size)
Parent (LRU) Parent (LFU-Aging)
010203040506070
1 100 10000Cache Size (MB)
Byt
e H
it R
ati
o
Parent (LRU) Parent (LFU-Aging)
Parent (GD-Size) Children (LRU)
Size threshold = 10,000 bytes
Small files at the lower level; Large files at the upper level
Parent
Children
41Size threshold = 5,000 bytes
010
2030
4050
6070
1 100 10000Cache Size (MB)
Hit
Ra
tio
Parent (GD-Size) Parent (LRU)
Parent (LFU-Aging) Children (LRU)
010203040506070
1 100 10000Cache Size (MB)
Byt
e H
it R
ati
o
Children (LRU) Parent (LRU)
Parent (LFU-Aging) Parent (GD-Size)
Size threshold = 10,000 bytes
0
1020
3040
5060
70
1 100 10000Cache Size (MB)
Hit
Ra
tio
Parent (GD-Size) Parent (LRU)
Parent (LFU-Aging) Children (LRU)
010203040506070
1 100 10000Cache Size (MB)
Byt
e H
it R
ati
o
Parent (LRU) Parent (LFU-Aging)
Parent (GD-Size) Children (LRU)
ChildrenParent
Large files at the lower level; Small files at the upper level
42
Summary: Multi-Level CachesDifferent Policies at different levels
LRU/LFU-Aging at the lower level + GD-Size at the upper level provided improvement in performance
GD-Size + GD-Size provided better performance in hit ratio, but with some penalty in byte hit ratio
Size-threshold approach small files at the lower level + large files at the
upper level provided improvement in performance reversing this policy offered no perf advantage
43
Conclusions
Existing multi-level caching hierarchies are not always that effective, due to cache filtering effects
“Heterogeneous” caching architectures may better exploit workload characteristics and improve Web caching performance
44
For More Information...
M. Busari, “Simulation Evaluation of Web Caching Hierarchies”, M.Sc. Thesis, Dept of Computer Science, U. Saskatchewan, June 2000
C. Williamson, “On Filter Effects in Web Caching Hierarchies”, ACM Transactions on Internet Technology, 2002 (to appear).
Email: [email protected] http://www.cpsc.ucalgary.ca/~carey/