44
1 On Filter Effects in Web Caching Hierarchies Carey Williamson Department of Computer Science University of Calgary

On Filter Effects in Web Caching Hierarchies

Embed Size (px)

DESCRIPTION

On Filter Effects in Web Caching Hierarchies. Carey Williamson Department of Computer Science University of Calgary. Introduction. “The Web is both a blessing and a curse…” Blessing: Internet available to the masses Seamless exchange of information Curse: Internet available to the masses - PowerPoint PPT Presentation

Citation preview

Page 1: On Filter Effects in Web Caching Hierarchies

1

On Filter Effects inWeb Caching Hierarchies

Carey WilliamsonDepartment of Computer ScienceUniversity of Calgary

Page 2: On Filter Effects in Web Caching Hierarchies

2

Introduction

“The Web is both a blessing and a curse…”

Blessing: Internet available to the masses Seamless exchange of information

Curse: Internet available to the masses Stress on networks, protocols, servers, users

Motivation: techniques to improve the performance and scalability of the Web

Page 3: On Filter Effects in Web Caching Hierarchies

3

Why is the Web so slow?

Client-side bottlenecks (PC, modem) Solution: better access technologies

Server-side bottlenecks (busy Web site) Solution: faster, scalable server designs

Network bottlenecks (Internet congestion) Solutions: caching, replication; improved

protocols for client-server communication

Page 4: On Filter Effects in Web Caching Hierarchies

4

Example of a Web Proxy Cache

Proxy server

Web server

Web server

Web server

Web Client

Web Client

Web Client

Web Client

Web Client

Page 5: On Filter Effects in Web Caching Hierarchies

5

Our Previous WorkEvaluation of Canada’s national Web caching

infrastructure for CANARIE’s CA*net II backboneWorkload characterization and evaluation of

CA*net II Web caching hierarchy (IEEE Network, May/June 2000)

Developed Web proxy caching simulator for trace-driven simulation evaluation of Web proxy caching architectures

Developed synthetic Web proxy workload generator called ProWGen [Busari/Williamson INFOCOMM 2001]

Page 6: On Filter Effects in Web Caching Hierarchies

CA*net II Web Caching Hierarchy (Dec 1998)

USask

CANARIE(Ottawa)

(selectedmeasurement points for our traffic analyses; 6-9 months of data from each)

To NLANR

Page 7: On Filter Effects in Web Caching Hierarchies

Caching Hierarchy Overview

C C C CC C C

Proxy

Proxy

Proxy Proxy

Proxy

Proxy

... ...

Regional/Univ.(5-10 GB)

National(10-20 GB)

Top-Level/International(20-50 GB)

Cache Hit Ratios

30-40%

15-20%

5-10%(empiricallyobserved)

Page 8: On Filter Effects in Web Caching Hierarchies

8

Some Observationson Multi-Level Caching...

Caching hierarchy not very effective, due to a “diminishing returns” effect

Reason: workload characteristics change as you move up the caching hierarchy (due to filtering effects, etc)

Bigger caches aren’t really the answerBetter caching system design might be...

Page 9: On Filter Effects in Web Caching Hierarchies

9

Research Goals

Develop better understanding of cache filter effects (intuitively, quantitatively)

Try to do something about it!Idea #1: Try different cache

replacement policies at different levels of hierarchy

Idea #2: Try partitioning cache content in overall hierarchy based on size or type to limit replication, etc.

Page 10: On Filter Effects in Web Caching Hierarchies

10

Talk Overview

Background/MotivationUnderstanding Cache Filtering

EffectsExploiting Cache Filtering EffectsSummary and Conclusions

Page 11: On Filter Effects in Web Caching Hierarchies

11

Part I: UnderstandingCache Filter Effects

Page 12: On Filter Effects in Web Caching Hierarchies

12

Simulation Model

Proxy server

Web Servers

Web Clients

Proxy server

Proxy server

Upper Level (Parent)

Lower Level (Children)

Page 13: On Filter Effects in Web Caching Hierarchies

13

Experimental MethodologyTrace-driven simulation (empirical

traces)Multi-factor experimental designCache size

1 MB to 32 GBCache Replacement Policy

Recency-based LRU (currently active docs) Frequency-based LFU-Aging (popular docs) Size-based GD-Size (favours smaller docs)

Analyze workload characteristics

Page 14: On Filter Effects in Web Caching Hierarchies

14

Web Workload Characteristics“One-timers” (60-70% docs are useless!!!)Zipf-like document referencing popularityHeavy-tailed file size distribution (i.e.,

most files small, but most bytes are in big files)

Zero correlations between document size and document popularity (debate!)

Temporal locality (temporal correlation between recent past and near future references) [Mahanti et al. PER 2000]

Page 15: On Filter Effects in Web Caching Hierarchies

15

Zipf-Like Referencing

An intrinsic “power-law” relationship in the way that humans organize, access, and use information (e.g., library books, English words in text, movie rentals, Web sites, Web pages, ...)

Plot item popularity versus relative rank, on a log-log scale, results in straight line

Page 16: On Filter Effects in Web Caching Hierarchies

16

Example: Zipf-Like Document Popularity Profile for UofS Trace

Page 17: On Filter Effects in Web Caching Hierarchies

17

Quiz Time: What do you get AFTER the cache?

Page 18: On Filter Effects in Web Caching Hierarchies

18

Quiz Time: What do you get AFTER the cache?

Page 19: On Filter Effects in Web Caching Hierarchies

19

(a)

Quiz Time: What do you get AFTER the cache?

Page 20: On Filter Effects in Web Caching Hierarchies

20

(a)

Quiz Time: What do you get AFTER the cache?

(b)

Page 21: On Filter Effects in Web Caching Hierarchies

21

(a)

Quiz Time: What do you get AFTER the cache?

(b)

(c)

Page 22: On Filter Effects in Web Caching Hierarchies

22

(a)

Quiz Time: What do you get AFTER the cache?

(b)

(c) (d)

Page 23: On Filter Effects in Web Caching Hierarchies

23

Quiz Time: What do you get AFTER the cache?

(c)

Answer: (c)

Page 24: On Filter Effects in Web Caching Hierarchies

24

Simulation Results for Input Workload Traceswith Different Initial Zipf Slopes

Page 25: On Filter Effects in Web Caching Hierarchies

25

The Magnitude of the Filter Effect Depends on Cache Size

Page 26: On Filter Effects in Web Caching Hierarchies

26

Filter Effect Depends on Cache Replacement Policy

Page 27: On Filter Effects in Web Caching Hierarchies

27

Filter Effect is Most Pronounced at First-Level Cache

Page 28: On Filter Effects in Web Caching Hierarchies

28

Part II: ExploitingCache Filter Effects

Page 29: On Filter Effects in Web Caching Hierarchies

29

Research Questions:Multi-Level Caches

In a multi-level caching hierarchy, can overall caching performance be improved by using different cache replacement policies at different levels of the hierarchy?

In a multi-level caching hierarchy, can overall performance be improved by keeping disjoint document sets at each level of the hierarchy?

Page 30: On Filter Effects in Web Caching Hierarchies

30

Simulation Model

Proxy server

Web Servers

Web Clients

Proxy server

Proxy server

Upper Level (Parent)

Complete Overlap

No Overlap

Partial Overlap (50%) Lower Level

(Children)

Page 31: On Filter Effects in Web Caching Hierarchies

31

Performance Metrics

Document Hit Ratio Percent of requested docs found in

cache (HR)Byte Hit Ratio

Percent of requested bytes found in cache (BHR)

Page 32: On Filter Effects in Web Caching Hierarchies

32

010203040506070

1 100 10000Cache Size (MB)

Byte

Hit

Ratio

Children (LRU) Parent (GD-Size)

Parent (LFU-Aging) Parent (LRU)

01020304050607080

1 100 10000Cache Size (MB)

Hit R

atio

Children (LRU) Parent (GD-Size)

Parent (LFU-Aging) Parent (LRU)

Experiment 1: Different Policies at Different Levels of the hierarchy

(a) Hit Ratio (b) Byte Hit Ratio

Parent Parent

Children

Children

Page 33: On Filter Effects in Web Caching Hierarchies

33

01020304050607080

1 100 10000Cache Size (MB)

Hit

Rat

ioChildren (LRU) Parent (GD-Size)

Parent (LFU-Aging) Parent (LRU)

Parent

Children

Page 34: On Filter Effects in Web Caching Hierarchies

34

Page 35: On Filter Effects in Web Caching Hierarchies

35

Experiment 2:Sensitivity to Workload Overlap

The greater the degree of workload overlap amongst the child proxies, the greater the role for the parent cache

In the “no overlap” scenario, the parent cache has negligible hit ratios, particularly when child caches are large

Page 36: On Filter Effects in Web Caching Hierarchies

36

Page 37: On Filter Effects in Web Caching Hierarchies

37

Page 38: On Filter Effects in Web Caching Hierarchies

38

Page 39: On Filter Effects in Web Caching Hierarchies

39

Experiment 3:Size-based Partitioning

Partition files across the two levels of the hierarchy based on size (e.g., keep small files at the lower level and large files at the upper level) (or vice versa)

Three size thresholds for “small”... 5,000 bytes 10,000 bytes 100,000 bytes

Page 40: On Filter Effects in Web Caching Hierarchies

40

010203040506070

1 100 10000Cache Size (MB)

Hit

Ra

tio

Children (LRU) Parent (GD-Size)

Parent (LRU) Parent (LFU-Aging)

010203040506070

1 100 10000Cache Size (MB)

Byt

e H

it R

ati

o

Parent (LRU) Parent (LFU-Aging)

Parent (GD-Size) Children (LRU)

Size threshold = 5,000 bytes

010203040506070

1 100 10000Cache Size (MB)

Hit

Ra

tio

Children (LRU) Parent (GD-Size)

Parent (LRU) Parent (LFU-Aging)

010203040506070

1 100 10000Cache Size (MB)

Byt

e H

it R

ati

o

Parent (LRU) Parent (LFU-Aging)

Parent (GD-Size) Children (LRU)

Size threshold = 10,000 bytes

Small files at the lower level; Large files at the upper level

Parent

Children

Page 41: On Filter Effects in Web Caching Hierarchies

41Size threshold = 5,000 bytes

010

2030

4050

6070

1 100 10000Cache Size (MB)

Hit

Ra

tio

Parent (GD-Size) Parent (LRU)

Parent (LFU-Aging) Children (LRU)

010203040506070

1 100 10000Cache Size (MB)

Byt

e H

it R

ati

o

Children (LRU) Parent (LRU)

Parent (LFU-Aging) Parent (GD-Size)

Size threshold = 10,000 bytes

0

1020

3040

5060

70

1 100 10000Cache Size (MB)

Hit

Ra

tio

Parent (GD-Size) Parent (LRU)

Parent (LFU-Aging) Children (LRU)

010203040506070

1 100 10000Cache Size (MB)

Byt

e H

it R

ati

o

Parent (LRU) Parent (LFU-Aging)

Parent (GD-Size) Children (LRU)

ChildrenParent

Large files at the lower level; Small files at the upper level

Page 42: On Filter Effects in Web Caching Hierarchies

42

Summary: Multi-Level CachesDifferent Policies at different levels

LRU/LFU-Aging at the lower level + GD-Size at the upper level provided improvement in performance

GD-Size + GD-Size provided better performance in hit ratio, but with some penalty in byte hit ratio

Size-threshold approach small files at the lower level + large files at the

upper level provided improvement in performance reversing this policy offered no perf advantage

Page 43: On Filter Effects in Web Caching Hierarchies

43

Conclusions

Existing multi-level caching hierarchies are not always that effective, due to cache filtering effects

“Heterogeneous” caching architectures may better exploit workload characteristics and improve Web caching performance

Page 44: On Filter Effects in Web Caching Hierarchies

44

For More Information...

M. Busari, “Simulation Evaluation of Web Caching Hierarchies”, M.Sc. Thesis, Dept of Computer Science, U. Saskatchewan, June 2000

C. Williamson, “On Filter Effects in Web Caching Hierarchies”, ACM Transactions on Internet Technology, 2002 (to appear).

Email: [email protected] http://www.cpsc.ucalgary.ca/~carey/