Distributed Caching Algorithms for Content Distribution ...€¦ · Sem Borsty, Varun Gupta?, Anwar Walidy yAlcatel-Lucent Bell Labs,?CMU BCAM Seminar Bilbao, September 30, 2010

Distributed Caching Algorithms

for Content Distribution Networks

Sem Borst†, Varun Gupta?, Anwar Walid†

†Alcatel-Lucent Bell Labs, ?CMU

BCAM Seminar

Bilbao, September 30, 2010

Introduction

Scope: personalized/on-demand delivery of high-definition

video through service provider

• CatchUp TV / PauseLive TV features

• NPVR (Network Personal Video Recorder) capabilities

• Movie libraries / VoD (Video-on-Demand)

• User-generated content

Unicast nature defies conventional broadcast TV paradigm1

Caching strategies

Focus: ‘hierarchical’ network architecture

Store popular content closer to network edge to reduce

traffic load, capital expense and performance bottlenecks

VHO

IO

STB

DSLAM

CO

1% stored10% served

35% served

20% stored30% served

9% stored

20% served30% stored

5% of traffic served40% of content stored

2

Caching strategies (cont’d)

Typically there are caches installed at only one or two levels

VHO

IO

STB

DSLAM

CO

45% served

30% stored30% served

10% stored

25% of traffic served60% of content stored (or 100%)

3

Caching strategies (cont’d)

Two interrelated problems

• Design: optimal cache locations and sizes

(joint work with Marty Reiman)

• Operation: efficient (dynamic) placement of content

items

4

Popularity statistics

Cache effectiveness strongly depends on locality / com-

monality in user requests

frequenciesrequest

content item ranks1 2 3 N

5

Popularity statistics (cont’d)

Empirical data suggests that rank statistics resemble Zipf-Mandelbrot distribution

Relative frequency of n-th most popular item is

pn =H

(q+ n)α, n = 1, . . . , N,

with

• α ≥ 0: shape parameter

• q ≥ 0: shift parameter

• H =[∑N

n=11

(q+n)α

]−1normalization constant

Ideal hit ratio for cache of size B ≤ N is

R =B∑

n=1

H

(q+ n)α

6


Shape parameter α varies with content type, and strongly

impacts cache effectiveness

0 1 2Shape parameter alpha

0

0.5

1H

it ra

tio R

B = 100B = 500B = 1000

Hit ratio as function of shape parameter α for various cache

sizes B and population of N = 10,000 content items7


Zipf-Mandelbrot distribution is inherently static, and diffi-

cult to reconcile with dynamic phenomena

• Dynamic content ingestion and removal

• Time-varying popularity, request-at-most-once

Both adverse and favorable implications

• Requires agile caching strategies policies and (implicit)

popularity estimation, negatively affecting caching per-

formance

• Causes popularity distribution to be steeper (higher α

values over shorter time intervals), improving potential

caching effectiveness

8

Optimal content placement

Consider symmetric scenario (cache sizes, popularity dis-

tributions)

For now, assume strictly hierarchical topology: content can

only be requested from parent node

Caches should be filled with most popular content items

from lowest level up

CO

DSLAM

STB

IO

VHO

9

Greedy content placement strategy

Whenever node receives request for item, its local ‘popu-

larity estimate’ for that item is updated

If requested item is not stored in local cache, then

• Request is forwarded to parent node

• Popularity estimate for requested item is compared with

that for ‘marginal’ item, which may then be evicted and

replaced

Provable ‘convergence’ to optimal content placement

10

Optimal content placement (cont’d)

Relies on two strong (though reasonable) assumptions

• Symmetric popularity distributions and cache sizes

• Strictly hierarchical topology

What if popularity distributions are spatially heterogeneous?

Or what if content can be requested from peers as well?

11


Assume there are caches installed at only two levels

VHO

IO

STB

DSLAM

CO

12


Consider cluster of M nodes at same level in hierarchy

Cluster nodes are either directly connected or indirectly via

common parent node

parent node

2 M1

leaf nodes

root node

13


Some notation

• c0: transfer cost from root node - 1 to parent node 0

• ci: transfer cost from parent node 0 to node i

• cij < c0+ci: transfer cost from leaf node j to leaf node i

Then

fij :=

c0 + ci j = ic0 j = 00 j = −1c0 + ci − cij > 0 j 6= −1,0, i

represents transport cost savings achieved by transferring

data to leaf node i from node j instead of root node14


Problem of maximizing cost savings may be formulated as

maxM∑i=1

N∑n=1

sndin

M∑j=0

fijxjin (1)

subN∑n=1

snxin ≤ Bi, i = 0,1, . . . ,M (2)

xjin ≤ xjn, i = 1, . . . ,M, j = 0,1, . . . ,M, n = 1, . . . , N(3)M∑j=0

xjin ≤ 1, i = 1, . . . ,M, n = 1, . . . , N, (4)

with Bi denoting cache size of i-th node, sn size of n-th

item, din demand for n-th item at i-th node

15

Inter-level cache cooperation

Allow for heterogeneous demands, but assume cij =∞, i.e.,

content can only be fetched from parent node and not from

peers

For compactness, denote cmin := mini=1,...,M ci

Proposition

For arbitrary demands, greedy content placement strategy

is guaranteed to achieve at least fraction

(M − 1)cmin +Mc0(M − 1)cmin + (2M − 1)c0

≥M

2M − 1

of maximum achievable cost savings

16

Intra-level cache collaboration

Now suppose content can be requested from peers as well

Intra-level connectivity allows distributed caches to coop-

erate and act as single logical cache, and makes caching

at lower levels more cost-effective

Greedy optimization of local hit rate will lead to complete

replication of cache content

Cache cooperation improves aggregate hit rate across cache

cluster, at expense of lower local hit rate

Optimal trade-off and degree of replication depends on

cost of intra-level transfers relative to transfers from parent

or root node

17

Intra-level cache cooperation (cont’d)

Assume symmetric transport cost, cache sizes and de-

mands: Bi ≡ B, ci ≡ c, cij ≡ c′, and din ≡ dn

For compactness, denote c′′ := M(c+ c0)− (M − 1)c′ > c′

Problem (1)–(4) may be simplified to

maxN∑n=1

sndn(c′′pn + (M − 1)c′q′n +Mc0x0n) (5)

subN∑n=1

snx0n ≤ B0 (6)

N∑n=1

sn(pn + (M − 1)q′n) ≤MB (7)

pn + x0n ≤ 1, n = 1, . . . , N (8)

q′n + x0n ≤ 1, n = 1, . . . , N (9)

Knapsack problem type structure18

Intra-level cache collaboration (cont’d)

Optimal solution of content placement problem has rela-

tively simple structure

Distinguish between two cases

• Mc ≥ (M−1)c′: more advantageous to store un-replicated

content in leaf nodes than in parent node

• Mc ≤ (M − 1)c′: more attractive to store un-replicated

content in parent node than in leaf nodes

with c cost between parent and leaf node and c′ cost be-

tween two leaf nodes

19

Case Mc ≥ (M − 1)c′

root node

leaf nodes

1 M2

parent node

Four popularity ‘tiers’

• Highly popular (red): replicated in all leafs pn = 1, q′n = 1

• Fairly popular (pink): stored in single leaf pn = 1

• Mildly popular (yellow): stored in parent node x0n = 1

• Hardly popular (green): stored in root node only

20

Case Mc ≥ (M − 1)c′ (cont’d)

2n − 1

qnqnqn

q0n = 0

n0x n0

n

− 10n− 10n− 10n0n0n

1n1n

1n1n1n

1

xn0x

= 01np

npnp

= 02n0x

np

2n2n

− 1

21

Case Mc ≥ (M − 1)c′

root node

leaf nodes

1 M2

parent node

Four popularity ‘tiers’

• Highly popular (red): replicated in all leafs pn = 1, q′n = 1

• Fairly popular (pink): stored in common parent x0n = 1

• Mildly popular (yellow): stored in single leaf pn = 1

• Hardly popular (green): stored in root node only

22

Case Mc ≥ (M − 1)c′ (cont’d)

qnqnqn

q0n = 0

2n

1n

1n − 1

0n

n

1n

0n

n0xn0x

npnpnp

− 10n− 10n0n0nn

0x

2n − 1

1n0x = 02np = 0

1n

1n

+ 10n

2n

− 10n1

23

Local-Greedy algorithm

For convenience, assume B0 = 0, sn = 1 for all n = 1, . . . N

If requested item is not stored in local cache, then

• Item is fetched from peer if cached elsewhere in cluster

and otherwise from root node

• Value of requested item is compared with ‘marginal’

cache value, i.e., value provided by marginal item in

local cache, which may then be evicted and replaced

Value of item n =

{c′dn if stored elsewhere in clusterc′′dn otherwise

24

Local-Greedy algorithm (cont’d)

May get stuck in suboptimal configuration

globally optimal configuration local optimum

• Duplicating red item less valuable than single yellow

item

• Duplicating yellow item less valuable then single green

item

25

Local-Greedy algorithm (cont’d)

Performance guarantees (competitive ratios)

• Symmetric demands: within factor 4/3 from optimal

• Arbitrary demands: within factor 2 from optimal

26

Numerical experiments

• M = 10 leaf nodes, each with cache of size B = 1 TB

• Unit transport cost c0 = 2, c = 1, c′ = 1

• N = 10,000 content items, with common size S = 2 GB

• Each leaf node can store K = B/S content items

27

Numerical experiments (cont’d)

• Each leaf receives average of 1 request every 160 sec,

i.e., total request rate per leaf is ν = 0.00625 sec−1

• Zipf-Mandelbrot popularity distribution with shape pa-

rameter α and shift parameter q, i.e.,

pn =H

(q+ n)α, n = 1, . . . , N,

with normalization constant

H =

N∑n=1

1

(q+ n)α

−1

• Request rate for n-th item at each leaf node is dn = pnν

28

Gains from cooperative caching

Compare minimum bandwidth cost with that in two other

scenarios

• Full replication: each leaf node stores K most popular

items

• No replication: only single copy of MK most popular

items is stored in one of leaf nodes

Without caching, bandwidth cost would be MνS(c+ c0) =

10× 0.00625× 2× 3 = 0.375 GBps = 3 Gbps

29

Bandwidth cost as function of shape parameter α for vari-

ous scenarios and cache sizes

30

Some observations

• Caching effectiveness improves as popularity distribu-tion gets steeper: bandwidth costs markedly decreasewith increasing values of α

• Even when collective cache space can only hold 10%of total content, bandwidth costs reduce to fraction ofthose without caching, as long as value of α is not toolow

• Best between either full or zero replication is often notmuch worse than optimal content placement;however, neither full nor zero replication performs wellacross entire range of α values

• Critical to adjust degree of replication to steepness ofpopularity distribution;Local-Greedy algorithm does just that

31

Performance of Local-Greedy algorithm

Various leaf nodes receive requests over time, sampledfrom Zipf-Mandelbrot popularity distribution

If requested item is not presently stored, node decideswhether to cache it and if so, which currently stored itemto evict

Distinguish three scenarios for initial placement

• Full replication: each leaf node stores 500 top items

• No replication: only single copy of 5000 top items isstored in one of leaf nodes

• Random: each leaf stores 500 randomly selected items

In optimal placement, items 1 through 165 fully replicated,and single copies of items 166 through 3515 stored

32

Performance ratio as function of number of requests, with

static or dynamic popularities

33

Bandwidth savings as function of number of requests, with

inaccurate popularity estimates

34

Some observations

• Local-Greedy algorithm gets progressively closer to op-

timum as system receives more requests and replaces

items over time

• After only 3000 requests (out of total number 10,000

items) Local-Greedy algorithm has come to within 1% of

optimum, and stays there

• Performs markedly better than worst-case ratio of 3/4

might suggest

• While algorithm seems to ‘converge’ for all three initial

states, scenario with no replication appears to be most

favorable one, due to fact that in optimal placement

only items 1 through 165 are fully replicated

35

Documents

Distributed Caching Algorithms for Content Distribution ...€¦ · Sem Borsty, Varun Gupta?, Anwar Walidy yAlcatel-Lucent Bell Labs,?CMU BCAM Seminar Bilbao, September 30, 2010