10

Click here to load reader

Improved lower bounds for coded caching

Embed Size (px)

DESCRIPTION

research paper

Citation preview

Page 1: Improved lower bounds for coded caching

arX

iv:1

501.

0600

3v1

[cs.

IT]

24 J

an 2

015

Improved Lower Bounds for Coded Caching

Hooshang Ghasemi and Aditya RamamoorthyDept. of Electrical & Computer Eng. Iowa State University, Ames, IA 50011

Emails:ghasemi, [email protected]

Abstract—Content delivery networks often employ caching toreduce transmission rates from the central server to the endusers. Recently, the technique of coded caching was introducedwhereby coding in the caches and coded transmission signalsfrom the central server are considered. Prior results in this areademonstrate that (a) carefully designing placement of content inthe caches and (b) designing appropriate coded delivery signalsallow for a system where the delivery rates can be significantlysmaller than conventional schemes. However, matching upperand lower bounds on the transmission rates have not yet beenobtained. In this work, we derive tighter lower bounds on codedcaching rates than were known previously. We demonstrate thatthis problem can equivalently be posed as one of optimallylabeling the leaves of a directed tree. Several examples thatdemonstrate the utility of our bounds are presented.

I. I NTRODUCTION

Content distribution over the Internet is an important prob-lem and is the core business of several enterprises such asYoutube, Netflix, Hulu etc. One commonly used technique tofacilitate content delivery is caching [1], whereby relativelypopular server content is stored in local cache memory at theend users. When files are requested by the users, the cachedcontent is first used to serve them. The remainder of thecontent is obtained from the server. This reduces the numberof bits transmitted from the server on average. In conventionalapproaches to caching, coding in the content of the cacheand/or coding in the transmission from the server are typicallynot considered.

The work of [2] introduced the problem of coded caching,where there is a server withN files andK users each witha cache of sizeM . The users are connected to the server bya shared link. In each time slot each user requests one of theN files. There are two distinct phases in coded caching. Inthe placement phase, the content of caches is populated; thisphase should not depend on the actual user requests (whichare assumed to be arbitrary). In thedelivery phase, the servertransmits a signal of rateR over the shared link that servesto satisfy the demands of each of the users. The work of[2] demonstrates that a carefully designed placement schemeand a corresponding delivery scheme achieves a rate that issignificantly lower than conventional caching. The work of[2] also shows that their achievable rate is within a factorof 12 of the cutset lower bound for all values ofN,K andM . There have been some subsequent contributions in thisarea. Decentralized coded caching where the placement phaseis such that each user stores a random portion of each file wasinvestigated in [3]. Schemes where the popularity of the filesare taken into account appeared in [4]. A hierarchical schemewhere there are multiple levels of caches was considered in[5].

In this work our main contribution is in developing im-proved lower bounds on the rate for the coded caching prob-lem. The computation of this lower bound can be posed as alabeling problem on a directed tree. This paper is organizedasfollows. Section II contains the problem formulation, SectionIII develops the proposed lower bound. Section IV presentsseveral examples of the proposed lower bound and comparesit with prior results.

II. PROBLEM FORMULATION

Let [m] = 1, . . . ,m. The coded caching problem canbe formally described as follows. LetWnNn=1 denoteNindependent random variables (representing the files) eachuniformly distributed over[2F ]. The i-th user requests the fileWdi

, wheredi ∈ [N ]. A (M,R) system consists of

• caching functions,Zi = φi(W1, . . . ,WN ).

• Encoding functions ϕd1,...,dK(W1, . . . ,WN ),

so that the delivery phase signalXd1,...,dK=

ϕd1,...,dK(W1, . . . ,WN ).

• Decoding functions for the k-th userµd1,...,dK ;k(Xd1,...,dK

, Zk), k = 1, . . . ,K so that de-coded fileWd1,...,dK ;k = µd1,...,dK ;k(Xd1,...,dK

, Zk).

The probability of error is defined as

max(d1,...,dK)∈[N ]K

maxk∈[K]

P (Wd1,...,dK ;k 6=Wdk).

Definition 1: The pair(M,R) is said to be achievable iffor ǫ > 0, there exists a file sizeF large enough so that thereexists a(M,R) caching scheme with probability of error atmost ǫ. We define

R⋆(M) = infR : (M,R) is achievable.

In this work we are interested in tight lower bounds onR⋆(M).

A. Preliminaries

Definition 2: Directed in-tree.A directed graphT =(V,A), is called a directed in-tree if there is one designatednode called the root such that from any other vertexv ∈ Vthere is exactly one directed path fromv to the root.

The nodes in a directed in-tree that do not have any incomingedges are referred to as the leaves. The remaining nodes,excluding the leaves and the root are called internal nodes.Each node in a directed in-tree has at most one outgoing edge.We have the following definitions for a nodev ∈ V .

out(v) = u ∈ V : (v, u) ∈ A,

in(v) = u ∈ V : (u, v) ∈ A.

Page 2: Improved lower bounds for coded caching

v1

Z1

v2

X123

v3

Z2

v4

X312

u1 u2

u∗

v∗

W1 W1

W2,W3

Fig. 1: Problem instance for Example 1. For clarity of presentation,only theWnew(u) label has been shown on the edges.

In this work, we exclusively work with trees which are suchthat the in-degree of the root equals 1. There is a naturaltopological order inT whereby for nodesu ∈ T andv ∈ T ,we say thatu ≻ v if there exists a sequence of edges that canbe traversed to reach fromu to v. This sequence of edges isdenotedpath(u, v).

Let D = ∪d1∈[N ],...,dK∈[N ]Xd1,...,dK. Suppose that we

are given a directed in-tree denotedT , with ℓ leaves denotedv1, . . . , vℓ. Furthermore, assume that each nodev ∈ T isassigned a label, denotedlabel(vi), which is a subset ofW1, . . . ,WN∪Z1, . . . , ZK∪D. Moreover, we also specifyW(v) ⊆ W1, . . . ,WN, Z(v) ⊆ Z1, . . . , ZK andD(v) ⊆D so thatlabel(v) = W(v)∪Z(v)∪D(v). In our formulation,the leaf nodesvi, i = 1, . . . , ℓ are such thatW(vi) = ∅.

Definition 3: We say that a singleton source subsetWiis recoverable from the pairZj, Xd1,...,dK

if dj = i. Similarly,for a given set of cachesZ ′ ⊂ Z1, . . . , ZK and deliveryphase signalsD′ ⊆ D, we define a setRec(Z ′, D′) ⊆W1, . . . ,WN to be the subset of the sources that can berecovered from pairs of the form(Zi, XJ) whereZi ∈ Z ′ andJ is a multiset of cardinalityK with entries from[N ] suchthatXJ ∈ D′.

For nodesu, v ∈ T , we let∆(u, v) = Rec(Z(u),D(v)). Fora given nodeu ∈ T , we define

Wnew(u) = ∆(u, u) \W(u), (1)

i.e., Wnew(u) is the subset of sources that can be recoveredfrom (Z(u),D(u)) that are distinct fromW(u). A word aboutnotation. We let the entropy of a set of random variables equalthe joint entropy of all the random variables in the set. We alsolet [x]+ = max(x, 0).

III. L OWER BOUND ONR⋆(M)

Given a directed treeT and appropriate labels on its leavesv1, . . . , vℓ, where we assume thatW(vi) = ∅, for i = 1, . . . , ℓ,we claim that Algorithm 1 generates an inequality of the formαR⋆ + βM ≥ L(α, β). We demonstrate this by means of thefollowing example and defer the proof of the general case tothe Appendix.

Example 1:Consider a system withN = K = 3, and thedirected in-treeT with labeling: label(v1) = Z1, label(v2) =X123, label(v3) = Z2 and label(v4) = X312 (see Fig. 1). Itcan be observed that an application of Algorithm 1 gives us

Algorithm 1 Labeling Algorithm

Input: T = (V,A) with leavesv1, . . . , vℓ andlabel(vi)ℓi=1,such thatW(vi) = ∅, i = 1, . . . , ℓ.

Initialization:1: for i← 1, . . . ℓ do2: Wnew(vi) = ∆(vi, vi).3: x(vi,out(vi)) =Wnew(vi).4: y(vi,out(vi)) = |Wnew(vi)|.5: end for6: while there exists an unlabeled edgedo7: Pick an unlabeled nodeu ∈ V s.t. all edges inin −edge(u) are labeled.

8: W(u) = ∪v∈in(u)W(v) ∪Wnew(v).9: Z(u) = ∪v∈in(u)Z(v).

10: D(u) = ∪v∈in(u)D(v).11: Entropy-label(u): H(Z(u) ∪D(u)|W(u)).12: Wnew(u) = ∆(u, u) \W(u).13: x(u,out(u)) =Wnew(u).14: y(u,out(u)) = |Wnew(u)|.15: end whileOutput: L =

e∈A ye.

the inequality2R⋆+2M ≥ 4. This can be justified as follows.

2R⋆F + 2MF ≥ H(Z1, X123) +H(Z2, X312)

= I(W1;Z1, X123) +H(Z1, X123|W1),

+ I(W1;Z2, X312) +H(Z2, X312|W1)(a)

≥ F (1− ǫ) + F (1− ǫ) +H(Z1, Z2, X123, X312|W1)

= 2F (1− ǫ) + I(W2,W3;Z1, Z2, X123, X312|W1),

+H(Z1, Z2, X123, X312|W1,W2,W3)(b)

≥ 2F (1− ǫ) + 2F (1− ǫ) = 4F (1− ǫ),

where inequality (a) holds since conditioning reduces entropyand e.g.,I(W1;Z1, X123) ≥ F − ǫF (by Fano’s inequality).The other inequality can be shown to hold in a similar manner.Similarly, inequality (b) holds by the independence of theWi’sand Fano’s inequality. This holds for arbitraryǫ > 0 andFlarge enough. Dividing throughout byF we have the requiredresult.

It can be observed that at each internal node, certain cachesignals and delivery phase signalsmeet, e.g.Z1 andX123 meetat nodeu1 in Fig. 1. The outgoing edge of an internal nodeis labeled by thenew files that are recovered at the node,e.g., atu1 the signalsZ1 andX123 recover the fileW1. Wecall a file new if it has not been recovered upstream of agiven node. It can be seen that this labeling is in one to onecorrespondence with inequality (a) in Example 1 above. In asimilar manner atu∗ one can recover all the filesW1, . . . ,W3;however only the setW2,W3 is labeled on edge(u∗, v∗) asW1 was recovered upstream. This intuition is formalized inthe Appendix (Lemma 2) where it is shown that a valid lowerbound is always obtained when applying Algorithm 1.

Definition 4: Problem Instance.Consider a given treeTwith leavesvi, i = 1, . . . , ℓ that are labeled as discussed above.Let α =

∑ℓi=1 |D(vi)| andβ =

∑ℓi=1 |Z(vi)|. Suppose that

the lower bound computed by Algorithm 1 equalsL. We definethe associated problem instance asP (T , α, β, L,N,K). We

Page 3: Improved lower bounds for coded caching

also defineα = |∪ℓi=1D(vi)| andβ = |∪ℓi=1Z(vi)|. A probleminstanceP (T , α, β, L,N,K) is optimal if all instances of theform P ′(T ′, α, β, L′, N,K) are such thatL′ ≤ L.

It is not too hard to see that it suffices to consider directedtrees whose internal nodes have an in-degree at least two. Inparticular, if u has in-degree equal to1, it is evident thatWnew(u) = ∅ and thus,y(u,out(u)) = 0. In addition, weshow in the Appendix (Claim 5) that w.l.o.g. it suffices toconsider trees where internal nodes have in-degree at mosttwo. Therefore, we will assume that all internal nodes havedegree equal to two.

We can also conclude that each leafv in an instancePis such that either|Z(v)| = 1 or |D(v)| = 1 but not both. If|Z(v)| = 1, we callv a cache node; if|D(v)| = 1 we call it adelivery phase node. In the subsequent discussion we will as-sume the delivery phase nodes are labeled in an arbitrary orderv1, . . . , vα and the cache nodes fromvα+1, . . . , vα+β , wherewe note thatα + β = ℓ. Moreover, we letD = v1, . . . , vαandC = vα+1, . . . , vα+β.

We now explore some characteristics of optimal probleminstances. In the treeT corresponding to problem instanceP (T , α, β, L,N,K), consider an internal nodeu and the edgee = (u, v). The incoming edges intou, denoted(ul, u) and(ur, u) are the last edges of the disjoint left and right subtreesdenotedTu(l) and Tu(r) respectively. Each of these subtreesdefines a problem instancePl = P (Tu(l), αl, βl, Ll, N,K) andPr = P (Tu(r), αr, βr, Lr, N,K). We defineDu(r) = v ∈D : v ∈ Tu(r) andCu(r) = v ∈ C : v ∈ Tu(r) with similardefinitions forDu(l) andCu(l). We also letDu = Du(l)∪Du(r)

andCu = Cu(l) ∪ Cu(r).

Let Γl = ∪v∈Tu(l)Wnew(v) and Γr = ∪v∈Tu(r)

Wnew(v),i.e.,Γl andΓr are the subsets ofW1, . . . ,WN that are usedup in the problem instancesPl andPr respectively. We shalloften need to reason about the files recovered at the nodeu.Accordingly, we have the following definitions.

∆rl(u) = Rec(Z(ur) \Z(ul),D(ul)),

∆lr(u) = Rec(Z(ul) \Z(ur),D(ur)).

It can be observed that we have

Wnew(u) = ∆(u, u) \W(u)

= ∆rl(u) ∪∆lr(u) \W(u). (2)

The second equality holds sinceRec(Z(ul),D(ul)) ∪Rec(Z(ur),D(ur)) ⊆ W(u). Note that based on Algorithm1, we can conclude that

W(u) = ∪v∈ur ,ulW(v) ∪Wnew(v)

= ∪v≻uWnew(v). (3)

Often, we will need to refer to the singleton file subset thatis recovered fromv ∈ D andv′ ∈ C wherev andv′ meet atnodeu. In this case, we denote

∆(v,v′)(u) = Rec(Z(v′),D(v)).

For any pair of leaf nodesvi andvj wherei, j ∈ 1, . . . , α+βwe say thatvi andvj meet at nodeu if there existpath(vi, u)andpath(vj , u) in T such thatpath(vi, u)∩ path(vj , u) = ∅.In our subsequent discussion, we will often modify a givenproblem instanceP to arrive at a different problem instance

P ′. In this situation we will use the superscriptP or P ′ torefer to the appropriate instance, e.g.,WP

new(u) will refer toWnew(u) in the instanceP .

For a problem instanceP (T , α, β, L,N,K), it may bepossible thatβ < β. However, given such an instance, we canconvert it into another instance whereβ = β without reducingthe value ofL. In fact the following stronger statement holds(see Appendix A for a proof).

Claim 1: For a problem instanceP (T , α, β, L,N,K)consider an internal nodeu∗ with associated probleminstances Pl = P (Tu∗(l), αl, βl, Ll, N,K) and Pr =P (Tu∗(r), αr, βr, Lr, N,K) such that at least one of conditions(i) – (iii) below is true.

(i) βl < min(βl,K).

(ii) βr < min(βr,K).

(iii) β < min(β,K).

Then, there exists another problem instanceP ′(T ′, α, β, L′, N,K) where L′ ≥ L and none of theconditions (i) – (iii) hold.

Henceforth we will assume w.l.o.g. thatβ = β andthat Claim 1 holds. Our next lemma shows a structuralproperty of problem instances. Namely for a instance whereL < αmin(β,K), increasing the number of files allows us toincrease the value ofL. This lemma is a key ingredient in ourproof of the main theorem.

Lemma 1:Let P = P (T , α, β, L,K,N) be an instancewhere L < αmin(β,K). Then, we can construct a newinstanceP ′ = P (T ′, α, β, L′,K,N + 1), whereL′ = L+ 1.

Informally, another property of optimal problem instancesis that the same file is recovered as many times as possibleat the same level of the tree. For instance, in Fig. 1,W1 isrecovered in bothTu∗(l) and Tu∗(r). In fact, intuitively it isclear that the same set of files can be reused in any subtreesof an internal node. Our next claim formalizes this intuition.

Claim 2: Consider an instanceP = P (T , α, β, L,K,N).At any nodeu ∈ T , suppose w.l.o.g. that|Γl| ≥ |Γr|. Ifthere exists a nodeu such thatΓr * Γl, then there existsanother instanceP ′(T ′, α, β, L′, N ′,K) such thatN ′ < NandL′ ≥ L.

Definition 5: Saturation number.Consider an instanceP ∗(T ∗, α, β, L∗, N∗,K), whereL∗ = αmin(β,K), such thatfor all problem instances of the formP (T , α, β, L∗, N,K), wehaveN∗ ≤ N . We callN∗ the saturation number of instanceswith parameters(α, β,K) and denote it byNsat(α, β,K).

In essence, for givenα, β andK, saturated instances are mostefficient in using the number of available files. It is easy to seethatNsat(α, β,K) ≤ αmin(β,K) since one can construct aninstance with lower boundαmin(β,K) whenαmin(β,K) ≤N (see remark 1 in the proof of Lemma 1).

Definition 6: Atomic problem instance.For a given optimalproblem instanceP (T , α, β, L,N,K) it is possible that thereexist other optimal problem instancesPi(αi, βi, Li, N,K), i =1, . . . ,m with m ≥ 2 such that

∑mi=1 αi = α,

∑mi=1 βi = β

Page 4: Improved lower bounds for coded caching

and∑m

i=1 Li = L, i.e., the value ofL follows from appro-priately combining smaller problems. In this case we call theinstanceP as non-atomic. Conversely, if such smaller probleminstances do not exist, we callP an atomic problem instance.

Claim 3: Let P (T , α, β, L,K,N) be an instance whereβ ≤ K andL = αβ. Then,α = α.

Let ρ(u) = αl[min(βr,K − βl)]+ + αr[min(βl,K − βr)]+.

Claim 4: In instanceP (T , α, β, L,N,K), consider an in-ternal nodeu. We have

|Wnew(u)| ≤ min (ρ(u), N − |Γl ∪ Γr|) .

Proof: From eq. (2) it follows that

|Wnew(u)| ≤ |∆rl(u) \W(u)|+ |∆lr(u) \W(u)|.

Next, we observe that

|∆rl(u) \W(u)| = |Rec(Z(ur) \Z(ul),D(ul)) \W(u)|

≤ |Z(ur) \Z(ul)| × |D(ul)|(a)

≤ αl × [min(βr,K − βl)]+,

where inequality (a) holds, since|D(ul)| = αl and |Z(ur) \Z(ul)| ≤ min(βr,K − βl)]+. We can bound|∆lr(u) \W(u)|in a similar manner to obtain the first inequality. To see thesecond inequality we note that instancesPl andPr recover atotal of |Γl ∪ Γr| sources. As the total number of sources isN , |Wnew(u)| ≤ N − |Γl ∪ Γr|.

The following theorem and its corollary are the main resultsof our paper and can be used to identify optimal probleminstances.

Theorem 1:Suppose that there exists an optimal andatomic problem instancePo(T = (V,A), α, β, Lo, N,K).Then, there exists optimal and atomic problem instanceP ∗(T ∗ = (V ∗, A∗), α, β, L∗, N,K) where L∗ = Lo withthe following properties. Let us denote the last edge inP ∗

with (u∗, v∗). Let P ∗l = P (T ∗

u∗(l), αl, βl, L∗l , Nl,K) and

P ∗r = P (T ∗

u∗(r), αr, βr, L∗r , Nr,K). Then, we have

L∗l = αl min(βl,K),

L∗r = αr min(βr,K), and

L∗ = min (αmin(β,K), L∗l + L∗

r +N −N0) , (4)

where N0 = max(Nsat(αl, βl,K), Nsat(αr , βr,K))1. Fur-thermore, at least one ofβl or βr is strictly smaller thanK.

Proof: Note that we assume that the problem instancePo

is atomic. This implies thatWPonew(u

∗) 6= ∅. Using Claim 1 wecan assert thatβl = βl and βr = βr.

Suppose thatL∗l < αl min(βl,K). We apply the result of

Lemma 1, by noting thatNl < N , and conclude that there ex-ists another instanceP ∗∗

l = P (T ∗∗u∗(l), αl, βl, L

∗l +1, Nl+1,K)

that can replaceP ∗l , where the new file is denotedW ∗.

We also note that inPo, W ∗ ∈ WPonew(u

∗). Let us denotethe new instanceP ′

o. We emphasize that the nature of themodification in Lemma 1 is such that∆P ′

o

rl (u∗) = ∆Po

rl (u∗)

1As the instance is atomic, we haveN ≥ N0.

and∆P ′

o

lr (u∗) = ∆Po

lr (u∗). Moreover, we note thatWP ′

o(u∗) =W

Po(u∗) ∪ W ∗. We have,

WP ′

onew(u

∗) = ∆P ′

o

rl (u∗) ∪∆

P ′

o

lr (u∗) \WP ′

o(u∗)

= ∆Po

rl (u∗) ∪∆Po

lr (u∗) \WPo(u∗) ∪ W ∗

=WPo

new \ W∗ (sinceW ∗ ∈ ∆Po

rl (u∗) ∪∆Po

rl (u∗)).

Based on this argument, we can immediately conclude that wecannot haveL∗

l < αl min(βl,K) and L∗r < αr min(βr,K)

as the fileW ∗ can be used to simultaneously modify theinstanceP ∗

r . Upon this modification, we can conclude thatL∗ can be increased by one, which contradicts the optimalityof the instancePo. Thus we assume thatL∗

r = αr min(βr,K).We can repeatedly apply the operation of moving files fromWPo

new(u∗) to P ∗

l until we haveL∗l = αl min(βl,K). It has to

be the case that|WPonew(u

∗)| > αl min(βl,K)−Nl so that wecan repeatedly apply the operation of moving the files, for ifthis were not true, the instancePo is not atomic.

We will denote the instance that we arrive at after complet-ing these modification byP ∗. We can also observe at this pointthat if we haveβl ≥ K and βr ≥ K, thenWP∗

new(u∗) = ∅

(by Claim 4) which implies that the original instancePo isnot atomic. Thus, eitherβl or βr or both have to be strictlysmaller thanK. In the discussion below we assume w.l.o.g.that βr < K. It is easy to see that

L∗ = L∗l + L∗

r + |WP∗

new(u∗)|.

By Claim 4, we have that

|WP∗

new(u∗)| ≤ min (ρ(u∗), N −max(|Γ∗

l |, |Γ∗r |)) ,

For an optimal instance, we claim that the above inequality ismet with equality. This is because by Claim 2, we have eitherΓ∗l ⊆ Γ∗

r or Γ∗r ⊆ Γ∗

l , so that there areN −max (|Γ∗l |, |Γ

∗r |)

new files that are available to be recovered atu∗.

Moreover, asβr < K andL∗r = αr min(βr,K), we can

conclude thatαr = αr by Claim 3. Next, we observe that ifβl < K, we can again use Claim 3 to concludeαl = αl. Onthe other hand ifβl > K, then [min(βr,K − βl)]+ = 0. Inboth cases, we can conclude that

|WP∗

new(u∗)| = min (ρ(u∗), N −max(|Γ∗

l |, |Γ∗r |)) ,

whereρ(u∗) = αl× [min(βr,K − βl)]+ +αr× [min(βl,K −βr)]

+. It is easy to verify that,

αmin(β,K) = αl min(βl,K) + αr min(βr,K) + ρ(u∗).

It follows that

L∗ = min (αmin(β,K), L∗l + L∗

r +N −max(|Γ∗l |, |Γ

∗r |)) .

Note that ifL∗ < αmin(β,K) we have

|WP∗

new(u∗)| = N −max(|Γ∗

l |, |Γ∗r |) (5)

≤ N −max(Nsat(αl, βl,K), Nsat(αr , βr,K)).

We claim thatP ∗ to be optimal,P ∗l andP ∗

r have to be suchthat Nl = Nsat(αl, βl,K) and Nr = Nsat(αr , βr,K). Tosee, by the definition of saturation number problem instances,P ′l (T

′l , αl, βl, L

′l, N

′l ,K) and P ′

r(T′r , αr, βr, L

′r, N

′r,K) exist

such thatL′l = L∗

l , L′r = L∗

r , N ′l = Nsat(αl, βl,K) andN ′

r =Nsat(αr, βr,K). W.l.o.g let assumeN ′

l ≥ N′r. By the Claims

Page 5: Improved lower bounds for coded caching

2 4 6

0.5

1

1.5

2

M

R

(a) Case I:N = 6, K = 2

2 4 6

1

2

3

M

R

(b) Case II:N = 6, K = 3

5 10 15

2

4

M

R

(c) Case III:N = 15, K = 4

20 40 60

5

10

M

R

(d) Case IV:N = 64, K = 12

Fig. 2: Comparison of lower bounds. Blue curve: proposed lower bound, Red curve: achievable rate, Black curve: cutset lower bound.

1 and 2 problem instancesP ′l andP ′

r can be modified in sucha way thatβ′

l = min(βl,K), β′r = min(βr,K) andΓ′

l ⊆ Γ′r.

Also, ∪v∈C′

lZ(v) and∪v∈C′

rZ(v) have minimum intersection.

Now, consider problem instanceP ′(T ′, α, β, L′, N,K) withlast edge(u′, v′) so thatP ′

l and P ′r are instances ofu′l and

u′r respectively. By setting∆(v,v′)(u′) ∈ W1, . . . ,WN \ Γ′

l,we can modifyP ′ such that|Wnew(u

′)| = N−max(N ′l , N

′r).

Thus, we haveL = L∗l + L∗

r + N −max(N ′l , N

′r) and since

L∗ ≥ L therefore equality in eq. 5 holds.

Corollary 1: Suppose that there exists an optimal andatomic problem instancePo(T = (V,A), α, β, Lo, N,K).Consider problem instancesP ′

l (α′l, β

′l, L

′l, N,K) and

P ′r(α

′r, β

′r, L

′r, N,K) such thatα′

l + α′r = α andβ′

l + β′r = β

such thatN ≥ N ′0 = max(Nsat(α

′l, β

′l ,K), Nsat(α

′r , β

′r,K)).

Then we have

Lo ≥ min (αmin(β,K), L′l + L′

r +N −N ′0)) .

Proof: The result follows by applying the arguments inthe proof of Theorem 1, to the problem instance whereP ∗

landP ∗

r are replaced byP ′l andP ′

r respectively.

IV. D ISCUSSION

Our first observation is that the cutset bound in [2] is aspecial case of the bound in eq. (4). In particular, supposethat α = ⌊N/s⌋, β = s for s = 0, 1, . . . ,min(N,K). Notethatαβ ≤ N . Thus, it is easy to construct a problem instancewhereL = αβ (see remark 1 in the Proof of 1). This alsofollows from observing thatNsat(α, β,K) ≤ αβ.

Suppose that for a coded caching system withN filesand K users, we first apply the cutset bound with certainα1 and β1 such thatα1β1 < N . This in turn implies thatNsat(α1, β1,K) < N . Using Corollary 1 we can instead at-tempt to lower bound2α1R

⋆+2β1M and obtain the followinginequality.

2α1R⋆ + 2β1M ≥ 2α1β1 +N −Nsat(α1, β1,K)

=⇒ α1R⋆ + β1M ≥ α1β1 + (N −Nsat(α1, β1,K))/2,

which is strictly better than the cutset bound.

Example 2:Consider a system containing a server withfour files and three users,N = 4 and K = 3. The cutsetbounds corresponding to the given system are4R⋆ +M ≥ 4,

2R⋆+2M ≥ 4 andR⋆+3M ≥ 3. Consider the second bound,2R⋆ + 2M ≥ 4 and instead attempt to obtain a lower boundon 4R⋆ + 4M .

In this case by exhaustive enumeration, it can be verifiedthatNsat(2, 2, 3) = 3 < N . Using Corollary 1, this results inthe lower boundL∗ ≥ min(4×3, 2×4+4−Nsat(2, 2, 3)) = 9.Thus we can concludeR⋆ +M ≥ 2.25 which is better thanthe cutset boundR⋆ +M ≥ 2.

Theorem 1 can be leveraged effectively if it can also yieldthe optimal values ofαl, βl and αr, βr. However, currentlywe do not have an algorithm for picking them in an optimalmanner. Moreover, we also do not have an algorithm forfindingNsat(α, β,K). Thus, we have to use Corollary 1 withan appropriate upper bound onNsat(α, β,K) in general.

Our proposed algorithm for upper boundingNsat(α, β,K)is discussed in the Appendix (Algorithm 3). Settingαl =⌈α/2⌉, βl = ⌊β/2⌋ in Theorem 1 and applying this approachto upper bound saturation number, we can obtain the resultsplotted in Fig. 2.

Example 3:Consider a system withN = 64, K = 12and cache sizeM = 16/3. In this case using the cutset boundprovides a lower boundR⋆(M) ≥ 77/27 = 2.852. Now, usingthe approach of Theorem 1 forα = 12, β = 8, (αl, βl) =(αr, βr) = (6, 4) yields12R⋆+8M ≥ min(12× 8, 24+ 24+64−Nsat(6, 4, 12)). Using Algorithm 3 to upper boundNsat

we haveNsat(6, 4, 12) ≤ 17 thereforeR⋆(M) ≥ 157/36 =4.361. This is significantly closer to the achievable rate of5.5(from [2]).

REFERENCES

[1] D. Wessels,Web Caching. O’ Reilly, 2001.

[2] M. Maddah-Ali and U. Niesen, “Fundamental limits of caching,” IEEETrans. on Info. Th., vol. 60, no. 5, pp. 2856–2867, May 2014.

[3] ——, “Decentralized coded caching attains order-optimal memory-ratetradeoff,” Networking, IEEE/ACM Transactions on, 2014 (to appear).

[4] U. Niesen and M. Maddah-Ali, “Coded caching with nonuniform de-mands,” in Computer Communications Workshops (INFOCOM WK-SHPS), 2014 IEEE Conference on, April 2014, pp. 221–226.

[5] N. Karamchandani, U. Niesen, M. Maddah-Ali, and S. Diggavi, “Hi-erarchical coded caching,” inInformation Theory (ISIT), 2014 IEEEInternational Symposium on, June 2014, pp. 2142–2146.

Page 6: Improved lower bounds for coded caching

APPENDIX

Lemma 2:Algorithm 1 always provides a valid lowerbound onαR⋆ + βM whereα =

∑ℓi=1 |D(vi)| and β =

∑ℓi=1 |Z(vi)|.

Proof: Consider any internal nodev ∈ T . We have∑

u∈in(v)

H(Z(u) ∪D(u)|W(u) ∪Wnew(u)),

(a)

≥∑

u∈in(v)

H(Z(u) ∪D(u)|W(v)),

(b)

≥ H(Z(v) ∪D(v)|W(v)),(c)= I(Wnew(v);Z(v) ∪D(v)|W(v))

+H(Z(v) ∪D(v)|W(v) ∪Wnew(v),

where inequality in(a) holds sinceW(u) ∪ Wnew(u) ⊆W(v) and conditioning decreases entropy,(b) holds since∪u∈in(v)Z(u) = Z(v) and ∪u∈in(v)D(u) = D(v) and (c)holds by the definition of mutual information. LetVint denotethe set of internal nodes inT . Let v∗ denote the root and(u∗, v∗) denote its incoming edge. Then,∑

v∈Vint

u∈in(v)

H(Z(u) ∪D(u)|W(u) ∪Wnew(u)) ≥

v∈Vint

y(v,out(v)) +∑

v∈Vint

H(Z(v) ∪D(v)|W(v) ∪Wnew(v)),

where we have ignored the infinitesimal terms introduced dueto Fano’s inequality (for convenience of presentation). Notethat the RHS of the inequality above contains entropy label ofall nodesv ∈ Vint (includingu∗). On the other hand the LHScontains the entropy label of all nodes including the leaf nodesbut excluding the nodeu∗. Canceling the common terms, weobtain,

ℓ∑

i=1

H(Z(vi) ∪D(vi)|Wnew(vi)) ≥

v∈Vi

y(v,out(v)) +H(Z ∪D(u∗)|W(u∗),Wnew(u∗)),

sinceW(vi) = φ for i = 1, . . . , ℓ. We can therefore concludethat

ℓ∑

i=1

H(Z(vi),D(vi)) ≥∑

v∈V

y(v,out(v)) (6)

=⇒ℓ

i=1

H(Z(vi)) +ℓ

i=1

H(D(vi)) ≥∑

v∈V

y(v,out(v)) (7)

Noting thatM ≥ H(Z(vi)) andR⋆ ≥ H(D(vi)) we have therequired result.

Claim 5: Consider a problem instanceP (T , α, β, L,N,K) such that there exists a nodeu ∈ Twith |in(u)| ≥ 3. Then, there exists another instanceP ′(T ′, α, β, L′, N,K) whereL′ ≥ L and |in(u)| ≤ 2 for allnodesu ∈ T ′.

Proof: We iteratively modifyP to arrive at an instancewhere every node has in-degree at most two. Towards this end,

v′1 v′2 v′3 v′δ

u

v′1 v′2

v′3

v′δ

u′

1

u′

2

u

Fig. 3: Tree modification example

we first identify a nodeu with in-degreeδ ≥ 3 such that noother node of degree at least 3 is topologically higher than it.

We modify the instanceP by replacingu with a directedin-tree where each node has in-degree exactly two. Specifically,arbitrarily number the nodes inin(u) from v′1, . . . , v

′δ. We

replace the nodeu with a directed in-treeTu with leavesv′1, . . . , v

′δ and root u. Tu has δ − 2 internal nodes num-

beredu′1, . . . , u′δ−2 such thatin(u′i) = u′i−1, v

′i+1 where

u′0 = v′1 (see Fig. 3). Let us denote the new instanceby Po = Po(To, α, β, Lo, N,K). We claim thatLo ≥ L.To see this, suppose thatW ∗ ∈ WP

new(u). We show thatW ∗ ∈ ∪u′∈Tu

WPonew(u

′). This ensures thatLo ≥ L. To seethis we note that

ZP (u) = Z

Po(u)

DP (u) = D

Po(u), and, thus

∆P (u, u) = ∆Po(u, u).

Thus, ifW ∗ ∈WPnew(u), there exists an internal nodeu′i ∈ Tu

with the smallest indexi ∈ 1, . . . , δ − 2 such thatW ∗ ∈∆Po(u′i, u

′i). Note that if i > 1, we haveW ∗ ∈ WPo

new(u′i)

sinceW ∗ /∈ ∆Po(u′i−1, u′i−1) which in turn implies thatW ∗ /∈

WPo(u′i). On the other hand ifi = 1, then a similar argument

holds since it is easy to see thatW ∗ /∈WPo(u′1).

Note that modification in instanceP can only affect nodesthat are downstream ofu. Now consideru′ such thatu ∈in(u′). It is evident thatZPo(u′) = Z

P (u′) andDPo(u′) =

DP (u′). MoreoverWPo(u′) = ∪v∈in(u′)W

Po(v) ∪WPonew(v).

Now for v 6= u, WPo(v) = W

P (v) and WPonew(v) =

WPnew(v) as there are no changes in the corresponding sub-

trees. Moreover, as∆P (u, u) = ∆Po(u, u), we have thatW

Po(u) ∪WPonew(u) = W

P (u) ∪WPnew(u). This implies that

WPo(u′) = W

P (u′). Thus, we can conclude thatWPonew(u

′) =WP

new(u′). Applying an inductive argument we can conclude

that theWPonew(u

′) =WPnew(u

′) for all u′ such thatu ≻ u′.

The above process can iteratively be applied to every nodein the instance that is of degree at least three. Thus, we havethe required result.

We define the functionψ : D × C → 0, 1 that allows usto expressL in another way. For nodesvi ∈ D, v′ ∈ C we candefine their meeting pointu ∈ T . The functionψ(vi, v′) isdetermined by means of Algorithm 2, where the sequence inwhich we pick the nodesv1, . . . , vα is fixed. Each element ofWnew(u) can be recovered from multiple pairs of nodes thatmeet there. The arrayΩ(u, δu) keeps track of the first time thefile δu is encountered. The functionψ(vi, v′) takes the value1 if the file W ∗ recovered from the pair(Z(v′),D(vi)) at ubelongs toWnew(u) and has not been encountered before and0 otherwise. A formal description is given in Algorithm 2.

Page 7: Improved lower bounds for coded caching

Algorithm 2 Computingψ

Input: P (T , α, β, L,N,K), Array Ω(u, δu), whereu ∈ T ,δu ⊆Wnew(u), |δu| = 1.

1: Initialization2: for all u ∈ T , δu ⊆Wnew(u) where|δu| = 1 do3: Ω(u, δu)← 0,4: end for5: end Initialization6: for i← 1 to α do7: for all v′ ∈ C do8: δu = ∆(vi,v′)(u).9: if δu ∈ Wnew(u) andΩ(u, δu) == 0 then

10: ψ(vi, v′)← 1, andΩ(u, δu)← 1.

11: else12: ψ(vi, v

′)← 0.13: end if14: end for15: end for

Claim 6: For an instanceP (T , α, β, L,N,K) the follow-ing equality holds

L =α∑

i=1

v′∈C

ψ(vi, v′). (8)

Proof: We first note that at the end of the algorithm aboveΩ(u, δu) = 1 for all u ∈ T and all δu ⊆ Wnew(u), |δu| = 1.To see this suppose that there is au1 ∈ T and a singletonsubsetδu1 of Wnew(u1) such thatΩ(u1, δu1) = 0. Now δu1

is recovered from some delivery phase node and cache node,otherwise it would not be a subset ofWnew(u1). As ouralgorithm considers all pairs of delivery phase nodes and cachenodes, at the end of the algorithm it has to be the case thatΩ(u1, δu1) = 1.

Next, we note that for each pair(u1, δu1) whereu1 ∈ Tand δu1 is singleton subset ofWnew(u1), we can identifya unique pair of nodes(vi, v′) where vi ∈ D and v′ ∈C such thatψ(vi, v′) and Ω(u1, δu1) are set to 1 at thesame step of the algorithm. The remaining pairs(vi, v

′) thatcannot be put in one to one correspondence with a pair(u1, δu1) are such thatψ(vi, v′) are set to 0. Moreover as∑

u∈T

δu⊆Wnew(u),|δu|=1 Ω(u, δu) =∑

u∈T |Wnew(u)| =

L, it follows thatL =∑α

i=1

v′∈C ψ(vi, v′).

A. Proof of Claim 1

Proof: W.l.o.g let assume that condition (a), namelyβl < min(βl,K) holds. This implies that there isa set of leaves inTu∗(l) denoted vi1 , . . . , vim suchthat Z(vi1 ) = · · · = Z(vim ) = Zj. Let Λ =u ∈ Tu∗(l) : (via , vib) meet atu, for all distinct via , vib ∈vi1 , . . . , vim. We identify u0 ∈ Λ such that no elementof Λ is topologically higher thanu0 (note thatu0 may notbe unique) and letv∗ia and v∗ib be the corresponding nodesin vi1 , . . . , vim that meet atu0. W.l.o.g we assume thatv∗ib ∈ Tu0(r) andv∗ia ∈ Tu0(l).

We construct instanceP ′ as follows. Choose a memberof Z1, . . . , ZK \ Z(v′) : v′ ∈ Cu∗(l) and denote it byZk. We setZP ′

(v∗ib) = Zk. Also, for anyu ∈ Du0(r) and

DP (u) = Xd1,...,dK

we setDP ′

(u) = Xd′

1,...,d′

Ksuch that

d′j = dk andd′k = dj andd′i = di for i /∈ j, k.

We now show thatL′ ≥ L. In particular, foru ∈ Tu0(l), wehaveWP ′

new(u) = WPnew(u). Also we claim thatWP ′

new(u) =WP

new(u) for u ∈ Tu0(r). To see this, note that forv ∈ Du0(r)

and v′ ∈ Cu0(r) we have∆P ′

(v′, v) = ∆P (v′, v) if Z(v′) /∈

Zj, Zk. If ZP ′

(v′) = Zk andDP ′

(v) = Xd′

1,...,d′

Kthen,

∆P ′

(v′, v) = Rec(Zk, Xd′

1,...,d′

K)

= Wd′

k = Wdj

= Rec(Zj, Xd1,...,dK)

= ∆P (v′, v).

Furthermore note that there not exist anyv′ ∈ Cu0(r) such thatZ(v′) = Zj since we pickedu0 such that no element ofΛis topologically higher thanu0. It is not hard to see that thisin turn implies thatWP ′

new(u) =WPnew(u) for u ∈ Tu0(r).

It follows therefore thatWP ′

(u0) = WP (u0) (see eq.

(3)). Let us now consider the other nodes. As the changesare applied only toTu0(r) so label(u) changes only for nodesu such thatu0 ≻ u. Consider the subset of internal nodesU =u0, u1, . . . , ut such thatui ≻ ui+1, i.e., the set of internalnodes includingu0 and all nodes downstream ofu0 such thatut is root node. W.l.o.g we assume thatui−1 ∈ Tui(l) for i ≥ 1.We now show that∪u∈UW

Pnew(u) ⊆ ∪u∈UW

P ′

new(u). Towardsthis end we have the following observations foru ∈ U .

ZP ′

(u) = ZP (u) ∪ Zk (from construction ofP ′)

∆P ′

(u, u) = ∪v∈Du∆P ′

(u, v).

Now, for v /∈ Du0(r) we haveDP ′

(v) = DP (v) so that

∆P ′

(u, v) = Rec(ZP ′

(u),DP ′

(v))

= Rec(ZP ′

(u),DP (v))

⊇ ∆P (u, v) sinceZP ′

(u) ⊇ ZP (u).

Conversely forv ∈ Du0(r) we have

Rec(

Zj, Zk,DP ′

(v))

= Rec(

Zj, Zk,DP (v)

)

,

and

Rec(

Zi,DP ′

(v))

= Rec(

Zi,DP (v)

)

,

for Zi /∈ Zj, Zk. Now note thatZk, Zj ⊂ ZP ′

(u) so that

∆P ′

(u, v) = Rec(

ZP ′

(u),DP ′

(v))

= Rec(

ZP ′

(u),DP (v))

,

⊇ Rec(

ZP (u),DP (v)

)

= ∆P (u, v),

sinceZP ′

(u) ⊃ ZP (u). We can therefore conclude that

∆P (u, u) = ∪v∈Du∆P (u, v) ⊆ ∪v∈Du

∆P ′

(u, v) = ∆P ′

(u).

Now we consider aW ∗ ∈ WPnew(ui) so thatW ∗ ∈ ∆P (ui, ui)

which by above condition means thatW ∗ ∈ ∆P ′

(ui, ui). ThuseitherW ∗ ∈ WP ′

new(ui) or W ∗ ∈WP ′

(ui). In the latter casethere exists a nodeui′ where 0 ≤ i′ < i such thatW ∗ ∈

Page 8: Improved lower bounds for coded caching

WP ′

new(ui′) since we have shown thatWP ′

(u0) = WP (u0).

Thus, we observe that

L′ =∑

u∈T ′,u/∈U

|WP ′

new(u)|+ | ∪u∈U WP ′

new(u)|,

≥∑

u∈T ,u/∈U

|WPnew(u)|+ | ∪u∈U WP

new(u)|,

= L,

where the second inequality holds since∑

u∈T ′,u/∈U |WP ′

new(u)| =∑

u∈T ,u/∈U |WPnew(u)| and

| ∪u∈U WP ′

new(u)| ≥ | ∪u∈U WPnew(u)|.

To conclude we note that the above modification of theoriginal instance can be iteratively repeated until we haveβ′l = min(βl,K). Following this we can repeat the process

on the instanceP ′r if β′

r < min(βr,K) to ensure thatβ′r = min(βr,K).

It remains to show thatβ = min(β,K). Towards this end,we consider a variation of the above argument. Letv∗ be theroot of T so that(u∗, v∗) is the last edge inT . By applyingthe above arguments we haveβl = |Z(u∗l )| = min(βl,K) andβr = |Z(u∗r)| = min(βr,K). Note that if eitherβl or βr belarger or equal toK thenβ = |Z(u∗)| = |Z(u∗l )∪Z(u

∗r)| = K

or equivalentlyβ = min(β,K). So we only consider the casethat bothβl andβr are smaller thanK andβ < min(β,K). Inthis case it is easy to see that there exists a uniquev∗i1 ∈ Cu∗(r)

and a uniquev∗i2 ∈ Cu∗(l) such thatZ(v∗i1 ) = Z(v∗i2 ) = Zj(for someZj).

We pick Zk ∈ Z1, . . . , ZK \ Z(u∗) and constructP ′(T ′, α, β, L′, N,K) from P by applying the followingchanges. We setZP ′

(v∗i1 ) = Zk and for anyv ∈ Du∗(r)

such thatDP (v) = Xd1,...,dKwe setDP ′

(v) = Xd′

1,...,d′

K

whered′j = dk, d′k = dj andd′i = di for all i /∈ j, k.

Since, the changes only affectTu∗(r) thereforeWP ′

new(u) =WP

new(u) for all u ∈ Tu∗(l). Also by arguments similar to theones made earlier, we can show thatWP ′

new(u) =WPnew(u) for

u ∈ Tu∗(r) thereforeWP ′

(u∗) = WP (u∗). Furthermore, since

ZP ′

(u∗) = ZP (u∗)∪Zk we haveWP

new(u∗) ⊆WP ′

new(u∗).

Thus, we conclude thatL′ ≥ L while β′ = β + 1.

If still β′ < min(β,K) then we keep applying the abovechanges until we constructP ′ such thatβ′ = min(β,K).

Proof of Lemma 1

Proof: For a nodevi, where1 ≤ i ≤ α, we have∑

v′∈C

ψ(vi, v′) ≤ | ∪v′∈C Z(v′)|

= β,

= min(β,K). (9)

From eq. (9) we can conclude thatL ≤ αmin(β,K).

Remark 1: If N ≥ αmin(β,K), then it is easy to con-struct an instance such thatL = αmin(β,K). Specifically,pick any directed tree onα + β leaves. Suppose that nodev ∈ D, v′ ∈ C meet at nodeu. We label the leaves such that| ∪(v,v′)∈D×C ∆v,v′(u)| = αmin(β,K).

Given the conditions of the theorem, it is evident that thereexists an indexi∗ ∈ 1, . . . , α such that

v′∈C ψ(vi∗ , v′) <

min(β,K). We set i∗ to be the smallest such index. LetΠ1(vi∗) = v′ ∈ C : ψ(vi∗ , v

′) = 1 andΠ0(vi∗) = v′ ∈C : ψ(vi∗ , v

′) = 0,Z(v′) * ∪v∈Π1(vi∗ )Z(v′). Note that

Π0(vi∗) is non-empty since| ∪v′∈C Z(v′)| = min(β,K) and∑

v′∈C ψ(vi∗ , v′) < min(β,K).

Next, we determine the set of nodes wherevi∗ and thenodes inΠ0(vi∗) meet, i.e., we defineΛ0(vi∗) = u ∈T : ∃v′ ∈ Π0(vi∗) such thatvi∗ andv′ meet atu.. Note thatthere is a topological ordering on the nodes inΛ0(vi∗). Pickthe nodeu∗ ∈ Λ0(vi∗) such that no element ofΛ0(vi∗) istopologically higher thanu∗. Let the corresponding node inΠ0(vi∗) be denoted byvj∗ wherej∗ ∈ α+ 1, . . . , α+ β.

Suppose thatZ(vj∗) = Zk and thatD(vi∗) = Xd1,...,dK.

We modify the instanceP as follows. Setdk = N+1 (i.e., theindex of theN+1 file). Thus, the only change is inD(vi∗). Letus denote the new instance byP ′ = P (T ′, α, β, L′, N+1,K).

We now analyze the value ofL′. W.l.o.g. we assume thatvi∗ ∈ T ′

u∗(l) andvj∗ ∈ T ′u∗(r). Note thatWP ′

new(u) =WPnew(u)

for u ∈ T ′u∗(r) as the subtreeT ′

u∗(r) is identical toTu∗(r). Wealso have

WP ′

new(u) =WPnew(u) for u ∈ T ′

u∗(l).

To see this suppose that this is not true. This implies that thefile WN+1 is recovered at some node inT ′

u∗(l), i.e., there existsv′ ∈ C such thatv′ ∈ T ′

u∗(l) andZ(v′) = Zk. However this isa contradiction, since this implies the existence of node that istopologically higher thanu∗ wherevi∗ andv′ meet. It followsthatWP ′

(u∗) = WP (u∗).

Next, we claim thatWP ′

new(u∗) =WP

new(u∗)∪WN+1. To

see this consider the following series of arguments. Let thesin-gleton subset∆P (vi∗ , vj∗) = W ∗. Note thatψP (vi∗ , vj∗) =0. This implies that there existv ∈ Du∗ and v′ ∈ Cu∗ suchthat v and v′ meet atu∗ and recover the fileW ∗ where(v, v′) 6= (vi∗ , vj∗). Thus, asZP ′

(u∗) = ZP (u∗), we can

conclude that

∆P ′

(u∗, u∗) = Rec(ZP (u∗),DP ′

(u∗))

= ∆P (u∗, u∗) ∪ WN+1.

Furthermore, we have

WP ′

new(u∗) = ∆P ′

(u∗, u∗) \WP ′

(u∗)

= ∆P (u∗, u∗) ∪ WN+1 \WP (u∗)

=WPnew(u

∗) ∪ WN+1, (sinceWN+1 /∈WP (u∗)).

For u such thatu∗ ≻ u we inductively argue thatWP ′

new(u) =WP

new(u). To see this suppose thatu∗ = ur. It is evidentthat ∆P ′

rl (u) = ∆Prl(u). Next, ∆P ′

lr (u) = ∆Plr(u) sinceZk /∈

Z(ul) \Z(ur). Thus, asWN+1 /∈ ∆Prl(u) ∪∆P

lr(u),

WP ′

new(u) = ∆P ′

rl (u) ∪∆P ′

lr (u) \WP ′

(u)

= ∆Prl(u) ∪∆P

lr(u) \WP ′

(u)

= ∆Prl(u) ∪∆P

lr(u) \WP (u) ∪ WN+1

= ∆Prl(u) ∪∆P

lr(u) \WP (u)

=WPnew(u).

Page 9: Improved lower bounds for coded caching

Next, we note thatW(u) = W(ur) ∪ Wnew(ur) ∪W(ul) ∪ Wnew(ul). It is evident thatWP ′

(ul) = WP (ul)

andWP ′

new(ul) = WPnew(ul). Next, WP ′

(ur) = WP ′

(u∗) =W

P (u∗) (from above) andWP ′

new(u∗) = WP

new(u∗) ∪

WN+1, so thatWP ′

(u) = WP (u) ∪ WN+1.

As the induction hypothesis we assume that for any nodeu downstream ofu∗, we haveWP ′

new(u) = WPnew(u) and

WP ′

(u) = WP (u) ∪ WN+1. Consider a nodeu′ such

that u′r = u. As before we haveWP ′

(u′l) = WP (u′l),

WP ′

new(u′l) = WP

new(u′l). Moreover, we haveWP ′

(u′r) =W

P (u′r) ∪ WN+1 and WP ′

new(u′r) = WP

new(u′r), by the

induction hypothesis, so thatWP ′

(u′) = WP (u′)∪WN+1.

Next, we argue similarly as above that∆P ′

rl (u′) = ∆P

rl(u′)

and∆P ′

lr (u′) = ∆P

lr(u′) and the sequence of equations above

can be used to conclude to thatWP ′

new(u′) =WP

new(u′).

We conclude thatL′ = L+ 1.

B. Proof of Claim 2

Proof: We pick one of the nodes whereΓr * Γl andapply the following arguments. We denote this node byu∗.Note that since|Γl| ≥ |Γr|, there exists an injective mappingφ : Γr \ Γl → Γl \ Γr. Let Z(u∗r) = Zi1 , . . . , Zim.

We construct the instanceP ′ as follows. For av ∈ Du∗

r

supposeD(v) = Xd1,...,dK. For j = 1, . . . ,m, if dij ∈ Γr \

Γl, we replace it byφ(dij ); otherwise, we leave it unchanged.In other words, we modify the delivery phase signals so thatthe files that are recovered inTu∗(r) are a subset of thoserecovered inTu∗(l).

As our change amounts to a simple relabeling of thesources, foru ∈ Tu∗(r) we have|WP ′

new(u)| = |WPnew(u)|.

Furthermore, the relabeling of the sources only affectsu ∈ T ′

such thatu∗ ≻ u. Note that WP ′

(u∗) ⊂ WP (u∗) (the

inclusion is strict since there is at least one source inΓr \Γl ismapped toΓl \ Γr) since we haveΓP ′

r ⊆ ΓP ′

l andΓP ′

l = ΓPl .

Now, we note that

∆P ′

rl (u∗) = ∆P

rl(u∗), and

∆P ′

lr (u∗) = ∆P

lr(u∗),

where the first equality holds sinceZP (u∗r) = ZP ′

(u∗r),Z

P (u∗l ) = ZP ′

(u∗l ) and DP (u∗l ) = D

P ′

(u∗l ). The secondequality holds since our modification to the delivery phasesignals inTu∗(r) does not affect files that are recovered fromZ

P (u∗l ) \ ZP (u∗r). It follows therefore that|WP ′

new(u∗)| ≥

|WPnew(u

∗)|.

We make an inductive argument for nodesu that are down-stream ofu∗; w.l.o.g. we assume thatu∗ ∈ Tu(r). Specifically,our inductive hypothesis is that for a nodeu that is downstreamof u∗, we haveWP ′

(u) ⊆ WP (u), ∆P ′

rl (u) = ∆Prl(u) and

∆P ′

lr (u) = ∆Plr(u).

Now consider a nodeu′ downstream ofu such thatu′r = u.We have,W(u′) = W(u′l) ∪Wnew(u

′l) ∪W(u) ∪Wnew(u).

Note that we can expressW(u) ∪ Wnew(u) = W(u) ∪∆rl(u) ∪ ∆lr(u) . It is evident thatWP ′

(u′l) = WP (u′l)

and WP ′

new(u′l) = WP

new(u′l). Moreover, by the induction

Algorithm 3 Upper Bound onNsat(α, β,K)

Input: α, β andK.1: Initialization2: Let (u∗, v∗) be last edge and setUnew = u∗.3: SetZ(u∗) be any subset ofZ1, . . . , ZK of sizemin(β,K)

andβ(u∗) = β, α(u∗) = α.4: C = ∅ andD = ∅.5: end Initialization6: procedure CACHE NODES LABELING7: while Unew is nonemptydo8: Pick u from Unew , create nodesul andur, edges(ul, u)

and (ur, u), add them toT .9: Setαl(u) = ⌈α(u)/2⌉, βl(u) = ⌊β(u)/2⌋ andαr(u) =

α(u) − αl(u), βr(u) = β(u)− βl(u).10: Set Z(ul) and Z(ur) be subsets ofZ(u) of sizes

min(βl(u),K) andmin(βr(u),K) respectively with minimumintersection.

11: Removeu from Unew .12: if αl(u) + βl(u) ≥ 2 then13: Add ul to Unew .14: else15: If βl(u) == 1 addul to D otherwise toC.16: end if17: if αr(u) + βr(u) ≥ 2 then18: Add ur to Unew .19: else20: If βr(u) == 1 addur to D otherwise toC.21: end if22: end while23: end procedure24: SetUunlab = u ∈ T : in(u) ⊂ C ∪ D andUlab = ∅.25: Setnv = 0 for all v ∈ C ∪ D.26: procedure DELIVERY NODES LABELING27: while Uunlab is not emptydo28: Pick u ∈ Uunlab such thatin(u) ⊆ Ulab and setIl =i : Zi ∈ Z(ul) \Z(ur) andIr = i : Zi ∈ Z(ur) \Z(ul).

29: Setρrl = |Ir| × |Du(l)| andρlr = |Il| × |Du(r)|.30: Setj = 1, mu = max(nul

, nur ) andnu = ρrl + ρlr +mu.

31: for all v ∈ Du(l) do32: Let D(v) = Xd1,...,dK .33: for i ∈ Ir do34: Setdi = j +mu.35: j ← j + 1.36: end for37: end for38: for all v ∈ Du(r) do39: Let D(v) = Xd1,...,dK .40: for i ∈ Il do41: Setdi = j +mu.42: j ← j + 1.43: end for44: end for45: Removeu from Uunlab and add it toUlab.46: If out(u) /∈ Ulab add it toUunlab.47: end while48: end procedureOutput: Nsat(α, β,K) = nv∗ .

hypothesis,WP ′

(u) ⊆ WP (u) and ∆P ′

rl (u) ∪ ∆P ′

lr (u) =∆P

rl(u) ∪ ∆Plr(u). Thus, the induction step is proved. We

conclude therefore thatL′ ≥ L and thatN ′ < N .

Page 10: Improved lower bounds for coded caching

C. Proof of Claim 3

Proof: Suppose that this is not the case. This means thatthere exist at least two delivery nodesvi1 , vi2 ∈ D such thatD(vi1) = D(vi2 ). Let u∗ be the node wherevi1 andvi2 meetand w.l.o.g leti1 < i2 and vi2 ∈ Tu∗(r). In eq. (9) we showthat

v′∈C ψ(v, v′) ≤ min(β,K) for anyv ∈ D. As L = αβ,

we conclude that∑

v′∈C ψ(v, v′) = β for anyv ∈ D. Next, as

i1 < i2 andvi1 , vi2 have the same label (and any two leavesmeet at some node) we have thatψ(vi2 , v

′) = 0 for v′ /∈Cu∗(r). From this, we can conclude thatCu∗(r) = C. However,this implies thatRec(Z(u∗),D(vi1 )\W(u∗) = ∅ since all filesin Rec(Z(u∗),D(vi1)) have already been recovered in subtreeTu∗(r) and β = β. This in turn implies thatψ(vi1 , v

′) = 0 forall v′ ∈ C which contradicts the fact thatL = αβ.

D. Upper bound onNsat(α, β,K)

Note that by the definition of saturation number any prob-lem instanceP (T , α, β, L,N,K) where L = αmin(β,K)can be used to find upper bound forNsat(α, β,K). We claimthat the proposed problem instance in Algorithm 3 is such thatL = αmin(β,K) henceNsat(α, β,K) is a valid upper bound.To see this, we have the following argument.

For fixed u ∈ T , by the algorithm forv ∈ Du(l) withD(v) = Xd1,...,dK

andi ∈ Ir such thatZi = Z(v′) for somev′ ∈ Cu(r), we have∆(v,v′)(u) = Wdi

thereforeWdi∈

∆(u, u). It is easy to verify thatWdi∈ Wnew(u) wheredi =

j+mu, j ∈ 1, . . . , ρlr+ρrl. Note thatdi = j+mu > mu ≥nv for any v ≻ u. Now assume thatWdi

/∈ Wnew(u). SinceWdi∈ ∆(u, u) this means that there exists a nodev ≻ u where

Wdi∈ Wnew(v) so thatdi = j′+mv. But this contradicts the

assumption thatdi > nv sincedi = j′ +mv ≤ nv thereforeWdi∈ Wnew(u).

By our algorithm for different choices ofi ∈ Ir and v ∈Du(l) there exist a distinctj ∈ 1, . . . , ρrl such thatdi =j+mu, thereforedi takes all values inmu+1, . . . ,mu+ρrl.By the same argument forDu(r), Il andj ∈ ρrl+1, . . . , nuthe number of distinct values fordi is ρrl + ρlr. Moreover,sinceWdi

∈ Wnew(u) it is easy to see that|Wnew(u)| ≥ρrl+ρlr. Note that in the cache labeling phase of the algorithmwe chooseZ(ul) and Z(ur) to have the minimum possibleintersection. Therefore,|Ir| = [min(βr(u),K − βl(u))]

+. Asimilar argument holds for|Il| and we can conclude that

|Wnew(u)| ≥ ρrl + ρlr,

= αl(u)[min(βr(u),K − βl(u))]+

+ αr(u)[min(βl(u),K − βr(u))]+. (10)

Along with the matching upper bound in Claim 4, we canassert that equality holds in eq. (10).

Now we claim that∑

vu |Wnew(v)| =α(u)min(β(u),K) by induction. It is easy to verifythat this is true whenu ∈ C ∩ D since for leaves eitherZ(u)or D(u) is empty thereforeWnew(u) = ∅ and eitherα(u) orβ(u) is zero. Assume that the equation holds forul and ur

and we show that it also holds foru. To see this note that∑

vu

|Wnew(v)|

=∑

vul

|Wnew(v)|+∑

vur

|Wnew(v)|+ |Wnew(u)|

= αl(u)min(βl(u),K) + αr(u)min(βr(u),K) + |Wnew(u)|

= αl(u)min(βl(u),K) + [min(βr(u),K − βl(u))]+

+ αr(u)min(βr(u),K) + [min(βr(u),K − βr(u))]+,

where we used the size ofWnew(u) in eq. (10) and the fact thatthe induction hypothesis holds forul andur. It is easy to verifythatmin(βl(u),K)+ [min(βr(u),K−βl(u))]+ = min(β,K)whereβl(u) + βr(u) = β(u). Therefore,

vu

|Wnew(v)| = α(u)min(β(u),K).

So the equation∑

vu |Wnew(v)| = α(u)min(β(u),K) holdsfor any u ∈ T in the algorithm 3. Applying this to the nodeu∗ whereα(u∗) = α andβ(u∗) = β completes our claim.