Optimal quantum source coding with quantum side

1

Optimal quantum source coding with quantum sideinformation at the encoder and decoder

Jon Yard∗, Igor Devetak†

Abstract—Consider many instances of an arbitrary quadripar-tite pure state of four quantum systems ABCD. Alice holds theAC part of each state, Bob holds B, while D represents all otherparties correlated with ABC. Alice is required to redistributethe C systems to Bob while asymptotically preserving the overallpurity. We prove that this is possible using Q qubits of commu-nication and E ebits of shared entanglement between Alice andBob, provided that Q ≥ 1

2I(C;D|B) and Q + E ≥ H(C|B),

proving the optimality of the Luo-Devetak outer bound. Theoptimal qubit rate provides the first known operational inter-pretation of quantum conditional mutual information. We alsoshow how our protocol leads to a fully operational proof ofstrong subadditivity and uncover a general organizing principle,in analogy to thermodynamics, that underlies the optimal rates.

Index Terms—Quantum information, source coding, side in-formation.

I. INTRODUCTION

THE most fundamental problem in communication theoryis the two-terminal source coding problem. Here one

user, say Alice, attempts to describe a source of informationto another user, who we call Bob. If the information sourceis modeled by a sequence of independent and identicallydistributed (i.i.d.) random variables X , one can ask for theultimate rate at which the source can be described, in unitsof bits per sample. It is required that Alice’s descriptionallow Bob to perfectly recreate the source sequence withhigh probability, although decreasing the error probabilitygenerally requires block coding on longer source sequences.According to Shannon’s noiseless channel coding theorem [1],this ultimate rate is given by the Shannon entropy

H(X) = −∑x

p(x) log p(x).

Intuitively, Shannon entropy can be understood as a measure ofthe information contained in the random variable X . BecauseShannon entropy answers the question regarding the optimalrate for data compression, one says that the correspondingprotocol for data compression provides an operational inter-pretation of Shannon entropy.

Suppose now that Bob had some a priori information aboutX , in the form of a correlated random variable Y . In this case,Slepian and Wolf demonstrated [2] that Alice would only needto send to Bob at a rate given by the conditional entropy

H(X|Y ) = H(XY )−H(Y )

∗ [email protected], Institute for Quantum Information, California In-stitute of Technology, Pasadena, California, USA, CNLS (Quantum Initiative),CCS-3, Los Alamos National Laboratory, Los Alamos, NM, USA† [email protected], Electrical Engineering Department, University of

Southern California, USA

and that surprisingly, Alice would not need to know Bob’s sideinformation to accomplish this task. The so-called Slepian-Wolf protocol for data compression with side informationprovides an operational interpretation of conditional entropy.Intuitively, one thinks of H(X|Y ) as a measure of the in-formation that is to be gained by learning X for one whoalready knows Y . Note that there is no advantage if Alicehas additional side information regarding X , and that sharedcommon randomness between Alice and Bob is also of nohelp.

In this paper, we provide a complete solution to a generalquantum counterpart of the above scenario. We find that, incontrast to the classical case, additional Alice side informationchanges the problem, while quantum mechanical entanglementbetween Alice and Bob, the quantum analog of shared com-mon randomness, is a useful resource. Our problem is fullyquantum in a sense introduced by Schumacher [3], whereAlice is asked to transfer part of a pure quantum state toBob, while preserving the purity of the global state. For this,we consider a pure state of four quantum systems |ψ〉ABCD.Initially, the A and C systems are held by Alice, while Bis in the possession of Bob. We refer to D as the referencesystem and assume that it is inaccessible to both Alice and Bob.We determine the cost for Alice and Bob to “redistribute” thestate, so that it is Bob who holds C instead of Alice, therebytransferring the quantum information in C to Bob. Specifically,we analyze the corresponding asymptotic scenario, asking thatmany copies of the same state be redistributed as above, whilerequiring that the redistributed states have arbitrarily highfidelity with the originals in the asymptotic limit.

To achieve this task, we allow the use of two fundamentalquantum mechanical resources. First, Alice may send qubits(two-level quantum systems) to Bob over a noiseless quantumchannel. Second, we allow Alice and Bob to use pre-existingentanglement, shared between themselves in the form of Bellstates

|Φ+〉 =1√2

(|00〉+ |11〉

).

We refer to such a state as an ebit (entangled bit). We donot separately consider classical communication, because itcan be used with entanglement to simulate qubit channels viateleportation. The asymptotic cost to redistribute C as above isgiven in terms of the number Q of qubits sent and the numberE of ebits consumed, per copy of the state. We allow theentanglement cost E to be negative, in which case the corre-sponding protocol generates entanglement rather than consumeit. Our main result (Theorem 1) proves the optimality of theLuo-Devetak outer bound [4] for this problem, demonstrating

arX

iv:0

706.

2907

v2 [

quan

t-ph

] 2

6 Se

p 20

20

2

12I(C;D|B)

12I(A;C)− 1

2I(B;C)

Q + E = H(C|B)

Q

E

Fig. 1. The shaded region represents contains the cost pairs from (1) atwhich it is possible to redistribute the C part of |ψ〉ABCD from Alice toBob. The figure corresponds to the case I(A;C) > I(B;C); otherwise thecorner point, which corresponds to the optimal cost pair in (2), would be inthe upper-left quadrant.

that it is possible to redistribute the state |ψ〉ABCD as aboveif and only if

Q ≥ 12I(C;D|B), Q+ E ≥ H(C|B). (1)

This region is depicted in Figure 1. The quantities in thesebounds, conditional mutual information and conditional en-tropy, are defined in Section I-A. Simultaneously minimizingthe qubit rate Q and the total sum rate Q+E gives the optimalcost pair

Q∗ = 12I(C;D|B) E∗ = 1

2I(A;C)− 12I(B;C). (2)

The optimal qubit cost gives the first known operationalinterpretation of quantum conditional mutual information. InSection IV, we show that Q∗ cannot be negative, which leadsto an operational proof of the celebrated strong subadditivityinequality [5]. This proof differs from other such operationalproofs [6], [7] in that it follows solely from a direct codingtheorem and not from a converse proof. In [8], where our mainresult was first announced, we showed that Q∗ is symmetricunder time-reversal, where now Bob redistributes C backto Alice, while E∗ is anti-symmetric. The former gives anintuitive understanding to the curious identity

I(C;D|A) = I(C;D|B),

which holds on every quadripartite pure state. We commentfurther on this feature in Section V. We also demonstratedthere that the corresponding protocol is perfectly composable.This constitutes an exact solution to a quantum analog ofresult of Cover and Equitz [9] on the successive refinement ofclassical information, although the classical problem is onlyknown to be exactly soluble in the presence of a Markovcondition.

By assuming that various subsystems are trivial, the stateredistribution problem generalizes numerous tasks that werepreviously considered in the literature while giving an optimalprotocol suited for any and all of them. As we discuss inSection V (in particular see Figure 4) and also during theproof of our main theorem in Section III-B, these tasks include

Schumacher compression [3], state merging and splitting [7],[10], [11], [12], and entanglement concentration and dilution[13]. We depart from previous nomenclature with regard to themerging and splitting problems; our convention for this paperis detailed in Section III-B.

The paper is organized as follows. In the next subsectionwe fix our notational conventions. The following sectiongives an introduction to the resource calculus. There we alsoformally state the main result, Theorem 1, which is provedin Section III. In Section IV, we show how our results yielda fully operational proof of strong subaddivity which, unlikeprevious operational proofs, is logically independent even fromthe subadditivity of entropy. We conclude with a discussion inSection V where we reflect on the main result and provide anovel thermodynamic interpretation of the optimal rates.

A. Notational conventions

Throughout this paper, we assume familiarity with standardbackground material in quantum information theory; for ageneral reference, the reader is referred to [14]. We use capitalRoman letters such as A,B,C to denote Hilbert spaces. Wewrite |A| for the dimension of A and use a superscripted labelto associate a state to a Hilbert space, by writing ρA or |ϕ〉A.Computational basis states of A are denoted with lower caseRoman letters as in |i〉A. Tensor products of Hilbert spacesare written AB = A ⊗ B. Given a pure state |ϕ〉AB , weabbreviate ϕAB = |ϕ〉〈ϕ|AB , while writing its partial tracesas ϕA = TrB ϕ

AB . We write πA for the maximally mixedstate on A, and given two isomorphic Hilbert spaces A andA′, we write

|Φ〉AA′

=1√|A|

|A|∑i=1

|i〉A|i〉A′

for the unique maximally entangled state associated with theisomorphism |i〉A 7→ |i〉A′

. A quantum channel is a completelypositive, trace-preserving linear map NA→B from densitymatrices on A to those on B. Given an isometry VA→B ,we will abbreviate its adjoint action on density matrices asV(ρ) ≡ VρV†. A partial isometry is an isometry whenrestricted to its support subspace.

For the von Neumann entropy of a density matrix ϕA wewrite

H(A) ≡ −TrϕA log2 ϕA.

When the underlying state could be ambiguous we writeH(A)ϕ. Given a multipartite state ϕABC , various entropicquantities can be defined in exact analogy to the classical case(see e.g. [15]). Quantum conditional entropy is defined [16]as

H(A|B) = H(AB)−H(B),

quantum mutual information [16] is

I(A;B) = H(A) +H(B)−H(AB)

and quantum conditional mutual information is given by

I(A;B|C) = H(A|C) +H(B|C)−H(AB|C).

3

Observe that the conditional quantities above cannot generallybe interpreted as averages, unless the conditioning system ispurely classical. Furthermore, notice that conditional entropycan in fact be negative, as it is for any pure entangled stateon AB. On the other hand, I(A;B|C) is never negative, afact that is known as strong subaddivity [5]. In Section IV,we show how our main result leads to a self-contained proofof strong subadditivity.

II. RESOURCE INEQUALITIES

It will be convenient for us to use the high-level notation ofresource inequalities [17], [18] to express our main result, aswell as to describe various intermediate protocols introducedduring the proof. We use a more elementary formulation than[18] which is nonetheless sufficient for our purposes.

A. Finite resource inequalities

A single ebit shared between Alice and Bob is denoted [qq].The notation [q→q] represents a noiseless qubit channel fromAlice to Bob, while a noiseless classical bit channel is written[c→ c]. A finite resource inequality is an expression such as

[q → q] ≥ [c→ c], [q → q] ≥ [qq]

meaning that the resource on the left can simulate the oneon the right. The above two examples respectively signifythat a qubit channel can be used to send classical bits (bysignaling with orthogonal pure states), or otherwise can beused to distribute entanglement (by transmitting halves oflocally prepared ebits). Addition of two resources may beregarded as having each of them available. In this way,for instance, the existence of the quantum teleportation andsuperdense coding protocols are proofs of the respective finiteresource inequalities

[qq] + 2[c→ c] ≥ [q → q], [qq] + [q → q] ≥ 2[c→ c]. (3)

B. Approximate resource inequalities

Given two quantum states ρ and σ of the same quantumsystem, we may judge their closeness using either the tracedistance ||ρ− σ||1 or the fidelity F (ρ, σ) =

∣∣∣∣√ρ√σ∣∣∣∣21. Note

that when one of the states is pure, F (|ϕ〉, σ) = 〈ϕ|σ|ϕ〉. Auseful characterization of fidelity – Uhlmann’s theorem – saysthat if |ψ〉 is a purification of ρ, then F (ρ, σ) is the maximumof |〈ψ|φ〉|2 over all purifications |φ〉 of σ. Fidelity and tracedistance related by the inequalities

F (ρ, σ) ≥ 1− ||ρ− σ||1 (4)

||ρ− σ||1 ≤ 2√

1− F (ρ, σ). (5)

Therefore, fidelity and trace distance are equivalent distancemeasures when one is interested in arbitrarily good approx-imations of states as we are here. An approximate resourceinequality ∑

i

ai ≥ε∑j

bj

is a finite resource inequality that holds with an error of ε inthe following sense. Consider acting on half of a maximally

entangled state with each target resource bj that is a channel,and call the resulting global state Ω. Note that Ω should alsocontain the bj that are quantum states. Now, let Ω′ be thesimulated version of this state, obtained by using the resourcesai. We require that Ω and Ω′ are ε-close in either trace distanceor fidelity. The particular measure is not important, as weare ultimately concerned with asymptotics, where ε can bearbitrarily small.

C. Asymptotic resource inequalities

The notion of a finite resource inequality can be generalizedto that of an asymptotic resource inequality. This is a formalexpression of the form∑

i

R(i)in ai

∑j

R(j)outbj . (6)

Here the ai and bj are resources and the rates R(i)in and R(j)

out

are nonnegative real numbers. We shall consider the inequality(6) to be shorthand for the following formal statement: forevery ε > 0, every set of rates R′(i)in > R

(i)in , R′(i)out < R

(i)out and

all sufficiently large n, the approximate resource inequality∑i

bnR′(i)in cai ≥ε∑j

bnR′(j)outcbj

holds. Below, we use Greek letters to denote linear combi-nations of finite resources that appear in asymptotic resourceinequalities. In some asymptotic resource inequalities, we mayonly require a sublinear amount o(n) of a particular inputresource. In such cases, we write oa + β γ if we haveRa+ β γ for every R > 0.

It will also be convenient for us extend the definitionof asymptotic resource inequalities to have negative rateson the left. Such rates are interpreted as meaning that thecorresponding resources are generated rather than consumed.Formally, these resources should be negated and moved tothe right. Let us introduce two powerful lemmas that are theraison d’etre for the entire formalism of asymptotic resourceinequalities and which play important roles in our proofs.

Lemma 1 (Composition lemma [18]):

α β and β γ ⇒ α γ.

Lemma 2 (Cancellation lemma [18]): Given rates that sat-isfy Rin > Rout ≥ 0,

Rina+ β Routa+ γ ⇒ (Rin −Rout)a+ β γ.

Otherwise, if Rout ≥ Rin ≥ 0, then

Rina+ β Routa+ γ ⇒ oa+ β (Rout −Rin)a+ γ

D. Distributed states

In Schumacher data compression, Alice wishes to transmitthe C parts of many instances of the state |ψ〉CD to Bobwhile asymptotically preserving the entanglement with D. We

4

introduce the following notation to describe the correspondingcoding theorem:

ψC|∅ +H(C)[q → q] ψ∅|C .

The notation ψC|∅ indicates that Alice holds the C parts ofmany i.i.d. instances of some fixed purification |ψ〉CD of thedensity matrix ψC , while Bob holds nothing. On the right,the expression ψ∅|C refers to the same purifications as on theleft, only it is Bob who is holding the C systems. In otherwords, Alice attempts to simulate identity channels from thesystems C in her lab to identical systems C located in Bob’slab. This channel is only required to work well when the inputis equal to ψC . In [18], the formalism of relative resources wasintroduced for these purposes, though our alternate notationis sufficient for our needs. State redistribution involves apurification |ψ〉ABCD of a tripartite density matrix ψABC . Wedenote the distributed states before and after the protocol asψAC|B and ψA|CB since Alice begins by holding AC andends by only holding A. The rates in an asymptotic resourceinequality involving such distributed states will in general beentropic expressions evaluated on the implicit but arbitrarypurification into a reference system D. Using this notation,we again state the main result:

Theorem 1:

ψAC|B +Q[q → q] + E[qq] ψA|CB (7)

if and only if Q and E satisfy (1), i.e. are contained in theregion depicted in Figure 1.The converse part of the proof of Theorem 1, i.e. that Q and Emust satisfy (1), is proved in [4]. We thus focus on proving acoding theorem showing that (7) is satisfied whenever Q andE satisfy (1). Because [q → q] ≥ [qq], it suffices for us todemonstrate (7) for the corner point (Q∗, E∗) defined in (2).

III. PROOF OF THEOREM 1

To prove Theorem 1 we will demonstrate the existence ofthe following auxiliary protocol that transfers Cn to Bob andhas the desired net communication and entanglement cost:

Theorem 2:

ψAC|B + 12I(C;BD)[q → q] + 1

2I(A;C)[qq]

ψA|CB + 12I(B;C)[q → q] + 1

2I(B;C)[qq].

Together with the cancellation lemma (Lemma 2), Theo-rem 2 yields a proof of Theorem 1. However, observe thatif I(B;C) ≥ I(A;C), the cancellation lemma still requiresa sublinear amount of entanglement on the left. Similarly, ifwe have I(C;BD) = I(C;D) (i.e. if strong subaddivity issaturated), a sublinear amount of communication will also berequired. However, because [q → q] ≥ [qq], the additionalentanglement cost can be absorbed into the communicationrate and is therefore only relevant if the state ψCBD saturatesstrong subaddivity. We discuss this point further in Section V.

We prove Theorem 2 by means of another protocol thatsimulates coherent channels [19]. A coherent channel [q → qq]

is a type of quantum feedback channel that is an isometry fromAlice to Alice and Bob:

|0〉A|0〉B〈0|A + |1〉A|1〉B〈1|A.

Using a coherent version of teleportation, where Alice andBob apply only local unitaries, it is known that [19]

[qq] + 2[q → qq] ≥ 2[qq] + [q → q].

Repeated concatenation yields the following asymptotic re-source inequality [19]:

2[q → qq] [q → q] + [qq]. (8)

In fact, the opposite direction holds as a finite resourceinequality, but it will not be useful for us here. In this paper,we devote most of our efforts toward proving the followingtheorem which, when combined with (8) and the compositionlemma (Lemma 1), provides a proof of Theorem 2.

Theorem 3:

ψAC|B + 12I(C;BD)[q → q] + 1

2I(A;C)[qq]

ψA|CB + I(B;C)[q → qq].

A. Proof of Theorem 3

Our proof of Theorem 3 relies on the following one-shotversion. We call this a “robust” one-shot protocol because theerror bound is robust to small perturbations in the underlyingstate (c.f. [20]). We delay the proof of this theorem untilSection III-B.

Theorem 4 (Robust one-shot redistribution protocol): Leta pure state |ψ〉ABCD and a maximally entangled state|Φ〉AB be given, where |A| = |B| divides |C|. Suppose that|ϕ〉ABCD and |φ〉ABCD are states satisfying

max∣∣∣∣ψABCD − ϕABCD∣∣∣∣

1,∣∣∣∣ψABCD − φABCD∣∣∣∣

1

≤ ε

for some ε ≤ (6 − 4√

2)2 ≈ .1177. Then there exist aquantum system S with |S| = |C|/|B|, κ encoding isometriesVAAC→ASk and a decoding isometry WSBB→BCK underwhich

1

κ

κ∑k=1

〈k|K〈ψ|ABCDWVk|ψ〉ABCD|Φ〉AB ≥ 1− η (9)

where η is equal to

6√ε+ 4

(|C|||ϕBD||0||ϕBCD||

2

2

|S|2

)1/4

+4κ||φBC ||0||φB ||∞

|C|. (10)

Now we show how to apply Theorem 4 to pure states ofthe form

(|ψ〉ABCD

)⊗nto obtain a proof of Theorem 3. This

is accomplished via the following theorem. The direct part isproved in [12], while the converse part follows from standardarguments in classical information theory (see e.g. [15]).

Theorem 5 (Method of types): Let a tripartite state |ψ〉ABCbe given. For every ε, δ > 0 and all sufficiently large n,

5

there are projections ΠAn

δ , ΠBn

δ , and ΠCn

δ such that forT ∈ A,B,C,

Tr ΠTn

δ (ψT )⊗n ≥ 1− ε

2nH(T )−nδ ≤ Tr ΠTn

δ ≤ 2nH(T )+nδ.

Also, the normalized version |ϕ〉AnBnCn of the subnormalizedstate (

ΠAn

δ ⊗ΠBn

δ ⊗ΠCn

δ

)(|ψ〉ABCD

)⊗nsatisfies ∣∣∣∣ϕAnBnCn − (ψABC)⊗n∣∣∣∣

1≤ ε

and for each T ∈ A,B,C,AB,BC,AC,

2nH(T )−nδ ≤∣∣∣∣ϕTn ∣∣∣∣

0≤ 2nH(T )+nδ

2−nH(T )−nδ ≤∣∣∣∣ϕTn ∣∣∣∣2

2≤ 2−nH(T )+nδ

2−nH(T )−nδ ≤∣∣∣∣ϕTn ∣∣∣∣∞ ≤ 2−nH(T )+nδ.

The entropies in these bounds are evaluated on |ψ〉ABC .Additionally, the normalized version |Ψ〉AnBnCn of the sub-normalized state(

11An

⊗ 11Bn

⊗ΠCn

δ

)(|ψ〉ABC

)⊗nsatisfies ∣∣∣∣ΨAnBnCn −

(ψABC

)⊗n∣∣∣∣1≤ ε.

Finally, there is a δ-independent constant c > 0 such that wemay take ε = 2−ncδ

2

in all of the above bounds.

Proof of Theorem 3: We will apply Theorem 5 two separatetimes to the state

(|ψ〉ABCD

)⊗n, obtaining two auxiliary

states that control the main quantities in the error bound(10) of Theorem 4. For the first, we consider |ψ〉ABCDto be a tripartite state of the systems A,C,BD. We thusobtain, for every δ > 0 and all sufficiently large n, a state|ϕ〉AnBnCnDn that is ε-close to |ψ〉⊗n in trace distance forε = 2−ncδ

2

, such that the matrix norms in the second termof (10) have the appropriate exponential bounds. With respectto the partition AD,B,C, we similarly obtain another state|φ〉AnBnCnDn such that the operator norms in the last termof (10) are bounded accordingly. Alice initiates the protocolby Schumacher compressing the system Cn. For this, sheperforms the projective measurement ΠCn

δ , 11Cn

− ΠCn

δ onCn. According to Theorem 5, the first outcome occurs withprobability at least 1 − ε. In this case, the global stateis replaced by the normalized version |Ψ〉AnBnCnDn of theprojected state ΠCn

δ

(|ψ〉ABCD

)⊗n. In case the other outcome

occurs, Alice declares an error and the protocol is aborted.We condition on the first case. In what follows, we identify|Ψ〉AnBnCnDn with its restriction |Ψ〉AnBnCδDn to the supportCδ of the typical projection ΠCn

δ .By the triangle inequality, each of |ϕ〉AnBnCnDn and

|φ〉AnBnCnDn is 2ε-close to |Ψ〉AnBnCδDn in trace distance be-cause all three states are ε-close to

(|ψ〉ABCD

)⊗n. Therefore,

the one-shot theorem (Theorem 4) implies that there exist aquantum system S and a maximally entangled state |Φ〉ABwith |A| · |S| = |Cδ|, together with κ encoding isometries

VAAnCδ→AnS

k and a decoding isometry WSBnB→BnCδKout

satisfying (9) and (10) with ε replaced by 2ε. If Alice appliesone of the isometries Vk uniformly at random and sends Sto Bob, after which he applies W , the system Cδ will betransferred with high global fidelity. By measuring Kout, Bobcan, on the average, identify Alice’s encoding. Rather thansend Bob classical information, Alice can instead simulate acoherent channel from a system Kin to KinKout by applyinga controlled isometry

V =∑k

|k〉〈k|Kin ⊗ VAAnCδ→AnS .

If she tries to send half of a maximally entangledstate |Φ〉K′Kin , the global pure state |Ω〉 onAnBnCδD

nK ′KinKout that results from the protocolis

|Ω〉 =W V|Ψ〉AnBnCδD

n

|Φ〉K′Kin |Φ〉AB .

It is then immediate from (9) that

〈Φ|K′KinKout〈Ψ|A

nBnCδDn

|Ω〉 ≥ 1− η.

where

|Φ〉K′KinKout =

1√κ

κ∑k=1

|k〉K′|k〉Kin |k〉Kout

is a GHZ state. The corresponding fidelity is thus bounded by(1− η)2 ≥ 1− 2η. By monotonicity of fidelity, we obtain

F (|Φ〉K′KinKout ,ΩK

′KinKout) ≥ 1− 2η (11)

and with (5), we similarly find that∣∣∣∣ΩAnBnCδDn −ΨAnBnCδDn ∣∣∣∣

1≤ 2√

2η.

Because ΨAnBnCδDn

is ε-close to(ψABCD

)⊗nin trace dis-

tance, the triangle inequality implies that∣∣∣∣ΩAnBnCδDn − (ψABCD)⊗n∣∣∣∣1≤ 2√

2η + ε. (12)

We may combine the estimates (11) and (12) using Lemma 2from [21], yielding

F(|Φ〉K

′KinKout(|ψ〉ABCD

)⊗n, |Ω〉

)≥ 1−

∣∣∣∣ΩAnBnCδDn − (ψABCD)⊗n∣∣∣∣1

−3(1− F (|Φ〉K

′KinKout ,ΩK′KinKout)

)≥ 1− ε− 2

√2η − 6η. (13)

Now we only need to bound the two main terms in theexpression (10) for η. Taking |S| = 2nQ and κ = 2nR, thefirst main quantity in (10) satisfies

|Cδ|∣∣∣∣ϕBnDn ∣∣∣∣

0

∣∣∣∣ϕBnCnDn ∣∣∣∣22

|S|2

≤ 2n[H(C)+H(BD)−H(BCD)−2Q]+3nδ

= 2n[I(C;BD)−2Q]+3nδ (14)

and thus tends to zero exponentially fast provided that

Q ≥ 1

2I(C;BD) + 2δ. (15)

6

For the second term,

κ∣∣∣∣φBnCn ∣∣∣∣

0

∣∣∣∣φBn ∣∣∣∣∞|Cδ|

≤ 2n[R+H(BC)−H(C)−H(B)]+3nδ

= 2n[R−I(B;C)]+3nδ

so that if R ≤ I(B;C) − 4δ, this term also goes to zeroexponentially with n. For sufficiently large n, each of theseterms is less than ε = 2−ncδ

2

, giving

η ≤ 6√

2ε+ 4ε1/4 + 4ε.

Therefore, the overall fidelity (13) is at least 1− 6ε1/8 whenε is sufficiently small. Recall the identity

H(C) = 12I(C;BD) + 1

2I(A;C).

If Q obeys (15), Theorem 5 implies that |Cδ| ≤ 2n[H(C)+δ].Therefore the protocol uses entanglement at rate

Ein =1

nlog |Cδ| −Q ≤ 1

2I(A;C)− δ.

Because δ > 0 can be taken arbitrarily small, it follows thatwhenever

Q > 12I(C;BD), Ein >

12I(A;C), and R < I(B;C),

we have, for all sufficiently large n,

ψAC|B + bnQc[q → q] + bnEinc[qq]≥6ε1/8 ψ

A|CB + bnRc[q → qq].

Since this holds for arbitrarily small ε > 0 (in fact, it evenholds for ε → 0 exponentially fast with n), the asymptoticresource inequality of Theorem 3 follows.

B. Proof of Theorem 4

Our proof of Theorem 4 makes essential use of the fol-lowing robust one-shot decoupling lemma, which is provedin the appendix. After stating the lemma, we briefly recallhow it is used in two previously studied special cases of ourredistribution result, to help the reader understand the contextinto which it fits with our proof.

Lemma 3 (Robust one-shot decoupling): Let a density ma-trix ψCE be given, fix ε > 0 and let ϕCE be any statesatisfying

∣∣∣∣ψCE − ϕCE∣∣∣∣1≤ ε. Fix a unitary decomposition

WC→SB of C into subsystems and define, for each UC→C ,

ψSBEU = WUψCEU†W †.

Then the average state

ψSBE

=

∫U(C)

ψSBEU dU.

satisfies

∣∣∣∣ψBE − πB ⊗ ψE∣∣∣∣1≤ 2ε+

√|C|∣∣∣∣ϕE∣∣∣∣

0

∣∣∣∣ϕCE∣∣∣∣22

|S|2. (16)

The state redistribution problem generalizes two previouslyconsidered tasks which nonetheless play a role in our proof.Since our nomenclature differs from past writings, we pause

UW

V

12I(C;D)

12I(B;C) C

C B

B

A S

V †

W †U†

12I(C;D)

12I(A;C)

C

CA

A

S B

Fig. 2. Circuits for merging (left) and splitting (right), related by time-reversaland swapping A↔ B. We have included the rates one gets by applying themethod of types (Theorem 5) to the one-shot decoupling lemma (Lemma 3).Note that for merging, the random encoding determines the decoding, whilefor splitting, the random decoding determines the encoding.

briefly to describe our conventions. When A is trivial or isotherwise regarded as part of the reference D, we follow [7] incalling the corresponding task state merging because Alice isasked to “merge” C with B. When B is trivial we call the taskstate splitting because Alice must “split” C apart from C. In[11], [12], these tasks were respectively called “fully quantumSlepian-Wolf” and “fully quantum reverse Shannon”, withthe additional understanding that the involved resources arequantum communication and entanglement. On the other hand,[7], [10] introduced a protocol for the state merging problem –the so-called “state merging protocol” that only allows the useof quantum entanglement and classical communication. In thispaper, when we speak of protocols for merging and splitting,we shall mean protocols with fully quantum resources in thesense of [11], [12], reserving the the term “merging withclassical communication” for protocols in the sense of [7],[10]. In Section V, we show how our protocol generalizesthese latter protocols when the communication is limited tobe only classical.

Given |ψ〉BCD, an optimal protocol for merging C with B,was given by proving the inequality

ψC|B + 12I(C;D)[q → q] ψ∅|CB + 1

2I(B;C)[qq].

Together with the method of types (Theorem 5), the abovedecoupling lemma provides an immediate proof of this re-source inequality. Indeed, if Alice encodes with a randomunitary, Lemma 3 ensures that a system A (identified withB in the lemma), which will hold Alice’s half of the gen-erated entanglement, is approximately maximally mixed anddecoupled from R. This can easily be shown to imply that Ais maximally entangled with BS (see [12], or compare withthe proof of Theorem 4 below). Because all transformationsare unitary and the global state is pure, this ensures that Bobcan apply a local isometry to reconstruct C, while at the sametime obtaining the other half of the generated entanglement.This scenario is illustrated on the left of Figure 2.

Given |ψ〉ACD, a circuit for splitting is obtained by runninga merging circuit in reverse (while swapping the labels A ↔B), yielding the inequality

ψAC|∅ + 12I(C;D)[q → q] + 1

2I(A;C)[qq]+ ψA|C .

7

12I(C;BD)

12I(A;C)

C

CA

A

V †k

U†k

measure k

W †

C B

B

k

Fig. 3. Circuit for using Bob’s side information to piggyback extra classical(or coherent) information through the fully quantum reverse Shannon circuiton the right of Figure 2.

The corresponding circuit is pictured on the right of Figure 2.We prove Theorem 4 as follows. If Bob’s side information

is considered as part of the reference (i.e. is disregarded asside information), the fully quantum reverse Shannon proto-col can be used to transfer C from Alice to Bob, at leastmaking use of Alice’s side information. By a modification ofthat protocol provided below, Bob’s side information can beutilized to simulate the required coherent channels [q → qq]as follows. Rather than choosing a single random unitaryfor the decoding, we choose exponentially many (roughly2nI(B;C) when we apply the method of types to the one-shot result). We further guarantee that if Alice chooses oneof the corresponding encodings uniformly at random, Bobcan, on average, correctly distinguish that encoding in orderto apply the correct decoding. Thus, it is possible for Aliceto “piggyback” classical information on the transmitted qubits,that Bob can access by means of his side information (cf. [22],[23]). We further ensure that this can all be done coherently,where Alice instead applies a superposition of encodings byusing a controlled isometry that is controlled by an arbitraryquantum state. The circuit we construct for performing thistask non-coherently is illustrated in Figure 3.

Our proof of Theorem 4 relies on two other lemmas. First,we require the operator inequality [24]:

Lemma 4: If 0 ≤ Π ≤ 11 and Π ≤ Λ, then

11− Λ−1/2ΠΛ−1/2 ≤ 2(11−Π) + 4(Λ−Π).

We also will use the following coherification lemma, whichallows us to convert protocols that transmit classical informa-tion to ones that simulate coherent channels. We give a shortproof in the appendix.

Lemma 5: Given a pure state |ψ〉XY and κ unitariesUX→Xk , let |ψk〉XY = Uk|ψ〉XY . Given any other set of pure

states |ψ′k〉XY and a POVM ΛXk on X , there are complexphases αk such that the isometry

LX→XK =∑k

(αkU†k

√Λk)⊗ |k〉K

satisfies

1

κ

κ∑k=1

〈k|〈ψ|L|ψ′k〉 ≥ 1− 2(P +√

1− F ).

where

P = 1− 1

κ

∑k

TrψkΛk, F =1

κ

∑k

|〈ψk|ψ′k〉|2.

Proof of Theorem 4: As in the statement of the theorem, wefix nearby states |ϕ〉 and |φ〉 and let WC→SB be any unitarydecomposition of C into subsystems. Independently chooseκ unitaries U1, . . . , Uκ according to the Haar measure onU(C). For each k, define the states

|ψk〉ABCD = Uk|ψ〉ABCD

|ψk〉ASBBD = WUk|ψ〉ABCD = W |ψk〉ABCD.

We define the decoupling fidelity for ψBBDk as

Fk = F (ψBBDk , πB ⊗ ψBDk ).

Since |Φ〉AB |ψ〉ABCD is a purification of πB ⊗ ψBD,Uhlmann’s theorem implies that there is an isometryV AS→AAC

′

k under which

Fk =∣∣〈ψk|ASBBDV †k |Φ〉AB |ψ〉ABCD∣∣2.

To send the message k, Alice will apply the isometry Vk = V †k .We now define

|ψ′k〉ABCD = W †Vk|Φ〉AB |ψ〉ABCD,

which is the state that is created after Alice performs Vk andgives S to Bob, who then applies W †SB→C . We may thereforeequivalently write

Fk =∣∣〈ψk|ABCD|ψ′k〉ABCD∣∣2.

The average decoupling fidelity is a random variable

Fave =1

κ

κ∑k=1

Fk

that depends on the random choice of unitaries. We lowerbound its expectation as follows. Define the average stateswith respect to Haar measure dU as

ψABCD

=

∫U(C′)

ψABCDU dU

ψASBBD

= WψABCD

W †.

We now use the robust one-shot decoupling lemma(Lemma 3) to bound the expectation of Fave over the random

8

choice of unitaries:

1− EFave = 1− 1

κ

κ∑k=1

EFk

= 1− F (ψBBD

, πB ⊗ ψBD)

≤ 2ε+

√|C|||ϕBD||0||ϕBCD||

22

|S|2.

A related estimate to be used later is

E√

1− Fave ≤√

1− EFave

≤√

2ε+


2

2

|S|2

)1/4

, (17)

which follows by concavity and the inequality√x+ y ≤√

x+√y, valid for x, y ≥ 0.

Next, we consider Bob’s ability to distinguish the statesψ′BCk . For this, we design a measurement that distinguishesthe nearby states φBCk = Ukφ

BCU†k . Let Π be the projectiononto the support of φBC and define

Πk = UkΠU†k ,

while defining the “pretty good measurement”

Λ =

κ∑k=1

Πk, Λk = Λ−1/2ΠkΛ−1/2.

The probability that this measurement fails to identify the stateψ′BCk is

Pk = Tr(11− Λk)ψ′BCk .

Observe that∣∣Pk − Tr(11− Λk)φBCk∣∣ ≤ ∣∣∣∣ψ′BCk − φBCk

∣∣∣∣1

≤∣∣∣∣ψ′BCk − ψBCk

∣∣∣∣1+∣∣∣∣ψBCk − φBCk

∣∣∣∣1

≤ 2√

1− Fk + ε

≡ Dk.

Because x 7→√x is concave, we have

Dave ≡1

κ

κ∑k=1

Dk ≤ ε+ 2√

1− Fave.

Therefore, the average of the Pk can be bounded usingLemma 4, obtaining a random variable satisfying

Pave ≡1

κ

κ∑k=1

Pk

≤ Dave+1

κ

κ∑k=1

Tr(11− Λk)φBCk

≤ Dave+1

κ

κ∑k=1

(2(1− Tr Πkφ

BCk

)+ 4

∑k′ 6=k

Tr Πk′φBCk

)= Dave+

4

κ

κ∑k=1

∑k′ 6=k

Tr Πk′φBCk

The last line holds because for each k, Πk projects onto thesupport of φBCk . By taking the expectation over the randomchoice of unitaries, this yields

EPave ≤ EDave + 4κETr Π1φBC2

= EDave + 4κTr[EΠ1 EφBC2

]= EDave + 4κTr

[EΠ1(πC ⊗ φB2 )

]≤ EDave +

4κ||φBC ||0||φB ||∞|C|

≤ 2E√

1− Fave + ε+4κ||φBC ||0||φB ||∞

|C|. (18)

We now apply Lemma 5 with X = BC and Y = AD, givingan isometry LBC→BCK under which

1

κ

κ∑k=1

〈k|〈ψ|L|ψ′k〉 ≥ 1− 2(Pave +√

1− Fave).

Taking expectations, we find that

1− E1

κ

κ∑k=1

〈k|〈ψ|L|ψ′k〉

≤ 2E√

1− Fave + 2EPave

≤ 4E√

1− Fave + ε+4κ||φBC ||0||φB ||∞

|C|

≤ 6√ε+ 4


2

2

|S|2

)1/4+

4κ||φBC ||0||φB ||∞|C|

.

The second inequality is by (18) while the third is due to (17)and holds for ε ≤ (6−4

√2)2. We may then conclude that for

a particular value of the randomness, the same bound holdswithout the expectations. Finally, we define Bob’s decodingisometry to be W = LW †, completing the proof.

IV. AN OPERATIONAL PROOF OF STRONG SUBADDITIVITY

Let |ψ〉ABCD be an arbitrary pure state. In this section, weshow how our results lead to an operational proof of strongsubaddivity, i.e. that I(C;D|B) ≥ 0. By discarding someresources on the right in Theorem 2, we obtain:

ψAC|B + 12I(C;BD)[q → q] + 1

2I(A;C)[qq]

12I(B;C)[q → q].

Intuitively, it makes sense that we should have

I(C;BD)− I(B;C) = I(C;D|B) ≥ 0

since otherwise, a noiseless qubit channel could be used tofaithfully transmit more than one qubit in the presence ofentanglement between the sender and receiver. Of course thisinequality is guaranteed by strong subadditivity. However, ouraim is to provide an alternative proof of this fundamentalinequality. The above asymptotic resource inequality impliesthat for every ε, δ > 0 and all sufficiently large n, we have

ΨL|L′+⌊n2 I(C;BD)+nδ

⌋[q → q]≥ε

⌊n2 I(B;C)

⌋[q → q]. (19)

ΨL|L′represents prior entanglement between Alice and Bob.

Its precise form is irrelevant for our argument; we lose

9

generality by assuming it is pure. Now consider the followinglemma, whose proof we delay until the end of this section.

Lemma 6: Let X and Y be quantum systems and let|Ψ〉L|L′

be arbitrary. Consider an attempted simulation

NX→X(ρX) = DY L′→X

(EXL→Y ⊗ 11L

′)(ρX ⊗ΨL|L′)

of the identity quantum channel idX→X by the possiblysmaller one idY→Y , assisted by the bipartite state |Ψ〉L|L′

. If|Φ〉X′X is maximally entangled, then the entanglement fidelity[25] satisfies

F(|Φ〉X

′X , (11X′⊗N )(ΦX

′X))≤ |Y ||X|

. (20)

Plugging in |Y | = 2bn2 I(C;BD)+nδc and |X| = 2b

n2 I(B;C)c

to (20), we find that the entanglement fidelity is upper boundedby 2b

n2 I(C;D|B)+n(δ+ 1

n )c. Suppose now that strong subaddivitywas not satisfied. Then, for some sufficiently small δ > 0 theentanglement fidelity would tend to zero exponentially fastwith n. However, (19) implies that for sufficiently large n,the entanglement fidelity can be made arbitrarily close to 1.Therefore I(C;D|B) ≥ 0.

Proof of Lemma 6: Let Ei and Dj be Kraus matricesfor the encoding EXL→Y and decoding DY L′→X . Fixingorthonormal bases of L and L′ that Schmidt-decompose theassistance state as

|Ψ〉L|L′

=∑`

√λ`|`〉L|`〉L

′,

the above Kraus matrices can be written in block form

Ei =[Ei1 · · · Ei|L|

], Dj =

[Dj1 · · · Dj|L|

].

Because these maps are trace-preserving, we have∑i

E†iEi = 11XL,∑j

D†jDj = 11Y L

which in turn implies that∑i

E†iÈi`′ = δ``′11X ,∑j

D†j`Dj`′ = δ``′11Y . (21)

The overall map NX→X has Kraus matrices

Nij =∑`

√λ`DjÈi`.

The entanglement fidelity (20) can be written as [25]:

F(|Φ〉X

′X , (11X′⊗N )(ΦX

′X))

=∑ij

∣∣TrNijπX∣∣2

=1

|X|2∑ij

∣∣TrNij∣∣2.

On the other hand,∑ij

∣∣TrNij∣∣2 =

∑ìj

λ`∣∣TrDjÈi`

∣∣2≤∑ìj

λ`|Y |TrE†i`D†j`DjÈi` (22)

= |Y |∑`

λ` Tr

∑i

E†i`

(∑j

D†j`Dj`

)Ei`

= |Y |

∑`

λ` Tr 11X (23)

= |Y | · |X|.

Above, (22) holds because for each i, j and `, there is a rank|Y | projection P satisfying PDjÈi` = DjÈi`, while theCauchy-Schwartz inequality implies∣∣TrPDjÈi`

∣∣2 ≤(TrP †P

)·(TrE†i`D

†j`DjÈi`

)= |Y |TrE†i`D

†j`DjÈi`.

Equation (23) follows from the identities (21) and the last lineholds because the squares of the Schmidt coefficients sum tounity. This proves the lemma.

V. DISCUSSION

State redistribution is the most general unidirectional two-terminal fully quantum source coding problem. It consists ofmoving a subsystem of a multipartite pure state between twospatially separated parties when the sender and receiver eachhold subsystems, which are regarded as quantum side infor-mation. We have identified the cost, in terms of entanglementand transmitted qubits, for performing state redistribution, bypresenting a protocol that uses these two resources at optimalrates, i.e. that matches the Luo-Devetak outer bound [4]. Ourproof that this protocol exists consists of a new resourceinequality that, when combined with other known results,implies that an optimal protocol exists. The optimal lowerbound on the achievable communication rates provides thefirst known operational interpretation of quantum conditionalmutual information. Technically, we provide an interpretationfor one half of the conditional mutual information; nonethe-less, we observed in [8] that by teleportation, we obtain abona fide interpretation of conditional mutual information (i.e.without the 1/2) as the optimal communication rate whenonly classical communication is allowed in the sense of [7],[10]. While operational interpretations of quantum mutualinformation are known [6], [26], these do not simply leadto one for the conditional quantity by naively subtractingmutual informations. Instead, one requires a proof consistingof a protocol (as found here) achieving rates arbitrarily closeto the desired quantity, together with a converse (as in [4])demonstrating optimality.

Our interpretation provides an explanation of the quadripar-tite pure state identity I(C;D|A) = I(C;D|B) because theessential reversibility of our protocol implies that the com-munication cost is the same in both directions. Indeed, withthe exception of the Schumacher compression step, which isessentially reversible because it succeeds with high probability,

10

A B Dstate redistribution • • •

state merging • •state splitting • •

Schumacher compression •entanglement concentration •

entanglement dilution • concentration + dilution • •

Fig. 4. State redistribution reduces to other known problems when varioussubsystems, represented here by open circles, are trivial (C is alwaysnontrivial in these settings). We exclude the trivial problem consisting ofjust a pure state on C, which can be solved with no nonlocal resources at all.

the protocol constructed to prove Theorem 3 consists entirelyof isometries. Moreover, the additional steps used to arriveat Theorem 1 introduce at most a “sublinear amount” ofnonunitarity. Throughout this paper, we have adhered to theconvention of always conditioning on Bob’s side information,although this was an arbitrary notational choice. We thusinterpret quantum conditional mutual information – as itappears throughout this paper – as a measure of the quantumcorrelations between C and D, from the perspective of eitherA or B.

In Figure 4, we illustrate several special cases of stateredistribution. Our protocol yields optimal protocols for theproblems listed there, at least with regard to the rates atwhich resources are consumed or generated. Respectivelydisregarding Alice’s or Bob’s side information gives opti-mal protocols for state merging and state splitting (recallour nomenclature from Section III-B), which can also beobtained by simply combining Theorem 5 and the robustdecoupling lemma (Lemma 3). Furthermore, when both partieslack side information we recover (albeit somewhat trivially)Schumacher data compression. As pointed out in [8], theformal time-reversal duality between merging and splittingobserved in [11] is embodied in a more natural way by ournew protocol, which is in fact self-dual with respect to timereversal. In [8], we also observed the intuitively satisfying –but nonetheless surprising – fact that successive redistributioncan be performed optimally using the optimal redistributionprotocol.

Other protocols are obtained when D is trivial, in whichcase strong subadditivity is saturated I(C;D|B) = 0 and thusany positive communication rate is achievable by our protocol.When either A or B is also trivial, we respectively obtainprotocols for entanglement concentration and dilution [13],and when both A and B are nontrivial, state redistributiongives an alternate approach to first concentrating the AC|Bentanglement then diluting the A|CB entanglement [27], eachof which gives a net entanglement cost of H(A)−H(B). Notethat [20], [27] showed that diluting EPR entanglement intoi.i.d. pure states requires a nonzero (but sublinear) communi-cation cost to achieve any constant error, while exponentiallysmall error requires any nonzero communication rate. Wetherefore must expect the same with even the most genericstate redistribution instances that saturate strong subadditivity.Here, the states are such that C is conditionally decoupled

from the reference D given A or B and, up to local unitaries,have the form∑

x

√px|x〉A

′|x〉B

′|ψx〉ACBCC |φx〉ADBDD.

As pointed out in [8] (with a sign error in the published ver-sion) this type of state can be redistributed with entanglementcost ∑

x

px(H(AC)−H(BC)

)ψx.

An interesting problem that we do not address in this paperis to more carefully account for sublinear terms in the overallcost for redistribution. Besides giving more precise estimateswhen the overall rates are zero, a more careful study mightprovide a better understanding of transformations betweennon-maximally entangled states as considered in [27], [28].In particular, we note that while exponentially small erroris generically possible with our protocol, this might not bepossible when sublinear amounts of resources are used.

Because the main technical part of our proof is proved in aone-shot fashion, it could possibly be applied to more generalquantum sources that do not satisfy the i.i.d. property but areinstead structured in some other way; for instance, to groundstates of many-body Hamiltonians in statistical physics. Inparticular, there are intriguing connections between state re-distribution and topological entanglement entropy, which is acharacteristic of topologically ordered ground states of gapped2D quantum spin systems. These connections will be pursuedelsewhere.

It could be useful for such applications to have a more directproof of Theorem 1 that does not use coherent channels or thecancellation lemma. While it would be most desirable to havea one-shot version of Theorem 1, it might be more natural (seeNote Added) to find a one-shot version of the related resourceinequality

ψAC|B + 12I(C;D|B)[q → q] + 1

2I(A;C)[qq] ψA|BC + 1

2I(B;C)[qq].

The corresponding circuit for this case makes the time-reversalsymmetry most apparent, as illustrated in Figure 5.

We expect state redistribution to be a useful primitive forstudying more complicated state transfer problems. Most gen-erally, one can imagine n spatially separated parties all holdingvarious parts of a global multipartite state, wishing to shuffletheir subsystems around in some arbitrary but predeterminedway. There is a multitude of ways that redistribution couldbe applied to give achievable rate regions for such problems,where each round of communication would fit our generalsetting, although they would most likely be suboptimal ingeneral. A simple example along these lines, for which theoptimal solution is not yet known, was considered in [29],where Alice and Bob wish to swap two systems. Perhapsjudicious use of state redistribution can lead to new achievablerates for this or related problems by optimizing over ways ofsplitting the systems to be swapped into subsystems.

Apparently, one half of the mutual information plays acentral role in characterizing the optimal rates in this paper.

11

A

A

B

B

C

C

12I(C;D|B)

12I(A;C)

12I(B;C)

Udec

Uenc

Fig. 5. Potential one-shot redistribution circuit making time-reversal sym-metry apparent.

In the following somewhat mysterious fashion, this quantitycan be considered as a “measure” of the correlations betweentwo subsystems. By analogy with thermodynamics, it is pos-sible to identify an underlying heuristic organizing principlegoverning our optimal rates that perhaps could lend itself tofurther generalizations of redistribution. The main task of stateredistribution is to transform between two configurations of thesubsystems as follows:

AC∣∣B∣∣D → A

∣∣CB∣∣D.Let Ainitial/final (resp. B) denote the systems Alice (resp.Bob) holds at the beginning/end of the protocol. Considerthe following “dynamic potentials” relative to Alice→Bobcommunication:

KA→Binitial ≡ 12I(Ainitial;D) = 1

2I(AC;D)

KA→Bfinal ≡ 12I(Afinal;D) = 1

2I(A;D).

We interpret these as indicating the correlations betweenAlice’s systems and the reference, both before and afterredistribution. The optimal qubit rate for redistribution is easilyshown to equal the difference between the dynamic potentials

KA→Bfinal −KA→Binitial = 12I(C;D|A) = 1

2I(C;D|B).

We are therefore operationally justified in interpreting thisdifference as measuring the correlations with the referencethat Alice must transfer to Bob to redistribute the state.Analogously, we may also define “static potentials”

SA→Binitial ≡ 12I(Ainitial;Binitial) = 1

2I(AC;B)

SA→Bfinal ≡ 12I(Afinal;Bfinal) = 1

2I(A;BC)

that indicate the correlations between Alice’s and Bob’s sys-tems at each state of redistribution. Similarly, the optimal ebitrate can be shown to equal the difference of the static potentials

SA→Bfinal − SA→Binitial = 12I(A;C)− 1

2I(B;C).

It is operationally justifiable to consider this difference as theamount of excess correlation between Alice and Bob that isinvolved in going between the two configurations.

Relative to the Bob→Alice direction, the dynamic potentialsare subtracted from a constant

KB→Ainitial/final = H(D)−KA→Binitial/final

while the static potentials obey

SA→Binitial/final = SB→Afinal/initial.

Subtracting these potentials as above, we find that

KB→Afinal −KB→Ainitial = KA→Bfinal −KA→Binitial ,

whileSA→Bfinal − SA→Binitial = −

(SB→Afinal − SB→Ainitial

),

providing another explanation of the symmetry properties ofthe optimal rates. One could imagine generalizations of theabove where more complicated potentials are defined for redis-tribution problems involving many more parties. However, weexpect it would be challenging to find operational justificationsfor such theories.

ACKNOWLEDGMENTS

We would like to thank Charlie Bennett for suggesting thecircuit pictured in Figure 5 and Toby Berger for encouragingus to find a quantum counterpart to the classical result on suc-cessive refinement of information. Igor Devetak was supportedin part by the NSF grants CCF-0524811 and CCF-0545845(CAREER). Jon Yard’s research at Caltech was supported fromthe NSF under the grant PHY-0456720. His research at LANLis supported by the Center for Nonlinear Studies (CNLS),the Quantum Institute and the LDRD program of the U.S.Department of Energy.

NOTE ADDED

After a preprint of this article was made available, a one-shot version of our main result along the lines of Figure 5 wasfound [30], [31].

APPENDIX

Here we collect the proofs of some auxiliary results used inthe proof of Theorem 4. Our proof of the robust decouplinglemma (Lemma 3) relies on the following non-robust versionfrom [12].

Lemma 7 (One-shot decoupling): Let a density matrixϕCE be given and fix a unitary decomposition WC→SB ofC into subsystems. For each unitary UC→C , define

ϕSBEU = WUϕCEU†W †.

Then∫U(C)

∣∣∣∣ϕBEU − πB ⊗ ϕE∣∣∣∣21dU ≤

|C|∣∣∣∣ϕE∣∣∣∣

0

∣∣∣∣ϕCE∣∣∣∣22

|S|2. (24)

Proof of Lemma 3: By convexity of the trace norm∣∣∣∣ψBE − πB ⊗ ψE∣∣∣∣1≤∫U(C)

∣∣∣∣ψBEU − πB ⊗ ψE∣∣∣∣1dU,

12

where dU is Haar measure on U(C). We use the triangleinequality to bound the integrand:∣∣∣∣ψBEU − πB ⊗ ψE

∣∣∣∣1≤∣∣∣∣ϕBEU − πB ⊗ ϕE

∣∣∣∣1

(25)

+∣∣∣∣ψBEU − ϕBEU

∣∣∣∣1

(26)

+∣∣∣∣πB ⊗ ψE − πB ⊗ ϕE∣∣∣∣

1. (27)

The second term is bounded using monotonicity, unitaryinvariance of the trace norm, and the assumed ε-closeness ofψCE and ϕCE :∣∣∣∣ψBEU − ϕBEU

∣∣∣∣1≤∣∣∣∣ψSBEU − ψSBEU

∣∣∣∣1

=∣∣∣∣ψCE − ϕCE∣∣∣∣

1

≤ ε. (28)

Similarly, the last term satisfies∣∣∣∣πB ⊗ ψE − πB ⊗ ϕE∣∣∣∣1

=∣∣∣∣ψE − ϕE∣∣∣∣

1

≤∣∣∣∣ψCE − ϕCE∣∣∣∣

1

≤ ε.

Because x 7→ x2 is convex, the integral of the first termsatisfies(∫

U(C)

∣∣∣∣ϕBEU − πB ⊗ ϕE∣∣∣∣1dU

)2

≤∫U(C)

∣∣∣∣ϕBEU − πB ⊗ ϕE∣∣∣∣21dU.

The theorem follows by applying Lemma 7 to this integral.Proof of Lemma 5: To begin, note that we may choose the

complex phases so that 〈k|〈ψ|L|ψ′k〉 = |〈k|〈ψ|L|ψ′k〉|. Now

〈k|〈ψ|L|ψ′k〉 ≥(〈k|〈ψ|L|ψ′k〉

)2≥ |〈k|〈ψ|L|ψk〉|2 − ||L(ψk)− L(ψ′k)||1.

Because 0 ≤ Λk ≤ 11, we have

|〈k|〈ψ|L|ψk〉|2 = |〈ψk|√

Λk|ψk〉|2

≥ |〈ψk|Λk|ψk〉|2

=(TrψkΛk

)2.

Furthermore, unitary invariance of the trace norm and (5)imply that

||L(ψk)− L(ψ′k)||1 = ||ψk − ψ′k||1 ≤ 2√

1− |〈ψk|ψ′k〉|2.

Therefore,

〈k|〈ψ|L|ψ′k〉 ≥(TrψkΛk

)2 − 2√

1− |〈ψk|ψ′k〉|2.

Finally, because the functions x 7→ x2 and x 7→ −√x are

convex, we find that

1

κ

κ∑k=1

〈k|〈ψ|L|ψ′k〉 ≥ (1− P )2 − 2√

1− F

≥ 1− 2(P +√

1− F )

as required.

REFERENCES

[1] C. E. Shannon, “A mathematical theory of communication,” Bell SystemTechnical Journal, vol. 27, pp. 379–423 and 623–656, July and October1948.

[2] D. Slepian and J. K. Wolf, “Noiseless coding of correlated informationsources,” IEEE Trans. Inform. Theory, vol. 19, pp. 461–480, 1971.

[3] B. Schumacher, “Quantum coding,” Phys. Rev. A, vol. 51, no. 4, pp.2738–2747, Apr 1995.

[4] Z. Luo and I. Devetak, “Channel simulation with quantum side infor-mation,” IEEE Trans. Inform. Theory, vol. 55, no. 3, pp. 1331–1342,2009, arXiv:quant-ph/0611008.

[5] E. Lieb and M. B. Ruskai, “Proof of the strong subadditivity of quantum-mechanical entropy,” J. Math. Phys., vol. 14, no. 12, pp. 938–1941, 1973.

[6] B. Groisman, S. Popescu, and A. Winter, “On the quantum, classical andtotal amount of correlations in a quantum state,” Phys. Rev. A, vol. 72,p. 032317, 2005, arXiv:quant-ph/0410091.

[7] M. Horodecki, J. Oppenheim, and A. Winter, “Partial quantum informa-tion,” Nature, vol. 436, pp. 673–676, 2005, arXiv:quant-ph/0505062.

[8] I. Devetak and J. Yard, “Exact cost of redistributing multipartite quantumstates,” Phys. Rev. Lett., vol. 100, no. 23, p. 230501, June 2008,arXiv:quant-ph/0612050.

[9] T. M. Cover and W. H. R. Equitz, “Successive refinement of informa-tion,” IEEE Trans. Inform. Theory, vol. 37, no. 2, pp. 269–275, 1991.

[10] M. Horodecki, J. Oppenheim, and A. Winter, “Quantum state mergingand negative information,” Commun. Math. Phys., vol. 269, no. 1, pp.107–136, January 2007, arXiv:quant-ph/0512247.

[11] I. Devetak, “A triangle of dualities: reversibly decomposable channels,source-channel duality, and time reversal,” Phys. Rev. Lett., vol. 97, p.140503, 2006, arXiv:quant-ph/0505138.

[12] A. Abeyesinghe, I. Devetak, P. Hayden, and A. Winter, “The mother ofall protocols: Restructuring quantum information’s family tree,” 2006,arXiv:quant-ph/0606225.

[13] C. H. Bennett, H. J. Bernstein, S. Popescu, and B. Schumacher,“Concentrating partial entanglement by local operations,” Phys. Rev. A,vol. 53, no. 4, pp. 2046–2052, Apr 1996.

[14] M. A. Nielsen and I. L. Chuang, Quantum Computation and QuantumInformation. Cambridge, UK: Cambridge University Press, 2000.

[15] T. M. Cover and J. A. Thomas, Elements of Information Theory, ser.Series in Telecommunication. New York: John Wiley and Sons, 1991.

[16] N. J. Cerf and C. Adami, “Negative entropy and information in quantummechanics,” Phys. Rev. Lett., vol. 79, pp. 5194–5197, 1997.

[17] I. Devetak, A. W. Harrow, and A. Winter, “A family of quantumprotocols,” Phys. Rev. Lett., vol. 93, p. 230504, 2004, arXiv:quant-ph/0308044.

[18] ——, “A resource framework for quantum Shannon theory,” IEEE Trans.Inform. Theory, vol. 54, no. 10, pp. 4587–4618, 2008, arXiv:quant-ph/0512015.

[19] A. W. Harrow, “Coherent communication of classical messages,” Phys.Rev. Lett., vol. 92, p. 097902, 2004, arXiv:quant-ph/0307091.

[20] P. Hayden and A. Winter, “Communication cost of entanglementtransformations,” Phys. Rev. A, vol. 67, no. 1, p. 012326, Jan 2003.[Online]. Available: arXiv.org:quant-ph/0204092

[21] J. Yard, I. Devetak, and P. Hayden, “Capacity theorems for quantummultiple access channels – Classical-quantum and quantum-quantumcapacity regions,” IEEE Trans. Inform. Theory, vol. 54, no. 7, pp. 3091–3113, August 2008, arXiv:quant-ph/0501045.

[22] M. Horodecki, P. Horodecki, R. Horodecki, D. Leung, and B. Terhal,“Classical capacity of a noiseless quantum channel assisted by noisyentanglement,” Quantum Information and Computation, vol. 1, no. 3,pp. 70–78, 2001.

[23] C. H. Bennett, P. W. Shor, J. A. Smolin, and A. V. Thapliyal,“Entanglement-assisted capacity of a quantum channel and the reverseShannon theorem,” IEEE Trans. Inform. Theory, vol. 48, no. 10, p. 2637,2002, arXiv:quant-ph/0106052.

[24] M. Hayashi and H. Nagoaka, “General formulas for capacity of classical-quantum channels,” IEEE Trans. Inform. Theory, vol. 49, pp. 1753–1768,2003.

[25] B. Schumacher, “Sending entanglement through noisy quantumchannels,” Phys. Rev. A, vol. 55, no. 1, pp. 2614– 2628, 1996. [Online].Available: arXiv.org:quant-ph/9604023

[26] B. Schumacher and M. Westmoreland, “Quantum mutual informationand the one-time pad,” arXiv.org:quant-ph/0604207.

[27] A. Harrow and H. K. Lo, “A tight lower bound on the classicalcommunication cost of entanglement dilution,” IEEE Trans. Inform.Theory, vol. 50, no. 2, pp. 319– 327, Feb. 2004. [Online]. Available:arXiv.org:quant-ph/0204096

13

[28] B. Fortescue and H.-K. Lo, “Inefficiency and classical communicationbounds for conversion between partially entangled pure bipartite states,”Phys. Rev. A, vol. 72, no. 3, p. 032336, Sep 2005.

[29] J. Oppenheim and A. Winter, “Uncommon information,” arXiv:quant-ph/0511082.

[30] J. Oppenheim, “State redistribution as merging: introducing the coherentrelay,” May 2008, arXiv:0805.1065.

[31] M. Ye, Y. Bai, and Z. D. Wang, “Quantum state redistribution based ona generalized decoupling,” May 2008, arXiv:0805.1542.

Documents

Optimal quantum source coding with quantum side