Strong converse exponent for classical-quantum channel coding · Universitat Autónoma Barcelona...

Preview:

Citation preview

Strong converse exponent forclassical-quantum channel coding

Milán Mosonyi 1,2 and Tomohiro Ogawa 3

1Física Teòrica: Informació i Fenomens QuànticsUniversitat Autónoma Barcelona

2Mathematical InstituteBudapest University of Technology and Economics

3Graduate School of Information Systems,University of Electro-Communications, Tokyo

Beyond I.I.D. Banff 2015

Main result

• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is

Ps ∼ e−nHR,c(W )

HR,c(W ) := supα>1

α− 1

α

[R− sup

Pinfσ

∑x∈X

P (x)D∗α (W (x)‖σ)

]

D∗α (.‖.) is the sandwiched Rényi divergence.

• Operational interpretation of a Rényi quantity not in the context ofhypothesis testing.

• Utilizing a new family of quantum Rényi divergences.

D[

α (%‖σ) :=1

α− 1log Tr eα log %+(1−α) log σ

Main result

• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is

Ps ∼ e−nHR,c(W )

HR,c(W ) := supα>1

α− 1

α

[R− sup

Pinfσ

∑x∈X

P (x)D∗α (W (x)‖σ)

]

D∗α (.‖.) is the sandwiched Rényi divergence.

• Operational interpretation of a Rényi quantity not in the context ofhypothesis testing.

• Utilizing a new family of quantum Rényi divergences.

D[

α (%‖σ) :=1

α− 1log Tr eα log %+(1−α) log σ

Main result

• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is

Ps ∼ e−nHR,c(W )

HR,c(W ) := supα>1

α− 1

α

[R− sup

Pinfσ

∑x∈X

P (x)D∗α (W (x)‖σ)

]

D∗α (.‖.) is the sandwiched Rényi divergence.

• Operational interpretation of a Rényi quantity not in the context ofhypothesis testing.

• Utilizing a new family of quantum Rényi divergences.

D[

α (%‖σ) :=1

α− 1log Tr eα log %+(1−α) log σ

Strong converse things

• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.

• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.

• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.

• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.

Strong converse things

• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.

• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.

• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.

• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.

Strong converse things

• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.

• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.

• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.

• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.

Strong converse things

• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.

• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.

• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.

• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.

Binary state discrimination

• Two candidates for the true state of a system: H0 : % vs. H1 : σ

• Many identical copies are available: H0 : %⊗n vs. H1 : σ

⊗n

• Decision is based on a binary POVM (T, I − T ) on H⊗n.

• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)

• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn

Binary state discrimination

• Two candidates for the true state of a system: H0 : % vs. H1 : σ

• Many identical copies are available: H0 : %⊗n vs. H1 : σ

⊗n

• Decision is based on a binary POVM (T, I − T ) on H⊗n.

• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)

• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn

Binary state discrimination

• Two candidates for the true state of a system: H0 : % vs. H1 : σ

• Many identical copies are available: H0 : %⊗n vs. H1 : σ

⊗n

• Decision is based on a binary POVM (T, I − T ) on H⊗n.

• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)

• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn

Binary state discrimination

• Two candidates for the true state of a system: H0 : % vs. H1 : σ

• Many identical copies are available: H0 : %⊗n vs. H1 : σ

⊗n

• Decision is based on a binary POVM (T, I − T ) on H⊗n.

• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)

• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn

Binary state discrimination

• Two candidates for the true state of a system: H0 : % vs. H1 : σ

• Many identical copies are available: H0 : %⊗n vs. H1 : σ

⊗n

• Decision is based on a binary POVM (T, I − T ) on H⊗n.

• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)

• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn

Binary state discrimination

• Two candidates for the true state of a system: H0 : % vs. H1 : σ

• Many identical copies are available: H0 : %⊗n vs. H1 : σ

⊗n

• Decision is based on a binary POVM (T, I − T ) on H⊗n.

• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)

• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn

• Quantum Stein’s lemma:1

αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD(%‖σ) is the optimal decay

D(%‖σ) := Tr %(log %− log σ) relative entropy2

1Hiai, Petz, 1991, Ogawa, Nagaoka, 2001; 2Umegaki, 1962

Binary state discrimination

• Two candidates for the true state of a system: H0 : % vs. H1 : σ

• Many identical copies are available: H0 : %⊗n vs. H1 : σ

⊗n

• Decision is based on a binary POVM (T, I − T ) on H⊗n.

• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)

• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn

• Quantum Stein’s lemma:1

αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD(%‖σ) is the optimal decay

D(%‖σ) := Tr %(log %− log σ) relative entropy2

1Hiai, Petz, 1991, Ogawa, Nagaoka, 2001; 2Umegaki, 1962

Quantifying the trade-off

• Stein’s lemma: αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD1(%‖σ)

Quantifying the trade-off

• Stein’s lemma: αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD1(%‖σ)

• Direct domain: Quantum Hoeffding bound1

βn(Tn) ∼ e−nr =⇒ αn(Tn) ∼ e−nHr , r < D1(%‖σ)

Converse domain: Quantum Han-Kobayashi bound2

βn(Tn) ∼ e−nr =⇒ αn(Tn) ∼ 1− e−nH∗r , r > D1(%‖σ)

• Hoeffding (anti-)divergences:

Hr := sup0<α<1

α− 1

α[r −Dα (%‖σ)]

H∗r := sup1<α

α− 1

α[r −D∗α (%‖σ)]

1Hayashi; Nagaoka; Audenaert, Nussbaum, Szkoła, Verstraete; 20062Mosonyi, Ogawa, 2013

Quantum Rényi divergences

• Quantum Rényi divergences:1

Dα (%‖σ) :=1

α− 1log Tr %ασ1−α

D∗α (%‖σ) :=1

α− 1log Tr

(%

12σ

1−αα %

12

• The right quantum definition is

Dqα(%‖σ) :=

{Dα(%‖σ), α ∈ [0, 1),

D∗α(%‖σ), α ∈ (1,+∞].

• Supported by further binary hypothesis testing resultsHayashi, Tomamichel 2014; Cooney, Mosonyi, Wilde 2014

1Petz 1986; Müller-Lennert, Dupuis, Szehr, Fehr, Tomamichel, 2013;Wilde, Winter, Yang, 2013

Quantum Rényi divergences

• Quantum Rényi divergences:1

Dα (%‖σ) :=1

α− 1log Tr %ασ1−α

D∗α (%‖σ) :=1

α− 1log Tr

(%

12σ

1−αα %

12

)α• The right quantum definition is

Dqα(%‖σ) :=

{Dα(%‖σ), α ∈ [0, 1),

D∗α(%‖σ), α ∈ (1,+∞].

• Supported by further binary hypothesis testing resultsHayashi, Tomamichel 2014; Cooney, Mosonyi, Wilde 2014

1Petz 1986; Müller-Lennert, Dupuis, Szehr, Fehr, Tomamichel, 2013;Wilde, Winter, Yang, 2013

Quantum Rényi divergences

• Quantum Rényi divergences:1

Dα (%‖σ) :=1

α− 1log Tr %ασ1−α

D∗α (%‖σ) :=1

α− 1log Tr

(%

12σ

1−αα %

12

)α• The right quantum definition is

Dqα(%‖σ) :=

{Dα(%‖σ), α ∈ [0, 1),

D∗α(%‖σ), α ∈ (1,+∞].

• Supported by further binary hypothesis testing resultsHayashi, Tomamichel 2014; Cooney, Mosonyi, Wilde 2014

1Petz 1986; Müller-Lennert, Dupuis, Szehr, Fehr, Tomamichel, 2013;Wilde, Winter, Yang, 2013

Yet another quantum Rényi divergence

• New quantum Rényi divergence connected to classical variationalformulas:

D[

α (%‖σ) :=1

α− 1log Tr eα log %+(1−α) log σ

A new Rényi divergence

Hoeffding (anti-)divergences:

Hr = sup0<α<1

α− 1

α[r −Dα (%‖σ)]

H∗r = sup1<α

α− 1

α[r −D∗α (%‖σ)]

A new Rényi divergence

Hoeffding (anti-)divergences: %σ = σ%

Hr = sup0<α<1

α− 1

α[r −Dα (%‖σ)] = inf

D(τ‖σ)≤rD(τ‖%)

H∗r = sup1<α

α− 1

α[r −D∗α (%‖σ)] = inf

D(τ‖σ)≤r{D(τ‖%) + r −D(τ‖σ)}

A new Rényi divergence

Hoeffding (anti-)divergences: %σ 6= σ%

Hr = sup0<α<1

α− 1

α[r −Dα (%‖σ)] 6= inf

D(τ‖σ)≤rD(τ‖%)

H∗r = sup1<α

α− 1

α[r −D∗α (%‖σ)] 6= inf

D(τ‖σ)≤r{D(τ‖%) + r −D(τ‖σ)}

A new Rényi divergence

Hoeffding (anti-)divergences: %σ 6= σ%

sup0<α<1

α− 1

α[r −D[

α(%‖σ)] = infD(τ‖σ)≤r

D(τ‖%)

sup1<α

α− 1

α

[r −D[

α(%‖σ)]= inf

D(τ‖σ)≤r{D(τ‖%) + r −D(τ‖σ)}

Direct Rényi divergence

Q(%‖σ) := Tr %ασ1−α

“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”

“direct Rényi divergence”

• operational interpretation: α ∈ (0, 1) Hoeffding bound1

• recovery from classical: Nussbaum-Szkoła distributions2

% =∑

i riPi, σ =∑

j sjQj

p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj

Qα(%‖σ) = Qα(p‖q)

• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]

1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986

Direct Rényi divergence

Q(%‖σ) := Tr %ασ1−α

“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”

“direct Rényi divergence”

• operational interpretation: α ∈ (0, 1) Hoeffding bound1

• recovery from classical: Nussbaum-Szkoła distributions2

% =∑

i riPi, σ =∑

j sjQj

p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj

Qα(%‖σ) = Qα(p‖q)

• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]

1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986

Direct Rényi divergence

Q(%‖σ) := Tr %ασ1−α

“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”

“direct Rényi divergence”

• operational interpretation: α ∈ (0, 1) Hoeffding bound1

• recovery from classical: Nussbaum-Szkoła distributions2

% =∑

i riPi, σ =∑

j sjQj

p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj

Qα(%‖σ) = Qα(p‖q)

• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]

1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986

Direct Rényi divergence

Q(%‖σ) := Tr %ασ1−α

“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”

“direct Rényi divergence”

• operational interpretation: α ∈ (0, 1) Hoeffding bound1

• recovery from classical: Nussbaum-Szkoła distributions2

% =∑

i riPi, σ =∑

j sjQj

p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj

Qα(%‖σ) = Qα(p‖q)

• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]

1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986

Direct Rényi divergence

Q(%‖σ) := Tr %ασ1−α

“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”

“direct Rényi divergence”

• operational interpretation: α ∈ (0, 1) Hoeffding bound1

• recovery from classical: Nussbaum-Szkoła distributions2

% =∑

i riPi, σ =∑

j sjQj

p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj

Qα(%‖σ) = Qα(p‖q)

• variational expression: ?

• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]

1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986

Direct Rényi divergence

Q(%‖σ) := Tr %ασ1−α

“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”

“direct Rényi divergence”

• operational interpretation: α ∈ (0, 1) Hoeffding bound1

• recovery from classical: Nussbaum-Szkoła distributions2

% =∑

i riPi, σ =∑

j sjQj

p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj

Qα(%‖σ) = Qα(p‖q)

• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]

1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986

Converse Rényi divergence

Q∗(%‖σ) := Tr(%

12σ

1−αα %

12

)α“new”, “sandwiched”, “minimal”

“converse Rényi divergence”

• operational interpretation:1 α ∈ (1,+∞) strong converse exponent

• recovery from classical:1,2

σ =∑

j sjQj pinching: Pσ(X) :=∑

j QjXQj .

D∗α(%‖σ) = limn→+∞

1

nDα(Pσ⊗n%⊗n‖σ⊗n)

• variational expression:3 s(α) = sign(α− 1)

Q∗α(%‖σ) = s(α) supH≥0

s(α)

{αTrH%+ (1− α) Tr

(H

12σ

α−1α H

12

) αα−1

}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)

1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013

Converse Rényi divergence

Q∗(%‖σ) := Tr(%

12σ

1−αα %

12

)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”

• operational interpretation:1 α ∈ (1,+∞) strong converse exponent

• recovery from classical:1,2

σ =∑

j sjQj pinching: Pσ(X) :=∑

j QjXQj .

D∗α(%‖σ) = limn→+∞

1

nDα(Pσ⊗n%⊗n‖σ⊗n)

• variational expression:3 s(α) = sign(α− 1)

Q∗α(%‖σ) = s(α) supH≥0

s(α)

{αTrH%+ (1− α) Tr

(H

12σ

α−1α H

12

) αα−1

}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)

1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013

Converse Rényi divergence

Q∗(%‖σ) := Tr(%

12σ

1−αα %

12

)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”

• operational interpretation:1 α ∈ (1,+∞) strong converse exponent

• recovery from classical:1,2

σ =∑

j sjQj pinching: Pσ(X) :=∑

j QjXQj .

D∗α(%‖σ) = limn→+∞

1

nDα(Pσ⊗n%⊗n‖σ⊗n)

• variational expression:3 s(α) = sign(α− 1)

Q∗α(%‖σ) = s(α) supH≥0

s(α)

{αTrH%+ (1− α) Tr

(H

12σ

α−1α H

12

) αα−1

}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)

1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013

Converse Rényi divergence

Q∗(%‖σ) := Tr(%

12σ

1−αα %

12

)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”

• operational interpretation:1 α ∈ (1,+∞) strong converse exponent

• recovery from classical:1,2

σ =∑

j sjQj pinching: Pσ(X) :=∑

j QjXQj .

D∗α(%‖σ) = limn→+∞

1

nDα(Pσ⊗n%⊗n‖σ⊗n)

• variational expression:3 s(α) = sign(α− 1)

Q∗α(%‖σ) = s(α) supH≥0

s(α)

{αTrH%+ (1− α) Tr

(H

12σ

α−1α H

12

) αα−1

}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)

1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013

Converse Rényi divergence

Q∗(%‖σ) := Tr(%

12σ

1−αα %

12

)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”

• operational interpretation:1 α ∈ (1,+∞) strong converse exponent

• recovery from classical:1,2

σ =∑

j sjQj pinching: Pσ(X) :=∑

j QjXQj .

D∗α(%‖σ) = limn→+∞

1

nDα(Pσ⊗n%⊗n‖σ⊗n)

• variational expression:3 s(α) = sign(α− 1)

Q∗α(%‖σ) = s(α) supH≥0

s(α)

{αTrH%+ (1− α) Tr

(H

12σ

α−1α H

12

) αα−1

}

• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)

1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013

Converse Rényi divergence

Q∗(%‖σ) := Tr(%

12σ

1−αα %

12

)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”

• operational interpretation:1 α ∈ (1,+∞) strong converse exponent

• recovery from classical:1,2

σ =∑

j sjQj pinching: Pσ(X) :=∑

j QjXQj .

D∗α(%‖σ) = limn→+∞

1

nDα(Pσ⊗n%⊗n‖σ⊗n)

• variational expression:3 s(α) = sign(α− 1)

Q∗α(%‖σ) = s(α) supH≥0

s(α)

{αTrH%+ (1− α) Tr

(H

12σ

α−1α H

12

) αα−1

}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)

1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• operational interpretation: ?

• recovery from classical: ?

• variational expression:1

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

max attained at τα = eα log %+(1−α) log σ/Tr(. . .)

• joint concavity and monotonicity: α ∈ [0, 1]

1Mosonyi, Ogawa, 2013

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• operational interpretation: ?

• recovery from classical: ?

• variational expression:1

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

max attained at τα = eα log %+(1−α) log σ/Tr(. . .)

• joint concavity and monotonicity: α ∈ [0, 1]

1Mosonyi, Ogawa, 2013

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• operational interpretation: ?

• recovery from classical: ?

• variational expression:1

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

max attained at τα = eα log %+(1−α) log σ/Tr(. . .)

• joint concavity and monotonicity: α ∈ [0, 1]

1Mosonyi, Ogawa, 2013

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• operational interpretation: ?

• recovery from classical: ?

• variational expression:1

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

max attained at τα = eα log %+(1−α) log σ/Tr(. . .)

• joint concavity and monotonicity: α ∈ [0, 1]

1Mosonyi, Ogawa, 2013

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• operational interpretation: ?

• recovery from classical: ?

• variational expression:1

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

max attained at τα = eα log %+(1−α) log σ/Tr(. . .)

• joint concavity and monotonicity: α ∈ [0, 1]

1Mosonyi, Ogawa, 2013

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• operational interpretation: ?

• recovery from classical: ?

• variational expression:1

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

max attained at τα = eα log %+(1−α) log σ/Tr(. . .)

• joint concavity and monotonicity: α ∈ [0, 1]

1Mosonyi, Ogawa, 2013

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• variational expression:

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

• Equivalent forms:1

Tr eH+logA = maxτ>0{Tr τ +Tr τH −D(τ‖A)}

log Tr eH+logA = maxτ∈S(H)

{Tr τH −D(τ‖A)}

• H = α log %, A = σ1−α

1Tropp 2011; Hiai, Petz 1993

Variational Rényi divergence

Q[(%‖σ) := Tr eα log %+(1−α) log σ

• variational expression:

Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}

logQ[α(%‖σ) = maxτ∈S(H)

{−αD(τ‖%)− (1− α)D(τ‖σ)}

• Equivalent forms:1

Tr eH+logA = maxτ>0{Tr τ +Tr τH −D(τ‖A)}

log Tr eH+logA = maxτ∈S(H)

{Tr τH −D(τ‖A)}

• H = α log %, A = σ1−α

1Tropp 2011; Hiai, Petz 1993

Ordering

D∗α ≤ Dα≤ D[α, α ∈ [0, 1),

D[α ≤ D∗α ≤ Dα, α > 1.

blue ⇐⇒ Araki-Lieb-Thirring inequality

red ⇐⇒ Golden-Thompson inequality

Classical-quantum channels: definition

• classical-quantum channel:

W : X → S(H)

• X arbitrary set input alphabet

special case: X = S(HA), W CPTP: quantum channel

• I.i.d. extensions:

W⊗n : x 7→W (x1)⊗ . . .⊗W (xn), x ∈ X n

special case: quantum channel with product encoding

Holevo capacity

• lifted channel:

W : X → S(HX ⊗H), W(x) := |x〉〈x| ⊗W (x).

|x〉〈x|, x ∈ X , set of orthogonal rank-1 projections in HX

• P ∈ Pf (X ) finitely supported probability density

W(P ) :=∑x∈X

P (x)|x〉〈x| ⊗W (x)

marginals: TrHW(P ) =∑x∈X

P (x)|x〉〈x| =: P

TrHX W(P ) =∑x∈X

P (x)W (x) =:W (P )

• Holevo quantity:

χ(W,P ) := D(W(P )‖P ⊗W (P ))

Rényi capacities

W(P ) :=∑x∈X

P (x)|x〉〈x| ⊗W (x)

• Rényi mutual informations:

χ(t)α,1(W,P ) := inf

σ∈S(H)D(t)α (W(P )‖P ⊗ σ)

χ(t)α,2(W,P ) := inf

σ∈S(H)

∑x

P (x)D(t)α (W (x)‖σ)

• Rényi capacities:

χ(t)(W ) := supP∈Pf (X )

χ(t)α,1(W,P )

= supP∈Pf (X )

χ(t)α,2(W,P )

= infσ∈S(H)

supx∈X

D(t)α (W (x)‖σ)

Rényi capacities

W(P ) :=∑x∈X

P (x)|x〉〈x| ⊗W (x)

• Rényi mutual informations:

χ(t)α,1(W,P ) := inf

σ∈S(H)D(t)α (W(P )‖P ⊗ σ)

χ(t)α,2(W,P ) := inf

σ∈S(H)

∑x

P (x)D(t)α (W (x)‖σ)

• Rényi capacities:

χ(t)(W ) := supP∈Pf (X )

χ(t)α,1(W,P )

= supP∈Pf (X )

χ(t)α,2(W,P )

= infσ∈S(H)

supx∈X

D(t)α (W (x)‖σ)

Rényi capacities

W(P ) :=∑x∈X

P (x)|x〉〈x| ⊗W (x)

• Rényi mutual informations:

χ(t)α,1(W,P ) := inf

σ∈S(H)D(t)α (W(P )‖P ⊗ σ)

χ(t)α,2(W,P ) := inf

σ∈S(H)

∑x

P (x)D(t)α (W (x)‖σ)

• Rényi capacities:

χ(t)(W ) := supP∈Pf (X )

χ(t)α,1(W,P )

= supP∈Pf (X )

χ(t)α,2(W,P )

= infσ∈S(H)

supx∈X

D(t)α (W (x)‖σ)

Rényi capacities

W(P ) :=∑x∈X

P (x)|x〉〈x| ⊗W (x)

• Rényi mutual informations:

χ(t)α,1(W,P ) := inf

σ∈S(H)D(t)α (W(P )‖P ⊗ σ)

χ(t)α,2(W,P ) := inf

σ∈S(H)

∑x

P (x)D(t)α (W (x)‖σ)

• Rényi capacities:

χ(t)(W ) := supP∈Pf (X )

χ(t)α,1(W,P )

= supP∈Pf (X )

χ(t)α,2(W,P )

= infσ∈S(H)

supx∈X

D(t)α (W (x)‖σ)

Rényi capacities

W(P ) :=∑x∈X

P (x)|x〉〈x| ⊗W (x)

• Rényi mutual informations:

χ(t)α,1(W,P ) := inf

σ∈S(H)D(t)α (W(P )‖P ⊗ σ)

χ(t)α,2(W,P ) := inf

σ∈S(H)

∑x

P (x)D(t)α (W (x)‖σ)

• Rényi capacities:

χ(t)(W ) := supP∈Pf (X )

χ(t)α,1(W,P )

= supP∈Pf (X )

χ(t)α,2(W,P )

= infσ∈S(H)

supx∈X

D(t)α (W (x)‖σ)

Classical-quantum channels: codes

• code: Cn = (En,Dn)En = (x1, . . . , xMn

) ∈ (X n)Mn , Dn(1), . . . ,Dn(Mn) POVM

• size of the code: |Cn| =Mn

• average success probability:

Ps(Cn) :=1

|Cn|

|Cn|∑m=1

TrW⊗n(En(m))Dn(m)

• channel capacity:1

C(W ) := sup{Cn}n∈N

{lim infn→+∞

1

nlog |Cn| : lim

n→+∞Ps(Cn) = 1

}= χ(W ).

• strong converse exponent:

|Cn| ∼ enr =⇒ Ps(Cn) ∼ e−n·sc(r)1Holevo 1997; Schumacher, Westmoreland 1997

Strong converse exponent: lower bound

Lemma:

sc(r) ≥ supα>1

α− 1

α[R− χ∗α(W )]

Follows by a standard argument due to Nagaoka, using the monotonicity ofD∗α, α > 1.

Dueck-Körner upper bound

Theorem:1

sc(r) ≤ infP∈Pf (X )

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}Proof idea:

Tr(V ⊗n(En(k))− enaW⊗n(En(k))

)+

≥ Tr(V ⊗n(En(k))− enaW⊗n(En(k))

)Dn(k),

and hence

Ps(W⊗n, Cn)

≥ e−na{Ps(V

⊗n, Cn)−1

Mn

Mn∑k=1

Tr(V ⊗n(En(k))− enaW⊗n(En(k))

)+

}.

1Dueck, Körner 1979; Mosonyi, Ogawa 2014

Dueck-Körner upper bound

Theorem:1

sc(r) ≤ infP∈Pf (X )

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}Proof idea:

Ps(W⊗n, Cn)

≥ e−na{Ps(V

⊗n, Cn)−1

Mn

Mn∑k=1

Tr(V ⊗n(En(k))− enaW⊗n(En(k))

)+

}.

Expectation E over random coding: Mn = denreE[Ps(W

⊗n, Cn)]

≥ e−naE

[Ps(V

⊗n, Cn)]−∑x∈Xn

Pn(x) Tr(V ⊗n(x)− enaW⊗n(x)

)+

= e−na

{E[Ps(V

⊗n, Cn)]− Tr

(V(P )⊗n − enaW(P )⊗n

)+

}.

1Dueck, Körner 1979; Mosonyi, Ogawa 2014

Dueck-Körner upper bound

Theorem:1

sc(r) ≤ infP∈Pf (X )

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}Proof idea: Expectation E over random coding: Mn = denre

E[Ps(W

⊗n, Cn)]

≥ e−na{E[Ps(V

⊗n, Cn)]− Tr

(V(P )⊗n − enaW(P )⊗n

)+

}.

r < χ(V, P ), a > D(V(P )‖W(P )) =⇒ E [Ps(W⊗n, Cn)] ≥ e−na

sc(r) ≤ infV :χ(V,P )>r

D(V(P )‖W(P ))

sc(r) ≤ infV :χ(V,P )≤r

{D(V(P )‖W(P )) + r − χ(V, P )}

1Dueck, Körner 1979; Mosonyi, Ogawa 2014

Dueck-Körner upper bound

Theorem:1

sc(r) ≤ infP∈Pf (X )

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}Proof idea: Expectation E over random coding: Mn = denre

E[Ps(W

⊗n, Cn)]

≥ e−na{E[Ps(V

⊗n, Cn)]− Tr

(V(P )⊗n − enaW(P )⊗n

)+

}.

r < χ(V, P ), a > D(V(P )‖W(P )) =⇒ E [Ps(W⊗n, Cn)] ≥ e−na

sc(r) ≤ infV :χ(V,P )>r

D(V(P )‖W(P ))

sc(r) ≤ infV :χ(V,P )≤r

{D(V(P )‖W(P )) + r − χ(V, P )}

1Dueck, Körner 1979; Mosonyi, Ogawa 2014

Dueck-Körner upper bound

• Theorem:

sc(r) ≤ infP∈Pf (X )

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}

• Theorem:

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}= sup

α>1

α− 1

α

[r − χ[α,2(W,P )

]• Theorem:

infP

supα>1

α− 1

α

[r − χ[α,2(W,P )

]= sup

α>1infP

α− 1

α

[r − χ[α,2(W,P )

]= sup

α>1

α− 1

α

[r − sup

Pχ[α,2(W,P )

]= sup

α>1

α− 1

α

[r − sup

Pχ[α,1(W,P )

]

Dueck-Körner upper bound

• Theorem:

sc(r) ≤ infP∈Pf (X )

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}• Theorem:

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}= sup

α>1

α− 1

α

[r − χ[α,2(W,P )

]

• Theorem:

infP

supα>1

α− 1

α

[r − χ[α,2(W,P )

]= sup

α>1infP

α− 1

α

[r − χ[α,2(W,P )

]= sup

α>1

α− 1

α

[r − sup

Pχ[α,2(W,P )

]= sup

α>1

α− 1

α

[r − sup

Pχ[α,1(W,P )

]

Dueck-Körner upper bound

• Theorem:

sc(r) ≤ infP∈Pf (X )

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}• Theorem:

infV :X→S(H)

{D(V(P )‖W(P )) + |r − χ(V, P )|+

}= sup

α>1

α− 1

α

[r − χ[α,2(W,P )

]• Theorem:

infP

supα>1

α− 1

α

[r − χ[α,2(W,P )

]= sup

α>1infP

α− 1

α

[r − χ[α,2(W,P )

]= sup

α>1

α− 1

α

[r − sup

Pχ[α,2(W,P )

]= sup

α>1

α− 1

α

[r − sup

Pχ[α,1(W,P )

]

Dueck-Körner upper bound

• ∀W, ∀r ∃ codes {Ck}k∈N with rate r s.t.

lim infk

1

klogPs(W

⊗k, Ck) ≥ − supα>1

α− 1

α

[r − sup

Pχ[α,1(W,P )

]

• Let σm be a universal symmetric state on H⊗m

∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)

2

• pinched channel:

Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm

Dueck-Körner upper bound

• ∀W, ∀r ∃ codes {Ck}k∈N with rate r s.t.

lim infk

1

klogPs(W

⊗k, Ck) ≥ − supα>1

α− 1

α

[r − sup

Pχ[α,1(W,P )

]

• Let σm be a universal symmetric state on H⊗m

∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)

2

• pinched channel:

Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm

Dueck-Körner upper bound

• ∀W, ∀r ∃ codes {Ck}k∈N with rate r s.t.

lim infk

1

klogPs(W

⊗k, Ck) ≥ − supα>1

α− 1

α

[r − sup

Pχ[α,1(W,P )

]

• Let σm be a universal symmetric state on H⊗m

∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)

2

• pinched channel:

Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm

Dueck-Körner upper bound

• ∀W, ∀r ∃ codes {Ck}k∈N with rate rm s.t.

lim infk

1

klogPs(W

⊗km , Ck) ≥ − sup

α>1

α− 1

α

[rm− sup

Pχ[α,1(Wm, P )

]

• Let σm be a universal symmetric state on H⊗m

∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)

2

• pinched channel:

Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm

Dueck-Körner upper bound

• ∀W, ∀r ∃ codes {Ck}k∈N with rate rm s.t.

lim infm

1

klogPs(W

⊗km , Ck) ≥ − sup

α>1

α− 1

α

[rm− sup

Pχ[α,1(Wm, P )

]

• Construct codes {C̃n}n∈N with rate r s.t.

lim infn

1

nlogPs(W

⊗n, C̃n) =1

mlim inf

m

1

klogPs(W

⊗km , Ck)

≥ − supα>1

α− 1

α

[r − 1

msupPχ[α,1(Wm, P )

]

Dueck-Körner upper bound

Construct codes {C̃n}n∈N with rate r s.t.

lim infn

1

nlogPs(W

⊗n, C̃n) =1

mlim inf

m

1

klogPs(W

⊗km , Ck)

≥ − supα>1

α− 1

α

[r − 1

msupPχ[α,1(Wm, P )

]

χ[α(Wm) = supPm∈Pf (Xm)

χ[α,1(EmW⊗m, Pm)

≥ supP∈Pf (X )

χ[α,1(EmW⊗m, P⊗m)

≥ supP∈Pf (X )

χ∗α,1(W⊗m, P⊗m)− 3 log vm,d

= m supP∈Pf (X )

χ∗α,1(W,P )− 3 log vm,d

= mχ∗α(W )− 3 log vm,d,

Dueck-Körner upper bound

Construct codes {C̃n}n∈N with rate r s.t.

lim infn

1

nlogPs(W

⊗n, C̃n) =1

mlim inf

m

1

klogPs(W

⊗km , Ck)

≥ − supα>1

α− 1

α

[r − 1

msupPχ[α,1(Wm, P )

]

χ[α(Wm) = supPm∈Pf (Xm)

χ[α,1(EmW⊗m, Pm)

≥ supP∈Pf (X )

χ[α,1(EmW⊗m, P⊗m)

≥ supP∈Pf (X )

χ∗α,1(W⊗m, P⊗m)− 3 log vm,d

= m supP∈Pf (X )

χ∗α,1(W,P )− 3 log vm,d

= mχ∗α(W )− 3 log vm,d,

Dueck-Körner upper bound

Construct codes {C̃n}n∈N with rate r s.t.

lim infn

1

nlogPs(W

⊗n, C̃n) =1

mlim inf

m

1

klogPs(W

⊗km , Ck)

≥ − supα>1

α− 1

α

[r − 1

msupPχ[α,1(Wm, P )

]≥ − sup

α>1

α− 1

α

[r − sup

Pχ∗α,1(W,P )

]− f(m)

χ[α(Wm) = supPm∈Pf (Xm)

χ[α,1(EmW⊗m, Pm)

≥ supP∈Pf (X )

χ[α,1(EmW⊗m, P⊗m)

≥ supP∈Pf (X )

χ∗α,1(W⊗n, P⊗m)− 3 log vm,d

= m supP∈Pf (X )

χ∗α,1(W,P )− 3 log vm,d

= mχ∗α(W )− 3 log vm,d,

Summary

• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is

Ps ∼ e−nHR,c(W )

HR,c(W ) := supα>1

α− 1

α[R− χ∗α(W )]

D∗α (.‖.) is the sandwiched Rényi divergence.

• Operational interpretation of the Rényi capacity χ∗α(W ), α > 1.

• Utilizing a new family of quantum Rényi divergences.

D[

α (%‖σ) :=1

α− 1log Tr eα log %+(1−α) log σ