r nyi divergences and hypothesis testing problems
play

Rnyi divergences and hypothesis testing problems Miln Mosonyi 1 , 2 1 - PowerPoint PPT Presentation

Rnyi divergences and hypothesis testing problems Miln Mosonyi 1 , 2 1 Fsica Terica: Informaci i Fenomens Quntics Universitat Autnoma Barcelona 2 Mathematical Institute Budapest University of Technology and Economics Paris 2015


  1. Rényi divergences and hypothesis testing problems Milán Mosonyi 1 , 2 1 Física Teòrica: Informació i Fenomens Quàntics Universitat Autónoma Barcelona 2 Mathematical Institute Budapest University of Technology and Economics Paris 2015

  2. Binary state discrimination • Two candidates for the true state of a system: H 0 : ρ vs. H 1 : σ • Many identical copies are available: H 0 : ρ ⊗ n vs. H 1 : σ ⊗ n on H ⊗ n . • Decision is based on a binary POVM ( T, I − T ) α n ( T ) := Tr ρ ⊗ n ( I n − T ) • error probabilities: (first kind) β n ( T ) := Tr σ ⊗ n T (second kind) • trade-off: min 0 ≤ T ≤ I { α n ( T ) + β n ( T ) } > 0 unless ρ n ⊥ σ n

  3. Binary state discrimination • Two candidates for the true state of a system: H 0 : ρ vs. H 1 : σ • Many identical copies are available: H 0 : ρ ⊗ n vs. H 1 : σ ⊗ n on H ⊗ n . • Decision is based on a binary POVM ( T, I − T ) α n ( T ) := Tr ρ ⊗ n ( I n − T ) • error probabilities: (first kind) β n ( T ) := Tr σ ⊗ n T (second kind) • trade-off: min 0 ≤ T ≤ I { α n ( T ) + β n ( T ) } > 0 unless ρ n ⊥ σ n • Quantum Stein’s lemma: 1 β n ( T n ) ∼ e − nD 1 ( ρ � σ ) α n ( T n ) → 0 = ⇒ is the optimal decay D 1 ( ρ � σ ) := Tr ρ (log ρ − log σ ) relative entropy 2 1 Hiai, Petz, 1991, Ogawa, Nagaoka, 2001; 2 Umegaki, 1962

  4. Relative entropy • The (quantum) Stein’s lemma gives an operational interpretation to the (quantum) relative entropy (Kullback-Leibler divergence). • Notion of “distance” on the state space. • All relevant information measures are derived from it: entropy: H ( ρ ) := − D 1 ( ρ � I ) I ( A : B ) ρ := D 1 ( ρ AB � ρ A ⊗ ρ B ) mutual information: all sorts of channel capacities, etc.

  5. Relative entropy • The (quantum) Stein’s lemma gives an operational interpretation to the (quantum) relative entropy (Kullback-Leibler divergence). • Notion of “distance” on the state space. • All relevant information measures are derived from it (?) entropy: H ( ρ ) := − D 1 ( ρ � I ) I ( A : B ) ρ := D 1 ( ρ AB � ρ A ⊗ ρ B ) mutual information: all sorts of channel capacities, etc. • Statistical divergence ∆ on the state space: (1) ∆( ρ � σ ) ≥ 0 , ∆( ρ, σ ) = 0 ⇐ ⇒ ρ = σ (2) ∆(Φ( ρ ) � Φ( σ )) ≤ ∆( ρ � σ ) Φ stochastic map Example: f -divergences (3) operational interpretation?

  6. Other statistical divergences • Trace-norm distance: H 0 : ρ vs. H 1 : σ 0 ≤ T ≤ I { α ( T ) + β ( T ) } = 1 − 1 min 2 � ρ − σ � 1 ∆ Tr ( ρ � σ ) := 1 2 � ρ − σ � 1

  7. Other statistical divergences • Trace-norm distance: H 0 : ρ vs. H 1 : σ 0 ≤ T ≤ I { α ( T ) + β ( T ) } = 1 − 1 2 � ρ − σ � 1 min ∆ Tr ( ρ � σ ) := 1 2 � ρ − σ � 1 • Chernoff bound theorem: 1 � � ρ ⊗ n − σ ⊗ n � 1 − 1 � 1 ∼ e − nC ( ρ,σ ) 2 C ( ρ, σ ) := − 0 <α< 1 ( α − 1) D α ( ρ � σ ) inf Chernoff divergence 1 α − 1 log Tr ρ α σ 1 − α D α ( ρ � σ ) := Rényi divergences 1 Nussbaum, Szkoła, 2006; Audenaert et al., 2006

  8. Quantifying the trade-off β n ( T n ) ∼ e − nD 1 ( ρ � σ ) • Stein’s lemma: α n ( T n ) → 0 = ⇒

  9. Quantifying the trade-off β n ( T n ) ∼ e − nD 1 ( ρ � σ ) • Stein’s lemma: α n ( T n ) → 0 = ⇒ Quantum Hoeffding bound 1 • Direct domain: β n ( T n ) ∼ e − nr α n ( T n ) ∼ e − nH r , ⇒ r < D 1 ( ρ � σ ) = Quantum Han-Kobayashi bound 2 Converse domain: α n ( T n ) ∼ 1 − e − nH ∗ β n ( T n ) ∼ e − nr r , = ⇒ r > D 1 ( ρ � σ ) • Hoeffding divergences: α − 1 [ r − D α ( ρ � σ )] H r := sup α 0 <α< 1 α − 1 H ∗ [ r − D ∗ r := sup α ( ρ � σ )] α 1 <α 1 Hayashi; Nagaoka; Audenaert, Nussbaum, Szkoła, Verstraete; 2006 2 Mosonyi, Ogawa, 2013

  10. Quantum Rényi divergences • p, q probability distributions on X , α ∈ [0 , + ∞ ) \ { 1 } : � 1 x p ( x ) α q ( x ) 1 − α D α ( p � q ) := α − 1 log

  11. Quantum Rényi divergences • p, q probability distributions on X , α ∈ [0 , + ∞ ) \ { 1 } : � 1 x p ( x ) α q ( x ) 1 − α D α ( p � q ) := α − 1 log • Quantum Rényi divergences: 1 1 α − 1 log Tr ρ α σ 1 − α D α ( ρ � σ ) := � � α 1 1 − α 1 1 D ∗ 2 σ α ρ α ( ρ � σ ) := α − 1 log Tr ρ 2 • The right quantum extension is � D α ( ρ � σ ) , α ∈ [0 , 1) , D q α ( ρ � σ ) := D ∗ α ( ρ � σ ) , α ∈ (1 , + ∞ ] . 1 Petz 1986; Müller-Lennert, Dupuis, Szehr, Fehr, Tomamichel, 2013; Wilde, Winter, Yang, 2013

  12. Mathematical properties • Both D α and D ∗ α are monotone increasing in α α → 1 D ( v ) lim α ( ρ � σ ) = D 1 ( ρ � σ ) := D ( ρ � σ ) := Tr ρ (log ρ − log σ ) • Araki-Lieb-Thirring inequality: D ∗ α ( ρ � σ ) ≤ D α ( ρ � σ ) , α ∈ [0 , + ∞ ] Equality for α = 1 and commuting states. • Monotonicity: D α (Φ( ρ ) � Φ( σ )) ≤ D α ( ρ � σ ) , α ∈ [0 , 2] D ∗ α (Φ( ρ ) � Φ( σ )) ≤ D ∗ α ( ρ � σ ) , α ∈ [1 / 2 , + ∞ ] D q α (Φ( ρ ) � Φ( σ )) ≤ D q = ⇒ α ( ρ � σ ) , α ∈ [0 , + ∞ ]

  13. The fidelity � � α 1 1 1 − α 1 D ∗ 2 σ α ρ α ( ρ � σ ) := α − 1 log Tr ρ 2 α = 1 / 2 : � D ∗ 1 1 2 σρ α ( ρ � σ ) = − 2 log Tr ρ 2 = − 2 log F ( ρ, σ ) Operational interpretation??

  14. More Rényi divergences • In classical information theory, trade-offs in many problems are quantified by Rényi divergences and derived quantities. • How about quantum? Probably also. • Do we get any other notions of Rényi divergences apart from D α and D ∗ α ? Probably not. • What are the right (=operational) definitions of the Rényi extensions of information quantities? E.g., Rényi mutual information, Rényi capacity, Rényi conditional mutual information?

  15. More Rényi divergences I ( v ) σ B D ( v ) • Rényi mutual information: α ( A : B ) ρ := inf α ( ρ AB � ρ A ⊗ σ B ) Yes, for all quantum values. 1 Operational interpretation? Hypothesis testing H 0 : ρ ⊗ n H 1 : ρ ⊗ n A ⊗ S ( H ⊗ n vs. B ) . AB • Rényi-Holevo capacities: W : X → S ( H B ) channel � � � χ ( v ) I ( v ) α ( W ) := sup α ( X : B ) ρ X B : ρ X B = p ( x ) | x �� x | X ⊗ W ( x ) x Operational interpretation 2 for α > 1 and ( x ) = ∗ Strong converse exponent of classical-quantum channel coding. • Channel Rényi mutual information: N : A → B CPTP I ( v ) I ( v ) α ( N ) := sup α ( R : B ) N ( ψ RA ) ψ RA Partial results (Cooney, Mosonyi, Wilde, 2014) . 1 Hayashi, Tomamichel, 2014; 2 Mosonyi, Ogawa, 2014

  16. More Rényi divergences • Channel Rényi divergences: N i : A → B CPTP D ( v ) D ( v ) α ( N 1 �N 2 ) := sup α ( N 1 ( ψ RA ) �N 2 ( ψ RA )) ψ RA Operational interpretation? Trivial one for all quantum values. Non-trivial one for α > 1 , ( x ) = ∗ , and N 2 ( . ) = R σ ( . ) := σ Tr( . ) replacer channel . ( Cooney, Mosonyi, Wilde, 2014) .

  17. Binary channel discrimination • Two candidates for the identity of a channel: H 0 : N 0 , H 1 : N 1 H 0 : N ⊗ n H 1 : N ⊗ n n independent uses: , 0 1 • Adaptive discrimination strategy: Binary measurement at the end. ⇒ output N ⊗ n • Non-adaptive strategy: input ϕ R n A n = ϕ R n A n i ϕ R n A n = ϕ ⊗ n ⇒ ouput ( N i ϕ RA ) ⊗ n Product strategy: = RA

  18. Binary channel discrimination • output ρ R n B n ( N = N 0 ) σ R n B n ( N = N 1 ) or ( T n , I − T n ) at the end measurement • error probabilities: β x ε ( N ⊗ n �N ⊗ n ) := inf { Tr σ R n B n T n : Tr ρ R n B n ( I − T n ) ≤ ε } 0 1 � Tr ρ R n B n ( I − T n ) : Tr σ R n B n T n ≤ 2 − nr � r ( N ⊗ n �N ⊗ n α x ) := inf 0 1 x = pr or x = ad

  19. Trade-off exponents with product strategies • error probabilities: ε ( N ⊗ n �N ⊗ n β x ) := inf { Tr σ R n B n T n : Tr ρ R n B n ( I − T n ) ≤ ε } 0 1 � Tr ρ R n B n ( I − T n ) : Tr σ R n B n T n ≤ 2 − nr � α x r ( N ⊗ n �N ⊗ n ) := inf 0 1 • If only product strategies are allowed: x = pr n → + ∞ − 1 n log β x ε ( N ⊗ n �N ⊗ n lim ) = D ( N 0 �N 1 ) 0 1 := sup D ( N 0 ( ψ RA ) �N 1 ( ψ RA )) , ψ RA n → + ∞ − 1 n log α x lim n,r = H r ( N 0 �N 1 ) := sup H r ( N 0 ( ψ RA ) �N 1 ( ψ RA )) , ψ RA n → + ∞ − 1 n log(1 − α x n,r ) = H ∗ lim r ( N 0 �N 1 ) ψ RA H ∗ := inf r ( N 0 ( ψ RA ) �N 1 ( ψ RA )) ,

  20. Channel divergences • Channel Hoeffding (anti-)divergences: H r ( N 0 �N 1 ) = sup H r ( N 0 ( ψ RA ) �N 1 ( ψ RA )) , ψ RA H ∗ ψ RA H ∗ r ( N 0 �N 1 ) = inf r ( N 0 ( ψ RA ) �N 1 ( ψ RA )) , • alternative expressions (due to minimax) α − 1 H r ( N 0 �N 1 ) = sup [ r − D α ( N 0 �N 1 )] , α 0 <α< 1 α − 1 H ∗ [ r − D ∗ r ( N 0 �N 1 ) = sup α ( N 0 �N 1 )] , α 1 <α where D α ( N 0 �N 1 ) and D ∗ α ( N 0 �N 1 ) are the channel Rényi divergences: D α ( N 0 �N 1 ) := sup D α ( N 0 ( ψ RA ) �N 1 ( ψ RA )) , ψ RA D ∗ D ∗ α ( N 0 �N 1 ) := sup α ( N 0 ( ψ RA ) �N 1 ( ψ RA )) . ψ RA

Recommend


More recommend