empirical properties of good channel codes
play

Empirical Properties of Good Channel Codes Qinghua (Devon) Ding - PowerPoint PPT Presentation

Empirical Properties of Good Channel Codes Qinghua (Devon) Ding June 8, 2020 The Chinese University of Hong Kong 1 Introduction Shannons Channel Coding Theorem 2 Shannons Channel Coding Theorem 2 { R > C C , P ( m = m


  1. Empirical Properties of Good Channel Codes Qinghua (Devon) Ding June 8, 2020 The Chinese University of Hong Kong

  2. 1

  3. Introduction

  4. Shannon’s Channel Coding Theorem 2

  5. Shannon’s Channel Coding Theorem 2 { R > C ⇒ ∀C , P ( m ̸ = ˆ m ) → 1 . C = max P X I ( X ; Y ) R < C ⇒ ∃C , P ( m ̸ = ˆ m ) → 0 .

  6. 2 Shannon’s Channel Coding Theorem { R > C ⇒ ∀C , P ( m ̸ = ˆ m ) → 1 . C = max P X I ( X ; Y ) R < C ⇒ ∃C , P ( m ̸ = ˆ m ) → 0 . P ∗ X = arg max P X I ( X ; Y ) need not be unique.

  7. X 3 Result I: Characterization of the P ∗ Consider a channel W = ( p 1 , ..., p |X| ) , denote r = ( H ( p 1 ) , ..., H ( p m )) .

  8. X 3 Result I: Characterization of the P ∗ Consider a channel W = ( p 1 , ..., p |X| ) , denote r = ( H ( p 1 ) , ..., H ( p m )) . Given some P ∗ X ∈ arg max P X I ( X ; Y ) (e.g. by Blahut-Amiroto algorithm).

  9. 3 X There’s no analytical solutions in general. The whole set of capacity-achieving input distribution is r Result I: Characterization of the P ∗ Consider a channel W = ( p 1 , ..., p |X| ) , denote r = ( H ( p 1 ) , ..., H ( p m )) . ( W )} { P ∗ P ∗ X = X + ker ∩ R m + . 1 1 A non-linear equation system for P ∗ X is developed in [Mur53] and its followup works.

  10. Optimizing Input Distribution 4 X ) = X ′ ⊂ X . Suppose W has unique CAID P ∗ X with supp ( P ∗

  11. Optimizing Input Distribution 4 X ) = X ′ ⊂ X . Suppose W has unique CAID P ∗ X with supp ( P ∗ Claim . The distributions { p i , i ∈ X ′ } should be linearly independent.

  12. Optimizing Input Distribution Proof by contrapositive (details later). 4 X ) = X ′ ⊂ X . Suppose W has unique CAID P ∗ X with supp ( P ∗ Claim . The distributions { p i , i ∈ X ′ } should be linearly independent.

  13. Property of Random Code Ensemble 5 { R > C ⇒ ∀C , P ( W ̸ = ˆ W ) → 1 . C = max P X I ( X ; Y ) R < C ⇒ ∃C , P ( W ̸ = ˆ W ) → 0 .

  14. Property of Random Code Ensemble Random code ensemble is capacity-achieving. 5 { R > C ⇒ ∀C , P ( W ̸ = ˆ W ) → 1 . C = max P X I ( X ; Y ) R < C ⇒ ∃C , P ( W ̸ = ˆ W ) → 0 .

  15. Property of Random Code Ensemble X . 6 Random codes: each alphabet i.i.d. from P ∗ P ∗ X ∈ arg max P X I ( X ; Y )

  16. Property of Random Code Ensemble X . 2 This condition is difgerent from [HV93, PV13, SV97]. 6 Random codes: each alphabet i.i.d. from P ∗ P ∗ X ∈ arg max P X I ( X ; Y ) Empirical independence: # { ( x i , x ′ i ) = ( a , b ) } ≈ nP ∗ X ( a ) P ∗ X ( b ) . 2

  17. Property of Random Code Ensemble X . codeword pairs that’s empirically independent, w.h.p. 2 This condition is difgerent from [HV93, PV13, SV97]. 6 Random codes: each alphabet i.i.d. from P ∗ P ∗ X ∈ arg max P X I ( X ; Y ) Empirical independence: # { ( x i , x ′ i ) = ( a , b ) } ≈ nP ∗ X ( a ) P ∗ X ( b ) . 2 Observation . Random codes have “most” (1 − o ( 1 ) fraction)

  18. Property of Random Code Ensemble X . k -tuples that’s empirically independent, w.h.p. 6 Random codes: each alphabet i.i.d. from P ∗ P ∗ X ∈ arg max P X I ( X ; Y ) # { ( x i , x ′ i , x ′′ i ) = ( a , b , c ) } ≈ nP X ∗ ( a ) P X ∗ ( b ) P X ∗ ( c ) . Generalization to k = O ( 1 ) . Random codes have “most” codeword

  19. Result II: Necessary Conditions for Good Codes Capacity-achieving code (or good code) 7 { R = C − ϵ, P ( m ̸ = ˆ m ) → 0 .

  20. Result II: Necessary Conditions for Good Codes Capacity-achieving code (or good code) Theorem (Property of Good Codes) X , any good code for it should have 7 { R = C − ϵ, P ( m ̸ = ˆ m ) → 0 . For any DMC with unique P ∗ 1 − o ( 1 ) fraction of codeword k-tuples empirically independent.

  21. Result II: Necessary Conditions for Good Codes Capacity-achieving code (or good code) Theorem (Property of Good Codes) X , any good code for it should have 7 { R = C − ϵ, P ( m ̸ = ˆ m ) → 0 . For any DMC with unique P ∗ 1 − o ( 1 ) fraction of codeword k-tuples empirically independent. Similar results holds for AWGN channel.

  22. Advertisement Parallel work [ZVJ20] on Quadratically Constrained Two-Way Adversarial Channels ISIT 2020 https://sites.google.com/view/yihan/ 8

  23. Advertisement Parallel work [ZVJ20] on Quadratically Constrained Two-Way Adversarial Channels ISIT 2020 https://sites.google.com/view/yihan/ 8

  24. Result III: Non-universality of Good Codes 9 Two channels W and W ′ are similar ifg P ∗ X = P ∗ X ′ and C = C ′ .

  25. Result III: Non-universality of Good Codes capacity under vanishing error probability for all similar channels. 9 Two channels W and W ′ are similar ifg P ∗ X = P ∗ X ′ and C = C ′ . ∼ P ∗ i . i . d . Observation . Random code ensemble with alphabet X achieves

  26. Result III: Non-universality of Good Codes capacity under vanishing error probability for all similar channels. Theorem (Non-universality of Good Codes) 9 Two channels W and W ′ are similar ifg P ∗ X = P ∗ X ′ and C = C ′ . ∼ P ∗ i . i . d . Observation . Random code ensemble with alphabet X achieves There exists similar DMCs W, W ′ and code C that’s capacity-achieving for W, s.t. no expurgation of C with the same rate is good for W ′ .

  27. Result III: Non-universality of Good Codes 10

  28. Result III: Non-universality of Good Codes 10

  29. Result III: Non-universality of Good Codes 10

  30. Proof Ideas

  31. r Proof to Characterization Result 11 For DMC W , given some P X ∗ ∈ arg max P X I ( X ; Y ) , we have ( W )} { P ∗ P ∗ X = X + ker ∩ R m + .

  32. r Proof to Characterization Result Proof by standard linear algebra. 11 For DMC W , given some P X ∗ ∈ arg max P X I ( X ; Y ) , we have ( W )} { P ∗ P ∗ X = X + ker ∩ R m + .

  33. Proof to Characterization Result Generalizing to k -use channel , we have X 12 ( W ⊗ k )} { P ∗ P ∗⊗ k X k = + ker ∩ R m k + . r ( k )

  34. Proof to Characterization Result Consider the following noisy typewritter channel. 0 0 0 0 0 0 Generalizing to k -use channel , we have 0 0 12 X ( W ⊗ k )} { P ∗ P ∗⊗ k X k = + ker ∩ R m k + . r ( k )   1 / 2 1 / 2 1 / 2 1 / 2   W =    1 / 2 1 / 2    1 / 2 1 / 2 Although C and P ∗ Y tensorize, P ∗ X k does not tensorize.

  35. Proof to Linear Indepence Lemma Proof by contrapositive. 13 X ) = X ′ ⊂ X . Suppose W has unique CAID P ∗ X with supp ( P ∗ Claim . The distributions { p i , i ∈ X ′ } should be linearly independent.

  36. Proof to Linear Indepence Lemma Suppose linear independence does not hold. 14

  37. Proof to Linear Indepence Lemma Suppose linear independence does not hold. 14 We can find feasible direction δ ̸ = 0 such that ⟨ r , δ ⟩ = 0 , W δ = 0 .

  38. Proof to Linear Indepence Lemma Suppose linear independence does not hold. 14 We can find feasible direction δ ̸ = 0 such that ⟨ r , δ ⟩ = 0 , W δ = 0 . Then I ( P X ∗ + ϵδ ; P Y ) = C for small enough ϵ , contradiction!

  39. Proof to Empirical Independence Property 15 Consider a discrete memoryless channel W with unique P X ∗ .

  40. Proof to Empirical Independence Property X 15 Consider a discrete memoryless channel W with unique P X ∗ . Claim . Any good code C for W has the property that ∀ δ > 0, P x 1 ,..., x k ∼C ( ∥ τ x 1 ,..., x k − P ∗⊗ k ∥ 1 > δ ) → 0 .

  41. Proof to Empirical Independence Property X 15 Consider a discrete memoryless channel W with unique P X ∗ . Claim . Any good code C for W has the property that ∀ δ > 0, P x 1 ,..., x k ∼C ( ∥ τ x 1 ,..., x k − P ∗⊗ k ∥ 1 > δ ) → 0 . Proof by considering the k -use channel.

  42. Proof to Empirical Independece Property 16

  43. Proof to Empirical Independece Property 16

  44. Proof to Empirical Independece Property 16

  45. Proof to Empirical Independece Property 16

  46. Proof to Empirical Independence Property (Cont’d) 17 Consider the AWGN ( P , N ) channel denoted as W .

  47. Proof to Empirical Independence Property (Cont’d) 17 Consider the AWGN ( P , N ) channel denoted as W . Claim . Any good code for W has the property that ∀ δ > 0, P X 1 ,..., X k ∼C ( ∃ i ̸ = j , |⟨ X i , X j ⟩| > δ n ) → 0 .

  48. Proof to Empirical Independence Property (Cont’d) Proof by contrapositive. 17 Consider the AWGN ( P , N ) channel denoted as W . Claim . Any good code for W has the property that ∀ δ > 0, P X 1 ,..., X k ∼C ( ∃ i ̸ = j , |⟨ X i , X j ⟩| > δ n ) → 0 .

  49. Properties of Good Channel Codess Suppose the codewords are empirically correlated. 18

  50. Properties of Good Channel Codess Suppose the codewords are empirically correlated. Then we can extract a large subcode good for another channel with 18 P ′ < P , contradiction!

  51. Non-universality of Good Channel Codes

  52. Non-universality of Good Channel Codes 2 The figures are from [CT12]. 19 Consider similar channels BEC ( H ( p )) and BSC ( p ) . 2

  53. Non-universality of Good Channel Codes 2 The figures are from [CT12]. 19 Consider similar channels BEC ( H ( p )) and BSC ( p ) . 2 Claim . There exists good code for BEC ( H ( p )) such that no expurgated subcode of the same rate can be good for BSC ( p ) .

Recommend


More recommend