Polar Codes for the Deletion Channel: Weak and Strong Polarization Ido Tal 1 Henry D. Pfister 2 Arman Fazeli 3 Alexander Vardy 3 1 Technion 2 Duke 3 UCSD
Big picture first A polar coding scheme for the deletion channel where the: ◮ Deletion channel has constant deletion probability δ ◮ Fix a hidden-Markov input distribution 1 . ◮ Code rate converges to information rate ◮ Error probability decays like 2 − Λ γ , where γ < 1 3 and Λ is the codeword length ◮ Decoding complexity is at most O (Λ 1+3 γ ) ◮ Achieves hidden-Markov capacity! 1 i.e., a function of an aperiodic, irreducible, finite-state Markov chain 1 / 21
Big picture first A polar coding scheme for the deletion channel where the: ◮ Deletion channel has constant deletion probability δ ◮ Fix a hidden-Markov input distribution 1 . ◮ Code rate converges to information rate ◮ Error probability decays like 2 − Λ γ , where γ < 1 3 and Λ is the codeword length ◮ Decoding complexity is at most O (Λ 1+3 γ ) ◮ Achieves hidden-Markov capacity! Equals true capacity? 1 i.e., a function of an aperiodic, irreducible, finite-state Markov chain 1 / 21
Big picture first A polar coding scheme for the deletion channel where the: ◮ Deletion channel has constant deletion probability δ ◮ Fix a hidden-Markov input distribution 1 . ◮ Code rate converges to information rate ◮ Error probability decays like 2 − Λ γ , where γ < 1 3 and Λ is the codeword length ◮ Decoding complexity is at most O (Λ 1+3 γ ) ◮ Achieves hidden-Markov capacity! Equals true capacity? ◮ Key ideas: ◮ Polarization operations defined for trellises ◮ Polar codes modified to have guard bands of ‘0’ symbols 1 i.e., a function of an aperiodic, irreducible, finite-state Markov chain 1 / 21
A brief history of the binary deletion channel ◮ Early Work: Levenshtein [Lev66] and Dobrushin [Dob67] ◮ LDPC Codes + Turbo Equalization: Davey-MacKay [DM01] ◮ Coding and Capacity Bounds by Mitzenmacher [Mit09] and many more: [FD10], [MTL12], [CK15], [RD15], [Che19] ◮ Polar codes: [TTVM17], [TFVL17], [TFV18] ◮ Our Contributions: ◮ Proof of weak polarization for constant deletion rate ◮ Strong polarization for constant deletion rate with guard bands ◮ Our trellis perspective also establishes weak polarization for channels with insertions, deletions, and substitutions 2 / 21
Hidden-Markov input process Example: (1 , ∞ ) Run-Length Constraint 1 1 − α 0 α 1 0 ◮ Input process is ( X j ), j ∈ Z ◮ Marginalization of ( S j , X j ), j ∈ Z ◮ State ( S j ), j ∈ Z , is Markov, stationary, irreducible, aperiodic ◮ For all j , it holds that −∞ = P S j , X j | S j − 1 P S j , X j | S j − 1 −∞ , X j − 1 3 / 21
Code rate The code rate of our scheme approaches 1 1 I ( X ; Y ) = lim N H (X) − lim N H (X | Y) , N →∞ N →∞ ◮ X = ( X 1 , . . . , X N ) is hidden-Markov input ◮ Y is the deletion channel output 4 / 21
Theorem (Strong polarization) Fix a regular hidden-Markov input process. For any fixed γ ∈ (0 , 1 / 3) , the rate of our coding scheme approaches the mutual-information rate between the input process and the deletion channel output. For large enough blocklength Λ , the probability of error is at most 2 − Λ γ . 5 / 21
Uniform input process ◮ It is known that a memoryless input distribution is suboptimal ◮ To keep this talk simple, we will however assume that the input process is uniform, and thus memoryless ◮ That is, the X i are i.i.d. and Ber (1 / 2) 6 / 21
The polar transform ◮ Let x = ( x 1 , . . . , x N ) ∈ { 0 , 1 } N be a vector of length N = 2 n ◮ Define ◮ minus transform: x [0] � ( x 1 ⊕ x 2 , x 3 ⊕ x 4 , . . . , x N − 1 ⊕ x N ) ◮ plus transform: x [1] � ( x 2 , x 4 , . . . , x N ) ◮ Both are vectors of length N / 2 ◮ Define x [ b 1 , b 2 ,..., b λ ] recursively: z = x [ b 1 , b 2 ,..., b λ − 1 ] , x [ b 1 , b 2 ,..., b λ ] = z [ b λ ] ◮ The polar transform of x is u = ( u 1 , u 2 , . . . , u N ), where for n � b j 2 n − j i = 1 + j =1 we have u i = x [ b 1 , b 2 ,..., b n ] 7 / 21
Polarization of trellises ◮ The decoder sees the received sequence y ◮ Ultimately, we want an efficient method of calculating u i | U i − 1 = ˆ u i − 1 , Y = y) P ( U i = ˆ ◮ Towards this end, let us first show an efficient method of calculating the joint probability P (X = x , Y = y) ◮ Generalizes the SC trellis decoder of Wang et. al. [WLH14], and the polar decoder for deletions by Tian et. al. [TFVL17] 8 / 21
Deletion channel trellis x j x 1 x 2 x 3 x 4 y i y 1 =0 y 2 =1 y 3 =1 ◮ Example: N = 4 inputs with length-3 output 011 ◮ Edge labels: blue x j = 0 and red x j = 1 ◮ Direction: diagonal = no deletion and horizontal = deletion 9 / 21
Deletion channel trellis x j x 1 x 2 x 3 x 4 y i y 1 =0 y 2 =1 y 3 =1 ◮ Example: N = 4 inputs with length-3 output 011 ◮ Edge labels: blue x j = 0 and red x j = 1 ◮ Direction: diagonal = no deletion and horizontal = deletion 9 / 21
Deletion channel trellis and the minus operation x 1 ⊕ x 2 x 3 ⊕ x 4 x 1 x 2 x 3 x 4 δ/ 2 δ / 2 δ/ 2 δ / δ/ 2 2 δ / 2 δ/ 2 δ / δ/ 2 2 δ / 2 δ/ 2 δ / δ/ 2 2 δ/ 2 ◮ Half as many sections representing twice the channel uses 10 / 21
Deletion channel trellis and the minus operation x 1 ⊕ x 2 x 3 ⊕ x 4 x 1 x 2 x 3 x 4 δ/ 2 δ δ δ / / 2 2 δ/ 2 δ / δ/ 2 2 δ δ 2 δδ/ 2 / / 2 4 δ/ 2 δ / δ/ 2 2 δ / 2 δ/ 2 δ / δ/ 2 2 δ/ 2 ◮ Half as many sections representing twice the channel uses ◮ Edge weight is product of edge weights along length-2 paths ◮ Edge label (i.e., color) is the xor of labels along length-2 paths 10 / 21
Deletion channel trellis and the minus operation x 1 ⊕ x 2 x 3 ⊕ x 4 x 1 x 2 x 3 x 4 δ/ 2 δ δ δ / / 2 2 δ/ 2 δ / δ/ 2 2 δ δ 2 δδ/ 2 / / 2 4 δ/ 2 δ / δ/ 2 δ 2 2 / δδ/ 2 4 δ / 2 δ/ 2 δ δδ/ 2 / δ/ 2 2 δ/ 2 ◮ Half as many sections representing twice the channel uses ◮ Edge weight is product of edge weights along length-2 paths ◮ Edge label (i.e., color) is the xor of labels along length-2 paths 10 / 21
Weak polarization Theorem For any ǫ > 0 , � �� 1 � i ∈ [ N ] | H ( U i | U i − 1 � � lim , Y) ∈ [ ǫ, 1 − ǫ ] � = 0 � 1 N N →∞ The proof follows along similar lines as the seminal proof: ◮ Define a tree process ◮ Show that the process is a submartingale ◮ Show that the submartingale can only converge to 0 or 1 All the above follow easily, once we notice the following ◮ Let X ⊙ X ′ be two concatenated inputs to the channel ◮ Denote the corresponding output Y ⊙ Y ′ ◮ Then, H ( A | B , Y ⊙ Y ′ ) ≥ H ( A | B , Y , Y ′ ) 11 / 21
Strong polarization ◮ Fix N = 2 n , n 0 = ⌊ γ · n ⌋ and n 1 = ⌈ (1 − γ ) · n ⌉ ◮ Define N 0 = 2 n 0 N 1 = 2 n 1 and ◮ Let X 1 , X 2 , . . . , X N 1 by i.i.d. blocks of length N 0 ◮ Suppose the channel input is X 1 ⊙ X 2 ⊙ · · · ⊙ X N 1 ◮ Decoder sees Y 1 ⊙ Y 2 ⊙ · · · ⊙ Y N 1 ◮ If only we had a genie to “punctuate” the output to Y 1 , Y 2 , . . . , Y N 1 , proving strong polarization would be easy. . . 12 / 21
A “good enough” genie ◮ We would like this: 13 / 21
A “good enough” genie ◮ We would like this: ◮ We will settle for this: 13 / 21
A “good enough” genie ◮ We would like this: ◮ We will settle for this: ◮ No head. . . 13 / 21
A “good enough” genie ◮ We would like this: ◮ We will settle for this: ◮ No head. . . ◮ No tail. . . 13 / 21
A “good enough” genie ◮ Decoder sees Y 1 ⊙ Y 2 ⊙ · · · ⊙ Y N 1 ◮ Decoder wants a genie to punctuate the above into Y 1 , Y 2 , . . . , Y N 1 ◮ Our “good enough” genie will give the decoder Y ⋆ 1 , Y ⋆ 2 , . . . , Y ⋆ N 1 where Y ⋆ i is Y i , with leading and trailing ‘0’ symbols removed ◮ Asymptotically, we have sacrificed nothing because I (X; Y) = I (X; Y ⋆ ) 14 / 21
Building our genie ◮ Guard bands added at the encoder ◮ Denote x = x I ⊙ x II ∈ X 2 n , where X = { 0 , 1 } and ∈ X 2 n − 1 , x I = x 2 n − 1 x II = x 2 n 2 n − 1 +1 ∈ X 2 n − 1 1 ◮ That is, instead of transmitting x, we transmit, g (x), where x if n ≤ n 0 g (x) � ℓ n � �� � g (x I ) ⊙ 00 . . . 0 ⊙ g (x II ) if n > n 0 , ℓ n � 2 ⌊ (1 − ǫ )( n − 1) ⌋ ◮ ǫ is a ‘small’ constant 15 / 21
The genie in action X I X II X G △ G I G II G Y △ Y I Y II Y Z △ Z Z I Z II ◮ Z is Y with leading and trailing ‘0’ symbols removed ◮ Guard band Z △ removed by splitting Z in half, and then removing leading and trailing 0 symbols from each half ◮ Genie successful if the middle of Z falls in the guard band 16 / 21
Conclusions ◮ Strong polarization for the deletion channel with constant deletion probability δ ◮ Error rate 2 − Λ γ comes from balancing strong polarization and guard-band failure ◮ If capacity of deletion channel achievable by hidden-Markov inputs, then we can achieve capacity! 17 / 21
Recommend
More recommend