Fast Polarization for Processes with Memory Joint work with Eren S ¸as ¸o˘ glu and Boaz Shuval 1/32
Polar codes in one slide X N Y N 1 1 W Polar coding ◮ Information vector : ˜ U k 1 1 = f (˜ 1 ) ◮ Padding : U N U k 1 = U N 1 · G − 1 ◮ Encoding : X N N ◮ Decoding : Successively, deduce U i from U i − 1 and Y N 1 1 2/32
Polar codes in two slides: [Arıkan:09], [ArıkanTelatar:09] ◮ Setting : binary-input, symmetric, memoryless channel 1 = X N 1 · G N ◮ Polar transform : U N ⇐ ⇒ X N U N uniform uniform 1 1 ◮ Low entropy indices : Fix β < 1 / 2 � 1 ) < 2 − N β � i : P error ( U i | U i − 1 Λ N = , Y N 1 ◮ Polarization : Let X N 1 be uniform 1 N | Λ N | = I ( X 1 ; Y 1 ) lim N →∞ ◮ Coding scheme : ◮ For i ∈ Λ N , set U i equal to information bits (uniform) ◮ Set remaining U i to uniform values, reveal to decoder 1 = U N 1 · G − 1 ◮ Transmit X N as codeword N 3/32
In this talk Setting : binary-input, symmetric, memoryless channel 4/32
In this talk Setting : binary-input, symmetric, memoryless channel 4/32
In this talk Setting : binary-input, symmetric, memoryless channel 4/32
In this talk Setting : binary-input, symmetric, memoryless channel 4/32
In this talk Setting : binary-input, symmetric, ✭✭✭✭✭✭ ❤❤❤❤❤❤ ✭ memoryless channel ❤ 4/32
Polar codes: [S ¸as ¸o˘ glu+:09], [KoradaUrbanke:10], [HondaYamamoto:13] ◮ Setting : Memoryless i.i.d. process ( X i , Y i ) N i = 1 ◮ For simplicity : Assume X i binary 1 = X N 1 · G N ◮ Polar transform : U N ◮ Index sets : � 1 ) < 2 − N β � Λ N = i : P error ( U i | U i − 1 , Y N Low entropy: 1 � 2 − 2 − N β � 1 ) > 1 Ω N = i : P error ( U i | U i − 1 , Y N High entropy: 1 ◮ Polarization : 1 N | Λ N | = 1 − H ( X 1 | Y 1 ) lim N →∞ 1 N | Ω N | = H ( X 1 | Y 1 ) lim N →∞ 5/32
Polar codes: [S ¸as ¸o˘ glu+:09], [KoradaUrbanke:10], [HondaYamamoto:13] Optimal rate for: ◮ Coding for non-symmetric memoryless channels ◮ Coding for memoryless channels with non-binary inputs ◮ (Lossy) compression of memoryless sources Question ◮ How to handle memory? 6/32
Roadmap Index sets � � i : P error ( U i | U i − 1 Λ N ( ǫ ) = , Y N 1 ) < ǫ Low entropy: 1 � � 1 ) > 1 i : P error ( U i | U i − 1 Ω N ( ǫ ) = , Y N 2 − ǫ High entropy: 1 Plan ◮ Define framework for handling memory ◮ Establish: ◮ Slow polarization: for ǫ > 0 fixed , 1 N | Λ N ( ǫ ) | = 1 − H ⋆ ( X | Y ) lim N →∞ 1 N | Ω N ( ǫ ) | = H ⋆ ( X | Y ) lim N →∞ N H ( X N 1 | Y N 1 ) lim N →∞ 1 ◮ Fast polarization: also holds for ǫ = 2 − N β ◮ What is β ? 7/32
A framework for memory ◮ Process : ( X i , Y i , S i ) N i = 1 ◮ Finite number of states : S i ∈ S , where |S| < ∞ ◮ Hidden state : S i is unknown to encoder and decoder ◮ Probability distribution : P ( x i , y i , s i | s i − 1 ) ◮ Stationary : same for all i ◮ Markov : P ( x i , y i , s i | s i − 1 ) = P ( x i , y i , s i |{ x j , y j , s j } j < i ) ◮ State sequence : aperiodic and irreducible Markov chain 8/32
Example 1 ◮ Model : Finite state channel P s ( y | x ) , s ∈ S ◮ Input distribution : X i i.i.d. and independent of state ◮ State transition : π ( s i | s i − 1 ) ◮ Distribution : P ( x i , y i , s i | s i − 1 ) = P ( x i ) π ( s i | s i − 1 ) P s i ( y i | x i ) 9/32
Example 2 ◮ Model : ISI + noise Y i = h 0 X i + h 1 X i − 1 + · · · + h m X i − m + noise ◮ Input : X i has memory P ( x i | x i − 1 , x i − 2 , . . . , x i − m , x i − m − 1 ) ◮ State : � � S i = · · · X i X i − 1 X i − m ◮ Distribution : For x i , s i , s i − 1 compatible, P ( x i , y i , s i | s i − 1 ) = P noise ( y i | h T s i ) · P ( x i | s i − 1 ) 10/32
Example 3 ◮ Model : ( d , k ) -RLL constrained system with noise ( 1 , ∞ ) -RLL Constraint 1 0 0 11/32
Example 3 ◮ Model : ( d , k ) -RLL constrained system with noise ( 1 , ∞ ) -RLL Constraint 1 X N Y N 1 1 BSC ( p ) 0 0 11/32
Example 3 ◮ Model : ( d , k ) -RLL constrained system with noise ( 1 , ∞ ) -RLL Constraint 1 X N Y N 1 − α 1 1 BSC ( p ) α 0 1 0 state Markov chain 0 / 0 1 / 0 α ( 1 − p ) ( 1 − α )( p ) 1 / 1 ( 1 − α ) ( 1 − ) p P ( x i , y i , s i | s i − 1 ) ( 1 1 − ) p 0 / 0 1 ( p ) α ( p ) 1 / 0 0 / 1 11/32
Example 4 ◮ Model : Lossy compression of a source with memory Lossy compression Y N X N 1 1 ◮ Source distribution : P s ( y ) , s ∈ S ◮ State transition : π ( s i | s i − 1 ) ◮ Distortion : test channel P ( x | y ) ◮ Distribution : P ( x i , y i , s i | s i − 1 ) = π ( s i | s i − 1 ) P s i ( y i ) P ( x i | y i ) 12/32
Polar codes: [S ¸as ¸o˘ glu:11], [S ¸as ¸o˘ gluTal:16], [ShuvalTal:17] ◮ Setting : Process ( X i , Y i , S i ) N i = 1 with memory, as above ◮ Hidden state : State unknown to encoder and decoder 1 = X N 1 · G N ◮ Polar transform : U N U N 1 are neither independent, nor identically distributed ◮ Index sets : Fix β < 1 / 2 � 1 ) < 2 − N β � i : P error ( U i | U i − 1 Λ N = , Y N Low entropy: 1 � 2 − 2 − N β � 1 ) > 1 Ω N = i : P error ( U i | U i − 1 , Y N High entropy: 1 ◮ Polarization : 1 N | Λ N | = 1 − H ⋆ ( X | Y ) lim N →∞ 1 N | Ω N | = H ⋆ ( X | Y ) lim N →∞ N H ( X N 1 | Y N 1 ) lim N →∞ 1 13/32
Achievable rate ◮ Achievable rate : In all examples, R approaches 1 I ⋆ ( X ; Y ) = lim NI ( X N 1 ; Y N 1 ) N →∞ ◮ Successive cancellation : [Wang+:15] 14/32
Mixing Consider the process ( X i , Y i ) — hidden state · · · · · · · · · X L + 1 X M + 1 X M + 2 X 1 X 2 X L X M X N · · · · · · · · · Y 1 Y 2 Y L Y L + 1 Y M Y M + 1 Y M + 2 Y N Then, there exist ψ ( k ) , k ≥ 0, such that M + 1 ≤ ψ ( M − L ) · P X L 1 · P X N P X L 1 , Y L 1 , X N M + 1 , Y N 1 , Y L M + 1 , Y N M + 1 where: ◮ ψ ( 0 ) < ∞ ◮ ψ ( k ) → 1 15/32
Three parameters ◮ Joint distribution P ( x , y ) ◮ For simplicity: X ∈ { 0 , 1 } ◮ Parameters : H ( X | Y ) = − � x , y P ( x , y ) log P ( x | y ) Entropy � Z ( X | Y ) = 2 � P ( 0 , y ) P ( 1 , y ) Bhattacharyya y K ( X | Y ) = � y | P ( 0 , y ) − P ( 1 , y ) | T.V. distance ◮ Connections : H ≈ 0 ⇐ ⇒ Z ≈ 0 ⇐ ⇒ K ≈ 1 H ≈ 1 ⇐ ⇒ Z ≈ 1 ⇐ ⇒ K ≈ 0 16/32
Three processes For n = 1 , 2 , . . . { X i , Y i , S i } ◮ N = 2 n 1 = X N ◮ U N 1 G N { X i , Y i } ◮ Pick B n ∈ { 0 , 1 } uniform, i.i.d. ◮ Random index from { 1 , 2 , . . . , N } i = 1 + � B 1 B 2 · · · B n � 2 ◮ Processes : H n = H ( U i | U i − 1 , Y N 1 ) Entropy 1 Z n = Z ( U i | U i − 1 , Y N 1 ) Bhattacharyya 1 K n = K ( U i | U i − 1 , Y N 1 ) T.V. distance 1 17/32
Proof — memoryless case Slow polarization Fast polarization � B n + 1 = 0 2 Z n H n ∈ ( ǫ, 1 − ǫ ) Z n + 1 ≤ B n + 1 = 1 Z 2 n | H n + 1 − H n | > 0 1 N | Λ N | − n →∞ 1 − H ( X 1 | Y 1 ) − − → Low entropy set New � B n + 1 = 0 K 2 K n + 1 ≤ n B n + 1 = 1 2 K n 1 N | Ω N | − n →∞ H ( X 1 | Y 1 ) − − → High entropy set 18/32
❍❍ ✟ Proof — memory ✟✟ less case ❍ { X i , Y i , S i } Slow polarization Fast polarization { X i , Y i } � 2 ψ Z n B n + 1 = 0 H n ∈ ( ǫ, 1 − ǫ ) Z n + 1 ≤ ψ Z 2 B n + 1 = 1 n | H n + 1 − H n | > 0 1 N | Λ N | − n →∞ 1 − H ⋆ ( X | Y ) − − → Low entropy set � ψ ˆ B n + 1 = 0 K 2 1 ˆ ψ = ψ ( 0 ) = max K n + 1 ≤ n π ( s ) 2 ˆ B n + 1 = 1 s K n π : stationary state distribution 1 N | Ω N | − n →∞ H ⋆ ( X | Y ) − − → High entropy set 19/32
Notation ◮ Two consecutive blocks : ( X N 1 , Y N 1 ) and ( X 2 N N + 1 , Y 2 N N + 1 ) . ◮ Polar transform : 1 = X N 1 · G N U N 1 = X 2 N N + 1 · G N V N ◮ Random index : i = 1 + � B 1 B 2 · · · B n � 2 ◮ Notation : Q i = ( U i − 1 , Y N 1 ) 1 R i = ( V i − 1 , Y 2 N N + 1 ) 1 20/32
Slow polarization 1 = X N U N 1 · G N V N 1 = X 2 N N + 1 · G N ◮ H n is a supermartingale Q i = ( U i − 1 1 ) , Y N 1 R i = ( V i − 1 H n = H ( U i | Q i ) = H ( V i | R i ) N + 1 ) , Y 2 N 1 � H ( U i + V i | Q i , R i ) B n + 1 = 0 H n + 1 = H ( V i | U i + V i , Q i , R i ) B n + 1 = 1 By the chain rule: � � E [ H n + 1 | H n , . . . ] = 1 H ( U i + V i | Q i , R i ) + H ( V i | U i + V i , Q i , R i ) 2 = 1 2 H ( U i + V i , V i | Q i , R i ) = 1 2 H ( U i , V i | Q i , R i ) ≤ 1 2 H ( U i | Q i ) + 1 2 H ( V i | R i ) = H n 21/32
Slow polarization 1 = X N U N 1 · G N V N 1 = X 2 N N + 1 · G N Convergence Q i = ( U i − 1 1 ) , Y N 1 ◮ H n is a supermartingale R i = ( V i − 1 N + 1 ) , Y 2 N 1 ◮ 0 ≤ H n ≤ 1 H n converges a.s. and in L 1 to H ∞ Polarization ◮ H ∞ ∈ [ 0 , 1 ] ◮ We need: H ∞ ∈ { 0 , 1 } ◮ Easy if ( U N 1 , Q N 1 ) and ( V N 1 , R N 1 ) were independent ◮ They are not: Y N ∈ Q N 1 and Y N + 1 ∈ R N 1 ◮ But: for almost all i , we have I ( U i ; V i | Q i , R i ) < ǫ ◮ Enough? No. Need to show that Q i and R i can’t cooperate to stop polarization 22/32
Recommend
More recommend