Polar Coding for Processes with Memory glu 1 Ido Tal 2 Eren S ¸a¸ so˘ 1 Intel 2 Technion 1 / 17
◮ Well known: polarization occurs for a memoryless process ◮ Our setting: a process with memory ◮ Mild assumption: ( ψ -mixing, ψ 0 < ∞ ) ◮ New: both weak and fast polarization occur under mild assumption ◮ New: example of a stationary periodic process that does not polarize 2 / 17
Process: ◮ ( X j , Y j , S j ) ∞ j = −∞ ◮ Polarization applied to X j : U N 1 = X N 1 G N ◮ Y j channel output/side information ◮ S j process state (usually hidden) Entropy: 1 N H ( X N 1 | Y N H X | Y = lim 1 ) N →∞ 3 / 17
Theorem (Weak polarization) If process is ψ mixing with ψ 0 < ∞ , then for all ǫ > 0 1 i : H ( U i | U i − 1 Y N � = H X | Y , � �� �� lim 1 ) > 1 − ǫ 1 N N →∞ 1 i : H ( U i | U i − 1 Y N � = 1 − H X | Y . � �� �� lim 1 ) < ǫ 1 N N →∞ Theorem (Fast polarization) If process is ψ mixing with ψ 0 < ∞ , then for all β < 1 / 2 1 1 ) < 2 − N β �� i : Z ( U i | U i − 1 � = 1 − H X | Y . � Y N �� lim 1 N N →∞ 4 / 17
Missing: Fast polarization to entropy 1. . . Even so: Above theorems = ⇒ ◮ polar coding transmission scheme for the Gilbert-Elliot channel q b 1 − q g 1 − q b BSC( p g ) BSC( p b ) q g ◮ polar coding lossless compression scheme for sources with memory q b 1 − q g 1 − q b Ber( p g ) Ber( p b ) q g See also: R. Wang, J. Honda, H. Yamamoto, R. Liu, and Y. Hou, “Construction of polar codes for channels with memory,” in Proc. IEEE Inform. Theory Workshop (ITW’2015) , Jeju Island, Korea, 2015, pp. 187–191. 5 / 17
Theorem (Periodic processes may not polarize) The stationary periodic Markov process S = 0 S = 1 X ∼ Ber(1 / 2) X ∼ Ber(1 / 2) S = 3 S = 2 X = 0 X = 0 does not polarize. Indeed, for all 5 N 8 < i ≤ 6 N 8 , � ) − 1 � � H ( U i | U i − 1 � � � ≤ ǫ N , N →∞ ǫ N = 0 . lim � 1 � 2 6 / 17
S = 0 S = 1 X ∼ Ber(1 / 2) X ∼ Ber(1 / 2) S = 3 S = 2 X = 0 X = 0 Lemma Consider the stationary Markov process depicted in the figure. Then, for N ≥ 8 , the following holds. For all 5 N 8 < i ≤ 6 N we have that 8 � 0 if s 1 ∈ { 1 , 3 } H ( U i | U i − 1 , S 1 = s 1 ) = 1 1 if s 1 ∈ { 0 , 2 } , S 1 ) = 1 ⇒ H ( U i | U i − 1 = 2 . 1 7 / 17
( U 2 , U 4 ) ( U 1 , U 3 , U 5 ) S = 0 S = 1 X ∼ Ber(1 / 2) X ∼ Ber(1 / 2) S 1 = 0 U 4 = 0 S 1 = 1 i . i . d . U 5 = U 3 S 1 = 2 U 4 = U 2 S = 3 S = 2 X = 0 X = 0 S 1 = 3 i . i . d . U 5 = U 3 + U 1 ◮ Table: distribution of U 5 1 for N = 8 and the four possible initial states ◮ First column: differentiate between S 1 = 0, S 1 = 2, S 1 ∈ { 1 , 3 } ◮ Second column: differentiate between S 1 = 1 and S 1 = 3 8 / 17
S = 0 S = 1 X ∼ Ber(1 / 2) X ∼ Ber(1 / 2) S = 3 S = 2 X = 0 X = 0 ◮ Counter-examples for other periods p ? ◮ Specifically, is it important that p | 2? 9 / 17
A process T j = ( X j , Y j , S j ) is ψ -mixing if there is a sequence ψ 0 , ψ 1 , . . . , lim ψ k = 1 , such that Pr( A ∩ B ) ≤ ψ k Pr( A ) Pr( B ) for all A ∈ σ ( T 0 −∞ ) and B ∈ σ ( T ∞ k +1 ). Graphically: · · · T − 2 T − 1 T 0 T 1 T 2 · · · T k − 1 T k T k +1 T k +2 T k +3 · · · i.i.d./aperiodic Markov/aperiodic hidden Markov = ⇒ ψ 0 < ∞ . 10 / 17
◮ Let N = 2 n and 1 ≤ i ≤ N . ◮ Notation: U N 1 = X N 1 G N V N 1 = X 2 N N +1 G N Q i = Y N 1 U i − 1 1 R i = Y 2 N N +1 V i − 1 1 ◮ Notation, for independent blocks: ◮ Let ˆ 1 , ˆ X 2 N Y 2 N be distributed as P X N 1 · P X 2 N 1 Y N N +1 Y 2 N 1 N +1 ◮ Define the corresponding variables ˆ U i , ˆ V i , ˆ Q i , ˆ R i as above ◮ Bhattacharyya: for U and Q, define � � Z (U | Q) = P U , Q (0 , q ) · P U , Q (1 , q ) . q 11 / 17
Proof of fast polarization: Z ( U i + V i | Q i , R i ) � � = P U i + V i , Q i , R i (0 , q , r ) · P U i + V i , Q i , R i (1 , q , r ) q , r � � ≤ ψ 0 P ˆ R i (0 , q , r ) · ψ 0 P ˆ R i (1 , q , r ) U i + ˆ V i , ˆ Q i , ˆ U i + ˆ V i , ˆ Q i , ˆ q , r = ψ 0 · Z (ˆ U i + ˆ V i | ˆ Q i , ˆ R i ) ≤ ψ 0 · 2 Z (ˆ U i | ˆ Q i ) = ψ 0 · 2 Z ( U i | Q i ) In a similar manner, we show Z ( V i | U i + V i , Q i , R i ) ≤ ψ 0 · Z ( U i | Q i ) 2 Now, apply Arıkan and Telatar ISIT 2009, assuming weak polarization 12 / 17
Proof of weak polarization: Recall our notation U N 1 = X N 1 G N 1 = X 2 N V N N +1 G N 1 U i − 1 Q i = Y N 1 N +1 V i − 1 R i = Y 2 N 1 Lemma: If ψ 0 < ∞ , then for any ǫ > 0, the fraction of indices i for which I ( U i ; R i | Q i ) < ǫ I ( V i ; Q i | R i ) < ǫ I ( U i ; V i | Q i , R i ) < ǫ approaches 1 as N → ∞ . 13 / 17
Proof: � � p X 2 N Y 2 N 1 1 log( ψ 0 ) ≥ E log p X N 1 · p X 2 N 1 Y N N +1 Y 2 N N +1 1 ; X 2 N N +1 Y 2 N = I ( X N 1 Y N N +1 ) 1 Y 2 N = I ( U N 1 Y N 1 ; V N N +1 ) ≥ I ( U N 1 ; V N 1 Y 2 N N +1 | Y N 1 ) N � I ( U i ; V N 1 Y 2 N N +1 | Y N 1 U i − 1 = ) 1 i =1 N � I ( U i ; V N = i +1 , V i , R i | Q i ) i =1 ◮ At most � log( ψ 0 ) N terms inside the sum are at most � log( ψ 0 ) / N ◮ The i th term is greater than both I ( U i ; R i | Q i ) and ( U i ; V i | Q i , R i ) 14 / 17
Lemma: Let ( X i , Y i ) be stationary and ψ -mixing. For all ξ > 0, there exists N 0 and δ ( ξ ) > 0 such that for all N > N 0 and all { 0 , 1 } -valued random variables A = f ( X N 1 , Y N 1 ) and B = f ( X 2 N N +1 , Y 2 N N +1 ) p A (0) ∈ ( ξ, 1 − ξ ) implies p AB (0 , 1) > δ ( ξ ) . 15 / 17
Proof: Define the random variable C = f ( X 3 N 2 N +1 , Y 3 N 2 N +1 ). We have 2 p AB (0 , 1) = p AB (0 , 1) + p BC (0 , 1) ≥ p ABC (0 , 1 , 1) + p ABC (0 , 0 , 1) = p AC (0 , 1) = p A (0) − p AC (0 , 0) ≥ p A (0)(1 − ψ N p C (0)) = p A (0)(1 − ψ N p A (0)) ◮ The first and last equalities are due to stationarity ◮ Since p A (0) ∈ ( ξ, 1 − ξ ) and ψ N → 1, there exists N 0 such that the last term is away from 0 for all N > N 0 . 16 / 17
◮ The above two lemmas are the essence of the proof ◮ A proof for the case of finite memory was given in the Ph.D. thesis of S ¸a¸ so˘ glu ◮ Current proof more general, and easier to follow (there are similarities) 17 / 17
Recommend
More recommend