Statistical Mechanical Analysis of Low-Density Parity-Check Codes on General Markov Channel Ryuhei Mori and Toshiyuki Tanaka SITA2011 Iwate 30 November
Concept It has been shown that Large deviations theory (method of types) is useful for understanding the result of the replica method [Mori 2011]. In this work, Large deviations theory (method of types) for Markov chain is applied to models including a Markov structure. 2 / 23
Types [Csisz´ ar 1977] X : a finite set ■ P x ( a ) : the fraction of a ∈ X in x ∈ X N ■ Example: For X = { a , b , c } , x = [ a , a , b , a , c , c , a , b ] , P x ( a ) = 4 / 8 , P x ( b ) = 2 / 8 , P x ( c ) = 2 / 8 . � N + |X|− 1 � P N ( X ) := { P x | x ∈ X N } , |P N ( X ) | = |X|− 1 3 / 23
Number of sequences of a particular type T P ( N ) := { x ∈ X N | P x = P } � � N ! N |T P ( N ) | = := NP ( a ) NP ( b ) NP ( c ) ( NP ( a ))!( NP ( b ))!( NP ( c ))! |T P ( n ) | ≈ exp { NH ( P ) } Usage: � � f ( P x ) = | T P ( N ) | f ( P ) x ∈X N P ∈P N ( X ) N � N � � � f ( x ) = f ( i ) i i =0 x ∈{ 0 , 1 } N 4 / 23
Sanov’s theorem � Q N ( x ) = Q ( a ) NP x ( a ) a ∈X � � � = exp P x ( a ) log Q ( a ) N a ∈X = exp {− N [ H ( P x ) + D ( P x � Q )] } � Q N ( x E [exp { Ng ( P X 1 X 2 ... X N ) } ] = x ) exp { Ng ( P x x ) } x x x ∈X N x x � = |T P ( N ) | exp {− N [ H ( P ) + D ( P � Q )] } exp { Ng ( P ) } P ∈P N ( X ) � ≈ exp {− N ( D ( P � Q ) − g ( P )) } P ∈P ( X ) ≈ sup exp {− N ( D ( P � Q ) − g ( P )) } (Laplace method) P ∈P ( X ) 5 / 23
The second order types X : a finite set ■ P (2) ( a , b ) : the fraction of a pair of successive symbols ( a , b ) ∈ X 2 ■ x in x ∈ X N Example: For X = { a , b , c } , x = [ a , a , b , a , c , c , a , b ] P (2) ( a , a ) = 1 / 7 , P (2) ( a , b ) = 2 / 7 , P (2) ( a , c ) = 1 / 7 , x x x P (2) ( b , a ) = 1 / 7 , P (2) ( b , b ) = 0 / 7 , P (2) ( b , c ) = 0 / 7 , x x x P (2) ( c , a ) = 1 / 7 , P (2) ( c , b ) = 0 / 7 , P (2) ( c , c ) = 1 / 7 . x x x P (2) |P (2) N ( X ) | ∼ d ( |X| ) N |X| 2 −|X| . N ( X ) := { P (2) | x ∈ X N } , x 6 / 23
Number of sequence of particular type P X , Y ( N ) := { x ∈ X N | P (2) T (2) = P X , Y } x � � NP X ( x ) � |T (2) P X , Y ( N ) | = C . { NP X , Y ( x , y ) } y ∈X x ∈X [Whittle 1955] [Billingsley 1961]. |T n P X , Y | ≈ exp { NH ( X | Y ) } , P X ≈ P Y 7 / 23
One-dimensional Ising model � � N − 1 N 1 � � p ( x ) := Z ( N ) exp − J x i x i +1 − h x i i =1 i =1 � � N − 1 N � � � Z ( N ) := exp − J x i x i +1 − h x i i =1 i =1 x ∈{ +1 , − 1 } N e − hx 1 e − hx 2 e − hx N − 1 e − hx N x 1 e − Jx 1 x 2 x 2 x N − 1 e − Jx N − 1 x N x N 8 / 23
The method of transfer matrix � � N − 1 N � � � Z N ( x 1 , x N ) := exp − J x i x i +1 − h . x i i =1 i =1 x ∈{ +1 , − 1 } N � Z N (+1 , +1) � Z N (+1 , − 1) Z N ( − 1 , +1) Z N ( − 1 , − 1) � Z N − 1 (+1 , +1) � � exp {− J − h } � Z N − 1 (+1 , − 1) exp { + J + h } = Z N − 1 ( − 1 , +1) Z N − 1 ( − 1 , − 1) exp { + J − h } exp {− J + h } � exp {− h } � � exp {− J − h } � N − 1 0 exp { + J + h } = 0 exp { + h } exp { + J − h } exp {− J + h } � Z N ( x 1 , x N ) ∼ λ N Z ( N ) = max . ( x 1 , x N ) ∈X 2 9 / 23
The method of types for Markov chain � � N − 1 N � � � Z ( N ) = exp − J x i x i +1 − h x i i =1 i =1 x ∈{ +1 , − 1 } N � � � � T (2) � � = P S , T ( N ) � P S , T ∈P (2) N � � · exp − J ( N − 1) P S , T ( s , t ) st − h NP T ( t ) t ( s , t ) ∈{ +1 , − 1 } 2 t ∈{ +1 , − 1 } 1 lim N log Z ( N ) = sup { H ( S | T ) − J � ST � − h � T �} N →∞ P ST , P S = P T = sup { H ( S , T ) − H ( T ) − J � ST � − h � T �} P ST , P S = P T The maximization problem can be solved by the method of Lagrange multiplier. 10 / 23
Free energy of 1d Ising model 1/2 Lemma 1. 1 lim N log Z N = supextr { log Z w − log Z v } . N →∞ m LR → v where supextr stands for supremum among saddle points. � Z w = m LR → v ( t ) m LR → v ( s ) exp {− Jst − hs − ht } ( s , t ) ∈{ +1 , − 1 } 2 � m LR → v ( t ) 2 exp {− ht } . Z v = t ∈{ +1 , − 1 } The saddle point equation is 1 � m LR → v ( t ) = m LR → v ( s ) exp {− Jst − hs } . Z LR → v s ∈{ +1 , − 1 } This is the equation of belief propagation on the 1d Ising model of infinite size ! 11 / 23
Free energy of 1d Ising model 2/2 1 lim N log Z N = log Z LR → v N →∞ where 1 � m LR → v ( t ) = m LR → v ( s ) exp {− Jst − hs } . Z LR → v ( s , t ) ∈{ +1 , − 1 } 2 Here, Z LR → v and m LR → v are eigenvalue and eigenvector of � exp {− J − h } � exp { + J + h } exp { + J − h } exp {− J + h } which is the transfer matrix. Hence, 1 lim N log Z N = log λ max . N →∞ The method of types is useful for more complicated problems. 12 / 23
LDPC codes on memoryless channel N p ( x | y ) := 1 � � f ( x ∂ a ) W ( y i | x i ) Z a i =1 N � � � Z := f ( x ∂ a ) W ( y i | x i ) . a i =1 x ∈X N �� � f ( x ) := I x j = 0 j N p ( y ) := 1 � � � f ( x ∂ a ) W ( y i | x i ) Z 0 x ∈X N a i =1 � � Z 0 := f ( x ∂ a ) . a x ∈X N 13 / 23
Conditional entropy and free energy N p ( x | y ) := 1 � � f ( x ∂ a ) W ( y i | x i ) Z a i =1 N � � � Z := f ( x ∂ a ) W ( y i | x i ) . a i =1 x ∈X N E [ H ( X | Y )] = E [log Z ] − E [log W ( Y | X )] 14 / 23
Disordered system and replica method � ∂ log E [ Z n ] 1 1 � lim N E [log Z ] = lim � ∂ n N � N →∞ N →∞ n =0 ? 1 1 1 1 ���� n log E [ Z n ] N log E [ Z n ] = lim N lim = lim n lim N →∞ n → 0 n → 0 N →∞ For non-negative integer n , � n � � � n � � � � � Z n = x ( i ) f ( x x ∂ a ) = f ( x ∂ a ) . x x a a x ∈X N x ∈ ( X n ) N i =1 x x x x Z n can be regarded as a partition function of a new model in which → X n X − n � x ( i ) ) . f ( x x x ) − → f ( x x i =1 15 / 23
Types on factor graphs [Vontobel 2010] ν ( x ) , x ∈ X : a type of variable nodes x ∈ X r : a type of factor nodes µ ( x x x ) , x x 0 (0010) µ (0010) = 1 / 4 , 0 X = { 0 , 1 } . 1 µ (0001) = 1 / 4 , (0001) 1 µ (0100) = 1 / 4 , ν (0) = 5 / 8 , 0 µ (0111) = 1 / 4 , (0100) 0 ν (1) = 3 / 8 . Otherwise, 0 µ ( x x x ) = 0 . (0111) 1 There is a constraint between ν ( x ) and µ ( x x x ) . More precisely, ν ( x ) is uniquely determined from µ ( x x x ) . 16 / 23
Contribution of particular types to a partition function � � Z = f ( x x x ∂ a ) x ∈X N a x x � � � x ) =: ℓ r N µ ( x x = N ( ν , µ ) f ( x x x ) Z ( ν , µ ) . ν , µ x ∈X r ν , µ x x �� � �� ℓ r N x ∈X ( N ν ( x ) ℓ )! N E [ N ( ν , µ )] = . { ℓ { N ν ( x ) } x ∈X ( N ℓ )! r N µ ( x x x ) } x x ∈X r x 1 lim N log E [ Z ( ν , µ )] N →∞ = ℓ r H ( µ ) − ( ℓ − 1) H ( ν ) + ℓ � µ ( x x x ) log f ( x x x ) . r x ∈X r x x Minus Bethe free energy of of mini (averaged) model [Mori 2011]. 17 / 23
Free energy of LDPC codes on memoryless channel � 1 l N log E [ Z n ] = lim sup r H ( U 1 , ... , U r ) − ( l − 1) H ( X ) N →∞ P X , P U 1 ,..., Ur � � � �� �� � n n + l � � f ( U ( k ) ) W ( y | X ( k ) ) log + log − R r k =0 y ∈Y k =0 Here, X and U 1 , ... , U r are random variables on X n +1 satisfying X and U K have the same distribution ■ where K denotes the uniform random variable on a set { 1 , ... , r } . The saddle point equation for replica symmetric solution is equivalent to the density evolution of the belief propagation [Mori 2011]. 18 / 23
LDPC codes on general Markov channel S : a set of states V ( t | y , x , s ) : a transition probability for x ∈ X , y ∈ Y and s , t ∈ S N − 1 N p ( x | y ) := 1 � � � � f ( x ∂ a ) W ( y i | x i , s i ) V 0 ( s 1 ) V ( s i +1 | y i , x i , s i ) Z i =1 i =1 s ∈S N a N � � � � Z := f ( x ∂ a ) W ( y i | x i , s i ) a i =1 x ∈X N s ∈S N N − 1 � · V 0 ( s 1 ) V ( s i +1 | y i , x i , s i ) . i =1 19 / 23
Free energy of LDPC codes on general Markov channel Main result of this work � 1 N log E [ Z n ] = sup lim H ( X 1 , S 1 | X 2 , S 2 ) − lH ( X 1 , S 1 ) N →∞ � � n + l r H ( U 1 , ... , U r , T 1 , ... , T r ) + l � f ( U ( k ) ) log r k =0 � �� �� � n � W ( y | X ( k ) , S ( k ) 2 ) V ( S ( k ) | y , X ( k ) , S ( k ) + log 2 ) − R . 2 1 2 y ∈Y k =0 ( X 1 , S 1 ) and ( X 2 , S 2 ) have the same distribution ■ ( X 1 , S 1 ) and ( U K , T K ) have the same distribution ■ where K denotes the uniform random variable on a set { 1 , ... , r } . The saddle point equation is equivalent to the density evolution of the belief propagation (joint iterative decoder). 20 / 23
The dicode erasure channel DEC( ǫ ) is defined for X = S = { 0 , 1 } , Y = {− 1 , 0 , +1 , ∗} as � 1 − ǫ , y = x − s W ( y | x , s ) = ǫ , y = ∗ V ( s ′ | y , x , s ) = 1 , for s ′ = x . The density evolution can be described by one parameter [Pfister and Siegel 2008]. 21 / 23
Recommend
More recommend