Local Optimality Certificates for LP Decoding of Tanner Codes Nissim Halabi Guy Even 1 1 th Haifa Workshop on Interdisciplinary Applications of Graph Theory , Combinatorics and Algorithms. May 2011 1
Error Correcting Codes – Worst Case Vs. Average Case Analysis An [ N , K ] linear code C – K -dimensional subspace of the vector space {0,1} N Worst case analysis – assuming C adversarial channel. e.g., how many bit flips, in any pattern, can decoding recover? d min /2 p Pr(fail : worst case) ~ Average case analysis – probabilistic channel e.g., given that every bit is flipped with d min probability p independently, what is the probability that decoding succeeds? d p min possibly, Pr(fail : avg. case) << 2
Error Correcting Codes for Memoryless Binary-Input Output-Symmetric Channels (1) { } { } y ∈ c ∈ N c ∈ ⊆ N N ˆ C 0,1 0,1 Channel Noisy Channel Encoding Channel Decoding noisy codeword codeword Memoryless Binary-Input Output-Symmetric Channel characterized by conditional probability function P( y | c ) Errors occur randomly and are independent from bit to bit (memoryless) Assumes transmitted symbols are binary Errors affect ‘0’s and ‘1’s with equal probability (i.e., symmetric) Examaple: Binary Symmetric Channel (BSC) 1 -p 0 0 p y i c i p 1 -p 1 1 3
Error Correcting Codes for Memoryless Binary-Input Output-Symmetric Channels (2) { } { } λ ( ) y ∈ c ∈ N c ∈ ⊆ N N ˆ C 0,1 0,1 Channel Noisy Channel Encoding Channel Decoding noisy codeword codeword Log-Likelihood Ratio (LLR) λ i for a received observation y i : ( ) = | 0 y x ( ) λ = Y X / i i ln y i i ( ) = i i | 1 y x Y X / i i i i λ i > 0 y i is more likely to be ‘0’ λ i < 0 y i is more likely to be ‘1’ λ y replace y by λ 4
Maximum-Likelihood (ML) Decoding Maximum-likelihood (ML) decoding for any binary-input memory-less channel: ( ) λ = λ arg min , ML x ∈ C x Maximum-likelihood (ML) decoding formulated as a linear program: ( ) λ = λ = λ arg min , arg min , ML x x ( ) ∈ C conv x ∈ C x C { 0,1 } N No Efficient Representation conv( C ) [0,1] N 5
Linear Programming (LP) Decoding Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv ( C ) ( ) λ = λ arg min , LP x ∈ P x P : (1) All codewords x in C are vertices (2) All new vertices are fractional (therefore, new vertices are not in C ) (3) Has an efficient representation C { 0,1 } N ( ) ( ) ( ) λ ⇒ λ = λ integral LP LP ML conv( C ) [0,1] N Solve LP P ( ) fractional λ ⇒ ! LP fail conv( C ) P fractional 6
Linear Programming (LP) Decoding Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv ( C ) ( ) λ = λ arg min , LP x ∈ P x P : (1) All codewords x in C are vertices (2) All new vertices are fractional LP decoder finds ML codeword (therefore, new vertices are not in C ) (3) Has an efficient representation C { 0,1 } N ( ) ( ) ( ) λ ⇒ λ = λ integral LP LP ML conv( C ) [0,1] N Solve LP P ( ) fractional λ ⇒ ! LP fail conv( C ) P fractional 7
Linear Programming (LP) Decoding Linear Programming (LP) decoding [Fel03, FWK05] – relaxation of the polytope conv ( C ) ( ) λ = λ arg min , LP x ∈ P x P : (1) All codewords x in C are vertices (2) All new vertices are fractional LP decoder (therefore, new vertices are not in C ) fails (3) Has an efficient representation C { 0,1 } N ( ) ( ) ( ) λ ⇒ λ = λ integral LP LP ML conv( C ) [0,1] N Solve LP P ( ) fractional λ ⇒ ! LP fail conv( C ) P fractional 8
Tanner Codes [Tan81] Factor graph representation of Tanner codes: Every Local-code node C j is associated with x 1 linear code of length deg G ( C j ) x 2 C x 3 1 Tanner code C ( G ) and codewords x : x C 4 2 x ( ) C 5 ∈ ⇔ ∀ ∈ − C C . x G j x local code 3 ( ) x j C 6 j C x 4 7 ( ) x C min = − * C 8 d d loc al co de 5 min j x j 9 x Extended local-code C j {0,1} N : extend to 10 Variable Local-Code bits outside the local-code nodes nodes Example: Expander codes [SS’96] G = ( I J , E ) Tanner graph is an expander; Simple bit flipping decoding algorithm. 9
LP Decoding of Tanner Codes Maximum-likelihood (ML) decoding: ( ) λ = λ arg min , ML x ( ) ∈ C conv x conv ( ) C C conv = extended local-code where j j Linear Programming (LP) decoding [following Fel03, FWK05]: ( ) λ = λ arg min , LP x ∈ P x ( ) = where P C j co nv extended local-code j conv(extended local-code C 1 ) conv(extended local-code C 2 ) 10
Criterions of Interest Let λ N denote an LLR vector received from the channel. Let x C ( G ) denote a codeword. Consider the following questions: ? ( ) ( ) λ = λ ? ML unique x ML NP-Hard ? ( ) ( ) λ = λ ? x LP LP unique λ Efficient Test with x Definitely Yes / One Sided Error Maybe No ? ( ) = λ ? x ML unique E.g., efficient test via local computations “Local Optimality” criterion 11
Combinatorial Characterization of Local Optimality (1) Let x C ( G ) {0,1} N f [0,1] N N ( ) [Fel03] Define relative point x ⊕ f by ⊕ = − x f x f i i i Consider a finite set B [0,1] N Definition: A codeword x C is locally optimal for λ N if for all vectors b B , λ ⊕ β > λ , , x x λ ML( λ ) Goal: find a set B such that: LP( λ ) integral (1) x LO( λ ) x ML( λ ) and ML( λ ) unique LO( λ ) (2) x LO( λ ) x LP( λ ) and LP( λ ) unique { } { } ≤ ∃ ∈ β λ β ≤ = B 0 N LP decoding fails . , 0 | c λ β > = = − (3) N , 0 | 0 1 (1) c o All-Zeros Assumption β ∈ B 12
Combinatorial Characterization of Local Optimality (2) Goal: find a set B such that: (1) x LO( λ ) x ML( λ ) and ML( λ ) unique (2) x LO( λ ) x LP( λ ) and LP( λ ) unique λ β > = = − N (3) , 0 | 0 1 (1) c o β ∈ B Suppose we have properties (1), (2). Large support( b ) property (3). (e.g., Chernoff-like bounds) If B = C , then: x LO( λ ) x ML( λ ) and ML( λ ) unique b – GLOBAL However, analysis of property (3) ??? Structure 13
Combinatorial Characterization of Local Optimality (2) Goal: find a set B such that: (1) x LO( λ ) x ML( λ ) and ML( λ ) unique (2) x LO( λ ) x LP( λ ) and LP( λ ) unique λ β > = = − N (3) , 0 | 0 1 (1) c o β ∈ B Suppose we have properties (1), (2). Large support( b ) property (3). (e.g., Chernoff-like bounds) For analysis purposes, consider structures with a local nature B is a set of TREES [following KV’06] Strengthen analysis by introducing layer weights! [following ADS’09] better bounds on λ β > = N , 0 | 0 c β ∈ B Finally, height(subtrees(G)) < ½ girth(G) = O(log N) Take path prefix trees – not bounded by girth! 14
Path-Prefix Tree Consider a graph G= ( V,E ) and a node r V : ˆ – set of all backtrackless paths in G emanating from node r with V length at most h. ( ) ( ) ˆ ˆ h , – path-prefix tree of G rooted at node r with height h. T G V E r G: 15
Path-Prefix Tree Consider a graph G= ( V,E ) and a node r V : ˆ – set of all backtrackless paths in G emanating from node r with V length at most h. ( ) ( ) ˆ ˆ h , – path-prefix tree of G rooted at node r with height h. T G V E r 1 ( ) G: 4 : T G r 1 1 1 1 1 1 1 1 2 2 1 2 2 2 3 2 4 2 16
d -Tree ( ) h Consider a path-prefix tree of a Tanner graph T G r G = ( I J , E ) ( ) d -tree T [ r , h , d ] – subgraph of h T G r root = r ∀ v ∈ T ∩ I : deg T ( v ) = deg G ( v ). ∀ c ∈ T ∩ J : deg T ( c ) = d . Not necessarily a valid 2-tree = skinny tree / configuration! 3-tree 4-tree minimal deviation v 0 v 0 v 0 17
Cost of a Projected Weighted Subtree Consider layer weights : {1,…, h } → , and a subtree of a path T r ˆ ( ) 2 h prefix tree . T G r ( ) Define a weight function for the subtree induced by : ω ˆ → T r T r : V ˆ ˆ where ( ) π ∈ ω T N ˆ – projection to Tanner graph G. r ˆ G r 1 1 1 ( ) ( ω ω 1 ) 1 1 1 ω 1 = ⋅ 1 T 1 6 1 ˆ r 1 2 2 project 1 1 2 2 ( ) ( ω 1 ( ) ( ) ω ω ω = ⋅ T 2 6 2 ) π ω = + T ˆ ⋅ ⋅ 1 2 r 2 2 2 1 2 ω 1 2 2 2 3 2 4 ˆ 2 G r 4 8 18 2
Recommend
More recommend