Algebraic Structure in Network Information Theory Michael Gastpar EPFL / Berkeley European Information Theory School, Antalya, Turkey April 2012 slides jointly with Bobak Nazer (Boston Univ.) download slides from linx.epfl.ch under “Teaching”
Motivation p Y | X
Motivation p Y | X
Motivation p Y | X p Y | X 1 X 2
Motivation p Y | X p Y | X 1 X 2 p Y 1 Y 2 Y 3 | X 1 X 2 X 3
Outline I. Discrete Alphabets II. AWGN Channels III. Network Applications
Point-to-Point Channels x y p Y | X w E D w ˆ The Usual Suspects: • Message w ∈ { 0 , 1 } k w ∈ { 0 , 1 } k • Estimate ˆ • Encoder E : { 0 , 1 } k → X n • Decoder D : Y n → { 0 , 1 } k • Input x ∈ X n • Output y ∈ Y n n � • Memoryless Channel p ( y | x ) = p ( y i | x i ) i =1 • Rate R = k n . • (Average) Probability of Error: P { ˆ w � = w } → 0 as n → ∞ . Assume w is uniform over { 0 , 1 } k .
i.i.d. Random Codes • Generate 2 nR codewords x = [ X 1 X 2 · · · X n ] independently and elementwise i.i.d. according to some distribution p X q − 1 . . . n � 4 p ( x ) = p X ( x i ) 3 i =1 2 • Bound the average error probability 1 for a random codebook. 0 4 · · · q − 1 0 1 2 3 • If the average performance over codebooks is good, there must exist at least one good fixed codebook.
(Weak) Joint Typicality • Two sequences x and y are (weakly) jointly typical if � � � − 1 � � n log p ( x ) − H ( X ) � <ǫ � � � � � − 1 � � n log p ( y ) − H ( Y ) � <ǫ � � � − 1 � � � � n log p ( x , y ) − H ( X, Y ) � <ǫ � � • For our considerations, weak typicality is convenient as it can also be stated in terms of differential entropies. • If x and y are i.i.d. sequences, the probability that they are jointly typical goes to 1 as n goes to infinity.
Joint Typicality Decoding Decoder looks for a codeword that is jointly typical with the received sequence y Error Events 1. Transmitted codeword x is not jointly typical with y . = ⇒ Low probability by the Weak Law of Large Numbers. 2. Another codeword ˜ x is jointly typical with y . Cuckoo’s Egg Lemma Let ˜ x be an i.i.d. sequence that is independent from the received sequence y . � � ≤ 2 − n ( I ( X ; Y ) − 3 ǫ ) ( ˜ x , y ) is jointly typical P See Cover and Thomas .
Point-to-Point Capacity • We can upper bound the probability of error via the union bound: � � � P { ˆ w � = w } ≤ P ( x ( ˜ w ) , y ) is jointly typical. w � = w ˜ ≤ 2 − n ( I ( X ; Y ) − R − 3 ǫ ) ← Cuckoo’s Egg Lemma • If R < I ( X ; Y ) , then the probability of error can be driven to zero as the blocklength increases. Theorem (Shannon ’48) The capacity of a point-to-point channel is C = max p X I ( X ; Y ) .
Linear Codes • Linear Codebook: A linear map between messages and codewords (instead of a lookup table). q -ary Linear Codes • Represent message w as a length- k vector over F q . • Codewords x are length- n vectors over F q . • Encoding process is just a matrix multiplication, x = Gw . · · · x 1 g 11 g 12 g 1 k w 1 x 2 g 21 g 22 · · · g 2 k w 2 = . . . . . ... . . . . . . . . . . · · · x n g n 1 g n 2 g nk w k • Recall that, for prime q , operations over F q are just mod q operations over the reals. • Rate R = k n log q
Random Linear Codes • Linear code looks like a regular subsampling of the elements of F n q . q − 1 . . . • Random linear code: Generate 4 each element g ij of the generator F q 3 matrix G elementwise i.i.d. 2 according to a uniform distribution 1 over { 0 , 1 , 2 , . . . , q − 1 } . 0 4 · · · q − 1 0 1 2 3 • How are the codewords distributed? F q
Random Linear Codes • Linear code looks like a regular subsampling of the elements of F n q . q − 1 . . . • Random linear code: Generate 4 each element g ij of the generator F q 3 matrix G elementwise i.i.d. 2 according to a uniform distribution 1 over { 0 , 1 , 2 , . . . , q − 1 } . 0 4 · · · q − 1 0 1 2 3 • How are the codewords distributed? F q
Codeword Distribution x = Gw ⊕ v It is convenient to instead analyze the shifted ensemble ¯ where v is an i.i.d. uniform sequence. (See Gallager. ) Shifted Codeword Properties 1. Marginally uniform over F n q . For a given message w , the codeword ¯ x looks like an i.i.d. uniform sequence. x = x } = 1 for all x ∈ F n P { ¯ q q n 2. Pairwise independent. For w 1 � = w 2 , codewords ¯ x 1 , ¯ x 2 are independent. 1 P { ¯ x 1 = x 1 , ¯ x 2 = x 2 } = q 2 n = P { ¯ x 1 = x 1 } P { ¯ x 2 = x 2 }
Achievable Rates • Cuckoo’s Egg Lemma only requires independence between the true codeword x ( w ) and the other codeword x ( ˜ w ) . From the union bound: � � � P { ˆ w � = w } ≤ ( x ( ˜ w ) , y ) is jointly typical. P w � = w ˜ ≤ 2 − n ( I ( X ; Y ) − R − 3 ǫ ) • This is exactly what we get from pairwise independence. • Thus, there exists a good fixed generator matrix G and shift v for any rate R < I ( X ; Y ) where X is uniform.
Removing the Shift z y ¯ ¯ x w E D w ˆ • For a binary symmetric channel (BSC), the output can be written as the modulo sum of the input plus i.i.d. Bernoulli ( p ) noise, ¯ y = ¯ x ⊕ z y = Gw ⊕ v ⊕ z ¯ • Due to this symmetry, the probability of error depends only on the realization of the noise vector z . = ⇒ For a BSC, x = Gw is a good code as well. • We can now assume the existence of good generator matrices for channel coding.
Random I.I.D. vs. Random Linear • What have we gotten for linearity (so far)? Simplified encoding. (Decoder is still quite complex.) • What have we lost? Can only achieve R = I ( X ; Y ) for uniform X instead of max p X I ( X ; Y ) . • In fact, this is a fundamental limitation of group codes, Ahlswede ’71 . • Workarounds: symbol remapping Gallager ’68 , nested linear codes • Are random linear codes strictly worse than random i.i.d. codes?
Slepian-Wolf Problem R 1 s 1 E 1 ˆ s 1 D R 2 ˆ s 2 s 2 E 2 m � • Joint i.i.d. sources p ( s 1 , s 2 ) = p S 1 S 2 ( s 1 i , s 2 i ) i =1 • Rate Region: Set of rates ( R 1 , R 2 ) such that the encoders can send s 1 and s 2 to the decoder with vanishing probability of error P { ( ˆ s 1 , ˆ s 2 ) � = ( s 1 , s 2 ) } → 0 as m → ∞
Random Binning • Codebook 1: Independently and uniformly assign each source sequence s 1 to a label { 1 , 2 , . . . , 2 mR 1 } • Codebook 2: Independently and uniformly assign each source sequence s 2 to a label { 1 , 2 , . . . , 2 mR 2 } • Decoder: Look for jointly typical pair ( ˆ s 1 , ˆ s 2 ) within the received bin. Union bound: � � jointly typical ( ˆ s 1 , ˆ s 2 ) � = ( s 1 , s 2 ) in bin ( ℓ 1 , ℓ 2 ) P � 2 − m ( R 1 + R 2 ) ≤ jointly typical ( ˜ s 1 , ˜ s 2 ) ≤ 2 m ( H ( S 1 ,S 2 )+ ǫ ) 2 − m ( R 1 + R 2 ) • Need R 1 + R 2 > H ( S 1 , S 2 ) . • Similarly, R 1 > H ( S 1 | S 2 ) and R 2 > H ( S 2 | S 1 )
Slepian-Wolf Problem: Binning Illustration · · · 1 2 3 4 2 nR 1 1 2 3 4 . . . 2 nR 2
Slepian-Wolf Problem: Binning Illustration · · · 1 2 3 4 2 nR 1 1 2 3 4 . . . 2 nR 2
Random Linear Binning • Assume source symbols take values in F q . • Codebook 1: Generate matrix G 1 with i.i.d. uniform entries drawn from F q . Each sequence s 1 is binned via matrix multiplication, w 1 = G 1 s 1 . • Codebook 2: Generate matrix G 2 with i.i.d. uniform entries drawn from F q . Each sequence s 2 is binned via matrix multiplication, w 2 = G 2 s 2 . • Bin assignments are uniform and pairwise independent (except for s ℓ = 0 ) • Can apply the same union bound analysis as random binning.
Slepian-Wolf Rate Region Slepian-Wolf Theorem R 2 Reliable compression possible if and S-W only if: R 1 ≥ H ( S 1 | S 2 ) = h B ( p ) R 2 ≥ H ( S 2 | S 1 ) = h B ( p ) h B ( p ) R 1 + R 2 ≥ H ( S 1 , S 2 ) = 1 + h B ( p ) R 1 + R 2 = 1 + h B ( p ) Random linear binning is as good R 1 h B ( p ) as random i.i.d. binning! Example: Doubly Symmetric Binary Source S 1 ∼ Bern (1 / 2) U ∼ Bern ( p ) S 2 = S 1 ⊕ U
K¨ orner-Marton Problem • Binary sources R 1 s 1 E 1 • s 1 is i.i.d. Bernoulli( 1 / 2 ) D ˆ u • s 2 is s 1 corrupted by Bernoulli( p ) R 2 noise s 2 E 2 • Decoder wants the modulo- 2 sum . u = s 1 ⊕ s 2 Rate Region: Set of rates ( R 1 , R 2 ) such that there exist encoders and decoders with vanishing probability of error P { ˆ u � = u } → 0 as m → ∞ Are any rate savings possible over sending s 1 and s 2 in their entirety?
Random Binning • Sending s 1 and s 2 with random binning requires R 1 + R 2 > 1 + h B ( p ) ? • What happens if we use rates such that R 1 + R 2 < 1 + h B ( p ) ? • There will be exponentially many pairs ( s 1 , s 2 ) in each bin! • This would be fine if all pairs in a bin have the same sum, s 1 + s 2 . But this probability goes to zero exponentially fast!
K¨ orner-Marton Problem: Random Binning Illustration · · · 1 2 3 4 2 nR 1 1 2 3 4 . . . 2 nR 2
K¨ orner-Marton Problem: Random Binning Illustration · · · 1 2 3 4 2 nR 1 1 2 3 4 . . . 2 nR 2
Recommend
More recommend