Computation problem in sequential decoding ◮ Computation in sequential decoding is a random quantity, depending on the code rate R and the noise realization ◮ Bursts of noise create barriers for the depth-first search algorithm, necessitating excessive backtracking in the search ◮ Still, the average computation per decoded digit in sequential decoding can be kept bounded provided the code rate R is below the cutoff rate � 2 � � ∆ � � = − log Q ( x ) W ( y | x ) R 0 y x ◮ So, SD solves the coding problem for rates below R 0 ◮ Indeed, SD was the method of choice in space communications, albeit briefly Sequential decoding and the cutoff rate 6 / 72
References on complexity of sequential decoding ◮ Achievability: Wozencraft (1957), Reiffen (1962), Fano (1963), Stiglitz and Yudkin (1964) ◮ Converse: Jacobs and Berlekamp (1967) ◮ Refinements: Wozencraft and Jacobs (1965), Savage (1966), Gallager (1968), Jelinek (1968), Forney (1974), Arıkan (1986), Arıkan (1994) Sequential decoding and the cutoff rate 7 / 72
Sequential decoding and the cutoff rate Guessing and cutoff rate Boosting the cutoff rate Pinsker’s scheme Massey’s scheme Polar coding Guessing and cutoff rate 8 / 72
A computational model for sequential decoding ◮ SD visits nodes at level N in a certain order ◮ No “look-ahead” assumption: SD forgets what it saw beyond level N upon backtracking ◮ Complexity measure G N : The number of nodes searched (visited) at level N until the correct node is visited for the first time Guessing and cutoff rate 9 / 72
A computational model for sequential decoding ◮ SD visits nodes at level N in a certain order ◮ No “look-ahead” assumption: SD forgets what it saw beyond level N upon backtracking ◮ Complexity measure G N : The number of nodes searched (visited) at level N until the correct node is visited for the first time Guessing and cutoff rate 9 / 72
A computational model for sequential decoding ◮ SD visits nodes at level N in a certain order ◮ No “look-ahead” assumption: SD forgets what it saw beyond level N upon backtracking ◮ Complexity measure G N : The number of nodes searched (visited) at level N until the correct node is visited for the first time Guessing and cutoff rate 9 / 72
A bound of computational complexity ◮ Let R be a fixed code rate. ◮ There exist tree codes of rate R such that E [ G N ] ≤ 1 + 2 − N ( R 0 − R ) . ◮ Conversely, for any tree code of rate R , E [ G N ] � 1 + 2 − N ( R 0 − R ) Guessing and cutoff rate 10 / 72
A bound of computational complexity ◮ Let R be a fixed code rate. ◮ There exist tree codes of rate R such that E [ G N ] ≤ 1 + 2 − N ( R 0 − R ) . ◮ Conversely, for any tree code of rate R , E [ G N ] � 1 + 2 − N ( R 0 − R ) Guessing and cutoff rate 10 / 72
A bound of computational complexity ◮ Let R be a fixed code rate. ◮ There exist tree codes of rate R such that E [ G N ] ≤ 1 + 2 − N ( R 0 − R ) . ◮ Conversely, for any tree code of rate R , E [ G N ] � 1 + 2 − N ( R 0 − R ) Guessing and cutoff rate 10 / 72
The Guessing Problem ◮ Alice draws a sample of a random variable X ∼ P . ◮ Bob wishes to determine X by asking questions of the form “Is X equal to x ?” which are answered truthfully by Alice. ◮ Bob’s goal is to minimize the expected number of questions until he gets a YES answer. Guessing and cutoff rate 11 / 72
The Guessing Problem ◮ Alice draws a sample of a random variable X ∼ P . ◮ Bob wishes to determine X by asking questions of the form “Is X equal to x ?” which are answered truthfully by Alice. ◮ Bob’s goal is to minimize the expected number of questions until he gets a YES answer. Guessing and cutoff rate 11 / 72
The Guessing Problem ◮ Alice draws a sample of a random variable X ∼ P . ◮ Bob wishes to determine X by asking questions of the form “Is X equal to x ?” which are answered truthfully by Alice. ◮ Bob’s goal is to minimize the expected number of questions until he gets a YES answer. Guessing and cutoff rate 11 / 72
Guessing with Side Information ◮ Alice samples ( X , Y ) ∼ P ( x , y ). ◮ Bob observes Y and is to determine X by asking the same type of questions “Is X equal to x ?” ◮ The goal is to minimize the expected number of quesses. Guessing and cutoff rate 12 / 72
Guessing with Side Information ◮ Alice samples ( X , Y ) ∼ P ( x , y ). ◮ Bob observes Y and is to determine X by asking the same type of questions “Is X equal to x ?” ◮ The goal is to minimize the expected number of quesses. Guessing and cutoff rate 12 / 72
Guessing with Side Information ◮ Alice samples ( X , Y ) ∼ P ( x , y ). ◮ Bob observes Y and is to determine X by asking the same type of questions “Is X equal to x ?” ◮ The goal is to minimize the expected number of quesses. Guessing and cutoff rate 12 / 72
Optimal guessing strategies ◮ Let G be the number of guesses to determine X . ◮ The expected no of guesses is given by � E [ G ] = P ( x ) G ( x ) x ∈X ◮ A guessing strategy minimizes E [ G ] if P ( x ) > P ( x ′ ) = ⇒ G ( x ) < G ( x ′ ) . Guessing and cutoff rate 13 / 72
Optimal guessing strategies ◮ Let G be the number of guesses to determine X . ◮ The expected no of guesses is given by � E [ G ] = P ( x ) G ( x ) x ∈X ◮ A guessing strategy minimizes E [ G ] if P ( x ) > P ( x ′ ) = ⇒ G ( x ) < G ( x ′ ) . Guessing and cutoff rate 13 / 72
Optimal guessing strategies ◮ Let G be the number of guesses to determine X . ◮ The expected no of guesses is given by � E [ G ] = P ( x ) G ( x ) x ∈X ◮ A guessing strategy minimizes E [ G ] if P ( x ) > P ( x ′ ) = ⇒ G ( x ) < G ( x ′ ) . Guessing and cutoff rate 13 / 72
Upper bound on guessing effort For any optimal guessing function � 2 � � E [ G ∗ ( X )] ≤ � P ( x ) x Proof. M � � � G ∗ ( x ) ≤ P ( x ′ ) / P ( x ) = ip G ( i ) all x ′ i =1 � 2 �� � � E [ G ∗ ( X )] ≤ � � P ( x ′ ) / P ( x ) = P ( x ) P ( x ) . x x x ′ Guessing and cutoff rate 14 / 72
Lower bound on guessing effort For any guessing function for a target r.v. X with M possible values, � 2 �� E [ G ( X )] ≥ (1 + ln M ) − 1 � P ( x ) x For the proof we use the following variant of H¨ older’s inequality. Guessing and cutoff rate 15 / 72
Lemma Let a i , p i be positive numbers. � − 1 �� � 2 �� √ p i � a − 1 a i p i ≥ . i i i i Proof. Let λ = 1 / 2 and put A i = a − 1 , B i = a λ i p λ i , in H¨ older’s i inequality � 1 − λ �� � λ �� � A 1 / (1 − λ ) B 1 /λ A i B i ≤ . i i i i i Guessing and cutoff rate 16 / 72
Proof of Lower Bound M � E [ G ( X ) = ip G ( i ) i =1 � M � − 1 � M � 2 � � � ≥ 1 / i p G ( i ) i =1 i =1 � M � − 1 �� � 2 � � = 1 / i P ( x ) x i =1 � 2 �� ≥ (1 + ln M ) − 1 � P ( x ) x Guessing and cutoff rate 17 / 72
Essense of the inequalities For any set of real numbers p 1 ≥ p 2 ≥ · · · ≥ p M > 0, � M i =1 i p i � 2 ≥ (1 + ln M ) − 1 1 ≥ √ p i �� M i =1 Guessing and cutoff rate 18 / 72
Guessing Random Vectors ◮ Let X = ( X 1 , . . . , X n ) ∼ P ( x 1 , . . . , x n ). ◮ Guessing X means asking questions of the form “Is X = x ?” for possible values x = ( x 1 , . . . , x n ) of X . ◮ Notice that coordinate-wise probes of the type “Is X i = x i ?” are not allowed. Guessing and cutoff rate 19 / 72
Complexity of Vector Guessing Suppose X i has M i possible values, i = 1 , . . . , n . Then, E [ G ∗ ( X 1 , . . . , X n )] � 2 ≥ [1 + ln( M 1 · · · M n )] − 1 1 ≥ �� � P ( x 1 , . . . , x n ) x 1 ,..., x n In particular, if X 1 , . . . , X n are i.i.d. ∼ P with a common alphabet X , 1 ≥ E [ G ∗ ( X 1 , . . . , X n )] � 2 n ≥ [1 + n ln |X| ] − 1 �� � P ( x ) x ∈X Guessing and cutoff rate 20 / 72
Guessing with Side Information ◮ ( X , Y ) a pair of random variables with a joint distribution P ( x , y ). ◮ Y known. X to be guessed as before. ◮ G ( x | y ) the number of guesses when X = x , Y = y . Guessing and cutoff rate 21 / 72
Lower Bound For any guessing strategy and any ρ > 0, � 2 �� E [ G ( X | Y )] ≥ (1 + ln M ) − 1 � � P ( x , y ) y x where M is the number of possible values of X . � Proof. E [ G ( X | Y )] = P ( y ) E [ G ( X | Y = y )] y � 2 �� � P ( y )(1 + ln M ) − 1 � ≥ P ( x | y ) y x � 2 �� = (1 + ln M ) − 1 � � P ( x , y ) y x Guessing and cutoff rate 22 / 72
Upper bound Optimal guessing functions satisfy � 2 �� � E [ G ∗ ( X | Y )] ≤ � P ( x , y ) . y x Proof. � � E [ G ∗ ( X | Y )] P ( x | y ) G ∗ ( x | y ) = P ( y ) y x � 2 �� � � ≤ P ( y ) P ( x | y ) y x � 2 �� � � = P ( x , y ) . y x Guessing and cutoff rate 23 / 72
Generalization to Random Vectors For optimal guessing functions, for ρ > 0, E [ G ∗ ( X 1 , . . . , X k | Y 1 , . . . , Y n )] 1 ≥ � 2 �� � � P ( x 1 , . . . , x k , y 1 , . . . , y n ) y 1 ,..., y n x 1 ,..., x k ≥ [1 + ln( M 1 · · · M k )] − 1 where M i denotes the number of possible values of X i . Guessing and cutoff rate 24 / 72
A “guessing” decoder ◮ Consider a block code with M codewords x 1 , . . . , x M of block length N . ◮ Suppose a codeword is chosen at random and sent over a channel W ◮ Given the channel output y , a “guessing decoder” decodes by asking questions of the form “Is the correct codeword the m th one?” to which it receives a truthful YES or NO answer. ◮ On a NO answer it repeats the question with a new m . ◮ The complexity C for this decoder is the number of questions until a YES answer. Guessing and cutoff rate 25 / 72
A “guessing” decoder ◮ Consider a block code with M codewords x 1 , . . . , x M of block length N . ◮ Suppose a codeword is chosen at random and sent over a channel W ◮ Given the channel output y , a “guessing decoder” decodes by asking questions of the form “Is the correct codeword the m th one?” to which it receives a truthful YES or NO answer. ◮ On a NO answer it repeats the question with a new m . ◮ The complexity C for this decoder is the number of questions until a YES answer. Guessing and cutoff rate 25 / 72
A “guessing” decoder ◮ Consider a block code with M codewords x 1 , . . . , x M of block length N . ◮ Suppose a codeword is chosen at random and sent over a channel W ◮ Given the channel output y , a “guessing decoder” decodes by asking questions of the form “Is the correct codeword the m th one?” to which it receives a truthful YES or NO answer. ◮ On a NO answer it repeats the question with a new m . ◮ The complexity C for this decoder is the number of questions until a YES answer. Guessing and cutoff rate 25 / 72
A “guessing” decoder ◮ Consider a block code with M codewords x 1 , . . . , x M of block length N . ◮ Suppose a codeword is chosen at random and sent over a channel W ◮ Given the channel output y , a “guessing decoder” decodes by asking questions of the form “Is the correct codeword the m th one?” to which it receives a truthful YES or NO answer. ◮ On a NO answer it repeats the question with a new m . ◮ The complexity C for this decoder is the number of questions until a YES answer. Guessing and cutoff rate 25 / 72
A “guessing” decoder ◮ Consider a block code with M codewords x 1 , . . . , x M of block length N . ◮ Suppose a codeword is chosen at random and sent over a channel W ◮ Given the channel output y , a “guessing decoder” decodes by asking questions of the form “Is the correct codeword the m th one?” to which it receives a truthful YES or NO answer. ◮ On a NO answer it repeats the question with a new m . ◮ The complexity C for this decoder is the number of questions until a YES answer. Guessing and cutoff rate 25 / 72
Optimal guessing decoder An optimal guessing decoder is one that minimizes the expected complexity E [ C ]. Clearly, E [ C ] is minimized by generating the guesses in decreasing order of likelihoods W ( y | x m ). x i 1 ← 1st guess (the most likely codeword given y ) x i 2 ← 2nd guess (2nd most likely codeword given y ) . . . x L ← correct codeword obtained; guessing stops Complexity C equals the number of guesses L Guessing and cutoff rate 26 / 72
Application to the guessing decoder ◮ A block code C = { x 1 , . . . , x M } with M = e NR codewords of block length N . ◮ A codeword X chosen at random and sent over a DMC W . ◮ Given the channel output vector Y , the decoder guesses X . A special case of guessing with side information where N � P ( X = x , Y = y ) = e − NR W ( y i | x i ) , x ∈ C i =1 Guessing and cutoff rate 27 / 72
Cutoff rate bound � 2 �� E [ G ∗ ( X | Y )] ≥ [1 + NR ] − 1 � � P ( x , y ) y x � 2 N �� = [1 + NR ] − 1 e NR � � Q N ( x ) W N ( x , y ) y x ≥ [1 + NR ] − 1 e N ( R − R 0 ( W )) where � 2 �� � � R 0 ( W ) = max − ln Q ( x ) W ( y | x ) Q y x is the channel cutoff rate . Guessing and cutoff rate 28 / 72
Sequential decoding and the cutoff rate Guessing and cutoff rate Boosting the cutoff rate Pinsker’s scheme Massey’s scheme Polar coding Boosting the cutoff rate 29 / 72
Boosting the cutoff rate ◮ It was clear almost from the beginning that R 0 was at best shaky in its role as a limit to practical communications ◮ There were many attempts to boost the cutoff rate by devising clever schemes for searching a tree ◮ One striking example is Pinsker’s scheme that displayed the strange nature of R 0 Boosting the cutoff rate 30 / 72
Boosting the cutoff rate ◮ It was clear almost from the beginning that R 0 was at best shaky in its role as a limit to practical communications ◮ There were many attempts to boost the cutoff rate by devising clever schemes for searching a tree ◮ One striking example is Pinsker’s scheme that displayed the strange nature of R 0 Boosting the cutoff rate 30 / 72
Boosting the cutoff rate ◮ It was clear almost from the beginning that R 0 was at best shaky in its role as a limit to practical communications ◮ There were many attempts to boost the cutoff rate by devising clever schemes for searching a tree ◮ One striking example is Pinsker’s scheme that displayed the strange nature of R 0 Boosting the cutoff rate 30 / 72
Sequential decoding and the cutoff rate Guessing and cutoff rate Boosting the cutoff rate Pinsker’s scheme Massey’s scheme Polar coding Pinsker’s scheme 31 / 72
Binary Symmetric Channel We will describe Pinsker’s scheme using the BSC example: ◮ Capacity C = 1 + ǫ log 2 ( ǫ ) + (1 − ǫ ) log 2 (1 − ǫ ) ◮ Cutoff rate 2 R 0 = log 2 � 1 + 2 ǫ (1 − ǫ ) Pinsker’s scheme 32 / 72
Binary Symmetric Channel We will describe Pinsker’s scheme using the BSC example: ◮ Capacity C = 1 + ǫ log 2 ( ǫ ) + (1 − ǫ ) log 2 (1 − ǫ ) ◮ Cutoff rate 2 R 0 = log 2 � 1 + 2 ǫ (1 − ǫ ) Pinsker’s scheme 32 / 72
Capacity and cutoff rate for the BSC R 0 and C R 0 / C Pinsker’s scheme 33 / 72
Pinsker’s scheme Based on the observations that as ǫ → 0 R 0 ( ǫ ) C ( ǫ ) → 1 and R 0 ( ǫ ) → 1 , Pinsker (1965) proposed concatenation scheme that achieved capacity within constant average cost per decoded bit irrespective of the level of reliability Pinsker’s scheme 34 / 72
b b b b b b b b b b b Pinsker’s scheme x 1 y 1 W ˆ d 1 u 1 u 1 ˆ d 1 CE 1 SD 1 x 2 y 2 W ˆ d 2 u 2 u 2 d 2 ˆ CE 2 Block SD 2 Block decoder encoder (ML) ˆ d K 2 u K 2 u K 2 d K 2 ˆ CE K 2 SD K 2 x N 2 y N 2 W K 2 identical K 2 independent convolutional sequential decoders N 2 independent encoders copies of W The inner block code does the initial clean-up at huge but finite complexity; the outer convolutional encoding (CE) and sequential decoding (SD) boosts the reliability at little extra cost. Pinsker’s scheme 35 / 72
Discussion ◮ Although Pinsker’s scheme made a very strong theoretical point, it was not practical. ◮ There were many more attempts to go around the R 0 barrier in 1960s: ◮ D. Falconer, “A Hybrid Sequential and Algebraic Decoding Scheme,” Sc.D. thesis, Dept. of Elec. Eng., M.I.T., 1966. ◮ I. Stiglitz, Iterative sequential decoding, IEEE Transactions on Information Theory, vol. 15, no. 6, pp. 715721, Nov. 1969. ◮ F. Jelinek and J. Cocke, “Bootstrap hybrid decoding for symmetrical binary input channels,” Inform. Contr., vol. 18, no. 3, pp. 261-298, Apr. 1971. ◮ It is fair to say that none of these schemes had any practical impact Pinsker’s scheme 36 / 72
Discussion ◮ Although Pinsker’s scheme made a very strong theoretical point, it was not practical. ◮ There were many more attempts to go around the R 0 barrier in 1960s: ◮ D. Falconer, “A Hybrid Sequential and Algebraic Decoding Scheme,” Sc.D. thesis, Dept. of Elec. Eng., M.I.T., 1966. ◮ I. Stiglitz, Iterative sequential decoding, IEEE Transactions on Information Theory, vol. 15, no. 6, pp. 715721, Nov. 1969. ◮ F. Jelinek and J. Cocke, “Bootstrap hybrid decoding for symmetrical binary input channels,” Inform. Contr., vol. 18, no. 3, pp. 261-298, Apr. 1971. ◮ It is fair to say that none of these schemes had any practical impact Pinsker’s scheme 36 / 72
Discussion ◮ Although Pinsker’s scheme made a very strong theoretical point, it was not practical. ◮ There were many more attempts to go around the R 0 barrier in 1960s: ◮ D. Falconer, “A Hybrid Sequential and Algebraic Decoding Scheme,” Sc.D. thesis, Dept. of Elec. Eng., M.I.T., 1966. ◮ I. Stiglitz, Iterative sequential decoding, IEEE Transactions on Information Theory, vol. 15, no. 6, pp. 715721, Nov. 1969. ◮ F. Jelinek and J. Cocke, “Bootstrap hybrid decoding for symmetrical binary input channels,” Inform. Contr., vol. 18, no. 3, pp. 261-298, Apr. 1971. ◮ It is fair to say that none of these schemes had any practical impact Pinsker’s scheme 36 / 72
Discussion ◮ Although Pinsker’s scheme made a very strong theoretical point, it was not practical. ◮ There were many more attempts to go around the R 0 barrier in 1960s: ◮ D. Falconer, “A Hybrid Sequential and Algebraic Decoding Scheme,” Sc.D. thesis, Dept. of Elec. Eng., M.I.T., 1966. ◮ I. Stiglitz, Iterative sequential decoding, IEEE Transactions on Information Theory, vol. 15, no. 6, pp. 715721, Nov. 1969. ◮ F. Jelinek and J. Cocke, “Bootstrap hybrid decoding for symmetrical binary input channels,” Inform. Contr., vol. 18, no. 3, pp. 261-298, Apr. 1971. ◮ It is fair to say that none of these schemes had any practical impact Pinsker’s scheme 36 / 72
Discussion ◮ Although Pinsker’s scheme made a very strong theoretical point, it was not practical. ◮ There were many more attempts to go around the R 0 barrier in 1960s: ◮ D. Falconer, “A Hybrid Sequential and Algebraic Decoding Scheme,” Sc.D. thesis, Dept. of Elec. Eng., M.I.T., 1966. ◮ I. Stiglitz, Iterative sequential decoding, IEEE Transactions on Information Theory, vol. 15, no. 6, pp. 715721, Nov. 1969. ◮ F. Jelinek and J. Cocke, “Bootstrap hybrid decoding for symmetrical binary input channels,” Inform. Contr., vol. 18, no. 3, pp. 261-298, Apr. 1971. ◮ It is fair to say that none of these schemes had any practical impact Pinsker’s scheme 36 / 72
Discussion ◮ Although Pinsker’s scheme made a very strong theoretical point, it was not practical. ◮ There were many more attempts to go around the R 0 barrier in 1960s: ◮ D. Falconer, “A Hybrid Sequential and Algebraic Decoding Scheme,” Sc.D. thesis, Dept. of Elec. Eng., M.I.T., 1966. ◮ I. Stiglitz, Iterative sequential decoding, IEEE Transactions on Information Theory, vol. 15, no. 6, pp. 715721, Nov. 1969. ◮ F. Jelinek and J. Cocke, “Bootstrap hybrid decoding for symmetrical binary input channels,” Inform. Contr., vol. 18, no. 3, pp. 261-298, Apr. 1971. ◮ It is fair to say that none of these schemes had any practical impact Pinsker’s scheme 36 / 72
R 0 as practical capacity ◮ The failure to beat the cutoff rate bound in a meaningful manner despite intense efforts elevated R 0 to the status of a “realistic” limit to reliable communications ◮ R 0 appears as the key figure-of-merit for communication system design in the influential works of the period: ◮ Wozencraft and Jacobs, Principles of Communication Engineering , 1965 ◮ Wozencraft and Kennedy, “Modulation and demodulation for probabilistic coding,” IT Trans.,1966 ◮ Massey, “Coding and modulation in digital communications,” Z¨ urich, 1974 ◮ Forney (1995) gives a first-hand account of this situation in his Shannon Lecture “Performance and Complexity” Pinsker’s scheme 37 / 72
R 0 as practical capacity ◮ The failure to beat the cutoff rate bound in a meaningful manner despite intense efforts elevated R 0 to the status of a “realistic” limit to reliable communications ◮ R 0 appears as the key figure-of-merit for communication system design in the influential works of the period: ◮ Wozencraft and Jacobs, Principles of Communication Engineering , 1965 ◮ Wozencraft and Kennedy, “Modulation and demodulation for probabilistic coding,” IT Trans.,1966 ◮ Massey, “Coding and modulation in digital communications,” Z¨ urich, 1974 ◮ Forney (1995) gives a first-hand account of this situation in his Shannon Lecture “Performance and Complexity” Pinsker’s scheme 37 / 72
R 0 as practical capacity ◮ The failure to beat the cutoff rate bound in a meaningful manner despite intense efforts elevated R 0 to the status of a “realistic” limit to reliable communications ◮ R 0 appears as the key figure-of-merit for communication system design in the influential works of the period: ◮ Wozencraft and Jacobs, Principles of Communication Engineering , 1965 ◮ Wozencraft and Kennedy, “Modulation and demodulation for probabilistic coding,” IT Trans.,1966 ◮ Massey, “Coding and modulation in digital communications,” Z¨ urich, 1974 ◮ Forney (1995) gives a first-hand account of this situation in his Shannon Lecture “Performance and Complexity” Pinsker’s scheme 37 / 72
R 0 as practical capacity ◮ The failure to beat the cutoff rate bound in a meaningful manner despite intense efforts elevated R 0 to the status of a “realistic” limit to reliable communications ◮ R 0 appears as the key figure-of-merit for communication system design in the influential works of the period: ◮ Wozencraft and Jacobs, Principles of Communication Engineering , 1965 ◮ Wozencraft and Kennedy, “Modulation and demodulation for probabilistic coding,” IT Trans.,1966 ◮ Massey, “Coding and modulation in digital communications,” Z¨ urich, 1974 ◮ Forney (1995) gives a first-hand account of this situation in his Shannon Lecture “Performance and Complexity” Pinsker’s scheme 37 / 72
R 0 as practical capacity ◮ The failure to beat the cutoff rate bound in a meaningful manner despite intense efforts elevated R 0 to the status of a “realistic” limit to reliable communications ◮ R 0 appears as the key figure-of-merit for communication system design in the influential works of the period: ◮ Wozencraft and Jacobs, Principles of Communication Engineering , 1965 ◮ Wozencraft and Kennedy, “Modulation and demodulation for probabilistic coding,” IT Trans.,1966 ◮ Massey, “Coding and modulation in digital communications,” Z¨ urich, 1974 ◮ Forney (1995) gives a first-hand account of this situation in his Shannon Lecture “Performance and Complexity” Pinsker’s scheme 37 / 72
R 0 as practical capacity ◮ The failure to beat the cutoff rate bound in a meaningful manner despite intense efforts elevated R 0 to the status of a “realistic” limit to reliable communications ◮ R 0 appears as the key figure-of-merit for communication system design in the influential works of the period: ◮ Wozencraft and Jacobs, Principles of Communication Engineering , 1965 ◮ Wozencraft and Kennedy, “Modulation and demodulation for probabilistic coding,” IT Trans.,1966 ◮ Massey, “Coding and modulation in digital communications,” Z¨ urich, 1974 ◮ Forney (1995) gives a first-hand account of this situation in his Shannon Lecture “Performance and Complexity” Pinsker’s scheme 37 / 72
Other attempts to boost the cutoff rate Efforts to beat the cutoff rate continues to this day ◮ D. J. Costello and F. Jelinek, 1972. ◮ P. R. Chevillat and D. J. Costello Jr., 1977. ◮ F. Hemmati, 1990. ◮ B. Radosavljevic, E. Arıkan, B. Hajek, 1992. ◮ J. Belzile and D. Haccoun, 1993. ◮ S. Kallel and K. Li, 1997. ◮ E. Arıkan, 2006 ◮ ... Pinsker’s scheme 38 / 72
Other attempts to boost the cutoff rate Efforts to beat the cutoff rate continues to this day ◮ D. J. Costello and F. Jelinek, 1972. ◮ P. R. Chevillat and D. J. Costello Jr., 1977. ◮ F. Hemmati, 1990. ◮ B. Radosavljevic, E. Arıkan, B. Hajek, 1992. ◮ J. Belzile and D. Haccoun, 1993. ◮ S. Kallel and K. Li, 1997. ◮ E. Arıkan, 2006 ◮ ... In fact, polar coding originates from such attempts. Pinsker’s scheme 38 / 72
Sequential decoding and the cutoff rate Guessing and cutoff rate Boosting the cutoff rate Pinsker’s scheme Massey’s scheme Polar coding Massey’s scheme 39 / 72
The R 0 debate A case study by McEliece (1980) cast a big doubt on the significance of R 0 as a practical limit ◮ McEliece’s study was concerned with a Pulse Position Modulation (PPM) scheme, modeled as a q -ary erasure channel ◮ Capacity: C ( q ) = (1 − ǫ ) log q 1−ε 1 1 q ε ◮ Cutoff rate: R 0 ( q ) = log 1+( q − 1) ǫ 2 2 ◮ As the bandwidth ( q ) grew, 3 3 R 0 ( q ) C ( q ) → 0 q q ◮ Algebraic coding (Reed-Solomon) scored a big win over probabilistic coding! ? Massey’s scheme 40 / 72
The R 0 debate A case study by McEliece (1980) cast a big doubt on the significance of R 0 as a practical limit ◮ McEliece’s study was concerned with a Pulse Position Modulation (PPM) scheme, modeled as a q -ary erasure channel ◮ Capacity: C ( q ) = (1 − ǫ ) log q 1−ε 1 1 q ε ◮ Cutoff rate: R 0 ( q ) = log 1+( q − 1) ǫ 2 2 ◮ As the bandwidth ( q ) grew, 3 3 R 0 ( q ) C ( q ) → 0 q q ◮ Algebraic coding (Reed-Solomon) scored a big win over probabilistic coding! ? Massey’s scheme 41 / 72
Massey meets the challenge ◮ Massey (1981) showed that there was a different way of doing coding and modulation on a q -ary erasure channel that boosted R 0 effortlessly ◮ Paradoxically, as Massey restored the status of R 0 , he exhibited the “flaky” nature of this parameter Massey’s scheme 42 / 72
Massey meets the challenge ◮ Massey (1981) showed that there was a different way of doing coding and modulation on a q -ary erasure channel that boosted R 0 effortlessly ◮ Paradoxically, as Massey restored the status of R 0 , he exhibited the “flaky” nature of this parameter Massey’s scheme 42 / 72
Channel splitting to boost cutoff rate (Massey, 1981) 1−ε 0 0 ε 1−ε 1−ε 1 1 00 00 1 1 ε ε 01 01 2 2 ? 3 3 10 10 1−ε 4 4 11 11 0 0 ε ?? ? 1 1 ? ◮ Begin with a quaternary erasure channel (QEC) Massey’s scheme 43 / 72
Channel splitting to boost cutoff rate (Massey, 1981) 1−ε 0 0 ε 1−ε 1−ε 00 00 1 1 1 1 ε ε 01 01 2 2 ? 3 3 10 10 1−ε 11 11 0 0 4 4 ε ?? ? 1 1 ? ◮ Relabel the inputs Massey’s scheme 44 / 72
Channel splitting to boost cutoff rate (Massey, 1981) 1−ε 0 0 ε 1−ε 1−ε 00 00 1 1 1 1 ε ε 01 01 2 2 ? 3 3 10 10 1−ε 11 11 0 0 4 4 ε ?? ? 1 1 ? ◮ Split the QEC into two binary erasure channels (BEC) ◮ BECs fully correlated: erasures occur jointly Massey’s scheme 45 / 72
Capacity, cutoff rate for one QEC vs two BECs Ordinary coding of QEC Independent coding of BECs E BEC D QEC E D E BEC D C (QEC) = 2(1 − ǫ ) C (BEC) = (1 − ǫ ) 4 2 R 0 (QEC) = log R 0 (BEC) = log 1+3 ǫ 1+ ǫ Massey’s scheme 46 / 72
Capacity, cutoff rate for one QEC vs two BECs Ordinary coding of QEC Independent coding of BECs E BEC D QEC E D E BEC D C (QEC) = 2(1 − ǫ ) C (BEC) = (1 − ǫ ) 4 2 R 0 (QEC) = log R 0 (BEC) = log 1+3 ǫ 1+ ǫ ◮ C (QEC) = 2 × C (BEC) Massey’s scheme 46 / 72
Capacity, cutoff rate for one QEC vs two BECs Ordinary coding of QEC Independent coding of BECs E BEC D QEC E D E BEC D C (QEC) = 2(1 − ǫ ) C (BEC) = (1 − ǫ ) 4 2 R 0 (QEC) = log R 0 (BEC) = log 1+3 ǫ 1+ ǫ ◮ C (QEC) = 2 × C (BEC) ◮ R 0 (QEC) ≤ 2 × R 0 (BEC) with equality iff ǫ = 0 or 1. Massey’s scheme 46 / 72
Cutoff rate improvement by splitting 2 2 × BEC cutoff rate QEC capacity capacity and cutoff rate (bits) QEC cutoff rate 1 0 0 erasure probability ( ǫ ) 1 Massey’s scheme 47 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Comparison of Pinsker’s and Massey’s schemes ◮ Pinsker ◮ Construct a superchannel by combining independent copies of a given DMC W ◮ Split the superchannel into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Can be used universally ◮ Can achieve capacity ◮ Not practical ◮ Massey ◮ Split the given DMC W into correlated subchannels ◮ Ignore correlations between the subchannels, encode and decode them independently ◮ Applicable only to specific channels ◮ Cannot achieve capacity ◮ Practical Massey’s scheme 48 / 72
Recommend
More recommend