previously
play

Previously... Forward and converse proof of the rate-distortion - PowerPoint PPT Presentation

Lecture 14 Review Previously... Forward and converse proof of the rate-distortion theorem S. Cheng (OU-Tulsa) November 28, 2017 1 / 27 Lecture 14 Overview This time Method of types Universal source coding Large deviation theory S. Cheng


  1. Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

  2. Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. In general, N ! | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! · · · S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

  3. Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. In general, N ! | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! · · · Actually we don’t care too much what | T ( p ) | is exactly. We will provide bounds for | T ( p ) | as we come back later on S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

  4. Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. In general, N ! | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! · · · Actually we don’t care too much what | T ( p ) | is exactly. We will provide bounds for | T ( p ) | as we come back later on And for any sequence y in T ( p x N ), p ( y ) = q (1) 3 q (2) q (3), where q ( · ) is the true distribution S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

  5. Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

  6. Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N � � N q N ( x N ) = i =1 log q ( x i ) q ( x i ) = 2 i =1 S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

  7. Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

  8. Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 = 2 − N � a ∈X − p xN ( a ) log q ( a ) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

  9. Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 � � a ∈X p ( a ) log p ( a ) a ∈X − p xN ( a ) log q ( a ) = 2 − � a ∈X p ( a ) log p ( a ) − � − N = 2 − N � q ( a ) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

  10. Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 � � a ∈X p ( a ) log p ( a ) a ∈X − p xN ( a ) log q ( a ) = 2 − � a ∈X p ( a ) log p ( a ) − � − N = 2 − N � q ( a ) = 2 − N ( H ( p )+ KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

  11. Lecture 14 Method of types Probability of a sequence in the “typical” class If x N ∈ T ( q ), where q ( · ) is the true distribution of X , then q N ( x N ) = 2 − NH ( q ) = 2 − NH ( X ) S. Cheng (OU-Tulsa) November 28, 2017 10 / 27

  12. Lecture 14 Method of types Probability of a sequence in the “typical” class If x N ∈ T ( q ), where q ( · ) is the true distribution of X , then q N ( x N ) = 2 − NH ( q ) = 2 − NH ( X ) Remarks Note that the probability is exactly equal to 2 − NH ( X ) S. Cheng (OU-Tulsa) November 28, 2017 10 / 27

  13. Lecture 14 Method of types Probability of a sequence in the “typical” class If x N ∈ T ( q ), where q ( · ) is the true distribution of X , then q N ( x N ) = 2 − NH ( q ) = 2 − NH ( X ) Remarks Note that the probability is exactly equal to 2 − NH ( X ) Recall that this is the probability of a typical sequence supposed to be. Therefore, any x N in T ( q ) is a typical sequence ( T ( q ) ⊂ A N ǫ ( X )) S. Cheng (OU-Tulsa) November 28, 2017 10 / 27

  14. Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

  15. Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

  16. Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 Since a type is uniquely characterized by a distribution of X in a length- N sequence S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

  17. Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 Since a type is uniquely characterized by a distribution of X in a length- N sequence Each element p of P N ( X ) corresponds a type T ( p ) S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

  18. Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 Since a type is uniquely characterized by a distribution of X in a length- N sequence Each element p of P N ( X ) corresponds a type T ( p ) Number of types is |P N ( X ) | S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

  19. Lecture 14 Method of types Number of types It is not too difficult to count the exact number of types. But in practice, we don’t quite bother with it as long as we know that the number is relatively “small” Theorem 2 |P N ( X ) | ≤ ( N + 1) |X| S. Cheng (OU-Tulsa) November 28, 2017 12 / 27

  20. Lecture 14 Method of types Number of types It is not too difficult to count the exact number of types. But in practice, we don’t quite bother with it as long as we know that the number is relatively “small” Theorem 2 |P N ( X ) | ≤ ( N + 1) |X| Proof Note that each type is specified by the empirical probability of each outcome of X . And the possible values of the empirical probabilities are N , 1 0 N , · · · , N N ( N + 1 of them). S. Cheng (OU-Tulsa) November 28, 2017 12 / 27

  21. Lecture 14 Method of types Number of types It is not too difficult to count the exact number of types. But in practice, we don’t quite bother with it as long as we know that the number is relatively “small” Theorem 2 |P N ( X ) | ≤ ( N + 1) |X| Proof Note that each type is specified by the empirical probability of each outcome of X . And the possible values of the empirical probabilities are N , 1 0 N , · · · , N N ( N + 1 of them). Since there are |X| elements, the number of types is bounded by ( N + 1) |X| S. Cheng (OU-Tulsa) November 28, 2017 12 / 27

  22. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  23. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here � p N ( x N ) 1 ≥ x N ∈ T ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  24. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  25. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � 1 = Pr ( T (ˆ p )) ˆ p ∈P N S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  26. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) ˜ p p ∈P N ˆ p ∈P N ˆ S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  27. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � � 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) = Pr ( T ( p )) ˜ p ˆ p ∈P N p ∈P N ˆ p ∈P N ˆ S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  28. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � � Pr ( T ( p )) ≤ ( N + 1) |X| Pr ( T ( p )) 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) = ˜ p ˆ p ∈P N p ∈P N ˆ p ∈P N ˆ S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  29. Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � � Pr ( T ( p )) ≤ ( N + 1) |X| Pr ( T ( p )) 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) = ˜ p ˆ p ∈P N p ∈P N ˆ p ∈P N ˆ = ( N + 1) |X| | T ( p ) | 2 − NH ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

  30. Lecture 14 Method of types Probability of a type class Theorem 4 Let the true distribution of X is q ( · ), then 2 − N ( KL ( p || q )) ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) ( N + 1) |X| S. Cheng (OU-Tulsa) November 28, 2017 14 / 27

  31. Lecture 14 Method of types Probability of a type class Theorem 4 Let the true distribution of X is q ( · ), then 2 − N ( KL ( p || q )) ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) ( N + 1) |X| Proof From Theorem 1, each sequence in T ( p ) has probability 2 − N ( H ( p )+ KL ( p || q )) ( N +1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) from Theorem 3, 1 and since 1 ( N + 1) |X| 2 NH ( p ) 2 − N ( H ( p )+ KL ( p || q )) ≤ Pr ( T ( p )) ≤ 2 NH ( p ) 2 − N ( H ( p )+ KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 14 / 27

  32. Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

  33. Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

  34. Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) There are about 2 NH ( p ) sequences in T ( p ) 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

  35. Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) There are about 2 NH ( p ) sequences in T ( p ) 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Probability of getting a sequence in T ( p ) is about 2 − N ( KL ( p || q )) . More precisely, 2 − N ( KL ( p || q )) ( N + 1) |X| ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

  36. Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) There are about 2 NH ( p ) sequences in T ( p ) 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Probability of getting a sequence in T ( p ) is about 2 − N ( KL ( p || q )) . More precisely, 2 − N ( KL ( p || q )) ( N + 1) |X| ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) There are ( N + 1) |X| types S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

  37. Lecture 14 Univesal source coding Rationale For the compression scheme (such as Huffmann coding) that we discussed earlier in this class, one needs to know the source distribution ahead to design the encoder and decoder S. Cheng (OU-Tulsa) November 28, 2017 16 / 27

  38. Lecture 14 Univesal source coding Rationale For the compression scheme (such as Huffmann coding) that we discussed earlier in this class, one needs to know the source distribution ahead to design the encoder and decoder Question: Is it possible to construct compression scheme without knowing the source distibution and still performs as good? S. Cheng (OU-Tulsa) November 28, 2017 16 / 27

  39. Lecture 14 Univesal source coding Rationale For the compression scheme (such as Huffmann coding) that we discussed earlier in this class, one needs to know the source distribution ahead to design the encoder and decoder Question: Is it possible to construct compression scheme without knowing the source distibution and still performs as good? Answer: Yes. At least theoretically → universal source coding S. Cheng (OU-Tulsa) November 28, 2017 16 / 27

  40. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  41. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  42. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as � | A | = | T ( p ) | p : H ( p ) < R N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  43. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as � � 2 NH ( p ) | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  44. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  45. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N ≤ ( N + 1) |X| 2 NR N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  46. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N � � R N + |X| log( N +1) ≤ ( N + 1) |X| 2 NR N = 2 N = 2 NR N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  47. Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N � � R N + |X| log( N +1) ≤ ( N + 1) |X| 2 NR N = 2 N = 2 NR N Encoder: given input, check if input is in A , output index if so. Otherwise, declare failure Decoder: simply map index back to the sequence S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

  48. Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � P e = Pr ( T ( p )) p : H ( p ) > R N S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

  49. Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

  50. Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

  51. Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

  52. Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 Therefore, any p in { p : H ( p ) > R N } cannot be the same as q S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

  53. Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 Therefore, any p in { p : H ( p ) > R N } cannot be the same as q ⇒ min ˜ p ) > R N KL (˜ p || q ) > 0 for N ≥ N 0 p : H (˜ S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

  54. Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 Therefore, any p in { p : H ( p ) > R N } cannot be the same as q ⇒ min ˜ p ) > R N KL (˜ p || q ) > 0 for N ≥ N 0 p : H (˜ Hence, P e → 0 as N → ∞ S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

  55. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  56. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  57. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  58. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  59. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 1 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  60. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 1 , 0 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  61. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 1 , 0 , 11 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  62. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 1 , 0 , 11 , 01 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  63. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 1 , 0 , 11 , 01 , 110 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  64. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 1 , 0 , 11 , 01 , 110 , 111 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  65. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 1 , 0 , 11 , 01 , 110 , 111 , 10 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  66. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  67. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  68. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; 2) the last bit S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  69. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; 2) the last bit ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  70. Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; 2) the last bit ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Encode representation to bit stream. Note that as the dictionary grows, number of bits needed to store the index increases ⇒ 0100011101011001110010110 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

  71. Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

  72. Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 1 ⇒ 1 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

  73. Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 1 0 ⇒ 10 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

  74. Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 3 1 0 11 ⇒ 1011 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

  75. Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 3 4 1 0 11 01 ⇒ 101101 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

  76. Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 3 4 5 1 0 11 01 110 ⇒ 101101110 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

Recommend


More recommend