Typical Sequences and a Lossless Source Coding Theorem Weakly Typical Sequences and Sources with Memory Summary Lecture 3 Source Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw October 19, 2014 1 / 43 I-Hsiang Wang NIT Lecture 3
Typical Sequences and a Lossless Source Coding Theorem Meta Description I-Hsiang Wang 2 / 43 3 Efficiency : losslessly or within a certain distortion. From the source codeword w , reconstruct the source sequence either 2 Decoder : , with K as small as possible. Weakly Typical Sequences and Sources with Memory 1 Encoder : NIT Lecture 3 The Source Coding Problem Summary s [1 : N ] b [1 : K ] b s [1 : N ] Source Source Encoder Decoder Source Destination Represent the source sequence s [1 : N ] by a binary source codeword [ ] 0 : 2 K − 1 w := b [1 : K ] ∈ Determined by the code rate R := K N bits/symbol time
Typical Sequences and a Lossless Source Coding Theorem Weakly Typical Sequences and Sources with Memory I-Hsiang Wang 3 / 43 prescribed distortion. source coding problem. Naturally, one would think of two different decoding criteria for the NIT Lecture 3 Summary Decoding Criteria s [1 : N ] b [1 : K ] s [1 : N ] b Source Source Encoder Decoder Source Destination 1 Exact: the reconstructed sequence � s [1 : N ] = s [1 : N ] . 2 Lossy: the reconstructed sequence � s [1 : N ] ̸ = s [1 : N ] , but is within a
Typical Sequences and a Lossless Source Coding Theorem Why? I-Hsiang Wang 4 / 43 What is going wrong? impossible to have data compression. As a consequence, it seems that if we require exact reconstruction, it is Because every possible sequence has to be uniquely represented by K bits! NIT Lecture 3 Weakly Typical Sequences and Sources with Memory must satisfy recovery criterion. Let us begin with some simple analysis of the system with the exact Summary For N fixed, if the decoder would like to reconstruct s [1 : N ] exactly for all possible s [1 : N ] ∈ S N , then it is simple to see that the smallest K 2 K − 1 < |S| N ≤ 2 K = ⇒ K = ⌈ N log |S|⌉ .
Typical Sequences and a Lossless Source Coding Theorem different probabilities to be drawn . I-Hsiang Wang 5 / 43 Allow (almost) lossless reconstruction rather than exact recovery probabilities, rather than fixing it to be K Allow variable codeword length for different symbols with different can take to demonstrate data compression: With a random source model, immediately there are two approaches one NIT Lecture 3 Weakly Typical Sequences and Sources with Memory engineering reasons, as mentioned in Lecture 1.) One of the simplest ways to capture redundancy is to model the source source sequence. Recall: data compression is possible because there is redundancy in the Random Source Summary as a random process. (Another reason to use a random source model is due to Redundancy comes from the fact that different symbols in S take
Typical Sequences and a Lossless Source Coding Theorem Weakly Typical Sequences and Sources with Memory I-Hsiang Wang 6 / 43 Note : the decoding criterion here is exact reconstruction. for a given distribution of the random source. called Huffman code, which can achieve the minimum compression rate In this lecture we will introduce an optimal block-to-variable source code, probability, we tend to use shorter codewords to represent it. Using variable codeword length is intuitive – for symbols with higher The key difference here is that we allow K to depend on the realization of NIT Lecture 3 Summary Block-to-Variable Source Coding s [1 : N ] b [1 : K ] b s [1 : N ] Source Source Encoder Decoder Variable Length Source Destination the source, s [1 : N ] . The definition of the code rate is modified to R := E [ K ] N .
Typical Sequences and a Lossless Source Coding Theorem Weakly Typical Sequences and Sources with Memory I-Hsiang Wang 7 / 43 arguments. combinatorial, the analysis here is majorly based on probabilistic Compared to the previous approach where the analysis is mainly reconstruction, the criterion is relaxed to vanishing error probability. Key features of this approach: NIT Lecture 3 e N Another way to let the randomness kick in: allow non-exact recovery. (Almost) Lossless Decoding Criterion Summary To be precise, we turn our focus to finding the smallest possible R = K given that the error probability { } P ( N ) S [1 : N ] ̸ = � := Pr S [1 : N ] → 0 as N → ∞ . Focus on the asymptotic regime where N → ∞ ; instead of error-free
Typical Sequences and a Lossless Source Coding Theorem Weakly Typical Sequences and Sources with Memory I-Hsiang Wang 8 / 43 called a discrete memoryless source (DMS). We shall begin with the simplest case where the random process In both cases, we will show that the minimum compression rate is equal especially Huffman codes, and prove its optimality 2 Second, introduce block-to-variable source coding schemes, typical sequences to prove a lossless source coding theorem 1 First, introduce a powerful tool called typical sequences, and use In this lecture, we shall Outline Summary NIT Lecture 3 to the entropy of the random source. { S [ t ] | t = 1 , 2 , . . . } consists of i.i.d. random variables S [ t ] ∼ p S , which is
Typical Sequences and a Lossless Source Coding Theorem Weakly Typical Sequences and Sources with Memory Summary 1 Typical Sequences and a Lossless Source Coding Theorem 2 Weakly Typical Sequences and Sources with Memory 3 Summary 9 / 43 I-Hsiang Wang NIT Lecture 3
Typical Sequences and a Lossless Source Coding Theorem Weakly Typical Sequences and Sources with Memory I-Hsiang Wang 10 / 43 interchangeably: Notation : For notational convenience, we shall use the following and weak typicality. the literature. In this lecture, we give two definitions: (robust) typicality Note : There are several notions of typicality and various definitions in For lossless reconstruction with vanishing error probability, we can use whole probability, while others become “atypical”. Goal : Understand and exploit the probabilistic asymptotic properties of a Overview of Typicality Methods Summary NIT Lecture 3 i.i.d. randomly generated sequence S [1 : N ] for coding. Key Observation : When N → ∞ , one often observe that a substantially small set of sequences will become “typical”, which contribute almost the shorter codewords to label “typical” sequences and ignore “atypical” ones. x [ t ] ← → x t , x [1 : N ] ← → x N .
Typical Sequences and a Lossless Source Coding Theorem n I-Hsiang Wang 11 / 43 Definition 1 (Typical Sequence) empirical p.m.f. does not deviate too much from the actual p.m.f. Weakly Typical Sequences and Sources with Memory . p NIT Lecture 3 empirical distribution is close to the actual distribution. Summary Typical Sequence For a sequence x n , its empirical p.m.f. is given by the frequency of Roughly speaking, a (robust) typical sequence is a sequence whose ∑ n i =1 I { x i = a } occurrence of a symbol in the sequence: π ( a | x n ) := Due the law of large numbers, π ( a | x n ) → p X ( a ) for all a ∈ X as n → ∞ , if x n is drawn i.i.d. based on p X . With high probability, the For X ∼ p X and ϵ ∈ (0 , 1) , a sequence x n is called ϵ -typical if | π ( a | x n ) − p X ( a ) | ≤ ϵ p X ( a ) , ∀ a ∈ X . The typical set is defined as the collection of all ϵ -typical length- n sequences, denoted by T ( n ) ( X ) . ϵ
Typical Sequences and a Lossless Source Coding Theorem Consider a random bit sequence generated i.i.d. based on Ber I-Hsiang Wang 12 / 43 is Weakly Typical Sequences and Sources with Memory ? How large is the typical set? . Let 2 NIT Lecture 3 Example 1 ” Summary Note : In the following, if the context is clear, we will write “ T ( n ) ϵ instead of “ T ( n ) ( X ) ”. ϵ ( 1 ) us set ϵ = 0 . 2 and n = 10 . What is T ( n ) ϵ sol : Based on the definition, a n -sequence x n is ϵ -typical iff π (0 | x n ) ∈ [0 . 4 , 0 . 6] and π (1 | x n ) ∈ [0 . 4 , 0 . 6] . In other words, the # of “ 0 ”s in the sequence should be 4, 5, or 6. Hence, T ( n ) consists of all length-10 sequences with 4, 5, or 6 0 ”s. ϵ ( 10 ) ( 10 ) ( 10 ) The size of T ( n ) + + = 714 . ϵ 4 5 6
Typical Sequences and a Lossless Source Coding Theorem (by definition of typical sequences and entropy) I-Hsiang Wang 13 / 43 (by the upper bound in property 1, and property 2) (by summing up the lower bound in property 1 over the typical set) (by the law of large numbers (LLN)) enough. Weakly Typical Sequences and Sources with Memory lim 2 NIT Lecture 3 Properties of Typical Sequences Summary Proposition 1 (Properties of Typical Sequences and Typical Set) Let p ( x n ) := Pr { X n = x n } = ∏ n i =1 p X ( x i ) , that is, the probability that the DMS generates the sequence x n . Similarly p ( A ) := Pr { X n ∈ A} , denotes the probability of a set A . 1 ∀ x n ∈ T ( n ) ( X ) , 2 − n ( H ( X )+ δ ( ϵ )) ≤ p ( x n ) ≤ 2 − n ( H ( X ) − δ ( ϵ )) , where ϵ δ ( ϵ ) = ϵ H ( X ) . ( ) ( ) T ( n ) T ( n ) ( X ) = 1 , i.e., p ( X ) ≥ 1 − ϵ for n large ϵ ϵ n →∞ p 3 |T ( n ) ( X ) | ≤ 2 n ( H ( X )+ δ ( ϵ )) . ϵ 4 |T ( n ) ( X ) | ≥ (1 − ϵ )2 n ( H ( X ) − δ ( ϵ )) for n large enough. ϵ
Recommend
More recommend