lecture 5 channel coding over continuous channels
play

Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang - PowerPoint PPT Presentation

Mutual Information and Differential Entropy Channel Coding with Input Cost Gaussian Channel Capacity Summary Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University


  1. Mutual Information and Differential Entropy Channel Coding with Input Cost Gaussian Channel Capacity Summary Lecture 5 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 14, 2014 1 / 34 I-Hsiang Wang NIT Lecture 5

  2. Mutual Information and Differential Entropy Mutual information for continuous-valued r.v.’s. I-Hsiang Wang 2 / 34 We skip lossy source coding (rate distortion theory) in this course. channels. (Main example: Gaussian channel capacity). Channel coding with input cost over continuous memoryless Lossy source coding for continuous stationary sources. In this lecture we extend the basic principles and fundamental theorems Channel Coding with Input Cost Channel coding over discrete memoryless channels. Lossless source coding for discrete stationary sources. Entropy and mutual information for discrete-valued r.v’s. So far we have focused on the discrete-valued (& finite-alphabet) r.v’s: From Discrete-Valued to Continuous-Valued (1) Summary Gaussian Channel Capacity NIT Lecture 5 to continuous-valued sources and channels. In particular:

  3. Mutual Information and Differential Entropy Extends naturally to multi-terminal settings – can focus on discrete I-Hsiang Wang 3 / 34 3 Gaussian channel capacity 2 Channel coding with input cost over DMC 1 Differential entropy Outline : memoryless networks. No need for new tools (eg., typicality) for continuous-valued r.v.’s. Channel Coding with Input Cost Advantages : Discretization. finite-alphabet world to the continuous-valued world: Main technique for extending coding theorems from the discrete-valued, From Discrete-Valued to Continuous-Valued (2) Summary Gaussian Channel Capacity NIT Lecture 5

  4. Mutual Information and Differential Entropy 2 Gaussian channel capacity: I-Hsiang Wang 4 / 34 variables; [5] uses threshold decoder, similar to weak typicality in spirit. capacity follows [1]. [2] uses weak typicality for continuous random Using discretization to derive the achievability of Gaussian channel Remark : Chapter 3.3,3.4, El Gamal&Kim [1] Chapter 16, Moser [5] Chapter 9, Cover&Thomas [2] Chapter 2.2, El Gamal&Kim [1] Channel Coding with Input Cost Chapter 15, Moser [5] Chapter 8, Cover&Thomas [2] 1 Mutual information and differential entropy: Reading : treatment in the references. deriving the results in this lecture. Instead, you can find rigorous Disclaimer : due to time constraint, we will not be 100% rigorous in Summary Gaussian Channel Capacity NIT Lecture 5

  5. Mutual Information and Differential Entropy Channel Coding with Input Cost Gaussian Channel Capacity Summary 1 Mutual Information and Differential Entropy 2 Channel Coding with Input Cost 3 Gaussian Channel Capacity 4 Summary 5 / 34 I-Hsiang Wang NIT Lecture 5

  6. Mutual Information and Differential Entropy H I-Hsiang Wang 6 / 34 exists. log Channel Coding with Input Cost NIT Lecture 5 Summary Gaussian Channel Capacity Entropy of a Continuous Random Variable Question : What is the entropy of a continuous real-valued random variable X ? Suppose X has the probability density function (p.d.f.) f ( x ) . Let us discretize X to answer this question, as follows: Partition R into length- ∆ intervals: R = ∪ ∞ k = −∞ [ k ∆ , ( k + 1)∆) . Suppose that f ( x ) is continuous, then by the mean-value theorem, ∫ ( k +1)∆ ∀ k ∈ Z , ∃ x k ∈ [ k ∆ , ( k + 1)∆) such that f ( x k ) = 1 f ( x ) dx . ∆ k ∆ Set [ X ] ∆ := x k if X ∈ [ k ∆ , ( k + 1)∆) , with p.m.f. p ( x k ) = f ( x k ) ∆ . Observation : lim ∆ → 0 H ([ X ] ∆ ) = H ( X ) (intuitively), while ∞ ∞ ∑ ∑ ( ) [ X ] ∆ = − ( f ( x k ) ∆) log ( f ( x k ) ∆) = − ∆ f ( x k ) log f ( x k ) − log ∆ k = −∞ k = −∞ ∫ ∞ → − f ( x ) log f ( x ) dx + ∞ = ∞ as ∆ → 0 [ ] ∞ ∫ ∞ 1 We conclude that H ( X ) = ∞ if − ∞ f ( x ) log f ( x ) = E f ( X )

  7. Mutual Information and Differential Entropy Definition 1 (Differential entropy and conditional differential entropy) I-Hsiang Wang 7 / 34 if the (improper) integral exists. log The conditional differential entropy of a continuous r.v. X given Y , if the (improper) integral exists. Channel Coding with Input Cost log NIT Lecture 5 counterparts of entropy and conditional entropy, as follows: It is quite intuitive that the entropy of a continuous random variable can Gaussian Channel Capacity Yet, for continuous r.v.’s, it turns out to be useful to define the source with finite rate. Instead, lossy source coding is done. Summary Hence, in general it is impossible to losslessly compress a continuous be arbitrarily large, because it can take infinitely many possible values. Differential Entropy The differential entropy of a continuous r.v. X with p.d.f. f ( x ) is defined [ ] 1 as h ( X ) := E f ( X ) where ( X , Y ) has joint p.d.f. f ( x , y ) and conditional p.d.f. f ( x | y ) , is [ ] 1 defined as h ( X | Y ) := E f ( X | Y )

  8. Mutual Information and Differential Entropy j I-Hsiang Wang 8 / 34 such that j , with j Channel Coding with Input Cost such that NIT Lecture 5 Gaussian Channel Capacity Summary Mutual Information between Continuous Random Variables How about mutual information between two continuous real-valued random variables X and Y , with joint p.d.f. f X , Y ( x , y ) and marginal p.d.f.’s f X ( x ) and f Y ( y ) ? Partition R 2 plane into ∆ × ∆ squares: R 2 = ∪ ∞ Again, we use discretization: k , j = −∞ I ∆ k × I ∆ j , where I ∆ k := [ k ∆ , ( k + 1)∆) . Suppose that f X , Y ( x , y ) is continuous, then by the mean-value theorem (MVT), ∀ k , j ∈ Z , ∃ ( x k , y j ) ∈ I ∆ k × I ∆ ∫ 1 f X , Y ( x k , y j ) = j f X , Y ( x , y ) dx dy . I ∆ k ×I ∆ ∆ 2 { } Set ([ X ] ∆ , [ Y ] ∆ ) := ∑ ( X , Y ) ∈ I ∆ k × I ∆ k , j ( x k , y j ) I p.m.f. p ( x k , y j ) = f X , Y ( x k , y j ) ∆ 2 . x k ∈ I ∆ y j ∈ I ∆ By MVT, ∀ k , j ∈ Z , ∃ � k and � ∫ ∫ p ( x k ) := k f X ( x ) dx = f X ( � x k ) ∆ , p ( y j ) := j f Y ( y ) dy = f Y ( � y j ) ∆ . I ∆ I ∆

  9. Mutual Information and Differential Entropy Channel Coding with Input Cost I-Hsiang Wang 9 / 34 if the improper integral exists. log NIT Lecture 5 Gaussian Channel Capacity I Mutual Information between Continuous Random Variables Summary Observation : lim ∆ → 0 I ([ X ] ∆ ; [ Y ] ∆ ) = I ( X ; Y ) (intuitively), while ∞ ∑ ( ) p ( x k , y j ) [ X ] ∆ ; [ Y ] ∆ = p ( x k , y j ) log p ( x k ) p ( y j ) k , j = −∞ ✚ ∞ ∑ log f X , Y ( x k , y j ) ✚ ( f X , Y ( x k , y j ) ∆ 2 ) ∆ 2 = ✚ y j ) ✚ ∆ 2 f X ( � x k ) f Y ( � k , j = −∞ ∞ ∑ f X , Y ( x k , y j ) log f X , Y ( x k , y j ) = ∆ 2 f X ( � x k ) f Y ( � y j ) k , j = −∞ ∫ ∞ ∫ ∞ f X , Y ( x , y ) log f X , Y ( x , y ) → f X ( x ) f Y ( y ) dx dy as ∆ → 0 −∞ −∞ [ ] f ( X , Y ) Hence, I ( X ; Y ) = E f ( X ) f ( Y )

  10. Mutual Information and Differential Entropy defined as I-Hsiang Wang 10 / 34 log Theorem 1 (Mutual information between two continuous r.v.’s) We have the following theorem immediately from the previous discussion: Channel Coding with Input Cost I NIT Lecture 5 The mutual information between two random variables X and Y is general we can define the mutual information between two real-valued Gaussian Channel Capacity Summary Mutual Information Unlike entropy that is only well-defined for discrete random variables, in Definition 2 (Mutual information) random variables (no necessarily continuous or discrete) as follows. ( ) I ( X ; Y ) = sup [ X ] P ; [ Y ] Q , P , Q where the supremum is taken over all pairs of partitions P and Q of R . [ ] f ( X , Y ) = h ( X ) − h ( X | Y ) . I ( X ; Y ) := E f ( X ) f ( Y )

  11. Mutual Information and Differential Entropy h I-Hsiang Wang 11 / 34 Proposition 3 (Non-negativity of mutual information) Proposition 2 (Conditioning reduces differential entropy) Channel Coding with Input Cost NIT Lecture 5 n Proposition 1 (Chain rule) Properties that Extend to Continuous R.V.’s Summary Gaussian Channel Capacity ∑ ( X i | X i − 1 ) h ( X , Y ) = h ( X ) + h ( Y | X ) , h ( X n ) = . i =1 h ( X | Y ) ≤ h ( X ) , h ( X | Y , Z ) ≤ h ( X | Z ) . I ( X ; Y ) ≥ 0 , I ( X ; Y | Z ) ≥ 0 .

  12. Mutual Information and Differential Entropy Differential entropy can be negative . I-Hsiang Wang 12 / 34 an invertible function. Scaling will change the differential entropy . extended to differential entropy. negative. Hence, the non-negative property of entropy cannot be Channel Coding with Input Cost NIT Lecture 5 Example 1 (Differential entropy of a uniform r.v.) New Properties of Differential Entropy Summary Gaussian Channel Capacity 1 For a r.v. X ∼ Unif [ a , b ] , that is, its p.d.f. f X ( x ) = b − a I { a ≤ x ≤ b } , its differential entropy h ( X ) = log ( b − a ) . Since b − a can be made arbitrarily small, h ( X ) = log ( b − a ) can be Consider X ∼ Unif [0 , 1] . 2 X ∼ Unif [0 , 2] . Hence, h ( X ) = log 1 = 0 , h (2 X ) = log 2 = 1 = ⇒ h ( X ) ̸ = h (2 X ) . This is in sharp contrast to entropy: H ( X ) = H ( g ( X )) as long as g ( · ) is

Recommend


More recommend