Lecture 1 Capacity of the Gaussian Channel • Basic concepts in information theory: Appendix B • Capacity of the Gaussian channel: Appendix B, Ch. 5.1–3 Mikael Skoglund, Theoretical Foundations of Wireless 1/23 Entropy and Mutual Information I • Entropy for a discrete random variable X with alphabet X and pmf p ( x ) � Pr( X = x ) , ∀ x ∈ X � H ( X ) � − p ( x ) log p ( x ) x ∈X • H ( X ) = average amount of uncertainty removed when observing X = the information obtained X • It holds that 0 ≤ H ( X ) ≤ log |X| • Entropy for an n -tuple X n 1 � ( X 1 , . . . , X n ) � H ( X n p ( x n 1 ) log p ( x n 1 ) = H ( X 1 , . . . , X n ) = − 1 ) x n 1 Mikael Skoglund, Theoretical Foundations of Wireless 2/23
Entropy and Mutual Information II • Conditional entropy of Y given X = x � H ( Y | X = x ) � − p ( y | x ) log p ( y | x ) y ∈Y • H ( Y | X = x ) = the average information obtained when observing Y when it is already known that X = x • Conditional entropy of Y given X (on the average) � H ( Y | X ) � p ( x ) H ( Y | X = x ) x ∈X • Define g ( x ) = H ( Y | X = x ) . Then H ( Y | X ) = Eg ( X ) . • Chain rule H ( X, Y ) = H ( Y | X ) + H ( X ) (c.f., p ( x, y ) = p ( y | x ) p ( x ) ) Mikael Skoglund, Theoretical Foundations of Wireless 3/23 Entropy and Mutual Information III • Mutual information p ( x, y ) log p ( x, y ) � � I ( X ; Y ) � p ( x ) p ( y ) x y • I ( X ; Y ) = the average information about X obtained when observing Y (and vice versa) Mikael Skoglund, Theoretical Foundations of Wireless 4/23
Entropy and Mutual Information IV H ( X, Y ) H ( X | Y ) I ( X ; Y ) H ( Y | X ) H ( X ) H ( Y ) I ( X ; Y ) = I ( Y ; X ) I ( X ; Y ) = H ( Y ) − H ( Y | X ) = H ( X ) − H ( X | Y ) I ( X ; Y ) = H ( X ) + H ( Y ) − H ( X, Y ) I ( X ; X ) = H ( X ) H ( X, Y ) = H ( X ) + H ( Y | X ) = H ( Y ) + H ( X | Y ) Mikael Skoglund, Theoretical Foundations of Wireless 5/23 Entropy and Mutual Information V • A continuous random variable X with pdf f ( x ) , differential entropy Z h ( X ) = − f ( x ) log f ( x ) dx • For E [ X 2 ] = σ 2 , h ( X ) ≤ 1 2 log(2 πeσ 2 ) [bits] with = only for X Gaussian • Mutual information, ZZ f ( x, y ) log f ( x, y ) I ( X ; Y ) = f ( x ) f ( y ) dxdy Mikael Skoglund, Theoretical Foundations of Wireless 6/23
Jensen’s Inequality • For f : R n → R convex and a random X ∈ R n , f ( E [ X ]) ≤ E [ f ( X )] • Reverse inequality for f concave • For f strictly convex (or strictly concave), f ( E [ X ]) = E [ f ( X )] = ⇒ Pr( X = E [ X ]) = 1 Mikael Skoglund, Theoretical Foundations of Wireless 7/23 Fano’s Inequality • Consider the following estimation problem (discrete RV’s): X random variable of interest Y observed random variable ˆ X = f ( Y ) estimate of X based on Y • Define the probability of error as P e = Pr( ˆ X � = X ) • Fano’s inequality lower bounds P e h ( P e ) + P e log( |X| − 1) ≥ H ( X | Y ) [ h ( x ) = − x log x − (1 − x ) log(1 − x ) ] Mikael Skoglund, Theoretical Foundations of Wireless 8/23
The Gaussian Channel I w m encoder decoder x m y m ω ω ˆ α β • At time m : transmitted symbol x m ∈ X = R , received symbol y m ∈ Y = R , noise w m ∈ R • The noise { w m } is i.i.d. Gaussian N (0 , σ 2 ) • A memoryless Gaussian transition density (noise variance σ 2 ), 1 1 � 2 σ 2 ( y − x ) 2 � √ f ( y | x ) = 2 πσ 2 exp − Mikael Skoglund, Theoretical Foundations of Wireless 9/23 The Gaussian Channel II • Coding for the Gaussian channel , subject to an average power constraint • Equally likely information symbols ω ∈ I M = { 1 , . . . , M } • An ( M, n ) code with power constraint P 1 Power-limited codebook n o x n 1 (1) , . . . , x n C = 1 ( M ) , with n n − 1 X x 2 m ( i ) ≤ P, i ∈ I M m =1 2 Encoding : ω = i ⇒ x n 1 = α ( i ) = x n 1 ( i ) transmitted 3 Decoding : y n ω = β ( y n 1 received ⇒ ˆ 1 ) • One symbol → one codeword → n channel uses Mikael Skoglund, Theoretical Foundations of Wireless 10/23
Capacity • A rate R � log M n is achievable (subject to the power constraint P ) if there exists a sequence of ( ⌈ 2 nR ⌉ , n ) codes with codewords satisfying the power constraint, and such that the average probability of error P ( n ) ω � = ω ) = Pr(ˆ e tends to 0 as n → ∞ . • The capacity C is the supremum of all achievable rates . Mikael Skoglund, Theoretical Foundations of Wireless 11/23 A Lower Bound for C I • Gaussian random code design : Fix x 2 � � 1 f ( x ) = exp − � 2( P − ε ) 2 π ( P − ε ) � x n 1 (1) , . . . , x n � for a small ε > 0 , and draw a codebook C n = 1 ( M ) i.i.d according to f ( x n 1 ) = � m f ( x m ) . • Mutual information : Let �� f ( y | x ) f ( y | x ) f ( x ) log I ε = f ( y | x ) f ( x ) dxdxdy � = 1 � 1 + P − ε � 2 log σ 2 • the mutual info between input and output when the channel is “driven by” f ( x ) = N (0 , P − ε ) Mikael Skoglund, Theoretical Foundations of Wireless 12/23
A Lower Bound for C II • Encoding : A message ω ∈ I M is encoded as x n 1 ( ω ) • Transmission : Received sequence y n 1 = x n 1 ( ω ) + w n 1 where w m are i.i.d zero-mean Gaussian, E [ w 2 m ] = σ 2 • Decoding : For any sequences x n 1 and y n 1 , let n n log f ( y n 1 | x n 1 ) = 1 1 ) = 1 log f ( y m | x m ) � f n = f n ( x n 1 , y n f ( y n 1 ) n f ( y m ) m =1 and let T ( n ) be the set of ( x n 1 , y n 1 ) such that f n > I ε − ε . Declare ε 1 ) ∈ T ( n ) ω = i if x n 1 ( i ) is the only codeword such that ( x n 1 ( i ) , y n ˆ , ε and in addition n � n − 1 x 2 m ( i ) ≤ P, m =1 otherwise set ˆ ω = 0 . Mikael Skoglund, Theoretical Foundations of Wireless 13/23 A Lower Bound for C III • Average probability of error : � � π n = Pr(ˆ ω � = ω ) = symmetry = Pr(ˆ ω � = 1 | ω = 1) with “Pr” over the random codebook and the noise • Let E 0 = { n − 1 � x 2 m (1) > P } m and ∈ T ( n ) x n 1 ( i ) , x n 1 (1) + w n �� � � E i = 1 ε then M � π n = P ( E 0 ∪ E c 1 ∪ E 2 ∪ · · · ∪ E M ) ≤ P ( E 0 ) + P ( E c 1 ) + P ( E i ) i =2 Mikael Skoglund, Theoretical Foundations of Wireless 14/23
A Lower Bound for C IV • For n sufficiently large, we have • P ( E 0 ) < ε • P ( E c 1 ) < ε • P ( E i ) ≤ 2 − n ( I ε − ε ) , i = 2 , . . . , M that is, π n ≤ 2 ε + 2 − n ( I ε − R − ε ) ⇒ For the average code, R < I ε − ε ⇒ π n → 0 as n → ∞ ⇒ Exists at least one code with P n e → 0 for R < I ε − ε ⇒ C ≥ 1 � 1 + P � 2 log σ 2 Mikael Skoglund, Theoretical Foundations of Wireless 15/23 An Upper Bound for C I • Consider any sequence of codes that can achieve the rate R • Fano = ⇒ n R ≤ 1 � I ( x m ( ω ); y m ) + α n n m =1 where α n = n − 1 + RP ( n ) → 0 as n → ∞ , and where e I ( x m ( ω ); y m ) = h ( y m ) − h ( w m ) = h ( y m ) − 1 2 log 2 πeσ 2 m ] = P m + σ 2 where P m = M − 1 � M • Since E [ y 2 i =1 x 2 m ( i ) we get h ( y m ) ≤ 1 2 log 2 πe ( σ 2 + P m ) and hence I ( x m ( ω ); y m ) ≤ 2 − 1 log(1 + P m /σ 2 ) . Mikael Skoglund, Theoretical Foundations of Wireless 16/23
An Upper Bound for C II Thus n 1 + n − 1 � � � � � R ≤ 1 1 1 + P m + α n ≤ 1 m P m � 2 log 2 log + α n σ 2 σ 2 n m =1 � � � � ≤ 1 1 + P + α n → 1 1 + P 2 log 2 log as n → ∞ σ 2 σ 2 ⇒ for all achievable R , due to Jensen and the power constraint = C ≤ 1 � 1 + P � 2 log σ 2 Mikael Skoglund, Theoretical Foundations of Wireless 17/23 Coding Theorem for the Gaussian Channel Theorem A memoryless Gaussian channel with noise variance σ 2 and power constraint P has capacity C = 1 � 1 + P � 2 log σ 2 That is, all rates R < C and no rates R > C are achievable. Mikael Skoglund, Theoretical Foundations of Wireless 18/23
The Gaussian Waveform Channel I N ( f ) x ( t ) y ( t ) H ( f ) ( − T/ 2 , T/ 2) ( − T/ 2 , T/ 2) • Linear-filter waveform channel with Gaussian noise, • independent Gaussian noise with spectral density N ( f ) • linear filter H ( f ) • input confined to ( − T/ 2 , T/ 2) • output measured over ( − T/ 2 , T/ 2) • codebook C = { x 1 ( t ) , . . . , x M ( t ) } • power constraint Z T/ 2 1 x 2 i ( t ) dt ≤ P T − T/ 2 • rate R = log M T Mikael Skoglund, Theoretical Foundations of Wireless 19/23 The Gaussian Waveform Channel II • Capacity (in bits per second), log | H ( f ) | 2 · β C = 1 � d f 2 N ( f ) F ( β ) � � � N ( f ) β − P = d f | H ( f ) | 2 F ( β ) where f : N ( f ) · | H ( f ) | − 2 ≤ β � � F ( β ) = for different β ∈ (0 , ∞ ) . • That is, there exist codes such that arbitrarily low error probability is possible as long as R = log M < C T and as T → ∞ . For R > C the error probability is > 0 . Mikael Skoglund, Theoretical Foundations of Wireless 20/23
Recommend
More recommend