A General Formula for Channel Capacity 1 Definitions Information - PDF document

A General Formula for Channel Capacity 1 Definitions • Information variable ω ∈ { 1 , . . . , M } , p ( i ) = Pr( ω = i ) • Channel input X ∈ X and output Y ∈ Y , finite alphabets • Codewords { x N 1 ( i ) : i = 1 , . . . , M } , x n ∈ X • Rate R = N − 1 ln M • A sequence of channel uses, Pr( Y N = y N 1 | X N 1 = x N 1 ) = p ( y N 1 | x N 1 ) 1 defined for for each N , including N → ∞ – a discrete channel with completely arbitrary memory behavior • Decoder, ω = i if Y N ˆ ∈ F i 1 where { F i } is a partition of Y N • Error probabilities, M � P ( N ) � Y N ∈ F c i | X N 1 = x N � = Pr 1 ( i ) p ( i ) e 1 i =1 λ ( N ) = max �� M Y N ∈ F c i | X N 1 = x N � � Pr 1 ( i ) 1 i =1 • Information density p ( x N 1 , y N 1 ) i N ( x N 1 ; y N 1 ) = ln p ( x N 1 ) p ( y N 1 ) • Liminf in probability of { A n } , α = liminfp { A n } = supremum of all α for which Pr( A n ≤ α ) → 0 as n → ∞ • Rate R achievable if there exists a sequence of codes such that λ ( N ) → 0 when N → ∞ • C = supremum of all achievable rates 1

2 Feinstein’s Lemma and a Converse Lemma 1 Given M and a > 0 and an input distribution p ( x N 1 ) , there exist 1 ( i ) ∈ X N , i = 1 , . . . , M , and a partition F 1 , . . . , F M of Y N such that x N ≤ Me − a + Pr Y N ∈ F i | X N 1 = x N i N ( X N 1 ; Y N � � � � Pr / 1 ( i ) 1 ) ≤ a 1 In particular, choosing a = ln M + Nγ , with γ > 0 , gives � 1 1 ) ≤ 1 � ≤ e − γN + Pr � Y N ∈ F i | X N 1 = x N � N i N ( X N 1 ; Y N Pr / 1 ( i ) N ln M + γ 1 Lemma 1 (Feinstein’s Lemma [1]) implies that for any given p ( x N 1 ) there exists a code of rate R such that, for any γ > 0 and N > 0 � 1 � λ ( N ) ≤ e − γN + Pr N i N ( X N 1 ; Y N 1 ) ≤ R + γ where p ( x N 1 , y N p ( y N 1 | x N 1 ) 1 ) i N ( x N 1 ; y N 1 ) = ln 1 ) = ln p ( x N 1 ) p ( y N 1 p ( y N 1 | x N 1 ) p ( x N � 1 ) x N for the given p ( x N 1 ) and p ( y N 1 | x N 1 ) (the latter given by the channel in consider- ation). Proof X = X N and ¯ 1 , ¯ We use the notation x = x N 1 , y = y N Y = Y N , for simplicity, where N is the fixed codeword length. Define G = { ( x, y ) : i N ( x, y ) > a } . Set ε = Me − a + Pr( i N ≤ a ) = Me − a + P ( G c ) and assume ε < 1 and hence also that P ( G c ) ≤ ε < 1 and therefore that Pr( i N > a ) = P ( G ) > 1 − ε > 0 Letting G x = { y : ( x, y ) ∈ G } this implies that in defining A = { x : P ( G x | x ) > 1 − ε } it holds that P ( A ) > 0. Choose x 1 ∈ A and let F 1 = G x 1 . Next choose if possible x 2 ∈ A such that P ( G x 2 − F 1 | x 2 ) > 1 − ε and let F 2 = G x 2 − F 1 . Continue in this way until either M points have been selected or all points in A have been exhausted. That is, given { x j , F j } , j = 1 , . . . , i − 1, find an x i ∈ A for which � P ( G x i − F j | x i ) > 1 − ε j<i and let F i = G x i − � j<i F j . If this terminates before M points have been collected, denote the final point’s index by n . Observe that P ( F c i | x i ) ≤ P ( G c x i | x i ) ≤ ε, i = 1 , . . . , n and hence the lemma will be proved if we can show that n cannot be strictly less than M . 2

Define F = � n i =1 F i and consider the probability P ( G ) = P ( G ∩ ( ¯ X × F )) + P ( G ∩ ( ¯ X × F c )) The first term is bounded as n P ( G ∩ ( ¯ X × F )) ≤ P ( ¯ � X × F ) = P ( F ) = P ( F i ) i =1 Let p ( x, y ) f ( x, y ) = p ( x ) p ( y ) (i.e., i N = ln f ( x, y ) ). We get f ( x i , y ) � � � P ( F i ) = p ( y ) ≤ p ( y ) ≤ p ( y ) e a y ∈ F i y ∈ G xi y ∈ G xi ≤ e − a � p ( y | x i ) = e − a y and hence P ( G ∩ ( ¯ X × F )) ≤ ne − a Now consider P ( G ∩ ( ¯ P ( G ∩ ( ¯ X × F c )) = � X × F c ) | x ) p ( x ) x n � � � P ( G x ∩ F c | x ) p ( x ) = = P ( G x − F i | x ) p ( x ) x x i =1 Defining n � B = { x : P ( G x − F i | x ) > 1 − ε } i =1 it must hold that P ( B ) = 0, or there would be a point x n +1 for which n +1 � P ( G x n +1 − F i | x n +1 ) > 1 − ε i =1 Hence P ( G ∩ ( A × F c )) ≤ 1 − ε so we get P ( G ) ≤ ne − a + 1 − ε From the definition of ε we have also that P ( G ) = 1 − P ( G c ) = 1 − ε + Me − a so M ≤ n must hold, completing the proof. Let a reliable code sequence be a sequence of codes that achieve λ ( N ) → 0 at a fixed rate R < C . Since M � 1 ¯ P ( N ) � F c i | x N ≤ λ ( N ) � � P 1 ( i ) e M i =1 3

P ( N ) it holds, for a reliable code sequence, that ¯ → 0 for any { p ( i ) } . Hence if a e sequence of codes gives P ( N ) ¯ > 0 e for all N , the sequence cannot be reliable. Thus, to prove a converse we can assume, without loss of generality, that p ( i ) = M − 1 and study the resulting average error probability P ( N ) . e The following lemma is adopted from [2]. Lemma 2 Assume that { x N 1 ( i ) } M i =1 is the codebook of any code used in encod- ing equiprobable information symbols ω ∈ { 1 , . . . , M } , and let { F i } M i =1 be the corresponding decoding sets. Then M 1 P ( N ) � Y N ∈ F i | X N 1 = x N � � = M Pr / 1 ( i ) e 1 i =1 1 ) ≤ N − 1 ln M − γ N − 1 i N ( X N 1 ; Y N − e − γN � � ≥ Pr for any γ > 0 , and where i N ( x N 1 ; y N 1 ) is evaluated with p ( x N 1 ) = 1 /M . Proof As before, we use the notation x = x N 1 , y = y N 1 , where N is the fixed codeword length. Let ε = P ( N ) , β = e − γN , and e L = { ( x, y ) : p ( x | y ) ≤ β } and note that = Pr( N − 1 i N ≤ N − 1 ln M − γ ) � p ( x | y ) ≤ e − γN � P ( L ) = Pr We hence need to show that P ( L ) ≤ ε + β holds for any code { x i } , with x i = x N 1 ( i ) and decoding sets { F i } . Letting L i = { y : p ( x i | y ) ≤ β } we can write � � � M − 1 P ( L i | x i ) = M − 1 P ( L i ∩ F c M − 1 P ( L i ∩ F i | x i ) P ( L ) = i | x i ) + i i i � M − 1 P ( F c � M − 1 P ( L i ∩ F i | x i ) ≤ i | x i ) + i i � � � � = ε + p ( x i | y ) p ( y ) ≤ ε + β p ( y ) y ∈ L i ∩ F i y ∈ L i ∩ F i i i � � ≤ ε + β p ( y ) ≤ ε + β i y ∈ F i 4

A General Formula for Channel Capacity [2] Theorem 1 � � liminfp 1 N i N ( X N 1 ; Y N C = sup 1 ) { p ( x N 1 ) } 1 ) } ∞ where the supremum is over all possible sequences { p ( x N 1 ) } = { p ( x N N =1 . Proof Let R ∗ = liminfp 1 N i N ( X N 1 ; Y N 1 ) for any given { p ( x N 1 ) } , and let C ∗ = R ∗ sup { p ( x N 1 ) } For any δ > 0 assume R = R ∗ − δ . In Feinstein’s lemma, fix N , let γ = δ/ 2, and note that � 1 � 1 � � 1 ) ≤ R ∗ − δ/ 2 N i N ( X N 1 ; Y N N i N ( X N 1 ; Y N Pr 1 ) ≤ R + δ/ 2 = Pr and because of the definition of R ∗ � 1 � 1 ) ≤ R ∗ − δ/ 2 N i N ( X N 1 ; Y N N →∞ Pr lim = 0 Thus R is an achievable rate for any { p ( x N 1 ) } and δ > 0, which means that C ≥ C ∗ . Now assume for γ > 0 that R = C ∗ + 2 γ is the rate of any code of length N that codes equally likely symbols, and note in that case that 1 ) ≤ C ∗ + γ N − 1 i N ( X N 1 ; Y N N − 1 i N ( X N 1 ; Y N � � � � Pr 1 ) ≤ R − γ = Pr As N → ∞ this probability cannot vanish, due to the definition of C ∗ . Hence by Lemma 2, R is not achievable for any γ , which means that C ≤ C ∗ . 3 Example Assume that p ( y N 1 | x N 1 ) = p ( y 1 | x 1 ) · · · p ( y N | x N ) (stationary and memoryless In [2, Theorem 10] it is shown that for such channels the p ( x N channel). 1 ) that achieves the supremum in the formula for C is of the form p ( x N 1 ) = p ( x 1 ) · · · p ( x N ) That is, the optimal input distribution is stationary and memoryless. Hence, assuming this form for p ( x N 1 ) it holds that liminfp 1 N i N ( X N 1 ; Y N 1 ) = I ( X ; Y ) 5

evaluated for p ( x ) = p ( x 1 ) and p ( y | x ) = p ( y 1 | x 1 ), since the information density converges in probability to the mutual information [3]. Hence, we get Shannon’s formula C = sup I ( X ; Y ) p ( x ) (where the sup is a max, since I ( X ; Y ) is concave in p ( x )). References [1] A. Feinstein, “A new basic theorem of information theory,” IEEE Transac- tions on Information Theory , vol. 4, no. 4, pp. 2–22, Sept. 1954. [2] S. Verd´ u and T. S. Han, “A general formula for channel capacity,” IEEE Transactions on Information Theory , vol. 40, no. 4, pp. 1147–1157, July 1994. [3] T. M. Cover and J. A. Thomas, Elements of Information Theory , Wiley, 1991. 6

A General Formula for Channel Capacity 1 Definitions Information - PDF document

A General Formula for Channel Capacity 1 Definitions Information variable { 1 , . . . , M } , p ( i ) = Pr( = i ) Channel input X X and output Y Y , finite alphabets Codewords { x N 1 ( i ) : i = 1 , . . . , M } , x n

CHANNEL ALLOCATION Channel Language Translation Channel Translation Language Channel 1 German

ANNUAL ACCOUNTS PRESS CONFERENCE CHANNEL ALLOCATION. Channel Language Translation Channel

Part V. AWGN Channel Capacity AWGN Capacity Formula; Sphere Packing; Resources in AWGN Channel

Formula Student Overview for 2014-2015 Carleton Formula Student What is Formula Student?

Channel Assignment and Channel Hopping in IEEE 802.11 Operating Channels for 802.11b Europe

ANNUAL ACCOUNTS PRESS CONFERENCE LANGUAGE CHANNELS. Channel Language Channel (translation)

Channel design Channel coverage Intensive Selective Exclusive Channel

Chapter 7 Channel Capacity Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei

71 Overview for 2010-2011 Carleton Formula SAE and Formula-Hybrid yb d u a o a d u a S o

Formula 1 What is Formula 1 ? What is Formula 1 ? Highest class of single seater auto racing

Target Formula Re-evaluation Target Formula Background Target formula is used to distribute

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

1 Simultaneous interpretation EN channel 1 FR channel 2 ES channel 3 DE channel 4 2 The Future

A Review of the Tennessee A Review of the Tennessee Funding Formula Funding Formula Tennessee

Ultimate Quadrilateral Outline Review formula for Sum of exterior angles 360 formula for Sum

Finding a Formula For f 1 ( x ) Given a formula for f ( x ), sometimes we would like to find a

Sigrok: Using Logic to Debug Logic Matt Ranostay Intel Open Source Technology Center

3-509

Optimization Applied Arne Andersson Co-founder of Trade Extensions in 2000 Acquired by Coupa in

3D Mode l ing and Printing by Python EuroPython 2016@ Bi l bao, Spain 2016/07/22 Takuro Wada

Achieving channel capacity ... Shannon says this is possible ... how ? Make use of the gap

On the Capacity of Intelligent Reflecting Surface Aided MIMO Communication Shuowen Zhang and Rui

On Numerical Approximation of the DMC Channel Capacity (BFA2017 Workshop) Yi LU, Bo SUN,

Large Intelligent Surfaces - Massive MIMO Evolution or Revolution? Rui Dinis 12 1 Instituto de