Source Coding with Lists and Rényi Entropy or The Honey-Do Problem Amos Lapidoth ETH Zurich October 8, 2013 Joint work with Christoph Bunte.
A Task from your Spouse Using a fixed number of bits, your spouse reminds you of one of the following tasks: • Honey, don’t forget to feed the cat. • Honey, don’t forget to go to the dry-cleaner. • Honey, don’t forget to pick-up my parents at the airport. • Honey, don’t forget the kids’ violin concert. • • •
A Task from your Spouse Using a fixed number of bits, your spouse reminds you of one of the following tasks: • Honey, don’t forget to feed the cat. • Honey, don’t forget to go to the dry-cleaner. • Honey, don’t forget to pick-up my parents at the airport. • Honey, don’t forget the kids’ violin concert. • • • The combinatorical approach requires � log 2 # of tasks � . # of bits = It guarantees that you’ll know what to do. . .
The Information-Theoretic Approach • Model the tasks as elements of X n generated IID P . • Ignore the atypical sequences. • Index the typical sequences using ≈ n H ( X ) bits. • Send the index. • Typical tasks will be communicated error-free.
The Information-Theoretic Approach • Model the tasks as elements of X n generated IID P . • Ignore the atypical sequences. • Index the typical sequences using ≈ n H ( X ) bits. • Send the index. • Typical tasks will be communicated error-free. Any married person knows how ludicrous this is: What if the task is atypical? Yes, this is unlikely, but: • You won’t even know it! • Are you ok with the consequences?
Improved Information-Theoretic Approach • First bit indicates whether task is typical. • You’ll know when the task is lost in transmission.
Improved Information-Theoretic Approach • First bit indicates whether task is typical. • You’ll know when the task is lost in transmission. What are you going to do about it?
Improved Information-Theoretic Approach • First bit indicates whether task is typical. • You’ll know when the task is lost in transmission. What are you going to do about it? • If I were you, I would perform them all. • Yes, I know there are exponentially many of them. • Are you beginning to worry about the expected number of tasks?
Improved Information-Theoretic Approach • First bit indicates whether task is typical. • You’ll know when the task is lost in transmission. What are you going to do about it? • If I were you, I would perform them all. • Yes, I know there are exponentially many of them. • Are you beginning to worry about the expected number of tasks? You could perform a subset of the tasks. • You’ll get extra points for effort. • But what if the required task is not in the subset? • Are you ok with the consequences?
Our Problem • A source generates X n in X n IID P . • The sequence is described using nR bits. • Based on the description, a list is generated that is guaranteed to contain X n . • For which rates R can we find descriptions and corresponding lists with expected listsize arbitrarily close to 1? More generally, we’ll look at the ρ -th moment of the listsize.
What if you are not in a Relationship? Should you tune out?
Rényi Entropy � � � 1 /α α P ( x ) α H α ( X ) = 1 − α log x ∈X Alfréd Rényi (1921–1970)
A Homework Problem Show that 1. lim α → 1 H α ( X ) = H ( X ) . 2. lim α → 0 H α ( X ) = log | supp P | . 3. lim α →∞ H α ( X ) = − log max x ∈X P ( x ) .
Do not Tune Out • Our problem gives an operational meaning to 1 + ρ , ρ > 0 ( i.e., 0 < α < 1 ) . H 1 • It reveals many of its properties. • And it motivates the conditional Rényi entropy.
Lossless List Source Codes • Rate- R blocklength- n source code with list decoder: f n : X n → { 1 , . . . , 2 nR } , λ n : { 1 , . . . , 2 nR } → 2 X n • The code is lossless if x n ∈ λ n ( f n ( x n )) , ∀ x n ∈ X n • ρ -th listsize moment ( ρ > 0): � E [ | λ n ( f n ( X n )) | ρ ] = P n ( x n ) | λ n ( f n ( x n )) | ρ x n ∈X n
The Main Result on Lossless List Source Codes Theorem 1. If R > H 1 + ρ ( X ) , then there exists ( f n , λ n ) n ≥ 1 such that 1 n →∞ E [ | λ n ( f n ( X n )) | ρ ] = 1 . lim 2. If R < H 1 + ρ ( X ) , then 1 n →∞ E [ | λ n ( f n ( X n )) | ρ ] = ∞ . lim
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ (Monotonicity of ρ �→ a ρ when a ≥ 1.)
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ 2. H ( X ) ≤ H 1 + ρ ( X ) ≤ log |X| 1
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ 2. H ( X ) ≤ H 1 + ρ ( X ) ≤ log |X| 1 ⇒ listsize ≥ 2 w.p. tending to one. ( R < H ( X ) = And R = log |X| can guarantee listsize = 1.)
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ 2. H ( X ) ≤ H 1 + ρ ( X ) ≤ log |X| 1 3. lim ρ → 0 H 1 + ρ ( X ) = H ( X ) 1
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ 2. H ( X ) ≤ H 1 + ρ ( X ) ≤ log |X| 1 3. lim ρ → 0 H 1 + ρ ( X ) = H ( X ) 1 ( R > H ( X ) = ⇒ prob(listsize ≥ 2) decays exponentially. For small ρ beats | λ n ( f n ( X n )) | ρ , which cannot exceed e n ρ log |X| .)
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ 2. H ( X ) ≤ H 1 + ρ ( X ) ≤ log |X| 1 3. lim ρ → 0 H 1 + ρ ( X ) = H ( X ) 1 4. lim ρ →∞ H 1 + ρ ( X ) = log | supp ( P ) | 1
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ 2. H ( X ) ≤ H 1 + ρ ( X ) ≤ log |X| 1 3. lim ρ → 0 H 1 + ρ ( X ) = H ( X ) 1 4. lim ρ →∞ H 1 + ρ ( X ) = log | supp ( P ) | 1 ⇒ ∃ x 0 ∈ supp ( P ) n for which ( R < log | supp ( P ) | = | ϕ n ( f n ( x 0 )) | ≥ e n ( log | supp ( P ) |− R ) . Since P n ( x 0 ) ≥ p n min , where p min = min { P ( x ) : x ∈ supp ( P ) } � P n ( x ) | ϕ n ( f n ( x )) | ρ ≥ e n ρ ( log | supp ( P ) |− R − 1 1 p min ) . ρ log x Hence R is not achievable if ρ is large.)
Some Properties of H 1 + ρ ( X ) 1 1. Nondecreasing in ρ 2. H ( X ) ≤ H 1 + ρ ( X ) ≤ log |X| 1 3. lim ρ → 0 H 1 + ρ ( X ) = H ( X ) 1 4. lim ρ →∞ H 1 + ρ ( X ) = log | supp ( P ) | 1
Sketch of Direct Part 1. Partition each type-class T Q into 2 nR lists of ≈ lengths � ≈ 2 n ( H ( Q ) − R ) . � 2 − nR | T Q | 2. Describe the type of x n using o ( n ) bits. 3. Describe the list containing x n using nR bits. 4. Pr ( X n ∈ T Q ) ≈ 2 − nD ( Q || P ) and small number of types, so � � 2 n ( H ( Q ) − R ) � ρ Pr ( X n ∈ T Q ) Q ≤ 1 + 2 − n ρ ( R − max Q { H ( Q ) − ρ − 1 D ( Q || P ) }− δ n ) where δ n → 0. 5. By Arıkan’96, � = H � H ( Q ) − ρ − 1 D ( Q || P ) max 1 + ρ ( X ) . 1 Q
The Key to the Converse Lemma If 1. P is a PMF on a finite nonempty set X , 2. L 1 , . . . , L M is a partition of X , 3. L ( x ) � |L j | if x ∈ L j . Then � � � 1 + ρ � 1 P ( x ) L ρ ( x ) ≥ M − ρ P ( x ) . 1 + ρ x ∈X x ∈X
A Simple Identity for the Proof of the Lemma � 1 L ( x ) = M . x ∈X
A Simple Identity for the Proof of the Lemma � 1 L ( x ) = M . x ∈X Proof: M � � � 1 1 L ( x ) = L ( x ) x ∈X j = 1 x ∈L j M � � 1 = |L j | j = 1 x ∈L j M � = 1 j = 1 = M .
Proof of the Lemma 1. Recall Hölder’s Inequality: If p , q > 1 and 1 / p + 1 / q = 1, then �� � 1 p �� � 1 � q a ( x ) p b ( x ) q a ( x ) b ( x ) ≤ , a ( · ) , b ( · ) ≥ 0 . x x x 2. Rearranging gives �� � − p q �� � p � a ( x ) p ≥ b ( x ) q a ( x ) b ( x ) . x x x 1 ρ 1 + ρ L ( x ) 3. Choose p = 1 + ρ , q = ( 1 + ρ ) /ρ , a ( x ) = P ( x ) 1 + ρ ρ and b ( x ) = L ( x ) − 1 + ρ , and note that � 1 L ( x ) = M . x ∈X
Converse � x n ∈ X n : f n ( x n ) = m � . 1. WLOG assume λ n ( m ) = 2. ⇒ The lists λ n ( 1 ) , . . . , λ n ( 2 nR ) partition X n . 3. λ n ( f n ( x n )) is the list containing x n . 4. By the lemma: � � � 1 + ρ � X ( x n ) | λ n ( f n ( x n )) | ρ ≥ 2 − n ρ R 1 P n P n X ( x n ) 1 + ρ x n ∈X n x n ∈X n � � n ρ ( X ) − R H 1 = 2 . 1 + ρ Recall the lemma: � � � 1 + ρ � 1 P ( x ) L ρ ( x ) ≥ M − ρ P ( x ) . 1 + ρ x ∈X x ∈X
How to Define Conditional Rényi Entropy? Should it be defined as � P Y ( y ) H α ( X | Y = y ) ? y ∈Y
How to Define Conditional Rényi Entropy? Should it be defined as � P Y ( y ) H α ( X | Y = y ) ? y ∈Y Consider Y as side information to both encoder and decoder, ( X i , Y i ) ∼ IID P XY . You and your spouse hopefully have something in common. . .
Lossless List Source Codes with Side-Information • ( X 1 , Y 1 ) , ( X 2 , Y 2 ) , . . . ∼ IID P X , Y • Y n is side-information. • Rate- R blocklength- n source code with list decoder: f n : X n × Y n → { 1 , . . . , 2 nR } , λ n : { 1 , . . . , 2 nR } × Y n → 2 X n • Lossless property: x n ∈ λ n ( f n ( x n , y n ) , y n ) , ∀ ( x n , y n ) ∈ X n × Y n • ρ -th listsize moment: E [ | λ n ( f n ( X n , Y n ) , Y n ) | ρ ]
Recommend
More recommend