lossless compression in
play

Lossless compression in lossy compression systems Almost every - PowerPoint PPT Presentation

Lossless compression in lossy compression systems Almost every lossy compression system contains a lossless compression system Lossy compression system Dequantizer Transform Lossless Lossless Inverse Quantizer Encoder Decoder


  1. Lossless compression in lossy compression systems  Almost every lossy compression system contains a lossless compression system Lossy compression system Dequantizer Transform Lossless Lossless Inverse Quantizer Encoder Decoder Transform Lossless compression system  We discuss the basics of lossless compression first, then move on to lossy compression Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 1

  2. Topics in lossless compression  Binary decision trees and variable length coding  Entropy and bit-rate  Prefix codes, Huffman codes, Golomb codes  Joint entropy, conditional entropy, sources with memory  Fax compression standards  Arithmetic coding Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 2

  3. Example: 20 Questions  Alice thinks of an outcome (from a finite set), but does not disclose her selection.  Bob asks a series of yes/no questions to uniquely determine the outcome chosen. The goal of the game is to ask as few questions as possible on average.  Our goal: Design the best strategy for Bob. Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 3

  4. Example: 20 Questions (cont.)  Which strategy is better? 0 (=no) 1 (=yes) 0 1 0 0 1 0 1 0 1 1 A B C C D 0 0 1 0 1 F 0 1 E F A B D E  Observation: The collection of questions and answers yield a binary code for each outcome. Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 4

  5. Fixed length codes 0 1 0 1 0 1 0 0 1 0 0 1 A B C D E F G H l av  log 2 K  Average description length for K outcomes  Optimum for equally likely outcomes  Verify by modifying tree Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 5

  6. Variable length codes  If outcomes are NOT equally probable:  Use shorter descriptions for likely outcomes  Use longer descriptions for less likely outcomes  Intuition:  Optimum balanced code trees, i.e., with equally likely outcomes, can be pruned to yield unbalanced trees with unequal probabilities.  The unbalanced code trees such obtained are also optimum.  Hence, an outcome of probability p should require about   1 log 2  bits    p Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 6

  7. Entropy of a random variable  Consider a discrete, finite-alphabet random variable X Alphabet  X  {  0 ,  1 ,  2 ,...,  K  1 }    P X  x   for each x   X PMF f X x   Information associated with the event X=x       h x log f x X 2 X  Entropy of X is the expected value of that information               H X E h X f x log f x   X X 2 X  x X  Unit: bits Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 7

  8. Information and entropy: properties    0 h X x  Information  Information h X ( x ) strictly increases with decreasing probability f X ( x )  Boundedness of entropy     0 H X ( ) log 2 X Equality if only one Equality if all outcomes outcome can occur are equally likely  Very likely and very unlikely events do not substantially change entropy     p log p 0 for p 0 or p 1 2 Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 8

  9. Example: Binary random variable        H X p log p (1 p )log (1 p ) 2 2 1   0.9 H X 0.8 Equally likely 0.7 0.6 0.5 0.4 0.3 deterministic 0.2 0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 p Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 9

  10. Entropy and bit-rate  X n   Consider IID random process (or “source”) where each sample (or “symbol”) possesses identical entropy H ( X ) X n  H ( X ) is called “entropy rate” of the random process.  Noiseless Source Coding Theorem [Shannon, 1948]  The entropy H ( X ) is a lower bound for the average word length R of a decodable variable-length code for the symbols.  Conversely, the average word length R can approach H ( X ) , if sufficiently large blocks of symbols are encoded jointly.  Redundancy of a code:       R H X 0 Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 10

  11. Variable length codes  X n   Given IID random process with alphabet and X   PMF X f x  Task: assign a distinct code word, c x , to each element, x  c c , where is a string of bits, such that each x X x c x symbol can be determined, even if the codewords n x n are directly concatenated in a bitstream  Codes with the above property are said to be “uniquely decodable.”  Prefix codes  No code word is a prefix of any other codeword  Uniquely decodable, symbol by symbol, in natural order 0, 1, 2, . . . , n, . . . Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 11

  12. Example of non-decodable code      Encode sequence of source symbols , , , , 0 2 3 0 1 Resulting bit-stream 0 10 11 0 01 Encode sequence of source symbols  1 ,  0 ,  3 ,  0 ,  1 Resulting bit-stream 01 0 11 0 01  Same bit-stream for different sequences of source symbols: ambiguous, not uniquely decodable  BTW: Not a prefix code. Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 12

  13. Unique decodability: McMillan and Kraft conditions  Necessary condition for unique decodability [McMillan]    c 2 1 x  x X  Given a set of code word lengths || c x || satisfying McMillan condition, a corresponding prefix code always exists [Kraft]  Hence, McMillan inequality is both necessary and sufficient.  Also known as Kraft inequality or Kraft-McMillan inequality.  No loss by only considering prefix codes.  Prefix code is not unique. Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 13

  14. Prefix Decoder Code word LUT Shift register to hold longest code word Input buffer . . . . . . Advance ||c x || bits Code word length LUT Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 14

  15. Binary trees and prefix codes  Any binary tree can be 0 1 converted into a prefix code 1 0 0 1 by traversing the tree from 00 01 10 1 root to leaves. 0 111 0 1 1100 1101  Any prefix code corresponding to a binary tree meets McMillan         2 4 3 3 2 2 2 2 1 condition with equality    c 2 1 x  x X Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 15

  16. Binary trees and prefix codes (cont.)  Augmenting binary tree by two 0 1 new nodes does not change McMillan sum. 1 0 0 1  Pruning binary tree does not 1 0 change McMillan sum. 1  2 l            l 1 l 1 l 2 2 2  McMillan sum for simplest binary tree     1 1 2 2 1 Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 16

  17. Instantaneous variable length encoding without redundancy Example  A code without redundancy, i.e.  R H X ( ) requires all individual code word lengths      l log f  2 X k k  All probabilities would have to be binary fractions:    H X 1.75 bits    l  f ( ) 2  k R 1.75 bits X k   0 Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 17

  18. Huffman Code  Design algorithm for variable length codes proposed by Huffman (1952) always finds a code with minimum redundancy.  Obtain code tree as follows: 1 Pick the two symbols with lowest probabilities and merge them into a new auxiliary symbol. 2 Calculate the probability of the auxiliary symbol. 3 If more than one symbol remains, repeat steps 1 and 2 for the new auxiliary alphabet. 4 Convert the code tree into a prefix code. Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 18

  19. Huffman Code - Example R fixed  4 bits/symbol Fixed length coding: R Huffman  2.77 bits/symbol Huffman code: H ( X )  2.69 bits/symbol Entropy   0.08 bits/symbol Redundancy of the Huffman code: Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 19

  20. Redundancy of prefix code for general distribution     Huffman code redundancy 0 1 bit/symbol  Theorem: For any distribution f X , a prefix code can be found, whose rate R satisfies     1    H X R H X  Proof  Left hand inequality: Shannon’s noiseless coding theorem  Right hand inequality:       Choose code word lengths c log f x   x 2 X          Resulting rate R f x log f x   X 2 X  x X          f x 1 log f x X 2 X  x X     H X 1 Bernd Girod: EE398A Image and Video Compression Entropy and Lossless Coding no. 20

Recommend


More recommend