coding and data compression
play

Coding and Data Compression Mathias Winther Madsen - PowerPoint PPT Presentation

Coding and Data Compression Mathias Winther Madsen mathias.winther@gmail.com Institute for Logic, Language, and Computation University of Amsterdam March 2015 Information Theory E M Y 2 H y x T REASONABLE CAUSES FOR EACH E 2 H x y T


  1. Coding and Data Compression Mathias Winther Madsen mathias.winther@gmail.com Institute for Logic, Language, and Computation University of Amsterdam March 2015

  2. Information Theory E M Y 2 H y x T REASONABLE CAUSES FOR EACH E 2 H x y T REASONABLE EFFECTS FOR EACH M Claude Shannon: “A Mathematical Theory of Communication,” Bell System Technical Journal , 1948.

  3. Information Theory THE COIEF DIFFIOULTY ALOCE FOUOD OT FIRST WAS IN OAOAGING HER FLAOINGO: SHE SUCCEODEO ON GO OTIOG IOS BODY OUOKEO AOAO, COMFOROABLY EOOOGO, UNDER OER O OM, WITO OTS O O OS HANGIOG DOO O, BOT OENEOAO OY, OUST AS SO O HOD OOT OTS O OCK NOCEO O SOROIGHTEOEO O OT, ANO WOS O O ONG TO OIOE TO O HEDGEHOG O OLOW WOTH ITS O OAD, O O WOULO TWOST O OSEOF OOUO O ANO O O OK OP IN HOR OACO, O OTO OUO O A O O OZOED EO OREOSOOO O O O O SHO COUOD O O O O O O O O O OSO O OG O O O OAO OHO O O: AOD WHON O O O OAO OOO O O O O O O O DOO O, O OD O OS GOIOG O O BO O ON O O OIO, O O O OS O O OY O OOOOO O O O O O O O O O O O OT TO O OEOGO O O O O OD O OROLO O O O O O O OF, O O O O O O O O OHO O O O O O O O O O O O O O O O O O O

  4. The Hartley Measure Definition: The Hartley Measure of Uncertainty H = log 2 | Ω | . Ralph V. L. Hartley: “Transmission of Information,” Bell System Technical Journal , 1928.

  5. The Hartley Measure ♠♣♥♦ ♣♠♥♦ ♠♣♦♥ ♣♠♦♥ ♠♥♣♦ ♣♥♠♦ ♠♦♣♥ ♣♦♠♥ ♠♥♦♣ ♣♥♦♠ ♠♦♥♣ ♣♦♥♠ ♠♣♦♥ ♣♠♦♥ ♠♣♥♦ ♣♠♥♦ ♠♦♣♥ ♣♦♠♥ ♠♥♣♦ ♣♥♠♦ ♠♦♥♣ ♣♦♥♠ ♠♥♦♣ ♣♥♦♠ H = log 2 24 = 4 . 58

  6. The Hartley Measure 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101 01110 01111 10000 10001 10010 10011 10100 10101 10110 10111 H = log 2 24 = 4 . 58

  7. The Hartley Measure ♠♣♥♦ ♣♠♥♦ ♠♣♦♥ ♣♠♦♥ ♠♥♣♦ ♣♥♠♦ ♠♦♣♥ ♣♦♠♥ ♠♥♦♣ ♣♥♦♠ ♠♦♥♣ ♣♦♥♠ ♠♣♦♥ ♣♠♦♥ ♠♣♥♦ ♣♠♥♦ ♠♦♣♥ ♣♦♠♥ ♠♥♣♦ ♣♥♠♦ ♠♦♥♣ ♣♦♥♠ ♠♥♦♣ ♣♥♦♠ H = log 2 24 = 4 . 58

  8. The Hartley Measure ♠♣♥♦ – – – ♠♥♣♦ – – – ♠♥♦♣ – – – ♠♣♦♥ – – – ♠♦♣♥ – – – ♠♦♥♣ – – – H = log 2 6 = 2 . 58

  9. The Hartley Measure 000 – – – 001 – – – 010 – – – 011 – – – 100 – – – 101 – – – H = log 2 6 = 2 . 58

  10. The Hartley Measure ♠♣♥♦ – – – ♠♥♣♦ – – – ♠♥♦♣ – – – ♠♣♦♥ – – – ♠♦♣♥ – – – ♠♦♥♣ – – – H = log 2 6 = 2 . 58

  11. The Hartley Measure – – – – ♠♥♣♦ – – – ♠♥♦♣ – – – – – – – – – – – – – – – H = log 2 2 = 1 . 00

  12. The Hartley Measure – – – – ♠♥♣♦ – – – – – – – – – – – – – – – – – – – H = log 2 1 = 0 . 00

  13. The Hartley Measure H = log k ? H = log ( ∞ ) ?

  14. Entropy The Shannon Entropy � 1 � 1 � H = E log = p ( x ) log p ( X ) p ( x ) . x 0 . 6 0 . 6 0 . 4 0 . 4 p ( x ) p ( x ) 0 . 2 0 . 2 0 0 1 2 3 1 2 3 − log p ( x ) x

  15. Entropy 1 H 0 . 5 0 0 0 . 5 1

  16. Entropy

  17. Entropy 1 − p 1 6 p 1 − p 4 2 H p 2 1 − p 3 0 p 0 0 . 5 1 . . p .

  18. Entropy Properties of the entropy 1. Positive: H ≥ 0. 2. Decomposes: H ( X × Y ) = H ( X ) + H ( Y | X ) . 3. Reduced (on average) by information: H ( X ) ≥ H ( X | Y ) . Definition: Conditional Entropy � H ( X | Y ) = E Y [ H ( X | Y ) ] = p ( y ) H ( X | Y = y ) y

  19. Huffman Coding x a b c d e Pr { X = x } . 05 . 15 . 20 . 25 . 35 David A. Huffman: “A Method for the Construction of Minimum-Redundancy Codes,” Proceedings of the Institute of Radio Engineers , 1952.

  20. Huffman Coding

  21. Huffman Coding x Code p − log p k x Code p − log p k .0634 3.98 4 .0008 10.33 10 A 1001 Q 0111000100 .0135 6.21 6 .0470 4.41 4 B 011101 R 0000 .0242 5.37 5 .0502 4.32 4 C 00011 S 0100 .0321 4.96 5 .0729 3.78 4 D 10100 T 1100 .0980 3.35 3 .0234 5.42 5 E 001 U 00010 F 101111 .0174 5.84 6 V 0111110 .0075 7.06 7 .0165 5.92 6 .0156 6.00 6 G 101011 W 011110 .0438 4.51 5 .0014 9.46 9 H 11011 X 011100001 .0552 4.18 4 .0160 5.97 6 I 0110 Y 101010 .0009 10.17 9 .0005 11.04 11 J 011100000 Z 01110001011 .0061 7.35 7 ¶ .0084 6.89 7 K 0111001 0111111 .0336 4.89 5 .1741 2.52 3 L 10110 _ 111 .0174 5.85 6 .0019 9.06 9 M 101110 ’ 011100011 .0551 4.18 4 .0117 6.42 7 N 0101 , 1101011 .0622 4.01 4 .0109 6.52 7 O 1000 . 1101010 .0180 5.80 6 .0003 11.56 11 P 110100 ? 01110001010

Recommend


More recommend