Struktur Data & Algoritme ( Data Structures & Algorithms ) Tree: application Denny ( denny@cs.ui.ac.id ) Suryana Setiawan ( setiawan@cs.ui.ac.id ) Fakultas I lm u Kom puter Universitas I ndonesia Sem ester Genap - 2 0 0 4 / 2 0 0 5 Version 2 .0 - I nternal Use Only Objectives � understand one of file compression technique (Huffman) SDA/ HUFF/ V2.0/ 2 1
Outline � Compression � Huffman compression SDA/ HUFF/ V2.0/ 3 Compression � Process: � Encoding: raw → compressed � Decoding: compressed → raw � Types of compression � Lossy : MPEG, JPEG � Lossless � Compression Algorithm: � RLE: Run Length Encoding � Lempel-Zif � Huffman Encoding � Performance of compression depends on file types. SDA/ HUFF/ V2.0/ 4 2
Huffman Compression � If a woodchuck could chuck wood! � 32 char × 8 bit = 256 bits � 13 distinct characters → 4 bit � Compressed code: 128 bits � Variable length string of bits to further improve compression. � Using prefix codes SDA/ HUFF/ V2.0/ 5 Huffman Compression � Frequently occurring letters: short representation. � Infrequent letters: long representations. SDA/ HUFF/ V2.0/ 6 3
Huffman Encoding: comparation → 16 bits a = 00 → 10 bits i = 01 → 6 bits u = 10 → 6 bits 8 5 3 e = 11 3 a i e u Total : 4 2 bits SDA/ HUFF/ V2.0/ 7 Huffman Encoding: comparation → 8 bits 19 a = 0 → 10 bits i = 10 → 9 bits 11 u = 110 → 9 bits 6 e = 111 Total: 3 6 bits 3 8 3 i 5 a e u SDA/ HUFF/ V2.0/ 8 4
Huffman Encoding 1 0 0 1 1 0 0 1 0 1 1 0 0 1 d 1 c 0 space o 1 0 1 0 u k w h 1 0 1 ! 0 I f a l SDA/ HUFF/ V2.0/ 9 Huffman Encoding 32 19 13 9 7 10 6 4 4 o:5 d:3 c:5 3 space:5 u:3 2 k:2 w :2 2 h:2 !:1 I :1 f:1 a:1 l:1 SDA/ HUFF/ V2.0/ 10 5
Huffman Encoding (freq) ! = 0 0 0 0 ( 1 ) I = 1 0 0 0 0 ( 1 ) a = 0 0 0 1 0 ( 1 ) f = 1 0 0 0 1 ( 1 ) l = 0 0 0 1 1 ( 1 ) h = 1 0 0 1 ( 2 ) u = 0 0 1 ( 3 ) c = 1 0 1 ( 5 ) d = 0 1 0 ( 3 ) space= 1 1 0 ( 5 ) k = 0 1 1 0 ( 2 ) o = 1 1 1 ( 5 ) w = 0 1 1 1 ( 2 ) Cost: ∑ d i * f i = 1 1 1 bits = 4 4 % × 2 5 6 bits SDA/ HUFF/ V2.0/ 11 Huffman Encoding: steps 5 o 5 c 5 3 3 u d 2 k 2 2 2 w 1 h 1 I 1 f 1 1 ! a l SDA/ HUFF/ V2.0/ 12 6
Huffman Encoding: steps 5 o 5 c 5 3 3 u d 2 k 2 2 2 w h 2 1 1 1 I f ! a l SDA/ HUFF/ V2.0/ 13 Huffman Encoding: steps 5 o 5 c 5 3 3 u 3 d 2 k 2 2 w h 2 2 1 I a l f ! SDA/ HUFF/ V2.0/ 14 7
Huffman Encoding: steps 5 o 5 c 5 3 3 u 3 d 4 2 I 2 2 2 k h w a l f ! SDA/ HUFF/ V2.0/ 15 Huffman Encoding: steps 5 o 5 c 5 4 3 4 h w 3 3 2 I 2 u d k f ! a l SDA/ HUFF/ V2.0/ 16 8
Huffman Encoding: steps 5 o 5 c 5 4 6 3 4 3 3 h w u d I k a l f ! SDA/ HUFF/ V2.0/ 17 Huffman Encoding: steps 6 u d 5 o 5 4 3 c 5 4 I k h w a l f ! SDA/ HUFF/ V2.0/ 18 9
Huffman Encoding: steps 6 u d 7 5 o 5 c 5 3 4 4 I k h w a l f ! SDA/ HUFF/ V2.0/ 19 Huffman Encoding: steps 7 I k a l f ! 9 6 5 5 c 5 4 u o d h w SDA/ HUFF/ V2.0/ 20 10
Huffman Encoding: steps 9 c 7 h w 10 6 I k 5 5 u d o a l f ! SDA/ HUFF/ V2.0/ 21 Huffman Encoding: steps 10 9 o 13 7 c h w 6 I k u d a l f ! SDA/ HUFF/ V2.0/ 22 11
Huffman Encoding: steps 13 19 10 9 o c I k u d h w a l f ! SDA/ HUFF/ V2.0/ 23 Huffman Encoding: steps 32 19 13 o c I h w k u d a l f ! SDA/ HUFF/ V2.0/ 24 12
Huffman Encoding: steps Total: 111 bits o c I h w k u d a l f ! SDA/ HUFF/ V2.0/ 25 Summary � Huffman encoding use frequency information to compress file. � The most frequent character get a shorter prefix code, and vice versa. SDA/ HUFF/ V2.0/ 26 13
Further Reading � Chapter 12 SDA/ HUFF/ V2.0/ 27 What’s Next � Hash Tables SDA/ HUFF/ V2.0/ 28 14
Recommend
More recommend