Logistics • Project 2 Trees IV – Minimal submission due Sunday – Please don’t miss the minimum submission • Submit early! Huffman Trees • Submit often! Logistics Huffman Trees • Exam 2 • Another “real live” application of binary – Next Wednesday, May 1 st trees – Will cover: • Binary (I.e. 0, 1) encoding of characters • Recursion • Analysis of Algorithms • Searching • Sorting • Trees – QA session Monday. Huffman Trees Huffman Trees • Suppose we want to “encode” a text • Example: message into a sequence of 1’s and 0’s: – Say we wish to encode 5 characters a,b,c,d,e – Each character will be given a binary code – One possible encoding: Char Code – No code for one character is a prefix of the a 000 code for another character b 001 – More frequently used characters have shorter c 010 codes. d 011 e 100 1
Huffman Trees Huffman Trees • Using this encoding • Suppose we are given frequency of Char Code occurrences for each character a 000 Char Occurrence Code • abca = 000 001 010 000 b 001 a 12 % 000 c 010 b 40 % 11 d 011 c 15 % 01 e 100 d 8 % 001 e 25 % 10 Huffman Trees Huffman Trees • Chars that occur more frequently have • No code is a prefix of any other code shorter codes. • Using this encoding Char Code Char Occurrence Code – abca = 000 11 01 000 a 000 – 10 bits a 12 % 000 b 11 • Using last encoding b 40 % 11 c 01 c 15 % 01 – abca = 000 001 010 000 d 011 d 8 % 001 – 12 bits e 10 e 25 % 10 Huffman Trees Huffman Trees • This sequence of codes can be represented • To obtain a code for a given character by a binary tree: – Start at the root – Leaves represent characters – Find a path to the character’s leaf node – Following left child represents appending a 0 to – Append a 0 to a code every time you follow a a code left child – Following right child represents appending a 1 – Append a 1 to a code every time you follow a to a code right child. 2
Huffman Tree Huffman Tree Find the code for a 0 0 0 1 1 0 0 Char Code Char Code a 000 0 a 000 0 1 0 1 0 1 1 b 11 b 11 c e b c e b c 01 c 01 0 0 1 1 d 011 d 011 a d a d e 10 e 10 Decoding using a Huffman Tree Huffman Tree Find the code for e 1 0 • Start with the root 1 0 • For each “bit” follow an edge Char Code • When you get to leaf, write the char a 000 0 1 0 1 associated with the leaf b 11 c e b • Go back to the root. c 01 0 1 d 011 a d e 10 Decoding using a Huffman Tree Huffman Coding • How to build these Huffman trees Decode: 1 0 – Given: 00100101100011101 • Set of characters to be encoded 0 1 0 • A “weight” assigned to each character (indicating its d d c e d b c 1 frequency of occurrence). c e b 0 1 a d 3
Huffman Coding Huffman Coding • How to build these Huffman trees Start: Each char 12 40 15 8 25 25 is its own tree 1. Begin with a forest of trees. All trees are one a b e c d node with the weight equal to the weight of the character. Combine trees 2. Repeat until there is only 1 tree 20 40 15 25 25 with smallest 1. Choose 2 trees: T 1 and T 2 with the smallest weights: a & d weights and combine creating a new tree with left e b c 12 8 subtree = T 1 and right subtree = T 2 a d Huffman Coding Huffman Coding 20 40 15 25 25 60 40 40 25 25 e b c 25 25 12 8 35 b e b e a d 20 15 35 35 c Combine trees 12 8 20 15 40 25 25 with smallest Combine trees 20 15 a d weights: with smallest c e b 12 8 weights: c 12 8 a d a d Huffman Coding Huffman Coding 100 0 Char Code 1 60 40 a 0000 1 0 25 25 b 1 35 b 25 b c 001 e 0 1 e 20 15 d 0001 Finally, combine 0 1 c the last 2 trees e 01 c 12 8 a d a d 4
Huffman Coding Summary • This is an example of a greedy algorithm. • Huffman Coding – Only considers information available during a – Used to encode characters into 0’s and 1’s given iteration. – More frequent characters have smaller codes – Local decision → Global solution – Result represented by a binary tree – Built using a greedy algorithm • You will be implementing the Huffman coding algorithm in Lab 9. – Questions? Next time • Introduction to hashing 5
Recommend
More recommend