CS 758/858: Algorithms http://www.cs.unh.edu/~ruml/cs758 Greedy Huffman Coding Wheeler Ruml (UNH) Class 12, CS 758 – 1 / 22
Greedy ■ Greedy ■ Scheduling ■ Rules ■ Algorithm ■ Proof ■ Greedy Choice ■ Opt. Substructure ■ Summary Greedy Algorithms ■ Break Huffman Coding Wheeler Ruml (UNH) Class 12, CS 758 – 2 / 22
Greedy Make best local choice, then solve remaining subproblem. Greedy ■ Greedy ■ Scheduling Eg, optimal solution uses the greedy choice + optimal solution ■ Rules ■ Algorithm to remaining subproblem. ■ Proof ■ Greedy Choice ■ Opt. Substructure ■ Summary Unlike DP, haven’t already solved subproblems, don’t need to ■ Break pick ‘best’ subsolution to use. Huffman Coding Wheeler Ruml (UNH) Class 12, CS 758 – 3 / 22
Activity Selection Given n activities, { 1 , 2 , ..., n } ; the i th activity corresponding to Greedy an interval starting at s ( i ) and finishing at f ( i ) , find a ■ Greedy ■ Scheduling compatible set with maximum size. ■ Rules ■ Algorithm ■ Proof ■ Greedy Choice ■ Opt. Substructure ■ Summary ■ Break Huffman Coding Wheeler Ruml (UNH) Class 12, CS 758 – 4 / 22
Activity Selection Given n activities, { 1 , 2 , ..., n } ; the i th activity corresponding to Greedy an interval starting at s ( i ) and finishing at f ( i ) , find a ■ Greedy ■ Scheduling compatible set with maximum size. ■ Rules ■ Algorithm ■ Proof Make a choice: at each step, select the next activity to include in ■ Greedy Choice ■ Opt. Substructure the set. ■ Summary ■ Break Huffman Coding Is there a rule? Wheeler Ruml (UNH) Class 12, CS 758 – 4 / 22
“Rules” for Activity Selection Earliest start time ■ Greedy Earliest finish time ■ Greedy ■ ■ Scheduling Smallest interval ■ ■ Rules ■ Algorithm Least conflicts ■ ■ Proof ■ Greedy Choice ■ Opt. Substructure ■ Summary ■ Break Try to make a decision that is good locally, Huffman Coding before solving remaining subproblem. Is best decision independent of remaining solution? Wheeler Ruml (UNH) Class 12, CS 758 – 5 / 22
“Rules” for Activity Selection Earliest start time ■ Greedy Earliest finish time ■ Greedy ■ ■ Scheduling Smallest interval ■ ■ Rules ■ Algorithm Least conflicts ■ ■ Proof ■ Greedy Choice ■ Opt. Substructure ■ Summary ■ Break Try to make a decision that is good locally, Huffman Coding before solving remaining subproblem. Is best decision independent of remaining solution? Wheeler Ruml (UNH) Class 12, CS 758 – 5 / 22
The Algorithm Make greedy choice, then solve remaining subproblem: Greedy ■ Greedy ■ Scheduling ■ Rules 1. R ← all activities ■ Algorithm ■ Proof 2. A ← {} ■ Greedy Choice 3. while R � = {} ■ Opt. Substructure ■ Summary 4. let t = activity in R with earliest finish time ■ Break 5. R ← R \ { s : s conflicts with t } Huffman Coding 6. A ← A ∪ { t } 7. return A Is this optimal? Wheeler Ruml (UNH) Class 12, CS 758 – 6 / 22
Proving Greedy Optimal Need to show: Greedy ■ Greedy 1. greedy choice is optimal: there exists an optimal solution ■ Scheduling ■ Rules that uses it ■ Algorithm 2. optimal substructure: the remaining subproblem can be ■ Proof ■ Greedy Choice solved the same way ■ Opt. Substructure ■ Summary ■ Break Huffman Coding Wheeler Ruml (UNH) Class 12, CS 758 – 7 / 22
The Greedy Choice Property Prove that first choice in optimal solution can be made greedily: Greedy ■ Greedy Let � a 1 , a 2 , ..., a i � be an optimal schedule. ■ ■ Scheduling ■ Rules If a 1 is the activity with the earliest finish time then the ■ ■ Algorithm greedy choice is within some optimal solution. ■ Proof ■ Greedy Choice If a 1 is not the activity with the earliest finish time then ■ ■ Opt. Substructure there must exist an activity b with an earlier finish time ■ Summary ■ Break ( f ( b ) < f ( a 1 ) ). Huffman Coding b will be compatible with a 2 , so � b, a 2 , ..., a i � is also an ■ optimal solution. This applies recursively to the subproblems: Recall that � a 2 , ..., a i � is an optimal sub-solution. Wheeler Ruml (UNH) Class 12, CS 758 – 8 / 22
Optimal Substructure Prove that optimal solution contains optimal solution to Greedy remaining subproblem after greedy choice: ■ Greedy ■ Scheduling ■ Rules Let � a 1 , a 2 , ..., a i � be an optimal schedule. ■ ■ Algorithm For the sake of contradiction, assume � a k , ..., a i � is a ■ ■ Proof ■ Greedy Choice suboptimal sub-schedule for the time after activity a k − 1 . ■ Opt. Substructure So, there exists a sequence � b 1 , ..., b j � that is a better ■ Summary ■ ■ Break schedule for this time interval ( j > i − k ). Huffman Coding Then, � a 1 , ..., a k − 1 ,b 1 , ..., b j � must be a better schedule. ■ Then, our optimal schedule was suboptimal: contradiction! ■ So our assumption must not hold. Sub-sechedule must be ■ optimal. Wheeler Ruml (UNH) Class 12, CS 758 – 9 / 22
Summary of Greedy Algorithms Make best local choice, then solve remaining subproblem. Greedy ■ Greedy ■ Scheduling Eg, optimal solution uses the greedy choice + optimal solution ■ Rules ■ Algorithm to remaining subproblem. ■ Proof ■ Greedy Choice ■ Opt. Substructure ■ Summary 1. prove greedy choice is safe (an optimal solution uses that ■ Break choice): subsitute greedy choice in optimal soluion Huffman Coding 2. prove optimal substructure (optimal solution uses optimal solutions of subproblems): assume suboptimal, then derive contradiction Wheeler Ruml (UNH) Class 12, CS 758 – 10 / 22
Break Thu Oct 3 asst6 due, asst7 out ■ Greedy Fri Oct 4 review Q&A ■ Greedy ■ ■ Scheduling Tue Oct 8 midterm ■ ■ Rules ■ Algorithm Thu Oct 10 graphs, asst8 out ■ ■ Proof Tue Oct 15 is a Mon: no class, but asst7 due ■ ■ Greedy Choice ■ Opt. Substructure Thu Oct 17 components ■ ■ Summary ■ Break Huffman Coding Wheeler Ruml (UNH) Class 12, CS 758 – 11 / 22
Greedy Huffman Coding ■ The Problem ■ Code Structure ■ The Algorithm ■ Optimality ■ Greedy Choice ■ Substructure ■ Proof 1 Huffman Coding ■ Proof 2 ■ Summary ■ EOLQs Wheeler Ruml (UNH) Class 12, CS 758 – 12 / 22
The Problem Given a table of character frequencies, find a set of prefix-free Greedy codewords that minimizes encoding length: Huffman Coding ■ The Problem ■ Code Structure � B ( T ) = f ( c ) · d T ( c ) ■ The Algorithm ■ Optimality c ∈ C ■ Greedy Choice ■ Substructure f ( c ) code c ■ Proof 1 ■ Proof 2 a 5 1 ■ Summary b 2 00 ■ EOLQs c 1 01 a a a b a b a c ⇒ 1 1 1 00 1 00 1 01 regular ASCII: 8 bytes = 64 bits ⇒ 11 bits ( ∼ 83% smaller) fixed size: 8 × 2 bits = 16 bits ⇒ 11 bits ( ∼ 31% smaller) Wheeler Ruml (UNH) Class 12, CS 758 – 13 / 22
Code Structure frequent characters will have shorter codes Greedy Huffman Coding ■ The Problem every node in the optimal code tree has two children ■ Code Structure ■ The Algorithm ■ Optimality ■ Greedy Choice ■ Substructure ■ Proof 1 ■ Proof 2 ■ Summary ■ EOLQs Wheeler Ruml (UNH) Class 12, CS 758 – 14 / 22
The Algorithm Distinguish elements by penalizing the two least frequent: Greedy Huffman Coding ■ The Problem 1. C ← characters c tagged by frequency f ( c ) ■ Code Structure 2. Q ← Make-Min-Heap ( C ) ■ The Algorithm ■ Optimality 3. for i = 1 to | C | − 1 do ■ Greedy Choice 4. let z be a new tree node ■ Substructure ■ Proof 1 5. z.left ← Extract-Min ( Q ) ■ Proof 2 ■ Summary 6. z.right ← Extract-Min ( Q ) ■ EOLQs 7. f ( z ) ← f ( z.left ) + f ( z.right ) 8. Heap-Insert ( Q, z ) 9. return Extract-Min ( Q ) What’s the worst-case time complexity? Wheeler Ruml (UNH) Class 12, CS 758 – 15 / 22
Proving that Greedy is Optimal Show that Greedy Huffman Coding 1. greedy choice is optimal (optimal solution can use greedy ■ The Problem choice) ■ Code Structure ■ The Algorithm 2. the greedy choice plus an optimal solution to the remaining ■ Optimality subproblem is an optimal solution for the larger problem ■ Greedy Choice ■ Substructure ■ Proof 1 ■ Proof 2 ■ Summary ■ EOLQs Wheeler Ruml (UNH) Class 12, CS 758 – 16 / 22
The Greedy Choice is Optimal Any code without greedy choice can be improved by it: Greedy Huffman Coding Let x and y be the least frequent and a and b be siblings at the ■ The Problem deepest depth in T . If they are not the same, we can improve ■ Code Structure ■ The Algorithm the code by swapping x and y for a and b . ■ Optimality ■ Greedy Choice Proof: Consider swapping x and a to get T ′ . ■ Substructure ■ Proof 1 ■ Proof 2 � � B ( T ) − B ( T ′ ) = f ( c ) · d T ( c ) − f ( c ) · d T ′ ( c ) ■ Summary ■ EOLQs c ∈ C c ∈ C = f ( a ) · d T ( a ) + f ( x ) · d T ( x ) − f ( a ) · d T ′ ( a ) − f ( x ) · d T ′ ( x ) = f ( a ) · d T ( a ) + f ( x ) · d T ( x ) − f ( a ) · d T ( x ) − f ( x ) · d T ( a ) = ( f ( a ) − f ( x ))( d T ( a ) − d T ( x )) ≥ 0 Wheeler Ruml (UNH) Class 12, CS 758 – 17 / 22
Recommend
More recommend