cs4102 algorithms
play

CS4102 Algorithms Fall 2018 Warm up Decode the line below into - PowerPoint PPT Presentation

CS4102 Algorithms Fall 2018 Warm up Decode the line below into English (hint: use Google or Wolfram Alpha) - -- - - -- --- - - -- 1 CS4102 Algorithms Fall 2018 Warm up Decode the


  1. CS4102 Algorithms Fall 2018 Warm up Decode the line below into English (hint: use Google or Wolfram Alpha) ·· ·-·· ·· -·- · ·- ·-·· --· --- ·-· ·· - ···· -- ··· 1

  2. CS4102 Algorithms Fall 2018 Warm up Decode the line below into English (hint: use Google or Wolfram Alpha) ·· ·-·· ·· -·- · ·- ·-·· --· --- ·-· ·· - ···· -- ··· 2

  3. Interval Scheduling Run Time Find event ending earliest, add to solution, Remove it and all conflicting events, Repeat until all events removed, return solution Equivalent way StartTime = 0 For each interval (in order of finish time): 𝑃(𝑜) if end of interval < Start Time: 𝑃(1) do nothing else: add interval to solution 𝑃(1) StartTime = end of interval 3

  4. Interval Scheduling Algorithm Find event ending earliest, add to solution, Remove it and all conflicting events, Repeat until all events removed, return solution 4

  5. Today’s Keywords • Greedy Algorithms • Choice Function • Prefix-free code • Compression • Huffman Code 5

  6. CLRS Readings • Chapter 16 6

  7. Homeworks • HW6 Due Friday Nov 9 @11pm – Written (use latex) – DP and Greedy 7

  8. Sam Morse • Engineer and artist 8

  9. Message Encoding • Problem: need to electronically send a message to two people at a distance. • Channel for message is binary (either on or off) 𝑛 9

  10. Character Frequency Encoding How can we do it? 0000 a: 2 0001 d: 2 wiggle, wiggle, wiggle like a gypsy queen 0010 e: 13 wiggle, wiggle, wiggle all dressed in green 0011 g: 14 • Take the message, send it over 0100 i: 8 character-by-character with an k: 1 0101 0110 l: 9 encoding 0111 n: 3 1000 p: 1 1001 q: 1 1010 r: 2 1011 s: 3 u: 1 1100 1101 w: 6 1110 y: 2 10

  11. Encoding Character Table Frequency How efficient is this? 𝑈 𝑔 𝑑 a: 2 0001 wiggle wiggle wiggle like a gypsy queen d: 2 0010 wiggle wiggle wiggle all dressed in green e: 13 0011 Each character requires 4 bits g: 14 0100 ℓ 𝑑 = 4 i: 8 0101 k: 1 0110 l: 9 0111 Cost of encoding: n: 3 1000 p: 1 1001 q: 1 1010 r: 2 1011 s: 3 1100 u: 1 1101 Better Solution: Allow for different w: 6 1110 characters to have different-size encodings y: 2 1111 (high frequency → short code) 11

  12. More efficient coding When this is big Character Frequency Make this small Codeword Size 12

  13. Morse Code Character Frequency Codeword Size 13

  14. Problem with Morse Code A A Decode: ET ET R T EN T Ambiguous Decoding 14

  15. Prefix-Free Code • A prefix-free code is codeword table 𝑈 such that for any two characters 𝑑 1 , 𝑑 2 , if 𝑑 1 ≠ 𝑑 2 then 𝑑𝑝𝑒𝑓(𝑑 1 ) is not a prefix of 𝑑𝑝𝑒𝑓(𝑑 2 ) g 0 1111011100011010 e 10 w i gg l e l 110 i 1110 w 11110 … … 15

  16. Binary Trees = Prefix-free Codes • I can represent any prefix-free code as a binary tree • I can create a prefix-free code from any binary tree g 0 1 0 e 10 g 1 0 g 00 l 110 e 1 0 e 01 i 1110 1 0 l 1 l 10 w 11110 0 1 i 110 … … i 0 1 0 0 w 111 1 w 0 … … g e l w i 16

  17. Goal: Shortest Prefix-Free Encoding • Input: A set of character frequencies {𝑔 𝑑 } • Output: A prefix-free code 𝑈 which minimizes Huffman Coding!! 17

  18. Greedy Algorithms • Require Optimal Substructure – Solution to larger problem contains the solution to a smaller one – Only one subproblem to consider! • Idea: 1. Identify a greedy choice property • How to make a choice guaranteed to be included in some optimal solution 2. Repeatedly apply the choice property until no subproblems remain 18

  19. Huffman Algorithm • Choose the least frequent pair, combine into a subtree G:14 E:13 L:9 I:8 W:6 N:3 S:3 A:2 D:2 R:2 Y:2 K:1 P:1 Q:1 U:1 19

  20. Huffman Algorithm • Choose the least frequent pair, combine into a subtree G:14 E:13 L:9 I:8 W:6 N:3 S:3 A:2 D:2 R:2 Y:2 2 K:1 P:1 1 0 Q:1 U:1 Subproblem of size 𝑜 − 1 ! 20

  21. Huffman Algorithm • Choose the least frequent pair, combine into a subtree G:14 E:13 L:9 I:8 W:6 N:3 S:3 A:2 D:2 R:2 Y:2 2 2 1 1 0 0 Q:1 U:1 K:1 P:1 21

  22. Huffman Algorithm • Choose the least frequent pair, combine into a subtree G:14 E:13 L:9 I:8 W:6 4 N:3 S:3 A:2 D:2 R:2 Y:2 1 0 2 2 1 1 0 0 Q:1 U:1 K:1 P:1 22

  23. Huffman Algorithm • Choose the least frequent pair, combine into a subtree G:14 E:13 L:9 I:8 W:6 4 4 N:3 S:3 A:2 D:2 1 1 0 0 R:2 Y:2 2 2 1 1 0 0 Q:1 U:1 K:1 P:1 23

  24. Huffman Algorithm • Choose the least frequent pair, combine into a subtree G:14 E:13 L:9 I:8 W:6 4 4 4 N:3 S:3 1 1 1 0 0 0 R:2 Y:2 A:2 D:2 2 2 1 1 0 0 Q:1 U:1 K:1 P:1 24

  25. Huffman Algorithm • Choose the least frequent pair, combine into a subtree G:14 E:13 L:9 I:8 W:6 6 4 4 4 0 1 1 1 1 0 0 0 N:3 S:3 R:2 Y:2 A:2 D:2 2 2 1 1 0 0 Q:1 U:1 K:1 P:1 25

  26. Huffman Algorithm • Choose the least frequent pair, 68 combine into a subtree 0 1 41 27 0 1 0 1 17 G:14 E:13 24 0 1 1 0 14 10 L:9 I:8 0 1 0 1 8 W:6 6 4 1 1 1 0 0 0 N:3 S:3 4 4 2 2 1 1 1 1 0 0 0 0 26 Y:2 D:2 U:1 P:1 R:2 A:2 Q:1 K:1

  27. Exchange argument • Shows correctness of a greedy algorithm • Idea: – Show exchanging an item from an arbitrary optimal solution with your greedy choice makes the new solution no worse – How to show my sandwich is at least as good as yours: • Show: “I can remove any item from your sandwich, and it would be no worse by replacing it with the same item from my sandwich” 27

  28. Showing Huffman is Optimal • Overview: – Show that there is an optimal tree in which the least frequent characters are siblings • Exchange argument – Show that making them siblings and solving the new smaller sub-problem results in an optimal solution • Proof by contradiction 28

  29. Showing Huffman is Optimal • First Step: Show any optimal tree is “full” (each node has either 0 or 2 children) 0 1 0 1 W W 1 0 0 R Y 1 0 𝑈′ is a “better” tree than 𝑈 , because all codes in R Y red subtree are shorter in 𝑈′ , without creating any longer codes 29

  30. Huffman Exchange Argument • Claim: if 𝑑 1 , 𝑑 2 are the least-frequent characters, then there is an optimal prefix-free code s.t. 𝑑 1 , 𝑑 2 are siblings – i.e. codes for 𝑑 1 , 𝑑 2 are the same length and differ only by their last bit Case 1: Consider some optimal tree 𝑈 𝑝𝑞𝑢 . If 𝑑 1 , 𝑑 2 are siblings in this tree, then claim holds 𝑈 𝑝𝑞𝑢 𝑑 1 30 𝑑 2

  31. Huffman Exchange Argument • Claim: if 𝑑 1 , 𝑑 2 are the least-frequent characters, then there is an optimal prefix-free code s.t. 𝑑 1 , 𝑑 2 are siblings – i.e. codes for 𝑑 1 , 𝑑 2 are the same length and differ only by their last bit Case 2: Consider some optimal tree 𝑈 𝑝𝑞𝑢 , in which 𝑑 1 , 𝑑 2 are not siblings Let 𝑏, 𝑐 be the two characters of lowest 𝑈 depth that are siblings 𝑝𝑞𝑢 (Why must they exist?) 𝑑 2 Idea: show that swapping 𝑑 1 with 𝑏 does not increase cost of the tree. 𝑑 1 Similar for 𝑑 2 and 𝑐 Assume: 𝑔 𝑑1 ≤ 𝑔 𝑏 and 𝑔 𝑑2 ≤ 𝑔 𝑐 𝑏 31 𝑐

  32. Case 2: are not siblings in • Claim: the least-frequent characters ( 𝑑 1 , 𝑑 2 ), are siblings in some optimal tree 𝑏, 𝑐 = lowest-depth siblings Idea: show that swapping 𝑑 1 with 𝑏 does not increase cost of the tree. Assume: 𝑔 𝑑1 ≤ 𝑔 𝑏 𝐶 𝑈′ = 𝐷 + 𝑔 𝑑1 ℓ 𝑏 + 𝑔 𝑏 ℓ 𝑑1 𝐶 𝑈 = 𝐷 + 𝑔 𝑑1 ℓ 𝑑1 + 𝑔 𝑏 ℓ 𝑏 𝑝𝑞𝑢 𝑈′ 𝑈 𝑝𝑞𝑢 𝑑 2 𝑑 2 𝑏 𝑑 1 𝑑 1 𝑐 𝑏 32 𝑐

  33. Case 2: are not siblings in • Claim: the least-frequent characters ( 𝑑 1 , 𝑑 2 ), are siblings in some optimal tree 𝑏, 𝑐 = lowest-depth siblings Idea: show that swapping 𝑑 1 with 𝑏 does not increase cost of the tree. Assume: 𝑔 𝑑1 ≤ 𝑔 𝑏 𝐶 𝑈′ = 𝐷 + 𝑔 𝑑1 ℓ 𝑏 + 𝑔 𝑏 ℓ 𝑑1 𝐶 𝑈 = 𝐷 + 𝑔 𝑑1 ℓ 𝑑1 + 𝑔 𝑏 ℓ 𝑏 𝑝𝑞𝑢 ≥ 0 ⇒ 𝑈′ optimal 𝑝𝑞𝑢 − 𝐶 𝑈 ′ = 𝐷 + 𝑔 𝐶 𝑈 𝑑1 ℓ 𝑑1 + 𝑔 𝑏 ℓ 𝑏 − (𝐷 + 𝑔 𝑑1 ℓ 𝑏 + 𝑔 𝑏 ℓ 𝑑1 ) = 𝑔 𝑑1 ℓ 𝑑1 + 𝑔 𝑏 ℓ 𝑏 − 𝑔 𝑑1 ℓ 𝑏 − 𝑔 𝑏 ℓ 𝑑1 = 𝑔 𝑑1 (ℓ 𝑑1 − ℓ 𝑏 ) + 𝑔 𝑏 (ℓ 𝑏 − ℓ 𝑑1 ) = (𝑔 𝑏 −𝑔 𝑑1 )(ℓ 𝑏 − ℓ 𝑑1 ) 33

  34. Case 2: are not siblings in • Claim: the least-frequent characters ( 𝑑 1 , 𝑑 2 ), are siblings in some optimal tree 𝑏, 𝑐 = lowest-depth siblings Idea: show that swapping 𝑑 1 with 𝑏 does not increase cost of the tree. Assume: 𝑔 𝑑1 ≤ 𝑔 𝑏 𝐶 𝑈′ = 𝐷 + 𝑔 𝑑1 ℓ 𝑏 + 𝑔 𝑏 ℓ 𝑑1 𝐶 𝑈 = 𝐷 + 𝑔 𝑑1 ℓ 𝑑1 + 𝑔 𝑏 ℓ 𝑏 𝑝𝑞𝑢 𝑈′ 𝑈 𝑝𝑞𝑢 𝑑 2 𝑑 2 𝑝𝑞𝑢 − 𝐶 𝑈 ′ = (𝑔 𝐶 𝑈 𝑏 −𝑔 𝑑1 )(ℓ 𝑏 − ℓ 𝑑1 ) ≥ 0 ≥ 0 𝑏 𝑑 1 𝑝𝑞𝑢 − 𝐶 𝑈 ′ ≥ 0 𝐶 𝑈 𝑑 1 𝑐 𝑈′ is also optimal! 𝑏 34 𝑐

Recommend


More recommend