Data Structures in Java Lecture 20: Algorithm Design Techniques 12/2/2015 Daniel Bauer 1
Algorithms and Problem Solving • Purpose of algorithms: find solutions to problems. • Data Structures provide ways of organizing data such that problems can be solved more efficiently • Examples: Hashmaps provide constant time access by key, Heaps provide a cheap way to explore different possibilities in order… • When confronted with a new problem, how do we: • Get an idea of how difficult it is? • Develop an algorithm to solve it? 2
Common Types of Algorithms • Greedy Algorithms • Divide and Conquer • Dynamic Programming • We have already seen some examples for each. • We will look at the general techniques and some additional examples. 3
Greedy Algorithms “Take what you can get now” • Algorithm uses multiple “phases” or “steps”. In each phase a local decision is made that appears to be good. • Making a local decision is fast (often O(log N) time). Examples: Dijkstra’s, Prim’s, Kruskal’s • Greedy algorithms assume that making locally optimal decisions leads to a global optimum. • This works for some problems. • For many others it doesn’t. Greedy algorithms are still useful to find approximate solutions. 4
ASCII Encoding Character Decimal Binary ⋮ • The ASCII codec contains 128 A 65 1000001 B 66 1000010 characters (about 100 printable C 67 1000011 characters + special chars). D 68 1000100 E 69 1000101 • Each character needs ⋮ bits of space. a 97 1100000 b 98 1100001 c 99 1100010 d 100 1100011 e 101 1100100 ⋮ Can we store data more efficiently? 5
A 5-Character Alphabet Character Decimal Binary a 0 “000" e 1 “001" i 2 “010" s 3 “011" t 4 “100" space 5 “101" newline 6 “110" 6
A 5-Character Alphabet Assume we see each character with a certain frequency in a textfile. We can then compute the total number of bits required to store the file. Character Decimal Binary Code Frequency Total bits a 0 “000" 10 30 e 1 “001" 15 45 i 2 “010" 12 36 s 3 “011" 3 9 t 4 “100" 4 12 space 5 “101" 13 39 newline 6 “110" 1 3 Total: 175 7
Prefix Trees Character Binary 0 1 a “000" e “001" 0 1 0 1 i “010" s “011" 0 1 0 1 0 0 t “100" a e i s t sp nl space “101" newline “110" depth of character i frequency of i in the file file size 8
Prefix Trees Character Binary 0 1 a “000" e “001" 0 1 0 1 i “010" s “011" 0 1 0 1 0 0 t “100" a e i s t sp nl space “101" newline “110" depth of character i frequency of i in the file file size Can we restructure the tree to minimize the file size? 9
Prefix Trees Character Binary 0 1 a “000" e “001" 0 1 0 1 i “010" s “011" 0 1 0 1 0 0 t “100" a e i s t sp nl space “101" newline “110" depth of character i frequency of i in the file file size Prefix “11” is not used for any other character than nl. 10
Prefix Trees Character Binary 0 1 a “000" e “001" 0 1 0 1 i “010" nl s “011" 0 1 1 0 0 t “100" a e i s t sp space “101" newline “11" depth of character i frequency of i in the file file size Prefix “11” is not used for any other character than nl. 11
Prefix Trees We cannot place characters on interior nodes, or else encoded sequences would be ambiguous. 0 1 a s 0 1 0 1 e i t sp 0 nl 000110 12
Prefix Trees We cannot place characters on interior nodes, or else encoded sequences would be ambiguous. 0 1 a s 0 1 0 1 e i t sp 0 nl 00 01 10 e i t 13
Prefix Trees We cannot place characters on interior nodes, or else encoded sequences would be ambiguous. 0 1 a s 0 1 0 1 e i t sp 0 nl 000 11 0 nl sp t 14
Huffman Code 1 f i chr bin 0 e “01" 15 0 1 0 1 sp “11" 13 i sp 0 e 1 i “10" 12 0 a a “001” 10 1 t “0001” 4 t 1 0 s “00000” 3 s nl nl “00001" 1 • All characters are at leaves. file size • Frequent characters have short codes. • Rare characters have long codes. 15
Huffman Code f i chr bin e “01" 15 sp “11" 13 i sp e i “10" 12 a a “001” 10 t “0001” 4 t s “00000” 3 s nl nl “00001" 1 0000001001000100001 Total size: 146 •This example: Save 16% space compared to standard coding. •Typically much better compression (for larger files and alphabets). 16
Huffman Code f i chr bin e “01" 15 sp “11" 13 i sp e i “10" 12 a a “001” 10 t “0001” 4 t s “00000” 3 s nl nl “00001" 1 00000 01 001 0001 00001 Total size: 146 •This example: Save 16% space compared to standard coding. •Typically much better compression (for larger files and alphabets). 17
Huffman’s Algorithm 10 15 12 3 4 13 1 a e i s sp t nl • Maintain a forest of prefix trees. • Weight of a tree T = sum of frequencies of characters in T. 18
Huffman’s Algorithm 4 T1 4 13 10 15 12 sp a e i s t nl • In every phase: • Choose the two trees with smallest weight and merge them. 19
Huffman’s Algorithm 8 T2 T1 10 15 12 13 a e i sp t s nl • In every phase: • Choose the two trees with smallest weight and merge them. 20
Huffman’s Algorithm 18 T3 T2 T1 13 15 12 a s t e sp i nl • In every phase: • Choose the two trees with smallest weight and merge them. 21
Huffman’s Algorithm 18 T3 T2 24 T4 T1 15 a s t e sp i nl • In every phase: • Choose the two trees with smallest weight and merge them. 22
Huffman’s Algorithm 33 T5 T3 T2 24 T4 T1 e a i sp s t nl • In every phase: • Choose the two trees with smallest weight and merge them. 23
Huffman’s Algorithm 58 T6 T5 T3 Selecting the two minimum T2 weight trees: O(log N) each. T4 T1 We do this N times. i sp e O(N log N) a s t nl • This is clearly a greedy algorithm as we consider then two lowest-weight trees at any level. • Keep the trees in the forest on a heap. 24
Divide and Conquer Algorithms • Algorithms consist of two parts: • Divide: Decompose the problem into smaller sub-problems. Solve each problem recursively (down to the base case). • Conquer: Solve the problem by combining solutions to the sub-problem. 25
Divide and Conquer Example Algorithms • Merge Sort. Quick Sort. • Binary Search. • Towers of Hanoi. • These algorithms work efficiently because: • The subproblems are independent. • Solving the subproblems first makes the overall problem easier. 26
Merge Sort • Split the array in half, recursively sort each half. • Merge the two sorted lists. 34 8 64 2 51 32 21 1 27
Merge Sort • Split the array in half, recursively sort each half. • Merge the two sorted lists. 34 8 64 2 51 32 21 1 28
Merge Sort • Split the array in half, recursively sort each half. • Merge the two sorted lists. 34 8 64 2 51 32 21 1 29
Merge Sort • Split the array in half, recursively sort each half. • Merge the two sorted lists. 34 8 64 2 51 32 21 1 8 2 32 51 1 21 34 64 30
Merge Sort • Split the array in half, recursively sort each half. • Merge the two sorted lists. 34 8 64 2 51 32 21 1 8 2 32 51 1 21 34 64 2 8 34 64 1 21 32 51 31
Merge Sort • Split the array in half, recursively sort each half. • Merge the two sorted lists. 34 8 64 2 51 32 21 1 8 2 32 51 1 21 34 64 2 8 34 64 1 21 32 51 1 2 8 21 32 34 51 64 32
Merge Sort Running Time • Base case: N=1 (sort a 1-element list). T(1) = 1 • Recurrence: T(N) = 2 T(N/2) + N Merge the two halfs Recursively sort each half 33
Running Time Analysis for Merge Sort and Quick Sort with Perfect Pivot. assume 34
Running time of Divide and Conquer Algorithms: “Master Theorem” Most divide and conquer algorithms have the following running time equation: The “Master Theorem” states that this recurrence relation has the following solution: 35
Master Theorem: MergeSort Example: Merge Sort This is Case 2: 36
Dynamic Programming Algorithms • In some cases, recursive algorithms (such as the ones used for Divide and Conquer algorithms) won’t work. • That’s because the solution to a subproblem is used more than once. • Merge Sort works because each partition is processed exactly once. • Dynamic Programming algorithms solve this problem by systematically recording the solution to sub-problems in a table and re-using them later. 37
Recommend
More recommend