cpsc 490 dp
play

CPSC 490 DP Part 2: Max-IS on Arrays, LCS, Recovery, and Binary - PowerPoint PPT Presentation

CPSC 490 DP Part 2: Max-IS on Arrays, LCS, Recovery, and Binary Exponentiation Lucca Siaudzionis and Jack Spalding-Jamieson 2020/01/23 University of British Columbia Announcements Again, assignment 1 is due this Saturday!!!!!!!!


  1. CPSC 490 DP Part 2: Max-IS on Arrays, LCS, Recovery, and Binary Exponentiation Lucca Siaudzionis and Jack Spalding-Jamieson 2020/01/23 University of British Columbia

  2. Announcements • Again, assignment 1 is due this Saturday!!!!!!!! • Assignment 2 to be released shortly after assignment 1 is closed. • Extra office hours by Lucca tomorrow 3PM-4:30PM at ICCS X241 (announced on Piazza). 1

  3. What DP is About Source: Non deterministic memes for NP complete teens 2

  4. Maximum Independent-Set on a Line Input : An array of n positive integers. Output : The maximum sum of chosen integers, such that no two chosen values are adjacent in the array. 4 5 4 1 1 8 3 5 3

  5. Maximum Independent-Set on a Line Input : An array of n positive integers. Output : The maximum sum of chosen integers, such that no two chosen values are adjacent in the array. 4 5 4 1 1 8 3 5 21 4

  6. Maximum Independent-Set on a Line: Formulating a Recurrence The first thing we need for any DP problem is a recurrence. For each element of the array, we need to make a choice: Include it or exclude it. Let’s consider everything back to front! OPT n = max (array n + OPT n − 2 , OPT n − 1 ) We are defining the optimal value at the end using smaller optimal values. The time to compute the new optimal value is constant. Base cases: OPT 1 = array 1 OPT 2 = max(array 1 , array 2 ). 5

  7. Maximum Independent-Set on a Line: Implementing our Recurrence (1) Like last class, this leads to a simple naive implementation: fun max_is(arr ,n): 1 if n == 0: 2 return 0 3 if n == 1: 4 return arr [0] 5 return max(arr[n -1]+ max_is(arr ,n-2),max_is(arr ,n-1)) 6 Runtime: O (2 n ) 6

  8. Maximum Independent-Set on a Line: Implementing our Recurrence (2) Like last class again, we can memoize the simple naive implementation: memo: hashmap (or array) from n -> max_is(arr ,n); Initially 1 empty fun max_is(arr ,n): 2 if n == 0: 3 return 0 4 if n == 1: 5 return arr [0] 6 if n in memo: 7 return memo[n] 8 memo[n] = max(arr[n -1]+ max_is(arr ,n-2),max_is(arr ,n-1)) 9 return memo[n] 10 Runtime: O ( n ) The function only recurses once for each value up to n . 7

  9. Maximum Independent-Set on a Line: Bottom-Up Implementation Often, it is much faster to create an iterative implementation of our recurrence. This is called a ”Bottom-up” implementation. fun max_is(arr): 1 memo: hashmap (or array) from n -> max_is(arr ,n); 2 Initially empty memo [0] = 0 3 memo [1] = 1 4 for n from 2 to arr.size: 5 memo[n] = max(arr[n -1]+ memo[n-2], memo[n -1]) 6 return memo[arr.size] 7 Runtime: O ( n ) 8

  10. Longest Common Subsequence: Problem Statement Input : Two strings s and t . Output : The length of the longest common subsequence of s and t . 9

  11. Longest Common Subsequence: Intuition & Recurrence Since we have two pieces of data (two strings!), we can no longer just reduce by removing the last character/element. Instead, we can consider removing at least one of the two last characters!This gives us a two-dimensional recurrence on the first n characters of s and the first m characters of t : � 1 + OPT[ n − 1 , m − 1] s [ n − 1] = t [ n − 1] OPT[ n , m ] = max(OPT[ n − 1 , m ] , OPT[ n , m − 1]) otherwise We have a 2D table of computations. Time complexity: Each entry in our table takes constant time to compute, given the previous values, so in total we have O ( | s || t | ) time. 10

  12. Longest Common Subsequence: Example (1) a c b a 0 0 0 0 0 a 0 c 0 a 0 b 0 11

  13. Longest Common Subsequence: Example (2) a c b a 0 0 0 0 0 a 0 1 1 1 1 c 0 1 2 2 2 a 0 1 2 2 3 b 0 1 2 3 3 The arrows tell us where an entry in the table came from. 12

  14. Longest Common Subsequence: Simple Bottom-Up Implementation (C++) int lcs(const string &s, const string &t) { 1 int n = s.size(), m = t.size(); 2 vector<vector<int>> memo(n+1,vector<int>(m+1)); 3 for (int is = 1; is <= n; ++is) 4 for (int it = 1; it <= m; ++it) { 5 if (s[is-1] == t[it-1]) 6 memo[is][it] = 1+memo[is-1][it-1]; 7 else 8 memo[is][it] = max(memo[is][it-1],memo[is-1][it]); 9 } 10 return memo[n][m]; 11 } 12 This code is really short! Most of the difficulty in DP problems is in finding the recurrence. 13

  15. Longest Common Subsequence: Recovering the Solution We now have the length of the LCS. What is the LCS though? Can we get the actual string? Let’s look at those arrows again: a c b a 0 0 0 0 0 a 0 1 1 1 1 c 0 1 2 2 2 a 0 1 2 2 3 b 0 1 2 3 3 We can actually store these arrows! 14

  16. Longest Common Subsequence: Recovery Implementation string lcs_recovery(const string &s, const string &t) { 1 int n = s.size(), m = t.size(); 2 // [... LCS Code here ...] 3 string str; int is = n, it = m; 4 while (is > 0 && it > 0) { 5 if (s[is-1] == t[it-1]) { 6 str += s[is-1]; --is; --it; 7 } 8 else if (memo[is][it] == memo[is][it-1]) --it; 9 else --is; 10 } 11 reverse(str.begin(),str.end()); 12 return str; 13 } 14 15

  17. Discussion Problem: Levenschtein Distance Input : Two strings s and t . Output : The minimum number of ”edits” to transform s into t , where an edit is one of: • Insertion of a new character • Deletion of an existing character • Substitution of a character. 16

  18. Discussion Problem: Levenschtein Distance – Insight This problem seems very similar to the longest common subsequence problem. We can do something very similar:  max( i , j ) i = 0 ∨ j = 0    OPT ( i , j ) = OPT ( i − 1 , j − 1) s i = t j   min( OPT ( i − 1 , j − 1) + 1 , OPT ( i − 1 , j ) + 1 , OPT ( i , j − 1) + 1) otherwise  Compared to LCS, the maximum is swapped for a minimum, and the cases are subtly different. 17

  19. Fibonacci Numbers: Even Faster Input : A positive number n ≤ 10 18 . Output : The n th Fibonacci number F n , modulo 10 9 + 7. Why modulo 10 9 + 7? We cannot represent Fibonacci numbers too large in memory. 18

  20. Fibonacci Numbers: A Matrix The n th Fibonacci Number is defined solely on the n − 1th and n − 2nd Fibonacci numbers: F n = F n − 1 + F n − 2 We can represent this as a vector addition: � � � � � � F n F n − 1 F n − 2 = + F n − 1 F n − 1 0 We can represent this addition as a matrix-vector multiplication! � � � � � � 1 1 F n F n − 1 = F n − 1 1 0 F n − 2 19

  21. Fibonacci Numbers: Matrix Exponentiation We now have a recurrence based on a matrix, equivalent to our recurrence from before: � � � � � � 1 1 F n F n − 1 = 1 0 F n − 1 F n − 2 What does this look like as it expands outwards? � 2 � � � � � 1 1 F n F n − 2 = F n − 1 1 0 F n − 3 In general: � n − 2 � � � � � F n 1 1 F 2 = F n − 1 1 0 F 1 20

  22. Fibonacci Numbers: A First Algorithm with Matrix Exponentiation Using our general recurrence, all we really need to do are n − 2 matrix-multiplications to � n − 2 � 1 1 = M n − 2 . compute 1 0 If n − 2 was a power of two, we could do something much faster: Say n − 2 = 2 k . We want to find M 2 k . Equivalently, we want to find ( M 2 ) 2 k − 1 . Repeating, we get ((((( M 2 ) 2 ) 2 ) 2 ) . . . 2 ). We halve the number of multiplications needed each time. We can actually compute every power of 2 separately up to ⌊ log 2 n ⌋ in O (log n ) time. 21

  23. A Faster Algorithm for Matrix Exponentiation Suppose that in binary, n − 2 = 1001 2 . Then, n − 2 = 2 3 + 2 0 . So M n − 2 = M 2 3 +2 0 = M 2 3 M 2 0 . In general, M n − 2 is equal to the product of the powers of 2 which correspond to a 1 in the binary representation. This gives us an O (log n ) solution to Fibonacci numbers! 22

  24. A Faster Algorithm for Matrix Exponentiation: Pseudocode The following algorithm returns the exponentiated matrix M n : fun mat_exp(M,n): 1 res: matrix; Initialized to identity matrix 2 while n > 0: 3 if n mod 2 == 1: 4 res = res * M 5 M = M * M 6 n = n / 2 7 return res 8 23

  25. Discussion Problem: General Fibonacci Queries Input : A number q ≤ 10 5 of queries, followed by q lines of three numbers each. On each query line, there will be three values f 1 , f 2 , n ≤ 10 18 . Output : For each query, output the n th Fibonacci number, modulo 10 9 + 7, where the first and second Fibonacci number are redefined to be f 1 and f 2 (instead of being 1 and 1). 24

  26. Discussion Problem: General Fibonacci Queries – Insight Even though the initial values of the numbers may change, the matrices we can use do not. The matrix for each query can still be completed in logarithmic time, and then used in constant time. To save some time, we could also pre-compute all the power of two matrices (although this may not be necessary if your implementation is fast enough already). 25

  27. Discussion Problem - Longest Increasing Subsequence (LIS) Find the length of the longest increasing subsequence (LIS) in an array of N ≤ 100 , 000 integers. Example: [10, 11, 1, 5, 3, 9, -1, 7, 25] 26

  28. Discussion Problem - Longest Increasing Subsequence (LIS) – Insight (1) With a good enough implementation, we might be able to ’squeeze’ an O ( n 2 ) solution. Try the recurrence OPT [ j ] = 1 + max i < j , a [ i ] < a [ j ] OPT [ i ]. Computing this will take O ( n 2 ) time, but will have a very small constant. By storing some additional data, we can compute this max function more quickly. 27

Recommend


More recommend