nisarg shah
play

Nisarg Shah 373F19 - Nisarg Shah 1 Recap Dynamic Programming - PowerPoint PPT Presentation

CSC373 Week 4: Dynamic Programming (contd) Network Flow (start) Nisarg Shah 373F19 - Nisarg Shah 1 Recap Dynamic Programming Basics Optimal substructure property Bellman equation Top-down (memoization) vs bottom-up


  1. CSC373 Week 4: Dynamic Programming (contd) Network Flow (start) Nisarg Shah 373F19 - Nisarg Shah 1

  2. Recap • Dynamic Programming Basics ➢ Optimal substructure property ➢ Bellman equation ➢ Top-down (memoization) vs bottom-up implementations • Dynamic Programming Examples ➢ Weighted interval scheduling ➢ Knapsack problem ➢ Single-source shortest paths ➢ Chain matrix product 373F19 - Nisarg Shah 2

  3. This Lecture • Some more DP ➢ Edit distance (aka sequence alignment) ➢ Traveling salesman problem (TSP) • Start of network flow ➢ Problem statement ➢ Ford-Fulkerson algorithm ➢ Running time ➢ Correctness 373F19 - Nisarg Shah 3

  4. Edit Distance • Edit distance (aka sequence alignment) problem ➢ How similar are strings 𝑌 = 𝑦 1 , … , 𝑦 𝑛 and 𝑍 = 𝑧 1 , … , 𝑧 𝑜 ? • Suppose we can delete or replace symbols ➢ We can do these operations on any symbol in either string ➢ How many deletions & replacements does it take to match the two strings? 373F19 - Nisarg Shah 4

  5. Edit Distance • Example: ocurrance vs occurrence 6 replacements, 1 deletion 1 replacement, 1 deletion 373F19 - Nisarg Shah 5

  6. Edit Distance • Edit distance problem ➢ Input o Strings 𝑌 = 𝑦 1 , … , 𝑦 𝑛 and 𝑍 = 𝑧 1 , … , 𝑧 𝑜 o Cost 𝑒(𝑏) of deleting symbol 𝑏 o Cost 𝑠(𝑏, 𝑐) of replacing symbol 𝑏 with 𝑐 • Assume 𝑠 is symmetric, so 𝑠 𝑏, 𝑐 = 𝑠(𝑐, 𝑏) ➢ Goal o Compute the minimum total cost for matching the two strings • Optimal substructure? ➢ Want to delete/replace at one end and recurse 373F19 - Nisarg Shah 6

  7. Edit Distance • Optimal substructure ➢ Goal: match 𝑦 1 , … , 𝑦 𝑛 and 𝑧 1 , … , 𝑧 𝑜 ➢ Consider the last symbols 𝑦 𝑛 and 𝑧 𝑜 ➢ Three options: o Delete 𝑦 𝑛 , and optimally match 𝑦 1 , … , 𝑦 𝑛−1 and 𝑧 1 , … , 𝑧 𝑜 o Delete 𝑧 𝑜 , and optimally match 𝑦 1 , … , 𝑦 𝑛 and 𝑧 1 , … , 𝑧 𝑜−1 o Match 𝑦 𝑛 and 𝑧 𝑜 , and optimally match 𝑦 1 , … , 𝑦 𝑛−1 and 𝑧 1 , … , 𝑧 𝑜−1 ➢ Hence in the DP, we need to compute the optimal solutions for matching 𝑦 1 , … , 𝑦 𝑗 with 𝑧 1 , … , 𝑧 𝑘 for all (𝑗, 𝑘) 373F19 - Nisarg Shah 7

  8. Edit Distance • 𝐹[𝑗, 𝑘] = edit distance between 𝑦 1 , … , 𝑦 𝑗 and 𝑧 1 , … , 𝑧 𝑘 • Bellman equation 0 if 𝑗 = 𝑘 = 0 𝑒 𝑧 𝑘 + 𝐹[𝑗, 𝑘 − 1] if 𝑗 = 0 ∧ 𝑘 > 0 𝐹 𝑗, 𝑘 = 𝑒 𝑦 𝑗 + 𝐹[𝑗 − 1, 𝑘] if 𝑗 > 0 ∧ 𝑘 = 0 min{𝐵, 𝐶, 𝐷} otherwise where 𝐵 = 𝑒 𝑦 𝑗 + 𝐹 𝑗 − 1, 𝑘 , 𝐶 = 𝑒 𝑧 𝑘 + 𝐹 𝑗, 𝑘 − 1 𝐷 = 𝑠 𝑦 𝑗 , 𝑧 𝑘 + 𝐹[𝑗 − 1, 𝑘 − 1] • 𝑃(𝑜 ⋅ 𝑛) time, 𝑃(𝑜 ⋅ 𝑛) space 373F19 - Nisarg Shah 8

  9. Edit Distance 0 if 𝑗 = 𝑘 = 0 𝑒 𝑧 𝑘 + 𝐹[𝑗, 𝑘 − 1] if 𝑗 = 0 ∧ 𝑘 > 0 𝐹 𝑗, 𝑘 = 𝑒 𝑦 𝑗 + 𝐹[𝑗 − 1, 𝑘] if 𝑗 > 0 ∧ 𝑘 = 0 min{𝐵, 𝐶, 𝐷} otherwise where 𝐵 = 𝑒 𝑦 𝑗 + 𝐹 𝑗 − 1, 𝑘 , 𝐶 = 𝑒 𝑧 𝑘 + 𝐹 𝑗, 𝑘 − 1 𝐷 = 𝑠 𝑦 𝑗 , 𝑧 𝑘 + 𝐹[𝑗 − 1, 𝑘 − 1] • Space complexity can be improved to 𝑃(𝑜 + 𝑛) ➢ To compute 𝐹[⋅, 𝑘] , we only need 𝐹 ⋅, 𝑘 − 1 stored ➢ So we can forget 𝐹[⋅, 𝑘] as soon as we reach 𝑘 + 2 ➢ But this is not enough if we want to compute the actual solution (sequence of operations) 373F19 - Nisarg Shah 9

  10. Hirschberg’s Algorithm This slide is not in the scope of the course • The optimal solution can be computed in 𝑃 𝑜 ⋅ 𝑛 time and 𝑃(𝑜 + 𝑛) space too! 373F19 - Nisarg Shah 10

  11. Hirschberg’s Algorithm This slide is not in the scope of the course • Key idea nicely combines divide & conquer with DP • Edit distance graph 𝑒(𝑦 𝑗 ) 𝑒(𝑧 𝑘 ) 373F19 - Nisarg Shah 11

  12. Hirschberg’s Algorithm This slide is not in the scope of the course • Observation (can be proved by induction) ➢ 𝐹[𝑗, 𝑘] = length of shortest path from (0,0) to (𝑗, 𝑘) 𝑒(𝑦 𝑗 ) 𝑒(𝑧 𝑘 ) 373F19 - Nisarg Shah 12

  13. Hirschberg’s Algorithm This slide is not in the scope of the course • Lemma 𝑜 2 ) ➢ Shortest path from (0,0) to (𝑛, 𝑜) passes through (𝑟, Τ where 𝑟 minimizes length of shortest path from (0,0) to 𝑜 2 ) + length of shortest path from (𝑟, Τ 𝑜 2 ) to (𝑛, 𝑜) (𝑟, Τ 373F19 - Nisarg Shah 13

  14. Hirschberg’s Algorithm This slide is not in the scope of the course • Idea ➢ Find 𝑟 using divide-and-conquer 𝑜 2 ) and (𝑟, Τ 𝑜 2 ) to ➢ Find shortest paths from (0,0) to (𝑟, Τ (𝑛, 𝑜) using DP 373F19 - Nisarg Shah 14

  15. Application: Protein Matching 373F19 - Nisarg Shah 15

  16. Traveling Salesman • Input ➢ Directed graph 𝐻 = (𝑊, 𝐹) ➢ Distance 𝑒 𝑗,𝑘 is the distance from node 𝑗 to node 𝑘 • Output ➢ Minimum distance which needs to be traveled to start from some node 𝑤 , visit every other node exactly once, and come back to 𝑤 o That is, the minimum cost of a Hamiltonian cycle 373F19 - Nisarg Shah 16

  17. Traveling Salesman • Approach ➢ Let’s start at node 𝑤 1 = 1 o It’s a cycle, so the starting point does not matter ➢ Want to visit the other nodes in some order, say 𝑤 2 , … , 𝑤 𝑜 ➢ Total distance is 𝑒 1,𝑤 2 + 𝑒 𝑤 2 ,𝑤 3 + ⋯ + 𝑒 𝑤 𝑜−1 ,𝑤 𝑜 + 𝑒 𝑤 𝑜 ,1 o Want to minimize this distance • Naïve solution ➢ Check all possible orderings 𝑜 𝑜 ➢ 𝑜 − 1 ! = Θ 𝑜 ⋅ (Stirling’s approximation) 𝑓 373F19 - Nisarg Shah 17

  18. Traveling Salesman • DP Approach ➢ Consider 𝑤 𝑜 (the last node before returning to 𝑤 1 = 1 ) o If 𝑤 𝑜 = 𝑑 • We now want to find the optimal order of visiting nodes in 2, … , 𝑜 ∖ 𝑑 • So we will need to keep track of which subset of nodes we need to visit and where we need to end ➢ 𝑃𝑄𝑈 𝑇, 𝑑 = minimum total distance of starting at 1 , visiting each node in 𝑇 exactly once, and ending at 𝑑 ∈ 𝑇 (without counting the distance for returning from 𝑑 to 1 ) o Then the answer to our original problem can easily be computed as min 𝑑∈𝑇 𝑃𝑄𝑈 𝑇, 𝑑 + 𝑒 𝑑,1 , where 𝑇 = {2, … , 𝑜} 373F19 - Nisarg Shah 18

  19. Traveling Salesman • DP Approach ➢ To compute 𝑃𝑄𝑈[𝑇, 𝑑] , we condition over the vertex which is visited right before 𝑑 • Bellman equation 𝑃𝑄𝑈 𝑇, 𝑑 = 𝑛∈𝑇∖ 𝑑 𝑃𝑄𝑈 𝑇 ∖ 𝑑 , 𝑛 + 𝑒 𝑛,𝑑 min Final solution = 𝑑∈ 2,…,𝑜 𝑃𝑄𝑈 2, … , 𝑜 , 𝑑 + 𝑒 𝑑,1 min • Time: 𝑃(𝑜 ⋅ 2 𝑜 ) calls, 𝑃(𝑜) time per call ⇒ 𝑃 𝑜 2 ⋅ 2 𝑜 𝑜 𝑓 𝑜 Τ ➢ Much better than the naïve solution which has 373F19 - Nisarg Shah 19

  20. Traveling Salesman • Bellman equation 𝑃𝑄𝑈 𝑇, 𝑑 = 𝑛∈𝑇∖ 𝑑 𝑃𝑄𝑈 𝑇 ∖ 𝑑 , 𝑛 + 𝑒 𝑛,𝑑 min Final solution = 𝑑∈ 2,…,𝑜 𝑃𝑄𝑈 2, … , 𝑜 , 𝑑 + 𝑒 𝑑,1 min • Space complexity: 𝑃 𝑜 ⋅ 2 𝑜 ➢ But computing the optimal solution with 𝑇 = 𝑙 only requires storing the optimal solutions with 𝑇 = 𝑙 − 1 • Question: Using this observation, how much can we reduce the space complexity? 373F19 - Nisarg Shah 20

  21. DP Concluding Remarks • Key steps in designing a DP algorithm ➢ “Generalize” the problem first o E.g. instead of computing edit distance between strings 𝑌 = 𝑦 1 , … , 𝑦 𝑛 and 𝑍 = 𝑧 1 , … , 𝑧 𝑜 , we compute 𝐹[𝑗, 𝑘] = edit distance between 𝑗 -prefix of 𝑌 and 𝑘 -prefix of 𝑍 for all (𝑗, 𝑘) o The right generalization is often obtained by looking at the structure of the “subproblem” which must be solved optimally to get an optimal solution to the overall problem ➢ Remember the difference between DP and divide-and- conquer ➢ Sometimes you can save quite a bit of space by only storing solutions to those subproblems that you need in the future 373F19 - Nisarg Shah 21

  22. Network Flow 373F19 - Nisarg Shah 22

  23. Network Flow • Input ➢ A directed graph 𝐻 = (𝑊, 𝐹) ➢ Edge capacities 𝑑 ∶ 𝐹 → ℝ ≥0 ➢ Source node 𝑡 , target node 𝑢 • Output ➢ Maximum “flow” from 𝑡 to 𝑢 373F19 - Nisarg Shah 23

  24. Network Flow • Assumptions ➢ For simplicity, assume that… ➢ No edges enters 𝑡 ➢ No edges comes out of 𝑢 ➢ Edge capacity 𝑑(𝑓) is a non- negative integer o Later, we’ll see what happens when 𝑑(𝑓) can be a rational number 373F19 - Nisarg Shah 24

  25. Network Flow • Flow ➢ An 𝑡 - 𝑢 flow is a function 𝑔: 𝐹 → ℝ ≥0 ➢ Intuitively, 𝑔(𝑓) is the “amount of material” carried on edge 𝑓 373F19 - Nisarg Shah 25

  26. Network Flow • Constraints on flow 𝑔 1. Respecting capacities Flow in = flow out at every node other than 𝑡 and 𝑢 ∀𝑓 ∈ 𝐹 ∶ 0 ≤ 𝑔 𝑓 ≤ 𝑑(𝑓) 2. Flow conservation ∀𝑤 ∈ 𝑊 ∖ 𝑡, 𝑢 ∶ σ 𝑓 into 𝑤 𝑔 𝑓 = σ 𝑓 leaving 𝑤 𝑔 𝑓 Flow out at 𝑡 = flow in at 𝑢 373F19 - Nisarg Shah 26

Recommend


More recommend