1 Dynamic Programming Formula Divide a problem into a polynomial number of smaller subproblems Solve subproblem, recording its answer in an array Do a case analysis where each case uses the subproblems in a different way Compare the cases to find the optimal solution for the current problem 2-1 String Similarity How similar are two strings? ocurrance occurrence 2-2 String Similarity o c u r r a n c e o c c u r r e n c e How similar are two strings? 6 mismatches, 1 gap ocurrance occurrence Slides18 - Sequence Alignment.key - April 10, 2019
2-3 String Similarity o c u r r a n c e o c c u r r e n c e How similar are two strings? 6 mismatches, 1 gap ocurrance occurrence o c u r r a n c e o c c u r r e n c e 1 mismatch, 1 gap 2-4 String Similarity o c u r r a n c e o c c u r r e n c e How similar are two 6 mismatches, 1 gap strings? ocurrance occurrence o c u r r a n c e o c c u r r e n c e 1 mismatch, 1 gap o c u r r a n c e o c c u r r e n c e 0 mismatches, 3 gaps 3 Edit Distance Applications. Basis for Unix diff. Speech recognition. Computational biology. Spam filter Edit distance. Gap penalty δ ; mismatch penalty α pq . Cost = sum of gap and mismatch penalties. C T G A C C T A C C T - C T G A C C T A C C T C C T G A C T A C A T C C T G A C - T A C A T α TC + α GT + α AG + 2 α CA 2 δ + α CA Slides18 - Sequence Alignment.key - April 10, 2019
4-1 Sequence Alignment Goal: Given two strings X = x 1 x 2 . . . x m and Y = y 1 y 2 . . . y n find alignment of minimum cost. An alignment M is a set of ordered pairs x i -y j such that each item occurs in at most one pair and no crossings. The pair x i -y j and x i' -y j' cross if i < i', but j > j'. 4-2 Sequence Alignment Goal: Given two strings X = x 1 x 2 . . . x m and Y = y 1 y 2 . . . y n find alignment of minimum cost. An alignment M is a set of ordered pairs x i -y j such that each item occurs in at most one pair and no crossings. The pair x i -y j and x i' -y j' cross if i < i', but j > j'. o c c u r e r n c e o c c u r r e n c e crossing 4-3 Sequence Alignment Goal: Given two strings X = x 1 x 2 . . . x m and Y = y 1 y 2 . . . y n find alignment of minimum cost. An alignment M is a set of ordered pairs x i -y j such that each item occurs in at most one pair and no crossings. The pair x i -y j and x i' -y j' cross if i < i', but j > j'. o c c u r e r n c e o c c u r e r n c e o c c u r r e n c e o c c u r r e n c e crossing 2 mismatches Slides18 - Sequence Alignment.key - April 10, 2019
5 Sequence Alignment Example: CTACCG vs. TACATG . Solution: M = x 2 -y 1 , x 3 -y 2 , x 4 -y 3 , x 5 -y 4 , x 6 -y 6 . x 1 x 2 x 3 x 4 x 5 x 6 C T A C C - G - T A C A T G y 1 y 2 y 3 y 4 y 5 y 6 6 Sequence Alignment What are the subproblems? What are the cases? What is the solution for each case? How do you find the optimal solution from the cases? 7 Sequence Alignment Case Analysis Consider the last character of the strings X and Y . Call them x M and y N . Case 1: x M and y N are aligned. Case 2: x M is not matched. Case 3: y N is not matched. Case 4: Neither x M nor y N are matched. Slides18 - Sequence Alignment.key - April 10, 2019
8 Solution 1 B O G U S B O N G O match mismat mismat mismat match ch ch ch Cost = 3 mismatches 9 Solution 2 B O G U S B O N G O match misma match match skip skip tch Cost = 1 mismatch + 2 skips 10 Solution 3 B O G U S B O N G O matc matc skip matc skip skip skip h h h Cost = 4 skips Slides18 - Sequence Alignment.key - April 10, 2019
11 Which is best? 3 mismatches: BONGO BOGUS 1 mismatch + 2 skips: BONGO BOGUS 4 skips: BONGO BOGUS 12 Sequence Alignment Cost Analysis Consider the last character of the strings X and Y . Call them x M and y N . Case 1: x M and y N are aligned. OPT(X, Y) = α + OPT(x 1 ...x m-1 , y 1 ...y n-1 ) x M y N Case 2: x M is not matched. OPT(X, Y) = δ + OPT(x 1 ...x m-1 , y 1 ...y n ) Case 3: y N is not matched. OPT(X, Y) = δ + OPT(x 1 ...x m , y 1 ...y n-1 ) Case 4: Neither x M nor y N are matched: Covered by cases 2 and 3. 13 Sequence Alignment: Algorithm Alignment(m, n, x 1 x 2 ...x m , y 1 y 2 ...y n , δ , α ) { for i = 0 to m M[i, 0] = i δ for j = 0 to n M[0, j] = j δ for i = 1 to m for j = 1 to n M[i, j] = min( α [x i, y j ] + M[i-1, j-1], δ + M[i-1, j], δ + M[i, j-1]) return M[m, n] } Slides18 - Sequence Alignment.key - April 10, 2019
14 Sequence Alignment Example x = boit; y = boot b o n g o α = 0, for match 0 2 4 6 8 10 α = 1, for mismatch b 2 δ = 2 o 4 g 6 u 8 s 10 Slides18 - Sequence Alignment.key - April 10, 2019
Recommend
More recommend