global and local alignments
play

Global and local alignments Global vs. local alignments Global: - PowerPoint PPT Presentation

Global and local alignments Global vs. local alignments Global: align all nucleotides Local: align subsequences with best score Align these sequences: GCAT, GCT (match = 1, mismatch = -1, gap = -1) global alignment: local alignment: ?


  1. Global and local alignments

  2. Global vs. local alignments • Global: align all nucleotides • Local: align subsequences with best score Align these sequences: GCAT, GCT (match = 1, mismatch = -1, gap = -1) global alignment: local alignment: ? GCAT GC-T

  3. We can make local alignments using the Smith-Waterman algorithm Like Needleman-Wunsch, with 2 changes: • Don't allow negative scores, set them to 0 • Backtrack from cell with highest score, stop at 0

  4. We can make local alignments using the Smith-Waterman algorithm Like Needleman-Wunsch, with 2 changes: • Don't allow negative scores, set them to 0 • Backtrack from cell with highest score, stop at 0 Needleman-Wunsch - G C A T 0 -1 -2 -3 -4 - -1 1 0 -1 -2 G C -2 0 2 1 0 -3 -1 1 1 2 T GCAT GC-T

  5. We can make local alignments using the Smith-Waterman algorithm Like Needleman-Wunsch, with 2 changes: • Don't allow negative scores, set them to 0 • Backtrack from cell with highest score, stop at 0 Needleman-Wunsch Smith-Waterman - G C A T - G C A T 0 -1 -2 -3 -4 0 0 0 0 0 - - -1 1 0 -1 -2 0 1 0 0 0 G G C -2 0 2 1 0 C 0 0 2 1 0 -3 -1 1 1 2 0 0 1 1 2 T T GCAT GC GC-T GC

  6. We can make local alignments using the Smith-Waterman algorithm Like Needleman-Wunsch, with 2 changes: • Don't allow negative scores, set them to 0 • Backtrack from cell with highest score, stop at 0 Needleman-Wunsch Smith-Waterman - G C A T - G C A T 0 -1 -2 -3 -4 0 0 0 0 0 - - -1 1 0 -1 -2 0 1 0 0 0 G G C -2 0 2 1 0 C 0 0 2 1 0 -3 -1 1 1 2 0 0 1 1 2 T T GCAT GC GCAT or GC-T GC-T GC

  7. Smith-Waterman algorithm, mathematical form M (0, j ) = 0 first row M ( i ,0) = 0 first column ⎛ ⎞ 0 ⎜ ⎟ M ( i − 1, j ) + p top ⎜ ⎟ M ( i , j ) = max ⎜ ⎟ M ( i , j − 1) + p left ⎜ ⎟ M ( i − 1, j − 1) + s ( a j , b i ) diagonal ⎜ ⎟ ⎝ ⎠ s ( a j , b i ) = match/mismatch score for sites j and i in sequences a and b

  8. BLAST (Basic Local Alignment Search Tool)

  9. BLAST is the primary method to find sequences in modern sequence data bases

  10. Image from: http://www.ncbi.nlm.nih.gov/books/NBK62051/

  11. Primary BLAST quality metric: E value The Expectation value or E value represents the number of different alignments with scores equivalent to or better than the one observed that are expected to occur in a database search by chance. The lower the E value, the more significant the score and the alignment.

  12. Anatomy of a BLAST result

  13. Anatomy of a BLAST result sequence we found (subject sequence)

  14. Anatomy of a BLAST result E value

  15. Anatomy of a BLAST result number and % of exact matches, near matches, and no matches

  16. Anatomy of a BLAST result number and % of exact matches, near matches, and no matches exact match

  17. Anatomy of a BLAST result number and % of exact matches, near matches, and no matches near match (positive)

  18. Anatomy of a BLAST result number and % of exact matches, near matches, and no matches no match

  19. Anatomy of a BLAST result

Recommend


More recommend