review session i
play

Review Session I CS 466 Wesley Wei Qian March 10th 2020 Midterm - PowerPoint PPT Presentation

Review Session I CS 466 Wesley Wei Qian March 10th 2020 Midterm Exam This Thursday! 03/12 class time Different building! Nature History Building Room 2079 Topics Topics we have covered: Molecular Biology


  1. Review Session I CS 466 Wesley Wei Qian March 10th 2020

  2. Midterm Exam This Thursday! ● 03/12 class time ○ Different building! ● Nature History Building Room 2079 ○ Topics ●

  3. Topics we have covered: Molecular Biology ● Probability and Statistics ● Sequence and Alignment ● Pattern Matching ● BLAST ●

  4. Molecular Biology - Molecules - DNA - RNA - Protein (polypeptide) - Molecular Process - Transcription - Translation - Protein folding - Gene splicing: intron/exon - Gene regulation - Genome

  5. Probability and Statistics

  6. Probability and Statistics

  7. Probability and Statistics

  8. Probability and Statistics

  9. Probability and Statistics

  10. Probability and Statistics

  11. Probability and Statistics

  12. Sequence and Alignment

  13. Sequence and Alignment Global Alignment Local Alignment

  14. Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1

  15. Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1

  16. Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1

  17. Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1

  18. Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 DDOGC D -2 1 -1 -3 -5 -7 D-OG- O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1

  19. Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 DDOGC D -2 1 -1 -3 -5 -7 -DOG- O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1

  20. Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1

  21. Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1

  22. Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1

  23. Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1

  24. Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 -DOG- D 0 1 1 0 0 0 DOG O 0 0 0 2 0 0 G 0 0 0 0 3 0

  25. Sequence and Alignment Complexity?

  26. Sequence and Alignment Complexity?

  27. Sequence and Alignment Scoring function and BLOSUM matrix rounding factor

  28. Sequence and Alignment Affine Gap Penalty

  29. Pattern Matching

  30. Pattern Matching Naive Approach K: number of patterns ● N: average length of pattern ● M: length of the query string ● Running Time: O(KMN)

  31. Pattern Matching Keyword Tree K: number of patterns ● N: average length of pattern ● M: length of the query string ● Running Time: O(KN + NM)

  32. Pattern Matching Aho-Corasick K: number of patterns ● N: average length of pattern ● M: length of the query string ● Running Time: O(KN + M)

  33. Pattern Matching Aho-Corasick One more example: http://blog.ivank.net/aho-corasick-algorithm-in-as3.html

  34. Pattern Matching A different setting Fixed patterns with various query string… what if we have a fix string but different query patterns? compile the patterns -> compile the string

  35. Pattern Matching Suffix Tree for {abcabx} N: average length of pattern ● M: length of the query string ● Running time: O(M^2 + N) Build a keyword tree: {abcabx, bcabx, cabx, abx, bx, x}

  36. Pattern Matching Suffix Tree for {abcabx} N: average length of pattern ● M: length of the query string ● Running time: O(M^2 + N) O(M + N) if do Ukkonen Algo. but no required!

  37. Good luck!

Recommend


More recommend