Review Session I CS 466 Wesley Wei Qian March 10th 2020
Midterm Exam This Thursday! ● 03/12 class time ○ Different building! ● Nature History Building Room 2079 ○ Topics ●
Topics we have covered: Molecular Biology ● Probability and Statistics ● Sequence and Alignment ● Pattern Matching ● BLAST ●
Molecular Biology - Molecules - DNA - RNA - Protein (polypeptide) - Molecular Process - Transcription - Translation - Protein folding - Gene splicing: intron/exon - Gene regulation - Genome
Probability and Statistics
Probability and Statistics
Probability and Statistics
Probability and Statistics
Probability and Statistics
Probability and Statistics
Probability and Statistics
Sequence and Alignment
Sequence and Alignment Global Alignment Local Alignment
Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1
Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1
Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1
Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 D -2 1 -1 -3 -5 -7 O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1
Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 DDOGC D -2 1 -1 -3 -5 -7 D-OG- O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1
Sequence and Alignment Global Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 -2 -4 -6 -8 -10 DDOGC D -2 1 -1 -3 -5 -7 -DOG- O -4 -1 0 0 -2 -4 G -6 -3 -2 -1 1 -1
Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1
Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1
Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1
Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 D 0 1 1 0 0 0 O 0 0 0 2 0 0 G 0 0 0 0 3 1
Sequence and Alignment Local Alignment: DDOGC vs DOG +1 Match; -1 Mismatch; -2 Gap. * D D O G C * 0 0 0 0 0 0 -DOG- D 0 1 1 0 0 0 DOG O 0 0 0 2 0 0 G 0 0 0 0 3 0
Sequence and Alignment Complexity?
Sequence and Alignment Complexity?
Sequence and Alignment Scoring function and BLOSUM matrix rounding factor
Sequence and Alignment Affine Gap Penalty
Pattern Matching
Pattern Matching Naive Approach K: number of patterns ● N: average length of pattern ● M: length of the query string ● Running Time: O(KMN)
Pattern Matching Keyword Tree K: number of patterns ● N: average length of pattern ● M: length of the query string ● Running Time: O(KN + NM)
Pattern Matching Aho-Corasick K: number of patterns ● N: average length of pattern ● M: length of the query string ● Running Time: O(KN + M)
Pattern Matching Aho-Corasick One more example: http://blog.ivank.net/aho-corasick-algorithm-in-as3.html
Pattern Matching A different setting Fixed patterns with various query string… what if we have a fix string but different query patterns? compile the patterns -> compile the string
Pattern Matching Suffix Tree for {abcabx} N: average length of pattern ● M: length of the query string ● Running time: O(M^2 + N) Build a keyword tree: {abcabx, bcabx, cabx, abx, bx, x}
Pattern Matching Suffix Tree for {abcabx} N: average length of pattern ● M: length of the query string ● Running time: O(M^2 + N) O(M + N) if do Ukkonen Algo. but no required!
Good luck!
Recommend
More recommend