Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence - PowerPoint PPT Presentation

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence Alignment

Exercise: Scoring an ungapped alignment Given two sequences and a scoring matrix, find the offset that yields the best scoring ungapped alignment. Mark Voorhies Sequence Alignment

Exercise: Scoring an ungapped alignment Given two sequences and a scoring matrix, find the offset that yields the best scoring ungapped alignment. def s c o r e (S , x , y ) : ””” Return alignment s c o r e f o r subsequences x and y f o r s c o r i n g matrix S ( r e p r e s e n t e d as a d i c t ) ””” a s s e r t ( len ( x ) == len ( y )) sum (S [ i ] [ j ] ( i , j ) z i p ( x , y )) return f o r i n subseqs ( x , y , i ) : def ””” Return subsequences of x and y f o r o f f s e t i . ””” i f ( i > 0 ) : y = y [ i : ] e l i f ( i < 0 ) : x = x[ − i : ] L = min ( len ( x ) , len ( y )) return x [ : L ] , y [ : L ] def ungapped (S , x , y ) : ””” Return best o f f s e t , score , and alignment between sequences x and y f o r s c o r i n g matrix S ( r e p r e s e n t e d as a d i c t ) . ””” best = None b e s t s c o r e = None f o r i i n range ( − len ( x )+1 , len ( y ) ) : ( sx , sy ) = subseqs ( x , y , i ) s = s c o r e (S , sx , sy ) i f ( s > b e s t s c o r e ) : b e s t s c o r e = s best = i return best , b e s t s c o r e , subseqs ( x , y , best ) Mark Voorhies Sequence Alignment

Dotplots Unbiased view of all ungapped 1 alignments of two sequences Mark Voorhies Sequence Alignment

Dotplots Unbiased view of all ungapped 1 alignments of two sequences Noise can be filtered by applying a 2 smoothing window to the diagonals. Mark Voorhies Sequence Alignment

Exercise: Scoring a gapped alignment 1 Given two equal length gapped sequences (where “-” represents a gap) and a scoring matrix, calculate an alignment score with a -1 penalty for each base aligned to a gap. 2 Write a new scoring function with separate penalties for opening a zero length gap ( e.g. , G = -11) and extending an open gap by one base ( e.g. , E = -1). gaps � S gapped ( x , y ) = S ( x , y ) + ( G + E ∗ len ( i )) i Mark Voorhies Sequence Alignment

How many ways can we align two sequences? Mark Voorhies Sequence Alignment

How many ways can we align two sequences? Binomial formula: � k � k ! = r ( k − r )! r ! Mark Voorhies Sequence Alignment

How many ways can we align two sequences? Binomial formula: � k � k ! = r ( k − r )! r ! � 2 n � = (2 n )! n ! n ! n Mark Voorhies Sequence Alignment

How many ways can we align two sequences? Binomial formula: � k � k ! = r ( k − r )! r ! � 2 n � = (2 n )! n ! n ! n Stirling’s approximation: √ � x x + 1 � e − x x ! ≈ 2 π 2 Mark Voorhies Sequence Alignment

How many ways can we align two sequences? Binomial formula: � k � k ! = r ( k − r )! r ! � 2 n � = (2 n )! n ! n ! n Stirling’s approximation: √ � x x + 1 � e − x x ! ≈ 2 π 2 ≈ 2 2 n � 2 n � √ π n n Mark Voorhies Sequence Alignment

Dynamic Programming Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A G A G C G G A Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 A -2 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 A - - G G -1 -1 A -2 A - A G G - G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -1 A -2 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -1 0 A -2 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -1 -1 0 A -2 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -2 -1 -1 0 A -2 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -2 -3 -1 -1 0 A -2 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -2 -3 -4 -5 -1 -1 0 A -2 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -2 -3 -4 -5 -1 -1 0 A -2 0 G -3 C -4 -5 G G -6 A -7 Mark Voorhies Sequence Alignment

Needleman-Wunsch A G C G G T A 0 -1 -2 -3 -4 -5 -6 -7 G -1 -2 -3 -4 -5 -1 -1 0 -1 -1 -2 -3 -4 -3 A -2 0 -1 -2 -3 G -3 -1 1 0 0 2 1 0 -1 -2 C -4 -2 0 -5 -3 -1 1 3 2 1 0 G G -4 -2 2 4 3 2 -6 0 A -7 -5 -3 -1 1 3 3 4 Mark Voorhies Sequence Alignment

Homework Implement Needleman-Wunsch global alignment with zero gap opening penalties. Try attacking the problem in this order: 1 Initialize and fill in a dynamic programming matrix by hand ( e.g. , try reproducing the example from my slides on paper). 2 Write a function to create the dynamic programming matrix and initialize the first row and column. 3 Write a function to fill in the rest of the matrix 4 Rewrite the initialize and fill steps to store pointers to the best sub-solution for each cell. 5 Write a backtrace function to read the optimal alignment from the filled in matrix. If that isn’t enough to keep you occupied, read the dynamic programming references from the class website. Try to articulate in your own words the logic for the speed-ups and trade-offs in the Myers and Miller approach. Mark Voorhies Sequence Alignment

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence - PowerPoint PPT Presentation

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence Alignment Exercise: Scoring an ungapped alignment Given two sequences and a scoring matrix, find the offset that yields the best scoring ungapped alignment. Mark Voorhies

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

This week CSE 527 Sequence alignment Computational Biology More sequence alignment

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Sequence Alignment Mark Voorhies 5/29/2013 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/12/2018 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/24/2012 Mark Voorhies Sequence Alignment Exercise:

CSE 421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Computational Biology Winter 2008 Sequence Alignment; DNA Replication 1 Sequence

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Binary Foreground Map Evaluation Deng-Ping Fan Nankai University of Media Computing Lab

Aims of Session Understand the concept of constructive alignment Identify the benefits

RTCP Extension For Time Alignment draft-taylor-avt-time-align-00.txt Tom Taylor et al IETF 66

Algorithms in Bioinformatics: A Practical Introduction Multiple Sequence Alignment Multiple

Block Devices, Filesystems And Block Layer Alignment Christoph Anton Mitterer

Fitting a transformation: feature-based alignment Tues Oct 13 Motivation: Recognition Figures

Uncertainty in compositional models of alignment Ieva Kazlauskaite, University of Bath Neill D.F.

External Plagiarism Detection using Information Retrieval and Sequence Alignment Rao Muhammad

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence - PowerPoint PPT Presentation

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence Alignment Exercise: Scoring an ungapped alignment Given two sequences and a scoring matrix, find the offset that yields the best scoring ungapped alignment. Mark Voorhies

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

This week CSE 527 Sequence alignment Computational Biology More sequence alignment

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Sequence Alignment Mark Voorhies 5/29/2013 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/12/2018 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/24/2012 Mark Voorhies Sequence Alignment Exercise:

CSE 421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Computational Biology Winter 2008 Sequence Alignment; DNA Replication 1 Sequence

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

SEQUENCE ANALYSIS The term &quot; sequence analysis &quot; in biology implies subjecting a DNA or

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Binary Foreground Map Evaluation Deng-Ping Fan Nankai University of Media Computing Lab

Aims of Session Understand the concept of constructive alignment Identify the benefits

RTCP Extension For Time Alignment draft-taylor-avt-time-align-00.txt Tom Taylor et al IETF 66

Algorithms in Bioinformatics: A Practical Introduction Multiple Sequence Alignment Multiple

Block Devices, Filesystems And Block Layer Alignment Christoph Anton Mitterer

Fitting a transformation: feature-based alignment Tues Oct 13 Motivation: Recognition Figures

Uncertainty in compositional models of alignment Ieva Kazlauskaite, University of Bath Neill D.F.

External Plagiarism Detection using Information Retrieval and Sequence Alignment Rao Muhammad

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or