Sequence Alignment: Linear Space Q. Can we avoid using quadratic - PowerPoint PPT Presentation

Sequence Alignment: Linear Space Q. Can we avoid using quadratic space? Easy. Optimal value in O(m + n) space and O(mn) time.  Compute OPT(i, •) from OPT(i-1, •).  No longer a simple way to recover alignment itself. Theorem. [Hirschberg 1975] Optimal alignment in O(m + n) space and O(mn) time.  Clever combination of divide-and-conquer and dynamic programming.  Inspired by idea of Savitch from complexity theory. 45

Sequence Alignment: Linear Space Edit distance graph.  Let f(i, j) be shortest path from (0,0) to (i, j).  Observation: f(i, j) = OPT(i, j). ε y 1 y 2 y 3 y 4 y 5 y 6 ε 0-0 x 1 α x i y j δ δ x 2 i-j x 3 m-n 46

Sequence Alignment: Linear Space Edit distance graph.  Let f(i, j) be shortest path from (0,0) to (i, j).  Can compute f (•, j) for any j in O(mn) time and O(m + n) space. j ε y 1 y 2 y 3 y 4 y 5 y 6 ε 0-0 x 1 x 2 i-j x 3 m-n 47

Sequence Alignment: Linear Space Edit distance graph.  Let g(i, j) be shortest path from (i, j) to (m, n).  Can compute by reversing the edge orientations and inverting the roles of (0, 0) and (m, n) ε y 1 y 2 y 3 y 4 y 5 y 6 ε 0-0 δ x 1 i-j α x i y j δ x 2 x 3 m-n 48

Sequence Alignment: Linear Space Edit distance graph.  Let g(i, j) be shortest path from (i, j) to (m, n).  Can compute g(•, j) for any j in O(mn) time and O(m + n) space. j ε y 1 y 2 y 3 y 4 y 5 y 6 ε 0-0 x 1 i-j x 2 x 3 m-n 49

Sequence Alignment: Linear Space Observation 1. The cost of the shortest path that uses (i, j) is f(i, j) + g(i, j). ε y 1 y 2 y 3 y 4 y 5 y 6 ε 0-0 x 1 i-j x 2 x 3 m-n 50

Sequence Alignment: Linear Space Observation 2. let q be an index that minimizes f(q, n/2) + g(q, n/2). Then, the shortest path from (0, 0) to (m, n) uses (q, n/2). n / 2 ε y 1 y 2 y 3 y 4 y 5 y 6 ε 0-0 q x 1 i-j x 2 x 3 m-n 51

Sequence Alignment: Linear Space Divide: find index q that minimizes f(q, n/2) + g(q, n/2) using DP.  Align x q and y n/2 . Conquer: recursively compute optimal alignment in each piece. n / 2 ε y 1 y 2 y 3 y 4 y 5 y 6 ε 0-0 q x 1 i-j x 2 x 3 m-n 52

Sequence Alignment: Running Time Analysis Warmup Theorem. Let T(m, n) = max running time of algorithm on strings of length at most m and n. T(m, n) = O(mn log n). T ( m , n ) ≤ 2 T ( m , n /2) + O ( mn ) ⇒ T ( m , n ) = O ( mn log n ) Remark. Analysis is not tight because two sub-problems are of size (q, n/2) and (m - q, n/2). In next slide, we save log n factor. 53

Sequence Alignment: Running Time Analysis Theorem. Let T(m, n) = max running time of algorithm on strings of length m and n. T(m, n) = O(mn). Pf. (by induction on n)  O(mn) time to compute f( •, n/2) and g ( •, n/2) and find index q.  T(q, n/2) + T(m - q, n/2) time for two recursive calls.  Choose constant c so that: T ( m , 2) cm ≤ T (2, n ) cn ≤ T ( m , n ) cmn + T ( q , n /2) + T ( m − q , n /2) ≤  Base cases: m = 2 or n = 2.  Inductive hypothesis: T(m, n) ≤ 2cmn. T ( m , n ) T ( q , n / 2 ) T ( m q , n / 2 ) cmn ≤ + − + 2 cqn / 2 2 c ( m q ) n / 2 cmn ≤ + − + cqn cmn cqn cmn = + − + 2 cmn = 54

Sequence Alignment: Linear Space Q. Can we avoid using quadratic - PowerPoint PPT Presentation

Sequence Alignment: Linear Space Q. Can we avoid using quadratic space? Easy. Optimal value in O(m + n) space and O(mn) time. Compute OPT(i, ) from OPT(i-1, ). No longer a simple way to recover alignment itself. Theorem.

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

This week CSE 527 Sequence alignment Computational Biology More sequence alignment

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 5/29/2013 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/12/2018 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/24/2012 Mark Voorhies Sequence Alignment Exercise:

CSE 421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Computational Biology Winter 2008 Sequence Alignment; DNA Replication 1 Sequence

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Transcriptome analysis Stefan Seemann seemann@rth.dk University of Copenhagen April 11th 2018

Challenge and novel aproaches for multiple sequence alignment and phylogenetic estimation Tandy

GPU accelerated partial order multiple sequence alignment for long reads self-correction

Sequence Analysis with TraMineR Gilbert Ritschard Institute for Demographic and Life Course

CSC263 Week 7 Thursday http://goo.gl/forms/S9yie3597B Announcement Pre-test office hour today

RNA Structure and RNA Structure Prediction Purines pentose Base glycosidic bond Adenine

Lecture 7: RNA folding Chapter 6 Problem 6.51 in

Stacking Energies and RNA Structure Prediction Bioinformatics Senior Project Adrian Lawsin

Sambuz

Useful Links

Newsletter

Mail Us

Sequence Alignment: Linear Space Q. Can we avoid using quadratic - PowerPoint PPT Presentation

Sequence Alignment: Linear Space Q. Can we avoid using quadratic space? Easy. Optimal value in O(m + n) space and O(mn) time. Compute OPT(i, ) from OPT(i-1, ). No longer a simple way to recover alignment itself. Theorem.

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

This week CSE 527 Sequence alignment Computational Biology More sequence alignment

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Sequence Alignment Mark Voorhies 5/20/2015 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 5/29/2013 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/12/2018 Mark Voorhies Sequence Alignment Exercise: Scoring

Sequence Alignment Mark Voorhies 4/24/2012 Mark Voorhies Sequence Alignment Exercise:

CSE 421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE 427 Computational Biology Winter 2008 Sequence Alignment; DNA Replication 1 Sequence

CSE 427 Comp Bio Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

CSE421 Algorithms Sequence Alignment 1 Sequence Alignment What Why A Dynamic Programming

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

SEQUENCE ANALYSIS The term &quot; sequence analysis &quot; in biology implies subjecting a DNA or

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Transcriptome analysis Stefan Seemann seemann@rth.dk University of Copenhagen April 11th 2018

Challenge and novel aproaches for multiple sequence alignment and phylogenetic estimation Tandy

GPU accelerated partial order multiple sequence alignment for long reads self-correction

Sequence Analysis with TraMineR Gilbert Ritschard Institute for Demographic and Life Course

CSC263 Week 7 Thursday http://goo.gl/forms/S9yie3597B Announcement Pre-test office hour today

RNA Structure and RNA Structure Prediction Purines pentose Base glycosidic bond Adenine

Lecture 7: RNA folding Chapter 6 Problem 6.51 in

Stacking Energies and RNA Structure Prediction Bioinformatics Senior Project Adrian Lawsin

Sambuz

Useful Links

Newsletter

Mail Us

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or