RNA Secondary Structures Our Results Open Problems Combinatorial RNA Design: Designability and Structure-Approximating Algorithm s 1 nuch 1 , 3 Yann Ponty 1 , 2 Jozef Haleˇ J´ an Maˇ Ladislav Stacho 1 1 Simon Fraser University, Canada 2 Pacific Institute for Mathematical Sciences, Canada 3 University of British Columbia, Canada CPM 2015 CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems RNA Structures Composed of four bases: adenine (A), guanine (G), cytosine (C) and uracil (U) Source: http://www.mpi-inf.mpg.de/departments/d1/projects/CompBio/align.html CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
Root [1,1] [2,2] [3,3] [4,66] [67,67] [68,68] [5,65] [6,6] [7,64] [8,63] [9,9] [10,20] [21,21] [22,61] [62,62] [11,19] [23,60] [12,18] [24,59] [13,17] [25,58] [14,14] [15,15] [16,16] [26,26] [27,43] [44,44] [45,56] [57,57] [28,42] [46,55] [29,29] [30,30] [31,39] [40,40] [41,41] [47,54] [32,38] [48,53] [33,33] [34,34] [35,35] [36,36] [37,37] [49,49] [50,50] [51,51] [52,52] RNA Secondary Structures Our Results Open Problems Representations of Secondary Structures Structure is a pair ( n , P ), where n is the number of bases and P is a set of pairs ( i , j ) with 1 ≤ i < j ≤ n representing a base pair between the i -th base and the j -the base. U A A A C U A U 30 U G G G C A C C G A 40 G U A 20 U A U A U G C U U C A G A 10 C G A G U C G U G G A C 60 C G A C G U G C A A C U G C A G C U U G C 1 68 A 50 CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
Root [1,1] [2,2] [3,3] [4,66] [67,67] [68,68] [5,65] [6,6] [7,64] [8,63] [9,9] [10,20] [21,21] [22,61] [62,62] [11,19] [23,60] [12,18] [24,59] [13,17] [25,58] [14,14] [15,15] [16,16] [26,26] [27,43] [44,44] [45,56] [57,57] [28,42] [46,55] [29,29] [30,30] [31,39] [40,40] [41,41] [47,54] [32,38] [48,53] [33,33] [34,34] [35,35] [36,36] [37,37] [49,49] [50,50] [51,51] [52,52] RNA Secondary Structures Our Results Open Problems Representations of Secondary Structures Structure is a pair ( n , P ), where n is the number of bases and P is a set of pairs ( i , j ) with 1 ≤ i < j ≤ n representing a base pair between the i -th base and the j -the base. U A A A C U A U 30 U G G G C A C C G A 40 G U A 20 U A U A U G C U U C A G A 10 C G A G U C G U G G A C 60 C G A C G U G C A A C U G C A G C U U G C 1 68 A 50 G C A G G A G U C U A G C G A U G C U A G U C A G C U A G C U C A U A A U G A A U U A G G C U A C G A C U A G C G C U G A G A C C C U U 1 10 20 30 40 50 60 68 arc diagram CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Representations of Secondary Structures Structure is a pair ( n , P ), where n is the number of bases and P is a set of pairs ( i , j ) with 1 ≤ i < j ≤ n representing a base pair between the i -th base and the j -the base. U A Root A A C U A U [1,1] [2,2] [3,3] [4,66] [67,67] [68,68] 30 U G G G C A [5,65] C C G A 40 G U A 20 [6,6] [7,64] U A U A U G C [8,63] U U C A G A 10 C G [9,9] [10,20] [21,21] [22,61] [62,62] A G U C G U G G A C 60 C G [11,19] [23,60] A C G U [12,18] [24,59] G C A A C U G C A G C U U [13,17] [25,58] G C 1 68 A 50 [14,14] [15,15] [16,16] [26,26] [27,43] [44,44] [45,56] [57,57] [28,42] [46,55] [29,29] [30,30] [31,39] [40,40] [41,41] [47,54] [32,38] [48,53] [33,33] [34,34] [35,35] [36,36] [37,37] [49,49] [50,50] [51,51] [52,52] G C A G G A G U C U A G C G A U G C U A G U C A G C U A G C U C A U A A U G A A U U A G G C U A C G A C U A G C G C U G A G A C C C U U 1 10 20 30 40 50 60 68 arc diagram tree representation CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Pseudoknot-Free Secondary Structures pseudoknot-free structure pseudoknotted structure CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Pseudoknot-Free Secondary Structures pseudoknot-free structure pseudoknotted structure Let S n denote all pseudoknot-free structures with n bases. CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems RNA Folding Let M be an energy model. RNA Folding problem looks from the MFE structure(s). Problem RNA-FOLD M problem Input: RNA sequence w Output: set of PKF structures arg min S ∈S | w | E M ( w , S ) . Assuming an additive energy model which adds up local contributions, finding one structure in RNA-FOLD M ( w ) can be done in time O ( n 3 / log( n )) using Dynamic Programming [Nussinov, Jacobson (1980),Frid et al. (2010),etc.]. CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Energy Models Turner model : free energy is the sum of loop energies Source: [Lorenz, Clote (2011)] CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Energy Models Turner model : free energy is the sum of loop energies Source: [Lorenz, Clote (2011)] Simplified models: Base-pair maximization (Watson-Crick model) W : Count the number of Watson-Crick base pairs ( C · G and A · U ) Base-pair sum : Sum of energy contributions of base pairs ( δ B ( x , x ′ )) — usually includes weak base pairs G · U Stacked base-pairs : Sum of energy contributions of consecutively nested pairs ( δ S ( x , x ′ , y , y ′ )) Nearest neighbor CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Energy Models Turner model : free energy is the sum of loop energies Source: [Lorenz, Clote (2011)] Simplified models: Base-pair maximization (Watson-Crick model) W : Count the number of Watson-Crick base pairs ( C · G and A · U ) Base-pair sum : Sum of energy contributions of base pairs ( δ B ( x , x ′ )) — usually includes weak base pairs G · U Stacked base-pairs : Sum of energy contributions of consecutively nested pairs ( δ S ( x , x ′ , y , y ′ )) Nearest neighbor CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems RNA Design Problem Let M be an energy model. Problem RNA-DESIGN M , Σ . ∆ problem Input: Secondary structure S + Energy distance ∆ > 0 Output: RNA sequence w ∈ Σ ⋆ — called a design for S — such that: ∀ S ′ ∈ S | w | \ { S } : E M ( w , S ′ ) ≥ E M ( w , S ) + ∆ or ∅ if no such sequence exists. CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems RNA Design Problem (simplified) Simplified formulation for Watson-Crick model W and ∆ = 1: Problem RNA-DESIGN Σ problem Input: Secondary structure S Output: RNA sequence w ∈ Σ ⋆ — called a design for S — such that: RNA-FOLD W ( w ) = { S } or ∅ if no such sequence exists. CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems RNA Design Problem (simplified) Simplified formulation for Watson-Crick model W and ∆ = 1: Problem RNA-DESIGN Σ problem Input: Secondary structure S Output: RNA sequence w ∈ Σ ⋆ — called a design for S — such that: RNA-FOLD W ( w ) = { S } or ∅ if no such sequence exists. Example a. Target sec. str. S b. Invalid sequence for S c. Design for S ( ( . ) ( . . ) ) G G A C A G G U C A C A G G U U C U CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems RNA Design Problem (simplified) Simplified formulation for Watson-Crick model W and ∆ = 1: Problem RNA-DESIGN Σ problem Input: Secondary structure S Output: RNA sequence w ∈ Σ ⋆ — called a design for S — such that: RNA-FOLD W ( w ) = { S } or ∅ if no such sequence exists. Let Designable(Σ) be the set of all structures for there exists a design. CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Our Results: Definitions and notations Given a secondary structure S . Let Unpaired S be the set of all unpaired positions of S . Example Unpaired S = { 4 , 8 } CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
RNA Secondary Structures Our Results Open Problems Our Results: Definitions and notations Given a secondary structure S . Let Unpaired S be the set of all unpaired positions of S . S is saturated if Unpaired S = ∅ . Let Saturated be the set of all saturated structures. Example not saturated saturated CPM 2015 J´ an Maˇ nuch Combinatorial RNA Design:Designability and Structure-Approximating
Recommend
More recommend