rna bioinformatics beyond the one sequence one structure
play

RNA Bioinformatics Beyond the One Sequence-One Structure Paradigm - PowerPoint PPT Presentation

RNA Bioinformatics Beyond the One Sequence-One Structure Paradigm Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA 2008 Molecular Informatics and Bioinformatics


  1. RNA Bioinformatics Beyond the One Sequence-One Structure Paradigm Peter Schuster Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA 2008 Molecular Informatics and Bioinformatics Collegium Budapest, 27.– 29.03.2008

  2. Web-Page for further information: http://www.tbi.univie.ac.at/~pks

  3. 1. Computation of RNA equilibrium structures 2. Inverse folding and neutral networks 3. Evolutionary optimization of structure 4. Suboptimal conformations and kinetic folding

  4. 1. Computation of RNA equilibrium structures 2. Inverse folding and neutral networks 3. Evolutionary optimization of structure 4. Suboptimal conformations and kinetic folding

  5. 5' - end N 1 O CH 2 O GCGGAU UUA GCUC AGUUGGGA GAGC CCAGA G CUGAAGA UCUGG AGGUC CUGUG UUCGAUC CACAG A AUUCGC ACCA 5'-end 3’-end N A U G C k = , , , OH O N 2 O P O CH 2 O Na � O O OH N 3 O P O CH 2 O Na � O Definition of RNA structure O OH N 4 O P O CH 2 O Na � O O OH 3' - end O P O Na � O

  6. N = 4 n N S < 3 n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ � { AU , CG , GC , GU , UA , UG } A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

  7. Conventional definition of RNA secondary structures

  8. H-type pseudoknot

  9. ∑ − 1 = + n ⋅ S S S S + − − 1 1 = n n j n j 1 j Counting the numbers of structures of chain length n � n+ 1 M.S. Waterman, T.F. Smith (1978) Math.Bioscience 42 :257-266

  10. Restrictions on physically acceptable mfe-structures: � � 3 and � � 2

  11. ≥ λ n Size restriction of elements: (i) hairpin loop loop ≥ σ (ii) stack n stack = Ξ + Φ S + + − 1 1 1 m m m ∑ − 2 m Ξ = + Φ ⋅ S S + − + 1 1 = λ + σ − m m k m k 2 2 k ∑ ⎣ − λ + ⎦ ( 1 ) / 2 Φ = m Ξ + − + 1 2 1 m = σ − m k 1 k S n � # structures of a sequence with chain length n Recursion formula for the number of physically acceptable stable structures I.L.Hofacker, P.Schuster, P.F. Stadler. 1998. Discr.Appl.Math . 89 :177-207

  12. RNA sequence: GUAUCGAAAUACGUAGCGUAUGGGGAUGCUGGACGGUCCCAUCGGUACUCCA Biophysical chemistry: thermodynamics and kinetics RNA folding : Structural biology, spectroscopy of biomolecules, Empirical parameters understanding molecular function RNA structure of minimal free energy Sequence, structure, and design

  13. (h) S 5 (h) S 3 (h) S 4 (h) S 1 (h) S 2 (h) S 8 0 G (h) (h) S 9 S 7 � y g r e n e (h) S 6 e e Suboptimal conformations r F (h) S 0 Minimum of free energy The minimum free energy structures on a discrete space of conformations

  14. Elements of RNA secondary structures as used in free energy calculations ∑ ∑ ∑ ∑ ∆ = + + + + 300 ( ) ( ) ( ) L G g h n b n i n 0 , ij kl l b i stacks of hairpin bulges internal base pairs loops loops

  15. Maximum matching j 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 i G G C G C G C C C G G C G C C 1 G * * 1 1 1 1 2 3 3 3 4 4 5 6 6 An example of a dynamic programming 2 G * * 0 1 1 2 2 2 3 3 4 4 5 6 computation of the maximum number of 3 C * * 0 1 1 1 2 3 3 3 4 5 5 base pairs 4 G * * 0 1 1 2 2 2 3 4 5 5 5 C * * 0 1 1 2 2 3 4 4 4 Back tracking yields the structure(s). 6 G * * 1 1 1 2 3 3 3 4 7 C * * 0 1 2 2 2 2 3 8 C * * 1 1 1 2 2 2 9 C * * 1 1 2 2 2 10 G * * 1 1 1 2 11 G * * 0 1 1 12 C * * 0 1 [i,k-1] [ k+1,j ] 13 G * * 1 14 C * * 15 C * i i+1 i+2 k j-1 j j+1 X i,k-1 X k+1,j { ( ) } = + + ρ max , max ( 1 ) X X X X + ≤ ≤ − − + + , 1 , 1 , 1 1 , , 1 i j i j i k j i k k j k j Minimum free energy computations are based on empirical energies

  16. 1. Computation of RNA equilibrium structures 2. Inverse folding and neutral networks 3. Evolutionary optimization of structure 4. Suboptimal conformations and kinetic folding

  17. RNA sequence: GUAUCGAAAUACGUAGCGUAUGGGGAUGCUGGACGGUCCCAUCGGUACUCCA Iterative determination of a sequence for the Inverse folding of RNA : given secondary RNA folding : structure Biotechnology, Structural biology, design of biomolecules spectroscopy of Inverse Folding with predefined biomolecules, Algorithm structures and functions understanding molecular function RNA structure of minimal free energy Sequence, structure, and design

  18. Compatibility of sequences and structures

  19. Compatibility of sequences and structures

  20. Inverse folding algorithm I 0 � I 1 � I 2 � I 3 � I 4 � ... � I k � I k+1 � ... � I t S 0 � S 1 � S 2 � S 3 � S 4 � ... � S k � S k+1 � ... � S t I k+1 = M k (I k ) and � d S (S k ,S k+1 ) = d S (S k+1 ,S t ) - d S (S k ,S t ) < 0 M ... base or base pair mutation operator d S (S i ,S j ) ... distance between the two structures S i and S j ‚Unsuccessful trial‘ ... termination after n steps

  21. Approach to the target structure S k in the inverse folding algorithm

  22. The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

  23. I Space of genotypes: = { , , , , ... , } ; Hamming metric I I I I I 1 2 3 4 N S Space of phenotypes: = { , , , , ... , } ; metric (not required) S S S S S 1 2 3 4 M �� N M � ( ) = I S j k U � � -1 � � G k = ( ) | ( ) = I S I S k j j k � A mapping and its inversion

  24. 1. Computation of RNA equilibrium structures 2. Inverse folding and neutral networks 3. Evolutionary optimization of structure 4. Suboptimal conformations and kinetic folding

  25. Structure of Phenylalanyl-tRNA as andomly chosen target structure initial sequence

  26. Evolution in silico W. Fontana, P. Schuster, Science 280 (1998), 1451-1455

  27. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  28. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  29. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  30. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  31. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  32. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  33. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  34. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  35. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  36. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  37. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  38. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  39. Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

  40. Replication rate constant: f k = � / [ � + � d S (k) ] � d S (k) = d H (S k ,S � ) Selection constraint: Population size, N = # RNA molecules, is controlled by the flow ≈ ± ( ) N t N N Mutation rate: p = 0.001 / site � replication The flowreactor as a device for studies of evolution in vitro and in silico

  41. In silico optimization in the flow reactor: Evolutionary Trajectory

  42. 28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations Neutral point mutations leave the change the molecular structure molecular structure unchanged Neutral genotype evolution during phenotypic stasis

  43. A sketch of optimization on neutral networks

  44. Randomly chosen initial structure Phenylalanyl-tRNA as target structure

  45. Application of molecular evolution to problems in biotechnology

  46. 1. Computation of RNA equilibrium structures 2. Inverse folding and neutral networks 3. Evolutionary optimization of structure 4. Suboptimal conformations and kinetic folding

  47. RNA secondary structures derived from a single sequence

  48. An algorithm for the computation of all suboptimal structures of RNA molecules using the same concept for retrieval as applied in the sequence alignment algorithm by M.S. Waterman and T.F. Smith. Math.Biosci. 42:257-266, 1978.

  49. An algorithm for the computation of RNA folding kinetics

Recommend


More recommend