prediction and analysis of rna secondary structures
play

Prediction and Analysis of RNA Secondary Structures Peter Schuster - PowerPoint PPT Presentation

Prediction and Analysis of RNA Secondary Structures Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien RNA Secondary Structures in Dijon Dijon, 24. 26.06.2002 Three-dimensional structure


  1. Prediction and Analysis of RNA Secondary Structures Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien RNA Secondary Structures in Dijon Dijon, 24.– 26.06.2002

  2. Three-dimensional structure of phenylalanyl-transfer-RNA

  3. RNA Secondary Structures and their Properties RNA secondary structures are listings of Watson-Crick and GU wobble base pairs, which are free of knots and pseudokots. Secondary structures are folding intermediates in the formation of full three-dimensional structures. D.Thirumalai, N.Lee, S.A.Woodson, and D.K.Klimov. Annu.Rev.Phys.Chem . 52 :751-762 (2001)

  4. 5'-End 3'-End Sequence GCGGAU UUA GCUC AGDDGGGA GAGC M CCAGA CUGAAYA UCUGG AGMUC CUGUG TPCGAUC CACAG A AUUCGC ACCA 3'-End 5'-End 70 60 Secondary Structure 10 50 20 30 40 Symbolic Notation 5'-End 3'-End Definition and formation of the secondary structure of phenylalanyl-tRNA

  5. 40 30 50 20 60 10 70 5'-Ende 3'-Ende Circle representation of tRNA phe

  6. Virtuelle Root 5'-Ende 3'-Ende Tree representation of tRNA phe

  7. 60 30 40 20 50 10 70 76 3'-Ende 5'-Ende Mountain representation of tRNA phe

  8. Mountain representation used in structure prediction of medium size RNA molecules

  9. Mountain representation used in structure prediction of large RNA molecules

  10. � � � � T = 0 K , t T > 0 K , t T > 0 K , t finite 3.30 3.40 3.10 49 48 47 46 2.80 45 44 42 43 41 40 38 37 39 36 Free Energy 34 35 33 32 31 29 30 28 27 26 25 2.60 24 23 22 21 20 19 3.10 18 S 10 17 16 15 13 14 12 S 8 3.40 2.90 S 9 11 10 9 S 7 5.10 S 5 3.00 S 6 8 6 7 5 S 4 4 S 3 3 7.40 S 2 2 5.90 S 1 S 0 S 0 S 1 S 0 Minimum Free Energy Structure Suboptimal Structures Kinetic Structures Different notions of RNA structure

  11. RNA Minimum Free Energy Structures Efficient algorithms based on dynamical programming are available for computation of secondary structures for given sequences. Inverse folding algorithms compute sequences for given secondary structures. M.Zuker and P.Stiegler. Nucleic Acids Res . 9 :133-148 (1981) Vienna RNA Package : http:www.tbi.univie.ac.at (includes inverse folding , suboptimal structures , kinetic folding , etc.) I.L.Hofacker, W. Fontana, P.F.Stadler, L.S.Bonhoeffer, M.Tacker, and P. Schuster. Mh.Chem . 125 :167-188 (1994)

  12. Minimum free energy criterion UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC 1st GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG 2nd 3rd trial UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG 4th 5th CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG Inverse folding The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

  13. Criterion of Minimum Free Energy UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG Sequence Space Shape Space

  14. .... GC CA UC .... d =1 H d =2 .... GC GA UC .... .... GC CU UC .... H d =1 H .... GC GU UC .... Point mutations as moves in sequence space

  15. CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... A C A C Hamming distance d (S ,S ) = 4 H 1 2 d (S ,S ) = 0 (i) H 1 1 (ii) d (S ,S ) = d (S ,S ) H 1 2 H 2 1 � (iii) d (S ,S ) d (S ,S ) + d (S ,S ) H 1 3 H 1 2 H 2 3 The Hamming distance induces a metric in sequence space

  16. Mutant class 0 0 1 1 2 4 8 16 Binary sequences are encoded by their decimal equivalents: 2 3 5 6 9 10 12 17 18 20 24 = 0 and = 1, for example, C G ≡ "0" 00000 = CCCCC , 3 7 11 13 14 19 21 22 25 26 28 ≡ "14" 01110 = , C GGG C ≡ 4 "29" 11101 = , etc. GGG G C 15 23 27 29 30 5 31 Sequence space of binary sequences of chain lenght n=5

  17. ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers Mapping from sequence space into phenotype space and into fitness values

  18. ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers

  19. ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers

  20. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4 n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence space. In this approach, nodes are inserted randomly into sequence space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

  21. Step 00 Sketch of sequence space Random graph approach to neutral networks

  22. Step 01 Sketch of sequence space Random graph approach to neutral networks

  23. Step 02 Sketch of sequence space Random graph approach to neutral networks

  24. Step 03 Sketch of sequence space Random graph approach to neutral networks

  25. Step 04 Sketch of sequence space Random graph approach to neutral networks

  26. Step 05 Sketch of sequence space Random graph approach to neutral networks

  27. Step 10 Sketch of sequence space Random graph approach to neutral networks

  28. Step 15 Sketch of sequence space Random graph approach to neutral networks

  29. Step 25 Sketch of sequence space Random graph approach to neutral networks

  30. Step 50 Sketch of sequence space Random graph approach to neutral networks

  31. Step 75 Sketch of sequence space Random graph approach to neutral networks

  32. Step 100 Sketch of sequence space Random graph approach to neutral networks

  33. � � Υ � � -1 � � G = ( S ) | ( ) = I I S k k j j k � � (k) j / λ k = λ j = 12 27 , | G k | / κ - cr = 1 - -1 ( 1) λ κ Connectivity threshold: � � � AUGC Alphabet size : = 4 cr 2 0.5 λ λ > network is connected G k cr . . . . k 3 0.4226 λ λ < network is not connected cr . . . . G k 4 0.3700 k Mean degree of neutrality and connectivity of neutral networks

  34. Giant Component A multi-component neutral network

  35. A connected neutral network

  36. Suboptimal RNA Secondary Structures Michael Zuker. On finding all suboptimal foldings of an RNA molecule . Science 244 (1989), 48-52 Stefan Wuchty, Walter Fontana, Ivo L. Hofacker, Peter Schuster. Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 49 (1999), 145-165

  37. 3' Total number of structures including all suboptimal conformations, stable 5' and unstable (with � G 0 >0): #conformations = 1 416 661 Minimum free energy structure AAAGGGCACAGGGUGAUUUCAAUAAUUUUA Sequence Example of a small RNA molecule: n=30

  38. Density of stares of suboptimal structures of the RNA molecule with the sequence: AAAGGGCACAGGGUGAUUUCAAUAAUUUUA

  39. Partition Function of RNA Secondary Structures John S. McCaskill . The equilibrium function and base pair binding probabilities for RNA secondary structure . Biopolymers 29 (1990), 1105-1119 Ivo L. Hofacker, Walter Fontana, Peter F. Stadler, L. Sebastian Bonhoeffer, Manfred Tacker, Peter Schuster. Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie 125 (1994), 167-188

  40. 3' 5' Example of a small RNA molecule with two low-lying suboptimal conformations which contribute substantially to the partition function UUGGAGUACACAACCUGUACACUCUUUC Example of a small RNA molecule: n=28

  41. U U G G A G U A C A C A A C C U G U A C A C U C U U U C C U U C U U U C U C A C A U G U C C A A C A C A U G A G G U U U U G G A G U A C A C A A C C U G U A C A C U C U U U C U C C U G G A U U A second suboptimal configuration C G A U ∆ E = 0.55 kcal / mole 0 →2 U A G C U A C C A C A C U U first suboptimal configuration U C ∆ E = 0.50 kcal / mole U → G G A G 0 1 C C U U A A U U G A U A C A C C A C C 3' U U U C U U U G G A G U C 5' C A minimum free energy A configuration U A G C � G = - 5.39 kcal / mole 0 U A C C A A C U U G G A G U A C A C A A C C U G U A C A C U C U U U C „Dot plot“ of the minimum free energy structure ( lower triangle ) and the partition function ( upper triangle ) of a small RNA molecule (n=28) with low energy suboptimal configurations

  42. 5'-End 3'-End Sequence GCGGAU UUA GCUC AGDDGGGA GAGC M CCAGA CUGAAYA UCUGG AGMUC CUGUG TPCGAUC CACAG A AUUCGC ACCA 3'-End 5'-End 70 60 Secondary Structure 10 50 20 30 40 Symbolic Notation 5'-End 3'-End Phenylalanyl-tRNA as an example for the computation of the partition function

  43. G first suboptimal configuration ∆ 0 E = 0.43 kcal / mole → 1 3’ 5’ tRNA phe without modified bases

Recommend


More recommend