Minimum free energy criterion UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC 1st GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG 2nd 3rd trial UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG 4th 5th CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG Inverse folding The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.
Initial trial sequences Stop sequence of an unsucessful trial Intermediate compatible sequences Target sequence Approach to the target structure in the inverse folding algorithm
3'-End 3'-End 3'-End 3'-End 5'-End 5'-End 5'-End 5'-End 70 70 70 70 60 60 60 60 10 10 10 10 50 50 50 50 20 20 20 20 30 40 30 40 30 40 30 40 A B C D RNA clover-leaf secondary structures of sequences with chain length n=76
3'-End 3'-End 3'-End 3'-End 5'-End 5'-End 5'-End 5'-End 70 70 70 70 60 60 60 60 10 10 10 10 50 50 50 50 20 20 20 20 30 40 30 40 30 40 30 40 Alphabet Number of successful inverse foldings out of 1000 trials AU - - - - - - - - - - - - � � � AUG - - - 4 2 24 8 30 6 AUGC 790 900 940 960 UGC 570 630 710 740 � � � � GC 64 6 89 15 84 10 77 5 Search for clover-leef structures by means of the inverse folding algorithm
Theory of sequence – structure mappings P. Schuster, W.Fontana, P.F.Stadler, I.L.Hofacker, From sequences to shapes and back: A case study in RNA secondary structures . Proc.Roy.Soc.London B 255 (1994), 279-284 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. I. Neutral networks . Mh.Chem. 127 (1996), 355-374 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. II. Structure of neutral networks and shape space covering . Mh.Chem. 127 (1996), 375-389 C.M.Reidys, P.F.Stadler, P.Schuster, Generic properties of combinatory maps . Bull.Math.Biol. 59 (1997), 339-397 I.L.Hofacker, P. Schuster, P.F.Stadler, Combinatorics of RNA secondary structures . Discr.Appl.Math. 89 (1998), 177-207 C.M.Reidys, P.F.Stadler, Combinatory landscapes . SIAM Review 44 (2002), 3-54
Sequence-structure relations are highly complex and only the simplest case can be studied. An example is the folding of RNA sequences into RNA structures represented in course-grained form as secondary structures. The RNA sequence-structure relation is understood as a mapping from the space of RNA sequences into a space of RNA structures.
ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers Mapping from sequence space into phenotype space and into function
ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers
ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers The pre-image of the structure S k in sequence space is the neutral network G k
Neutral networks are sets of sequences forming the same structure. G k is the pre-image of the structure S k in sequence space: -1 (S k ) π { � G k = � j | � (I j ) = S k } The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4 n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence space. In this approach, nodes are inserted randomly into sequence space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.
Step 00 Sketch of sequence space Random graph approach to neutral networks
Step 01 Sketch of sequence space Random graph approach to neutral networks
Step 02 Sketch of sequence space Random graph approach to neutral networks
Step 03 Sketch of sequence space Random graph approach to neutral networks
Step 04 Sketch of sequence space Random graph approach to neutral networks
Step 05 Sketch of sequence space Random graph approach to neutral networks
Step 10 Sketch of sequence space Random graph approach to neutral networks
Step 15 Sketch of sequence space Random graph approach to neutral networks
Step 25 Sketch of sequence space Random graph approach to neutral networks
Step 50 Sketch of sequence space Random graph approach to neutral networks
Step 75 Sketch of sequence space Random graph approach to neutral networks
Step 100 Sketch of sequence space Random graph approach to neutral networks
� � � U � -1 � � G = ( S ) | ( ) = I I S k k j j k � � (k) j / λ k = λ j = 12 27 , | G k | / κ - cr = 1 - -1 ( 1) λ κ Connectivity threshold: � � � Alphabet size : AUGC = 4 cr 2 0.5 λ λ > network G k is connected cr . . . . k 3 0.4226 λ λ < network G k is not connected cr . . . . 4 0.3700 k Mean degree of neutrality and connectivity of neutral networks
Giant Component A multi-component neutral network
A connected neutral network
3'-End Alphabet Degree of neutrality � 5'-End 70 � AUGC 0.27 0.07 60 10 � 50 UGC 0.26 0.07 20 � 30 40 GC 0.06 0.03 Computated degree of neutrality for the tRNA neutral network
3’-end 3’-end 3’-end C C C U U U G G G G G G G G G A A A A A A A A A A A A A A A U U U C C C C C C C C C C C C A A A G G G A A A C C C C C C G G G G G G G G G G G G G G G U U U U U U U U U C C C C C C C C G U C C G G G 5’-end 5’-end 5’-end G G G A A A A A A A U A U A U G G G C C C G C G C G C G C G C G C e G C G C G C l e b A A A U U U l i b t C G C G C G i a t p a A A A m p G G G C C C m o G o G C G C C c n C U G C G C G I G G C G C G C G C G C G C G Definition of compatibility of C G C G U G U G U G sequences and structures U U U U U U
Structure
3’-end 3’-end C C A A A A U U G G U U A A G G G G G C G C A A A A G C G C A A A A G C G C A A U U G C G C C C C C A A U U C C C C C G C G A A G G A A A A C C G C G C C C G G G C G C G G G G C G U G G G G G C G C G U U U U C G C G U U C C C C G G C C C C U G U G C U G G 5’-end U 5’-end U U U G G Structure Compatible sequences
3’-end C A A U G A U G G G C A A G C A A G C A U G C C C A C U C C G A G A A C G C C G G C G G C G G G G G U G C U U C G C C G U G C G U U 5’-end G Structure Incompatible sequence
Neutral network k � k G C G k Compatible set Ck The compatible set C k of a structure S k consists of all sequences which form S k as its minimum free energy structure ( neutral network G k ) or one of its suboptimal structures.
3’- end C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G G A A A Minimum free energy conformation S0 C G A U C A Suboptimal conformation S1 G C G C G C G C G C G C G C G C U A G U C G A U A A U G C A sequence at the intersection of U A G C C G C G two neutral networks is compatible C G C G with both structures C G C G G C C G G U U G G C U U
G 1 G 2 � � � : C 1 C 2 � � � : C 1 C 2 The intersection of two compatible sets is always non empty: C 1 � C 2 � π
1. Introduction 2. A few experiments 3. Analysing neutral networks 4. Mechanisms of neutral evolution
Optimization of RNA molecules in silico W.Fontana, P.Schuster, A computer model of evolutionary optimization . Biophysical Chemistry 26 (1987), 123-147 W.Fontana, W.Schnabl, P.Schuster, Physical aspects of evolutionary optimization and adaptation . Phys.Rev.A 40 (1989), 3301-3321 M.A.Huynen, W.Fontana, P.F.Stadler, Smoothness within ruggedness. The role of neutrality in adaptation . Proc.Natl.Acad.Sci.USA 93 (1996), 397-401 W.Fontana, P.Schuster, Continuity in evolution. On the nature of transitions . Science 280 (1998), 1451-1455 W.Fontana, P.Schuster, Shaping space. The possible and the attainable in RNA genotype- phenotype mapping . J.Theor.Biol. 194 (1998), 491-515 B.M.R. Stadler, P.F. Stadler, G.P. Wagner, W. Fontana, The topology of the possible: Formal spaces underlying patterns of evolutionary change. J.Theor.Biol. 213 (2001), 241-274
Stock Solution Reaction Mixture Fitness function: f k = � / [ � + � (k) ] d S � d S (k) = d s (I k ,I � ) The flowreactor as a device for studies of evolution in vitro and in silico
3'-End 5'-End 70 60 10 50 20 30 40 Randomly chosen Phenylalanyl-tRNA as initial structure target structure
Master sequence Mutant cloud “Off-the-cloud” Concentration mutations Sequence e c a p s The molecular quasispecies in sequence space
Genotype-Phenotype Mapping Evaluation of the = � S { ( ) I { S { Phenotype I { ƒ f = ( S ) { { f { Q { f 1 j f 1 Mutation I 1 f 2 f n+1 I 1 I n+1 I 2 f n f 2 I n I 2 f 3 I 3 Q Q I 3 f 3 I { I 4 f 4 f { I 5 I 4 I 5 f 4 f 5 f 5 Evolutionary dynamics including molecular phenotypes
50 S d � - 0 5 40 e r u t c u r Evolutionary trajectory t s 30 l a i t i n i m o r f 20 e c n a t s i d e g 10 a r e v A 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( biologists‘ view )
50 S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( physicists‘ view )
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 44 Endconformation of optimization
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 43 44 Reconstruction of the last step 43 � 44
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 42 43 44 Reconstruction of last-but-one step 42 � 43 ( � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 41 42 43 44 Reconstruction of step 41 � 42 ( � 43 � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 40 41 42 43 44 Reconstruction of step 40 � 41 ( � 42 � 43 � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time Evolutionary process 39 40 41 42 43 44 Reconstruction Reconstruction of the relay series
Transition inducing point mutations Neutral point mutations Change in RNA sequences during the final five relay steps 39 � 44
50 Relay steps S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory and relay steps
50 Relay steps S Uninterrupted presence d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Uninterrupted presence
Average structure distance Uninterrupted presence Number of relay step 08 to target dS 10 � 12 20 14 Evolutionary trajectory 10 0 250 500 Time (arbitrary units) Transition inducing point mutations Neutral point mutations Neutral genotype evolution during phenotypic stasis
Uninterrupted presence Average structure distance to target dS � 30 20 Number of relay step 25 20 30 35 Evolutionary trajectory 10 750 1000 1250 Time (arbitrary units) 18 20 21 19 26 28 31 29 A random sequence of minor or continuous transitions in the relay series
18 20 21 19 26 28 31 29 A random sequence of minor or continuous transitions in the relay series
Shortening of Stacks Elongation of Stacks Multi- loop Minor or continuous transitions : Occur frequently on single Opening of Constrained Stacks point mutations
50 Relay steps S Uninterrupted presence d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Uninterrupted presence
Average structure distance to target dS 36 � Main transition leading to clover leaf Relay steps Number of relay step 10 38 40 42 44 36 37 38 Evolutionary trajectory 0 1250 Time Reconstruction of a main transitions 36 � 37 ( � 38)
50 Relay steps Main transitions Average structure distance to target d � S 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Main transitions
Roll-Over Shift α α α a Double Flip Flip α a a a b β b β Main or discontinuous Multi- loop transitions : Structural innovations , occur rarely on single point Closing of Constrained mutations Stacks
50 Relay steps Main transitions Average structure distance to target d � S Uninterrupted presence 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor
Statistics of evolutionary trajectories Population Number of Number of Number of main size replications transitions transitions N < n > < n > < n > rep tr dtr The number of main transitions or evolutionary innovations is constant.
00 09 31 44 Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions .
Recommend
More recommend