Hierarchische und mehrkriterielle Optimierungssystematik nach dem Vorbild der RNA-Selektion Peter Schuster Institut für Theoretische Chemie und Molekulare Strukturbiologie der Universität Wien BBAW Studiengruppe: Strukturbildung und Innovation Berlin, 21.– 22.11.2003
5' - end N 1 O CH 2 O 5'-e nd GCGGAU UUA GCUC AGUUGGGA GAGC CCAGA G CUGAAGA UCUGG AGGUC CUGUG UUCGAUC CACAG A AUUCGC ACCA 3’-end N A U G C k = , , , OH O N 2 O P O CH 2 O Na � O O OH N 3 O P O CH 2 O Na � 3'-end O RNA O OH 5’-end N 4 O O CH 2 P O Na � 70 O O OH 60 3' - end O P O 10 Na � O 50 20 30 40 Definition of RNA structure
5'-End 3'-End Sequence GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYAUCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA 3'-End 5'-End 70 60 Secondary structure 10 50 20 40 30 � Symbolic notation 5'-End 3'-End A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
5’-end 3’-end A C C (h) S 5 (h) U S 3 G C (h) U A S 4 A U (h) U G S 1 (h) S 2 C G (h) S 8 � 0 G C (h) Free energy G (h) S 9 S 7 A U A A A C C (h) U S 6 A U G G C C Suboptimal conformations A G G U U U G G G A C C A U G A G G G C U G (h) S 0 Minimum of free energy The minimum free energy structures on a discrete space of conformations
� � � � T = 0 K , t T > 0 K , t T > 0 K , t finite 3.30 3.40 3.10 49 48 47 46 2.80 44 45 42 43 41 40 38 39 Free Energy 37 36 34 35 33 32 31 30 29 28 27 25 2.60 26 24 23 22 21 20 3.10 19 18 S 10 17 16 15 13 14 12 S 8 3.40 2.90 S 9 11 10 9 S 7 5.10 S 5 S 6 3.00 8 7 6 5 S 4 4 S 3 3 7.40 S 2 2 5.90 S 1 S 0 S 0 S 1 S 0 Minimum Free Energy Structure Suboptimal Structures Kinetic Structures Different notions of RNA structure including suboptimal conformations
� 0 Free energy G T k { � 0 G y g r e n e e e r F S { S { Saddle point T k { S k S k "Barrier tree" "Reaction coordinate" Definition of a ‚barrier tree‘
� � lim t finite folding time 0 3 . 3 49 48 47 46 45 44 42 43 41 40 38 37 39 36 34 35 33 32 31 30 29 28 27 25 26 24 23 22 21 20 19 3.10 18 S 10 17 16 15 13 14 12 S 8 S 9 10 11 0 S 7 9 1 . S 5 5 S 6 8 7 6 5 S 4 4 S 3 3 0 4 . S 2 7 2 0 9 . 5 S 1 S 0 S1 S0 Kinetic folding Suboptimal structures A typical energy landscape of a sequence with two (meta)stable comformations
Kinetics RNA refolding between a long living metastable conformation and the minmum free energy structure
Minimal hairpin loop size: n lp � 3 Minimal stack length: n st � 2 Recursion formula for the number of acceptable RNA secondary structures
Computed numbers of minimum free energy structures over different nucleotide alphabets P. Schuster, Molecular insights into evolution of phenotypes . In: J. Crutchfield & P.Schuster, Evolutionary Dynamics. Oxford University Press, New York 2003, pp.163-215.
ψ Sk = ( ) I. fk = ( f Sk ) Sequence space Structure space Real numbers Mapping from sequence space into structure space and into function
5’-end 3’-end A C C U G C U A A U U G C G G C A U A A A C C U A U G G C C A G G U U U G G G A C C A U G A G RNAStudio.lnk G G C GGCGCGCCCGGCGCC U G GUAUCGAAAUACGUAGCGUAUGGGGAUGCUGGACGGUCCCAUCGGUACUCCA UGGUUACGCGUUGGGGUAACGAAGAUUCCGAGAGGAGUUUAGUGACUAGAGG Folding of RNA sequences into secondary structures of minimal free energy, � G 0 300
Hamming distance d (S ,S ) = 4 H 1 2 (i) d (S ,S ) = 0 H 1 1 (ii) d (S ,S ) = d (S ,S ) H 1 2 H 2 1 � (iii) d (S ,S ) d (S ,S ) + d (S ,S ) H 1 3 H 1 2 H 2 3 The Hamming distance between structures in parentheses notation forms a metric in structure space
Replication rate constant: f k = � / [ � + � d S (k) ] � (k) = d H (S k ,S � d S ) f 6 f 7 f 5 f 0 f � f 4 f 3 f 1 f 2 Evaluation of RNA secondary structures yields replication rate constants
Stock Solution Reaction Mixture Replication rate constant: f k = � / [ � + � d S (k) ] � (k) = d H (S k ,S � d S ) Selection constraint: # RNA molecules is controlled by the flow ≈ ± N ( t ) N N The flowreactor as a device for studies of evolution in vitro and in silico
Master sequence Mutant cloud “Off-the-cloud” Concentration mutations Sequence e c a p s The molecular quasispecies in sequence space
Genotype-Phenotype Mapping Evaluation of the = � S { ( ) I { S { Phenotype I { ƒ f = ( S ) { { f { Q { f 1 j f 1 Mutation I 1 f n+1 f 2 I 1 I n+1 I 2 f n f 2 I n I 2 f 3 I 3 Q Q I 3 f 3 I 4 I { f 4 f { I 5 I 4 I 5 f 4 f 5 f 5 Evolutionary dynamics including molecular phenotypes
50 S d � - 0 5 40 e r u t c u r Evolutionary trajectory t s 30 l a i t i n i m o r f 20 e c n a t s i d e g 10 a r e v A 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( biologists‘ view )
50 S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a r 10 e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( physicists‘ view )
AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet
0.2 0.15 y c n e 0.1 u q e r F 0.05 0 0 1000 2000 3000 4000 5000 Runtime of trajectories Statistics of the lengths of trajectories from initial structure to target ( AUGC -sequences)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 44 Endconformation of optimization
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 43 44 Reconstruction of the last step 43 � 44
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 42 43 44 Reconstruction of last-but-one step 42 � 43 ( � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 41 42 43 44 Reconstruction of step 41 � 42 ( � 43 � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 40 41 42 43 44 Reconstruction of step 40 � 41 ( � 42 � 43 � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time Evolutionary process 39 40 41 42 43 44 Reconstruction Reconstruction of the relay series
Transition inducing point mutations Neutral point mutations Change in RNA sequences during the final five relay steps 39 � 44
50 Relay steps S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a r 10 e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory and relay steps
Average structure distance Uninterrupted presence Number of relay step 08 to target dS 10 � 12 28 neutral point mutations during 20 14 a long quasi-stationary epoch Evolutionary trajectory 10 0 250 500 Time (arbitrary units) Transition inducing point mutations Neutral point mutations Neutral genotype evolution during phenotypic stasis
Average structure distance to target dS Uninterrupted presence � 30 20 Number of relay step 25 20 30 35 Evolutionary trajectory 10 750 1000 1250 Time (arbitrary units) 18 20 21 19 26 28 31 29 A random sequence of minor or continuous transitions in the relay series
20 18 22 24 21 19 23 25 27 26 30 28 31 29 A random sequence of minor or continuous transitions in the relay series
Uninterrupted presence Average structure distance to target dS � 30 20 Number of relay step 25 20 30 35 Evolutionary trajectory 10 750 1000 1250 Time (arbitrary units) A random sequence of minor or continuous transitions in the relay series
50 Relay steps Main transitions Average structure distance to target d � S 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Main transitions
00 09 31 44 Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions .
0.3 Main transitions 0.25 0.2 y c n e 0.15 u q e r F All transitions 0.1 0.05 0 0 20 40 80 100 60 Number of transitions Statistics of the numbers of transitions from initial structure to target ( AUGC -sequences)
More recommend