Kinetic differential equations d x = = i f ( x , x , , x ; k , k , , k ) ; i 1 , 2 , , n K K K 1 2 n 1 2 m d t Reaction diffusion equations ∂ x = ∇ + = i D 2 x f ( x , x , , x ; k , k , , k ) ; i 1 , 2 , , n K K K ∂ i i 1 2 n 1 2 m t General conditions: , , pH , , ... T p I = Initial conditions: x i ( 0 ) ; i 1 , 2 , , n K Parameter set � = Boundary conditions: boundary ... s k ( T , p , p H , I , ; x , x , , x ) ; j 1 , 2 , , m K K K � j 1 2 n normal unit vector ... u r x s = = Dirichlet , f ( r , t ) ; i 1 , 2 , , n K i ∂ x r r Neumann , = ⋅ ∇ s = = i u ˆ x f ( r , t ) ; i 1 , 2 , , n K i ∂ u Data from measurements x t ( ); = 1, 2, ... , ; = 1, 2, ... , i n k N i k x i Concentration The inverse-problem of chemical reaction kinetics t Time
Neurobiology Neural networks, collective properties, nonlinear dynamics, signalling, ... A single neuron signaling to a muscle fiber
The human brain 10 11 neurons connected by � 10 13 to 10 14 synapses
Evolutionary biology Optimization through variation and selection, relation between genotype, phenotype, and function, ... 10 6 generations 10 7 generations Generation time 10 000 generations RNA molecules 10 sec 27.8 h = 1.16 d 115.7 d 3.17 a 1 min 6.94 d 1.90 a 19.01 a Bacteria 20 min 138.9 d 38.03 a 380 a 10 h 11.40 a 1 140 a 11 408 a Higher multicelluar 10 d 274 a 27 380 a 273 800 a 2 × 10 7 a 2 × 10 8 a organisms 20 a 20 000 a Time scales of evolutionary change
5' - end N 1 O CH 2 O GCGGAU UUA GCUC AGUUGGGA GAGC CCAGA G CUGAAGA UCUGG AGGUC CUGUG UUCGAUC CACAG A AUUCGC ACCA 5'-e nd 3’-end N A U G C k = , , , OH O N 2 O P O CH 2 O Na � O O OH N 3 O P O CH 2 O Na � 3'-end O RNA O OH 5’-end N 4 O P O CH 2 O Na � 70 O O OH 60 3' - end O P O 10 Na � O 50 20 30 40 Definition of RNA structure
James D. Watson, 1928- , and Francis Crick, 1916- , Nobel Prize 1962 1953 – 2003 fifty years double helix The three-dimensional structure of a short double helical stack of B-DNA
5'-End 3'-End Sequence GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYAUCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA 3'-End 5'-End 70 60 Secondary structure 10 50 20 30 40
5' 3' Plus Strand G C C C G Synthesis 5' 3' Plus Strand G C C C G C G 3' Synthesis 5' 3' Plus Strand G C C C G Minus Strand C G G G C 5' 3' Complex Dissociation Complementary replication as the 3' 5' simplest copying mechanism of RNA Plus Strand G C C C G Complementarity is determined by Watson-Crick base pairs: + 5' 3' G � C and A = U Minus Strand C G G G C
f 1 (A) + I 1 I 1 I 1 + f 2 (A) + I 2 I 2 I 2 + Φ = ( Φ ) dx / dt = x - x f x f i - i i i i i Φ = Σ ; Σ = 1 ; i,j f x x =1,2,...,n j j j j j i � i =1,2,...,n ; [I ] = x 0 ; i f i I i [A] = a = constant (A) + (A) + I i + + I i fm = max { ; j=1,2,...,n} fj � � � xm(t) 1 for t f m I m (A) + (A) + I m I m + f n I n (A) + (A) + I n I n + + Reproduction of organisms or replication of molecules as the basis of selection
Selection equation : [I i ] = x i � 0 , f i > 0 ( ) dx ∑ ∑ n n = − φ = = φ = = i x f , i 1 , 2 , , n ; x 1 ; f x f L i i i j j = = dt i 1 j 1 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, ( ) φ n dx d = ∑ { } 2 = − = ≥ i 2 f f f var f 0 i dt dt = i 1 Solutions are obtained by integrating factor transformation ( ) ( ) ⋅ x 0 exp f t ( ) = = x t i i ; i 1 , 2 , , n L ( ) ( ) i ∑ = n ⋅ x 0 exp f t j j j 1
s = ( f 2 - f 1 ) / f 1 ; f 2 > f 1 ; x 1 (0) = 1 - 1/N ; x 2 (0) = 1/N 1 Fraction of advantageous variant 0.8 0.6 s = 0.1 s = 0.02 0.4 0.2 s = 0.01 0 0 200 600 800 1000 400 Time [Generations] Selection of advantageous mutants in populations of N = 10 000 individuals
5' 3' Plus Strand G C C C G 5' 3' GAA UCCCG AA GAA UCCCGUCCCG AA Plus Strand G C C C G Insertion C 3' G 5' 3' Minus Strand G G C G G C GAAUCCA GAAUCC CGA A 3' 5' Deletion Plus Strand G C C C G C Point Mutation Mutations in nucleic acids represent the mechanism of variation of genotypes .
Theory of molecular evolution M.Eigen, Self-organization of matter and the evolution of biological macromolecules . Naturwissenschaften 58 (1971), 465-526 C.J. Thompson, J.L. McBride, On Eigen's theory of the self-organization of matter and the evolution of biological macromolecules . Math. Biosci . 21 (1974), 127-142 B.L. Jones, R.H. Enns, S.S. Rangnekar, On the theory of selection of coupled macromolecular systems. Bull.Math.Biol . 38 (1976), 15-28 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle . Naturwissenschaften 58 (1977), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part B: The abstract hypercycle . Naturwissenschaften 65 (1978), 7-41 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part C: The realistic hypercycle . Naturwissenschaften 65 (1978), 341-369 J. Swetina, P. Schuster, Self-replication with errors - A model for polynucleotide replication. Biophys.Chem. 16 (1982), 329-345 J.S. McCaskill, A localization threshold for macromolecular quasispecies from continuously distributed replication rates . J.Chem.Phys. 80 (1984), 5194-5202 M.Eigen, J.McCaskill, P.Schuster, The molecular quasispecies . Adv.Chem.Phys. 75 (1989), 149-263 C. Reidys, C.Forst, P.Schuster, Replication and mutation on neutral networks . Bull.Math.Biol. 63 (2001), 57-94
I 1 I j + Σ Φ dx / dt = f Q ji x - x f j Q j1 i j j j i I j I 2 + Σ i Φ = Σ ; Σ = 1 ; f x x Q ij = 1 j j i j j � i =1,2,...,n ; f j Q j2 [Ii] = xi 0 ; I i I j + [A] = a = constant f j Q ji l -d(i,j) d(i,j) I j (A) + I j Q = (1- ) p p + I j ij f j Q jj p .......... Error rate per digit l ........... Chain length of the f j Q jn polynucleotide I j d(i,j) .... Hamming distance I n + between Ii and Ij Chemical kinetics of replication and mutation as parallel reactions
.... GC CA UC .... d =1 H d =2 .... GC GA UC .... .... GC CU UC .... H d =1 H .... GC GU UC .... City-block distance in sequence space 2D Sketch of sequence space Single point mutations as moves in sequence space
Mutant class 0 0 1 1 2 4 8 16 Binary sequences are encoded by their decimal equivalents: 2 3 5 6 9 10 12 17 18 20 24 = 0 and = 1, for example, C G ≡ "0" 00000 = CCCCC , 3 7 11 13 14 19 21 22 25 26 28 ≡ "14" 01110 = , C GGG C ≡ 4 "29" 11101 = , etc. GGG G C 15 23 27 29 30 5 31 Sequence space of binary sequences of chain lenght n=5
CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... A C A C Hamming distance d (I ,I ) = 4 H 1 2 (i) d (I ,I ) = 0 H 1 1 (ii) d (I ,I ) = d (I ,I ) H 1 2 H 2 1 � (iii) d (I ,I ) d (I ,I ) + d (I ,I ) H 1 3 H 1 2 H 2 3 The Hamming distance between sequences induces a metric in sequence space
Mutation-selection equation : [I i ] = x i � 0, f i > 0, Q ij � 0 dx ∑ ∑ ∑ n n n = − φ = = φ = = i f Q x x , i 1 , 2 , , n ; x 1 ; f x f L j ji j i i j j = = = dt j 1 i 1 j 1 Solutions are obtained after integrating factor transformation by means of an eigenvalue problem ( ) ( ) ∑ − n 1 ⋅ ⋅ λ c 0 exp t l ( ) ∑ n ik k k = = = = x t k 0 ; i 1 , 2 , , n ; c ( 0 ) h x ( 0 ) L ( ) ( ) ∑ ∑ i − k ki i n n 1 = i 1 ⋅ ⋅ λ c 0 exp t l jk k k = = j 1 k 0 { } { } { } ÷ = = = − = = = 1 W f Q ; i , j 1 , 2 , , n ; L ; i , j 1 , 2 , , n ; L H h ; i , j 1 , 2 , , n L l L L i ij ij ij { } − ⋅ ⋅ = Λ = λ = − 1 L W L ; k 0 , 1 , , n 1 L k
Quasispecies Uniform distribution 0.00 0.05 0.10 Error rate p = 1-q Quasispecies as a function of the replication accuracy q
Master sequence Mutant cloud n o i t a r t n e c n o C Sequence space The molecular quasispecies in sequence space
e 1 l 0 x 1 e 1 x 3 e 3 e 2 l 2 e 3 e 2 x 2 l 1 The quasispecies on the concentration simplex S 3 = { } ∑ = 3 ≥ = = x 0 , i 1 , 2 , 3 ; x 1 i i i 1
Replication rate constant: f k = � / [ � + � (k) ] d S � d S (k) = d H (S k ,S � ) f 6 f 7 f 5 f 0 f 4 f � f 3 f 1 f 2 Evaluation of RNA secondary structures yields replication rate constants
Hamming distance d (S ,S ) = 4 H 1 2 d (S ,S ) = 0 (i) H 1 1 (ii) d (S ,S ) = d (S ,S ) H 1 2 H 2 1 � (iii) d (S ,S ) d (S ,S ) + d (S ,S ) H 1 3 H 1 2 H 2 3 The Hamming distance between structures in parentheses notation forms a metric in structure space
Reaction Mixture Stock Solution Replication rate constant: f k = � / [ � + � (k) ] d S � d S (k) = d H (S k ,S � ) Selection constraint: # RNA molecules is controlled by the flow ≈ ± N ( t ) N N The flowreactor as a device for studies of evolution in vitro and in silico
3'-End 5'-End 70 60 10 50 20 30 40 Randomly chosen Phenylalanyl-tRNA as initial structure target structure
Master sequence Mutant cloud “Off-the-cloud” Concentration mutations Sequence e c a p s The molecular quasispecies in sequence space
Genotype-Phenotype Mapping Evaluation of the = � S { ( ) I { S { Phenotype I { ƒ f = ( S ) { { f { Q { f 1 j f 1 Mutation I 1 f 2 f n+1 I 1 I n+1 I 2 f n f 2 I n I 2 f 3 I 3 Q Q I 3 f 3 I { I 4 f 4 f { I 5 I 4 I 5 f 4 f 5 f 5 Evolutionary dynamics including molecular phenotypes
50 S d � - 0 5 40 e r u t c u r Evolutionary trajectory t s 30 l a i t i n i m o r f 20 e c n a t s i d e g 10 a r e v A 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( biologists‘ view )
50 S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( physicists‘ view )
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 44 Endconformation of optimization
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 43 44 Reconstruction of the last step 43 � 44
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 42 43 44 Reconstruction of last-but-one step 42 � 43 ( � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 41 42 43 44 Reconstruction of step 41 � 42 ( � 43 � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 40 41 42 43 44 Reconstruction of step 40 � 41 ( � 42 � 43 � 44)
Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time Evolutionary process 39 40 41 42 43 44 Reconstruction Reconstruction of the relay series
Transition inducing point mutations Neutral point mutations Change in RNA sequences during the final five relay steps 39 � 44
50 Relay steps S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory and relay steps
Average structure distance Uninterrupted presence Number of relay step 08 to target dS 10 � 12 28 neutral point mutations during 20 14 a long quasi-stationary epoch Evolutionary trajectory 10 0 250 500 Time (arbitrary units) Transition inducing point mutations Neutral point mutations Neutral genotype evolution during phenotypic stasis
50 Relay steps Main transitions Average structure distance to target d � S 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Main transitions
00 09 31 44 Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions .
AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet
0.2 0.15 y c n e 0.1 u q e r F 0.05 0 0 1000 2000 3000 4000 5000 Runtime of trajectories Statistics of the lengths of trajectories from initial structure to target ( AUGC -sequences)
Alphabet Runtime Transitions Main transitions No. of runs AUGC 385.6 22.5 12.6 1017 GUC 448.9 30.5 16.5 611 GC 2188.3 40.0 20.6 107 Statistics of trajectories and relay series (mean values of log-normal distributions)
Minimum free energy criterion Inverse folding of RNA secondary structures The idea of inverse folding algorithm is to search for sequences that form a given RNA secondary structure under the minimum free energy criterion.
Structure
3’-end C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G 5’-end G Structure Compatible sequence
3’-end C A A U G U A G G G C A A G C A A G C A U G C C C A U C C G C A G A A C G C C G G C G G C G G G C G U U C G U C C G C C U G C G 5’-end U U G Structure Compatible sequence
3’-end C A A U G U A G Single nucleotides: A U G C , , , G G C A A G C A A G C A U G C C C A U C C G C A G A A C G C C G G C G G C G AU , UA G G C G Base pairs: GC , CG U U C G GU , UG U C C G C C U G C G 5’-end U U G Structure Compatible sequence
3’-end C A A U G A U G G G C A A G C A A C G A U G C C C A U C C C G A G A A C G C C G G C G G C G G G G G U C G U U C G C C G U G C G U U 5’-end G Structure Incompatible sequence
Initial trial sequences Stop sequence of an unsuccessful trial Intermediate compatible sequences Target sequence Target structure S k Approach to the target structure S k in the inverse folding algorithm
Minimum free energy criterion 1st 2nd 3rd trial 4th 5th Inverse folding of RNA secondary structures The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.
Theory of genotype – phenotype mapping P. Schuster, W.Fontana, P.F.Stadler, I.L.Hofacker, From sequences to shapes and back: A case study in RNA secondary structures . Proc.Roy.Soc.London B 255 (1994), 279-284 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. I. Neutral networks . Mh.Chem. 127 (1996), 355-374 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. II. Structure of neutral networks and shape space covering . Mh.Chem. 127 (1996), 375-389 C.M.Reidys, P.F.Stadler, P.Schuster, Generic properties of combinatory maps . Bull.Math.Biol. 59 (1997), 339-397 I.L.Hofacker, P. Schuster, P.F.Stadler, Combinatorics of RNA secondary structures . Discr.Appl.Math. 89 (1998), 177-207 C.M.Reidys, P.F.Stadler, Combinatory landscapes . SIAM Review 44 (2002), 3-54
ψ Sk = ( ) I. fk = ( f Sk ) Sequence space Real numbers Structure space Mapping from sequence space into structure space and into function
ψ Sk = ( ) I. fk = ( f Sk ) Sequence space Real numbers Structure space
ψ Sk = ( ) I. fk = ( f Sk ) Sequence space Real numbers Structure space The pre-image of the structure S k in sequence space is the neutral network G k
Neutral networks are sets of sequences forming the same structure. G k is the pre-image of the structure S k in sequence space: -1 (S k ) π { � G k = � j | � (I j ) = S k } The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4 n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence space. In this approach, nodes are inserted randomly into sequence space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.
� � � U � -1 � � G = ( S ) | ( ) = I I S k k j j k � � (k) j / λ j = λ k = 12 27 = 0.444 , | G k | / κ - λ κ -1 ( 1) Connectivity threshold: cr = 1 - � � � AUGC Alphabet size : = 4 cr 2 0.5 GC,AU λ λ > network G k is connected cr . . . . k 3 0.423 GUC,AUG λ λ < network G k is not connected cr . . . . 4 0.370 k AUGC Mean degree of neutrality and connectivity of neutral networks
A connected neutral network
Giant Component A multi-component neutral network
3'-End 3'-End 3'-End 3'-End 5'-End 5'-End 5'-End 5'-End 70 70 70 70 60 60 60 60 10 10 10 10 50 50 50 50 20 20 20 20 30 40 30 40 30 40 30 40 Degree of neutrality � � Alphabet � - - - - 0.073 0.032 AU � - - � 0.217 0.051 0.201 0.056 AUG � 0.275 0.064 � AUGC � 0.279 0.063 0.313 0.058 � UGC 0.263 0.071 � � 0.257 0.070 0.250 0.064 GC � 0.052 0.033 � � 0.057 0.034 0.068 0.034 Degree of neutrality of cloverleaf RNA secondary structures over different alphabets
Reference for postulation and in silico verification of neutral networks
Structure S k G k Neutral Network � k G k C Compatible Set C k The compatible set C k of a structure S k consists of all sequences which form S k as its minimum free energy structure (the neutral network G k ) or one of its suboptimal structures.
Recommend
More recommend