different kinds of robustness in genetic and metabolic
play

Different kinds of robustness in genetic and metabolic networks - PowerPoint PPT Presentation

Different kinds of robustness in genetic and metabolic networks Peter Schuster Institut fr Theoretische Chemie und Molekulare Strukturbiologie der Universitt Wien Seminar lecture Linz, 15.12.2003 Genomics and proteomics Large scale data


  1. Kinetic differential equations d x = = i f ( x , x , , x ; k , k , , k ) ; i 1 , 2 , , n K K K 1 2 n 1 2 m d t Reaction diffusion equations ∂ x = ∇ + = i D 2 x f ( x , x , , x ; k , k , , k ) ; i 1 , 2 , , n K K K ∂ i i 1 2 n 1 2 m t General conditions: , , pH , , ... T p I = Initial conditions: x i ( 0 ) ; i 1 , 2 , , n K Parameter set � = Boundary conditions: boundary ... s k ( T , p , p H , I , ; x , x , , x ) ; j 1 , 2 , , m K K K � j 1 2 n normal unit vector ... u r x s = = Dirichlet , f ( r , t ) ; i 1 , 2 , , n K i ∂ x r r Neumann , = ⋅ ∇ s = = i u ˆ x f ( r , t ) ; i 1 , 2 , , n K i ∂ u Data from measurements x t ( ); = 1, 2, ... , ; = 1, 2, ... , i n k N i k x i Concentration The inverse-problem of chemical reaction kinetics t Time

  2. Neurobiology Neural networks, collective properties, nonlinear dynamics, signalling, ... A single neuron signaling to a muscle fiber

  3. The human brain 10 11 neurons connected by � 10 13 to 10 14 synapses

  4. Evolutionary biology Optimization through variation and selection, relation between genotype, phenotype, and function, ... 10 6 generations 10 7 generations Generation time 10 000 generations RNA molecules 10 sec 27.8 h = 1.16 d 115.7 d 3.17 a 1 min 6.94 d 1.90 a 19.01 a Bacteria 20 min 138.9 d 38.03 a 380 a 10 h 11.40 a 1 140 a 11 408 a Higher multicelluar 10 d 274 a 27 380 a 273 800 a 2 × 10 7 a 2 × 10 8 a organisms 20 a 20 000 a Time scales of evolutionary change

  5. 5' - end N 1 O CH 2 O GCGGAU UUA GCUC AGUUGGGA GAGC CCAGA G CUGAAGA UCUGG AGGUC CUGUG UUCGAUC CACAG A AUUCGC ACCA 5'-e nd 3’-end N A U G C k = , , , OH O N 2 O P O CH 2 O Na � O O OH N 3 O P O CH 2 O Na � 3'-end O RNA O OH 5’-end N 4 O P O CH 2 O Na � 70 O O OH 60 3' - end O P O 10 Na � O 50 20 30 40 Definition of RNA structure

  6. James D. Watson, 1928- , and Francis Crick, 1916- , Nobel Prize 1962 1953 – 2003 fifty years double helix The three-dimensional structure of a short double helical stack of B-DNA

  7. 5'-End 3'-End Sequence GCGGAUUUAGCUCAGDDGGGAGAGCMCCAGACUGAAYAUCUGGAGMUCCUGUGTPCGAUCCACAGAAUUCGCACCA 3'-End 5'-End 70 60 Secondary structure 10 50 20 30 40

  8. 5' 3' Plus Strand G C C C G Synthesis 5' 3' Plus Strand G C C C G C G 3' Synthesis 5' 3' Plus Strand G C C C G Minus Strand C G G G C 5' 3' Complex Dissociation Complementary replication as the 3' 5' simplest copying mechanism of RNA Plus Strand G C C C G Complementarity is determined by Watson-Crick base pairs: + 5' 3' G � C and A = U Minus Strand C G G G C

  9. f 1 (A) + I 1 I 1 I 1 + f 2 (A) + I 2 I 2 I 2 + Φ = ( Φ ) dx / dt = x - x f x f i - i i i i i Φ = Σ ; Σ = 1 ; i,j f x x =1,2,...,n j j j j j i � i =1,2,...,n ; [I ] = x 0 ; i f i I i [A] = a = constant (A) + (A) + I i + + I i fm = max { ; j=1,2,...,n} fj � � � xm(t) 1 for t f m I m (A) + (A) + I m I m + f n I n (A) + (A) + I n I n + + Reproduction of organisms or replication of molecules as the basis of selection

  10. Selection equation : [I i ] = x i � 0 , f i > 0 ( ) dx ∑ ∑ n n = − φ = = φ = = i x f , i 1 , 2 , , n ; x 1 ; f x f L i i i j j = = dt i 1 j 1 Mean fitness or dilution flux, φ (t), is a non-decreasing function of time, ( ) φ n dx d = ∑ { } 2 = − = ≥ i 2 f f f var f 0 i dt dt = i 1 Solutions are obtained by integrating factor transformation ( ) ( ) ⋅ x 0 exp f t ( ) = = x t i i ; i 1 , 2 , , n L ( ) ( ) i ∑ = n ⋅ x 0 exp f t j j j 1

  11. s = ( f 2 - f 1 ) / f 1 ; f 2 > f 1 ; x 1 (0) = 1 - 1/N ; x 2 (0) = 1/N 1 Fraction of advantageous variant 0.8 0.6 s = 0.1 s = 0.02 0.4 0.2 s = 0.01 0 0 200 600 800 1000 400 Time [Generations] Selection of advantageous mutants in populations of N = 10 000 individuals

  12. 5' 3' Plus Strand G C C C G 5' 3' GAA UCCCG AA GAA UCCCGUCCCG AA Plus Strand G C C C G Insertion C 3' G 5' 3' Minus Strand G G C G G C GAAUCCA GAAUCC CGA A 3' 5' Deletion Plus Strand G C C C G C Point Mutation Mutations in nucleic acids represent the mechanism of variation of genotypes .

  13. Theory of molecular evolution M.Eigen, Self-organization of matter and the evolution of biological macromolecules . Naturwissenschaften 58 (1971), 465-526 C.J. Thompson, J.L. McBride, On Eigen's theory of the self-organization of matter and the evolution of biological macromolecules . Math. Biosci . 21 (1974), 127-142 B.L. Jones, R.H. Enns, S.S. Rangnekar, On the theory of selection of coupled macromolecular systems. Bull.Math.Biol . 38 (1976), 15-28 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle . Naturwissenschaften 58 (1977), 465-526 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part B: The abstract hypercycle . Naturwissenschaften 65 (1978), 7-41 M.Eigen, P.Schuster, The hypercycle. A principle of natural self-organization. Part C: The realistic hypercycle . Naturwissenschaften 65 (1978), 341-369 J. Swetina, P. Schuster, Self-replication with errors - A model for polynucleotide replication. Biophys.Chem. 16 (1982), 329-345 J.S. McCaskill, A localization threshold for macromolecular quasispecies from continuously distributed replication rates . J.Chem.Phys. 80 (1984), 5194-5202 M.Eigen, J.McCaskill, P.Schuster, The molecular quasispecies . Adv.Chem.Phys. 75 (1989), 149-263 C. Reidys, C.Forst, P.Schuster, Replication and mutation on neutral networks . Bull.Math.Biol. 63 (2001), 57-94

  14. I 1 I j + Σ Φ dx / dt = f Q ji x - x f j Q j1 i j j j i I j I 2 + Σ i Φ = Σ ; Σ = 1 ; f x x Q ij = 1 j j i j j � i =1,2,...,n ; f j Q j2 [Ii] = xi 0 ; I i I j + [A] = a = constant f j Q ji l -d(i,j) d(i,j) I j (A) + I j Q = (1- ) p p + I j ij f j Q jj p .......... Error rate per digit l ........... Chain length of the f j Q jn polynucleotide I j d(i,j) .... Hamming distance I n + between Ii and Ij Chemical kinetics of replication and mutation as parallel reactions

  15. .... GC CA UC .... d =1 H d =2 .... GC GA UC .... .... GC CU UC .... H d =1 H .... GC GU UC .... City-block distance in sequence space 2D Sketch of sequence space Single point mutations as moves in sequence space

  16. Mutant class 0 0 1 1 2 4 8 16 Binary sequences are encoded by their decimal equivalents: 2 3 5 6 9 10 12 17 18 20 24 = 0 and = 1, for example, C G ≡ "0" 00000 = CCCCC , 3 7 11 13 14 19 21 22 25 26 28 ≡ "14" 01110 = , C GGG C ≡ 4 "29" 11101 = , etc. GGG G C 15 23 27 29 30 5 31 Sequence space of binary sequences of chain lenght n=5

  17. CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... A C A C Hamming distance d (I ,I ) = 4 H 1 2 (i) d (I ,I ) = 0 H 1 1 (ii) d (I ,I ) = d (I ,I ) H 1 2 H 2 1 � (iii) d (I ,I ) d (I ,I ) + d (I ,I ) H 1 3 H 1 2 H 2 3 The Hamming distance between sequences induces a metric in sequence space

  18. Mutation-selection equation : [I i ] = x i � 0, f i > 0, Q ij � 0 dx ∑ ∑ ∑ n n n = − φ = = φ = = i f Q x x , i 1 , 2 , , n ; x 1 ; f x f L j ji j i i j j = = = dt j 1 i 1 j 1 Solutions are obtained after integrating factor transformation by means of an eigenvalue problem ( ) ( ) ∑ − n 1 ⋅ ⋅ λ c 0 exp t l ( ) ∑ n ik k k = = = = x t k 0 ; i 1 , 2 , , n ; c ( 0 ) h x ( 0 ) L ( ) ( ) ∑ ∑ i − k ki i n n 1 = i 1 ⋅ ⋅ λ c 0 exp t l jk k k = = j 1 k 0 { } { } { } ÷ = = = − = = = 1 W f Q ; i , j 1 , 2 , , n ; L ; i , j 1 , 2 , , n ; L H h ; i , j 1 , 2 , , n L l L L i ij ij ij { } − ⋅ ⋅ = Λ = λ = − 1 L W L ; k 0 , 1 , , n 1 L k

  19. Quasispecies Uniform distribution 0.00 0.05 0.10 Error rate p = 1-q Quasispecies as a function of the replication accuracy q

  20. Master sequence Mutant cloud n o i t a r t n e c n o C Sequence space The molecular quasispecies in sequence space

  21. e 1 l 0 x 1 e 1 x 3 e 3 e 2 l 2 e 3 e 2 x 2 l 1 The quasispecies on the concentration simplex S 3 = { } ∑ = 3 ≥ = = x 0 , i 1 , 2 , 3 ; x 1 i i i 1

  22. Replication rate constant: f k = � / [ � + � (k) ] d S � d S (k) = d H (S k ,S � ) f 6 f 7 f 5 f 0 f 4 f � f 3 f 1 f 2 Evaluation of RNA secondary structures yields replication rate constants

  23. Hamming distance d (S ,S ) = 4 H 1 2 d (S ,S ) = 0 (i) H 1 1 (ii) d (S ,S ) = d (S ,S ) H 1 2 H 2 1 � (iii) d (S ,S ) d (S ,S ) + d (S ,S ) H 1 3 H 1 2 H 2 3 The Hamming distance between structures in parentheses notation forms a metric in structure space

  24. Reaction Mixture Stock Solution Replication rate constant: f k = � / [ � + � (k) ] d S � d S (k) = d H (S k ,S � ) Selection constraint: # RNA molecules is controlled by the flow ≈ ± N ( t ) N N The flowreactor as a device for studies of evolution in vitro and in silico

  25. 3'-End 5'-End 70 60 10 50 20 30 40 Randomly chosen Phenylalanyl-tRNA as initial structure target structure

  26. Master sequence Mutant cloud “Off-the-cloud” Concentration mutations Sequence e c a p s The molecular quasispecies in sequence space

  27. Genotype-Phenotype Mapping Evaluation of the = � S { ( ) I { S { Phenotype I { ƒ f = ( S ) { { f { Q { f 1 j f 1 Mutation I 1 f 2 f n+1 I 1 I n+1 I 2 f n f 2 I n I 2 f 3 I 3 Q Q I 3 f 3 I { I 4 f 4 f { I 5 I 4 I 5 f 4 f 5 f 5 Evolutionary dynamics including molecular phenotypes

  28. 50 S d � - 0 5 40 e r u t c u r Evolutionary trajectory t s 30 l a i t i n i m o r f 20 e c n a t s i d e g 10 a r e v A 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( biologists‘ view )

  29. 50 S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory ( physicists‘ view )

  30. Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 44 Endconformation of optimization

  31. Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 43 44 Reconstruction of the last step 43 � 44

  32. Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 42 43 44 Reconstruction of last-but-one step 42 � 43 ( � 44)

  33. Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 41 42 43 44 Reconstruction of step 41 � 42 ( � 43 � 44)

  34. Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time 40 41 42 43 44 Reconstruction of step 40 � 41 ( � 42 � 43 � 44)

  35. Average structure distance to target dS 36 � Relay steps Number of relay step 10 38 40 42 44 Evolutionary trajectory 0 1250 Time Evolutionary process 39 40 41 42 43 44 Reconstruction Reconstruction of the relay series

  36. Transition inducing point mutations Neutral point mutations Change in RNA sequences during the final five relay steps 39 � 44

  37. 50 Relay steps S d � 40 t e g r a t o t e 30 c n a t s i d e r u 20 t c u r t s e g a 10 r e v A Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Trajectory and relay steps

  38. Average structure distance Uninterrupted presence Number of relay step 08 to target dS 10 � 12 28 neutral point mutations during 20 14 a long quasi-stationary epoch Evolutionary trajectory 10 0 250 500 Time (arbitrary units) Transition inducing point mutations Neutral point mutations Neutral genotype evolution during phenotypic stasis

  39. 50 Relay steps Main transitions Average structure distance to target d � S 40 30 20 10 Evolutionary trajectory 0 0 250 500 750 1000 1250 Time (arbitrary units) In silico optimization in the flow reactor: Main transitions

  40. 00 09 31 44 Three important steps in the formation of the tRNA clover leaf from a randomly chosen initial structure corresponding to three main transitions .

  41. AUGC GC Movies of optimization trajectories over the AUGC and the GC alphabet

  42. 0.2 0.15 y c n e 0.1 u q e r F 0.05 0 0 1000 2000 3000 4000 5000 Runtime of trajectories Statistics of the lengths of trajectories from initial structure to target ( AUGC -sequences)

  43. Alphabet Runtime Transitions Main transitions No. of runs AUGC 385.6 22.5 12.6 1017 GUC 448.9 30.5 16.5 611 GC 2188.3 40.0 20.6 107 Statistics of trajectories and relay series (mean values of log-normal distributions)

  44. Minimum free energy criterion Inverse folding of RNA secondary structures The idea of inverse folding algorithm is to search for sequences that form a given RNA secondary structure under the minimum free energy criterion.

  45. Structure

  46. 3’-end C U G G G A A A A A U C C C C A G A C C G G G G G U U U C C C C G 5’-end G Structure Compatible sequence

  47. 3’-end C A A U G U A G G G C A A G C A A G C A U G C C C A U C C G C A G A A C G C C G G C G G C G G G C G U U C G U C C G C C U G C G 5’-end U U G Structure Compatible sequence

  48. 3’-end C A A U G U A G Single nucleotides: A U G C , , , G G C A A G C A A G C A U G C C C A U C C G C A G A A C G C C G G C G G C G AU , UA G G C G Base pairs: GC , CG U U C G GU , UG U C C G C C U G C G 5’-end U U G Structure Compatible sequence

  49. 3’-end C A A U G A U G G G C A A G C A A C G A U G C C C A U C C C G A G A A C G C C G G C G G C G G G G G U C G U U C G C C G U G C G U U 5’-end G Structure Incompatible sequence

  50. Initial trial sequences Stop sequence of an unsuccessful trial Intermediate compatible sequences Target sequence Target structure S k Approach to the target structure S k in the inverse folding algorithm

  51. Minimum free energy criterion 1st 2nd 3rd trial 4th 5th Inverse folding of RNA secondary structures The inverse folding algorithm searches for sequences that form a given RNA secondary structure under the minimum free energy criterion.

  52. Theory of genotype – phenotype mapping P. Schuster, W.Fontana, P.F.Stadler, I.L.Hofacker, From sequences to shapes and back: A case study in RNA secondary structures . Proc.Roy.Soc.London B 255 (1994), 279-284 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. I. Neutral networks . Mh.Chem. 127 (1996), 355-374 W.Grüner, R.Giegerich, D.Strothmann, C.Reidys, I.L.Hofacker, P.Schuster, Analysis of RNA sequence structure maps by exhaustive enumeration. II. Structure of neutral networks and shape space covering . Mh.Chem. 127 (1996), 375-389 C.M.Reidys, P.F.Stadler, P.Schuster, Generic properties of combinatory maps . Bull.Math.Biol. 59 (1997), 339-397 I.L.Hofacker, P. Schuster, P.F.Stadler, Combinatorics of RNA secondary structures . Discr.Appl.Math. 89 (1998), 177-207 C.M.Reidys, P.F.Stadler, Combinatory landscapes . SIAM Review 44 (2002), 3-54

  53. ψ Sk = ( ) I. fk = ( f Sk ) Sequence space Real numbers Structure space Mapping from sequence space into structure space and into function

  54. ψ Sk = ( ) I. fk = ( f Sk ) Sequence space Real numbers Structure space

  55. ψ Sk = ( ) I. fk = ( f Sk ) Sequence space Real numbers Structure space The pre-image of the structure S k in sequence space is the neutral network G k

  56. Neutral networks are sets of sequences forming the same structure. G k is the pre-image of the structure S k in sequence space: -1 (S k ) π { � G k = � j | � (I j ) = S k } The set is converted into a graph by connecting all sequences of Hamming distance one. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4 n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence space. In this approach, nodes are inserted randomly into sequence space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

  57. � � � U � -1 � � G = ( S ) | ( ) = I I S k k j j k � � (k) j / λ j = λ k = 12 27 = 0.444 , | G k | / κ - λ κ -1 ( 1) Connectivity threshold: cr = 1 - � � � AUGC Alphabet size : = 4 cr 2 0.5 GC,AU λ λ > network G k is connected cr . . . . k 3 0.423 GUC,AUG λ λ < network G k is not connected cr . . . . 4 0.370 k AUGC Mean degree of neutrality and connectivity of neutral networks

  58. A connected neutral network

  59. Giant Component A multi-component neutral network

  60. 3'-End 3'-End 3'-End 3'-End 5'-End 5'-End 5'-End 5'-End 70 70 70 70 60 60 60 60 10 10 10 10 50 50 50 50 20 20 20 20 30 40 30 40 30 40 30 40 Degree of neutrality � � Alphabet � - - - - 0.073 0.032 AU � - - � 0.217 0.051 0.201 0.056 AUG � 0.275 0.064 � AUGC � 0.279 0.063 0.313 0.058 � UGC 0.263 0.071 � � 0.257 0.070 0.250 0.064 GC � 0.052 0.033 � � 0.057 0.034 0.068 0.034 Degree of neutrality of cloverleaf RNA secondary structures over different alphabets

  61. Reference for postulation and in silico verification of neutral networks

  62. Structure S k G k Neutral Network � k G k C Compatible Set C k The compatible set C k of a structure S k consists of all sequences which form S k as its minimum free energy structure (the neutral network G k ) or one of its suboptimal structures.

Recommend


More recommend