Mathematical Modeling of Evolution Solved and Open Problems Peter Schuster Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA Emerging Modeling Methodologies in Medicine and Biology Edinburgh, 20.– 24.07.2009
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
1. Darwin, Mendel, and evolutionary optimization 2. Evolution as an exercise in chemical kinetics 3. Genotype – phenoytype mappings in biopolymers 4. Neutrality in evolution 5. Extending the notion of structure 6. Simulation of molecular evolution 7. Some origins of complexity in biology
1. Darwin, Mendel, and evolutionary optimization 2. Evolution as an exercise in chemical kinetics 3. Genotype – phenoytype mappings in biopolymers 4. Neutrality in evolution 5. Extending the notion of structure 6. Simulation of molecular evolution 7. Some origins of complexity in biology
Three necessary conditions for Darwinian evolution are: 1. Multiplication, 2. Variation , and 3. Selection. Biologists distinguish the genotype – the genetic information – and the phenotype – the organisms and all its properties. The genotype is unfolded in development and yields the phenotype . Variation operates on the genotype – through mutation and recombination – whereas the phenotype is the target of selection . Without human intervention natural selection is based on the number of fertile progeny in forthcoming generations that is called fitness . Question: Is Darwinian evolution optimizing fitness?
∑ = = n ( ) ( ) ( ) x t N t N t j j i 1 i { } = = max ; 1 , 2 , , K f f j n m j → → ∞ ( ) 1 x t for t m Reproduction of organisms or replication of molecules as the basis of selection
Selection equation : [X i ] = x i � 0 , f i � 0 ( ) dx ∑ ∑ = − φ = n = φ = n = , 1 , 2 , , ; 1 ; i L x f i n x f x f i i = i = j j 1 1 i j dt mean fitness or dilution flux , φ (t), is a non-decreasing function of time , φ ( ) = ∑ n d dx { } 2 = − = ≥ 2 var 0 i f f f f i dt dt = 1 i solutions are obtained by integrating factor transformation ( ) ( ) ⋅ 0 exp ( ) x f t = = i i ; 1 , 2 , L , x t i n ( ) ( ) ∑ i n ⋅ 0 exp x f t = j j 1 j The mean reproduction rate or mean fitness, � (t), is optimized in populations.
Gregor Mendel, 1822-1884 Mendel‘s rules of inheritance: white and red colors of flowers
Ronald Aylmer Fisher and the other scholars of population genetics, John Burdon Sanderson Haldane, and Sewall Wright, reconciled the theory of natural selection with Mendelian genetics. Ronald A Fisher, The genetical theory of natural selection (1930). Sewall Wright, Evolution in Mendelian populations, (1931). Ronald Fisher, 1890-1962, mathematician, statistician, JBS Haldane, The causes of evolution (1932). and founder of population genetics.
Sexual reproduction and recombination
Fisher‘s selection equation : [X i ] = x i � 0 , g ij � 0 , g ij = g ji ( ) ( ) dx ∑ = n − φ = − φ = ; 1 , 2 , , i L x g x x f i n i = ij j i i 1 j dt ∑ ∑ ∑ ∑ , n = = n φ = n n = n = 1 ; ; x f g x g x x f x f = i i = ij j = = ij j i = i i 1 1 1 , 1 1 i j i j i mean fitness or dilution flux , φ (t), is a non-decreasing function of time , d φ ( ) { } = ∑ n dx 2 = − = ≥ 2 var 0 i f f f f i i dt dt = 1 i Fisher‘s fundamental theorem of natural selection is valid for independent genes (single locus model) and autosomal symmetry, g ij = g ji .
The symmetric three-allele case
1. Darwin, Mendel, and evolutionary optimization 2. Evolution as an exercise in chemical kinetics 3. Genotype – phenoytype mappings in biopolymers 4. Neutrality in evolution 5. Extending the notion of structure 6. Simulation of molecular evolution 7. Some origins of complexity in biology
1971 1977 1988 Chemical kinetics of molecular evolution
Accuracy of replication: Q = q 1 · q 2 · q 3 · … · q n Template induced nucleic acid synthesis proceeds from 5‘-end to 3‘-end
Kinetics of RNA replication C.K. Biebricher, M. Eigen, W.C. Gardiner, Jr. Biochemistry 22 :2544-2559, 1983
dx dx = = 1 and 2 f x f x 2 2 1 1 dt dt = ξ = ξ ζ = ξ + ξ η = ξ − ξ = , , , , x f x f f f f 1 2 1 2 1 2 1 2 1 2 1 2 − η = η ( ) ( 0 ) ft t e ζ = ζ ( ) ( 0 ) ft t e Complementary replication as the simplest molecular mechanism of reproduction
Replication and mutation are parallel chemical reactions.
Chemical kinetics of replication and mutation as parallel reactions
∑ = N ji = Q 1 1 i Chemical kinetics of replication and mutation as parallel reactions
∑ = N ji = Q 1 1 i Chemical kinetics of replication and mutation as parallel reactions
Factorization of the value matrix W separates mutation and fitness effects.
Mutation-selection equation : [I i ] = x i � 0, f i � 0, Q ij � 0 dx ∑ ∑ ∑ = n − φ = n = φ = n = , 1 , 2 , , ; 1 ; i L Q f x x i n x f x f = ij j j i = i = j j 1 1 1 j i j dt solutions are obtained after integrating factor transformation by means of an eigenvalue problem ( ) ( ) ∑ − 1 n ⋅ ⋅ λ l 0 exp c t ( ) ∑ n = = = = ik k k 0 ; 1 , 2 , , ; ( 0 ) ( 0 ) k L x t i n c h x ( ) ( ) ∑ ∑ − i 1 k = ki i n n ⋅ ⋅ λ 1 i 0 exp l c t = = jk k k 1 0 j k { } { } { } ÷ = = = − = = = 1 ; , 1 , 2 , L , ; l ; , 1 , 2 , L , ; ; , 1 , 2 , L , W f Q i j n L i j n L H h i j n i ij ij ij { } − ⋅ ⋅ = Λ = λ = − 1 ; 0 , 1 , L , 1 L W L k n k
Fitness landscapes showing error thresholds
− H H ≅ − = − n d d ( 1 ) ; 1 Q p ij p ij p q ij Error threshold: Individual sequences n = 10, � = 2 and d = 0, 1.0, 1.85
Quasispecies Driving virus populations through threshold The error threshold in replication
Three necessary conditions for Darwinian evolution are: 1. Multiplication, 2. Variation , and 3. Selection. Charles Darwin, 1809-1882 All three conditions are fulfilled not only by cellular organisms but also by nucleic acid molecules – DNA or RNA – in suitable cell-free experimental assays: Darwinian evolution in the test tube
Application of molecular evolution to problems in biotechnology
Artificial evolution in biotechnology and pharmacology G.F. Joyce. 2004. Directed evolution of nucleic acid enzymes. Annu.Rev.Biochem . 73 :791-836. C. Jäckel, P. Kast, and D. Hilvert. 2008. Protein design by directed evolution. Annu.Rev.Biophys . 37 :153-173. S.J. Wrenn and P.B. Harbury. 2007. Chemical evolution as a tool for molecular discovery. Annu.Rev.Biochem . 76 :331-349.
constant level sets of � Selection of quasispecies with f 1 = 1.9, f 2 = 2.0, f 3 = 2.1, and p = 0.01 , parametric plot on S 3
Phenomenon Optimization of fitness Unique selection outcome yes yes Selection Recombination and selection yes no Independent genes Recombination and selection no no Interacting genes no yes Mutation and selection The Darwinian mechanism of variation and selection is a very powerful optimization heuristic . The Darwinian mechanism and optimization of fitness
� 0 , � 0 � largest eigenvalue and eigenvector diagonalization of matrix W „ complicated but not complex “ � W = G F mutation matrix fitness landscape „ complex “ ( complex ) sequence structure � „ complex “ mutation selection Complexity in molecular evolution
1. Darwin, Mendel, and evolutionary optimization 2. Evolution as an exercise in chemical kinetics 3. Genotype – phenoytype mappings in biopolymers 4. Neutrality in evolution 5. Extending the notion of structure 6. Simulation of molecular evolution 7. Some origins of complexity in biology
5' - end N 1 O CH 2 O GCGGAU UUA GCUC AGUUGGGA GAGC CCAGA G CUGAAGA UCUGG AGGUC CUGUG UUCGAUC CACAG A AUUCGC ACCA 5'-end 3’-end N A U G C k = , , , OH O N 2 O P O CH 2 O Na � O O OH N 3 O P O CH 2 O Na � O O OH RNA structure N 4 O P O CH 2 O The molecular phenotype Na � O O OH 3' - end O P O Na � O
N = 4 n N S < 3 n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ � { AU , CG , GC , GU , UA , UG } A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs
The inverse folding algorithm searches for sequences that form a given RNA structure.
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space
Recommend
More recommend