a few thoughts on graphs in chemistry and biology
play

A few Thoughts on Graphs in Chemistry and Biology Peter Schuster 19 - PowerPoint PPT Presentation

A few Thoughts on Graphs in Chemistry and Biology Peter Schuster 19 th LL-Seminar on Graph Theory AW, 25.04.2002 Graphs are seen as valuable tools to order and classify information in various scientific disciplines at an intermediate stage of


  1. A few Thoughts on Graphs in Chemistry and Biology Peter Schuster 19 th LL-Seminar on Graph Theory ÖAW, 25.04.2002

  2. Graphs are seen as valuable tools to order and classify information in various scientific disciplines at an intermediate stage of knowledge or level of approximation. Such stages are, for example, • collection or harvesting of data, • ordering of data according to new categories and development of models for qualitative analysis • development of model for quatitative analysis and accurate predictions.

  3. Graphs are considered here as tools to • distiguish chemical isomers, • describe the flux in chemical reaction networks, • define biological species by their phylogenetic descent, and • model genotype-phenotype maps in case of neutrality.

  4. Chemists use graphs to distinguish isomers since the second half of the ninteenth century. Atoms are nodes and chemical bonds are edges. In case of hydrocarbons containing exclusively carbon and hydrogen atoms the position of the atom is sufficient to predict its nature: H atoms form one bond and are attached to one edge, whereas C atoms form always four bonds and are connected to four edges.

  5. D.J.Cram and G.S.Hammond, Organic Chemistry , McGraw-Hill, New York 1959, p.18

  6. C n H 2n+2 , n = 1,2,3,4,5 methane isobutane ethane isopentane propane n-butane Formulas of the eight simplest alkanes as graphs, which allow for the distinction of isomers, e.g. n- and isobutane, n-, iso- and neo-pentane n-pentane neopentane

  7. C6H6 hexa-2,4-diyne (dimethyl-diacetylene) benzene hexa-1,2,4,5-tetraene (diallene) Graphs allow for a distinction of single-, double- and triple bonds

  8. C H 6 O 2 dimethylether ethanol Carbon, hydrogen and oxygen atoms are distinguished by the degree of the corresponding nodes: d( H ) = 1, d( O ) = 2, and d( C ) = 4.

  9. C6H6 benzene The benzene molecule cannot be described by a single graph.

  10. CH 3 X methyl fluoride: X = F methyl bromide: X = Br methyl chloride: X = Cl methyl iodide: X = I methane: X = H Different atoms forming one bond: H , F , Cl , Br , and I

  11. ethane C H 6 2 1,1-dichloro ethane C H 4 Cl 2 2 1,2-dichloro ethane Two isomers that cannot be distinguished by means of their graphs.

  12. Paul Karrer, Lehrbuch der organischen Chemie, Georg Thieme Verlag, Stuttgart 1959, p.737

  13. Paul Karrer, Lehrbuch der organischen Chemie, Georg Thieme Verlag, Stuttgart 1959, p.949

  14. H H o o 112.7 120 1.00 1.09 1.35 C N 122.5 o 121.6 o 1.00 1.22 o 124.7 o 118.5 H O -10 1 Å = 10 m Molecular structure of the formamide molecule

  15. Molecular structure of an association complex between a protein an a nucleic acid

  16. Chemists use directed graphs to model reaction mechanisms in chemical kinetics.

  17. Paul Karrer, Lehrbuch der organischen Chemie, Georg Thieme Verlag, Stuttgart 1959, p.479

  18. A + + + B C D AB + + C D AD + + B C ABD + C ACD + B Reaction graph of a kinetic mechanism E C + EC ACE + B

  19. A + + + B C D k -4 k -1 k 1 k 4 AB + + C D AD + + B C k 3 k 2 k -2 k -3 k 5 ABD + C ACD + B k 6 Reaction graph of a k 7 kinetic mechanism with rate constants E C + EC ACE + B k 8 k 7

  20. A B C D E F G H I J K L Biochemical Pathways 1 2 3 4 5 6 7 8 9 10 The reaction network of cellular metabolism published by Boehringer-Ingelheim.

  21. The citric acid or Krebs cycle (enlarged from previous slide).

  22. Biologists use directed graphs in the form of trees to distinguish biological species by their descent. The concept of evolution allows for ordering the wealth of species by means of phylogenetic relation. Direction of development and time ordering is introduced by the fossil record.

  23. time Charles Darwin, The Origin of Species , 6th edition. Everyman‘s Library, Vol.811, Dent London, pp.121-122.

  24. Phylogenetic tree of animal kingdom Lynn Margulis & Karlene V. Schwarz, Five Kingdoms. An illustrated guide to the Phyla of Life on Earth . W.H. Freeman & Co., San Francisco, 1982, p. 160.

  25. t 3 t 2 time t 1 Phylogenetic tree of animal kingdom Lynn Margulis & Karlene V. Schwarz, Five Kingdoms. An illustrated guide to the Phyla of Life on Earth . W.H. Freeman & Co., San Francisco, 1982, p. 160.

  26. The genotypes or genomes of individuals and species, being reproductively related ensembles of individuals, are DNA sequences. They are changing from generation to generation through mutation and recombination. Genotypes unfold into phenotypes or organisms, which are the targets of the evolutionary selection process. Point mutations are single nucleotide exchanges. The Hamming distance of two sequences is the minimal number of single nucleotide exchanges that mutually converts the two sequence into each other.

  27. .... GC CA UC .... d =1 H d =2 .... GC GA UC .... .... GC CU UC .... H d =1 H .... GC GU UC .... Point mutations as moves in sequence space

  28. CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... G A G T CGTCGTTACAATTTA GTTATGTGCGAATTC CAAATT AAAA ACAAGAG..... A C A C Hamming distance d (S ,S ) = 4 H 1 2 d (S ,S ) = 0 (i) H 1 1 (ii) d (S ,S ) = d (S ,S ) H 1 2 H 2 1 � (iii) d (S ,S ) d (S ,S ) + d (S ,S ) H 1 3 H 1 2 H 2 3 The Hamming distance induces a metric in sequence space

  29. Mutant class 0 0 1 1 2 4 8 16 Binary sequences are encoded by their decimal equivalents: 2 3 5 6 9 10 12 17 18 20 24 = 0 and = 1, for example, C G ≡ "0" 00000 = CCCCC , 3 7 11 13 14 19 21 22 25 26 28 ≡ "14" 01110 = , C GGG C ≡ 4 "29" 11101 = , etc. GGG G C 15 23 27 29 30 5 31 Sequence space of binary sequences of chain lenght n=5

  30. The RNA model considers RNA sequences as genotypes and simplified RNA structures, called secondary structures, as phenotypes. The mapping from genotypes into phenotypes is many-to-one. Hence, it is redundant and not invertible. Genotypes, i.e. RNA sequences, which are mapped onto the same phenotype, i.e. the same RNA secondary structure, form neutral networks . Neutral networks are represented by graphs in sequence space.

  31. Three-dimensional structure of phenylalanyl-transfer-RNA

  32. 5'-End 3'-End Sequence GCGGAU UUA GCUC AGDDGGGA GAGC M CCAGA CUGAAYA UCUGG AGMUC CUGUG TPCGAUC CACAG A AUUCGC ACCA 3'-End 5'-End 70 60 Secondary structure 10 50 20 30 40 Symbolic notation 5'-End 3'-End Definition and formation of the secondary structure of phenylalanyl-tRNA

  33. Criterion of Minimum Free Energy UUUAGCCAGCGCGAGUCGUGCGGACGGGGUUAUCUCUGUCGGGCUAGGGCGC GUGAGCGCGGGGCACAGUUUCUCAAGGAUGUAAGUUUUUGCCGUUUAUCUGG UUAGCGAGAGAGGAGGCUUCUAGACCCAGCUCUCUGGGUCGUUGCUGAUGCG CAUUGGUGCUAAUGAUAUUAGGGCUGUAUUCCUGUAUAGCGAUCAGUGUCCG GUAGGCCCUCUUGACAUAAGAUUUUUCCAAUGGUGGGAGAUGGCCAUUGCAG Sequence Space Shape Space

  34. ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers Mapping from sequence space into phenotype space and into fitness values

  35. ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers

  36. ψ Sk = ( ) I. fk = ( f Sk ) Non-negative Sequence space Phenotype space numbers

  37. Neutral networks of small RNA molecules can be computed by exhaustive folding of complete sequence spaces, i.e. all RNA sequences of a given chain length. This number, N=4 n , becomes very large with increasing length, and is prohibitive for numerical computations. Neutral networks can be modelled by random graphs in sequence space. In this approach, nodes are inserted randomly into sequence space until the size of the pre-image, i.e. the number of neutral sequences, matches the neutral network to be studied.

  38. Step 00 Sketch of sequence space Random graph approach to neutral networks

  39. Step 01 Sketch of sequence space Random graph approach to neutral networks

  40. Step 02 Sketch of sequence space Random graph approach to neutral networks

  41. Step 03 Sketch of sequence space Random graph approach to neutral networks

  42. Step 04 Sketch of sequence space Random graph approach to neutral networks

  43. Step 05 Sketch of sequence space Random graph approach to neutral networks

  44. Step 10 Sketch of sequence space Random graph approach to neutral networks

  45. Step 15 Sketch of sequence space Random graph approach to neutral networks

  46. Step 25 Sketch of sequence space Random graph approach to neutral networks

  47. Step 50 Sketch of sequence space Random graph approach to neutral networks

  48. Step 75 Sketch of sequence space Random graph approach to neutral networks

  49. Step 100 Sketch of sequence space Random graph approach to neutral networks

  50. � � � � � -1 � � G = ( S ) | ( ) = I I S k k j j k � � (k) j / λ k = λ j = 12 27 , | G k | / κ - cr = 1 - -1 ( 1) λ κ Connectivity threshold: � � � AUGC Alphabet size : = 4 cr 2 0.5 λ λ > network is connected G k cr . . . . k 3 0.4226 λ λ < network is not connected cr . . . . G k 4 0.3700 k Mean degree of neutrality and connectivity of neutral networks

  51. Giant Component A multi-component neutral network

  52. A connected neutral network

Recommend


More recommend