Interpreting the Molecular Tree of Life: What Happened in Early Evolution? Norm Pace MCD Biology University of Colorado-Boulder nrpace@colorado.edu
Outline What is the “Tree of Life”? • -- Historical – Conceptually a tree of organisms, but … -- Molecular trees, constraints, controversies and the Next-Gen stall in expanding the Tree -- It’s not a “simple” tree of organisms: Pangenome How do we know where is LUCA on the molecular map? • When was that? What was the nature of earliest life, the early lines of descent? • -- How to predict? How to go deeper than LUCA? -- Paradoxes – may be interesting
Haeckel, 1866
Stanier, 1960s Whittaker, 1969
Carl Woese, early 1980s RNase T1 fingerprint
Woese , 1977 Eukaryotes Bacteria Archaebacteria (Archaea)
Woese , 1977 Eukaryotes Bacteria Archaebacteria (Archaea) 1987
Why rRNA sequences for the backbone of the universal tree? • Universally present • The most conservative sequence in biology – even into pre-cellular life. • No lateral transfer – reflects the genetic machinery.
Expanding the Tree: Into the Natural Microbial World Sample DNA rDNA PCR library clone sequence “ “ “ next-gen sequence
Expansion of the Bacterial Tree
Doolittle Confusogram
Swithers and Katz, Microbe 2013
And then came genome sequences …….
Pangenome – the collection of genes accessible to a “phylotype” Tenaillon et al., Nature Rev. Microbiol. 8:207 (2010) (Genes) Lukjancenko et al., Mic. Ecol. 6-:708 (2010) (Gene families)
E.g. Gene contents of different “strains” of Escherichia coli : “Pangenome” Strain B Strain A Strain C
E.g. Gene contents of different “strains” of Escherichia coli : “Pangenome” Strain B Strain A What’s with all the Strain C lateral transfer”?
5µm Jed Fuhrman
Pangenome: the world of Jean-Baptist Lamark (1744-1829)
Tree of Life? Tree of what??
Tree of Life? Tree of what?? What gene(s) to use?? Core genes – rRNA, others – cellular line of descent Concatenated core genes – with care only Concatenated genomes – Ugh. There is no such thing as a tree of organisms. Note that no single gene or sequence is uniformly useful in phylogenetic analyses throughout the ToL
Making Sense of Sequences: Molecular Phylogeny 1. Align sequences so that “homologous” residues are juxtaposed. 2. Count the number of differences between pairs of sequences -- this is some measure of “evolutionary distance” that separates the organisms. 3. Calculate the “tree”, the relatedness map, that most accurately represents all the pairwise differences.
Experimental tree, late 1990s
Baldauf et al., 2000
Problems in resolving Deep-branching topology: • Representation • Uncertainty
Now 900,000 Cumlative Number of Sequences 30,000 800,000 25,000 BUT - 20,000 700,000 15,000 600,000 Next-Gen Problems!! 10,000 5,000 500,000 0 1989 1992 1995 1998 400,000 300,000 Total Bacteria 200,000 Eucarya 100,000 Archaea 0 A. 1989 1992 1995 1998 2001 2004 2007 • Next-gen sequences are short – you may get a (low level) taxon call, but only ~70% of the time with environmental seqs and you can’t do phylogeny with “unclassified” seqs. • Pipelines, in dealing with pyro-babble, toss novel seqs that don’t fit the training set – and throw out the new stuff!
Domain 1. 2 level * variation Inferred sequence change 1. 0 Phylum level 0. 8 variation Unseen change 0. 6 Species 0. 4 level variation Observed change 0. 2 0. 0 0.0 0.3 0.5 Observed Sequence Change * (Knuc) = -3/4 ln(1-(4/3)D)
Where is the “root” – LUCA? Woese 1987
• “Rooting” a tree requires an “ outgroup: – not available with a universal tree. • Solution (Dayhoff, 1970s): “paralogous rooting” -- use trees based on in-group paralogs. (Recall that “homologs” are of three kinds: “Orthologs”,“Paralogs” and “Xenologs.”)
Paralogs you can still recognize include: • Elongation Factors Tu and G • Membrane ATP Synthase α and β • tRNAs metF and met Each gives the 3-D tree and are homologs
EF-Tu/EF-G alignment, residues 1-70 Tu ‐‐ G
Rooting the Bacteria Big Tree EF-G Archaea Eucarya Bacteria EF-Tu Eucarya Archaea
Woese 1990
When was LUCA? >3.5 billion years ago
What was LUCA? • Not a “genetic cell”. More likely a “state,” communal, interdependent, replicating foci. • Early phylogenetic lines would have differentiated with acquisition of intermolecular specificity. • Radiations at the base of the domains could occur only after development of sophistication necessary for independent vertical lines of descent
Paradox: How is it that chemiosmosis was in-place before the biochemical/genetic membrane? Maybe the first membrane was abiological.
CLASH: The Big Tree vs. the Common Wisdom The eukaryote nuclear line of descent is not a • late arrival, rather, is as old as cellular life • The prokaryote-eukaryote model of evolution is wrong and needs to be banished from the lexicon of biology.
Where did the eukaryotic cell come from? • Mitos and chlps from specific bacterial phyla, Proteobacteria and Cyanobacteria. • But the nuclear line is primordial and older than cyanobacteria
The modern kind of eucaryotic cell, complete with chloroplast (and probably mitochondrion) was in-place by >3 billion years ago !
Models of Biological Organization and Evolution vs. Procaryote-Eucaryote, the textbook tale Three Domains
Procaryote/Eucaryote: The Test (Woese, 1977) 1. All eucaryotes are specifically related to one another. True 2. All procaryotes are related to the exclusion of eucaryotes. False 3. Procaryotes gave rise to (more advanced) eucaryotes. False
The End – Thank you! Organism
Recommend
More recommend