phylogenetic tree
play

Phylogenetic tree Michael Schroeder Biotechnology Center TU - PowerPoint PPT Presentation

Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden Phylogenetic trees Motivation Rooted and unrooted trees Rooted trees: Hierarchical clustering Drawing trees Unrooted trees: Neighbour joining Origin of


  1. Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden

  2. Phylogenetic trees • Motivation • Rooted and unrooted trees • Rooted trees: Hierarchical clustering • Drawing trees • Unrooted trees: Neighbour joining

  3. Origin of mitochondria in eucaryotes? Sequence comparison (Blast) of 601 mitochondrial yeast genes to bacteria and archaea

  4. Origin of mitochondria in eucaryotes? Sequence comparison (Blast) of 601 mitochondrial yeast genes to bacteria and archaea Bacteria Archaea Horiike et al. Nat Cell Biol. 2001. Adapted from Campbell and Heyer. Discovering genomics, proteomics, bioinformatics.

  5. Darwin‘s Tree of Life 5

  6. Tree of Life with 2.3 Mio Species opentreeoflife.org 6

  7. Phylogeny § Taxonomists classify and group organisms § Aristoteles, De Partibus Animalium § … discuss each separate species – man, lion, ox, … § or … deal first with the attributes which they have in common …

  8. Schools of Taxonomists § Goal: create taxonomy § Approach: Phenotype § Phylogeny § § 3 schools: Phenotype only § Evolutionary Taxonomists: § Phenotype (+ Phylogeny) Cladists: § Phylogeny (+Phenotype)

  9. Westnile virus in New York

  10. When did homo sapiens leave Africa? § Recent-Africa Hypothesis: hundred(s) thousand years § Multi-regional Hypothesis: million(s) years

  11. § 53 humans § Outgroup chimpanzee

  12. Clustal W: over 50 000 citations

  13. ClustalW uses phylogenetic trees as guide trees for multiple sequence alignment Thompson, NAR, 1994

  14. Phylogenetic trees • Motivation • Rooted and unrooted trees • Rooted trees: Hierarchical clustering • Drawing trees • Unrooted trees: Neighbour joining

  15. Topixgallery.com

  16. Bifurcating Trees Ancestral node (root) Internal node (hypothetical ancestor) Bifurcating = two decendants Terminal node (leave) Edge or Branch A B C D Genes, Proteins, Populations, Species,...

  17. Unrooted and Rooted Trees The principal uses of these numbers will be ... to frighten taxonomists.

  18. Unrooted and Rooted Trees A B C A C B A C B B C A

  19. Unrooted and Rooted Trees C A D A B C D B A B C D B A C D C D A B D C A B B A C D B C A D C A B D D A B c A C B D A C B D B A A D B C A D B C B D A C C B A D D B A C D C

  20. Unrooted and Rooted Trees 8.200.794.532.637.891.559.375 unrooted trees for 20 leaves! Felsenstein, 1978 To get a feeling: 8.200.794.532.637.891.559.375 ms is 20 times longer than the universe exists By Michael Schroeder, Biotec 20

  21. Unrooted and Rooted Trees Rooted tree with m leaves has m-1 internal nodes and 2m-2 edges Unrooted tree with m leaves has m-2 internal nodes and 2m-3 edges Let T unroot (m) be the number of unrooted trees with m leaves Given an unrooted tree with m leaves, an extra leaf can be added to any of the 2m-3 edges to make a tree with m+1 leaves T unroot (m+1)=(2m-3) T unroot (m) This is satisfied by T unroot (m)=(2m-5)!! Double factorial = Factorial leaving out every other number By Michael Schroeder, Biotec 21 Felsenstein, 1978

  22. Consequence: Algorithms that generate all trees, judge them, and pick the best cannot work, as there are too many trees. Alternatives : Hierarchical clustering and Neighbour joining

  23. Phylogenetic trees • Motivation • Rooted and unrooted trees • Rooted trees: Hierarchical clustering • Drawing trees • Unrooted trees: Neighbour joining

  24. Hierarchical clustering § Input: Pairwise distances between sequences § Output: A tree of clusters of sequences A B C D E A 2 6 10 9 B 5 9 8 C 4 5 D 3 A B C D E E

  25. Hierarchical clustering A B C D E (A,B) C D E A 2 6 10 9 (A,B) 5 9 8 B 5 9 8 C 4 5 C 4 5 D 3 D 3 E E A B

  26. Hierarchical clustering (A,B) C D E (A,B) C (D,E) (A,B) 5 9 8 (A,B) 5 8 C 4 5 C 4 D 3 (D,E) E A B A B D E

  27. Hierarchical clustering (A,B) C (D,E) (A,B) (C,(D,E)) (A,B) 5 8 (A,B) 5 C 4 (C,(D,E)) (D,E) A B D E A B C D E

  28. Hierarchical clustering (A,B) (C,(D,E)) ((A,B),(C,(D,E))) (A,B) 5 ((A,B),(C,(D,E))) (C,(D,E)) A B C D E A B C D E

  29. Algorithm const m number of original sequences var U a set of current trees, initially, one tree for each original sequence . D The distance between the trees in U begin U = the set of one tree (each of one node) for each original sequence. while |U| >1 do (u,v) = the roots of two trees in U with the least distance in D Make a new tree with root w and with u and v as children Calculate the length of the edges ( v, w ) and ( u, w) for each root x of the trees in U -{u, v} do D(x, w) = calculate the distance between x and the new node (w) end U = (U - {u,v} ) ∪ {w} update U end end

  30. Hierarchical Clustering A B C Distance to the new cluster w = (u,v) A 1 2 § Single linkage : B 1 § D(x,w) = min { D(x,u), D(x,v) } § Example: Distance (A,B) to C is 1 C § Complete linkage : § D(x,w) = max { D(x,u), D(x,v) } Note : “weighted” because u § Example: Distance (A,B) is C is 2 and v may have different number of nodes, hences § Average linkage ( WPGMA ) (weighted pair group method with arithmetic mean)): they are weighted. § D(x,w) = ( D(x,u) + D(x,v) ) / 2 § Example: Distance (A,B) to C is 1.5 Question : What’s the § More general (UPGMA ) difference between (unweighted pair group method using arithmetic mean): UPGMA and WPGMA? § D(x,w) = ( m u D(x,u) + m v D(x,v) ) / (m u + m v ) § m u is the number of nodes in the subtreee u By Michael Schroeder, Biotec 30

  31. Are WPGMA and UPGMA the same? § Subtree D has 1000 nodes ( m D =1000 ) § Subtree E has 1 node ( m E =1 ) § Distance (D,E) to F is § ( 2 + 98)/ 2 = 50 for WPGMA § (1000*2 + 1*98)/(1000+1) = 2.09 for UPGMA D E F D 1 2 E 98 F

  32. UPGMA A B C D E Unweighted pair group method using arithmetic mean A 3 7 8 10 B 6 8 7 C 4 5 UPGMA : (2*7.25+1*8.5) / 3 = 7.67 D 6 E WPGMA : (7.25+8.5) / 2 = 7.825 (A,B) C D E (A,B) 6.5 8 8.5 C 4 5 D 6 E (A,B) (C,D) E (A,B) 7.25 8.5 (C,D) 5.5 E (A,B) (C,D),E) (A,B) 7.67 (C,D),E)

  33. Does linkage method change trees? A B C D A 1 2 5 B 4 5 C 3 A B C D A B C D D By Michael Schroeder, Biotec 33

  34. Summary § Applications of phylogenetic trees § Clustal W § Hierarchical clustering § Linkage methods

Recommend


More recommend