Assessing Phylogenetic Hypotheses and Phylogenetic Data We use - PowerPoint PPT Presentation

Assessing Phylogenetic Hypotheses and Phylogenetic Data • We use numerical phylogenetic methods because most data includes potentially misleading evidence of relationships • We should not be content with constructing phylogenetic hypotheses but should also assess what ‘confidence’ we can place in our hypotheses • This is not always simple! (but do not despair!)

Assessing Data Quality • We expect (or hope) our data will be well structured and contain strong phylogenetic signal • We can test this using randomization tests of explicit null hypotheses • The behaviour or some measure of the quality of our real data is contrasted with that of comparable but phylogenetically uninformative data determined by randomization of the data

Random Permutation Random permutation destroys any correlation among characters to that expected by chance alone It preserves number of taxa, characters and character states in each character (and the theoretical maximum and minimum tree lengths) ‘TAXA’ ‘CHARACTERS’ Original structured data with 1 2 3 4 5 6 7 8 R-P R P R P R P R P strong correlations among A-E A E A E A E A E N-R N R N R N R N R characters D-M D M D M D M D M O-U O U O U O U O U M-T M T M T M T M T L-E L E L E L E L E Y-D Y D Y D Y D Y D ‘TAXA’ ‘CHARACTERS’ 1 2 3 4 5 6 7 8 Randomly permuted data with R-P N U D E R T O U A-E R E A P L E A D any correlation among N-R M R M M A D N P characters due to chance D-M L T R E Y M D R O-U D E Y U D E Y M M-T O M O T O U L T L-E Y D N D M P M E Y-D A P L R N R R E

Matrix Randomization Tests • Compare some measure of data quality/hierarchical structure for the real and many randomly permuted data sets • This allows us to define a test statistic for the null hypothesis that the real data are no better structured than randomly permuted and phylogenetically uninformative data • A permutation tail probability (PTP) is the proportion of data sets with as good or better measure of quality than the real data

Structure of Randomization Tests • Reject null hypothesis if, for example, more than 5% of random permutations have as good or better measure than the real data FAIL TEST Frequency 95% cutoff PASS TEST reject null hypothesis Measure of data quality (e.g. tree length, ML, pairwise incompatibilities) GOOD BAD

Matrix Randomization Tests • Measures of data quality include: 1. Tree length for most parsimonious trees - the shorter the tree length the better the data (PAUP*) 2. Numbers of pairwise incompatibilities between characters (pairs of incongruent characters) - the fewer character conflicts the better the data 3. Skewness of the distribution of tree lengths (PAUP)

Matrix Randomization Tests Ciliate SSUrDNA Min = 430 Max = 927 1 MPT Ochromonas L = 618 Symbiodinium CI = 0.696 Prorocentrum Loxodes RI = 0.714 Real data Tracheloraphis PTP = 0.01 Spirostomum Gruberia PC-PTP = 0.001 Euplotes Tetrahymena Significantly non random 3 MPTs Ochromonas Symbiodinium L = 792 Prorocentrum CI = 0.543 Loxodes Randomly Tetrahymena RI = 0.272 Tracheloraphis permuted Spirostomum PTP = 0.68 Euplotes PC-PTP = 0.737 Gruberia Not significantly different Strict consensus from random

Skewness of Tree Length Distributions NUMBER OF TREES • Studies with random (and phylogenetically uninformative) shortest tree data showed that the distribution of tree lengths tends to be normal Tree length • In contrast, phylogenetically NUMBER OF TREES informative data is expected to shortest have a strongly skewed tree distribution with few shortest trees and few trees nearly as Tree length short

Skewness of Tree Length Distributions • Skewness of tree length distributions can be used as a measure of data quality in randomization tests • It is measured with the G 1 statistic in PAUP • Significance cut-offs for data sets of up to eight taxa have been published based on randomly generated data (rather than randomly permuted data) • PAUP does not perform the more direct randomization test

Assessing Phylogenetic Hypotheses and Phylogenetic Data We use - PowerPoint PPT Presentation

Assessing Phylogenetic Hypotheses and Phylogenetic Data We use numerical phylogenetic methods because most data includes potentially misleading evidence of relationships We should not be content with constructing phylogenetic hypotheses

Hypotheses with two variates Two sample hypotheses R.W. Oldford Common hypotheses Recall some

13. hypothesis testing 1 competing hypotheses 2 competing hypotheses 3 competing hypotheses

Hypotheses with two variates Paired data R.W. Oldford Common hypotheses Recall some common

Phylogenetic Networks Networks Phylogenetic Daniel H. Huson Daniel H. Huson www-

FOUND IN TRANSLATION: Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic

Verifying Test Hypotheses - HOL/TestGen An Experiment in Test and Proof Thomas Malcher January

Spaces of phylogenetic networks Jonathan Klawitter PhD Exam 5th March, 2020 2 - 1

CSCE 471/871 Lecture 5: Phylogenetic Trees Building Phylogenetic Trees Stephen Scott

Outline CSCE CSCE 471/871 471/871 Lecture 5: Lecture 5: Building Building CSCE 471/871

Some simple hypotheses to be Some simple hypotheses to be tested by IBOY-DIWPA data Takakazu

Fictions Functions: Three Data-Driven Hypotheses Andrew Piper, McGill University How can we

Is the best model good enough? Assessing the absolute fit of phylogenetic models via posterior

Business Statistics CONTENTS A hypothesis test Hypotheses Rejection region and significance

Generating Hypotheses by Generating Hypotheses by Discovering Implicit Associations in

Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true

Learning Logically Defined Hypotheses Martin Grohe RWTH Aachen Outline I. A Declarative

Weeding Through the Workplace Impact of Medical Marijuana Steven T. Boell, Shareholder The

lif etime in Birmingham Professor Liam Grover University of Birmingham @CDTLifETIME @HTIbham

Collabora:ve Efforts to Promote Technology-based Economic

WORKING FOR A HEALTHIER TENNESSEE January Wellness Council Webinar In collaboration with the

applications Daniel Belo Felisberto Pereira Motivation Present IoThings Near future

Developing new technologies in biopharmaceu5cal formula5on, drug

The eye Basic anatomy optic nerve photoreceptors Optics and retinal image formation:

Sponsor - F ac ilitate d Re lationships Be twe e n L ate Stage Re se ar c he r s and Phase

Sambuz

Useful Links

Newsletter

Mail Us