Phylogenetic trees IV Maximum Likelihood Gerhard Jger Words, - PowerPoint PPT Presentation

Phylogenetic trees IV Maximum Likelihood Gerhard Jäger Words, Bones, Genes, Tools February 28, 2018 Gerhard Jäger Maximum Likelihood WBGT 1 / 20

Theory Theory Gerhard Jäger Maximum Likelihood WBGT 2 / 20

Theory Recap: Continuous time Markov model WBGT Maximum Likelihood Gerhard Jäger 3 / 20 l 4 l 3 l 2 l 5 � s + re − t r − re − t � l 8 P ( t ) = s − se − t r + se − t l 1 π = ( s, r ) l 6 l 7

Theory Likelihood of a tree WBGT Maximum Likelihood Gerhard Jäger 4 / 20 difgerent branches is independent suppose we know probability simplifying assumption: evolution at background reading: Ewens and Grant (2005), 15.7 l 4 l 3 l 2 l 5 l 8 l 1 distributions v t and v b over states at top and bottom of branch l k l 6 L ( l k ) = v T t P ( l k ) v b l 7

Theory Likelihood of a tree WBGT Maximum Likelihood Gerhard Jäger method from tips to root log-likelihood of larger tree: recursively apply this 5 / 20 log-likelihoods likelihoods of states (0 , 1) at root are v T 1 P ( l 1 ) v T 2 P ( l 2 ) l 2 log ( v T 1 P ( l 1 )) + log ( v T 2 P ( l 2 )) v 2 l 1 v 1

Theory Likelihood of a tree Gerhard Jäger Maximum Likelihood WBGT 6 / 20 L ( mother ) i � � = ( P ( t ) i,j L ( d ) j ) , d ∈ daughters 1 ≤ j ≤ n

Theory (Log-)Likelihood of a tree WBGT Maximum Likelihood Gerhard Jäger likelihoods for each character this is for one character — likelhood for all data is product of if we assume that root node is in equilibrium: root overall likelihood for entire tree depends on probability distribution on each branch this is essentially identical to Sankofg algorithm for parsimony: 7 / 20 weight ( i, j ) = log P ( l k ) ij weight matrix depends on branch length → needs to be recomputed for L ( tree ) = ( s, r ) T L ( root ) does not depend on location of the root ( → time reversibility)

Theory (Log-)Likelihood of a tree likelihood of tree depends on branch lengths rates for each character likelihood for tree topology: Gerhard Jäger Maximum Likelihood WBGT 8 / 20 L ( tree | � L ( topology ) = max l k ) l k : k is a branch

Theory (Log-)Likelihood of a tree WBGT Maximum Likelihood Gerhard Jäger rates are gamma distributed 4 invariant 3 characters) 2 1 difgerent options, increasing order of complexity Where do we get the rates from? 9 / 20 s = r = 0 . 5 for all characters r = empirical relative frequency of state 1 in the data (identical for all a certain proportion p inv (value to be estimated) of characters are

Theory rate matrix is multiplied with WBGT Maximum Likelihood Gerhard Jäger Gamma distribution Gamma-distributed rates 10 / 20 all characters equilibrium distribution is identical for except for mathematical convenience) common method (no real justifjcation much we want allow rates to vary, but not too coeffjcient λ i for character i λ i is random variable drawn from a L ( r i = x ) = β β x ( β − 1) e − βx Γ( β )

Theory Gamma-distributed rates overall likelihood of tree topology: integrate computationally impractical approximate integration via Hidden Markov Model Gerhard Jäger Maximum Likelihood WBGT 11 / 20 over all λ i , weighted by Gamma likelihood in practice: split Gamma distribution into n discrete bins (usually n = 4 ) and

Theory Modeling decisions to make WBGT Maximum Likelihood Gerhard Jäger This could be continued — you can build in rate variation across branches, you can fjt the 1 0 none invariant characters 1 Gamma distributed 0 none rate variation 1 ML estimate 1 aspect of model possible choices number of parameters to estimate branch lengths unconstrained 12 / 20 ultrametric equilibrium probabilities uniform 0 empirical 2 n − 3 ( n is number of taxa) n − 1 p inv number of Gamma categories . . .

Theory Model selection tradeofg rich models are better at detecting patterns in the data, but are prone to over-fjtting parsimoneous models less vulnerable to overfjtting but may miss important information standard issue in statistical inference one possible heuristics: Akaike Information Criterion (AIC) the model minimizing AIC is to be preferred Gerhard Jäger Maximum Likelihood WBGT 13 / 20 AIC = − 2 × log likelihood + 2 × number of free parameters

Theory unconstrained 16 unconstrained uniform Gamma 17496.73 17 empirical none none none 16106.52 18 unconstrained empirical 17494.73 Gamma 16049.28 none Example: Model selection for cognacy data/ 16009.90 13 unconstrained uniform none 17492.73 uniform 14 unconstrained uniform none 17494.73 15 unconstrained none 19 ultrametric 16025.99 16051.27 23 unconstrained ML Gamma none 24 ML unconstrained ML Gamma 16001.00 Gerhard Jäger Maximum Likelihood WBGT none unconstrained unconstrained empirical empirical Gamma none 16033.21 20 unconstrained Gamma 22 16011.38 21 unconstrained ML none none 16102.04 ML Gamma 12 ultrametric ultrametric uniform Gamma none 17517.89 4 uniform 17518.39 Gamma 17519.75 5 ultrametric empirical none none 3 none 6 AIC UPGMA tree model no. branch lengths eq. probs. rate variation inv. char. 1 uniform ultrametric uniform none none 17515.95 2 ultrametric 15981.94 16114.66 ultrametric ultrametric empirical ML none none 16034.96 10 ML 16022.21 none 16058.83 11 ultrametric ML Gamma none 9 ultrametric 14 / 20 empirical empirical ultrametric 8 none 16056.85 7 15997.16 ultrametric none Gamma Gamma p inv p inv p inv p inv p inv p inv p inv p inv p inv p inv p inv p inv

Theory helps WBGT Maximum Likelihood Gerhard Jäger in practice one has to make compromises model specifjcation, and pick the tree+model with lowest AIC ideally, one would want to do 24 heuristic tree searches, one for each model requires several hours on a single processor; parallelization Tree search for the 25 taxa in our running example, ML tree search for the full computationally very demanding! optimize branch lengths to maximize likelihood for that topology heuristic search to fjnd the topology maximizing likelihood ML tree: a model ML computation gives us likelihood of a tree topology, given data and 15 / 20

Running example Running example Gerhard Jäger Maximum Likelihood WBGT 16 / 20

Running example ultrametric: WBGT Maximum Likelihood Gerhard Jäger Running example: cognacy data AIC = 7972 17 / 20 unconstrained branch lengths: AIC = 7929 Greek Irish Breton Welsh Bengali Hindi Nepali Lithuanian Bulgarian Czech Polish Russian Ukrainian Icelandic Swedish Danish English Dutch German Romanian French Italian Catalan Portuguese Spanish Greek Hindi Bengali Nepali Italian French Catalan Romanian Portuguese Spanish Irish Breton Welsh Lithuanian Russian Ukrainian Polish Bulgarian Czech English Dutch German Danish Icelandic Swedish

Running example ultrametric: WBGT Maximum Likelihood Gerhard Jäger Running example: WALS data AIC = 2828 18 / 20 unconstrained branch lengths: AIC = 2752 Bengali Nepali Hindi Breton Irish Welsh Bulgarian Greek Czech Lithuanian Polish Russian Ukrainian Catalan Italian Portuguese Romanian Spanish French Danish Swedish Icelandic Dutch German English Hindi Bengali Nepali Polish Czech Lithuanian Russian Ukrainian English Dutch German Swedish Danish Icelandic Bulgarian Greek Romanian Portuguese Spanish Catalan Italian French Breton Irish Welsh

Running example ultrametric: WBGT Maximum Likelihood Gerhard Jäger Running example: phonetic data AIC = 90575 19 / 20 unconstrained branch lengths: AIC = 89871 Bengali Hindi Nepali Lithuanian Bulgarian Polish Czech Russian Ukrainian English Dutch German Danish Icelandic Swedish Greek Irish Breton Welsh French Catalan Portuguese Romanian Spanish Italian Lithuanian Ukrainian Bulgarian Russian Polish Czech Icelandic Swedish Danish English Dutch German Bengali Hindi Nepali Greek Irish Breton Welsh Romanian French Italian Spanish Catalan Portuguese

Running example Wrapping up WBGT Maximum Likelihood Gerhard Jäger ultrametric constraint makes branch lengths optimization even though they have higher AIC) (note that the ultrametric trees in our example are sometimes better many parameter settings makes model selection diffjcult computationally demanding disadvantages: character states at each internal node can be read ofg side efgect of likelihood computation: probability distribution over on branch lengths possibility of multiple mutations are taken into account — depending data difgerent mutation rates for difgerent characters are inferred from the ML is conceptually superior to MP (let alone distance methods) 20 / 20 computationally more expensive ⇒ not feasible for larger data sets

Running example Ewens, W. and G. Grant (2005). Statistical Methods in Bioinformatics: An Introduction . Springer, New York. Gerhard Jäger Maximum Likelihood WBGT 20 / 20

Phylogenetic trees IV Maximum Likelihood Gerhard Jger Words, - PowerPoint PPT Presentation

Phylogenetic trees IV Maximum Likelihood Gerhard Jger Words, Bones, Genes, Tools February 28, 2018 Gerhard Jger Maximum Likelihood WBGT 1 / 20 Theory Theory Gerhard Jger Maximum Likelihood WBGT 2 / 20 Theory Recap: Continuous

Phylogenetic trees IV Maximum Likelihood Gerhard Jger ESSLLI 2016 Gerhard Jger Maximum

FOUND IN TRANSLATION: Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic

CSCE 471/871 Lecture 5: Phylogenetic Trees Building Phylogenetic Trees Stephen Scott

Assessing Phylogenetic Hypotheses and Phylogenetic Data We use numerical phylogenetic methods

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

Outline CSCE CSCE 471/871 471/871 Lecture 5: Lecture 5: Building Building CSCE 471/871

Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden Phylogenetic trees

CSCE 471/871 Lecture 5: Building Phylogenetic Trees Building trees from pairwise distances

Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden Phylogenetic trees

Spaces of phylogenetic networks Jonathan Klawitter PhD Exam 5th March, 2020 2 - 1

Phylogenetic Networks Networks Phylogenetic Daniel H. Huson Daniel H. Huson www-

Phylogenetic Trees in ACL2 Warren A. Hunt Jr. and Serita M. Nelesen The University of Texas at

Phylogenetic trees III Maximum Parsimony . Gerhard Jger ESSLLI 2016 Gerhard Jger Maximum

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Balance indices for phylogenetic trees under well-known probability models Universitat de les

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Statistics of One-Way Internet Packet Delays Andrew Corlett CQOS Inc., Irvine, CA D. I. Pullin

Inverse gamma distribution STAT 587 (Engineering) Iowa State University September 17, 2020

Secure Mobile Mobile Gambling Gambling Secure RSA Conference 2001 San Francisco,

Week 3 Video 2 Data Synchronization and Grain-Sizes You have ground truth training labels

Dark matter gamma-ray line searches toward the Galactic Center halo with H.E.S.S. I Emmanuel

Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics,

Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline

The cosmological evolution of blazars and the cosmic gamma- ray background in the Fermi era