change point detection on a tree to study evolutionary
play

Change-point Detection on a Tree to Study Evolutionary Adaptation - PowerPoint PPT Presentation

Change-point Detection on a Tree to Study Evolutionary Adaptation from Present-day Species e 1 , 2 , Paul Bastide 3 , 4 , Mahendra Mariadassou 4 , C ecile An ephane Robin 3 St 1 Department of Statistics, University of WisconsinMadison,


  1. Change-point Detection on a Tree to Study Evolutionary Adaptation from Present-day Species e 1 , 2 , Paul Bastide 3 , 4 , Mahendra Mariadassou 4 , C´ ecile An´ ephane Robin 3 St´ 1 Department of Statistics, University of Wisconsin–Madison, WI, 53706, USA 2 Department of Botany, University of Wisconsin–Madison, WI, 53706, USA 3 UMR MIA-Paris, AgroParisTech, INRA, Universit´ e Paris-Saclay, 75005, Paris, France 4 MaIAGE, INRA, Universit´ e Paris-Saclay, 78352 Jouy-en-Josas, France 19 April 2016

  2. Stochastic Processes on Trees Identifiability Problems and Counting Issues Statistical Inference Turtles Data Set Introduction 0 Dermochelys Coriacea Unit Homopus Areolatus 200 150 100 50 0 Turtles phylogenetic tree with habitats. (Jaffe et al., 2011). How can we explain the diversity, while accounting for the phylogenetic correlations ? Modelling: a shifted stochastic process on the phylogeny. CA, PB, MM, SR Change-point Detection on a Tree 2/19

  3. Stochastic Processes on Trees Identifiability Problems and Counting Issues Statistical Inference Turtles Data Set Outline Stochastic Processes on Trees 1 Identifiability Problems and Counting Issues 2 Statistical Inference 3 Turtles Data Set 4 CA, PB, MM, SR Change-point Detection on a Tree 3/19

  4. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set Stochastic Process on a Tree (Felsenstein, 1985) t AB A H B R Only tip values are C observed F D G t E 6 E G Brownian Motion: 4 F C phenotype 2 V ar [ A | R ] = σ 2 t D R 0 H −2 C ov [ A ; B | R ] = σ 2 t AB A −4 B 0 200 400 600 800 time CA, PB, MM, SR Change-point Detection on a Tree 4/19

  5. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set BM vs OU Equation Stationary State Variance 1 0 phenotype W ( t ) −2 σ ij = σ 2 t ij dW ( t ) = σ dB ( t ) None. −4 0 200 400 600 800 time W ( t ) 4 β phenotype 3 t 1 2 = ln ( 2 ) α ( 1 − e −α t ) β  µ = β 0 dW ( t ) = σ dB ( t ) 2 σ ij = γ 2 e − α ( t i + t j )  1 γ 2 = σ 2 0 × ( e 2 α t ij − 1) + α [ β ( t ) − W ( t )] dt 0 200 400 600 800  2 α time CA, PB, MM, SR Change-point Detection on a Tree 5/19

  6. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set Shifts A B R C D E 2 6 E 4 1 C C phenotype phenotype 2 R E 0 A D 0 R D B −1 −2 A −4 −2 B 0 200 400 600 800 0 200 400 600 800 time time BM Shifts in the mean : OU Shifts in the optimal value : m child = m parent + δ β child = β parent + δ CA, PB, MM, SR Change-point Detection on a Tree 6/19

  7. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set Shifts A B R C D δ E 2 6 E 4 1 C C phenotype phenotype 2 R E 0 A D 0 R D B −1 −2 A −4 −2 B 0 200 400 600 800 0 200 400 600 800 time time BM Shifts in the mean : OU Shifts in the optimal value : m child = m parent + δ β child = β parent + δ CA, PB, MM, SR Change-point Detection on a Tree 6/19

  8. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set Shifts A B R C D δ E 2 E 10 1 C phenotype D phenotype δ E R E 5 0 A C C D B −1 R D 0 A A −2 B B 0 200 400 600 800 0 200 400 600 800 time time BM Shifts in the mean : OU Shifts in the optimal value : m child = m parent + δ β child = β parent + δ CA, PB, MM, SR Change-point Detection on a Tree 6/19

  9. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set Shifts A B R C D δ E 6 E D E 10 4 phenotype D phenotype δ δ E 5 2 C C C C R D R 0 E 0 A A D B B A A B B 0 200 400 600 800 0 200 400 600 800 time time BM Shifts in the mean : OU Shifts in the optimal value : m child = m parent + δ β child = β parent + δ CA, PB, MM, SR Change-point Detection on a Tree 6/19

  10. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set Linear Regression Model µ   δ 2 δ 1 Y 1 Z 4   0 µ + δ 2     Y 2   0 µ Z 1   Y 3     Z 2 ∆ = δ 2 T ∆ = µ + δ 1 + δ 3 δ 3         Y 4 0 µ + δ 1 δ 1       Z 3 Y 5 δ 3 µ + δ 1     0   0 Z 1 Z 2 Z 3 Z 4 Y 1 Y 2 Y 3 Y 4 Y 5 Y 1 1 0 0 1 1 0 0 0 0   1 0 0 1 0 1 0 0 0 Y 2   T = Y 3 1 1 0 0 0 0 1 0 0 Y = T ∆ BM + E BM   BM :   Y 4 1 1 1 0 0 0 0 1 0   Y 5 1 1 1 0 0 0 0 0 1 CA, PB, MM, SR Change-point Detection on a Tree 7/19

  11. Stochastic Processes on Trees Principle of the Modeling Identifiability Problems and Counting Issues Shifts Statistical Inference Equivalency OU/BM Turtles Data Set Linear Regression Model λ   δ 2 δ 1 Y 1 Z 4   0 λ + w 5 δ 2     Y 2   0 λ Z 1   Y 3     Z 2 ∆ = δ 2 TW ( α )∆ = λ + w 2 δ 1 + w 7 δ 3 δ 3         Y 4 0 λ + w 2 δ 1 δ 1       Z 3 Y 5 δ 3 λ + w 2 δ 1     0   0 W ( α ) = Diag(1 − e − α ( h − t pa( i ) ) , 1 ≤ i ≤ m + n ) Y = T ∆ BM + E BM BM : λ = µ e − α h + β 0 (1 − e − α h ) Y = TW ( α )∆ OU + E OU OU : CA, PB, MM, SR Change-point Detection on a Tree 8/19

  12. Stochastic Processes on Trees Identifiability Problems Identifiability Problems and Counting Issues Number of Parsimonious Solutions Statistical Inference Number of Models with K Shifts Turtles Data Set Equivalencies Number of shifts K fixed, several equivalent solutions. µ µ δ 1 δ 2 − δ 1 δ 1 δ 2 µ + δ 1 µ + δ 2 Problem of over-parametrization: parsimonious configurations. CA, PB, MM, SR Change-point Detection on a Tree 9/19

  13. Stochastic Processes on Trees Identifiability Problems Identifiability Problems and Counting Issues Number of Parsimonious Solutions Statistical Inference Number of Models with K Shifts Turtles Data Set Equivalencies Number of shifts K fixed, several equivalent solutions. µ µ µ + δ 1 µ + δ 2 δ 1 δ 2 − δ 1 δ 1 δ 2 δ 2 − δ 1 δ 1 − δ 2 µ + δ 1 µ + δ 2 µ + δ 1 µ + δ 2 Problem of over-parametrization: parsimonious configurations. CA, PB, MM, SR Change-point Detection on a Tree 9/19

  14. Stochastic Processes on Trees Identifiability Problems Identifiability Problems and Counting Issues Number of Parsimonious Solutions Statistical Inference Number of Models with K Shifts Turtles Data Set Parsimonious Solution : Definition Definition (Parsimonious Allocation) A coloring of the tips being given, a parsimonious allocation of the shifts is such that it has a minimum number of shifts. CA, PB, MM, SR Change-point Detection on a Tree 10/19

  15. Stochastic Processes on Trees Identifiability Problems Identifiability Problems and Counting Issues Number of Parsimonious Solutions Statistical Inference Number of Models with K Shifts Turtles Data Set Parsimonious Solution : Definition Definition (Parsimonious Allocation) A coloring of the tips being given, a parsimonious allocation of the shifts is such that it has a minimum number of shifts. CA, PB, MM, SR Change-point Detection on a Tree 10/19

  16. Stochastic Processes on Trees Identifiability Problems Identifiability Problems and Counting Issues Number of Parsimonious Solutions Statistical Inference Number of Models with K Shifts Turtles Data Set Parsimonious Solution : Definition Definition (Parsimonious Allocation) A coloring of the tips being given, a parsimonious allocation of the shifts is such that it has a minimum number of shifts. CA, PB, MM, SR Change-point Detection on a Tree 10/19

  17. Stochastic Processes on Trees Identifiability Problems Identifiability Problems and Counting Issues Number of Parsimonious Solutions Statistical Inference Number of Models with K Shifts Turtles Data Set Parsimonious Solution : Definition Definition (Parsimonious Allocation) A coloring of the tips being given, a parsimonious allocation of the shifts is such that it has a minimum number of shifts. ≤ CA, PB, MM, SR Change-point Detection on a Tree 10/19

  18. Stochastic Processes on Trees Identifiability Problems Identifiability Problems and Counting Issues Number of Parsimonious Solutions Statistical Inference Number of Models with K Shifts Turtles Data Set Parsimonious Solution : Definition Definition (Parsimonious Allocation) A coloring of the tips being given, a parsimonious allocation of the shifts is such that it has a minimum number of shifts. CA, PB, MM, SR Change-point Detection on a Tree 10/19

Recommend


More recommend