phylogenies phylogenies describe history phylogenies
play

Phylogenies Phylogenies describe history Phylogenies describe - PowerPoint PPT Presentation

Phylogenies Phylogenies describe history Phylogenies describe history Haeckel. 1879. Phylogenies describe history Pace. 1997. Science. Phylogenies are the result of branching processes Timeseries and phylogeny are dual outcomes of an


  1. Phylogenies

  2. Phylogenies describe history

  3. Phylogenies describe history Haeckel. 1879.

  4. Phylogenies describe history Pace. 1997. Science.

  5. Phylogenies are the result of branching processes

  6. Timeseries and phylogeny are dual outcomes of an infectious process

  7. Epidemic process Time

  8. Epidemic process Count Time

  9. Epidemic process Count Can ask for the probability of observing this timeseries given epidemiological parameters β and γ . Time

  10. Epidemic process Time

  11. Epidemic process Sample some individuals Time

  12. Epidemic branching process Time

  13. Epidemic branching process Time

  14. Epidemic branching process Can ask for the probability of observing this tree given epidemiological parameters β and γ . Time

  15. The coalescent Assume equilibrium number of infecteds. Call this equilibrium N.

  16. The coalescent Sample some individuals

  17. The coalescent Each generation, there is a small Pr(coal | i = 2) = 1 chance for coalescence for each pair N

  18. The coalescent ◆ 1 Probability of coalescence scales ✓ i N = i ( i − 1) Pr(coal) = 2 2 N quadratically with lineage count

  19. The coalescent

  20. The coalescent

  21. The coalescent

  22. The coalescent ✓ ◆ 2 N T i ∼ Exponential i ( i − 1) T 2 T 3

  23. Demo

  24. Population size affects tree shape The rate of coalescence decreases linearly with the population size N . N = 500 N = 1000 N = 2000 0 5k 10k 0 5k 10k -5k 0 5k 10k N = 5000 N = 10000 N = 20000 -15k -10k -5k 0 5k 10k -5k 0 5k 10k -20k -15k -10k -5k 0 5k 10k

  25. Changing population size Constant size Growing population

  26. Changing population size Constant size Growing population

  27. Given a phylogeny, how can we learn about the evolutionary process that underlies it? Generally, we want to know: p (model | data) Bayes rule: p (model | data) ∝ p (data | model) p (model) Often referred to as: posterior ∝ likelihood × prior

  28. λ – coalescent model – sequence data D µ – mutation model – phylogeny τ In this case, we have: p ( λ | τ ) ∝ p ( τ | λ ) p ( λ ) However, we don’t observe the tree directly: p ( τ , µ | D ) ∝ p ( D | τ , µ ) p ( τ ) p ( µ ) We integrate over uncertainty: Z p ( λ | D ) ∝ p ( D | τ , µ ) p ( τ | λ ) p ( λ ) p ( µ ) d τ dµ

  29. BEAST: Bayesian Evolutionary Analysis by Sampling Trees

  30. Integration through Markov chain Monte Carlo 3 2 1 0 x 2 - 1 - 2 - 3 - 12 - 10 - 8 - 6 - 4 - 2 0 2 x 1

  31. Integration through Markov chain Monte Carlo 3 2 1 0 x 2 - 1 - 2 - 3 - 12 - 10 - 8 - 6 - 4 - 2 0 2 x 1

  32. Metropolis-Hastings algorithm Starting from state θ propose a new state θ *. For the following, this proposal must to symmetric, i.e. Q ( θ ➝ θ *) = Q ( θ * ➝ θ ) If new state is more likely, always accept. If new state is less likely, accept with probability proportional to ratio of new state to old state. ( ) p ( θ *) Acceptance probability: min 1, p ( θ ) Simple example: p ( x ) = 0.2 p ( y ) = 0.8 A ( x ➝ y ) = 0.8/0.2 = 1 A ( y ➝ x ) = 0.2/0.8 = 0.25 Mass moving from x to y: p ( x ) A ( x ➝ y ) = 0.2 ╳ 1 = 0.2 Mass moving from y to x: p ( y ) A ( y ➝ x ) = 0.8 ╳ 0.25 = 0.2

  33. BEAST will produce samples from: λ – coalescent model µ – mutation model – phylogeny τ

  34. Use a ‘skyline’ demographic model N 4 N 3 N 2 N 1

  35. Use a ‘skyline’ demographic model N 4 N 3 N 2 N 1

  36. Practical part 1

  37. Estimating R 0 from timeseries data 1000 100 Individuals 10 1 0.1 0 50 100 150 200 250 300 350 Days r (0) = β − γ r = 0.20 per day for 1918 influenza We know the approximate recovery rate γ ≈ 0 . 25 We can solve for β and hence R 0 β = r + γ ≈ 0 . 45 R 0 = β γ ≈ 0 . 45 0 . 25 ≈ 1 . 8

  38. Growth rate of pandemic H1N1 r = 0.11 per day β = 0.11 + 0.33 = 0.44 per day Ê Ê R 0 = 0.44 / 0.33 = 1.33 Ê 1000 Laboratory confirmed cases Ê 100 Ê Ê Ê Ê 10 Ê Ê Ê 1 Mar Apr May

  39. Generation time τ of infection At the beginning of the epidemic, 1 1 τ = 2 β S (0) = 2 × 0 . 36 = 1 . 39 new infections emerge at rate β . S ( ∞ ) = e − R 0 (1 − S ( ∞ )) Final susceptible fraction: 1 1 At the end of the epidemic: τ = 2 β S ( ∞ ) = 2 × 0 . 36 × 0 . 84 = 1 . 65 1000 0.010 100 Individuals 0.009 10 Τ 0.008 1 0.007 0.1 0.0 0.2 0.4 0.6 0.8 1.0 0 50 100 150 200 250 300 350 Days Time

  40. Effective population sizes of flu vs measles Influenza A (H3N2) Measles 1970 1980 1990 2000 2010 1950 1960 1970 1980 1990 2000 2010 N e � = 7.2 years N e � = 124.6 years N e = 1050 infections (duration of infection of 5 days) N e = 8270 infections (duration of infection of 11 days) N = 70 million infections (prevalence) N = 0.9 million infections (prevalence) Off by a factor of 6,700 Off by a factor of 110

  41. Practical part 2

  42. Continuous time Markov chains (CTMCs) A B A B A A B μ AB = 3 q (A) = 0.25 A μ AB 1.0 μ BA = 1 q (B) = 0.75 B 0.8 μ BA Probability in state X 0.6 0.4 µ BA p t →∞ ( A ) = µ AB + µ BA 0.2 µ AB 0.0 p t →∞ ( B ) = 0.0 0.2 0.4 0.6 0.8 1.0 µ AB + µ BA Time

  43. CTMCs on trees Transition matrix with μ AB = 3 μ BA = 1 t = 0.2 B A B A 0.59 0.41 B 0.14 0.86 A A

  44. A B Integrate over internal states A 0.59 0.41 Transition matrix with μ AB = 3 μ BA = 1 t = 0.2 B 0.14 0.86 B B 0.86 0.41 0.59 0.41 0.14 0.59 0.25 0.25 A A 0.59 0.59 A A B B 0.41 0.86 0.14 0.86 0.14 0.59 0.75 0.75 A A 0.14 0.14 A A

  45. Integrate over internal states Transition matrix with μ AB = 3 μ BA = 1 t = 0.2 B B Pr = 0.0073 Pr = 0.0211 0.86 0.41 0.59 0.41 0.14 0.59 0.25 0.25 A A 0.59 0.59 A A B B 0.41 0.86 Pr = 0.0109 Pr = 0.0036 0.14 0.86 0.14 0.59 0.75 0.75 A A 0.14 0.14 A A

  46. Integrate over internal states p ( D | τ , µ ) = 0.0211 + 0.0073 + 0.0036 + 0.0109 = 0.0429 B B 49% 17% A A A A B B 8% 25% A A A A

  47. Practical part 3

Recommend


More recommend