monophyletic concordance between species trees and gene
play

Monophyletic concordance between species trees and gene genealogies - PowerPoint PPT Presentation

Monophyletic concordance between species trees and gene genealogies with multiple mergers Bjarki Eldon and James Degnan Phylomania 2010 University of Tasmania November 4-5, 2010 Low offspring number models Kingman (1982) introduced the n


  1. Monophyletic concordance between species trees and gene genealogies with multiple mergers Bjarki Eldon and James Degnan Phylomania 2010 University of Tasmania November 4-5, 2010

  2. Low offspring number models Kingman (1982) introduced the n -coalescent from an exchangeable Cannings offspring model; let ν i denote the number of offspring of individual i E [ ν k 1 ] < ∞ N → ∞ ; k ≥ 1 as M¨ ohle and Sagitov (2001) characterised coalescent processes based on the timescale c N c N = E [ ν 1 ( ν 1 − 1)] N − 1

  3. Conditions for convergence to Kingman’s coalescent Wright-Fisher and Moran models are exchangeable Cannings models with E [ ν 1 ( ν 1 − 1)( ν 1 − 2)] lim = 0 N 2 c N N →∞ implying c N → 0 and convergence to Kingman’s coalescent.

  4. High variance in offspring distribution Ecology, reproductive biology, and genetics of a diverse group of marine organisms suggest many offspring contributed by few individuals (Beckenbach 94; Hedgecock 94) Direct genotyping of parents and offspring provides evidence of large families in Pacific oyster (Boudry etal 2002) and Lion-Paw Scallop (Petersen etal 2008) Cod, oysters, mussels, barnacles, sea stars, plants ?

  5. Evidence for large offspring distribution ◮ broadcast spawning and external fertilization ◮ high initial mortality ◮ very large population sizes ◮ low genetic diversity ◮ large number of singleton genetic variants

  6. Λ coalescent allows multiple mergers Donnelly and Kurtz (1999), Pitman (1999), and Sagitov (1999) independently introduce a multiple merger coalescent; Λ-coalescent with coalescence rate � � 1 � b x k (1 − x ) b − k x − 2 Λ( dx ) λ b , k = k 0 Kingman’s coalescent is obtained if Λ = δ 0 For simultaneous multiple merger coalescent processes, see Schweinsberg (2000) and M¨ ohle and Sagitov (2001).

  7. Schweinsberg’s heavy-tail model Schweinsberg (2003) Each individual produces a random number X i of potential offspring; C > 0 and a > 0 and constant population size N P [ X i ≥ k ] ∼ C / k a and E [ X i ] > 1 From the pool of potential offspring, sample without replacement to form the new generation

  8. Coalescent process depends on a Coalescent timescale in units of c N ∼ N a − 1 if 1 < a < 2 case coalescent coalescence rate � b � a ≥ 2 Kingman coalescent 2 � b � B ( k − a , b − k + a ) 1 ≤ a < 2 Λ ∼ Beta (2 − a , a ) B (2 − a , a ) k 0 < a < 1 Ξ-coalescent

  9. A modified Moran model Eldon and Wakeley (2006) A modified Moran model, in which the offspring number U is random rather than fixed at one as in the usual Moran model P [ U = u ] = (1 − ε N ) δ 2 + ε N δ [ ψ N ] and ε N ∼ 1 / N γ , γ > 0

  10. Coalescent process depends on γ N γ , N 2 � � Coalescent timescale is N γ = min , γ > 0 case coalescence rate timescale � n � N 2 γ > 2 2 � b � � N 2 δ 2 + ψ k (1 − ψ ) b − k � γ = 2 k � b � ψ k (1 − ψ ) b − k N γ , 1 < γ < 2 γ < 2 k

  11. Ratios of coalescence times for Λ = K + Λ ψ ◦ : R 1 ; △ : R 2 ; ▽ : R 3 ; ⋄ : R 4 ; + : R n − 1 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 100 200 300 400 500 sample size n

  12. Ratios of coalescence times for Λ = Beta (0 . 9 , 1 . 1) ◦ : R 1 ; △ : R 2 ; ▽ : R 3 ; ⋄ : R 4 ; + : R n − 1 0.8 0.6 0.4 0.2 0.0 0 100 200 300 400 500 sample size n

  13. Ratios of coalescence times for Λ = Beta (0 . 1 , 1 . 9) ◦ : R 1 ; △ : R 2 ; ▽ : R 3 ; ⋄ : R 4 ; + : R n − 1 0.8 0.6 0.4 0.2 0.0 0 100 200 300 400 500 sample size n

  14. Monophyletic concordance for Λ coalescents t A B

  15. Not monophyletic concordance t A B

  16. General form for P [ MC ] for two species � P [ MC ] = P [ MC ; m A , m B ] P [ m A , m B ] m A , m B with P [ n A , n B ] = G n A , m A ( t ) G n B , m B ( t ) and m A + m B � � m A � � P [ MC ; m A , m B ] = β m A + m B , k P [ MC ; m A − k + 1 , m B ] k k =2 � m B �� � m A + m B � + P [ MC ; m A , m B − k + 1] / k k

  17. Computing G i , j ( t ) G i , j ( t ) is the probability of j lines at time t when starting from i lines at time zero within one population A vector c of ordered mergers associated with Kingman’s coalescent is simply { 2 , 2 , . . . , 2 } By way of example, starting from 10 lines, say, a coalescence sequence could be { 3 , 2 , 5 , 3 } in a Λ coalescent. Conditioning on the embedded chain , or the order of mergers Transition probabilities q i , j  if i � = j �  k � = i q i , k   β i , j =    0 otherwise

  18. The rate matrix Q A of ( A t ; t ≥ 0) is � � 1 � j x j − i − 1 (1 − x ) i − 1 Λ( dx ) q j , i = j − i + 1 0 j − 1 � q j , j = − q j , i , 2 ≤ j ≤ n i =1 q j , i = 0 , otherwise

  19. Using eigenvectors and eigenvalues of Q A Eigenvalues of Q A are α ( k ) = q k , k Left eigenvector l ( k ) = � � l ( k ) 1 , . . . , l ( k ) n � � Right eigenvector r ( k ) = r ( k ) , . . . , r ( k ) n 1 Obtained by recursions q j +1 , j l ( k ) j +1 + · · · + q k , j l ( k ) l ( k ) k = , 1 ≤ j < k j q k , k − q j , j q j , k r ( k ) + · · · + q j , j − 1 r ( k ) r ( k ) j − 1 k = , 1 < k < j ≤ n j q k , k − q j , j

  20. The spectral decomposition of Q A yields the transition probabilites G i , j ( t ) ≡ P [ A t = j | A 0 = i ] as i e − α ( k ) t r ( k ) l ( k ) � G i , j ( t ) = i j k = j

  21. Transition probabilities G i , j for i = 3 q 3 , 2 G 3 , 2 ( t ) = P [ T 3 ≤ t , T 3 + T 2 > t ] q 3 , 2 + q 3 , 3 q 3 , 2 q 3 , 3 G 3 , 1 ( t ) = P [ T 3 + T 2 ≤ t ] + P [ T 3 ≤ t ] q 3 , 2 + q 3 , 3 q 3 , 2 + q 3 , 3 G 3 , 3 ( t ) = P [ T 3 > t ] and G 3 , 1 ( t ) + G 3 , 2 ( t ) + G 3 , 3 ( t ) = 1

  22. An example with Λ ψ Process with infinitesimal parameters � i � ψ i − j +1 (1 − ψ ) j − 1 q ij = j For i = 3 we obtain, with α ( k ) ≡ � 1 k = i − 1 q ik 3 e − α (2) t − e − α (3) t � � G 3 , 2 ( t ) = 2 1 − 3 2 e − α (2) t + 1 2 e − α (3) t G 3 , 1 ( t ) = e − α (3) t G 3 , 3 ( t ) =

  23. In general, � G i , j ( t ) = g c ( t ) , 1 ≤ j < i c ∈ C i , j in which c is a coalescence sequence ; or a particular order of mergers in going from i to j sequences. Number of possible sequences is | C i , j | = 2 i − j − 1

  24.  p ( c ) P [ T ( c ) ≤ t , T ( c ) + T j > t ] if j > 1        g c ( t ) = p ( c ) P [ T ( c ) ≤ t ] if j = 1       P [ T i > t ] if j = i  in which l γ k � 1 − e − β ( i k , j ) t � � P [ T ( c ) ≤ t , T ( c ) + T j > t ] = e − α ( j ) t β ( i k , j ) k =1 with β ( i k , j ) ≡ α ( i k ) − α ( j ); and l � 1 − e − α ( i k ) t � � γ ′ P [ T ( c ) ≤ t ] = k k =1

  25. Example: two species The probability P [ MC ] of monophyletic concordance for two lines from each of two species, with α X ( k ) = � 1 ≤ k ≤ i − 1 q ik (for species X ) (1 − e − α A (2) t )(1 − e − α B (2) t ) P [ MC ] = e − α A (2) t (1 − e − α B (2) t ) β 3 , 2 / 3 + (1 − e − α A (2) t ) e − α B (2) t β 3 , 2 / 3 + e − α A (2) t e − α B (2) t β 4 , 2 β 3 , 2 / 9 +

  26. Two species and two lines each ◦ : Λ ψ ; △ : K + Λ ψ 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 psi ◦ : Beta (2 − a , a ) 1.0 0.8 0.6 0.4 0.2 0.0 1.2 1.4 1.6 1.8 a

  27. Two species and two lines each ◦ : Λ 0 . 05 ; △ : K + Λ 0 . 05 ; ⋄ : Beta (0 . 95 , 1 . 05); + : K 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  28. Two species and two lines each ◦ : Λ 0 . 99 ; △ : K + Λ 0 . 99 ; ⋄ : Beta (0 . 05 , 1 . 95); + : K 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  29. Two species and three lines each ◦ : Λ ψ ; △ : K + Λ ψ 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 ◦ : Beta (2 − a , a ) 1.0 0.8 0.6 0.4 0.2 0.0 1.2 1.4 1.6 1.8

  30. Two species and three lines each ◦ : Λ 0 . 05 ; △ : K + Λ 0 . 05 ; ⋄ : Beta (0 . 95 , 1 . 05) 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  31. Two species and three lines each ◦ : Λ 0 . 95 ; △ : K + Λ 0 . 95 ; ⋄ : Beta (0 . 05 , 1 . 95) 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  32. Recursive approach for s species Let ˜ n = n 1 + · · · + n s in which n i denotes the number of ancestral lines for species i in a population; and let n = ( n 1 , . . . , n s ) ˜ n s � n r � � ˜ � n � � P [ MC ; n ] = β ˜ P [ MC ; m ] / n , k k k r =1 k =2 in which m = ( n 1 , n 2 , . . . , n r − 1 , n r − k + 1 , n r +1 , . . . , n s ) and P [ MC ; (0 , 0 , . . . , 0 , 1)] = P [ MC ; (0 , 0 , . . . , 0 , 1 , 1)] = 1

  33. Three species and two lines each ( t 1 = 1 , t 2 = 2) ◦ : Λ 0 . 05 ; △ : K + Λ 0 . 05 ; ⋄ : Beta (0 . 95 , 1 . 05) 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 ψ = a − 1

Recommend


More recommend