types of codon models Q ij = j for synonymous ts. j for - PDF document

2017-07-29 part 3: analysis of natural selection pressure “omega models” ! ⎧ 0 if i and j differ by > 1 ⎪ ⎪ π j for synonymous tv. types of codon models ⎪ Q ij = κπ j ⎨ for synonymous ts. ⎪ ωπ j for non-synonymous tv. ⎪ ⎪ ωκπ j for non-synonymous ts. ⎩ Goldman(and(Yang((1994)( Muse(and(Gaut((1994)( 1

2017-07-29 this codon model “ M0 ” “omega models” ! ⎧ 0 if i and j differ by > 1 ⎪ ⎪ π j for synonymous tv. ⎪ Q ij = κπ j ⎨ for synonymous ts. ⎪ ωπ j for non-synonymous tv. ⎪ ⎪ ωκπ j for non-synonymous ts. ⎩ Goldman(and(Yang((1994)( Muse(and(Gaut((1994)( x 1 x 2 ! x 3 x 4 t 1 : ω 0 t 2 : ω 0 t 3 : ω 0 t 4 : ω 0 ω 0 j GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC ... ... ... G.C ... ... ... T.. ..T ... ... ... ... ... ... ... ... ... .GC A.. t 5 : ω 0 t 4 : ω 1 ... ... ... ..C ..T ... ... ... ... A.. ... A.T ... ... .AA ... A.C ... AGC ... ... ..C ... G.A .AT ... ..A ... ... A.. ... AA. TG. ... ..G ... A.. ..T .GC ..T ... ..C ..G GA. ..T ... ... ..T C.. ..G ..A ... AT. ... ..T ... ..G ..A .GC ... k same ω same ω for all branches for all sites two basic types of models x 1 x 2 ! ω 1 ω 0 ω 1 ω 0 ω 1 x 3 x 4 t 1 : ω 0 t 2 : ω 0 t 3 : ω 0 t 4 : ω 0 GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC j ... ... ... G.C ... ... ... T.. ..T ... ... ... ... ... ... ... ... ... .GC A.. ... ... ... ..C ..T ... ... ... ... A.. ... A.T ... ... .AA ... A.C ... AGC ... ... ..C ... G.A .AT ... ..A ... ... A.. ... AA. TG. ... ..G ... A.. ..T .GC ..T ... ..C ..G GA. ..T ... ... ..T C.. ..G ..A ... AT. ... ..T ... ..G ..A .GC ... t 4 : ω 1 t 5 : ω 1 k branch models site models ( ω varies among ( ω varies among sites) branches) 2

2017-07-29 interpretation of a branch model x 1 x 2 ! x 3 x 4 t 1 : ω 0 t 2 : ω 0 t 3 : ω 0 t 4 : ω 0 j t 4 : ω 1 t 5 : ω 1 k episodic adaptive evolution of a novel function with ω 1 > 1 branch models* x 1 x 2 ! x 3 x 4 t 1 : ω 0 t 2 : ω 0 t 3 : ω 0 t 4 : ω 0 j t 4 : ω 1 t 5 : ω 1 k variation ( ω ) among branches: approach Yang, 1998 fixed effects Bielawski and Yang, 2003 fixed effects Seo et al. 2004 auto-correlated rates Kosakovsky Pond and Frost, 2005 genetic algorithm Dutheil et al. 2012 clustering algorithm * these methods can be useful when selection pressure is strongly episodic 3

2017-07-29 site models* GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC ... ... ... G.C ... ... ... T.. ..T ... ... ... ... ... ... ... ... ... .GC A.. ... ... ... ..C ..T ... ... ... ... A.. ... A.T ... ... .AA ... A.C ... AGC ... ... ..C ... G.A .AT ... ..A ... ... A.. ... AA. TG. ... ..G ... A.. ..T .GC ..T ... ..C ..G GA. ..T ... ... ..T C.. ..G ..A ... AT. ... ..T ... ..G ..A .GC ... variation ( ω ) among sites: approach Yang and Swanson, 2002 fixed effects (ML) Bao, Gu and Bielawski, 2006 fixed effects (ML) Massingham and Goldman, 2005 site wise (LRT) Kosakovsky Pond and Frost, 2005 site wise (LRT) Nielsen and Yang, 1998 mixture model (ML) Kosakovsky Pond, Frost and Muse, 2005 mixture model (ML) Huelsenbeck and Dyer, 2004; Huelsenbeck et al. 2006 mixture (Bayesian) Rubenstein et al. 2011 mixture model (ML) Bao, Gu, Dunn and Bielawski 2008 & 2011 mixture (LiBaC/MBC) Murell et al. 2013 mixture (Bayesian) • useful when at some sites evolve under diversifying selection pressure over long periods of time this is not a comprehensive list • site models: discrete model ( M3 ) 1 0.9 mixture-model likelihood � 0.8 0.7 0.6 K − 1 0.5 ∑ p i P ( x h | ω i ) P ( x h ) = 0.4 0.3 0.2 0.1 i = 0 0 conditional likelihood calculation (see part 1) ω 0 ω 2 ω 1 = 0.01 = 1.0 = 2.0 4

2017-07-29 interpretation of a sites-model 1 0.9 0.8 0.7 0.6 0.5 5% of sites 0.4 0.3 0.2 0.1 0 diversifying selection (frequency dependent) at 5% of sites with ω 2 = 2 ω 2 ω 0 ω 1 = 0.01 = 1.0 = 2.0 models for variation among branches & sites x 1 x 2 x 3 x 4 ω 1 ω 0 ω 1 ω 0 ω 1 t 1 : ω 1 t 2 : ω 1 t 3 : ω 0 t 4 : ω 0 j GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC ... ... ... G.C ... ... ... T.. ..T ... ... ... ... ... ... ... ... ... .GC A.. ... ... ... ..C ..T ... ... ... ... A.. ... A.T ... ... .AA ... A.C ... AGC ... t 0 : ω 0 ... ..C ... G.A .AT ... ..A ... ... A.. ... AA. TG. ... ..G ... A.. ..T .GC ..T ... ..C ..G GA. ..T ... ... ..T C.. ..G ..A ... AT. ... ..T ... ..G ..A .GC ... k branch models site models ( ω varies among ( ω varies among sites) branches) branch-site models ( combines the features of above models ) 5

2017-07-29 models for variation among branches & sites variation ( ω ) among branches & sites: approach Yang and Nielsen, 2002 fixed+mixture (ML) Forsberg and Christiansen, 2003 fixed+mixture (ML) Bielawski and Yang, 2004 fixed+mixture (ML) Giundon et al., 2004 switching (ML) Zhang et al. 2005 fixed+mixture (ML) Kosakovsky Pond et al. 2011, 2012 full mixture (ML) * these methods can be useful when selection pressures change over time at just a fraction of sites * it can be a challenge to apply these methods properly ( more about this later ) branch-site “Model B” 1 0.9 mixture-model likelihood � 0.8 0.7 − K 1 0.6 ∑ = ω P ( x ) p P ( x | ) 0.5 0.4 h i h i 0.3 Foreground = i 0 branch only 0.2 0.1 0 ω ω ω = 0.01 = 0.90 = 5.55 ω for background branches are from site-classes 1 and 2 (0.01 or 0.90) 6

2017-07-29 two scenarios can yield branch-sites with dN/dS > 1 1 0.9 0.8 0.7 0.6 10% of sites 0.5 0.4 0.3 Foreground (FG) 0.2 branch only 0.1 0 10% of sites have shifting balance on a fixed peak ( same function ) ω ω FG ω = 0.01 = 0.90 = 5.55 branch-site codon episodic adaptive models cannot tell evolution at 10% of which scenario is sites for novel function correct without external information! Jones et al (2016) MBE “omega models” ! ⎧ 0 if i and j differ by > 1 ⎪ ⎪ π j for synonymous tv. model-based inference ⎪ Q ij = κπ j ⎨ for synonymous ts. ⎪ ωπ j for non-synonymous tv. ⎪ ⎪ ωκπ j for non-synonymous ts. ⎩ Goldman(and(Yang((1994)( Muse(and(Gaut((1994)( 7

2017-07-29 model based inference 3 analytical tasks task 1 . parameter estimation (e.g., ω ) task 2 . hypothesis testing task 3 . make predictions (e.g., sites having ω > 1 ) task 1: parameter estimation t, κ , ω = unknown constants estimated by ML π ’s = empirical [GY: F3 × 4 or F61 in Lab] use a numerical hill-climbing algorithm to maximize the likelihood function 8

2017-07-29 task 1: parameter estimation Parameters : t and ω Gene : acetylcholine α receptor human mouse common ancestor lnL = -2399 Sooner or later you’ll get it Sooner or later you’ll get it task 2: statistical significance task 1. parameter estimation (e.g., ω ) ✔ task 2. hypothesis testing LRT task 3. prediction / site identification 9

2017-07-29 task 2: likelihood ratio test for positive selection H 0 : variable selective pressure but NO positive selection (M1) H 1 : variable selective pressure with positive selection (M2) Compare 2 Δ l = 2( l 1 - l 0 ) with a χ 2 distribution Model 1a ( M1a ) Model 2a ( M2a ) 1 0.7 0.9 0.6 0.8 0.5 0.7 0.6 0.4 0.5 0.3 0.4 0.2 0.3 0.2 0.1 0.1 0 0 ω ˆ ( ω = 1) = 0.5 ω ˆ ω ˆ = 0.5 ( ω = 1) = 3.25 task 2: likelihood ratio test for positive selection H 0 : Beta distributed variable selective pressure (M7) H 1 : Beta plus positive selection (M8) Compare 2 Δ l = 2( l 1 - l 0 ) with a χ 2 distribution M7 : beta M8 : beta & ω sites sites 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 >1 ω ratio ω ratio 10

2017-07-29 task 3: identify the selected sites task 1. parameter estimation (e.g., ω ) ✔ task 2. hypothesis testing ✔ task 3. prediction / site identification Bayes’ rule task 3: which sites have dN/dS > 1 1 0.9 0.8 model: 0.7 0.6 9% have ω > 1 0.5 0.4 0.3 0.2 0.1 0 GTG CTG TCT CCT GCC GAC AAG ACC AAC GTC AAG GCC GCC TGG GGC AAG GTT GGC GCG CAC Bayes’ rule: ... ... ... G.C ... ... ... T.. ..T ... ... ... ... ... ... ... ... ... .GC A.. site 4, 12 & 13 ... ... ... ..C ..T ... ... ... ... A.. ... A.T ... ... .AA ... A.C ... AGC ... ... ..C ... G.A .AT ... ..A ... ... A.. ... AA. TG. ... ..G ... A.. ..T .GC ..T ... ..C ..G GA. ..T ... ... ..T C.. ..G ..A ... AT. ... ..T ... ..G ..A .GC ... structure: sites are in contact 11

2017-07-29 review the mixture likelihood (model M3 ) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 K − 1 ∑ p ( ω i ) P ( x h | ω i ) P ( x h ) = i = 0 Likelihood Total Prior probability = 0.03 = 0.40 = 14.1 ω 2 ω 0 ω 1 p 1 p 2 p 0 = 0.85 = 0.10 = 0.05 Bayes’ rule for identifying selected sites Site class 0: ω 0 = .03, 85% of codon sites Site class 1: ω 1 = .40, 10% of codon sites ? ? Site class 2: ω 2 = 14, 05% of codon sites Likelihood of hypothesis ( ω 2 ) Prior probability of hypothesis ( ω 2 ) ( ) P ( ω 2 | x h ) = P ( ω 2 ) P x h | ω 2 K − 1 ∑ ( ) P ( ω i ) P x h | ω i i = 0 Posterior probability of Marginal probability (Total hypothesis ( ω 2 ) probability) of the data 12

types of codon models Q ij = j for synonymous ts. j for - PDF document

2017-07-29 part 3: analysis of natural selection pressure omega models ! 0 if i and j differ by > 1 j for synonymous tv. types of codon models Q ij = j for synonymous ts. j for

types of codon models Q ij = j for synonymous ts. j for non-synonymous

Types Dynamic types Types are broken down into many categories Static types Duck typing

2015-07-20 codon substitution models and the analysis of natural selection pressure Joseph P.

2017-07-29 codon substitution models and the analysis of natural selection

part II codon substitution models and the analysis of natural selection pressure Joseph P.

codon substitution models and the analysis of natural selection pressure Joseph P. Bielawski

! TYPES & STATIC ANALYSIS TYPES ARE GOOD, I PROMISE. SAM GREENWOOD @SAMTGREENWOOD

Types Classification of Values cs3723 1 Values and Types Basic types: types of atomic

Compound Heterozygosity for Silent Cap +1570 (T>C) (HBB: c*96T>C), Codon 39 (C>T) ( HBB :

Compression of Genetic Coding Sequences MohammadReza Ghodsi Genetic Code (Recap) The code

Veterinary Vaccinology Network Early Career Vaccinologists Journal Club 08-05-15 Andrew

Codon-model based inference of selection pressure (a very brief review prior to the PAML lab) an

Quick Lesson on dN/dS Neutral Selection Codon Degeneracy Synonymous vs. Non-synonymous dN/dS

Algebraic Data Types Christine Rizkallah CSE, UNSW Term 3 2020 1 Composite Data Types as

OSPF Router Types OSPF Router Types There are four types of OSPF routers. Router types are

Algebraic Data Types Christine Rizkallah CSE, UNSW (and data61) Term 3 2019 1 Composite Data

CS 423 Operating System Design: The Kernel Abstraction Professor Adam Bates CS423:

LZ4, BulkIO, and offset removal performance Jim Pivarski Princeton University DIANA October

CS415: Systems Programming File related System Calls Most of the slides in this lecture are

The Type Sanitizer: Free Yourself from -fno-strict-aliasing Hal Finkel Argonne National

Make Housing Assistance a Priority in Congress Login at: https://results.zoom.us/j/873308801 or

Creating Knowledge in the Age of Digital Information Robert L. Constable Dean of the Faculty of

DEPUTY Maarten de Vos, all the way from Holland. Maarten is PRINCIPAL putting the finishing

A quick review Significance of similarity scores (P-values) Empirical null score

Sambuz

Useful Links

Newsletter

Mail Us

types of codon models Q ij = j for synonymous ts. j for - PDF document

2017-07-29 part 3: analysis of natural selection pressure omega models ! 0 if i and j differ by > 1 j for synonymous tv. types of codon models Q ij = j for synonymous ts. j for

types of codon models Q ij = j for synonymous ts. j for non-synonymous

Types Dynamic types Types are broken down into many categories Static types Duck typing

2015-07-20 codon substitution models and the analysis of natural selection pressure Joseph P.

2017-07-29 codon substitution models and the analysis of natural selection

part II codon substitution models and the analysis of natural selection pressure Joseph P.

codon substitution models and the analysis of natural selection pressure Joseph P. Bielawski

! TYPES &amp; STATIC ANALYSIS TYPES ARE GOOD, I PROMISE. SAM GREENWOOD @SAMTGREENWOOD

Types Classification of Values cs3723 1 Values and Types Basic types: types of atomic

Compound Heterozygosity for Silent Cap +1570 (T&gt;C) (HBB: c*96T&gt;C), Codon 39 (C&gt;T) ( HBB :

Compression of Genetic Coding Sequences MohammadReza Ghodsi Genetic Code (Recap) The code

Veterinary Vaccinology Network Early Career Vaccinologists Journal Club 08-05-15 Andrew

Codon-model based inference of selection pressure (a very brief review prior to the PAML lab) an

Quick Lesson on dN/dS Neutral Selection Codon Degeneracy Synonymous vs. Non-synonymous dN/dS

Algebraic Data Types Christine Rizkallah CSE, UNSW Term 3 2020 1 Composite Data Types as

OSPF Router Types OSPF Router Types There are four types of OSPF routers. Router types are

Algebraic Data Types Christine Rizkallah CSE, UNSW (and data61) Term 3 2019 1 Composite Data

CS 423 Operating System Design: The Kernel Abstraction Professor Adam Bates CS423:

LZ4, BulkIO, and offset removal performance Jim Pivarski Princeton University DIANA October

CS415: Systems Programming File related System Calls Most of the slides in this lecture are

The Type Sanitizer: Free Yourself from -fno-strict-aliasing Hal Finkel Argonne National

Make Housing Assistance a Priority in Congress Login at: https://results.zoom.us/j/873308801 or

Creating Knowledge in the Age of Digital Information Robert L. Constable Dean of the Faculty of

DEPUTY Maarten de Vos, all the way from Holland. Maarten is PRINCIPAL putting the finishing

A quick review Significance of similarity scores (P-values) Empirical null score

Sambuz

Useful Links

Newsletter

Mail Us

! TYPES & STATIC ANALYSIS TYPES ARE GOOD, I PROMISE. SAM GREENWOOD @SAMTGREENWOOD

Compound Heterozygosity for Silent Cap +1570 (T>C) (HBB: c*96T>C), Codon 39 (C>T) ( HBB :