overview
play

Overview Implementation of robust methods for locating quantitative - PowerPoint PPT Presentation

Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Analysis of QTL data modified BIC Institute of Statistics and Decision Support


  1. Overview Implementation of robust methods for locating quantitative trait loci in R • Introduction to QTL mapping Andreas Baierl and Andreas Futschik • Analysis of QTL data – modified BIC Institute of Statistics and Decision Support Systems University of Vienna – Robust methods • Implementation and Simulations in R Robust Methods for QTL Mapping in R Andreas Baierl 1 Robust Methods for QTL Mapping in R Andreas Baierl 2 Locating quantitative trait loci (QTL) Background • A gene can obtains different forms (alleles) Quantitative trait: evolution occurred in small steps • contribution of genetic effects to total (phenotypic) variation of a trait characters, that are influenced by many genes (heritability) determines rate at which characters respond to selection. Many relevant traits are quantitative: height, yield, ... (environmental variance reduces efficiency of response) Quantitative trait locus (QTL): trait value = genetic influence + environmental influence gene (functional sequence of bases) that influences a certain quantitative trait • partitioning genotypic variance into components with different impact on selection: additive, non-additive gene effects (epistasis) Relevant questions: -> dependency on background population - How many genes influence a trait (How many QTL) evolutionary reason: stabilization of phenotype - Find exact positions of QTL (- estimate size of genetic effects) phenotype: the form taken by some character in a specific individual. genotype: genetic makeup of individual Robust Methods for QTL Mapping in R Andreas Baierl 3 Robust Methods for QTL Mapping in R Andreas Baierl 4

  2. Data from experimental crosses Data matrix for backcross design A A a a Indiv. QT marker.1 marker.2 ... marker.m ~ 50-500 markers F0 BACKCROSS 1 34.3 AA Aa ... AA 2 65.4 Aa AA ... * F1 3 23.2 Aa * ... Aa 4 45.4 AA AA ... Aa ... .... ... ... ... ... F2 INTERCROSS ~ 200 – 1000 individuals Robust Methods for QTL Mapping in R Andreas Baierl 5 Robust Methods for QTL Mapping in R Andreas Baierl 6 Genetic map Analysis of QTL data Genetic map Distance between markers is Find NUMBER, POSITIONS, EFFECT TYPES and SIZES of QTL 0 usually estimated from recombination frequency Challenges: 20 • large number of possible models If marker is close to QTL, then (main effects + interactions = m + m(m-1)/2 ~ 100 + 5.000) Location (cM) marker genotype will be -> efficient search strategy 40 associated with QTL genotype -> correct for test multiplicity (There would be a 1-1 • deviation from normality of conditional distribution of trait given marker 60 correspondence, if there were genotypes (especially when heavy tails or outliers) no recombinations) 80 • recover unobserved / wrong / missing genotype information No linkage between • confounding of effect types 100 chromosomes • selection bias for effect sizes, especially for small effects 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Chromosome Robust Methods for QTL Mapping in R Andreas Baierl 7 Robust Methods for QTL Mapping in R Andreas Baierl 8

  3. Methods for QTL mapping Multiple regression approach marker based ANOVA on single markers multiple regression X ij : genotype of the i th individual (out of n ) at the j th marker (out of m ). X ij = ½ if individual has genotype AA (homozygous) univariate multiple X ij = -½ if individual has genotype Aa (heterozygous) I: subset of the set N = {1,...,m} marker - interval mapping - conditional interval mapping strict Bayesian U : subset of N x N - composite interval - multiple interval mapping approach mapping - Bayesian (Sen & Churchill) ε i : random error term with distribution f estimation of QTL location Robust Methods for QTL Mapping in R Andreas Baierl 9 Robust Methods for QTL Mapping in R Andreas Baierl 10 Model selection Behaviour of BIC depending on n & # of predictors 0.30 aim: identify correct model, not minimise prediction error number of predictors 0.25 -> criterion for inclusion and exclusion of variables 1 5 0.20 10 type 1 error und M0 BIC Mi : BIC of 1-dimensional • cross validation / bootstrap 20 30 Model M i • AIC: n log (RSS) + 2k/n minimises prediction error 100 0.15 N: Number of 1-dim models • BIC: n log (RSS) + k log(n) more conservative than AIC, n: sample size especially for small n 0.10 n : sample size BIC chooses too many QTL k = p + q = number of main effects ( p ) and interaction effects ( q ) under consideration 0.05 RSS: residual sum of squares (assuming normal error distribution !) every model has the same probability to be selected 0.00 -> efficient search strategy -> more likely to select forward selection + backward elimination step 0 2000 4000 6000 8000 10000 large model. n Robust Methods for QTL Mapping in R Andreas Baierl 11 Robust Methods for QTL Mapping in R Andreas Baierl 12

  4. modified BIC Comparison of mBIC and BIC 0.20 Additional penalty term dependent on number of predictors under consideration (Bogdan et al 2004) 0.15 5 predictors (+10 two-way interaction terms) modified BIC type 1 error under M0 BIC 0.10 mBIC with 0.05 E(p): expected number of main effects E(q): expected number of epistasis (=interaction) effects 0.00 E(p) = E(q) = 2.2 controls the Type I error at a level of of 5% (for n = 200) 0 1000 2000 3000 4000 5000 n Robust Methods for QTL Mapping in R Andreas Baierl 13 Robust Methods for QTL Mapping in R Andreas Baierl 14 Deviations from Normality Robust model selection criterion • Typically, non-parametric methods based on ranks are used • Here we use robust regression techniques, in particular M-Estimators: minimise other measure of distance instead of residual sum of squares. popular alternatives are: still consistent under quite general conditions on the error distribution rho.huber, k=0.05 rho.huber, k=1.3 (Martin, 1980) but performance of BIC * ρ depends on ρ and error distribution: Jure č kova and Sen (1996) derived limiting distribution for -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6 rho.bisquare rho.hampel -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6 Robust Methods for QTL Mapping in R Andreas Baierl 15 Robust Methods for QTL Mapping in R Andreas Baierl 16

  5. Limiting Distribution Values for normalisation constant c e We showed that has the following property: with and error distribution f(x) for L 2 c e = 1 Robust Methods for QTL Mapping in R Andreas Baierl 17 Robust Methods for QTL Mapping in R Andreas Baierl 18 Robust mBIC Simulation Setup In practice, c e and therefore the error distribution f(x) have to be estimated. 2 chromosomes with 11 marker each (m=22) 200 individuals (n=200) This leads to a robust version of the mBIC: 1 additive effect 1 epistasis effect error distributions: Normal, Laplace, Cauchy, Tukey, χ 2 with estimators: L 2 , Huber (k=0.05) ~ L 1 , Huber (k=1.3), Bisquare, Hampel Robust Methods for QTL Mapping in R Andreas Baierl 19 Robust Methods for QTL Mapping in R Andreas Baierl 20

  6. Simulation Results Implementation in R Percentage correctly identified effects and false discovery rate • Robust regression using procedure rlm of package MASS Huber-0.05 • program structure: 0.6 Huber-1.3 – parameter specification Bisquare Hampel – generate realisation of genetic setup L2-mBIC Percentage – estimation of error distribution and c e L2-BIC 0.4 – in each forward step: estimate likelihood for m + m(m-1)/2 models – generate output 0.2 • simulations: – 1000 replications – n=200-500, m=20-120 0.0 Normal Laplace Cauchy Tukey Chisq Chisq.med Robust Methods for QTL Mapping in R Andreas Baierl 21 Robust Methods for QTL Mapping in R Andreas Baierl 22 References • Baierl, A., Bogdan, M., Frommlet, F., Futschik, A., 2006. On Locating multiple interacting quantitative trait loci in intercross designs. To appear in Genetics. • Bogdan, M., J. K. Ghosh and R. W. Doerge, 2004. Modifying the Schwarz Bayesian Information Criterion to Locate Multiple Interacting Quantitative Trait Loci. Genetics, 1 6 7 : 989-999. • Broman, K. W. and T. P. Speed, 2002. A model selection approach for the identification of quantitative trait loci in experimental crosses. J Roy Stat Soc B, 6 4 : 641-656. • Jureckova, J., Sen, P.K., 1996. Robust statistical procedures: asymptoticsand interrelations. Wiley, New York. • Sen and Churchill (2001), A Statistical framework for quantitative trait mapping, Genetics, 1 5 9 :371-387. Robust Methods for QTL Mapping in R Andreas Baierl 23

Recommend


More recommend