predic ng ssue specific effects of rare gene c variants
play

Predic'ng 'ssue-specific effects of rare gene'c variants Farhan - PowerPoint PPT Presentation

Predic'ng 'ssue-specific effects of rare gene'c variants Farhan Damani Biological Data Sciences 2016 Goal: develop a framework to predict 'ssue- specific regulatory effects of rare variants Rare variants are abundant and poten'ally


  1. Predic'ng 'ssue-specific effects of rare gene'c variants Farhan Damani Biological Data Sciences 2016

  2. Goal: develop a framework to predict 'ssue- specific regulatory effects of rare variants

  3. Rare variants are abundant and poten'ally high-impact Rare variants defined with minor allele frequency < 1% Number of variants Enriched for deleterious func'onal classes Eynard et al. BMC Gene'cs 2015 Kircher et al. Nature Gene'cs 2014 DAF CADD score Minor Allele Frequency Slide – Alexis BaUle

  4. Tissue-specific func'onality Overlap of func'onal common variants Backenroth et al. Biorxiv 2016 • Understanding 'ssue-specific Aguet et al. Biorxiv 2016 consequences of noncoding gene'c varia'on is cri'cal to understanding complex traits Tissue type Cell type

  5. Challenges • Even fewer reliable labels in 'ssue-specific seZng • Each individual 'ssue has low sample size (RNA-seq) • Limited samples for each rare SNV

  6. GTEx Project Data • WGS from 148 donors • 114 European Ancestry used here • 8555 RNA-seq samples from • 44 tissues from 522 donors 44 tissues 148 individuals (WGS) 522 individuals (RNA-seq samples)

  7. Expression outliers What are expression outliers? Enrichment of func5onal variants among outliers Li et al. The impact of rare varia'on. Biorxiv hUp:// biorxiv.org/content/early/2016/09/09/074443

  8. Genomic features (1) regulatory elements (2) variant predictor summary sta5s5cs - Variant effect predictor - CADD - DANN - …

  9. Genomic features ENCODE Project Consor'um. Plos Biology 2011. • Tissue-specific promoters/ enhancers • Conserva'on scores • Transcrip'on factor binding sites • CpG sites • ChromHMM

  10. Related work on 'ssue-shared effects + = Li et al. The impact of rare varia'on. Biorxiv hUp:// biorxiv.org/content/early/2016/09/09/074443

  11. Learning 'ssue-specific effects as individual tasks C ? λ 1 λ 2 λ 5 λ 3 λ 4 Brain Artery+Fats Muscles Epithelial Diges've

  12. Learning 'ssue-specific effects as individual tasks C ? λ 1 λ 2 λ 5 λ 3 λ 4 Brain Artery+Fats Muscles Epithelial Diges've Expression outliers are noisier based on smaller sets of 'ssues

  13. unobserved Graphical model observed ! Boxes represent replicates… g q r e • M 5ssues N • N individual by gene samples " # M $ & %

  14. unobserved Graphical model observed Sample-level component Presence of rare regulatory variant g r Genomic annota'ons N genomic annota'ons # coefficients

  15. Leak probability unobserved Graphical model observed ! Sample-level component Gene expression g q r e N # Presence of common variant & expression-covariate parameter

  16. unobserved Graphical model observed ! Tissue-specific influence g q r e Tissue-specific genomic N annota'ons coefficient " # Tissue-specific transfer parameter M Global genomic annota'ons $ & coefficient

  17. unobserved Graphical model observed ! Global influence g q r e N " # M Global genomic annota'ons $ & coefficient Global transfer parameter %

  18. unobserved Graphical model observed ! g q r e N " # M We want to infer $ & p(regulatory variant | data) … %

  19. Objec've func'on

  20. Objec've func'on

  21. Hyperparameter seZng • (transfer parameters) Bootstrap es'ma'on: • (leak probability) Categorical distribu'on

  22. Op'mizing the objec've using EM • Expecta'on step • Exact inference • Maximiza'on Step Coordinate gradient descent NoisyOr update

  23. Results

  24. Allelic imbalance presents strong evidence for regulatory varia'on BaUle et al. Genome Research 2013 Strong evidence of causal cis- regulatory impact Almost all rare variants in our cohort are heterozygous Zhang et al. Nature Methods 2009: “we found that the varia'on of allelic ra'os in gene expression among different cell lines was primarily explained by gene'c varia'ons…” Yan et. al. Science 2002: “We es'mated that this approach could confidently iden'fy varia'ons when the differences between expression of the two alleles differed by more than 20%.”

  25. Posteriors are predic've of allelic imbalance

  26. Muscle Brain

  27. Artery+ Fats Epithelial

  28. Diges've

  29. Our predic'ons are also confident

  30. Rare regulatory variant nearby GCAT P(regulatory variant | data) 24.75 percen'le 91.2 percen'le allelic imbalance allelic imbalance Brain Muscle

  31. Conclusion We developed a framework for regulatory rare variant predic'on We compared our predic'ons to measured allelic imbalance Presents an opportunity for researchers with WGS and (limited) RNA-seq to reliably iden'fy func'onal rare variants

  32. Thank you! BaPle Lab Montgomery Lab GTEx Consor'um Yungil Kim Xin Li PistriUo Fellowship Ben Strober Joe Davis NIH Alexis BaUle Emily Tsang NIMH Zachary Zappala Searle Scholar Program Stephen Montgomery

  33. Tissue groups with similar behavior

  34. Case 1: Extreme expression across 'ssues Gene expression (z-score) Tissue type

  35. Model predic'ons p(regulatory variant | data) Mul'-task: Brain Mul'-task: Not Brain RIVER Shared Logis'c Regression

  36. Case 2: Extreme expression in brain 'ssues Gene expression (z-score) Tissue type

  37. Model predic'ons p(regulatory variant | data) Mul'-task: Brain RIVER Mul'-task: Not Brain Shared Logis'c Regression

Recommend


More recommend