genome wide analysis of dna methylation in samples from
play

Genome-wide analysis of DNA methylation in samples from the - PowerPoint PPT Presentation

Genome-wide analysis of DNA methylation in samples from the Genotype-Tissue Expression (GTEx) project Peter Hickey @PeteHaitch Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health Single Cell Open Research Endeavour


  1. Genome-wide analysis of DNA methylation in samples from the Genotype-Tissue Expression (GTEx) project Peter Hickey @PeteHaitch Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health Single Cell Open Research Endeavour (SCORE), Walter and Eliza Hall Institute of Medical Research Slides: www.bit.ly/AGTA2018

  2. GTEx to eGTEx via a ‘pilot’ study

  3. The Genotype-Tissue Expression (GTEx) project is an ongoing effort to build a comprehensive public resource to study [human] tissue-specific gene expression and regulation. - GTEx Consortium, 2015, Science 348, 648–660

  4. [eGTEx] extends the GTEx project to combine gene expression with additional intermediate molecular measurements on the same tissues. - eGTEx Project, 2017, Nat. Genet. 49 , 1664–1670

  5. Hmm, this eGTEx study is gonna be huge. And the human brain is hella cool. Let’s do a pilot study. - Artist’s impression of conversation in Hansen and Feinberg labs, c. 2015

  6. BrainEpigenome (the ‘pilot’ study) Rizzardi, L*. Hickey, P.F.*, et al. Neuronal brain region-specific DNA methylation and chromatin accessibility are associated with neuropsychiatric disease heritability . bioRxiv (2017), doi:10.1101/120386 (in press, Nature Neuroscience) UCSC Track Hub: www.bit.ly/BrainEpigenomeHub

  7. Ma Map o of h human b brain me meth thylome lome wa was limited (c. 2015) • Little whole genome bisulfite sequencing (WGBS) data • Few (if any) biological replicates • Mostly bulk tissue samples • Few brain region-specific differentially methylated regions (DMRs) 1,2 1 Davies, M. N. et al. Functional annotation of the human brain methylome identifies tissue-specific epigenetic variation across brain and blood. Genome Biol. 13, R43 (2012). 2 Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015). http://epigenomesportal.ca/ihec/grid.html (Build: 2017-10)

  8. A A good d map p requi quires s bi biologi gical repl plicates, s, mul ultipl ple br brain n regi egions ns, and nd mul ultipl ple e cel ell types pes WGBS (bulk) n = 27 WGBS (NeuN sorted) n = 45 ATAC-seq (NeuN sorted) n = 22 RNA-seq (NeuN sorted) Tissue n = 20 BRNCTXB (frontal cortex) BRNACC (anterior cingulate cortex) Donor BRNHPP (hippocampus) BRNNCC (nucleus accumbens)

  9. Bulk k tissue samples are uninformative for brain region-sp speci cific c mC mCG due due to var aria iatio tion n of ne neur uronal nal pr propo portio tion n in in sample ampled d tis tissue ue Tissue

  10. Let’s try fluorescence activated nuclei sort Le rting (FANS) S) WGBS (bulk) n = 27 WGBS (NeuN+, NeuN-) n = 45 ATAC-seq (NeuN sorted) n = 22 RNA-seq (NeuN sorted) Tissue n = 20 BRNCTXB (frontal cortex) BRNACC (anterior cingulate cortex) Donor BRNHPP (hippocampus) BRNNCC (nucleus accumbens)

  11. And And let’s s do do so some more assa ssays WGBS (bulk) n = 27 WGBS (NeuN+, NeuN-) n = 45 ATAC-seq (NeuN+, NeuN-) n = 22 RNA-seq (NeuN+, NeuN-) Tissue n = 20 BRNCTXB (frontal cortex) BRNACC (anterior cingulate cortex) Donor BRNHPP (hippocampus) BRNNCC (nucleus accumbens)

  12. FA FANS & WGBS reveals brain region-sp specifi ficity y of f mC mCG in in Ne NeuN+ + (b (but not Ne NeuN-) ) sa samples CG-DMRs n Size 100,875 * 70.0 Mb NeuN+ vs. NeuN- 13,074 * 11.9 Mb NeuN+ NeuN- 114 * 0.1 Mb * 21,802 novel DMRs

  13. NeuN+ s Neu + sampl ples: mCH mCH sh shows s limited strand speci cifici city, ‘track cks’ ’ mC mCG, a , and c nd can be n be us used t d to i ide denti tify C CH-DM DMRs Rs NeuN+ mCH (1kb bins) mCH (1 kb bins) chr9: 101,348,685 − 101,404,045 (width = 55,361, extended = 15,000) 0.8 mCG (S) mCG (S) 0.5 A a 0.2 0.2 A a A a mCG (L) 0.8 mCG (L) a A 0.5 A A a 0.1 a 0.2 a T A t a A T A a t t T A mCA (+) a t t T 0.8 T A T a t T mCA (+) t t t T a T t T a A PC2 (8.0 %) A T t 0.5 a A T t t 0.0 T t T t T T t 0.2 t a TT T t PC2 (8%) A t t T mCT (+) A a t 0.8 a T a A A mCT (+) A a − 0.1 0.5 t T a T 0.2 t A a A GABBR2 0.8 Region mCA ( − ) CH-DMRs − 0.2 Context & strand 0.5 BRNCTXB 0.2 A: mCA (+) BRNACC n Size a: mCA (-) 0.8 A mCA (+) BA24 mCT ( − ) − 0.3 BRNHPP 0.5 T: mCT (+) a mCA ( − ) BA9 a NeuN+ 15,029 + 39.6 Mb ++ A 0.2 T mCT (+) HC BRNNCC t: mCT (-) t a mCT ( − ) A NAcc + Before merging across strand and context GABBR2 PC1 (22.5%) − 0.2 − 0.1 0.0 0.1 0.2 0.3 ++ After merging across strand and context PC1 (22.5%)

  14. Enrichm Enr hment of f DMRs Rs over geno nomic featur ures − 4 0 2 4 Value OCR (union) H3K27ac FANTOM5 CH − DMRs (NeuN+) DEGs CG − DMRs (NeuN+) DEG promoters intronic Shelves exonic three_utr Shores promoter log2(OR) CGI OpenSea SINE DNA Simple_repeat Low_complexity five_utr intergenic LTR LINE Satellite CH-DMR CG-DMR ) n S o O i n P u (NeuN+) (NeuN+) ( s R M D − G C

  15. Enr Enrichm hment of f DMRs Rs over geno nomic featur ures − 4 0 2 4 Value OCR (union) CG CG-DM DMRs and CH-DM DMRs co-oc occur H3K27ac FANTOM5 CG CG-DM DMRs are enhancer-ce centri ric CH − DMRs (NeuN+) DEGs CG − DMRs (NeuN+) CH CH-DM DMRs are enriched over DEG promoters intronic Shelves exonic three_utr Shores promoter log2(OR) CGI OpenSea SINE DNA Simple_repeat Low_complexity five_utr intergenic LTR LINE Satellite CH-DMR CG-DMR ) n S o O i n P u (NeuN+) (NeuN+) ( s R M D − G C

  16. Enr Enrichm hment of f DMRs Rs over geno nomic featur ures − 4 0 2 4 Value OCR (union) CG-DM CG DMRs and CH-DM DMRs co-oc occur H3K27ac FANTOM5 CG CG-DM DMRs are enhancer-ce centri ric CH − DMRs (NeuN+) DEGs CG − DMRs (NeuN+) CH CH-DM DMRs are enriched over genes (D (D DEG promoters intronic Shelves exonic three_utr Shores promoter log2(OR) CGI OpenSea SINE DNA Simple_repeat Low_complexity five_utr intergenic LTR LINE Satellite CH-DMR CG-DMR ) n S o O i n P u (NeuN+) (NeuN+) ( s R M D − G C

  17. Enr Enrichm hment of f DMRs Rs over geno nomic featur ures − 4 0 2 4 Value OCR (union) CG-DM CG DMRs and CH-DM DMRs co-oc occur H3K27ac FANTOM5 CG CG-DM DMRs are enhancer-ce centri ric CH − DMRs (NeuN+) DEGs CG − DMRs (NeuN+) CH CH-DM DMRs are DE DEG-ce centri ric DEG promoters intronic Shelves exonic three_utr Shores promoter log2(OR) CGI OpenSea SINE DNA Simple_repeat Low_complexity five_utr intergenic LTR LINE Satellite CH-DMR CG-DMR ) n S o O i n P u (NeuN+) (NeuN+) ( s R M D − G C

  18. CG CG-DM DMRs in in Ne NeuN+ + sa samp mples s are enriched for GWAS AS her heritabi bility of neur neuropsychi hiatric traits Stratified l linkage disequilibrium s score re regre ression * 27 ‘ ‘brain-linked’ t traits (e.g., S Schizophrenia, AD ADHD) 3 ‘ ‘non-br brain-lin linked’ ’ traits ( (e.g., h height) * Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics . Nat. Genet. (2015) doi: 10.1038/ng.3404

  19. eGTEx (work in-progress) eGTEx Project Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease . Nature Genetics (2017), doi: 10.1038/ng.3969

  20. eGTE TEX st study design eGTEx Project Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease . Nature Genetics (2017), doi: 10.1038/ng.3969

  21. Re Re-wr wrote e bs bsseq to to process and analyse eGTE TEx-si sized (a (and bigger) ) datase sets • Processed data is too large to store and operate on in-memory (10s – 100s of GB) • Data stored on-disk in HDF5 file • Improved parallelization of key steps • Importing files • Smoothing • DMR calling • Permutation testing • Available through Bioconductor • https://bioconductor.org/packages/bsseq/

  22. mC mCG distinguishes eGTE TEx samples by tissue

  23. eGTE TEx Ne NeuN+ + sa samp mples s are (mo (mostly) y) consi sistent with Br BrainEpigenome me Ne NeuN+ + sa samp mples

  24. eGTE TEx Ne NeuN+ + sa samp mples s are (mo (mostly) y) consi sistent with Br BrainEpigenome me Ne NeuN+ + sa samp mples

  25. eGTE TEx Ne NeuN+ + sa samp mples s are (mo (mostly) y) consi sistent with Br BrainEpigenome me Ne NeuN+ + sa samp mples

  26. 5-gr group: up: 16x x as as man any CG-DMRs in eGTE TEx Ne NeuN+ + sa samples s as s in Br BrainEpigenome me Ne NeuN+ + sa samp mples CG-DMRs n Size 5-group 181,146 196.9 Mb

  27. Ba Basal g gangl glia: Di : Discover er 2x 2x a as ma many C y CG-DM DMRs Rs i in e eGTEx Neu NeuN+ s + sampl ples a as i in n Br BrainEpigen enome me Neu NeuN+ s + sampl ples CG-DMRs n Size 5-group 181,146 196.9 Mb Basal ganglia 16,866 24.0 Mb

  28. Hippo Hippocam ampus pus: Wha hat t the the hell hell is is going ing on? n? CG-DMRs n Size 5-group 181,146 196.9 Mb Basal ganglia 16,866 24.0 Mb Hippocampus 11,702 24.4 Mb

Recommend


More recommend