Shiny-phyloseq: Web Application for Interactive Microbiome Analysis with Provenance Tracking Paul J. McMurdie ! Research Associate ! Prof Susan Holmes Group ! Statistics Department ! Stanford University
Overview • Intro to Microbiome Research ! • phyloseq - a microbiome BioC package ! • (RNA-Seq methods solve a microbiome problem) ! • Shiny-phyloseq: a shiny interface to phyloseq
What are microbes? Cell structure (they don’t all look like this)
What are microbes? Ancestry of Life Eukaryota Bacteria Archaea http://en.wikipedia.org/wiki/Tree_of_life_(biology)
What is a microbiome? The totality of microbes in a defined environment, especially their genomes and interactions with each other and surrounding environment. • A population of a single species/strain is a culture, extremely rare outside of lab, some infections ! • A microbiome is a mixed population of different microbial species (microbial ecosystem)
Why study microbiomes? Earth Microbiome Project: Oceans, soils, waterways Wastewater Treatment Human Microbiomes Cow Rumen Deep-Sea Hydrothermal Vent
Human Body Sites, HMP >10 times more microbial cells than human cells ! Entire human microbiome weighs less than 2 kg, at most
Fecal Transplants (Clostridium di ffj cile infection) Borody, et al (2011) ! Nature Rev Gastroenterology & ! Hepatology
Why is microbiome research new? Bias for cultivable microbes, especially pathogens • Culture-based methods fail to detect most microbes ! • Microbes are easy to miss (except pathogens) ! • Most microbes are NOT pathogens (even the human-associated) Availability of tools limited to last 3 decades • PCR, fast & cheap DNA sequencing, microarrays, etc ! • Discovery of culture-independent techniques - 16S-rRNA
How do we query microbiomes?? 16S rRNA ribosome ribosome ! in action
How do we query microbiomes?? Number of Microbial Species • Universal (e.g. 16S rRNA) Gene census ! Counted • Shotgun Metagenome Sequencing ! • Transcriptomics (shotgun mRNA) ! • Proteomics (protein fragments) ! • Metabolomics (excreted chemicals)
amplify 16S rRNA ! demultiplex and ! (barcoded) species clustering microbiome ! samples Paul J. McMurdie ! ! Statistics Department ! & CEHG ! Stanford University ! ! with contributions from ! Prof Susan Holmes Microbiome data ! heterogeneity and processing
phyloseq data structure & API ape Biostrings matrix data.frame matrix package package read.tree DNAStringSet read.nexus RNAStringSet otu_table as sample_data as tax_table as read_tree AAStringSet OTU Abundance Sample Variables Taxonomy Table Phylogenetic Tree Reference Seq. otu_table sample_data taxonomyTable XStringSet phylo optional Accessors: Processors: otu_table sample_data tax_table phy_tree refseq get_taxa filter_taxa get_samples merge_phyloseq get_variable merge_samples nsamples merge_taxa Experiment Data ntaxa prune_samples phyloseq rank_names prune_taxa constructor: sample_names otu_table, subset_taxa phyloseq sam_data, sample_sums subset_samples sample_variables tax_table, tip_glom data import phy_tree taxa_names tax_glom taxa_sums refseq
phyloseq work flow Input Import import_biom sample data Direct Plots phyloseq import_mothur import_pyrotagger OTU cluster output raw import_qiime plot_richness import_RDP Preprocessing plot_tree filter_taxa filterfun_sample genefilter_sample phyloseq prune_taxa plot_bar prune_samples processed subset_taxa subset_samples transform_sample_counts Inference, Testing bootstrap distance ordinate permutation tests regression discriminant analysis Summary / Exploratory multiple testing Graphics gap statistic clustering procrustes plot_ordination plot_network plot_heatmap
phyloseq graphics plot_ordination, NMDS, wUF plot_heatmap; bray − curtis, NMDS 0.3 ● ● ● ● ● ● 0.2 ● ● ● ● ● ● Abundance SampleType 0.1 ● Feces ● plot_ordination() 10000 plot_heatmap() ● Freshwater OTU ● ● ● ● ● ● ● ● ● Freshwater (creek) ● ● 100 ● NMDS2 ● ● ● ● ● ● Mock ● ● 0.0 ● ● ● Ocean ● ● ● 1 ● ● ● ● ● Sediment (estuary) ● ● ● Skin ● ● ● ● ● Soil ● ● − 0.1 ● ● ● Tongue ● ● ● ● ● − 0.2 Freshwater Freshwater (creek) Freshwater Freshwater (creek) Freshwater (creek) Soil Soil Soil Skin Skin Skin Mock Mock Mock Feces Feces Feces Feces Sediment (estuary) Tongue Tongue Ocean Ocean Ocean Sediment (estuary) Sediment (estuary) ● ● ● ● ● ● − 0.3 − 0.4 − 0.2 0.0 0.2 0.4 NMDS1 SampleType plot_network; Enterotype data, bray − curtis, max.dist=0.25 plot_tree; Bacteroidetes − only. Merged samples, tip_glom=0.1 ● ● ● ● ● ● ● ● ● Prevotella 67 ● ● ● ● ● Prevotella ● ● ● ● ● ● ● ● ● ● Prevotella ● ● ● ● ● ● ● Prevotella ● ● 79 ● ● ● ● ● ● ● ● ● ● ● Abundance ● ● ● Bacteroides ● ● ● ● ● ● ● ● Bacteroides ● ● ● ● ● Porphyromonas ● ● ● ● ● ● 1 ● ● ● Porphyromonas ● 75 ● ● ● ● ● ● ● ● ● ● ● ● ● Parabacteroides ● ● ● ● ● ● ● ● ● ● ● 25 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Alistipes ● ● ● ● ● ● ● ● 625 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 15625 ● ● ● ● ● ● ● ● ● Odoribacter ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 ● ● ● ● ● ● ● ● ● ● ●● ● SampleType ● ● SeqTech ● ● ● ● ● ● ● Capnocytophaga ● ● ● ● ● ● ● ● ● ● ● Feces ● Illumina ● ● ● ● ● ● ● ● ● ● ● ● ● ● plot_tree() ● Freshwater ● ● ● ● Pyro454 ● ● ● ● ● ● ● 73 ● ● ● ● ● ● plot_network() ● ● ● ● Freshwater (creek) ● ● ● ● Sanger ● ● ● ● ● ● ● ● ● ● ● ● Mock ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● Enterotype ● ● ●● ● ● ● ● ● Ocean ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● Sediment (estuary) ● ● ● ● Sphingobacterium ● ● 81 Pedobacter 2 ● ● Skin ● ● 3 Cytophaga ● Soil ● Cytophaga Emticicia Spirosoma ● Tongue ● ● 84 Hymenobacter ● 84 ●● Order Algoriphagus Bacteroidales 75 ● ● Segetibacter ● ● Flavobacteriales ● 76 ● Sphingobacteriales ● Haliscomenobacter ● 82 CandidatusAquirestis Balneola plot_bar; Bacteroidetes − only S.obs S.chao1 S.ACE 6e+05 Family 8000 Bacteroidaceae Balneolaceae plot_bar() plot_richness() Cryomorphaceae SampleType Cyclobacteriaceae Abundance Feces 4e+05 Flavobacteriaceae Freshwater Number of OTUs Flexibacteraceae Freshwater (creek) 6000 Porphyromonadaceae Mock Prevotellaceae Ocean Rikenellaceae Sediment (estuary) 2e+05 Saprospiraceae Skin Sphingobacteriaceae Soil ● Tongue 4000 0e+00 Feces Freshwater Freshwater (creek) Mock Ocean Sediment (estuary) Skin Soil Tongue 2000 FALSE TRUE FALSE TRUE FALSE TRUE SampleType Human Associated Samples
Side Note: BioC tools for microbiome edgeR, DESeq(2), metagenomeSeq ! perform better than popular alternatives ! in differential abundance detection: ! ! McMurdie and Holmes (2014) PLoS Comp Biol ! DOI: 10.1371/journal.pcbi.1003531 samples samples genes species gene species counts counts http://joey711.github.io/waste-not-supplemental/
Acknowledgements Susan Holmes Postdoc Advisor, Mentor, Co-author Holmes Group Helpful advice and feedback Wolfgang Huber Helpful advice and feedback re: DESeq(2) BioC and CRAN Support, Feedback, Distribution of phyloseq and biom RStudio Shiny, RStudio IDE Hadley Wickham ggplot2, reshape2, plyr R packages
Shiny-phyloseq Live Demo How to Run: install.packages(“shiny”) shiny::runGitHub(“shiny-phyloseq”, “joey711”) http://joey711.github.io/shiny-phyloseq/
End. ! Questions?
Recommend
More recommend