Cbio 16S analysis pipeline Katie Lennard
Microbiome analysis workflow Data preprocessing (UCT High Performance Cluster)
Microbiome analysis workflow unsupervised classification correlations analyses Import data into R
Microbiome analysis workflow unsupervised classification Summary barplots correlations analyses Exploratory
Microbiome analysis workflow unsupervised classification Beta diversity: NMDS/PCoA correlations analyses Exploratory
Microbiome analysis workflow unsupervised classification Annotated heatmaps correlations analyses Exploratory
Microbiome analysis workflow unsupervised classification correlations analyses Differential abundance testing Downstream analyses
Microbiome analysis workflow correlations analyses Downstream analyses
Microbiome analysis workflow unsupervised classification unsupervised classification correlations analyses Downstream analyses
Microbiome analysis workflow unsupervised classification Biomarker discovery: random forests correlations analyses Downstream analyses
Customized .R script to make your life easier • Convert from phyloseq object to metagenomeSeq object • Get the lowest available taxonomic annotation for each OTU and merge counts at this level • Heatmap (using NMF package) customized for phyloseq objects • Can easily specify a subset of taxa and/or samples to plot • Select annotation colours • Select distance function for clustering • Choose to merge taxa at a given level (e.g. Genus) or plot individual OTUs • Generic barplot function build on phyloseq plot_bar() • Specify subset of samples • Filter OTUs so very rare ones (that just clog up the legend) are excluded • Merge at any taxonomic level (Family, Genus etc..) • Differential abundance testing + heatmap of significant results • Built around MetagenomSeq’s fitzig() and mrfulltable() functions • NB: currently only setup for two-class categorical comparisons • Correlations testing + correlation plot of significant results
Customized .R script to make your life easier • For PICRUSt data: takes the output from PICRUSt's metagenome_contributions.py, together with taxonomic annotation for the OTUs included in this table and provides a summary of the contribution of each Family/Genus.. etc to ONE SPECIFIC KEGG gene e.g. K02030 • Random forests analysis on the otu table of a supplied phyloseq object • The data is randomly divided into a training (two thirds of the data) and test set (remaining one third of the data not used for training) • Results printed to screen and written to file including: • most important taxa, AUC, PPV, NPV, OOB errors, class errors • option to specify the top N taxa to see how they perform
Random Forests output example
Random Forests output example
The 16S accreditation dataset: first look • Number of OTUs: 181 (140 retained after filtering) • Number of samples: 15 • Sample data summary (columns=Treatment; rows=Dog): 0 1 2 3 4 B 1 1 1 1 1 G 1 1 1 1 1 K 1 1 1 1 1
The 16S accreditation dataset: first look
The 16S accreditation dataset: first look
The 16S accreditation dataset: first look
Recommend
More recommend