cbio 16s analysis pipeline
play

Cbio 16S analysis pipeline Katie Lennard Microbiome analysis - PowerPoint PPT Presentation

Cbio 16S analysis pipeline Katie Lennard Microbiome analysis workflow Data preprocessing (UCT High Performance Cluster) Microbiome analysis workflow unsupervised classification correlations analyses Import data into R Microbiome analysis


  1. Cbio 16S analysis pipeline Katie Lennard

  2. Microbiome analysis workflow Data preprocessing (UCT High Performance Cluster)

  3. Microbiome analysis workflow unsupervised classification correlations analyses Import data into R

  4. Microbiome analysis workflow unsupervised classification Summary barplots correlations analyses Exploratory

  5. Microbiome analysis workflow unsupervised classification Beta diversity: NMDS/PCoA correlations analyses Exploratory

  6. Microbiome analysis workflow unsupervised classification Annotated heatmaps correlations analyses Exploratory

  7. Microbiome analysis workflow unsupervised classification correlations analyses Differential abundance testing Downstream analyses

  8. Microbiome analysis workflow correlations analyses Downstream analyses

  9. Microbiome analysis workflow unsupervised classification unsupervised classification correlations analyses Downstream analyses

  10. Microbiome analysis workflow unsupervised classification Biomarker discovery: random forests correlations analyses Downstream analyses

  11. Customized .R script to make your life easier • Convert from phyloseq object to metagenomeSeq object • Get the lowest available taxonomic annotation for each OTU and merge counts at this level • Heatmap (using NMF package) customized for phyloseq objects • Can easily specify a subset of taxa and/or samples to plot • Select annotation colours • Select distance function for clustering • Choose to merge taxa at a given level (e.g. Genus) or plot individual OTUs • Generic barplot function build on phyloseq plot_bar() • Specify subset of samples • Filter OTUs so very rare ones (that just clog up the legend) are excluded • Merge at any taxonomic level (Family, Genus etc..) • Differential abundance testing + heatmap of significant results • Built around MetagenomSeq’s fitzig() and mrfulltable() functions • NB: currently only setup for two-class categorical comparisons • Correlations testing + correlation plot of significant results

  12. Customized .R script to make your life easier • For PICRUSt data: takes the output from PICRUSt's metagenome_contributions.py, together with taxonomic annotation for the OTUs included in this table and provides a summary of the contribution of each Family/Genus.. etc to ONE SPECIFIC KEGG gene e.g. K02030 • Random forests analysis on the otu table of a supplied phyloseq object • The data is randomly divided into a training (two thirds of the data) and test set (remaining one third of the data not used for training) • Results printed to screen and written to file including: • most important taxa, AUC, PPV, NPV, OOB errors, class errors • option to specify the top N taxa to see how they perform

  13. Random Forests output example

  14. Random Forests output example

  15. The 16S accreditation dataset: first look • Number of OTUs: 181 (140 retained after filtering) • Number of samples: 15 • Sample data summary (columns=Treatment; rows=Dog): 0 1 2 3 4 B 1 1 1 1 1 G 1 1 1 1 1 K 1 1 1 1 1

  16. The 16S accreditation dataset: first look

  17. The 16S accreditation dataset: first look

  18. The 16S accreditation dataset: first look

Recommend


More recommend