towards standard accessible and reproducible metabolomics
play

Towards standard, accessible and reproducible Metabolomics Reza - PowerPoint PPT Presentation

Towards standard, accessible and reproducible Metabolomics Reza Salek PhD Metabolism and Molecular Informatics The European Bioinformatics Institute (EMBL-EBI) Email: Reza.salek@ebi.ac.uk The 1st International Electronic Conference on


  1. Towards standard, accessible and reproducible Metabolomics Reza Salek PhD Metabolism and Molecular Informatics The European Bioinformatics Institute (EMBL-EBI) Email: Reza.salek@ebi.ac.uk The 1st International Electronic Conference on Metabolomics

  2. EBI Databases and services Literature and ontologies PubMC, GO Genomes Ensembl Protein families, Ensembl Genomes motifs and domains EGA InterPro Functional Nucleotide sequence genomics ENA ArrayExpress Macromolecular Expression Atlas PDBe Protein activity IntAct , PRIDE Pathways Reactome Protein Sequences UniProt Cheminformatics & Metabolism MetaboLights , ChEBI Systems BioModels Chemogenomics BioSamples ChEMBL

  3. Is data growth, FAIR?

  4. Metabolomics Standard Initiative (WG) • Lives at http://msi-workgroups.sourceforge.net • 5 Workgroups • Biological context metadata WG • Chemical analysis WG • Data processing WG • Ontology WG • Exchange format WG Roy Goodacre Metabolomics (2014) 10:5-7

  5. Data sharing repositories http://www.metabolomicsworkbench.org/ http://ebi.ac.uk/metabolights/

  6. OmicsDI – Collection of omics TX PX EGA MX

  7. Leading to data discovery

  8. OmicsDI

  9. Capturing Metadata: ISA-Tab format Developed a user friendly way to capture standards-compliant metadata https://github.com/ISA-tools/ISAcreator https://github.com/ISA-tools/ISAcreator/wiki/API https://github.com/ISA-tools/ISATab-Viewer

  10. ISAcreator – Using Ontologies

  11. Data Standards ; What is XML? • XML stands for EXtensible Markup Language • XML is a markup language much like HTML • XML was designed to carry data, not to display data • XML is designed to be self-descriptive NMR analysis All spectra were recorded on a <Varian NMR Instrument> Varian VNMRS 600 NMR Spectrometer </Varian NMR Instrument> operating at a proton NMR frequency of <Irradiation frequency>599.83 <Megahertz>MHz</Megahertz> </Irradiation frequency> using a <cryoprobe>5 mm inverse detection cryoprobe</cryoprobe>. <acquisition nucleus>1H</acquisition nucleus> NMR spectra were recorded […].

  12. Generating ISA-Tab metadata files from metabolomics XML data

  13. MetaboLights – Study Validation Status MetaboLights - an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucl. Acids Res. (2012) [ doi:10.1093/nar/gks1004

  14. MetaboLights – Study Validation details

  15. Tools the way forward! 3

  16. Current way and ideal

  17. Samples QC C5 S3 S7 C1 C10 QC S1 C3 S5 C7 S6 QC .. Complex analysis pipelines Technical Triplicates C5 C5’ C5’’ S3 S3’ S3’’ .. ..’ ..’’ DIMS Data Collection IRF C5 IRF C5 ’ IRF C5 ’’ IRF S3 IRF S3 ’ IRF S3 ’’ IRF .. IRF .. ’ IRF .. ’’ Instrument .RAW files AT C5 AT C5 ’ AT C5 ’’ AT S3 AT S3 ’ AT S3 ’’ AT .. AT .. ’ AT .. ’’ Averaged Transients Apodisation, Zero-filling and FFT TIC Filtering Frequency Spectra FS C5 FS C5 ’ FS C5 ’’ FS S3 FS S3 ’ FS S3 ’’ FS .. FS .. ’ FS .. ’’ Calibrant List Mass Calibration and SIM-stitching Stitched Peak Lists SPL C5 SPL C5 ’ SPL C5 ’’ SPL S3 SPL S3 ’ SPL S3 ’’ SPL .. SPL .. ’ SPL .. ’’ Replicate Replicate Replicate Filtering Filtering Filtering RFPL C5 RFPL S3 RFPL .. RFPL blank Replicate Filtered Peak Lists Sample Filtering Blank Filtering Sample Filtered Peak Matrix SFPM Missing-value Filtering Batch Spectral PQN Normalisation Correction Cleaning SFPM PQN SFPM PQN + BATCH SFPM PQN + BATCH + CLEAN Impute Missing Values using KNN SFPM PQN + KNN SFPM PQN + BATCH + KNN SFPM PQN + BATCH + CLEAN + KNN Glog Transformation SFPM PQN + KNN + GLOG SFPM PQN + BATCH + KNN + GLOG SFPM PQN + BATCH + CLEAN + KNN + GLOG

  18. PhenoMeNal - Goal Infrastructure Tool maker Data Producer provider Data container Packaged Compute tool Infrastructure VRE Portal PhenoMeNal

  19. Key objectives • Understand the computational needs of the Metabolomics Community. • Integrate and scale existing Open Source tools into a well- tested e-infrastructure.

  20. Major revolution

  21. Same in software Developer’s Cluster Cloud Collaborator’s PI’s

  22. VRE Portal - Three usability rounds - 80% functionality running. - Public instance access. - App Library, hooked to EGI AppDB. - Documentation. http://portal.phenomenal- h2020.eu/

  23. MetaboLights – The team Kenneth Haug Reza Salek Kalai Jayaseelan Mark Williams Venkata Keeva Cochrane Chandrasekhar Jose Ramon Macias Xuefei Li (MRC) Gonzalez Christoph Steinbeck Jules Griffin (UC/MRC) Previous: Paula de Matos, Mark Rijnbeek, Tejasvi Mahendraker, Pablo Conesa

  24. EBI PhenoMeNal – The team Kenneth Haug Reza Salek Pablo Moreno Sijin He Christoph Steinbeck Namrata Kale

  25. COSMOS consortium

  26. PhenoMeNalconsortium

  27. Funding and Collaborators

Recommend


More recommend