genomics virtual laboratory
play

Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie - PowerPoint PPT Presentation

Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie (VLSCI) What is the Genomics Virtual Laboratory? NeCTAR funded nationally distributed platform for genomics, built on the Research Cloud and RDSI NeCTAR? Research Cloud? RDSI?


  1. Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie (VLSCI)

  2. What is the Genomics Virtual Laboratory? NeCTAR funded nationally distributed platform for genomics, built on the Research Cloud and RDSI

  3. NeCTAR? Research Cloud? RDSI? NeCTAR = National eResearch Collaboration Tools and Resources http://www.nectar.org.au NeCTAR Research Cloud http://www.nectar.org.au/research-cloud RDSI = Research Data Storage Initiative http://www.rdsi.uq.edu.au/

  4. What is the Genomics Virtual Laboratory? NeCTAR funded nationally distributed platform for genomic analyses: Infrastructure ● Workflow management system ● Bioinformatics toolkit (for command-line users) ● Visualisation services ● Scalable compute infrastructure Resources ● Tutorials and exemplar workflows targetted at common high throughput genomics tasks ● Data catalogues and coordination centres ● Subscription based support

  5. What is the Genomics Virtual Lab?

  6. Workflow platforms

  7. Workflow platforms Interactive platforms for developing genomics workflows and interactive data analysis ● Galaxy ● Genepattern, others possible (Bioflow, ...) What's Galaxy? "an open, web-based platform for performing accessible, reproducible, and transparent genomic science." http://galaxyproject.org Accessible: Users without programming experience can easily specify parameters and run tools and workflows Reproducible: Galaxy captures information so that any user can repeat and understand a complete computational analysis Transparent: Users share and publish analyses via the web

  8. Visualisation platforms

  9. Cluster-on-the-cloud

  10. Cluster-on-the-cloud CloudBioLinux - Linux with comprehensive, actively maintained suite of bioinformatics tools http://cloudbiolinux.org/ CloudMan : platform for launching and scaling CloudBioLinux clusters and Galaxy clusters on the cloud http://usecloudman.org Research Cloud : ~25000 CPUs to be spread across 6-10 research centres around Australia, to host research activities 'on demand' http://www.nectar.org.au/research-cloud

  11. Data catalogues

  12. Data catalogues UCSC databases Ensembl databases ENCODE dbSNP, Hapmap ICGC, COSMIC BPA Framework Datasets ● sarcoma ● wheat ● soil diversity

  13. Tutorials and workshops

  14. Tutorials and education resources NGS School - summer schools, 2 day workshops Galaxy based online tutorials: ● Intro to NGS ● Genome Browsers ● Common analyses ○ Differential gene expression ○ Variant calling ○ ChIPseq ○ ...

  15. Exemplar best practice workflows

  16. Exemplar workflows ● Variant calling: GATK best-practice ○ microbial ○ cancer-optimised ○ ● RNA-seq differential expression ● Fusion gene discovery from RNA-seq ● MicroRNA analysis ● De novo genome and transcriptome assembly ● Metagenomics ● ChIP-seq ● Variant annotation ● Pathway analysis ● Methylation

  17. Support

  18. Genomics Informatics Network Institutional subscriptions: ● genomics support (% of FTE) ● large compute and data resources ● managed instances of GVL ● new GVL tool development ● advocacy to funding bodies for resources ● communities of best practice

  19. Or...roll your own GVL

  20. Progress and timelines Dec 2012 Prototype at Qld (UQ) and Vic (UoM) ● Galaxy ● UCSC browser + databases ● Bioinformatics cluster-on-the-cloud ● Initial tutorials and exemplars Jun 2013 Production at Qld (UQ) and Vic (UoM), prototype @ other Research Cloud nodes Data coordination centres, data catalogues Dec 2013 Additional workflows and tutorials Additional nodes Jun 2014 Operations (support centres - subscriptions)

Recommend


More recommend