pathway analysis
play

Pathway Analysis Jenny Wu Outline Introduction to NGS data - PowerPoint PPT Presentation

Introduction to Next Generation Sequencing (NGS) Data Analysis and Pathway Analysis Jenny Wu Outline Introduction to NGS data analysis in Cancer Genomics NGS applications in cancer research Typical NGS workflows and pipeline


  1. Introduction to Next Generation Sequencing (NGS) Data Analysis and Pathway Analysis Jenny Wu

  2. Outline • Introduction to NGS data analysis in Cancer Genomics – NGS applications in cancer research – Typical NGS workflows and pipeline – Open source software with GUI • Pathway Analysis and Software • Pathway Analysis goals and concepts • Commercial and open source pathway analysis software • Data analysis resources • Summary

  3. Next Generation Sequencing Massively Parallel Sequencing: One can generate hundreds of millions of short sequences (up to 250bp) in a single run in a short period of time with low per base cost. • Illumina/Solexa GA II, HiSeq 2500, 3000,X • Roche/454 FLX, Titanium • Life Technologies/Applied Biosystems SOLiD Reviews: Michael Metzker (2010) Nature Reviews Genetics 11:31 Quail et al (2012) BMC Genomics Jul 24;13:341.

  4. NGS in Cancer Genomics Shyr et al.2013

  5. Data Analysis in the bottleneck Informatics (wall.hms.harvard.edu)

  6. Basic NGS Workflow Isolation of material Library QC QC and pipeline analysis PCR amplification Cluster generation Data interpretation End repair, size selection Instrument operation Olson et al.

  7. High Throughput Data Analysis Overview Olson et al.

  8. Many Analysis Pipelines Start with Read Mapping Typical Data Analysis Pipelines Genotyping (GATK) RNA-seq (Tuxedo) http://www.broadinstitute.org/gsa/wiki/images/7/7a/Overall_flow.jpg http://www.broadinstitute.org/gatk/guide/topic?name=intro http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html

  9. Cancer NGS Data Analysis Pipeline-Software Raw reads FASTQC, FASTX- toolkit, Trimmomatic Analysis-ready reads BWA, STAR Data Visualization (IGV, Mapped reads Task IGB, USCS GB……) Software ……

  10. Cancer NGS Application Specific Software Mapped reads …… SomaticSniper, Cufflinks, MISO Bismark, BS VarScan2, mutect MACS2, SISSRs DESeq2,GATK Seeker freeBayes, Pindel, CNVnator

  11. Open Source Software with GUI Galaxy: Web based platform for analysis of large datasets http://hpc-galaxy.oit.uci.edu/root https://main.g2.bx.psu.edu/ https://usegalaxy.org/ GENE-E: java based matrix visualization and analysis platform; includes heatmap, clustering, filtering etc. http://www.broadinstitute.org/cancer/software/GENE-E

  12. Commercial software for NGS analysis • Easy to use, no command line skills required • Usually platform independent • Little to no learning curve o Limited flexibility o Harder to publish

  13. Outline • Introduction to NGS data analysis in Cancer Genomics – NGS applications in cancer research – Typical NGS workflows and pipeline – Open source software with GUI • Pathway Analysis and Software • Pathway Analysis goals and concepts • Commercial and open source pathway analysis software • Data analysis resources • Summary

  14. Why Pathway Analysis • Logical next step in any high throughput experiments • Goal: to characterize biological meaning of the joint changes in gene expression • Why? Often group of genes doing related functions are changed

  15. Pathway and Network Analysis Pathway Analysis Methods : • Functional category over representation: discrete test for significance ( BiNGO, David, IPA etc ) • Continuous test ( GSEA, PAGE ) • Signaling Pathway Impact Analysis ( iPathway Guide ) Network Analysis: ( WGCNA, Cytoscape etc )

  16. Functional Category Enrichment • Discrete tests: enrichment for groups in gene lists – Select gene list at some Differentially Not total predefined cutoff expressed differentially expressed – For each gene list and In the a b a+b functional category pathway cross-tabulate to get a 2X2 Not in the c d c+d contingency table pathway – Test for significance using total a+c b+d n Fisher’s exact test – FDR correction for multiple hypothesis testing

  17. Functional Categories in Pathway Analysis • Gene Ontology – Biological Process – Molecular Function – Cellular Localization • Pathway Databases – KEGG – BioCarta – Broad Institute (MSigDB) – Commercial knowledge bases such as IPA • Other – Transcription factor targets – Protein complexes – Self-Defined

  18. Commerical and Open Source Pathway Analysis Software

  19. Ingenuity Pathway Analysis Tool

  20. IPA Input file

  21. IPA results page

  22. Resources in NGS data analysis Public forums: Computational resources available at UCI: • HPC: open source software • CLCbio , IPA, JMP Genomics…

  23. Summary • NGS technologies are transforming cancer research. • Data analysis is a crucial part in NGS applications • Pathway analysis concepts and software • Data analysis resources Thank you!

Recommend


More recommend