computational genomics
play

Computational Genomics Francisco Garca Garca BIER - PowerPoint PPT Presentation

Computational Genomics Francisco Garca Garca BIER fgardos@gmail.com Mster en Biotecnologa Biomdica. UPV Why are we interested in Computational Genomics? The overall goal : Apply computational methods to biomedical and


  1. Computational Genomics Francisco García García BIER fgardos@gmail.com Máster en Biotecnología Biomédica. UPV

  2. Why are we interested in Computational Genomics? The overall goal : Apply computational methods to biomedical and  biotechnological problems Research interests : The development and application of novel bioinformatics  methods aimed at discovering new drugs Identifjcation of genes or proteins may be considered  therapeutic targets Personalized medicine : tools for discovering and  diagnostic Introduction Why Computational Genomics?

  3. Computational Genomics Genomics Transcriptomics Metabolomics Lipidomics Proteomics Epigenomics Introduction Omics sciences

  4. Computational Genomics How do these technologies work ? Introduction High throughput technologies: microarrays

  5. Computational Genomics How do these technologies work ? Reference genome Introduction High throughput technologies: Next Generation Sequencing

  6. Computational Genomics Regulatory KEGG Gene elements Biological Ontology pathways MiRNA, CisRed knowledge InterPro Transcription Factor Biocarta Binding Sites Motifs Gene pathways Expression Bioentities from in tissues literature Clinical ClinVar HUMSAVAR knowledge HGMD COSMIC Introduction Clinical and biological databases

  7. Computational Genomics Introduction Personalized Medicine

  8. Computational Genomics + Introduction Personalized Medicine

  9. Descripción de las sesiones 3 sesiones (7 horas) sobre el uso de herramientas web para el análisis e interpretación de datos de secuenciación . T oda la documentación (presentaciones + ejercicios) que necesitaremos durante estos días, estarán disponibles en este enlace http://bioinfo.cipf.es/mbb/. T ambién en Poliformat. Docentes : Marta Hidalgo y Paco García. El enfoque de las sesiones será práctico y sólo introduciremos aquellos conceptos que precisemos para los ejercicios. Introduction Máster en Biotecnología Biomédica. UPV.

  10. Programa Sesión 1 • Introducción a las tecnologías NGS. • Estudios de detección de variación genómica. Pipeline de análisis de datos genómicos. • ¿Cómo detectar mutaciones de interés en estudios de exomas completos? Ejercicios con la herramienta web BiERapp. Sesión 2  Estudios de variación genómica: secuenciación genómica dirigida.  ¿Cómo diseñar un panel de genes? ¿Cómo analizar e interpretar datos de paneles de genes?. Ejercicios con TEAM.  Variabilidad genética española. Base de datos CSVS.  Estudios transcriptómicos con datos de NGS. Pipeline de análisis de datos de expresión. ¿Cómo analizar datos de RNA-Seq desde la suite Babelomics? Sesión 3  Análisis de datos transcriptómicos en el contexto de las rutas de señalización.  Ejercicios con las herramientas web hipathia y PathAct. Introduction Máster en Biotecnología Biomédica. UPV.

  11. Web tools to analyze omic data BIER fgardos@gmail.com Máster en Biotecnología Biomédica. UPV

  12. NGS Data Analysis Pipeline Fastq Sequence preprocessing Fastq Alignment BAM Visualization Resequencing RNA-Seq BAM Data Analysis Data Analysis RNA-Seq processing Variant calling Count matrix VCF RNA-Seq data analysis Variant annotation Prioritization Functional analysis Introduction NGS data analysis: pipelines

  13. Fastq format  We could say “it is a fasta with qualities ”:  1. Header (like the fasta but starting with “@”)  2. Sequence (string of nt)  3. “+” and sequence ID (optional)  4. Encoded quality of the sequence @SEQ_ID GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + !''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 Introduction NGS data analysis: fjles format

  14. BAM/SAM format @PG ID:HPG-Aligner VN:1.0 @SQ SN:20 LN:63025520 HWI-ST700660_138:2:2105:7292:79900#2@0/1 16 20 76703 254 76= * 0 0 GTTTAGATACTGAAAGGTACATACTTCTTTGTAGGAACAAGCTATCATGCTGCATTTCTATAATATCACATGAATA GIJGJLGGFLILGGIEIFEKEDELIGLJIHJFIKKFELFIKLFFGLGHKKGJLFIIGKFFEFFEFGKCKFHHCCCF AS:i:254 NH:i:1 NM:i:0 HWI-ST700660_138:2:2208:6911:12246#2@0/1 16 20 76703 254 76= * 0 0 GTTTAGATACTGAAAGGTACATACTTCTTTGTAGGAACAAGCTATCATGCTGCATTTCTATAATATCACATGAATA HHJFHLGFFLILEGIKIEEMGEDLIGLHIHJFIKKFELFIKLEFGKGHEKHJLFHIGKFFDFFEFGKDKFHHCCCF AS:i:254 NH:i:1 NM:i:0 HWI-ST700660_138:2:1201:2973:62218#2@0/1 0 20 76655 254 76M * 0 0 AACCCCAAAAATGTTGGAAGAATAATGTAGGACATTGCAGAAGACGATGTTTAGATACTGAAAGGGACATACTTCT FEFFGHHHGGHFKCCJKFHIGIFFIFLDEJKGJGGFKIHLFIJGIEGFLDEDFLFGEIIMHHIKL$BBGFFJIEHE AS:i:254 NH:i:1 NM:i:1 HWI-ST700660_138:2:1203:21395:164917#2@0/1 256 20 68253 254 4M1D72M * 0 0 NCACCCATGATAGACCAGTAAAGGTGACCACTTAAATTCCTTGCTGTGCAGTGTTCTGTATTCCTCAGGACACAGA #4@ADEHFJFFEJDHJGKEFIHGHBGFHHFIICEIIFFKKIFHEGJEHHGLELEGKJMFGGGLEIKHLFGKIKHDG AS:i:254 NH:i:3 NM:i:1 HWI-ST700660_138:2:1105:16101:50526#6@0/1 16 20 126103 246 53M4D23M * 0 0 AAGAAGTGCAAACCTGAAGAGATGCATGTAAAGAATGGTTGGGCAATGTGCGGCAAAGGGACTGCTGTGTTCCAGC FEHIGGHIGIGJI6FCFHJIFFLJJCJGJHGFKKKKGIJKHFFKIFFFKHFLKHGKJLJGKILLEFFLIHJIEIIB AS:i:368 NH:i:1 NM:i:4 SAM Specifjcation: http://samtools.sourceforge.net/SAM1.pdf Introduction NGS data analysis: fjles format

  15. VCF format http://www.1000genomes.org/ Introduction NGS data analysis: fjles format

  16. Counts Sample Gene Introduction NGS data analysis: fjles format

  17. Transcriptomic Studies BIER fgardos@gmail.com Máster en Biotecnología Biomédica. UPV

  18. RNA-Seq Data Analysis Pipeline 1. Sequence preprocessing 1. Sequence preprocessing Primary 2. Mapping 3. Quantifjcation 4. Normalization Secondary 5. Difgerential expression 6. Functional Profjling Babelomics 5 RNA-Seq Data Analysis

  19. Babelomics 5 http://babelomics.bioinfo.cipf.es/ Babelomics 5 Analyzing omics data + functional profjling

  20. Differential Expression NORMALIZATION + FUNCTIONAL UPLOAD EDIT DIFFERENTIAL PROFILING DATA DATA EXPRESSION Babelomics 5 Analyzing omics data + functional profjling

  21. Supervised and Unsupervised Classification RPKM TMM CLUSTERING UPLOAD NORMALIZE EDIT DATA DATA DATA PREDICTORS Babelomics 5 Analyzing omics data + functional profjling

  22. Signaling Pathways Analysis http://hipathia.babelomics.org/ hiPhatia Signaling Pathways Analysis

  23. Genomic Variation Studies BIER fgardos@gmail.com Máster en Biotecnología Biomédica. UPV

  24. Genomics Data Analysis Pipeline 1. Sequence preprocessing 1. Sequence preprocessing Primary Analysis 2. Mapping 3. Variant calling Secondary 4. Variant prioritization Pipeline Resequencing Data Analysis

  25. How do we prioritize variants in whole exome studies? http://courses.babelomics.org/bierapp/ BIER BiERapp Discovering variants

  26. Introduction Whole-exome sequencing has become a fundamental tool for the  discovery of disease-related genes of familial diseases but there are diffjculties to fjnd the causal mutation among the enormous background There are difgerent scenarios, so we need difgerent and immediate  strategies of prioritization Vast amount of biological knowledge available in many databases  We need a tool to integrate this information and fjlter  immediately to select candidate variants related to the disease BiERapp Discovering variants

  27. How does BiERapp work? Filterings VCF fjle multisample BiERapp VARIANT CellBase BiERapp Discovering variants

  28. Input: VCF fjle 1. Sequence preprocessing 1. Sequence preprocessing Primary Analysis 2. Mapping VCF fjles 3. Variant calling Secondary BiERapp 4. Variant prioritization BiERapp Discovering variants

  29. Can I interpret sequencing data for diagnostic? http://courses.babelomics.org/team/ BIER TEAM Targeted Enrichment Analysis and Management

  30. Gene panel Sequencing Biological data knowledge ClinVar TEAM HGMD HUMSAVAR COSMIC Diagnostic TEAM Targeted Enrichment Analysis and Management

  31. Gene panel 1. VCF fjles TEAM 2. Gene panel ClinVar HGMD HUMSAVAR COSMIC TEAM Targeted Enrichment Analysis and Management

  32. CSVS: CIBERER Spanish Variant Server Repositorio de frecuencias de variantes en la población española http://csvs.babelomics.org/ CSVS CIBERER Spanish Variant Server

  33. CIBERER Spanish Variant Server CSVS Local genetic variability

  34. Tool interface http://csvs.babelomics.org/ CSVS CIBERER Spanish Variant Server

  35. Genome Maps Visualizador genómico que interactúa con bases de datos funcionales http://genomemaps.org/ Genome Maps A next-generation web-based genome browser

  36. Tool interface Genome Maps A next-generation web-based genome browser

  37. Cell Maps Herramienta de modelización y visualización de redes biológicas http://cellmaps.babelomics.org/ Cell Maps Visualizing and integrating biological networks

Recommend


More recommend