Development of Genomics Plugins in i2b2 Lori Phillips, MS AUG - PowerPoint PPT Presentation

Development of Genomics Plugins in i2b2 Lori Phillips, MS AUG Meeting June 18, 2013

Big Picture - Data flow of next-gen sequencing base calls from the sequencer FASTQ files with base calls SAM with standard alignment VCF digests variants GVF maps to ontologies De- identified Data Warehouse

Importing NGS variant output into i2b2 Variant Call Format VCF Gene Annotated VCF ANNOVAR Genome Variation Format GVF i2b2 Observation fact

Pipeline - VCF to VCF-ANNO 1 1105366 . T C . PASS VCF AA=T;AC=4;AN=114;DP=3251 GT:DP 1/0:54 ANNOVAR* exonic TTLL10 1 1105366 1105366 T C 1 1105366 . T C . PASS VCF-ANNO AA=T;AC=4;AN=114;DP=3251 GT:DP 1/0:54 *Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from next-generation sequencing data Nucleic Acids Research, 38:e164, 2010 (www.openbioinformatics.org/annovar)

Pipeline - VCF-ANNO to GVF exonic TTLL10 1 1105366 1105366 T C 1 1105366 . T C . PASS AA=T;AC=4;AN=114;DP=3251 GT:DP 1/0:54 VCF-ANNO 2GVF* chr1 VCF SNV 1105366 1105366 . + GVF . ID=1;Reference_seq=T;Variant_seq=C;Variant_feature=exonic;Gene=TTLL10; Genotype=heterozygous *Kong, Sek-Won, Lee, Joon, Boston Children’s Hospital (perl script) modified for ANNOVAR by Lori Phillips

Pipeline – GVF to I2B2 records chr1 VCF SNV 1105366 1105366 . + GVF . ID=1;Reference_seq=T;Variant_seq=C;Variant_feature=exonic;Gene=TTLL10; Genotype=heterozygous GVF2I2B2  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"@"|1||||||||||||||"GVF2I2B2"|  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"SO:0000340"|1|"T"| "chr1"||||||||||||"GVF2I2B2| (chr1) I  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"SEQ:Start"|1|"N"|"E“| 1105366|||||||||||"GVF2I2B2| (start position) 2  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"SEQ:End"|1|"N"|"E"| 1105366|||||||||||"GVF2I2B2| (end position) B  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"SO:0001029"|1|"T"| "+"||||||||||||"GVF2I2B2”| (+ strand) 2  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"SEQ:Zygosity"|1|"T"| "heterozygous"||||||||||||"GVF2I2B2”| (heterozygous)  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"SEQ:HUGO"|1|"T"| "TTLL10"||||||||||||"GVF2I2B2"| (associated gene)  1880001024|1000000024|"SO:0001483"|"@"|"2010-03-03 00:00:00"|"SO:0001791"|1|| ||||||||||||"GVF2I2B2"| (exonic variant)

Genomics Import Plugin

Mapping file ##genome-build hg18 ##file-date 2010-07-07 #sample|patient_num|encounter_num NA12878|1000000090|1880003090 NA12891|1000000093|1880003093 NA12892|1000000094|1880003094

Bulk Loader Status

Bulk Loading Observations 2. Tell the CRC 2 the file is ready to load CRC 3. SSIS package SSIS loads the i2b2 file to observation_fact table 3 I2B2 FR 1 1. Send the i2b2 file to the FR

Navigating NGS Variant Data with Sequence Ontology Combination of concepts and modifiers to identify: An SNV/SNP located on a 3’UTR An SNV/SNP associated with a certain gene An SNV/SNP of specified zygosity

Gene Association Modifier

Specifying Gene Association Modifier

Building a Translational Genomic Query  Group1: SNV/SNP with HGNC Gene Symbol modifier of “PPARG”

Building a Translational Genomic Query  Group 2: SNV/SNP with exon variant modifier  Note that “Items instance will be same” is selected on the panels

Building a Translational Genomic Query  Group 3: Diabetes Mellitus  Select “Treat Independently” for this panel

Run the query

Summary  A Genomics plug-in was created to create observation-fact files from VCF files.  A bulk loader was written in native (SQL Server) code to allow for the rapid loading of 2-5 million rows / patient into observation-fact table.  Sequence Ontology (available at NCBO) that is associated with GVF format can be used to query the next generation sequencing data that was imported into i2b2.

Development of Genomics Plugins in i2b2 Lori Phillips, MS AUG - PowerPoint PPT Presentation

Development of Genomics Plugins in i2b2 Lori Phillips, MS AUG Meeting June 18, 2013 Big Picture - Data flow of next-gen sequencing base calls from the sequencer FASTQ files with base calls SAM with standard alignment VCF digests variants

Glowing Bear tranSMART forked i2b2 MedCo webclient Front Ends Clinical

Genomics Genomics extravaganza extravaganza Genomics Genomics overview overview Genomics

Top WordPress Plugins SEO Tips and Tactics MeetUp Christina Peterson Thursday, May 23, 13 Top

13 Vim plugins I use every day VimConf 2019 Tatsuhiro Ujihisa 13 Vim plugins I use every day

xrootd news Paul Millar Zeuthen, dCache workshop xrootd plugins xrootd has plugins that allow

Melbourne Genomics Establishing data governance in clinical genomics Ian Pham Data Governance

Genomics extravaganza Genomics overview Genomics analysis of the structure and function of very

Outline Part 1 Introduction to Genomics Part 2 Visual Design for Genomics Part 3 Hands-On

Melbourne Genomics Data and technology to support and enable genomics Kate Birch Data &

clinical genomics Melbourne Genomics Health Alliance Melbourne Genomics Health Alliance Medical

Comparative Genomics: Comparative Genomics: Sequence, Structure, Sequence, Structure, and

High throughput methods approches in genomics D. Puthier Genomics The science for the 21st

Looking for Subjectivity in Medical Discharge Summaries The Obesity NLP i2b2 Challenge (2008)

A Comprehensive Clinical Research Database based on CDISC ODM and i2b2 F. Meineke, S. Stubert.

Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie (VLSCI) What is the Genomics

Comparative Genomics of Environmental Stress Responses in North American Hardwoods The

Introduction to RNA-Seq Introduction To Bioinformatics Using NGS Data Dag Ahrn 22-May-2019

flatfish reveals selection under high levels of gene flow Filip A.M. Volckaert 1 , Eveline

The goal of bioinformatics is the extension of experimental data by predictions. A fundamental

Low Pass Sequence Data in Genetic Evaluation A joint UNL/USMARC project Larry Kuehn, Warren

ChIP-seq data analysis 04-05-12 Outlook Friday 04-05-12: Next-generation sequencing

Why and how to build up a network of excellence on Triticeae genomics in Europe? Nils Stein,

Sequencing data files and Quality Control Gilgi Friedlander Bioinformatics Unit, Biological

Statistical analysis of meta-omics data Sandra Plancade INRA (French Institute of Research in

Sambuz

Useful Links

Newsletter

Mail Us

Development of Genomics Plugins in i2b2 Lori Phillips, MS AUG - PowerPoint PPT Presentation

Development of Genomics Plugins in i2b2 Lori Phillips, MS AUG Meeting June 18, 2013 Big Picture - Data flow of next-gen sequencing base calls from the sequencer FASTQ files with base calls SAM with standard alignment VCF digests variants

Glowing Bear tranSMART forked i2b2 MedCo webclient Front Ends Clinical

Genomics Genomics extravaganza extravaganza Genomics Genomics overview overview Genomics

Top WordPress Plugins SEO Tips and Tactics MeetUp Christina Peterson Thursday, May 23, 13 Top

13 Vim plugins I use every day VimConf 2019 Tatsuhiro Ujihisa 13 Vim plugins I use every day

xrootd news Paul Millar Zeuthen, dCache workshop xrootd plugins xrootd has plugins that allow

Melbourne Genomics Establishing data governance in clinical genomics Ian Pham Data Governance

Genomics extravaganza Genomics overview Genomics analysis of the structure and function of very

Outline Part 1 Introduction to Genomics Part 2 Visual Design for Genomics Part 3 Hands-On

Melbourne Genomics Data and technology to support and enable genomics Kate Birch Data &amp;

clinical genomics Melbourne Genomics Health Alliance Melbourne Genomics Health Alliance Medical

Comparative Genomics: Comparative Genomics: Sequence, Structure, Sequence, Structure, and

High throughput methods approches in genomics D. Puthier Genomics The science for the 21st

Looking for Subjectivity in Medical Discharge Summaries The Obesity NLP i2b2 Challenge (2008)

A Comprehensive Clinical Research Database based on CDISC ODM and i2b2 F. Meineke, S. Stubert.

Genomics Virtual Laboratory Mike Pheasant (UQ) Andrew Lonie (VLSCI) What is the Genomics

Comparative Genomics of Environmental Stress Responses in North American Hardwoods The

Introduction to RNA-Seq Introduction To Bioinformatics Using NGS Data Dag Ahrn 22-May-2019

flatfish reveals selection under high levels of gene flow Filip A.M. Volckaert 1 , Eveline

The goal of bioinformatics is the extension of experimental data by predictions. A fundamental

Low Pass Sequence Data in Genetic Evaluation A joint UNL/USMARC project Larry Kuehn, Warren

ChIP-seq data analysis 04-05-12 Outlook Friday 04-05-12: Next-generation sequencing

Why and how to build up a network of excellence on Triticeae genomics in Europe? Nils Stein,

Sequencing data files and Quality Control Gilgi Friedlander Bioinformatics Unit, Biological

Statistical analysis of meta-omics data Sandra Plancade INRA (French Institute of Research in

Sambuz

Useful Links

Newsletter

Mail Us

Melbourne Genomics Data and technology to support and enable genomics Kate Birch Data &