Scalable differential transcript usage analysis for single-cell - PowerPoint PPT Presentation

Scalable differential transcript usage analysis for single-cell applications JEROEN GILIS EuroBioc2019 presentation Promotor: Prof. Lieven Clement Supervisor: Dr. Koen Van den Berge � 1

Di ff erential Transcript Usage (DTU) Translation Alternative splicing Normal metabolism Transcription Isoform M1 Pre-mRNA Tumorigenesis (gene-level) DNA Isoform M2 Gene-level analysis Transcript-level analysis Expression level (cpm) Relative usage (%) M1 M2 M1 M2 � 2

Method development • Our workflow unlocks edgeR for DTU analysis Y gi ~ NB ( µ gi , φ g ) DGE log ( µ gi ) = η gi C η gi = β 0 + β gc + log (S i ) � 3

Method development • Our workflow unlocks edgeR for DTU analysis Y ti ~ NB ( µ ti , φ t ) DTE log ( µ ti ) = η ti C η ti = β 0 + β tc + log (S i ) � 4

Method development • Our workflow unlocks edgeR for DTU analysis Y ti ~ NB ( µ ti , φ t ) DTU log ( µ ti ) = η ti C η ti = β 0 + β tc + log (T ti ) • Our workflow takes the gene-level counts (total counts, T ti ) as offsets to the GLM framework edgeR-total � 5

Method development • Our workflow unlocks edgeR for DTU analysis Y ti ~ NB ( µ ti , φ t ) DTU log ( µ ti ) = η ti C η ti = β 0 + β tc + log (T ti ) • Our workflow takes the gene-level counts (total counts, T ti ) as offsets to the GLM framework edgeR-total • DEXSeq Sample 1 … Sample m Sample 1 … Sample m Tx 1 112 … 15 Tx 1 25 … 3 ‘other’ Counts Tx t … … … Tx t … … … counts Tx n 62 … 348 Tx n 88 … 212 • Our second workflow takes the other counts as offsets edgeR-other � 6

Performance evaluation on real bulk data Gtex dataset, Nature Genetics 45, 580-585 (2013) 5v5 75v75 10v10 DEXSeq edgeR_total edgeR_other limma_di ff splice DRIMSeq � 7

Scalability benchmark on real single-cell data • Our workflow performs a DTU analysis between two groups of 512 cells in ~20 minutes • DEXSeq scales quadratically � 8

Single-cell transcriptomics case study Dataset from Buettner et al., Nature Biotechnology 33; 155-160 (2015) • Dataset; 288 mouse embryonic stem cells, di ff erent cell cycle stages (G1, S and G2M) • Runtime; < 2 minutes • Significant enrichment in cell cycle processes • Several DTU genes are; ✦ Biologically relevant ✦ Not picked up in a gene-level analysis ✦ Clearly di ff erentially used when visualised Ccdc86 *** *** Proportions Phase G1 S Tx1 Tx2 Tx3 The size of the dots (which represent individual cells) are weighted according to the total expression of the gene in that cell. � 9

Single-cell transcriptomics case study Buettner dataset, Nature Biotechnology 33; 155-160 (2015) • Dataset; 288 mouse embryonic stem cells, di ff erent cell cycle stages (G1, S and G2M) • Runtime; < 2 minutes for o ff set-based methods • Significant enrichment in cell cycle processes • Some DTU genes display clear DTU in visualisation and are biologically relevant • edgeR_other method large number of (false) positive results; sensitive to outliers (?) • Discrepancy between edgeR-total and limma di ff splice; asses formally in single-cell benchmark Eef1d limma di ff splice edgeR-total *** Proportions edgeR-other Tx8 Phase G1 G2M � 10

Take-home messages We are developing a workflow for studying DTU that; 1. Has a performance similar to that of DEXSeq 2. Correctly controls the false discovery rate 3. Scales towards large transcriptomics datasets � 11

Scalable differential transcript usage analysis for single-cell applications JEROEN GILIS EuroBioc2019 presentation Promotor: Prof. Lieven Clement Supervisor: Dr. Koen Van den Berge � 12

Background - DTU � 13

Background - DEXSeq • Input : matrix of transcript-level counts (e.g. Salmon or kallisto) Transcript-level counts Complementary counts • Statistical model: Y ti ~ NB ( µ ti , φ t ) log ( µ ti ) = η ti S T TC η ti = β ti + β t + β tci � 14

Parametric bulk simulation study Dataset from Love et al., F1000Research, 7:952 (2018) 3v3 10v10 6v6 DEXSeq edgeR_total edgeR_other limma_di ff splice DRIMSeq � 15

Gtex dataset stringent filtering DEXSeq edgeR_total edgeR_other limma_di ff splice DRIMSeq � 16

Love dataset stringent filtering DEXSeq edgeR_total edgeR_other limma_di ff splice DRIMSeq � 17

Other parametric bulk simulations and additional methods Love 6v6 Van den Berge 5v5 (1) Van den Berge 5v5 (2) DEXSeq edgeR_total edgeR_other limma_di ff splice DRIMSeq NBSplice edgeR_di ff splice � 18

Results - Scalability • Methods that require sample-level intercepts scale quadratically with the number of cells • edgeR one order of magnitude faster than DESeq2 • All methods scale linearly with the number of transcripts � 19

� 20

Scalable differential transcript usage analysis for single-cell - PowerPoint PPT Presentation

Scalable differential transcript usage analysis for single-cell applications JEROEN GILIS EuroBioc2019 presentation Promotor: Prof. Lieven Clement Supervisor: Dr. Koen Van den Berge 1 Di ff erential Transcript Usage (DTU) Translation

DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOL DIFFERENTIAL AROMA VOLATILES DIFFERENTIAL AROMA

Differential expression analysis John Blischak Instructor DataCamp Differential Expression

Transcript Verification Report What is your Grade 10, 11, and 12 Courses TVR? Unofficial

HMDA Webinar 2 Transcript Slides and transcript to accompany the webinar video presentation

HMDA Webinar 1 Transcript Slides and transcript to accompany the webinar video presentation

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Tutorial: Differential Categories and Cartesian Differential Categories JS Pacaud Lemay FMCS

Differential expression analysis Mary Piper Bioinformatics Consultant and Trainer DataCamp

Physics plans and and ILDG ILDG usage usage Physics plans in Italy Italy in Francesco Di

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

Differential and Linear Cryptanalysis Lars R. Knudsen June 2014 L.R. Knudsen Differential and

Modelling with Differential Equations Modelling with Differential Equations Modelling with

Differential equations Programming of Differential Equations A differential equation (ODE)

Differential forms in non-linear Cartesian differential categories Hayley Reid and Jonathan

differential schemes and differential algebraic varieties Dmitry Trushin Department of

Differential equations Programming of Differential Equations A differential equation (ODE)

Performance Analysis of the AD Detector Control System in the ALICE Experiment XXXI Annual Meeting

Asympto(c safety v. strings: UV comple(on on the world line Steven Abel (Durham) w/ Nicola Dondi

Business and Financial Update October 30, 2009 Safe Harbor Statement The information contained

Directed Algebraic Topology Scott Newton PhD Student, Ohio State University newton.385@osu.edu

Agenda Grading ELLs FLE ID Accommodations ACCESS Title III Supplemental

Renormalons in Quantum Mechanics Cihan Pazarba s Bo gazi ci University based on

Math 211 Math 211 Complex Numbers and Matrices October 29, 2001 2 Complex Numbers Complex

REASONML DOUGLAS TEOH dteoh.com dteoh #3808 allm.net/en Medical Device Recalls Software

Sambuz

Useful Links

Newsletter

Mail Us