Analyzing ICS Assays Using a BioConductor Pipeline Greg Finak,Mike - PowerPoint PPT Presentation

Analyzing ICS Assays Using a BioConductor Pipeline Greg Finak,Mike Jiang Gottardo Lab Fred Hutchinson Cancer Research Center

Highlights ● Our interest was in demonstrating the utility of simple, automated flow analysis tools in BioConductor. ● Pipeline uses only core BioConductor toolset. ● Knowledge-driven gating strategy mimics manual analysis. ● Methodology is fast, reproducible, and easy to interpret. ● Difficulty: dealing with rare populations.

Outline ● Preprocessing and Gating ○ Sequential normalization ○ Gating strategy ● Challenge 3a (ENV / GAG classification) ● Challenge 3b (Responder / Non- Responder Calls)

Data Description ● 48 individuals (240 FCS files) ○ 5 FCS files for each individual : ■ 2 Stimulations (ENV/GAG) ■ 2 negative controls and 1 positive control ○ Training: 27 subjects (135 FCS) ○ Testing: 21 subjects (105 FCS) ● compensated, transformed and partially gated (for singlets, live cells and lymphocytes). ● Markers ○ CD3/CD4/CD8 ○ TNFa/IL4/IFNg/IL2 ● Goals ○ Classify the antigen stimulation ○ Classify each sample as either a responder or non-responder

Sequential normalization/Gating

BioConductor Tools ● flowCore (updated) ○ Core functionality for flow cytometry data analysis in BioConductor (flowSets, flowFrames). ● flowStats (updated) ○ Convenience methods for 1D, 2D gating (rangeGate and quadGate) ○ flow cytometry data normalization (warpSet, warpSetNCDF, warpSetNCDFLowMem) ● ncdfFlow (new) ○ netCDF (disk-based) analysis of large flow cytometry data sets. ○ All flowCore functionality on netCDF backed flow data in R. ● flowWorkspace (new) ○ Import flowJo workspaces into R/BioConductor. Used to prepare and distribute challenge 3 data for flowCAP II.

Normalization on CD3

RangeGate on CD3 channel CD3+

Normalization on CD4,CD8

QuadrantGate on CD4 vs CD8 CD4+CD8- CD4-CD8+

RangeGate on Cytokine channels ● Major peak modeled using robust mean and standard deviation. ● Outlier detection in the +ive direction used to identify positive cells. TNFa+ IL4+

RangeGate on Cytokine channels of CD4+ IFNg+ IL2+

Env/Gag classification ● Use proportion of Cytokine+ cells as features ○ 4 from cd4+ ○ 4 from cd8+ ● Use paired data. ○ -If one sample of a pair is ENV, the other is necessarily GAG. ○ ENV is systematically higher than GAG for each sample pair in the training data.

Responder/ non-responder calls ● For each patient , does a sample respond to the stimulation ● The usual approach is to use Fisher's exact test on the count data. ● We take a Bayesian approach... ● Fit a standard Beta-Binomial model to raw counts from each stimulation/control pair. ● Estimate the posterior probability that the proportion of stimulated cells is greater than the control.

Beta-Binomial Model Positive Negative Stimulated Unstimulated Prior : shrinkage factor are estimated from the data Posterior via Monte Carlo Finally, estimate:

Response Prediction ● Calibrate the posterior probabilities using the training data. ● Decision tree to choose cutoff and features. Features used for classification <=0.99991 >0.99991 were IL2, and IFNg|IL2 2 non-responders misclassified >0 =0 as responders on training data. <=0.013 >0.013

Analyzing ICS Assays Using a BioConductor Pipeline Greg Finak,Mike - PowerPoint PPT Presentation

Analyzing ICS Assays Using a BioConductor Pipeline Greg Finak,Mike Jiang Gottardo Lab Fred Hutchinson Cancer Research Center Highlights Our interest was in demonstrating the utility of simple, automated flow analysis tools in

The Bioconductor Project Paula Andrea Martinez, PhD. Data Scientist DataCamp Introduction to

A very short, sketchy, introduction to A very short, sketchy, introduction to Bioconductor

Topics for today Introduction to Bioconductor: Getting started with Bioconductor g Using R

The Bioconductor Project: Current Status Martin Morgan Roswell Park Cancer Institute Buffalo,

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture & Computer Architecture &

ICS Vulnerability Disclosure To Disclose or Not to Disclose ICS-CERT Control Systems Security

A graphical user interface to DNA microarray data analysis using R and Bioconductor Jarno Tuimala

The Bioconductor Project Martin Morgan Fred Hutchinson Cancer Research Center 19-21 January,

R / Bioconductor for Analysis and Comprehension of High-Throughput Sequence Data Martin T. Morgan

Introduction to Biostrings Paula Andrea Martinez, PhD. Data Scientist DataCamp Introduction to

Introducing ShortRead Paula Andrea Martinez, PhD. Data Scientist DataCamp Introduction to

Sequence Ranges Paula Andrea Martinez, PhD. Data scientist DataCamp Introduction to

Approaches to Package Management Bioconductor Martin Morgan (Martin.Morgan@RoswellPark.org)

R / Bioconductor for Sequence Analysis Martin Morgan 1 June 20-23, 2011 1 mtmorgan@fhcrc.org

ics.uwex.edu ics.uwex.edu The Video Interoperability Challenge ics.uwex.edu Room Systems A

Successful gene expression studies using validated qPCR assays Jan Hellemans, CEO Biogazelle

Malaysian Healthy Ageing Society The Role of Nutraceuticals in Reversing Diabetes Dato Steve

Coordinate Transformations in Parietal Cortex Computational Models of Neural Systems Lecture 7.1

Modelling Biochemical Reaction Networks Lecture 12: Stochastic modeling of a single ion channel

Cavity Test Stands for Project X Andy Hocker Fermilab - Technical Division Project X

Sta Staphyloco coccus ccus & st strept ptococcus Ma Main to topics cs we will ta

My Kitchen Table PCR Sophomore Year of High School PCR Primer Primer-Defined Changes to the PCR

Boolean models of the lac operon in E. coli Matthew Macauley Clemson University Gene expression

Cellular Reprogramming and Controllability of Complex Systems University of Maryland and ISR

Analyzing ICS Assays Using a BioConductor Pipeline Greg Finak,Mike - PowerPoint PPT Presentation

Analyzing ICS Assays Using a BioConductor Pipeline Greg Finak,Mike Jiang Gottardo Lab Fred Hutchinson Cancer Research Center Highlights Our interest was in demonstrating the utility of simple, automated flow analysis tools in

The Bioconductor Project Paula Andrea Martinez, PhD. Data Scientist DataCamp Introduction to

A very short, sketchy, introduction to A very short, sketchy, introduction to Bioconductor

Topics for today Introduction to Bioconductor: Getting started with Bioconductor g Using R

The Bioconductor Project: Current Status Martin Morgan Roswell Park Cancer Institute Buffalo,

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture &amp; Computer Architecture &amp;

ICS Vulnerability Disclosure To Disclose or Not to Disclose ICS-CERT Control Systems Security

A graphical user interface to DNA microarray data analysis using R and Bioconductor Jarno Tuimala

The Bioconductor Project Martin Morgan Fred Hutchinson Cancer Research Center 19-21 January,

R / Bioconductor for Analysis and Comprehension of High-Throughput Sequence Data Martin T. Morgan

Introduction to Biostrings Paula Andrea Martinez, PhD. Data Scientist DataCamp Introduction to

Introducing ShortRead Paula Andrea Martinez, PhD. Data Scientist DataCamp Introduction to

Sequence Ranges Paula Andrea Martinez, PhD. Data scientist DataCamp Introduction to

Approaches to Package Management Bioconductor Martin Morgan (Martin.Morgan@RoswellPark.org)

R / Bioconductor for Sequence Analysis Martin Morgan 1 June 20-23, 2011 1 mtmorgan@fhcrc.org

ics.uwex.edu ics.uwex.edu The Video Interoperability Challenge ics.uwex.edu Room Systems A

Successful gene expression studies using validated qPCR assays Jan Hellemans, CEO Biogazelle

Malaysian Healthy Ageing Society The Role of Nutraceuticals in Reversing Diabetes Dato Steve

Coordinate Transformations in Parietal Cortex Computational Models of Neural Systems Lecture 7.1

Modelling Biochemical Reaction Networks Lecture 12: Stochastic modeling of a single ion channel

Cavity Test Stands for Project X Andy Hocker Fermilab - Technical Division Project X

Sta Staphyloco coccus ccus &amp; st strept ptococcus Ma Main to topics cs we will ta

My Kitchen Table PCR Sophomore Year of High School PCR Primer Primer-Defined Changes to the PCR

Boolean models of the lac operon in E. coli Matthew Macauley Clemson University Gene expression

Cellular Reprogramming and Controllability of Complex Systems University of Maryland and ISR

ICS 233 ICS 233 ICS 233 ICS 233 Computer Architecture & Computer Architecture &

Sta Staphyloco coccus ccus & st strept ptococcus Ma Main to topics cs we will ta