A Flexible Probe Level Approach to Improving the Quality and - PowerPoint PPT Presentation

A Flexible Probe Level Approach to Improving the Quality and Relevance of Affymetrix Microarray Data Chris Harbron Discovery Statistics AstraZeneca Non-Clinical Statistics Conference, Leuven, September 2008

Microarrays • � Enable measurements of the levels of gene expression of many thousands of genes simultaneously • � Provides an detailed description of the biology at a molecular level

Uses Of Gene Expression In The Pharmaceutical Industry Drug Drug Marketed Discovery Development Products Identification Understanding Personalised Understanding of drug Modes Of Medicine Drug Safety targets Action Support For Biomarkers For Existing & Early Identifying New Assessment Of Indications Efficacy

Microarrays • � Best thing about • � Worst thing about microarrays: microarrays: • � Analyse 1000s of • � Analyse 1000s of genes genes simultaneously simultaneously • � Won’t miss anything • � Can end up missing the interesting results in a mass of false positives

Reducing False Positives : Filtering • � Often people try and reduce the false positives issue by pre- filtering the genes before analysis – � Present / Absent calls, Variability, Minimum / average expression level • � And by subsequently selecting arbitrary cut-offs post-analysis – � p-value & fold change • � Lots of arbitrary choices • � May miss things – some properties may not directly translate across platforms and species • � Present / Absent calls based on differences between PM & MM – � Assumes no signal in MM which we know to be untrue. – � Also affected by GC content of middle base – � Arbitrary cut-off from significance test

3d fdr Maximise confidence by considering a balance of 3 parameters Quality & Evidence Of Size Of Relevance Separation Separation of Probe Sets (statistical test) (statistical test) Informative 2d fdr Adaptation Genes Ploner et al Talloen et al Ranking of probesets, combining all 3 parameters, with a measure of confidence

3 Correlated Criteria Size Of Separation (statistical test) Evidence Of Difference Test = Separation Statistic Variability (statistical test) Quality & Relevance of Probes

Assessing False Positives Local False Discovery Rate (fdr) Expected proportion of genes with observed statistic Z=z which are false positives Density for f 0 (z) non-DE genes Proportion of truly = x non-DE genes f(z) Observed Density fdr ~ 0.5 fdr ~ 0 Distinct from, but related to, global FDR

2d fdr Ploner et al Bioinformatics 2006 Extends concept of fdr to joint distribution of two statistics -Log 10 p-Value Calculates likelihood of being of each probeset being a false positive based on a combination of significance and difference Log Fold Change – Difference Between Groups

Informative / Non-Informative Calls & The PCPV Statistic I/NI Calls - Talloen et al, Bioinformatics 2007 – � Makes use of the multiple probes in an Affymetrix probeset – � Bayesian estimate of a signal to noise ratio – � If a probeset is informative, then the same pattern should be seen within all the probes within the probeset – � Binary classification PCPV statistic uses similar concept – � Percentage of total variation in probe intensity explained in the first principal component – � Continuous measure of information

Informative / Non-Informative Calls Relationship To PCPV Non-Informative Informative Probe Set Probe Set Low PCPV High PCPV Statistic Statistic

Informative / Non-Informative Calls & The PCPV Statistic • � If a probeset had a low PCPV statistic, i.e. its constituent probes are non-correlated, then either: – � It’s just measuring noise, i.e. there’s no differences between the samples • � Low levels of expression dominated by noise • � No variation in expression between samples – � It’s an unreliable set of probes • � Either way, it’s not very interesting • � Doesn’t necessarily follow that the gene is interesting in the sense of changing with what we are interested in, e.g. treatment

Higher PCPV Statistics Have More Interesting Profiles

Probes With Higher PCPV Statistics Tend To Be More Interesting But not exclusively so

3d fdr Stratified PCPV Calculate PCPV statistic for each probeset (% of total probe variation in 1 st PC) Probeset Quality Stratify probe sets by PCPV statistics & Relevance Calculate 2d fdr within Significance & each stratum of probesets Difference Combine data across strata and rank probesets by fdr Ranking of probesets, combining all 3 parameters, with a measure of confidence

3d fdr Stratified PCPV fdr ~ 0.75 Entire Set Of Probes Expected distribution of non-DE genes = fdr ~ 0.5 Observed High Quality Probes distribution + fdr ~ 0.95 Low Quality Probes

3d fdr Results 2d fdr 3d fdr Increase in confidence (lower fdr) for high relevance probesets Decrease in confidence (higher fdr) for lower relevance probesets High confidence probesets (low fdr) enriched, but not exclusively, from higher relevance probesets

3d fdr Results

Applicable Over Different DataSets Selected 10 datasets with available covariate information at random from GEO Consistently able to detect genes with more confidence using 3d fdr approach

Summary • � Single ordering of genes combining different properties on a rational basis • � A gene which is outstanding on one parameter, but not others could still be selected for further investigation – � Will get missed with standard “and” selection • � Removes arbitrary filtering decisions • � Tried a robust PCA (as RMA fitting is a robust method – median polish) – � Little change • � Shown for a 2-group t-test – easily extended to ANOVA or regression situation or any other test statistic

Back Up Slides

Relationship Of PCPV to Other Quality Filters Informative Non-Informative ProbeSets ProbeSets

A Flexible Probe Level Approach to Improving the Quality and - PowerPoint PPT Presentation

A Flexible Probe Level Approach to Improving the Quality and Relevance of Affymetrix Microarray Data Chris Harbron Discovery Statistics AstraZeneca Non-Clinical Statistics Conference, Leuven, September 2008 Microarrays Enable

Wedge Probe Cards PCBs, Connectors, Applications HTT High Tech Trade GmbH HTT Wedge Probe Cards

Phased Array Probe The PA probe consists of many small elements, each one can be pulsedon

POEMMA POEMMA: Probe of Extreme : Probe of Extreme Multi-Messenger Astrophysics Multi-Messenger

The The Beverly Beverly Middle Middle School School Flexible Flexible Learning Learning

Interstellar Probe Study Webinar Series The Interstellar Probe Study Year 2 Update Ralph L.

On the Bi-Enhancement of Chordal-Bipartite Probe Graphs Elad Cohen Martin Charles Golumbic

Personalized Learning Flexible Seating and Space Flexible Seating and Space Flexible Seating and

Flexible Instruction Day Parent Presentation Flexible Instruction Day March 16 - 20 - Flexible

Flexible Infrastructure Qualification What Is Flexible Infrastructure/Benefits Flexible

Optical Bio-im aging w ith Polym er Nanoparticles I ck Chan Kw on, Ph.D Biom edical Research

EEG Probe Project Grant G. Connell EEG Probe Project Design Objectives Investigate BCI

The Coolest, Hottest Mission under the Sun!! Dr. Nicola J. Fox Parker Solar Probe Project

Profometer PM-600 / PM-630 overview Profometer PM-600 / PM-630: - High resolution touch screen

Multi-Probe LSH: Efficient Indexing for Efficient Indexing for Multi-Probe LSH:

Tail Loss Probe (TLP) Converting RTOs to fast recoveries draft-dukkipati-tcpm-tcp-loss-probe-00

What is the Vehicle Probe Project? The VPP works with a traffic probe data marketplace

Large Scale IPv6 Alias Resolution Matthew Luckie Overview IP-ID based alias resolution

to probe the EBL Problem: EBL, emitted SED: both unknown ! Aim: measure n EBL ( z ) at different

Probing beyond the Standard PRISMA Cluster of Excellence Model with Flavor Physics Johannes

Astrophysics & Cosmology Outline of Cosmology Sec7on

Examining How The Great Firewall Discovers Hidden Circumvention Servers Roya Ensafi , David

P 4 PCN: Privacy-Preserving Path Probing for Payment Channel Networks Ruozhou Yu, Assistant

Linear probing with constant independence Anna Pagh, Rasmus Pagh, and Milan Ru i IT

Probing for Open DNS Resolvers John Kristoff jtk@depaul.edu Midwest Security Workshop jtk

Sambuz

Useful Links

Newsletter

Mail Us

A Flexible Probe Level Approach to Improving the Quality and - PowerPoint PPT Presentation

A Flexible Probe Level Approach to Improving the Quality and Relevance of Affymetrix Microarray Data Chris Harbron Discovery Statistics AstraZeneca Non-Clinical Statistics Conference, Leuven, September 2008 Microarrays Enable

Wedge Probe Cards PCBs, Connectors, Applications HTT High Tech Trade GmbH HTT Wedge Probe Cards

Phased Array Probe The PA probe consists of many small elements, each one can be pulsedon

POEMMA POEMMA: Probe of Extreme : Probe of Extreme Multi-Messenger Astrophysics Multi-Messenger

The The Beverly Beverly Middle Middle School School Flexible Flexible Learning Learning

Interstellar Probe Study Webinar Series The Interstellar Probe Study Year 2 Update Ralph L.

On the Bi-Enhancement of Chordal-Bipartite Probe Graphs Elad Cohen Martin Charles Golumbic

Personalized Learning Flexible Seating and Space Flexible Seating and Space Flexible Seating and

Flexible Instruction Day Parent Presentation Flexible Instruction Day March 16 - 20 - Flexible

Flexible Infrastructure Qualification What Is Flexible Infrastructure/Benefits Flexible

Optical Bio-im aging w ith Polym er Nanoparticles I ck Chan Kw on, Ph.D Biom edical Research

EEG Probe Project Grant G. Connell EEG Probe Project Design Objectives Investigate BCI

The Coolest, Hottest Mission under the Sun!! Dr. Nicola J. Fox Parker Solar Probe Project

Profometer PM-600 / PM-630 overview Profometer PM-600 / PM-630: - High resolution touch screen

Multi-Probe LSH: Efficient Indexing for Efficient Indexing for Multi-Probe LSH:

Tail Loss Probe (TLP) Converting RTOs to fast recoveries draft-dukkipati-tcpm-tcp-loss-probe-00

What is the Vehicle Probe Project? The VPP works with a traffic probe data marketplace

Large Scale IPv6 Alias Resolution Matthew Luckie Overview IP-ID based alias resolution

to probe the EBL Problem: EBL, emitted SED: both unknown ! Aim: measure n EBL ( z ) at different

Probing beyond the Standard PRISMA Cluster of Excellence Model with Flavor Physics Johannes

Astrophysics &amp; Cosmology Outline of Cosmology Sec7on

Examining How The Great Firewall Discovers Hidden Circumvention Servers Roya Ensafi , David

P 4 PCN: Privacy-Preserving Path Probing for Payment Channel Networks Ruozhou Yu, Assistant

Linear probing with constant independence Anna Pagh, Rasmus Pagh, and Milan Ru i IT

Probing for Open DNS Resolvers John Kristoff jtk@depaul.edu Midwest Security Workshop jtk

Sambuz

Useful Links

Newsletter

Mail Us

Astrophysics & Cosmology Outline of Cosmology Sec7on