Class discrimination for microarray studies Vlad Popovici Swiss - PowerPoint PPT Presentation

Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics February 5th, 2008 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 1 / 45

Outline Introduction 1 Discriminant analysis 2 Performance assessment 3 Estimating the performance parameters 4 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 2 / 45

Introduction Example: ER status prediction Questions: How to decide which patient is ER+ and which is ER-? Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 3 / 45

Introduction Example: ER status prediction Questions: How to decide which patient is ER+ and which is ER-? What is the expected error? What if I prefer to detect most of ER+? Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 3 / 45

Introduction Know your problem! Remember Good study ←→ clear objectives. Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 4 / 45

Introduction Know your problem! Remember Good study ←→ clear objectives. Problems: Class Comparison : find genes differentially expressed between predefined classes; Class Prediction : predict one of the predefined classes using the gene expressions; Class Discovery : cluster analysis – define new classes using clusters of genes/specimens. Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 4 / 45

Introduction Class prediction Typical applications: predict treatment response predict patient relapse predict the phenotype toxico–genomics: predict which chemicals are toxic . . . Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 5 / 45

Introduction Class prediction Typical applications: predict treatment response predict patient relapse predict the phenotype toxico–genomics: predict which chemicals are toxic . . . Characteristics: supervised learning : requires labelled training data the goal is prediction accuracy uses some measure of similarity relies on feature selection quite often incorrectly used Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 5 / 45

Introduction Usage problems: improper methodological approach: ◮ well fitted model does not ensure good prediction (overfitted model) ◮ too many features used in the model (curse of dimensionality) ◮ feature selection on the full dataset(!) reproducibility: ◮ improper/insufficient validation ◮ batch effects unaccounted for ◮ insufficiently documented therapeutic relevance Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 6 / 45

Introduction Data acquisition Design decisions: Data acquisition: everything up to (and −feature selection method(s) including) normalization −classifier(s) −performance criterion Design decisions: should be taken before real modeling Model construction: Model design: DO NOT USE ALL DATA AT Feature selection ONCE!! Classifier design External validation: other datasets; clinical Performance estimation trials: phase II and III Model selection External validation Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 7 / 45

Discriminant analysis Discriminant analysis Goal Find a separation boundary between the classes. f(x) > 0 f(x) = 0 f(x) < 0 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 8 / 45

Discriminant analysis Representing data Tumor 1 Tumor 2 Tumor k Tumor n . . . . . . 1007_s_at 1053_at 117_at each element to be classified is a vector . . . probeset i p features . . . 211584_s_at 211585_at Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 9 / 45

Discriminant analysis Representing data Tumor 1 Tumor 2 Tumor k Tumor n . . . . . . 1007_s_at 1053_at 117_at each element to be classified is a vector . . . usually we classify tumors/samples/patients probeset i p features → use columns . . . 211584_s_at 211585_at Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 9 / 45

Discriminant analysis Representing data Tumor 1 Tumor 2 Tumor k Tumor n . . . . . . 1007_s_at 1053_at 117_at each element to be classified is a vector . . . usually we classify tumors/samples/patients probeset i p features → use columns p ≫ n . . . 211584_s_at 211585_at Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 9 / 45

Discriminant analysis Formalism: Data points: X = { x i ∈ R p | i = 1 , . . . , n } x could be the log ratios or log signals Labels: Y = { y i ∈ { ω 1 , . . . , ω k }| i = 1 , . . . , n } ; ( k classes) e.g. ω 1 = pCR and ω 2 = non-pCR Easier: take y i ∈ { 1 , 2 , . . . , k } or y i ∈ {− 1 , + 1 } (for two classes) 2–class problem (dichotomy) Given a finite set of points X and their corresponding labels Y (a training set), find a discriminant function f such that  > 0 for x ∈ ω 1   f ( x )  ,  < 0 for x ∈ ω 2   for all x . Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 10 / 45

Discriminant analysis Assumptions: training set is representative for the whole population the characteristics of the population do not change over time (e.g. same experimental conditions) Comments: for all x : i.e. infere a rule that works for unseen data – generalization perfect classification of the training data does not ensure generalization; e.g.: f ( x i ) = y i will hardly work on new data as stated, the problem is ill–posed: there are an infinity of solutions real data is noisy: usually there is no perfect solution Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 11 / 45

Discriminant analysis Linear discriminants With x = ( x 1 , . . . , x n ) t , w = ( w 1 , . . . , w n ) t , Linear discriminant functions n � f ( x ) = w t x + w 0 = w , x ∈ R n , w 0 ∈ R w i x i + w 0 , i = 1 New problem: optimize some criterion ( w ∗ , w ∗ 0 ) = arg max w , w 0 J ( X , Y ; w , w 0 ) Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 12 / 45

Discriminant analysis Geometry of the linear discriminants x w Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 13 / 45

Discriminant analysis Geometry of the linear discriminants w w0 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 13 / 45

Discriminant analysis Fisher’s LDA Fisher’s criterion J ( w ) = w t S B w w t S W w where S B = ( µ 1 − µ 2 )( µ 1 − µ 2 ) t is the between class scatter matrix ( µ i is the average of the class i ) n − 2 ( n 1 ˆ Σ 1 + n 2 ˆ 1 S W = Σ 2 ) is the pooled within class covariance matrix ( ˆ Σ i is the estimated covariance of the class i ) Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 14 / 45

Discriminant analysis Fisher’s criterion (in plain English) Find the direction w along which the two classes are best separated – in some sense. Solution: w = S − 1 W ( µ 1 − µ 2 ) w 0 =?? ◮ assuming data is normal with equal covariances: closed–form formula ◮ alternative: estimate w 0 by w line search ◮ can be used to embed prior w0 probabilities in the classifier Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 15 / 45

Discriminant analysis Versions of Fisher’s DA Under normality assumption and if the covariance matrices are equal and the features are uncorrelated: Diagonal LDA (the covariance matrices are diagonal); if the covariance matrices are not equal: Quadratic DA Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 16 / 45

Discriminant analysis Versions of Fisher’s DA (from Duda, Hart & Stork, Pattern Classification ) Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 17 / 45

Discriminant analysis A Bayesian perspective p ( x | ω i ) : class–conditional distribution 2 classes, ω 1 , ω 2 from Bayes’ formula: one continuous feature (e.g. log expression) p ( ω i | x ) = p ( x | ω i ) p ( ω i ) p ( x ) posterior = likelihood × prior evidence optimal decision (Bayes decision rule): decide x ∈ ω 1 if p ( ω 1 | x ) > p ( ω 2 | x ) , otherwise decide x ∈ ω 2 p ( ω i ) =? p ( x | ω i ) =? Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 18 / 45

Class discrimination for microarray studies Vlad Popovici Swiss - PowerPoint PPT Presentation

Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics February 5th, 2008 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 1 / 45 Outline Introduction 1 Discriminant

Linear Discrimination Discriminant-Based Classification 1 Linear Discrimination Linearly

A CMOS Label- -free DNA free DNA A CMOS Label Microarray Microarray Erik Anderson Stanford

Capturing Best Practice for Microarray Gene Expression Data Analysis Gregory Piatetsky-Shapiro

Linear Discrimination Steven J Zeil Old Dominion Univ. Fall 2010 1 Discriminant-Based

Biology-Driven Clustering of Microarray Data Applications to the NCI60 Data Set K.R. Coombes,

Microarray Data Analysis ECS 289A ECS289A a) Oligonucleotide and b) Spotted Arrays Lochart and

Recent development in microarray data analysis Guan-Hua Huang Institute of Statistics National

Biweight Correlation as a Measure of Distance between Genes on a Microarray Aya Mitani Pitzer

Conflicts between Optimality Criteria in Incomplete-Block Designs for Microarray Experiments R.

Discrimination in the Auto Loan Market Alexander W. Butler Rice Erik J. Mayer SMU James P.

Auditory Perception - Detection versus Discrimination - Localization versus Discrimination -

2.2 Price Discrimination Matilde Machado Download the slides from:

2.2 Price Discrimination Matilde Machado Download the slides from:

Racial Discrimination in the Coronary Racial Discrimination in the Artery Risk Development in

Models for Replicated Discrimination Tests: A Synthesis of Latent Class Mixture Models and

Programming Abstraction in C++ Eric S. Roberts and Julie Zelenski Stanford University 2010

Feature Reduction and Selection Selim Aksoy Bilkent University Department of Computer

Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Discriminant

Lecture 12: Midterm Exam Review Dr. Chengjiang Long Computer Vision Researcher at Kitware Inc.

E9 205 Machine Learning for Signal Processing Supervised-Dimensionality-Reduction. Decision

Constrained discriminative speaker verification specific to normalized i-vectors P.M. Bousquet,

Discriminative Feature Extraction and Dimension Reduction - PCA & LDA Berlin Chen, 2004

Local Classification Methods for Heterogeneous Classes Julia Schiffner and Claus Weihs

Introduction to Machine Learning Classification: Tasks Sonar Learning goals 0.20 Understand

Class discrimination for microarray studies Vlad Popovici Swiss - PowerPoint PPT Presentation

Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics February 5th, 2008 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 1 / 45 Outline Introduction 1 Discriminant

Linear Discrimination Discriminant-Based Classification 1 Linear Discrimination Linearly

A CMOS Label- -free DNA free DNA A CMOS Label Microarray Microarray Erik Anderson Stanford

Capturing Best Practice for Microarray Gene Expression Data Analysis Gregory Piatetsky-Shapiro

Linear Discrimination Steven J Zeil Old Dominion Univ. Fall 2010 1 Discriminant-Based

Biology-Driven Clustering of Microarray Data Applications to the NCI60 Data Set K.R. Coombes,

Microarray Data Analysis ECS 289A ECS289A a) Oligonucleotide and b) Spotted Arrays Lochart and

Recent development in microarray data analysis Guan-Hua Huang Institute of Statistics National

Biweight Correlation as a Measure of Distance between Genes on a Microarray Aya Mitani Pitzer

Conflicts between Optimality Criteria in Incomplete-Block Designs for Microarray Experiments R.

Discrimination in the Auto Loan Market Alexander W. Butler Rice Erik J. Mayer SMU James P.

Auditory Perception - Detection versus Discrimination - Localization versus Discrimination -

2.2 Price Discrimination Matilde Machado Download the slides from:

2.2 Price Discrimination Matilde Machado Download the slides from:

Racial Discrimination in the Coronary Racial Discrimination in the Artery Risk Development in

Models for Replicated Discrimination Tests: A Synthesis of Latent Class Mixture Models and

Programming Abstraction in C++ Eric S. Roberts and Julie Zelenski Stanford University 2010

Feature Reduction and Selection Selim Aksoy Bilkent University Department of Computer

Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Discriminant

Lecture 12: Midterm Exam Review Dr. Chengjiang Long Computer Vision Researcher at Kitware Inc.

E9 205 Machine Learning for Signal Processing Supervised-Dimensionality-Reduction. Decision

Constrained discriminative speaker verification specific to normalized i-vectors P.M. Bousquet,

Discriminative Feature Extraction and Dimension Reduction - PCA &amp; LDA Berlin Chen, 2004

Local Classification Methods for Heterogeneous Classes Julia Schiffner and Claus Weihs

Introduction to Machine Learning Classification: Tasks Sonar Learning goals 0.20 Understand

Discriminative Feature Extraction and Dimension Reduction - PCA & LDA Berlin Chen, 2004