L15:Microarray analysis (Classification) November 09 Bafna
Silly Quiz • Social networking site: • How can you find people with interests similar to yours? November 09 Bafna
Gene Expression Data Gene Expression data: • s 1 s 2 s – Each row corresponds to a gene – Each column corresponds to an expression value • Can we separate the experiments into two or more classes? g • Given a training set of two classes, can we build a classifier that places a new experiment in one of the two classes. November 09 Bafna
Formalizing Classification • Classification problem: Find a surface (hyperplane) that will separate the classes • Given a new sample point, its class is then determined by which side of the surface it lies on. • How do we find the hyperplane? How do we find the side that a point lies on? 1 2 3 4 5 6 1 2 g1 3 1 .9 .8 .1 .2 .1 .1 0 .2 .8 .7 .9 g2 Bafna November 09
Basic geometry • What is || x || 2 ? • What is x /|| x || x=(x1,x2) • Dot product? y x T y x 1 y 1 + x 2 y 2 = || x || ⋅ || y ||cos θ x cos θ y + || x || ⋅ || y ||sin( θ x )sin( θ y ) = || x || ⋅ || y ||cos( θ x − θ y ) Bafna November 09
Dot Product x • Let β be a unit vector. – || β || = 1 • Recall that – β T x = ||x|| cos θ θ • What is β T x if x is β orthogonal T x = ||x|| cos θ β (perpendicular) to β ? Bafna November 09
Hyperplane • How can we define a hyperplane L? • Find the unit vector that is perpendicular (normal to the hyperplane) Bafna November 09
Points on the hyperplane • Consider a hyperplane L defined by unit vector β , and distance β 0 • Notes; x2 x1 – For all x ∈ L, x T β must be the same, x T β = β 0 – For any two points x 1 , x 2 , • (x 1 - x 2 ) T β =0 Bafna November 09
Hyperplane properties • Given an arbitrary point x, what is the distance from x to the plane L? – D(x,L) = ( β T x - β 0 ) • When are points x1 and x2 x on different sides of the β 0 hyperplane? Bafna November 09
Separating by a hyperplane • Input: A training set of +ve & + -ve examples • Goal: Find a hyperplane that separates the two classes. • Classification: A new point x is +ve if it lies on the +ve side of the hyperplane, -ve x2 otherwise. - • The hyperplane is represented by the line • {x:- β 0 + β 1 x 1 + β 2 x 2 =0} x1 Bafna November 09
Error in classification + • An arbitrarily chosen hyperplane might not separate the test. We need to minimize a mis-classification error • Error: sum of distances of the β misclassified points. x2 • Let y i =-1 for +ve example i, – y i =1 otherwise. - ( ) ∑ T β + β 0 D ( β , β 0 ) = y i x i i ∈ M • Other definitions are also x1 possible. Bafna November 09
Gradient Descent • The function D( β ) defines the error. • We follow an iterative refinement. In each step, D( β ) D’( β ) refine β so the error is reduced. • Gradient descent is an approach to such iterative refinement. β β ← β − ρ ⋅ D '( β ) Bafna November 09
Rosenblatt’s perceptron learning algorithm ( ) ∑ T β + β 0 D ( β , β 0 ) = y i x i i ∈ M ∂ D ( β , β 0 ) ∑ y i x i = ∂β i ∈ M ∂ D ( β , β 0 ) ∑ y i = ∂β 0 i ∈ M ∑ y i x i ⇒ Update rule : β = β i ∈ M − ρ ∑ β 0 β 0 y i Bafna November 09 i ∈ M
Classification based on perceptron learning • Use Rosenblatt’s algorithm to compute the hyperplane L=( β , β 0 ). • Assign x to class 1 if f(x) >= 0, and to class 2 otherwise. Bafna November 09
Perceptron learning • If many solutions are possible, it does no choose between solutions • If data is not linearly separable, it does not terminate, and it is hard to detect. • Time of convergence is not well understood Bafna November 09
Linear Discriminant analysis + • Provides an alternative approach to classification with a linear function. • Project all points, including the means, onto vector β . β • We want to choose β such x 2 that – Difference of projected - means is large. – Variance within group is small x 1 November 09 Bafna
Choosing the right β + + β 1 x 2 x 2 - - β 2 x 1 x 1 • β 1 is a better choice than β 2 as the variance within a group is small, and difference of means is large. • How do we compute the best β ? November 09 Bafna
Linear Discriminant analysis • Fisher Criterion (difference of projected means) Max β (sum of projected variance) November 09 Bafna
LDA cont’d + • What is the projection of a point x onto β ? – Ans: β T x m 2 • What is the distance β x 2 between projected m 1 means? - x ˜ m 1 x 1 2 = β T m 1 − m 2 2 ˜ 1 − ˜ ( ) m m 2 November 09 Bafna
LDA Cont’d β T ( m 1 − m 2 ) 2 2 | 2 | ˜ 1 − ˜ m m = β T ( m 1 − m 2 ) T β T ( m 1 − m 2 )( m 1 − m 2 ) T β = β β T S B β = ( m 1 − m 2 ) β T 2 + ˜ 2 scatter within sample : ˜ s s 1 2 β ( β T ( x − m 1 ) 2 = ) 2 = ) 2 = β T S 1 β S B ∑ ∑ where, ˜ ( ˜ x − ˜ s m 1 1 y x ∈ D 1 2 = β T ( S 1 + S 2 ) β = β T S w β 2 + ˜ ˜ s s 1 2 β T S B β Fisher Criterion max β β T S w β November 09 Bafna
LDA β T S B β Let max β β T S w β = λ Then, β T S B β − λ S w β ( ) = 0 ⇒ λ S w β = S B β − 1 S B β ⇒ λβ = S w − 1 ( m 1 − m 2 ) ⇒ β = S w Therefore, a simple computation (Matrix inverse) is sufficient to compute the ‘best’ separating hyperplane November 09 Bafna
End of Lecture 15 November 09 Bafna
Maximum Likelihood discrimination • Consider the simple case of single dimensional data. values • Compute a distribution of the values in each class. Pr November 09 Bafna
Maximum Likelihood discrimination • Suppose we knew the distribution of points in each class ω i . – We can compute Pr(x| ω i ) for all classes i, and take the maximum • The true distribution is not known, so usually, we assume that it is Gaussian November 09 Bafna
ML discrimination • Use a Bayesian approach to identify the class for each sample P(x) x µ Pr( x | ω i )Pr( ω i ) Pr( ω i | x ) = ∑ 2 Pr( x | ω j )Pr( ω j ) − x − µ ( ) 1 j ( ) 2 σ 2 P ( x ) = e σ 2 π ( ) + ln Pr( ω i ) ( ) g i ( x ) ln Pr( x | ω i ) = − ( x − µ i ) 2 ( ) + ln Pr( ω i ) ≅ 2 2 σ i November 09 Bafna
ML discrimination recipe (1 dimensional case) • We know the distribution for each class, but not the parameters • Estimate the mean and variance for each class. • For a new point x, compute the discrimination function g i (x) for each class i. • Choose argmax i g i (x) as the class for x November 09 Bafna
ML discrimination • Suppose all the points were in 1 dimension, and all classes were normally distributed. Pr( x | ω i )Pr( ω i ) Pr( ω i | x ) = ∑ Pr( x | ω j )Pr( ω j ) j ( ) + ln Pr( ω i ) ( ) g i ( x ) ln Pr( x | ω i ) = µ 1 x µ 2 − ( x − µ i ) 2 ( ) − ln( σ i ) + ln Pr( ω i ) ≅ 2 2 σ i ( x − µ i ) 2 ( ) Choose argmin i + ln( σ i ) − ln Pr( ω i ) 2 2 σ i November 09 Bafna
ML discrimination (multi- dimensional case) Sample mean, µ = 1 ˆ ∑ x i n i k − k − 1 T Covariance matrix = ˆ ˆ ˆ ( ) ( ) ∑ x x ∑ = µ µ n − 1 k November 09 Bafna
ML discrimination (multi- dimensional case) 1 exp − 1 T Σ − 1 x − m ( ) ( ) p ( x | ω i ) = 2 x − m d 12 2 Σ ( ) 2 π ( ) + ln P ( ω i ) g i ( x ) = ln p ( x | ω i ) Compute argmax i g i ( x ) November 09 Bafna
Supervised classification summary • Most techniques for supervised classification are based on the notion of a separating hyperplane. • The ‘optimal’ separation can be computed using various combinatorial (perceptron), algebraic (LDA), or statistical (ML) analyses. November 09 Bafna
Review of micro-array analysis November 09 Bafna
The dynamic picture of the cellular activity • Each Cell is continuously active, – Genes are being transcribed into RNA – RNA is translated into proteins – Proteins are PT modified and transported – Proteins perform various cellular functions • Can we probe the Cell dynamically? – Which transcripts are active? Gene – Which proteins are active? Proteomic Regulation – Which proteins interact? Transcript profiling profiling November 09 Bafna
Other static analysis is possible Genomic Analysis/ Pop. Genetics Assembly Protein Sequence Sequence Analysis Analysis Gene Finding November 09 Bafna ncRNA
Silly Quiz • Who are these people, and what is the occasion? November 09 Bafna
Genome Sequencing and Assembly November 09 Bafna
DNA Sequencing • DNA is double- stranded • The strands are separated, and a polymerase is used to copy the second strand. • Special bases terminate this process early. November 09 Bafna
Sequencing • A break at T is shown here. • Measuring the lengths using electrophoresis allows us to get the position of each T • The same can be done with every nucleotide. Fluorescent labeling can help separate different nucleotides November 09 Bafna
Recommend
More recommend