4/8/09 CSCI1950‐Z Computa4onal Methods for Biology Lecture 17 Ben Raphael April 6, 2009 hJp://cs.brown.edu/courses/csci1950‐z/ Classifica4on Binary classifica,on Given a set of examples ( x i , y i ) , where y i = +‐ 1, from unknown distribu4on D. Design func4on f: R n {‐1,+1} that op+mally assigns addi4onal samples x i to one of two classes. Supervised learning ( x i , y i ) training data x i (j) : feature. R n : feature space. 1
4/8/09 Classifica4on in 2D Hard vs. So\ Op4on 2
4/8/09 Decision Surfaces Linear Nonlinear Embedding in Higher Dimension 3
4/8/09 Cancer Classifica4on (ben‐Dor et al. JCB 2000) • Colon cancer (Alon et al. (1999)) – 62 samples. 6000 measured genes. 2000 selected. • Ovary (Schummer et al. (1999)) – 15 cancerous, 13 normal, 4 other 4ssues. 100,000 cDNA clones • Leukemia (Golub et al. (1999)) – 25 AML. 47 ALL. 7,129 genes measured genes. Cancer Classifica4on Results (ben‐Dor et al. JCB 2000) Clustering method Select similarity threshold (CAST or Less than hierarchical) to margin maximize compa+bility with sample labeling. Linear kernel K( x , y ) = x . y Quadra4c kernel K( x , y ) = ( x . y + 1) 2 4
4/8/09 Results Colon cancer data 5
Recommend
More recommend