structured multicategory support vector machine with
play

Structured Multicategory Support Vector Machine with ANOVA - PowerPoint PPT Presentation

Structured Multicategory Support Vector Machine with ANOVA decomposition www.stat.ohio-state.edu/ yklee Yoonkyung Lee Department of Statistics The Ohio State University Structured Multicategory Support Vector Machine with ANOVA


  1. Structured Multicategory Support Vector Machine with ANOVA decomposition www.stat.ohio-state.edu/ � yklee Yoonkyung Lee Department of Statistics The Ohio State University Structured Multicategory Support Vector Machine with ANOVA decomposition – p.1/15

  2. Predictive learning A training data set � � . � � � � � � � � � � � � � � � � � Functional relationship � between � and � � � � � � � � � � � � � . Regression: continuous � . Classi fi cation: categorical � . Goodness Prediction accuracy for a given loss � � �� . � � � � � Interpretation. Structured Multicategory Support Vector Machine with ANOVA decomposition – p.2/15

  3. Support Vector Machines Vapnik (1995), http://www.kernel-machines.org Find � � � � minimizing � � � � � � � � � � � �� � � � �� � � � � � � � � � � � � � �� � � � Then � � � � � � � � � � � � � � � � � � � �� Competitive classi fi cation accuracy. Flexibility - implicit embedding through kernel. Handle high dimensional data. A black box unless the embedding is explicit. Structured Multicategory Support Vector Machine with ANOVA decomposition – p.3/15

  4. Feature Selection The best subset selection. Nonnegative garrote [Breiman, Technometrics (1995)] Least Absolute Shrinkage and Selection Operator [Tibshirani, JRSS (1996)] COmponent Selection and Smoothing Operator [Lin & Zhang, Technical Report (2003)] Structural modelling with sparse kernels [Gunn & Kandola, Machine Learning (2002)] Structured Multicategory Support Vector Machine with ANOVA decomposition – p.4/15

  5. ANOVA decomposition Wahba (1990) Function: � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � �� ��� � � Functional space: � , � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � �� ��� Reproducing kernel (r.k.): � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � �� ��� Modi fi cation of r.k. by rescaling parameters � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� �� � �� ��� Structured Multicategory Support Vector Machine with ANOVA decomposition – p.5/15

  6. � penalty on � � Truncating � to � , fi nd � � � � � � � � � � � � � � �� � minimizing � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � �� � � � � Then � � � � � � � � � � � � � � � � � � � � � � � �� �� � For sparsity, minimize � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � �� � � subject to � � � � � � � � Structured Multicategory Support Vector Machine with ANOVA decomposition – p.6/15

  7. Structured MSVM Lee, Lin & Wahba, JASA (2004) Find �� with � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � the sum-to-zero constraint minimizing � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� �� �� � � � � � subject to � � for � � � � � � � � � � � � � �� � � �� � By the representer theorem, � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� �� � � Structured Multicategory Support Vector Machine with ANOVA decomposition – p.7/15

  8. Updating Algorithm Denoting the objective function by � , �� � � � � � � and ��� Initialize � �� � �� � � � � � ��� ��� ��� � . � � � � ������ �� � � � � � � � At the � -th step ( � � ) � � � � � � � � -step: � � � minimizing � � � �� Find � � � �� � with �� � � � � � � � � � � � fi xed. -step: � � � � � � � � � Find � minimizing � with � � �� � � � � � � � � fi xed. Structured Multicategory Support Vector Machine with ANOVA decomposition – p.8/15

  9. A toy example scatter plot and visualization of the � -step 1.0 1.5 0.9 1.0 0.8 0.5 θ 2 x 2 0.5 0.7 0.6 0.0 0.0 0.0 0.5 1.0 1.5 0.0 0.5 1.0 θ 1 x 1 Structured Multicategory Support Vector Machine with ANOVA decomposition – p.9/15

  10. The trajectory of � two-way interaction spline kernel with � tuned by GCKL � 1.0 0.5 θ 1 2 1 x 2 0.0 − 15 − 10 − 5 0 log 2 ( λ θ ) Structured Multicategory Support Vector Machine with ANOVA decomposition – p.10/15

  11. Classi fi cation boundaries ordinary MSVM (0.3970) vs. structured MSVM (0.3967) 1.0 1.0 0.5 0.5 x 2 x 2 0.0 0.0 0.0 0.5 1.0 0.0 0.5 1.0 x 1 x 1 Structured Multicategory Support Vector Machine with ANOVA decomposition – p.11/15

  12. An “apple” example � � depends only on � . � � � � � � � � � � � � � � � �� � �� additive spline kernel with 5-fold CV � � � � � ( � � ���� , � , and � � ���� ). � � � � � � � � � � � � � � � 1.0 0.5 θ 0.0 − 10 − 8 − 6 − 4 − 2 0 log 2 ( λ θ ) Structured Multicategory Support Vector Machine with ANOVA decomposition – p.12/15

  13. Gene selection: microarray data 2308 genes and four tumor types [Khan et al. Nature Medicine (2001)] 46 positive rescaling parameters out of 500. 1.0 0.5 θ 0.0 0 100 200 300 400 500 gene rank Structured Multicategory Support Vector Machine with ANOVA decomposition – p.13/15

  14. Concluding remarks Integrate feature selection with learning classi fi cation rule. Enhance interpretation without compromising prediction accuracy. Characterize the solution path for effective computation and tuning. Tailor the structure of component penalty for re fi ned selection. Structured Multicategory Support Vector Machine with ANOVA decomposition – p.14/15

  15. Joint work with Yuwon Kim (SNU), Ja-Yong Koo (Inha Univ.), and Sangjun Lee (SNU) in Korea. Manuscript to be posted at www.stat.ohio-state.edu/ � yklee. Structured Multicategory Support Vector Machine with ANOVA decomposition – p.15/15

Recommend


More recommend