feat ure select ion using f or feat ure select ion using
play

Feat ure Select ion using/ f or Feat ure Select ion using/ f or - PowerPoint PPT Presentation

NI PS03 Workshop of Feature Extraction and Feature Selection Challenge Feat ure Select ion using/ f or Feat ure Select ion using/ f or Transduct ive ransduct ive S Support upport V Vect or ect or M Machine achine T Mr. Zhili Wu Dr.


  1. NI PS03 Workshop of Feature Extraction and Feature Selection Challenge Feat ure Select ion using/ f or Feat ure Select ion using/ f or Transduct ive ransduct ive S Support upport V Vect or ect or M Machine achine T Mr. Zhili Wu Dr. Chun- hung Li Department of Computer Science Hong Kong Baptist University

  2. Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) - Contents Contents • I ntroduction to Feature Selection • Why TSVM works • Technique sharing – not limited by TSVM • Several technique highlights • Conclusion • Your comments & doubts

  3. Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Feature Selection Feature selection (Competition) - I mpact of Weston’s Dataset selection • Your algorit hm A* • Ot her’s algorit hms A 1 , … , A n • You have M=2 d -1 possible f eat ure set s f or a d-dimensional dat aset : F 1 ,… ,F M • L(A,D(F i )) = loss of algorit hm A on dat aset D(F i ) • Your goal: f ind a f eat ure set F* in F 1 ,… ,F M so t hat L(A*,D(F*)) < min 1, ..., n ( L(A i , D(F*) )

  4. Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – “No Free Feature” “ No Free Feature” Theorem • From “No Free Brunch” (Weston NI PS 2002 ) •The generalizat ion error of t wo dat aset s f or all algorit hms is t he same A [D]] = E A [ R gen A [D’]] E A [ R gen • Since any t wo f eat ure set s induce t wo new dat aset s A [D(F)]] = E A [ R gen A [D(F’)]] E A [ R gen • Consequence: Techniques are very important!

  5. Technique 1 - Transductive Transductive SVM (TSVM) Transductive SVM (SVMLight by J oachims)

  6. Technique 1 - Transductive Transductive SVM (TSVM) – Simpler Explanation Simpler Explanation to TSVM 1. Train a SVM on labeled dat a only 2. Predict unlabeled dat a t o an assigned f ract ion of Pos, ot hers being Neg 3. Train t he whole dat aset - swit ch some pairs of Pos/ Neg f or some goodness measure, repeat 3 4. Repeat 2 & 3 t ill unlabeled dat a cont ribut e much

  7. Technique 1 - Transductive Transductive SVM (TSVM) – Why works f or FS Competition Why TSVM Works f or FS Competition • unlabeled (validating+testing) data provided • accuracy is the f irst priority measure • Fraction of Pos/ Neg unlabeled samples provided • Also, ef f ective & compatible tools: - Dr. Chih-J en Lin’s SVMLI B - SVMLI B + SVMLI GHT

  8. Feature Selection Using/ f or Transductive Feature Selection Using/ f or Transductive SVM (TSVM) – Technique Summary Arcene Gisette Dexter Dorothea Madelon Normalize 1 (0 mean, unit st d) Score Fisher Score F-score Odd Rat io F-score 7~20 P Cs by P CA D_ij / Sqrt (row- D_ij / Sqrt (row- Normalize 1 sum*col-sum) sum*col-sum) Scale f eat ure Scale f eat ure by f -score by f -score Kernel RBF ( C=2^5, g=2^-6) P oly 2 Linear Linear (C+/ C - RBF (g=1,c=1) =19.5) Transduction Yes yes Yes No Yes Further Use w t o select f eature f eat ure and reduction rescale f eat ure Remarks: Model select ion by MI , BNS, BER T-t est CV seems t o overf it ? score, F-score BER & (Rank 1.58(11t h) 4.4(6t h) 11.52(11t h) by submissions on 1st/ Dec)

  9. Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Technique Highlights Madelon – A Fisher- Score Variant • ( µ + - µ - )/ (s + + s - ) • 13 f eatures are selected

  10. Feature Selection Using/ f or Transductive Feature Selection Using/ f or Transductive SVM (TSVM) – Technique Highlights Dorothea oddRatio Feature value 0 1 Class - 1 a b Truth Class + 1 c d ExpProb oddRatio 1 • - f or unbalanced class exp( P (1| class+) – P (1|class-) ) = exp( d/ (c+d) – b/ (a+b) ) Other Measures like BNS 2 , MI , … • • I s BER a score indicating goodness of f eatures? The balanced error rat e (BER) is t he average of t he errors on each class: BER = 0.5*(b/ (a+b) + c/ (c+d)). 1. Feat ure select ion f or unbalanced class dist ribut ion and Naïve Bayes, Dunj a Mladenic, Marko Grobelnik 2. An Ext ensive Empirical St udy of Feat ure Select ion Met rics f or Text Classif icat ion , George Forman , J MLR 2003 special issue on variable and f eat ure select ion

  11. Feature Selection Using/ f or Transductive Feature Selection Using/ f or Transductive SVM (TSVM) – Technique Highlights Dexter: A Simple Linear- TSVM- RFE 1. Pr une some f eat ur es using scor es easily calculat ed 2. Rescale remaining f eat ures by scores 3. Train a Linear TSVM (wit h good generalizat ion abilit y) 4. Calculat e t he f eat ure weight w 5. Rank f eat ur es and r escale f eat ur es by w 6. Repeat 3~5 t ill a balance of f eat ure relevance & accuracy

  12. Feature Selection Using/ f or Feature Selection Using/ f or Transductive Transductive SVM (TSVM) – Conclusion Conclusion 1. No Free Feat ure 2. TSVM 3. Techniques 1. Scoring Met hods 2. TSVM RFE 4. Ot her import ant issues not ment ioned: 1. Model select ion 2. Normalizat ion 3. …

  13. Your Comments! Thanks !

Recommend


More recommend