large scale greedy feature selection for multi target
play

Large scale greedy feature-selection for multi-target learning - PowerPoint PPT Presentation

Large scale greedy feature-selection for multi-target learning Antti Airola, Tapio Pahikkala et al. ECML 2015 BigTargets Workshop Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning Overview


  1. Large scale greedy feature-selection for multi-target learning Antti Airola, Tapio Pahikkala et al. ECML 2015 BigTargets Workshop Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  2. Overview Joint work with many authors University of Turku: Antti Airola, Pekka Naula, Tapio Pahikkala, Tapio Salakoski (Multi-target greedy RLS) Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  3. Overview Large scale feature selection for multi-target learning Task: select minimal set of common features allowing accurate predictions over target tasks Greedy RLS: greedy regularized least-squares Linear time (#inputs, #features, #outputs, #selected) Highlights from experiments Broad-DREAM Gene Essentiality Prediction Challenge Outperforms multi-task Lasso for small feature budgets Also scales to full Genome Wide Association Studies; thousands of samples, hundreds of thousands of features (recent PhD thesis: Sebastian Okser) Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  4. Motivation Why feature selection? 1 Accuracy: regularizing effect, avoiding overfitting leads to better generalization 2 Interpretability: obtain a small set of features understandable by human expert 3 Budget constraints: obtaining features costs time and money Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  5. Model sparsity     1 0 0 0 0 0 0 0     3 0 0 0 2 3 − 1 2         0 2 0 0 0 0 0 0         0 − 1 0 0 3 1 4 1     W 1 = , W 2 =     0 0 0 3 0 0 0 0         0 0 0 1 0 0 0 0         0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 features x targets coefficient matrices W 1 8 features needed for prediction W 2 2 features needed for prediction Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  6. Learning task Least-squares formulation arg min W ∈ R d × t � XW − Y � 2 F subject to C ( W ) Notation X data matrix Y output matrix W model coefficients � · � F Frobenius norm C ( · ) Constraint (regularizer) Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  7. Multi-task Lasso (baseline) Multi-task Lasso (Zhang, 2006) arg min W ∈ R d × t � XW − Y � 2 F subject to � d i =1 max j | W i , j | ≤ r L 1 , ∞ norm enforces sparsity in the number of features r > 0 regularization parameter Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  8. Greedy RLS Greedy RLS (proposed) arg min W ∈ R d × t � XW − Y � 2 F subject to � W � 2 F < r and |{ i | ∃ j , W i , j � = 0 }| ≤ k r > 0 regularization parameter k > 0 constraint on the number of features heuristics needed to search over the power set of features Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  9. Greedy RLS Greedy regularized least-squares (Greedy RLS) Starting from empty feature set, at each point add the feature reducing leave-one-out cross-validation error most Stop once k features have been selected Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  10. Greedy RLS Algorithm 1 Multi-target greedy RLS 1: S ← ∅ ⊲ selected features common for all tasks 2: while |S| < k do ⊲ select k features e ← ∞ 3: b ← 0 4: for i ∈ { 1 , . . . , d } \ S do ⊲ test all features 5: e avg ← 0 6: for j ∈ { 1 , . . . , t } do 7: e i , j ← L ( X : , S∪{ i } , Y : , j ) ⊲ LOO for task j 8: e avg ← e avg + e i , j / t 9: if e avg < e then 10: e ← e avg 11: b ← i 12: S ← S ∪ { b } ⊲ feature with lowest LOO-error 13: 14: W ← A ( X : , S , Y ) ⊲ train final models 15: return W , S Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  11. Greedy RLS could be implemented as a general wrapper code calling a black-box solver #selected × #features × #targets × #CV-rounds calls for naive implementation! Matrix algebraic optimization for feature addition, leave-one-out... (for all targets simultaneously) Linear time algorithm (#inputs, #features, #outputs, #selected) P. Naula, A. Airola, T. Salakoski and T. Pahikkala. Multi-label learning under feature extraction budgets. Pattern Recognition Letters , 2014. Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  12. Greedy RLS Algorithm 2 Multi-target greedy RLS A ← λ − 1 Y g ← λ − 1 1 C ← λ − 1 X S ← ∅ while |S| < k do e ← ∞ b ← 0 for i ∈ { 1 , . . . , d } \ S do u ← C : , i (1 + ( X : , i ) T C : , i ) − 1 e i ← 0 � A ← A − u (( X : , i ) T A ) for h ∈ { 1 , . . . , t } do for j ∈ { 1 , . . . , n } do ˜ g j ← g j − u j C j , i g j ) − 2 ( � A j , h ) 2 e i ← e i + (˜ if e i < e then e ← e i b ← i u ← C : , b (1 + ( X : , b ) T C : , b ) − 1 A ← A − u (( X : , b ) T A ) for j ∈ { 1 , . . . , n } do g j ← g j − u j C j , b C ← C − u (( X : , b ) T C ) S ← S ∪ { b } W ← ( X : , S ) T A Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  13. Benchmarking greedy RLS and multi-task Lasso Table: Mulan datasets (Tsoumakas et al. 2011). Data sets domain labels features instances Scene image 6 294 2407 Yeast biology 14 103 2417 Emotions music 6 72 593 Mediamill* text 9 120 41583 Delicious text 983 500 16105 Tmc2007 text 22 49060 28596 Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  14. Greedy RLS vs. Lasso 1.0 M.avg.AUC 0.9 0.8 Scene data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 50 100 150 200 250 1.0 Yeast data M.avg.AUC 0.9 MT-Lasso 0.8 ML-gRLS 0.7 0.6 0.5 0 20 40 60 80 100 Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  15. Greedy RLS vs. Lasso 1.0 M.avg.AUC 0.9 0.8 Emotions data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 10 20 30 40 50 60 70 1.0 M.avg.AUC 0.9 0.8 Mediamill data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 20 40 60 80 100 Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  16. Greedy RLS vs. Lasso 1.0 Delicious data M.avg.AUC 0.9 MT-Lasso 0.8 ML-gRLS 0.7 0.6 0.5 0 20 40 60 80 1.0 M.avg.AUC 0.9 0.8 Tmc2007 data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 10 20 30 40 Number of features Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  17. Conclusion Greedy RLS: linear time algorithm for (multi-target) feature selection Selects joint features for the target tasks Competitive, when number of features to be selected small Applications on Genome-Wide Association Studies RLScore open source implementation at https://github.com/aatapa/RLScore Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

Recommend


More recommend