multi target regression via
play

Multi-Target Regression via Random Linear Target Combinations - PowerPoint PPT Presentation

Multi-Target Regression via Random Linear Target Combinations Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis Aikaterini Vrekou, Ioannis Vlahavas Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece


  1. Multi-Target Regression via Random Linear Target Combinations Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis Aikaterini Vrekou, Ioannis Vlahavas Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece

  2. Multi-Target Regression also known as multivariate or multi-output regression 𝑌 1 𝑌 2 … 𝑌 𝒒 𝑍 1 𝑍 2 … 𝑍 𝒓 0.12 1 … 12 0.14 10 … -1.3 training 2.34 9 … -5 4.15 12 … -2.0 examples 1.22 3 … 40 1.01 28 … -5.3 2.18 2 … 8 ? ? … ? unknown instances 1.76 7 … 23 ? ? … ? 𝑟 continuous output variables 𝑞 input variables

  3. Applications • Ecological modeling • Predicting physical and chemical properties of soil (forestry, agriculture) and water • Economics • Sales and price forecasting for multiple products • Energy • Solar/wind energy production forecasting • Load forecasting • We expect a raise in popularity • Internet of Things, Smart Cities Images are logos of corresponding multi-target regression competitions hosted at

  4. Inspiration multi-label transfer of ideas 1 multi-target regression classification RA 𝑙 EL 2 ? random subset of labels random subset of targets all combinations ? of binary label values 1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas, Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG] 2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007

  5. Inspiration multi-label transfer of ideas 1 multi-target regression classification RA 𝑙 EL 2 RLC random subset of labels random subset of targets all combinations a random linear of binary label values combination of targets 1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas, Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG] 2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007

  6. Inspiration multi-label transfer of ideas 1 multi-target regression classification RA 𝑙 EL 2 RLC random subset of labels random subset of targets all combinations a random linear of binary label values combination of targets 1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas, Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG] 2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007

  7. Sketching RLC 𝑟 targets 𝑠 ≫ 𝑟 targets random linear combinations 𝒛 𝟐 𝒛 𝟑 𝒛 𝟒 𝒜 𝟐 𝒜 𝟑 𝒜 𝟒 𝒜 𝟓 𝒜 𝟔 𝒜 𝟕 of the original targets 1 -0,5 -0,2 0,4 1 2 0,5 -0,3 -1 2 3 0,9 0,6 -0,5 3 4 -0,8 -0,5 -0,9 4 5 -0,5 0,6 0,7 5 6 0,4 0,1 0,8 6 7 -0,2 -0,3 0,8 7 8 -0,4 -0,4 -0,9 8

  8. Sketching RLC 𝑟 targets 𝑠 ≫ 𝑟 targets random linear combinations 𝒛 𝟐 𝒛 𝟑 𝒛 𝟒 𝒜 𝟐 𝒜 𝟑 𝒜 𝟒 𝒜 𝟓 𝒜 𝟔 𝒜 𝟕 of the original targets 1 -0,5 -0,2 0,4 1 0,26 -0,23 0,22 0,66 0 0,5 2 0,5 -0,3 -1 2 -0,73 0,05 -0,42 -1,2 0,32 -0,25 𝑎 = 𝑍𝐷 3 0,9 0,6 -0,5 3 -0,29 0,48 -0,3 -0,99 -0,14 -1,02 4 -0,8 -0,5 -0,9 4 -0,68 -0,83 0,28 -0,33 0,38 0,89 0 0,7 -0,6 -0,6 0 -0,8 5 -0,5 0,6 0,7 5 0,55 -0,14 0,54 0,93 -0,38 0,1 0,1 0 0,4 0 -0,4 -0,5 6 0,4 0,1 0,8 6 0,57 0,52 -0,2 0,48 -0,2 -0,37 0,7 0,3 0 0,9 -0,2 0 7 -0,2 -0,3 0,8 7 0,53 0,1 0 0,84 -0,04 0,31 𝑟 × 𝑠 coefficient matrix 𝐷 8 -0,4 -0,4 -0,9 8 -0,67 -0,55 0,08 -0,57 0,34 0,52 of standard uniform values multi-target solving a system of 𝑠 linear regression model equations with 𝑟 unknowns -0,2 1 0,2 -0,5 -0,4 0,5 ? ? ?

  9. Some More Details 𝑟 targets 𝒛 𝟐 𝒛 𝟑 𝒛 𝟒 Assumption: original 1 -0,5 -0,2 0,4 targets take values from the same domain 2 0,5 -0,3 -1 𝑎 = 𝑍𝐷 3 0,9 0,6 -0,5 4 -0,8 -0,5 -0,9 0 0,7 -0,6 -0,6 0 -0,8 Parameter 𝑙 ∈ 2. . 𝑟 5 -0,5 0,6 0,7 0,1 0 0,4 0 -0,4 -0,5 (number of targets being combined) 6 0,4 0,1 0,8 0,7 0,3 0 0,9 -0,2 0 7 -0,2 -0,3 0,8 𝑟 × 𝑠 coefficient matrix 𝐷 8 -0,4 -0,4 -0,9 of standard uniform values Each original target is involved in 𝑠𝑙/𝑟 new targets

  10. Relation to Output Coding motivation 𝑠 > 𝑟 𝑠 ≤ 𝑟 improve accuracy RLC [2,3] improve computational complexity [1] [4] 1 Hsu, D., Kakade, S., Langford, J., Zhang, T. Multi-label prediction via compressed sensing. In: NIPS 2009, 772 – 780 2 Zhang, Y., Schneider, J.G. Multi-label output codes using canonical correlation analysis. In: AISTATS 2011. 3 Zhang, Y., Schneider, J.G.: Maximum margin output coding. In: ICML 2012, icml.cc / Omnipress 4 Tai, F., Lin, H.T, Multilabel classification with principal label space transformation, Neural Computation 24(9) 2012, 2508 – 2542

  11. Experimental Setup: Methods • ST • One regression model per target using gradient boosting • MORF • Multi-objective random forest of 100 trees • RLC • Multi-target regression algorithm: ST • Solving system of linear equations: least squares All code and specific experimental setup available at MULAN

  12. Experimental Setup: Datasets Name Abbreviation Examples Features Targets 1,2 Airline Ticket Price 1 / 2 atp1d / atp7d 337 / 296 411 6 3 Electrical Discharge Machining edm 154 16 2 4,5 Occupational Employment Survey 1 / 2 oes1997 / oes2010 334 / 403 263 / 298 16 6,7 River Flow 1 / 2 rf1 / rf2 9125 64 / 576 8 8,9 Solar Flare 1 / 2 sf1969 / sf1978 323 / 1066 26 / 27 3 10,11 Supply Chain Management 1 / 2 scm1d / scm20d 9803 / 8966 280 / 61 16 12 Water Quality wq 1060 16 14

  13. Studying the 𝑠 Parameter average of aRRMSE of our method (y-axis) with respect to 𝑠 (x-axis) across all datasets and all 𝑙 values

  14. Studying the 𝑙 Parameter aRRMSE of our method (y-axis) at the atp1d dataset with respect to 𝑠 (x-axis) for 𝑙 ∈ {2, 3, 4, 5, 6}

  15. Comparative Results • Results for 𝑙 =2/3 and 𝑠 =500 models RLC ST MORF ST with gradient boosting appears to be strong baseline RLC - 10:2 8:4 ST 2:10 - 7:5 MORF 4:8 5:7 - RLC is better than ST and MORF RLC ST MORF Wilcoxon signed-rank test at 95% Avg. shows statistically significant 1.5 2.25 2.25 Rank difference between RLC and ST

  16. Pairwise Target Correlations Heat-map of the pairwise target correlations for the scm20d dataset atp1d atp7d edm sf1969 sf1978 oes10 oes97 rf1 rf2 scm1d scm20d wq gain (%) 3.6 2.6 4.6 5.0 3.1 7.9 2.5 -1.3 -2.0 1.6 1.4 1.3 median 0.8013 0.6306 0.0051 0.2242 0.1484 0.8479 0.7952 0.4077 0.4077 0.6526 0.5785 0.0751 stdev 0.0788 0.1602 - 1.1247 1.2006 0.0972 0.0785 0.3125 0.3125 0.1316 0.1483 0.0717

  17. Pairwise Target Correlations No strong correlation between the median of pairwise target correlations and the gain of our approach over ST ( 𝑆 = 0.15 ) The higher the variance of the pairwise target correlations the more difficult for our approach to improve over ST ( 𝑆 = −0.68 ) Between dataset variants, higher median leads to higher gains atp1d atp7d edm sf1969 sf1978 oes10 oes97 rf1 rf2 scm1d scm20d wq gain (%) 3.6 2.6 4.6 5.0 3.1 7.9 2.5 -1.3 -2.0 1.6 1.4 1.3 median 0.8013 0.6306 0.0051 0.2242 0.1484 0.8479 0.7952 0.4077 0.4077 0.6526 0.5785 0.0751 stdev 0.0788 0.1602 - 1.1247 1.2006 0.0972 0.0785 0.3125 0.3125 0.1316 0.1483 0.0717

  18. Recap • Our approach • Constructs new targets by taking random linear combinations of existing targets • Solves a linear equation system at prediction time • Relation to multi-label classification methods • RA 𝑙 EL, output coding • Results • RLC is significantly better than a strong baseline • RLC is better than a state-of-the-art approach • Interesting viewpoint of average target correlations

  19. Future Work • Alternative randomization • Gaussian matrices, sparse Rademacher matrices • Theoretical understanding • WHY and WHEN it works • Increase ensemble diversity • Multiple coefficient matrices (e.g. 5 x 100 vs 1 x 500)

  20. Multi-Target Regression via Random Linear Target Combinations Thank you! Grigorios Tsoumakas Department of Informatics Eleftherios Spyromitros-Xioufis Aristotle University of Thessaloniki Aikaterini Vrekou Thessaloniki 54124, Greece Ioannis Vlahavas

Recommend


More recommend