Multi-Target Regression via Random Linear Target Combinations
Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis Aikaterini Vrekou, Ioannis Vlahavas
Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece
Multi-Target Regression via Random Linear Target Combinations - - PowerPoint PPT Presentation
Multi-Target Regression via Random Linear Target Combinations Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis Aikaterini Vrekou, Ioannis Vlahavas Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece
Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece
π1 π2 β¦ ππ π1 π2 β¦ ππ 0.12 1 β¦ 12 0.14 10 β¦
2.34 9 β¦
4.15 12 β¦
1.22 3 β¦ 40 1.01 28 β¦
2.18 2 β¦ 8 ? ? β¦ ? 1.76 7 β¦ 23 ? ? β¦ ? π input variables π continuous output variables training examples unknown instances also known as multivariate or multi-output regression
Images are logos of corresponding multi-target regression competitions hosted at
multi-label classification multi-target regression transfer of ideas 1
1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas,
Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG]
2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for
Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007
multi-label classification multi-target regression transfer of ideas 1
1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas,
Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG]
2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for
Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007
multi-label classification multi-target regression transfer of ideas 1
1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas,
Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG]
2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for
Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007
ππ ππ ππ
1
0,4 2 0,5
3 0,9 0,6
4
5
0,6 0,7 6 0,4 0,1 0,8 7
0,8 8
ππ ππ ππ ππ ππ ππ
1 2 3 4 5 6 7 8
π targets π β« π targets random linear combinations
ππ ππ ππ
1
0,4 2 0,5
3 0,9 0,6
4
5
0,6 0,7 6 0,4 0,1 0,8 7
0,8 8
ππ ππ ππ ππ ππ ππ
1 0,26 -0,23 0,22 0,66 0,5 2
3
4
5 0,55 -0,14 0,54 0,93 -0,38 0,1 6 0,57 0,52 -0,2 0,48 -0,2 -0,37 7 0,53 0,1 0,84 -0,04 0,31 8
π targets π β« π targets random linear combinations
0,7
0,1 0,4
0,7 0,3 0,9
π Γ π coefficient matrix π·
π = ππ· multi-target regression model
1 0,2
0,5 ? ? ?
solving a system of π linear equations with π unknowns
ππ ππ ππ
1
0,4 2 0,5
3 0,9 0,6
4
5
0,6 0,7 6 0,4 0,1 0,8 7
0,8 8
π targets
0,7
0,1 0,4
0,7 0,3 0,9
π Γ π coefficient matrix π·
π = ππ· Assumption: original targets take values from the same domain Parameter π β 2. . π (number of targets being combined) Each original target is involved in π π/π new targets
motivation π > π π β€ π improve accuracy RLC [2,3] improve computational complexity [1] [4]
1 Hsu, D., Kakade, S., Langford, J., Zhang, T.
Multi-label prediction via compressed sensing. In: NIPS 2009, 772β780
2 Zhang, Y., Schneider, J.G.
Multi-label output codes using canonical correlation analysis. In: AISTATS 2011.
3 Zhang, Y., Schneider, J.G.: Maximum margin output coding. In: ICML 2012, icml.cc / Omnipress 4 Tai, F., Lin, H.T, Multilabel classification with principal label space transformation,
Neural Computation 24(9) 2012, 2508β2542
All code and specific experimental setup available at MULAN
Name Abbreviation Examples Features Targets
1,2 Airline Ticket Price 1 / 2 atp1d / atp7d 337 / 296 411 6 3 Electrical Discharge Machining edm 154 16 2 4,5 Occupational Employment Survey 1 / 2 oes1997 / oes2010 334 / 403 263 / 298 16 6,7 River Flow 1 / 2 rf1 / rf2 9125 64 / 576 8 8,9 Solar Flare 1 / 2 sf1969 / sf1978 323 / 1066 26 / 27 3 10,11 Supply Chain Management 1 / 2 scm1d / scm20d 9803 / 8966 280 / 61 16 12 Water Quality wq 1060 16 14
average of aRRMSE of our method (y-axis) with respect to π (x-axis) across all datasets and all π values
aRRMSE of our method (y-axis) at the atp1d dataset with respect to π (x-axis) for π β {2, 3, 4, 5, 6}
RLC ST MORF Avg. Rank 1.5 2.25 2.25 Wilcoxon signed-rank test at 95% shows statistically significant difference between RLC and ST RLC ST MORF RLC
8:4 ST 2:10
MORF 4:8 5:7
appears to be strong baseline RLC is better than ST and MORF
atp1d atp7d edm sf1969 sf1978 oes10 oes97 rf1 rf2 scm1d scm20d wq gain (%)
3.6 2.6 4.6 5.0 3.1 7.9 2.5
1.6 1.4 1.3
median 0.8013 0.6306 0.0051 0.2242 0.1484 0.8479 0.7952 0.4077 0.4077 0.6526
0.5785 0.0751
stdev
0.0788 0.1602
0.1483 0.0717 Heat-map of the pairwise target correlations for the scm20d dataset
The higher the variance of the pairwise target correlations the more difficult for our approach to improve over ST (π = β0.68) No strong correlation between the median
Between dataset variants, higher median leads to higher gains atp1d atp7d edm sf1969 sf1978 oes10 oes97 rf1 rf2 scm1d scm20d wq gain (%)
3.6 2.6 4.6 5.0 3.1 7.9 2.5
1.6 1.4 1.3
median 0.8013 0.6306 0.0051 0.2242 0.1484 0.8479 0.7952 0.4077 0.4077 0.6526
0.5785 0.0751
stdev
0.0788 0.1602
0.1483 0.0717
Grigorios Tsoumakas Eleftherios Spyromitros-Xioufis Aikaterini Vrekou Ioannis Vlahavas Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece