Multi-Target Regression via Random Linear Target Combinations - PowerPoint PPT Presentation

Multi-Target Regression via Random Linear Target Combinations Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis Aikaterini Vrekou, Ioannis Vlahavas Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece

Multi-Target Regression also known as multivariate or multi-output regression 𝑌 1 𝑌 2 … 𝑌 𝒒 𝑍 1 𝑍 2 … 𝑍 𝒓 0.12 1 … 12 0.14 10 … -1.3 training 2.34 9 … -5 4.15 12 … -2.0 examples 1.22 3 … 40 1.01 28 … -5.3 2.18 2 … 8 ? ? … ? unknown instances 1.76 7 … 23 ? ? … ? 𝑟 continuous output variables 𝑞 input variables

Applications • Ecological modeling • Predicting physical and chemical properties of soil (forestry, agriculture) and water • Economics • Sales and price forecasting for multiple products • Energy • Solar/wind energy production forecasting • Load forecasting • We expect a raise in popularity • Internet of Things, Smart Cities Images are logos of corresponding multi-target regression competitions hosted at

Inspiration multi-label transfer of ideas 1 multi-target regression classification RA 𝑙 EL 2 ? random subset of labels random subset of targets all combinations ? of binary label values 1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas, Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG] 2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007

Inspiration multi-label transfer of ideas 1 multi-target regression classification RA 𝑙 EL 2 RLC random subset of labels random subset of targets all combinations a random linear of binary label values combination of targets 1 E. Spyromitros-Xioufis, G. Tsoumakas, W. Groves, I. Vlahavas, Multi-Label Classification Methods for Multi-Target Regression, arXiv:1211.6581 [cs.LG] 2 G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An Ensemble Method for Multilabel Classification, Proc. ECML PKDD 2007, pp. 406-417, Warsaw, Poland, 2007

Sketching RLC 𝑟 targets 𝑠 ≫ 𝑟 targets random linear combinations 𝒛 𝟐 𝒛 𝟑 𝒛 𝟒 𝒜 𝟐 𝒜 𝟑 𝒜 𝟒 𝒜 𝟓 𝒜 𝟔 𝒜 𝟕 of the original targets 1 -0,5 -0,2 0,4 1 2 0,5 -0,3 -1 2 3 0,9 0,6 -0,5 3 4 -0,8 -0,5 -0,9 4 5 -0,5 0,6 0,7 5 6 0,4 0,1 0,8 6 7 -0,2 -0,3 0,8 7 8 -0,4 -0,4 -0,9 8

Sketching RLC 𝑟 targets 𝑠 ≫ 𝑟 targets random linear combinations 𝒛 𝟐 𝒛 𝟑 𝒛 𝟒 𝒜 𝟐 𝒜 𝟑 𝒜 𝟒 𝒜 𝟓 𝒜 𝟔 𝒜 𝟕 of the original targets 1 -0,5 -0,2 0,4 1 0,26 -0,23 0,22 0,66 0 0,5 2 0,5 -0,3 -1 2 -0,73 0,05 -0,42 -1,2 0,32 -0,25 𝑎 = 𝑍𝐷 3 0,9 0,6 -0,5 3 -0,29 0,48 -0,3 -0,99 -0,14 -1,02 4 -0,8 -0,5 -0,9 4 -0,68 -0,83 0,28 -0,33 0,38 0,89 0 0,7 -0,6 -0,6 0 -0,8 5 -0,5 0,6 0,7 5 0,55 -0,14 0,54 0,93 -0,38 0,1 0,1 0 0,4 0 -0,4 -0,5 6 0,4 0,1 0,8 6 0,57 0,52 -0,2 0,48 -0,2 -0,37 0,7 0,3 0 0,9 -0,2 0 7 -0,2 -0,3 0,8 7 0,53 0,1 0 0,84 -0,04 0,31 𝑟 × 𝑠 coefficient matrix 𝐷 8 -0,4 -0,4 -0,9 8 -0,67 -0,55 0,08 -0,57 0,34 0,52 of standard uniform values multi-target solving a system of 𝑠 linear regression model equations with 𝑟 unknowns -0,2 1 0,2 -0,5 -0,4 0,5 ? ? ?

Some More Details 𝑟 targets 𝒛 𝟐 𝒛 𝟑 𝒛 𝟒 Assumption: original 1 -0,5 -0,2 0,4 targets take values from the same domain 2 0,5 -0,3 -1 𝑎 = 𝑍𝐷 3 0,9 0,6 -0,5 4 -0,8 -0,5 -0,9 0 0,7 -0,6 -0,6 0 -0,8 Parameter 𝑙 ∈ 2. . 𝑟 5 -0,5 0,6 0,7 0,1 0 0,4 0 -0,4 -0,5 (number of targets being combined) 6 0,4 0,1 0,8 0,7 0,3 0 0,9 -0,2 0 7 -0,2 -0,3 0,8 𝑟 × 𝑠 coefficient matrix 𝐷 8 -0,4 -0,4 -0,9 of standard uniform values Each original target is involved in 𝑠𝑙/𝑟 new targets

Relation to Output Coding motivation 𝑠 > 𝑟 𝑠 ≤ 𝑟 improve accuracy RLC [2,3] improve computational complexity [1] [4] 1 Hsu, D., Kakade, S., Langford, J., Zhang, T. Multi-label prediction via compressed sensing. In: NIPS 2009, 772 – 780 2 Zhang, Y., Schneider, J.G. Multi-label output codes using canonical correlation analysis. In: AISTATS 2011. 3 Zhang, Y., Schneider, J.G.: Maximum margin output coding. In: ICML 2012, icml.cc / Omnipress 4 Tai, F., Lin, H.T, Multilabel classification with principal label space transformation, Neural Computation 24(9) 2012, 2508 – 2542

Experimental Setup: Methods • ST • One regression model per target using gradient boosting • MORF • Multi-objective random forest of 100 trees • RLC • Multi-target regression algorithm: ST • Solving system of linear equations: least squares All code and specific experimental setup available at MULAN

Experimental Setup: Datasets Name Abbreviation Examples Features Targets 1,2 Airline Ticket Price 1 / 2 atp1d / atp7d 337 / 296 411 6 3 Electrical Discharge Machining edm 154 16 2 4,5 Occupational Employment Survey 1 / 2 oes1997 / oes2010 334 / 403 263 / 298 16 6,7 River Flow 1 / 2 rf1 / rf2 9125 64 / 576 8 8,9 Solar Flare 1 / 2 sf1969 / sf1978 323 / 1066 26 / 27 3 10,11 Supply Chain Management 1 / 2 scm1d / scm20d 9803 / 8966 280 / 61 16 12 Water Quality wq 1060 16 14

Studying the 𝑠 Parameter average of aRRMSE of our method (y-axis) with respect to 𝑠 (x-axis) across all datasets and all 𝑙 values

Studying the 𝑙 Parameter aRRMSE of our method (y-axis) at the atp1d dataset with respect to 𝑠 (x-axis) for 𝑙 ∈ {2, 3, 4, 5, 6}

Comparative Results • Results for 𝑙 =2/3 and 𝑠 =500 models RLC ST MORF ST with gradient boosting appears to be strong baseline RLC - 10:2 8:4 ST 2:10 - 7:5 MORF 4:8 5:7 - RLC is better than ST and MORF RLC ST MORF Wilcoxon signed-rank test at 95% Avg. shows statistically significant 1.5 2.25 2.25 Rank difference between RLC and ST

Pairwise Target Correlations Heat-map of the pairwise target correlations for the scm20d dataset atp1d atp7d edm sf1969 sf1978 oes10 oes97 rf1 rf2 scm1d scm20d wq gain (%) 3.6 2.6 4.6 5.0 3.1 7.9 2.5 -1.3 -2.0 1.6 1.4 1.3 median 0.8013 0.6306 0.0051 0.2242 0.1484 0.8479 0.7952 0.4077 0.4077 0.6526 0.5785 0.0751 stdev 0.0788 0.1602 - 1.1247 1.2006 0.0972 0.0785 0.3125 0.3125 0.1316 0.1483 0.0717

Pairwise Target Correlations No strong correlation between the median of pairwise target correlations and the gain of our approach over ST ( 𝑆 = 0.15 ) The higher the variance of the pairwise target correlations the more difficult for our approach to improve over ST ( 𝑆 = −0.68 ) Between dataset variants, higher median leads to higher gains atp1d atp7d edm sf1969 sf1978 oes10 oes97 rf1 rf2 scm1d scm20d wq gain (%) 3.6 2.6 4.6 5.0 3.1 7.9 2.5 -1.3 -2.0 1.6 1.4 1.3 median 0.8013 0.6306 0.0051 0.2242 0.1484 0.8479 0.7952 0.4077 0.4077 0.6526 0.5785 0.0751 stdev 0.0788 0.1602 - 1.1247 1.2006 0.0972 0.0785 0.3125 0.3125 0.1316 0.1483 0.0717

Recap • Our approach • Constructs new targets by taking random linear combinations of existing targets • Solves a linear equation system at prediction time • Relation to multi-label classification methods • RA 𝑙 EL, output coding • Results • RLC is significantly better than a strong baseline • RLC is better than a state-of-the-art approach • Interesting viewpoint of average target correlations

Future Work • Alternative randomization • Gaussian matrices, sparse Rademacher matrices • Theoretical understanding • WHY and WHEN it works • Increase ensemble diversity • Multiple coefficient matrices (e.g. 5 x 100 vs 1 x 500)

Multi-Target Regression via Random Linear Target Combinations Thank you! Grigorios Tsoumakas Department of Informatics Eleftherios Spyromitros-Xioufis Aristotle University of Thessaloniki Aikaterini Vrekou Thessaloniki 54124, Greece Ioannis Vlahavas

Multi-Target Regression via Random Linear Target Combinations - PowerPoint PPT Presentation

Multi-Target Regression via Random Linear Target Combinations Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis Aikaterini Vrekou, Ioannis Vlahavas Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece

Classification or Regression? Regression Classification: want to learn a discrete target

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Target Risk vs. Target Date Funds in 401(k) Plans: Maybe the answer is both January 14, 2015

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

Beta Regression: Summary Shaken, Stirred, Mixed, and Partitioned Achim Zeileis, Francisco

Drawing Parallels between Multi-label Classification and Multi-target Regression Grigorios

Active Regression via Linear-Sample Sparsification Xue Chen Eric Price UT Austin Xue Chen, Eric

Natural Target Pruning Making Proper Pruning Cuts Natural Target Pruning In this lesson we

Cotton Incorporated TARGET SPOT UPDATE A. K. Hagan Auburn University TARGET SPOT Target Spot

MADE EASY A WAKE FOREST UNIVERSITY & K16 SOLUTIONS STORY OCTOBER 21, 2020 1 OUR LAURA

Week 4: Maths Two hour programming class Tuesday 2:004:00, Birkbeck, 414/415 Tuesday

High stakes automatic assessments: developing an online linear algebra examination Chris Sangwin

Powering the Python Programming Laboratory 1 MR . J. D H AYA N I TH I M R . M. MA R IMU TH

Outline Modeling of fractional quantum Hall liquids Theory of topological phases of matter

Course on Inverse Problems Albert Tarantola Lesson XV: Square Root Variable Metric Algorithm The

Some Thoughts on Privacy and Security for Educational Data Ryan S. Baker University of

Dedicated to the memory of Louis Michel and Roland S en eor two Polytechniciens

Multi-Target Regression via Random Linear Target Combinations - PowerPoint PPT Presentation

Multi-Target Regression via Random Linear Target Combinations Grigorios Tsoumakas, Eleftherios Spyromitros-Xioufis Aikaterini Vrekou, Ioannis Vlahavas Department of Informatics Aristotle University of Thessaloniki Thessaloniki 54124, Greece

Classification or Regression? Regression Classification: want to learn a discrete target

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Target Risk vs. Target Date Funds in 401(k) Plans: Maybe the answer is both January 14, 2015

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction &amp; STRIPS Case Malte Helmert and

Beta Regression: Summary Shaken, Stirred, Mixed, and Partitioned Achim Zeileis, Francisco

Drawing Parallels between Multi-label Classification and Multi-target Regression Grigorios

Active Regression via Linear-Sample Sparsification Xue Chen Eric Price UT Austin Xue Chen, Eric

Natural Target Pruning Making Proper Pruning Cuts Natural Target Pruning In this lesson we

Cotton Incorporated TARGET SPOT UPDATE A. K. Hagan Auburn University TARGET SPOT Target Spot

MADE EASY A WAKE FOREST UNIVERSITY &amp; K16 SOLUTIONS STORY OCTOBER 21, 2020 1 OUR LAURA

Week 4: Maths Two hour programming class Tuesday 2:004:00, Birkbeck, 414/415 Tuesday

High stakes automatic assessments: developing an online linear algebra examination Chris Sangwin

Powering the Python Programming Laboratory 1 MR . J. D H AYA N I TH I M R . M. MA R IMU TH

Outline Modeling of fractional quantum Hall liquids Theory of topological phases of matter

Course on Inverse Problems Albert Tarantola Lesson XV: Square Root Variable Metric Algorithm The

Some Thoughts on Privacy and Security for Educational Data Ryan S. Baker University of

Dedicated to the memory of Louis Michel and Roland S en eor two Polytechniciens

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

MADE EASY A WAKE FOREST UNIVERSITY & K16 SOLUTIONS STORY OCTOBER 21, 2020 1 OUR LAURA