learning the parameters of a multiple criteria sorting
play

Learning the parameters of a multiple criteria sorting method from - PowerPoint PPT Presentation

Learning the parameters of a multiple criteria sorting method from large sets of assignment examples Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 1 cole Centrale de Paris - Laboratoire de Gnie Industriel 2 University of Mons -


  1. Learning the parameters of a multiple criteria sorting method from large sets of assignment examples Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 1 École Centrale de Paris - Laboratoire de Génie Industriel 2 University of Mons - Faculty of engineering November 14, 2013 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 1 / 23

  2. 1 Introduction 2 Algorithm 3 Experimentations 4 Conclusion Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 2 / 23

  3. Introduction Introductory example Application : Lung cancer Categories : C 3 : No cancer C 2 : Curable cancer C 1 : Incurable cancer C 3 ≻ C 2 ≻ C 1 ◮ 9394 patients analyzed ◮ Monotone attributes (number of cigarettes per day, age, ...) ◮ Output variable : no cancer, cancer, incurable cancer ◮ Predict the risk to get a lung cancer for other patients on basis of their attributes Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 3 / 23

  4. Introduction MR-Sort procedure Main characteristics ◮ Sorting procedure ◮ Simplified version of the ELECTRE TRI procedure [Yu, 1992] ◮ Axioms based [Slowínski et al., 2002, Bouyssou and Marchant, 2007a, Bouyssou and Marchant, 2007b] Parameters b p C p b p − 1 C p − 1 ◮ Profiles’ performances ( b h , j for b p − 2 C p − 2 b p − 3 h = 1 , ..., p − 1 ; j = 1 , ..., n ) ◮ Criteria weights ( w j for b 3 b 2 C 3 b 1 n = 1 , ..., n ) C 2 C 1 b 0 ◮ Majority threshold ( λ ) crit 1 crit 2 crit 3 crit 4 crit 5 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 4 / 23

  5. Introduction MR-Sort procedure Main characteristics ◮ Sorting procedure ◮ Simplified version of the ELECTRE TRI procedure [Yu, 1992] ◮ Axioms based [Slowínski et al., 2002, Bouyssou and Marchant, 2007a, Bouyssou and Marchant, 2007b] Parameters b p C p b p − 1 C p − 1 b p − 2 Assignment rule C p − 2 b p − 3 a ∈ C h b 3 b 2 ⇔ C 3 b 1 C 2 � � w j ≥ λ and w j < λ C 1 b 0 j : a j ≥ b h − 1 , j j : a j ≥ b h , j crit 1 crit 2 crit 3 crit 4 crit 5 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 4 / 23

  6. Introduction Inferring the parameters What already exists to infer MR-Sort parameters ? ◮ Mixed Integer Program learning the parameters of an MR-Sort model [Leroy et al., 2011] ◮ Metaheuristic to learn the parameters of an ELECTRE TRI model [Doumpos et al., 2009] ◮ Not suitable for large problems : computing time becomes huge when the number of parameters or examples increases Our objective ◮ Learn a MR-Sort model from a large set of assignment examples ◮ Efficient algorithm (i.e. can handle 1000 alternatives, 10 criteria, 5 categories) Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 5 / 23

  7. Algorithm Principe of the metaheuristic Input parameters ◮ Assignment examples ◮ Performances of the examples on the n criteria Objective ◮ Learn an MR-Sort model which is compatible with the highest number of assignment examples, i.e. maximize the classification accuracy, Number of examples correctly restored CA = Total number of examples What we know ◮ Easy : Learning only the weights and majority threshold ◮ Difficult : Learning only the profiles Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 6 / 23

  8. Algorithm Metaheuristic to learn all the parameters Algorithm Generate a population of N model models with profiles initialized with a heuristic repeat for all model M of the set do Learn the weights and majority threshold with a linear program, using the current profiles Adjust the profiles with a heuristic N it times, using the current weights and threshold. end for � � N model Reinitialize the models giving the worst CA 2 until Stopping criterion is met Stopping criterion Stopping criterion is met when one model has a CA equal to 1 or when the algorithm has run N o times. Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 7 / 23

  9. Algorithm Profiles initialization Principe ◮ By a heuristic ◮ On each criterion j , give to the profile a performance such that CA would be max for the alternatives belonging to h and h + 1 if w j = 1. ◮ Take the probability to belong to a category into account Example 1 : Where should the profile be set on criterion j ? Category Category P ( a i ∈ C h ) a 6 ,j a i,j a 5 ,j 1 a 1 ,j C 1 C 1 2 1 a 2 ,j C 1 C 2 a 4 ,j 2 a 3 ,j C 1 a 3 ,j C 2 ≻ C 1 a 4 ,j C 2 a 2 ,j a 5 ,j C 2 a 3 ,j < b h ≤ a 4 ,j a 1 ,j a 6 ,j C 2 crit j Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 8 / 23

  10. Algorithm Profiles initialization Principe ◮ By a heuristic ◮ On each criterion j , give to the profile a performance such that CA would be max for the alternatives belonging to h and h + 1 if w j = 1. ◮ Take the probability to belong to a category into account Example 2 : Where should the profile be set on criterion j ? Category Category P ( a i ∈ C h ) a 6 ,j a i,j a 5 ,j 2 a 1 ,j C 1 C 1 3 1 a 2 ,j C 1 C 2 a 4 ,j 3 a 3 ,j C 1 a 3 ,j C 2 ≻ C 1 a 4 ,j C 2 a 2 ,j a 5 ,j C 1 a 3 ,j < b h ≤ a 4 ,j a 1 ,j a 6 ,j C 2 crit j Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 8 / 23

  11. Algorithm Learning the weights and the majority threshold Principe ◮ Maximizing the classification accuracy of the model ◮ Using a linear program with no binary variables Linear program � ( x ′ i + y ′ Objective : min i ) (1) a i ∈ A � w j − x i + x ′ i = λ ∀ a i ∈ A h , h = { 2 , ..., p − 1 } (2) ∀ j | a i S j b h − 1 � w j + y i − y ′ i = λ − δ ∀ a i ∈ A h , h = { 1 , ..., p − 2 } (3) ∀ j | a i S j b h n � w j = 1 (4) j = 1 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 9 / 23

  12. Algorithm Learning the profiles Case 1 : Alternative a 1 classified in C 2 instead of C 1 ( C 2 ≻ C 1 ) b 2 C 2 ◮ a 1 is classified by the DM δ b 1 , 1 δ b 1 , 3 δ b 1 , 4 into category C 1 δ b 1 , 2 b 1 ◮ a 1 is classified by the model into category C 2 C 1 ◮ a 1 outranks b 1 a 1 ◮ Profile too low on one or b 0 several criteria (in red) crit 1 crit 2 crit 3 crit 4 crit 5 w j = 0 . 2 for j = 1 , ..., 5 ; λ = 0 . 8 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 10 / 23

  13. Algorithm Learning the profiles Case 1 : Alternative a 1 classified in C 2 instead of C 1 ( C 2 ≻ C 1 ) b 2 C 2 ◮ a 1 is classified by the DM into category C 1 b 1 ◮ a 1 is classified by the model into category C 2 C 1 ◮ a 1 outranks b 1 a 1 ◮ Profile too low on one or b 0 several criteria (in red) crit 1 crit 2 crit 3 crit 4 crit 5 w j = 0 . 2 for j = 1 , ..., 5 ; λ = 0 . 8 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 10 / 23

  14. Algorithm Learning the profiles Case 2 : Alternative a 2 classified in C 1 instead of C 2 ( C 2 ≻ C 1 ) b 2 ◮ a 2 is classified by the DM into category C 2 C 2 ◮ a 2 is classified by the model into category C 1 b 1 δ b 1 , 4 ◮ a 2 doesn’t outrank b 1 δ b 1 , 5 ◮ Profile too high on one or a 2 C 1 several criteria (in blue) ◮ If profile moved by δ b 1 , 2 , 4 on b 0 g 4 and/or by δ b 1 , 2 , 5 on g 5 , crit 1 crit 2 crit 3 crit 4 crit 5 the alternative will be rightly classified w j = 0 . 2 for j = 1 , ..., 5 ; λ = 0 . 8 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 11 / 23

  15. Algorithm Learning the profiles Case 2 : Alternative a 2 classified in C 1 instead of C 2 ( C 2 ≻ C 1 ) b 2 ◮ a 2 is classified by the DM into category C 2 C 2 ◮ a 2 is classified by the model into category C 1 b 1 ◮ a 2 doesn’t outrank b 1 ◮ Profile too high on one or a 2 C 1 several criteria (in blue) ◮ If profile moved by δ b 1 , 2 , 4 on b 0 g 4 and/or by δ b 1 , 2 , 5 on g 5 , crit 1 crit 2 crit 3 crit 4 crit 5 the alternative will be rightly classified w j = 0 . 2 for j = 1 , ..., 5 ; λ = 0 . 8 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 11 / 23

  16. Algorithm Learning the profiles ◮ V + δ h , j (resp. V − δ h , j ) : the sets of alternatives misclassified in C h + 1 instead of C h (resp. C h instead of C h + 1 ), for which moving the profile b h by + δ (resp. − δ ) on j results in a correct assignment. b 2 C 2 ◮ C 2 ≻ C 1 b 1 ◮ w j = 0 . 2 for j = 1 , ..., 5 δ ◮ λ = 0 . 8 a 3 C 1 ◮ a 3 ∈ A 1 ← Model 2 ← DM b 0 crit 1 crit 2 crit 3 crit 4 crit 5 Olivier Sobrie 1 , 2 - Vincent Mousseau 1 - Marc Pirlot 2 - November 14, 2013 University of Mons - Ecole Centrale Paris 12 / 23

Recommend


More recommend