adaptive operator selection via online learning and
play

Adaptive Operator Selection via Online Learning and Fitness - PowerPoint PPT Presentation

Adaptive Operator Selection via Online Learning and Fitness Landscape Metrics Pietro Consoli Leandro L. Minku Xin Yao CERCIA, School of Computer Science University of Birmingham, United Kingdom www.cs.bham.ac.uk/~pac265


  1. Adaptive Operator Selection via Online Learning and Fitness Landscape Metrics Pietro Consoli Leandro L. Minku Xin Yao CERCIA, School of Computer Science University of Birmingham, United Kingdom www.cs.bham.ac.uk/~pac265 p.a.consoli@cs.bham.ac.uk P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 1 / 23

  2. Outline Motivation 1 Adaptive Crossover Selection 2 Fitness Landscape Metrics Online Learning Case Study 3 CARP Experimental Studies 4 Results Future Work 5 Conclusions 6 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 2 / 23

  3. Adaptive Crossover Selection Different crossover operators might lead to offspring with different characteristics : Exploration Fitness Good traits transmission We can expect different search results on certain instances P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 3 / 23

  4. Adaptive Crossover Selection Inst Op A Op B Op C Op D Inst ACS(Op) 1 best x x x 1 A 2 best x x x 2 A 3 x x x best 3 D 4 x best x x 4 B ⇒ 5 x x x best 5 D 6 x x best x 6 C 7 x x best x 7 C 8 x x x best 8 D Adaptive Crossover Selection Adaptively select the best crossover operator to use during the search process P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 4 / 23

  5. Adaptive Crossover Selection: Dynamic Scenario Dynamic scenario: Different periods of the search might have different best crossover operators; Dynamic ACS potentially better than static scenario P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 5 / 23

  6. Research Questions State-of-the art approaches for Credit Assignment consider the 1 use of just one measure (usually fitness). Enough to characterize the current population distribution? What Operator Selection Rule can handle a set of measures? 2 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 6 / 23

  7. RQ1: Population Charachterization through FLA Fitness Landscape Analysis (FLA): create a more “aware” snapshot of the current population distribution. perform a set of 4 online FLA techniques during each generation; 1 Average Escape Probability 1 (Evolvability); 1 Average ∆ − Fitness of the neutral networks 2 (Neutrality); 2 Average neutrality ratio 2 (Neutrality); 3 Dispersion Metric 3 (Population Distribution); 4 FLA not to predict hardness but to learn more the current 2 population distribution. 1 Lu, G., Li, J., Yao, X. - "Fitness-probability cloud and a measure of problem hardness for evolutionary algorithms" - 2011 2 Vanneschi L., Pirola Y., Collard P . - "A Quantitative Study of Neutrality in GP Boolean Landscapes" - 2006 3 Lunacek M., Whitley D., - "The Dispersion Metric and the CMA Evolution Strategy" - 2006 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 7 / 23

  8. RQ2: Credit Assignment through Online Learning Detection of changes analogous to Concept Drift tracking in Online Learning; Concept Drift: change of the underlying distribution of the samples during the learning process; Online learning can be used to learn the relationship between FLA results (input features) and the credit measure (output feature); Dynamic Weighted Majority (DWM) using Regression Trees as base learners. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 8 / 23

  9. Dynamic Weighted Majority DWM( p , β, θ, , τ ); initialize a set of experts and assign an initial weight w j = 1 to each; create a window of the last training instances wTS ( x i ) ; forall the instances ( x i , y i ) do update wTS; forall the expert e j do λ i = predict ( e j , x i ) ; if | λ i − y i | < τ and i mod p = 0 then w j = β ∗ w j ; end if w j < θ and i mod p = 0 then delete expert e j ; end normalize weights (maximum weight equal to 1); calculate global prediction σ i (weighted average prediction); if | σ i − y i | < τ and i mod p = 0 then create new expert e j and train with wTS; end train all experts with the new instance ( x i , y i ) ; return σ i ; end end P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 9 / 23

  10. Capacitated Arc Routing Problem Case Study: Crossover Operator Selection using the MAENS algorithm for Capacitated Arc Routing Problem 4 ; Considers the use of a suite of four different crossover operators ; Credit Assignment Mechanism: Proportional Reward (PR); we exploit the Local Search of MAENS* to perform the FLA techniques without extra computational cost. 4 K. Tang, Y. Mei, X. Yao - "Memetic Algorithm with Extended Neighborhood Search for Capacitated Arc Routing Problems" - 2009 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 10 / 23

  11. Credit Mechanism Credit Assignment Mechanism: percentage of offspring generated by each operator surviving to the next generation. Proportional Reward PR ( i ) t = | x ∈ pop t + 1 : x generated by operator i| | pop t + 1 | Indirect effect of crossover operator; We entrust the selection/ranking operator of the algorithm to evaluate the individuals. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 11 / 23

  12. CARP - instance P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 12 / 23

  13. CARP - instance each arc (task) has a service cost and a demand; constraints: number of vehicles and capacity; objective function: minimize the total service cost; proved NP-Hard in 1981; many real-world applications (e.g. waste collection, road gritting). P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 12 / 23

  14. MAENS*-II FLA is performed during each iteration; basic Operator Selection Rule: largest instantaneous reward in order to reduce bias of previous performances; Credit Assignment through DWM. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 13 / 23

  15. Experimental studies Experiments conducted on a set of 42 non-easy CARP instances belonging to egl , val and Beullen’s benchmark sets; Average fitness values calculated over 30 independent runs; In order to provide a lower bound and a term of comparison for the results, an Oracle using only the Proportional Reward is built; Tested optimization 1 results against MAENS*, Oracle; Tested prediction ability 2 against Oracle. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 14 / 23

  16. MAENS*-II vs MAENS* MAENS* - uses MAB and Proportional Reward; MAENS*-II wins the comparisons with MAENS* on 20 instances and loses on 18 out of 42 instances; Wilcoxon signed-rank test over the set of the instances suggests that there is no statistical difference between the results achieved by the two algorithms; 6 instances show statistically different results using Wilcoxon rank-sum test on each couple of results. Instance MAENS*-II MAENS* avg fitness std avg fitness std D23 767.67 7.39 769.83 12.28 E15 1604.33 5.59 1602.50 6.68 E19 1442.00 4.58 1442.67 4.23 F19 732.50 9.64 735.17 9.35 egl-s1-B 6397.59 12.70 6399.90 16.38 egl-s2-B 13171.41 29.49 13179.07 26.11 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 15 / 23

  17. MAENS*-II vs Oracle Oracle achieves better results on 40 instances; On 2 instances MAENS*-II managed to achieve better results than the Oracle; If Oracle shows bound using only PR, then the use of FLA+PR can enhance of the optimization ability of the algorithm. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 16 / 23

  18. Prediction Ability: Oracle P . Consoli (University of Birmingham) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 1 13971.10 13519.52 Figure: Oracle Selection Rates on instance egl-s2-B 13410.23 13359.21 13329.27 13306.66 13284.98 gsbx 13266.16 The 43rd CREST Open Workshop 13259.01 13246.96 13238.19 grx 13230.69 13224.19 13219.14 13210.87 pbx 13202.81 13198.70 13193.36 13185.43 spbx 13178.60 13173.01 13166.71 13162.42 13157.75 13138.98 17 / 23

  19. Prediction Ability: MAENS* P . Consoli (University of Birmingham) 0.1 0.2 0.3 0.4 0.5 0.6 0 13977.76 Figure: MAENS* Selection Rates on instance egl-s2-B 13518.30 13426.18 13372.69 13340.94 13319.99 13303.27 gsbx 13290.76 The 43rd CREST Open Workshop 13279.41 13269.52 13261.32 grx 13254.27 13247.09 13240.74 13234.68 pbx 13228.45 13222.88 13218.08 13213.19 spbx 13207.88 13202.49 13194.81 13186.17 13176.59 13162.09 18 / 23

  20. Prediction Ability: MAENS*-II P . Consoli (University of Birmingham) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 13983.50 Figure: MAENS*-II Selection Rates on instance egl-s2-B 13517.70 13424.79 13371.06 13331.73 13301.41 13281.70 gsbx 13269.87 The 43rd CREST Open Workshop 13262.24 13254.12 13245.67 grx 13238.68 13232.95 13228.56 13223.33 pbx 13217.34 13212.08 13207.40 13203.44 spbx 13198.58 13192.89 13186.89 13179.02 13169.77 13156.15 19 / 23

  21. Future work Integrated with a Reinforcement Learning mechanism with concurrent use of the operators; Tested the use of a diversity-based reward measure; Improved results when using RL; outperformed state-of-the-art on Large Scale CARP instances. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 20 / 23

Recommend


More recommend