Adaptive Operator Selection via Online Learning and Fitness Landscape Metrics Pietro Consoli Leandro L. Minku Xin Yao CERCIA, School of Computer Science University of Birmingham, United Kingdom www.cs.bham.ac.uk/~pac265 p.a.consoli@cs.bham.ac.uk P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 1 / 23
Outline Motivation 1 Adaptive Crossover Selection 2 Fitness Landscape Metrics Online Learning Case Study 3 CARP Experimental Studies 4 Results Future Work 5 Conclusions 6 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 2 / 23
Adaptive Crossover Selection Different crossover operators might lead to offspring with different characteristics : Exploration Fitness Good traits transmission We can expect different search results on certain instances P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 3 / 23
Adaptive Crossover Selection Inst Op A Op B Op C Op D Inst ACS(Op) 1 best x x x 1 A 2 best x x x 2 A 3 x x x best 3 D 4 x best x x 4 B ⇒ 5 x x x best 5 D 6 x x best x 6 C 7 x x best x 7 C 8 x x x best 8 D Adaptive Crossover Selection Adaptively select the best crossover operator to use during the search process P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 4 / 23
Adaptive Crossover Selection: Dynamic Scenario Dynamic scenario: Different periods of the search might have different best crossover operators; Dynamic ACS potentially better than static scenario P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 5 / 23
Research Questions State-of-the art approaches for Credit Assignment consider the 1 use of just one measure (usually fitness). Enough to characterize the current population distribution? What Operator Selection Rule can handle a set of measures? 2 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 6 / 23
RQ1: Population Charachterization through FLA Fitness Landscape Analysis (FLA): create a more “aware” snapshot of the current population distribution. perform a set of 4 online FLA techniques during each generation; 1 Average Escape Probability 1 (Evolvability); 1 Average ∆ − Fitness of the neutral networks 2 (Neutrality); 2 Average neutrality ratio 2 (Neutrality); 3 Dispersion Metric 3 (Population Distribution); 4 FLA not to predict hardness but to learn more the current 2 population distribution. 1 Lu, G., Li, J., Yao, X. - "Fitness-probability cloud and a measure of problem hardness for evolutionary algorithms" - 2011 2 Vanneschi L., Pirola Y., Collard P . - "A Quantitative Study of Neutrality in GP Boolean Landscapes" - 2006 3 Lunacek M., Whitley D., - "The Dispersion Metric and the CMA Evolution Strategy" - 2006 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 7 / 23
RQ2: Credit Assignment through Online Learning Detection of changes analogous to Concept Drift tracking in Online Learning; Concept Drift: change of the underlying distribution of the samples during the learning process; Online learning can be used to learn the relationship between FLA results (input features) and the credit measure (output feature); Dynamic Weighted Majority (DWM) using Regression Trees as base learners. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 8 / 23
Dynamic Weighted Majority DWM( p , β, θ, , τ ); initialize a set of experts and assign an initial weight w j = 1 to each; create a window of the last training instances wTS ( x i ) ; forall the instances ( x i , y i ) do update wTS; forall the expert e j do λ i = predict ( e j , x i ) ; if | λ i − y i | < τ and i mod p = 0 then w j = β ∗ w j ; end if w j < θ and i mod p = 0 then delete expert e j ; end normalize weights (maximum weight equal to 1); calculate global prediction σ i (weighted average prediction); if | σ i − y i | < τ and i mod p = 0 then create new expert e j and train with wTS; end train all experts with the new instance ( x i , y i ) ; return σ i ; end end P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 9 / 23
Capacitated Arc Routing Problem Case Study: Crossover Operator Selection using the MAENS algorithm for Capacitated Arc Routing Problem 4 ; Considers the use of a suite of four different crossover operators ; Credit Assignment Mechanism: Proportional Reward (PR); we exploit the Local Search of MAENS* to perform the FLA techniques without extra computational cost. 4 K. Tang, Y. Mei, X. Yao - "Memetic Algorithm with Extended Neighborhood Search for Capacitated Arc Routing Problems" - 2009 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 10 / 23
Credit Mechanism Credit Assignment Mechanism: percentage of offspring generated by each operator surviving to the next generation. Proportional Reward PR ( i ) t = | x ∈ pop t + 1 : x generated by operator i| | pop t + 1 | Indirect effect of crossover operator; We entrust the selection/ranking operator of the algorithm to evaluate the individuals. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 11 / 23
CARP - instance P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 12 / 23
CARP - instance each arc (task) has a service cost and a demand; constraints: number of vehicles and capacity; objective function: minimize the total service cost; proved NP-Hard in 1981; many real-world applications (e.g. waste collection, road gritting). P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 12 / 23
MAENS*-II FLA is performed during each iteration; basic Operator Selection Rule: largest instantaneous reward in order to reduce bias of previous performances; Credit Assignment through DWM. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 13 / 23
Experimental studies Experiments conducted on a set of 42 non-easy CARP instances belonging to egl , val and Beullen’s benchmark sets; Average fitness values calculated over 30 independent runs; In order to provide a lower bound and a term of comparison for the results, an Oracle using only the Proportional Reward is built; Tested optimization 1 results against MAENS*, Oracle; Tested prediction ability 2 against Oracle. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 14 / 23
MAENS*-II vs MAENS* MAENS* - uses MAB and Proportional Reward; MAENS*-II wins the comparisons with MAENS* on 20 instances and loses on 18 out of 42 instances; Wilcoxon signed-rank test over the set of the instances suggests that there is no statistical difference between the results achieved by the two algorithms; 6 instances show statistically different results using Wilcoxon rank-sum test on each couple of results. Instance MAENS*-II MAENS* avg fitness std avg fitness std D23 767.67 7.39 769.83 12.28 E15 1604.33 5.59 1602.50 6.68 E19 1442.00 4.58 1442.67 4.23 F19 732.50 9.64 735.17 9.35 egl-s1-B 6397.59 12.70 6399.90 16.38 egl-s2-B 13171.41 29.49 13179.07 26.11 P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 15 / 23
MAENS*-II vs Oracle Oracle achieves better results on 40 instances; On 2 instances MAENS*-II managed to achieve better results than the Oracle; If Oracle shows bound using only PR, then the use of FLA+PR can enhance of the optimization ability of the algorithm. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 16 / 23
Prediction Ability: Oracle P . Consoli (University of Birmingham) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 1 13971.10 13519.52 Figure: Oracle Selection Rates on instance egl-s2-B 13410.23 13359.21 13329.27 13306.66 13284.98 gsbx 13266.16 The 43rd CREST Open Workshop 13259.01 13246.96 13238.19 grx 13230.69 13224.19 13219.14 13210.87 pbx 13202.81 13198.70 13193.36 13185.43 spbx 13178.60 13173.01 13166.71 13162.42 13157.75 13138.98 17 / 23
Prediction Ability: MAENS* P . Consoli (University of Birmingham) 0.1 0.2 0.3 0.4 0.5 0.6 0 13977.76 Figure: MAENS* Selection Rates on instance egl-s2-B 13518.30 13426.18 13372.69 13340.94 13319.99 13303.27 gsbx 13290.76 The 43rd CREST Open Workshop 13279.41 13269.52 13261.32 grx 13254.27 13247.09 13240.74 13234.68 pbx 13228.45 13222.88 13218.08 13213.19 spbx 13207.88 13202.49 13194.81 13186.17 13176.59 13162.09 18 / 23
Prediction Ability: MAENS*-II P . Consoli (University of Birmingham) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 13983.50 Figure: MAENS*-II Selection Rates on instance egl-s2-B 13517.70 13424.79 13371.06 13331.73 13301.41 13281.70 gsbx 13269.87 The 43rd CREST Open Workshop 13262.24 13254.12 13245.67 grx 13238.68 13232.95 13228.56 13223.33 pbx 13217.34 13212.08 13207.40 13203.44 spbx 13198.58 13192.89 13186.89 13179.02 13169.77 13156.15 19 / 23
Future work Integrated with a Reinforcement Learning mechanism with concurrent use of the operators; Tested the use of a diversity-based reward measure; Improved results when using RL; outperformed state-of-the-art on Large Scale CARP instances. P . Consoli (University of Birmingham) The 43rd CREST Open Workshop 20 / 23
Recommend
More recommend