simple on the fly parameter selection
play

Simple On-the-Fly Parameter Selection Carola Doerr CNRS and - PowerPoint PPT Presentation

Simple On-the-Fly Parameter Selection Carola Doerr CNRS and Sorbonne University, Paris, France Markus Wagner University of Adelaide, Australia Presentation at GECCO 2018 Carola Doerr, Markus Wagner: Simple On-the-Fly Parameter Selection


  1. Simple On-the-Fly Parameter Selection Carola Doerr CNRS and Sorbonne University, Paris, France Markus Wagner University of Adelaide, Australia Presentation at GECCO 2018 Carola Doerr, Markus Wagner: Simple On-the-Fly Parameter Selection Mechanisms for Two Classical Discrete Black-Box Optimization Benchmark Problems Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 1

  2. The Parameter Selection Problem  Evolutionary algorithms and related iterative optimization heuristics are parametrized algorithms  Example: � + � EAs  Parameters:  Memory size �  Offspring population size � How shall I set these parameters to  Crossover rate get a well-performing EA?  Mutation rate, search radius, etc  Selective pressure Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 2

  3. Parameter Tuning vs. Parameter Control  Parameter Tuning:  Initial set of experiments  Deduce reasonable parameter settings  Does not have to be done manually, but a number of powerful, ready-to-use tools available: irace, SPOT, ParamILS, SMAC, GGA,…  Parameter Control:  2 main differences:  Parameters are set while optimizing  Parameters change over time : Key motivation: different parameter values can be optimal in different stages of an optimization process Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 3

  4. Goals of Parameter Control  to identify good parameter values “on the fly”  to track good parameter values when they change during the optimization process Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 4

  5. Parameter Control  Example: LeadingOnes: LO(110110101010)=2  Randomized Local search: flip � bits, keep the better of parent and offspring � � ��� (�) = �� � �� Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 5

  6. Parameter Control  Example: LeadingOnes: LO(110110101010)=2  Randomized Local search: flip � bits, keep the better of parent and offspring  n=1000 � = �� � �� Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 6

  7. Parameter Control  Example: LeadingOnes: LO(110110101010)=2  Randomized Local search: flip � bits, keep the better of parent and How can I find/predict such a dependence??? offspring  n=1000 � = �� � �� 22% smaller optimization time Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 7

  8. Good News: You Don’t Have to!  Easy mechanisms which find close-to-optimal parameter values automatically: 1000 optimal mutation strength Avg. mutation strength of Mutation Strength adaptive EA 100 10 1 0 50 100 150 200 250 LO(x) Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 8

  9. Good News: You Don’t Have to!  With close-to-optimal performance: optimal mutation strength 1000 35 x 1000 Avg. mutation strength of adaptive EA 30 Avg. hitting time of dynamic (1+1) EA Avg. hitting time of best static RLS Mutation Strength 25 Avg. Hitting Time 100 Avg. hitting time of best dynamic RLS 20 15 10 10 5 1 0 0 50 100 150 200 250 LO(x) Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 9

  10. Good News: You Don’t Have to!  Running time for update strengths � = 2 , � = 1/2 (empirical)  around 20.5% performance gain over the (1+1) EA �� with static mutation rate � = 1/�  14% performance gain over RLS  larger gains possible for other combinations of � and � Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 10

  11. Success-Based Multiplicative Update Rule Create offspring � through standard bit mutation with mutation probability � A>1 b<1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 11

  12. Success-Based Multiplicative Update Rule Standard bit mutation, condition to flip at least one bit A>1 b<1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 12

  13. LeadingOnes  Average optimization time for different combinations of � and � (101 independent runs) For comparison: RLS needs � � /2 iterations (=0.5 and =3.125 above),  (1+1) EA >0 needs 0.54 and 3.4 * 10 4 iterations, respectively Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 13

  14. LeadingOnes  Average optimization time for different combinations of � and � (101 independent runs) For comparison: RLS needs � � /2 iterations (=0.5, =3.125, 1.25 above),  (1+1) EA >0 needs 0.54, 3.4 * 10 4 , and 1.35*10 5 iterations, respectively Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 14

  15. LeadingOnes  Average optimization time for different combinations of � and � (101 independent runs) For comparison: RLS needs � � /2 iterations (=1.25*10 5 for � =500),  (1+1) EA >0 needs 1.35*10 5 iterations, respectively Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 15

  16. 1/5-th Success Rules  1/5-th success rule:  originally from continuous optimization [Rechenberg, Devroye, Schumer/Steiglitz] �  (1+1) ES optimizing sphere � � = ∑� !  When success rate > 1/5: increase search radius When success rate < 1/5: decrease search radius  In discrete optimization, e.g., [Kern/Müller/Hansen/Büche/Ocenasek/Koumoutsakos04, Auger09]:  When success rate ≈ 1/5 , parameter value should be stable  In our algorithm: If � � ≥ � � : � ← min ��, + � � ← max{��, 1/� � } else +/1 + since �� 1 = 1  � = 0 � = 1/� 1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 16

  17. 1/5-th Success Rules  1/5-th success rule:  originally from continuous optimization [Rechenberg, Devroye, Schumer/Steiglitz] �  (1+1) ES optimizing sphere � � = ∑� !  When success rate > 1/5: increase search radius When success rate < 1/5: decrease search radius  In discrete optimization, e.g., [Kern/Müller/Hansen/Büche/Ocenasek/Koumoutsakos04, Auger09]:  When success rate ≈ 1/5 , parameter value should be stable  In our algorithm: If � � ≥ � � : � ← min ��, + � � ← max{��, 1/� � } else +/1 + since �� 1 = 1  � = 0 � = 1/� 1 Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 17

  18. Results for the 1/5-th Success Rule  LO, � =500, 100 independent runs  RLS performance: 125,000 iterations 125000 120000 115000 110000 105000 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 A Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 18

  19. 1:x Success Rules  A priori no reason the restrict ourselves to a 1:5 success ratio  We can also try different success rules Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 19

  20. Average Optimization Times of 1:x Rules  LO, n=500, 100 independent runs  RLS performance: 125,000 iterations 125000 2 3 4 5 6 7 8 120000 115000 110000 105000 100000 95000 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 A Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 20

  21. Overall Performance Summary  50% of all configurations with 1 < � ≤ 2.5 and 0.4 ≤ � < 1 are better than RLS by at least 13% Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 21

  22. Results for OneMax Average Runtime on OneMax for Different Dimensions 30,000 25,000 Average Optimization Time RLS RLS_opt 20,000 15,000 10,000 5,000 - 100 500 1000 2000 3000 RLS 445 3,050 6,871 14,809 23,814 RLS_opt 436 2,974 6,690 14,722 23,507 Dimension n Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 22

  23. Results for OneMax Average Runtime on OneMax for Different Dimensions 40,000 35,000 (1+1) EA_>0 RLS RLS_opt 30,000 Average Optimization Time 25,000 20,000 15,000 10,000 5,000 - 100 500 1000 2000 3000 (1+1) EA_>0 679 4,756 10,574 24,352 37,256 RLS 445 3,050 6,871 14,809 23,814 RLS_opt 436 2,974 6,690 14,722 23,507 Dimension n Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 23

  24. Results for OneMax Average Runtime on OneMax for Different Dimensions 40,000 (1+1) EA_>0 A=1,11. b=0,66 35,000 Average Optimization Time A=1,2. b=0,85 A=1,3. b=0,75 30,000 A=2,0. b=0,5 RLS 25,000 20,000 RLS_opt 15,000 10,000 5,000 - 100 500 1000 2000 3000 (1+1) EA_>0 679 4,756 10,574 24,352 37,256 A=1,11. b=0,66 447 3,039 6,749 15,134 23,726 A=1,2. b=0,85 450 3,059 6,751 14,801 23,558 A=1,3. b=0,75 450 3,033 6,801 14,974 23,715 A=2,0. b=0,5 455 3,013 6,753 14,613 23,417 RLS 445 3,050 6,871 14,809 23,814 RLS_opt 436 2,974 6,690 14,722 23,507 Dimension n Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 24

  25. Heatmaps for OneMax Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 25

  26. �% Configs better than � + � EA �8 by at least �% 100% 90% 80% 70% % of configurations Even better results if we 60% restrict to configurations with 1 < � ≤ 2.5 and 50% 0.4 ≤ � < 1 40% 30% 100 500 1000 20% 1500 2000 10% 0% 0% 5% 10% 15% 20% 25% 30% 35% 40% % better Carola Doerr and Markus Wagner: Simple On-the-Fly Parameter Selection 26

Recommend


More recommend