tuning numerical parameters of algorithms sampling and
play

Tuning numerical parameters of algorithms: sampling and - PowerPoint PPT Presentation

Tuning numerical parameters of algorithms: sampling and stochasticity handling Z. Yuan, T. St utzle, M. Birattari, M. Montes de Oca IRIDIA, CoDE, Universit e Libre de Bruxelles Brussels, Belgium zyuan@ulb.ac.be iridia.ulb.ac.be/~zyuan


  1. Tuning numerical parameters of algorithms: sampling and stochasticity handling Z. Yuan, T. St¨ utzle, M. Birattari, M. Montes de Oca IRIDIA, CoDE, Universit´ e Libre de Bruxelles Brussels, Belgium zyuan@ulb.ac.be iridia.ulb.ac.be/~zyuan

  2. Outline 1. The tuning problem 2. Tuning algorithm Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method Iterated F-Race (Birattari et al. 2010) Post-selection mechanism 3. Experimental results

  3. Outline 1. The tuning problem 2. Tuning algorithm Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method Iterated F-Race (Birattari et al. 2010) Post-selection mechanism 3. Experimental results

  4. Configuration of parameterized algorithms Algorithm components ◮ categorical parameters ◮ choice of neighborhood in local search ◮ choice of crossover and mutation in EAs ◮ type of perturbation in iterated local search ◮ numerical parameters (real-valued or integer) ◮ crossover and mutation rates ◮ tabu list length ◮ perturbation strength

  5. Importance of the tuning problem ◮ improvement over default settings, manual tuning ◮ reduction of development time and human intervention ◮ empirical studies, comparisons of algorithms ◮ support for end users of algorithms

  6. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C�

  7. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C� ◮ Θ: set of candidate configurations.

  8. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C� ◮ Θ: set of candidate configurations. ◮ I : set of instances. P I : probability measure over I .

  9. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C� ◮ Θ: set of candidate configurations. ◮ I : set of instances. P I : probability measure over I . ◮ c ( θ, i ): random variable representing the cost measure of a configuration θ ∈ Θ on instance i ∈ I

  10. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C� ◮ Θ: set of candidate configurations. ◮ I : set of instances. P I : probability measure over I . ◮ c ( θ, i ): random variable representing the cost measure of a configuration θ ∈ Θ on instance i ∈ I ◮ C ⊂ ℜ : range of c . P C : probability measure over the set C .

  11. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C� ◮ Θ: set of candidate configurations. ◮ I : set of instances. P I : probability measure over I . ◮ c ( θ, i ): random variable representing the cost measure of a configuration θ ∈ Θ on instance i ∈ I ◮ C ⊂ ℜ : range of c . P C : probability measure over the set C . ◮ C ( θ ) = C ( θ | Θ , I , P I , P C ): performance expectation: � C ( θ ) = E I , C [ c ] = c dP C ( c | θ, i ) dP I ( i ) , (1)

  12. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C� ◮ Θ: set of candidate configurations. ◮ I : set of instances. P I : probability measure over I . ◮ c ( θ, i ): random variable representing the cost measure of a configuration θ ∈ Θ on instance i ∈ I ◮ C ⊂ ℜ : range of c . P C : probability measure over the set C . ◮ C ( θ ) = C ( θ | Θ , I , P I , P C ): performance expectation: � C ( θ ) = E I , C [ c ] = c dP C ( c | θ, i ) dP I ( i ) , (1) ◮ The objective is to find a performance optimizing configuration ¯ θ : ¯ θ = arg min θ ∈ Θ C ( θ ) (2)

  13. Tuning problem: formal definition (Birattari et al. 2002) The tuning problem can be defined as a tuple � Θ , I , P I , P C , C� ◮ Θ: set of candidate configurations. ◮ I : set of instances. P I : probability measure over I . ◮ c ( θ, i ): random variable representing the cost measure of a configuration θ ∈ Θ on instance i ∈ I ◮ C ⊂ ℜ : range of c . P C : probability measure over the set C . ◮ C ( θ ) = C ( θ | Θ , I , P I , P C ): performance expectation: � C ( θ ) = E I , C [ c ] = c dP C ( c | θ, i ) dP I ( i ) , (1) ◮ The objective is to find a performance optimizing configuration ¯ θ : ¯ θ = arg min θ ∈ Θ C ( θ ) (2) ◮ Analytical solution not possible, hence estimate expected cost in a Monte Carlo fashion

  14. Tuning problem is an optimization problem Variables: mixed discrete-continuous, conditional variables Objective: ◮ black-box ◮ stochastic ◮ due to stochasticity of the algorithm ◮ due to sampling of instances

  15. Outline 1. The tuning problem 2. Tuning algorithm Sampling in parameter space Budget allocation for ranking and selection: F-Race Combine F-Race with Sampling method Iterated F-Race (Birattari et al. 2010) Post-selection mechanism 3. Experimental results

  16. Solving tuning problem: Our approach ◮ sampling in parameter space ◮ budget allocation for ranking and selection under stochasticity: F-Race ◮ combine budget allocator with sampling methods Open question: trade-off in allocating budget to sampling new points or evaluation of sampled points.

  17. Sampling in parameter space ◮ focus on numerical parameters ◮ usually low dimension, low budget ◮ sampling in established tuners: ad-hoc, factorial design, Kriging approximation ◮ our work studies state-of-the-art derivative-free optimizers: BOBYQA, CMA-ES, and MADS (Yuan et al. 2010, 2012a) Average rank of algorithms across numbers of parameters in MMAS Average rank of algorithms across numbers of parameters in cPSO 5 5 4 4 Average rank Average rank CMAES CMAES 3 3 MADS MADS IRS IRS URS URS BOBYQA BOBYQA 2 2 1 1 2 3 4 5 6 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Number of parameters Number of parameters

  18. F-Race (Birattari et al. 2002) Θ i

  19. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates i

  20. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances i

  21. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates i

  22. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates i

  23. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates i

  24. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  25. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  26. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  27. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  28. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  29. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  30. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  31. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  32. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  33. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

  34. F-Race (Birattari et al. 2002) Θ ◮ start with a set of initial candidates ◮ consider a stream of instances ◮ sequentially evaluate candidates ◮ discard statistically worse candidates as detected by Friedman test i

Recommend


More recommend