monte carlo tree search for algorithm configuration mosaic
play

Monte Carlo Tree Search for Algorithm Configuration: MOSAIC - PowerPoint PPT Presentation

Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich` ele Sebag TAU CNRS INRIA LRI Universit e Paris-Sud NeurIPS MetaLearning Wshop Dec. 8, 2018 1 / 14 Monte Carlo Tree Search for


  1. Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich` ele Sebag TAU CNRS − INRIA − LRI − Universit´ e Paris-Sud NeurIPS MetaLearning Wshop − Dec. 8, 2018 1 / 14

  2. Monte Carlo Tree Search for Algorithm Configuration: MOSAIC Herilalaina Rakotoarison and Mich` ele Sebag Tackling the Underspecified CNRS − INRIA − LRI − Universit´ e Paris-Sud NeurIPS MetaLearning Wshop − Dec. 8, 2018 1 / 14

  3. AutoML: Algorithm Selection and Configuration A mixed optimization problem Find λ ∗ ∈ arg min λ ∈ Λ L ( λ, P ) with λ a pipeline and L the predictive loss on dataset P Modes ◮ offline hyper-parameter setting ◮ online hyper-parameter setting Approaches ◮ Bayesian optimization: SMAC, Auto-SkLearn, AutoWeka, BHOB Hutter et al., 11; Feurer et al. 15; Kotthoff et al. 17; Falkner et al. 18 ◮ Evolutionary Computation Olson et al. 16; Choromanski et al. 18 ◮ Bilevel optimization Franceschi et al. 17, 18 ◮ Reinforcement learning Andrychowicz 16; Drori et al. 18 2 / 14

  4. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree ◮ Building Blocks Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  5. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  6. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  7. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  8. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  9. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  10. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  11. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  12. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  13. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes ◮ Returned solution ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  14. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes Random ◮ Returned solution Phase ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  15. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes Random ◮ Returned solution Phase ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

  16. Monte Carlo Tree Search Kocsis & Szepesv´ ari 06, Gelly & Silver 07 Game playing when no good evaluation function and huge search space. ◮ Upper Confidence Tree (UCT) ◮ Gradually grow the search tree Bandit−Based ◮ Building Blocks Phase Search Tree ◮ Select next action (bandit-based phase) Auer et al. 02 ◮ Add a node (leaf of the search tree) ◮ Select next action bis (random phase) ◮ Compute instant reward New Node ◮ Update information in visited nodes Random ◮ Returned solution Phase ◮ Path visited most often Explored Tree Within learning Feature selection Gaudel, Sebag, 10 Active learning Rolet, Teytaud, Sebag, 09 3 / 14

Recommend


More recommend