Towards Assessing the Impact of Bayesian Optimizations own - PowerPoint PPT Presentation

Towards Assessing the Impact of Bayesian Optimization’s own Hyperparameters Marius Lindauer, Matthias Feurer, Katharina Eggensperger, André Biedenkapp & Frank Hutter Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters DSO@IJCAI 2019

Motivation 1 Hyperparameter optimization is crucial to achieve peak performance! Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 2 DSO@IJCAI 2019

Motivation 1 Hyperparameter optimization is crucial to achieve peak performance! 2 Bayesian optimization is a successful approach for that! Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 2 DSO@IJCAI 2019

Quick Recap on Bayesian Optimization Optimize acquisition function to choose where to evaluate next Target function Update predictive model Bayesian Optimization Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 3 DSO@IJCAI 2019

Related Work Bayesian optimization can be improved with: Changing transformations of the target function 2 ● Changing its initial design 2,4 ● Tuning the model on- and offline 1,3 ● Changing the acquisition function 4,5 ● [1] G. Malkomes and R. Garnett. Automating Bayesian optimization with Bayesian optimization. NeurIPS 2018 [2] D. Jones et al. Efficient global optimization of expensive black box functions . JGO 1998 [3] J. Snoek et al. Scalable Bayesian optimization using deep neural networks. ICML 2015 [4] D. Brockhoff et al. The impact of initial designs on the performance of matsumoto on the noiseless BBOB-2015 testbed: A preliminary study. GECCO 2015 [4] V. Picheny et al. A benchmark of kriging-based infill criteria for noisy optimization. Structural and Multidisciplinary Optimization 2013 [5] M. Hoffman et al. Portfolio allocation for Bayesian optimization. UAI’11 Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 4 DSO@IJCAI 2019

Goal: Meta-Optimization Optimize acquisition function to choose where to evaluate next Target function Update predictive model Bayesian Optimization Optimizer Similar to N. Dang, L. Pérez Cáceres, P. De Causmaecker, and T. Stützle. Configuring irace using surrogate configuration benchmarks. GECCO’17 Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 5 DSO@IJCAI 2019

Research Questions 1 How large is the impact of tuning Bayesian optimization’s own hyperparameters? Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 6 DSO@IJCAI 2019

Research Questions 1 How large is the impact of tuning Bayesian optimization’s own hyperparameters? 2 How well does this transfer to similar target functions? 3 How well does this transfer to different target functions? 4 Which hyperparameters are actually important? Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 6 DSO@IJCAI 2019

What do we need to tune BO’s hyperparameters? 1 Search Space 2 Target functions 3 Meta-loss function to be optimized 4 Optimizer Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 7 DSO@IJCAI 2019

Ingredients 1 Search Space GP-ML RF GP-MAP +model hyperparameter +model hyperparameter +model hyperparameter +initial design +initial design +initial design +acquisition function +acquisition function +acquisition function +transformation +transformation +transformation 2 Target functions 3 Meta-loss function to be optimized 4 Optimizer Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 8 DSO@IJCAI 2019

Ingredients 1 Search Space SVMs ● 10 datasets 2 Target functions ● 3 continuous hyperparameters ● 1 categorical hyperparameter NNs → Meta-optimization is quite expensive ● 6 datasets → Use artificial functions ● 6 continuous hyperparameters → Surrogate benchmark problems 3 Meta-loss function to be optimized Artificial functions 4 Optimizer ● 10 functions ● 2-6 continuous hyperparameter Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 9 DSO@IJCAI 2019

Ingredients 1 Search Space 2 Target functions Bayesian Target function 3 Meta-loss function to be optimized Optimizer ● Measure good anytime performance ● Compare across multiple functions Optimizer ● Hit optimum accurately 4 Optimizer Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 10 DSO@IJCAI 2019

Ingredients 1 Search Space 2 Target functions 3 Meta-loss function to be optimized 4 Optimizer → Algorithm configuration Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 11 DSO@IJCAI 2019

How Large is the Impact of Tuning Average log-regret (lower is better). Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 13 DSO@IJCAI 2019

How Large is the Impact of Tuning Average log-regret (lower is better). LOFO : Running the Meta-optimizer on all but one function from a family, rerun the best found configuration on the left out function Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 13 DSO@IJCAI 2019

Important Hyperparameters Ablation 1 showed: → Only a small set of hyperparameters is important → Which hyperparameters depend on the model Figure: Most important hyperparameters according to ablation for Bayesian optimization with Random Forests on the artificial function family. [1] C. Fawcett, H. H. Hoos. Analysing differences between algorithm configurations through ablation . J. Heuristics 2016 Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 14 DSO@IJCAI 2019

Wrap-Up → Hyperparameter optimization for Bayesian optimization is important Open questions and future work: - How to handle this in practice? - Measure similarity of target functions Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian Optimization’s Own Hyperparameters 15 DSO@IJCAI 2019

Towards Assessing the Impact of Bayesian Optimizations own - PowerPoint PPT Presentation

Towards Assessing the Impact of Bayesian Optimizations own Hyperparameters Marius Lindauer, Matthias Feurer, Katharina Eggensperger, Andr Biedenkapp & Frank Hutter Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

ASSESSING INTELLECTUAL DISABILITIES ASSESSING INTELLECTUAL DISABILITIES ASSESSING INTELLECTUAL

Assessing Earthquake Disaster Using ALOS Assessing Earthquake Disaster Using ALOS Assessing

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

CSC321 Lecture 21: Bayesian Hyperparameter Optimization Roger Grosse Roger Grosse CSC321

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

J. Norem & Z. Insepov ANL/HEP CLIC 09 Oct. 15 09 We study rf gradient limits at the

A New Approach to Ultrasound Guided Radio-Frequency Needle Placement Claudio Alcrreca, Jakob

Radioassay and Purification for Experiments at Y2L and Yemilab in Korea Moo-Hyun Lee Center for

Liquid Xe Detectors 18 7 12

J. Norem, ANL/HEP Friday MAP Mtg. Nov. 5, 2010 Outline General ideas Survey of effort Causes

X-ray Calorimeter Arrays for Astrophysics Caroline Kilbourne NASA Goddard Space Flight Center

Oncology Em ergencies Gerald Hsu, MD, PhD Asst Clinical Professor of Medicine Workshop outline

Adaptive Tight Frames for X-ray CT Image Restoration via Radon Domain Inpainting Bin Dong, Ruohan

Towards Assessing the Impact of Bayesian Optimizations own - PowerPoint PPT Presentation

Towards Assessing the Impact of Bayesian Optimizations own Hyperparameters Marius Lindauer, Matthias Feurer, Katharina Eggensperger, Andr Biedenkapp & Frank Hutter Lindauer, Feurer, Eggensperger, Biedenkapp and Hutter Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

ASSESSING INTELLECTUAL DISABILITIES ASSESSING INTELLECTUAL DISABILITIES ASSESSING INTELLECTUAL

Assessing Earthquake Disaster Using ALOS Assessing Earthquake Disaster Using ALOS Assessing

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

CSC321 Lecture 21: Bayesian Hyperparameter Optimization Roger Grosse Roger Grosse CSC321

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

J. Norem &amp; Z. Insepov ANL/HEP CLIC 09 Oct. 15 09 We study rf gradient limits at the

A New Approach to Ultrasound Guided Radio-Frequency Needle Placement Claudio Alcrreca, Jakob

Radioassay and Purification for Experiments at Y2L and Yemilab in Korea Moo-Hyun Lee Center for

Liquid Xe Detectors 18 7 12

J. Norem, ANL/HEP Friday MAP Mtg. Nov. 5, 2010 Outline General ideas Survey of effort Causes

X-ray Calorimeter Arrays for Astrophysics Caroline Kilbourne NASA Goddard Space Flight Center

Oncology Em ergencies Gerald Hsu, MD, PhD Asst Clinical Professor of Medicine Workshop outline

Adaptive Tight Frames for X-ray CT Image Restoration via Radon Domain Inpainting Bin Dong, Ruohan

J. Norem & Z. Insepov ANL/HEP CLIC 09 Oct. 15 09 We study rf gradient limits at the