Hyperparameter Optimization Albert-Ludwigs-Universitt Freiburg - PowerPoint PPT Presentation

Surrogate Benchmarks for Hyperparameter Optimization Albert-Ludwigs-Universität Freiburg Holger Hoos Katharina Eggensperger Kevin Leyton-Brown Frank Hutter University of British Columbia University of Freiburg {hoos,kevinlb}@cs.ubc.ca {eggenspk,fh}@cs.uni-freiburg.de

Problem: Evaluation of Methods for Hyperparameter Optimization is expensive ! Albert-Ludwigs-Universität Freiburg

Outline  Benchmarking Hyperparameter Optimization Methods  Constructing Surrogates  Using Surrogate Benchmarks MetaSEL’14 3 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Bayesian Optimization Methods Configuration Optimizer space Λ Uses internal model M λ i Performance Run algorithm f( λ i ) with configuration λ i MetaSEL’14 5 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

What do we need for an empirical comparison  Standard benchmark problems  Easy-to-use software Then:  Run each optimizer on each benchmark X multiple times MetaSEL’14 6 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

What do we need for an empirical comparison  Standard benchmark problems  Easy-to-use software Then:  Run each optimizer on each benchmark X multiple times Evaluation of X is expensive MetaSEL’14 6 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Benchmarking hyperparameter optimization methods Neural Network, configuration space Λ : MetaSEL’14 7 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Benchmarking hyperparameter optimization methods Neural Network, configuration space Λ : categorical hyperparameter MetaSEL’14 7 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Benchmarking hyperparameter optimization methods Neural Network, configuration space Λ : conditional hyperparameter MetaSEL’14 7 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Benchmarking hyperparameter optimization methods Neural network 8 MetaSEL’14 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Surrogate Benchmark 𝑌 ′ • cheap-to-evaluate • Can be used like the real benchmark X • Behaves like X Configuration 𝜇 𝑌 Performance 𝑔(𝜇) MetaSEL’14 10 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Surrogate Benchmark 𝑌 ′ • cheap-to-evaluate • Can be used like the real benchmark X • Behaves like X Configuration 𝜇 Regression model 𝑌 ′ Performance 𝑔(𝜇) MetaSEL’14 11 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Constructing a Surrogate for Benchmark X 1. Collect data 2. Choose a regression model 3. Train and store model MetaSEL’14 12 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

1. Collect data for benchmark X 𝜇 1 , 𝑔 𝜇 1 , … , 𝜇 𝑜 , 𝑔 𝜇 𝑜 Training data:  Dense sampling in high performance regions  Good overall coverage MetaSEL’14 13 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

1. Collect data for benchmark X 𝜇 1 , 𝑔 𝜇 1 , … , 𝜇 𝑜 , 𝑔 𝜇 𝑜 Training data:  Dense sampling in high performance regions Run optimizers on benchmark X  Good overall coverage MetaSEL’14 13 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

1. Collect data for benchmark X 𝜇 1 , 𝑔 𝜇 1 , … , 𝜇 𝑜 , 𝑔 𝜇 𝑜 Training data:  Dense sampling in high performance regions Run optimizers on benchmark X  Good overall coverage Run random search on benchmark X MetaSEL’14 13 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

2. Choice of Regression Models Ridge Regression K-nearest neighbour Gradient Boosting Linear Regression Random Forests Gaussian Processes Bayesian Neural Network SVM MetaSEL’14 14 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

2. Choice of Regression Models Can we quantify the performance of a new optimizer?  Leave-one-optimizer-out setting - Train model on data gathered by all but one optimizer - Test on remaining data MetaSEL’14 16 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

2. Choice of Regression Models Leave-one-optimizer-out setting Random forest prediction Neural Network True performance MetaSEL’14 17 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

2. Choice of Regression Models Leave-one-optimizer-out setting Random Forest Neural Network MetaSEL’14 17 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

2. Choice of Regression Models Leave-one-optimizer-out setting Random Forest Gaussian Process k-nearest-neighbour Gradient Boosting nuSVR Neural Network MetaSEL’14 17 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Using Surrogate Benchmarks Neural Network MetaSEL’14 20 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Using Surrogate Benchmarks Neural Network Real Benchmark MetaSEL’14 20 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Using Surrogate Benchmarks Neural Network GP-based benchmark Real Benchmark RF-based benchmark MetaSEL’14 20 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Using Surrogate Benchmarks Neural Network GP-based benchmark Real Benchmark RF-based benchmark One optimization run: 40h <200s <200s <1.5h <1.5h Whole comparison: 50d MetaSEL’14 21 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Applications  Extensive testing at early development stages  Fast comparison of different hyperparameter optimization methods  Metaoptimization of existing hyperparameter optimization methods MetaSEL’14 23 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Conclusion Can we construct cheap-to evaluate and realistic hyperparameter optimization benchmarks ? Yes, based on random forests and Gaussian process regression models  MetaSEL’14 24 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Conclusion Can we construct cheap-to evaluate and realistic hyperparameter optimization benchmarks ? Yes, based on random forests and Gaussian process regression models But, some work needs to be done for high dimensional benchmarks. MetaSEL’14 24 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

This presentation was supported by an ECCAI travel award and the ECCAI sponsors Thank you for your attention Albert-Ludwigs-Universität Freiburg More information on hyperparameter optimization benchmarks can be found on automl.org/hpolib

Regression models MetaSEL’14 26 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Benchmarks MetaSEL’14 27 Surrogates for Hyperparameter Optimization Benchmarks – Eggensperger, Hutter, Hoos, and Leyton-Brown

Hyperparameter Optimization Albert-Ludwigs-Universitt Freiburg - PowerPoint PPT Presentation

Surrogate Benchmarks for Hyperparameter Optimization Albert-Ludwigs-Universitt Freiburg Holger Hoos Katharina Eggensperger Kevin Leyton-Brown Frank Hutter University of British Columbia University of Freiburg {hoos,kevinlb}@cs.ubc.ca

Hyperparameter tuning in caret Dr. Shirin Glander Data Scientist DataCamp Hyperparameter

Parameters vs hyperparameters Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning

Machine learning with H2O Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning in R

Machine learning with mlr Dr. Shirin Elsinghorst Data Scientist DataCamp Hyperparameter Tuning

CSC321 Lecture 21: Bayesian Hyperparameter Optimization Roger Grosse Roger Grosse CSC321

Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman

Hyperparameter optimization strategies git clone

Hyperparameter Optimization with SHERPA Lars Hertel, Julian Collado, Peter Sadowski, Pierre Baldi

Hyperparameter Search in Machine Learning Marc Claesen and Bart De Moor

Deep Learning Hyperparameter Optimization with Competing Objectives GTC 2018 - S8136 Scott Clark

Maggy - Open-Source Asynchronous Distributed Hyperparameter Optimization Based on Apache Spark

Hyperparameter Optimization using Hyperopt Yassine Alouini - Paul Coursaux 03/11/2016 @qucit

Calculating Hypergradient Jingchang Liu November 13, 2019 HKUST 1 Table of Contents

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Knowledge-Uncertainty Axiomatized Framework with Support Vector Machines for Hyperparameter

Efficient Determination of the Hyperparameter via L-curve in Large Scale Least Squares and Total

Mo Model-base sed Deve velopment for High Assu ssurance ce Embedded Syst ystems Slang

Deploying Azure-Based Windows 10 Desktops in Days How Bioverativ Empowers Employees and Partners

Technical Concepts PAS IND Controls for FAIR GSI Helmholtzzentrum fr Schwerionenforschung GmbH

Welcome to CS110: Principles of Computer Systems I'm Jerry Cain ( jerry@cs.stanford.edu )

Azure Active Directory Provider The Azure Provider can be used to congure infrastructure in

CS356 Unit 4 Intro to x86 Instruction Set 4.2 Why Learn Assembly To understand something of

Decidable and Undecidable Fragments of Halpern and Shohams Interval Temporal Logic: Towards a

Basic Local Alignment Search Tool A blast from the past... AGATCAC A G A T C A C CGACAG