Hyperparameter Optimization using Hyperopt Yassine Alouini - Paul - PowerPoint PPT Presentation

Hyperparameter Optimization using Hyperopt Yassine Alouini - Paul Coursaux 03/11/2016 @qucit @YassineAlouini

About us Yassine Data Scientist @ Qucit ● ● Centrale Paris & Cambridge ● Quora’s Top Writer 2016 Paul Data Scientist @ Qucit ● ● Centrale Paris ● Market finance in London ● Horse riding

Outline 1. Hyperparameters in Machine Learning 2. How to Choose Hyperparameters ? 3. Tree-structured Parzen Estimation Approach 4. Live-coding Example

1. Hyperparameters in Machine Learning

What are hyperparameters ? Parameters: Hyperparameters : RMSE LASSO = RMSE + Rent = a 1 × surface + α × (|a 1 | + …) a 2 × distance to city center + ...

The impact of hyperparameters

2. How to choose hyperparameters ?

Cross validation Enable to choose the hyperparameter(s) with the best generalization capabilities making an efficient use of the data Figure credit: http://vinhkhuc.github.io/2015/03/01/how-many-folds-for-cross-validation.html

How to choose the points to cross-validate? Grid search Random search Credits: https://medium.com/rants-on-machine-learning/smarter-parameter-sweeps-or-why-grid-search-is-plain-stupid-c17d97a0e881#.db7060phq https://districtdatalabs.silvrback.com/visual-diagnostics-for-more-informed-machine-learning-part-3

3. Tree-structured Parzen Estimation Approach

Sequential Model-based Global Optimization

The Expected Improvement EI ε* (α) = ∫ max(ε* - ε, 0)p M (ε|α)dε

How to Optimize the EI ? (1)

How to Optimize the EI ? (2) ● Lasso model on the Boston Housing Dataset Distribution of the suggested ● αs

4. Live-coding Example

Description of the dataset ● IMDb dataset Dataset publicly available ● (from Kaggle) Credits: screenshot, 24/10/2016, https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset

Movies having the best score Credits: http://www.impawards.com/1974/towering_inferno.html, http://www.impawards.com/1994/shawshank_redemption_ver1.html, http://ruthusher.com/wordpress/wp-includes/js/godfather-poster

Movies having the worst score Credits: https://en.wikipedia.org/wiki/Justin_Bieber:_Never_Say_Never, http://www.movieinsider.com/m766/foodfight, http://www.moviepostershop.com/superbabies-baby-geniuses-2-movie-poster-2004

Task ● Predict the IMDB movie score ● Gradient Boosting algorithm (XGBoost package) ● 3 hyperparameters optimization strategies A naive grid search ○ ○ An expert grid search (*) ○ The TPE algorithm (hyperopt package) (*) http://blog.kaggle.com/2016/07/21/approaching-almost-any-machine-learning-problem-abhishek-thakur/

Features description 28 features: ● ○ 14 movie-related ○ 4 review-related ○ 10 cast-related 16 kept: ● ○ 11 numerical ○ 5 categorical ● 12 removed

Live demo Our code is available here: https://github.com/yassineAlouini/ hyperparameters-optimization-talk

Conclusion ● Outperforms the standard methods in most cases Search space matters ● Other Python libraries: Spearmint, BayesOpt, Scikit-Optimize ● ● Distributed optimization (using MongoDB)

Thanks for your attention. Question time Qucit is hiring!

References ● https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf ● https://conference.scipy.org/proceedings/scipy2013/pdfs/bergstra_hyperopt.pdf ● https://github.com/scikit-optimize ● http://jaberg.github.io/hyperopt/ ● https://github.com/JasperSnoek/spearmint ● https://github.com/fmfn/BayesianOptimization ● http://xgboost.readthedocs.io/en/latest/ ● http://www.cs.ubc.ca/~hutter/papers/13-BayesOpt_EmpiricalFoundation.pdf

Hyperparameter Optimization using Hyperopt Yassine Alouini - Paul - PowerPoint PPT Presentation

Hyperparameter Optimization using Hyperopt Yassine Alouini - Paul Coursaux 03/11/2016 @qucit @YassineAlouini About us Yassine Data Scientist @ Qucit Centrale Paris & Cambridge Quoras Top Writer 2016 Paul Data

Hyperparameter tuning in caret Dr. Shirin Glander Data Scientist DataCamp Hyperparameter

Parameters vs hyperparameters Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning

Machine learning with H2O Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning in R

Machine learning with mlr Dr. Shirin Elsinghorst Data Scientist DataCamp Hyperparameter Tuning

CSC321 Lecture 21: Bayesian Hyperparameter Optimization Roger Grosse Roger Grosse CSC321

Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman

Hyperparameter optimization strategies git clone

Hyperparameter Optimization with SHERPA Lars Hertel, Julian Collado, Peter Sadowski, Pierre Baldi

Hyperparameter Search in Machine Learning Marc Claesen and Bart De Moor

Deep Learning Hyperparameter Optimization with Competing Objectives GTC 2018 - S8136 Scott Clark

Maggy - Open-Source Asynchronous Distributed Hyperparameter Optimization Based on Apache Spark

Hyperparameter Optimization Albert-Ludwigs-Universitt Freiburg Holger Hoos Katharina

Calculating Hypergradient Jingchang Liu November 13, 2019 HKUST 1 Table of Contents

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Performance evaluation and hyperparameter tuning of statistical and machine-learning models using

Performance evaluation and hyperparameter tuning of statistical and machine-learning models using

Bayesian machine learning: a tutorial R emi Bardenet CNRS & CRIStAL, Univ. Lille, France

Parameter Tuning. Automatic Algorithm Configuration Petr Po s k P. Po s k c

Decision-aid methodologies in transportation Lecture 5: Issues with performance validation Tim

3-3 Multiple Events 21 October 2010 While Im gone Groups of three Two players, one

SPEERMINT Working Group Administriva mailing list: speermint@ietf.org subscribe:

Alloy Analyzer 4 Tutorial Session 4: Dynamic Modeling Greg Dennis and Rob Seater Software

Dynamic Programming Talk 5 by Daniela and Christoph Content Reinforcement Learning Problem

Solving stochastic dynamic programming models without transition matrices Paul L. Fackler

Hyperparameter Optimization using Hyperopt Yassine Alouini - Paul - PowerPoint PPT Presentation

Hyperparameter Optimization using Hyperopt Yassine Alouini - Paul Coursaux 03/11/2016 @qucit @YassineAlouini About us Yassine Data Scientist @ Qucit Centrale Paris & Cambridge Quoras Top Writer 2016 Paul Data

Hyperparameter tuning in caret Dr. Shirin Glander Data Scientist DataCamp Hyperparameter

Parameters vs hyperparameters Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning

Machine learning with H2O Dr. Shirin Glander Data Scientist DataCamp Hyperparameter Tuning in R

Machine learning with mlr Dr. Shirin Elsinghorst Data Scientist DataCamp Hyperparameter Tuning

CSC321 Lecture 21: Bayesian Hyperparameter Optimization Roger Grosse Roger Grosse CSC321

Improving Bug Prediction Accuracy by Regularization and Hyperparameter Optimization Haidar Osman

Hyperparameter optimization strategies git clone

Hyperparameter Optimization with SHERPA Lars Hertel, Julian Collado, Peter Sadowski, Pierre Baldi

Hyperparameter Search in Machine Learning Marc Claesen and Bart De Moor

Deep Learning Hyperparameter Optimization with Competing Objectives GTC 2018 - S8136 Scott Clark

Maggy - Open-Source Asynchronous Distributed Hyperparameter Optimization Based on Apache Spark

Hyperparameter Optimization Albert-Ludwigs-Universitt Freiburg Holger Hoos Katharina

Calculating Hypergradient Jingchang Liu November 13, 2019 HKUST 1 Table of Contents

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Performance evaluation and hyperparameter tuning of statistical and machine-learning models using

Performance evaluation and hyperparameter tuning of statistical and machine-learning models using

Bayesian machine learning: a tutorial R emi Bardenet CNRS &amp; CRIStAL, Univ. Lille, France

Parameter Tuning. Automatic Algorithm Configuration Petr Po s k P. Po s k c

Decision-aid methodologies in transportation Lecture 5: Issues with performance validation Tim

3-3 Multiple Events 21 October 2010 While Im gone Groups of three Two players, one

SPEERMINT Working Group Administriva mailing list: speermint@ietf.org subscribe:

Alloy Analyzer 4 Tutorial Session 4: Dynamic Modeling Greg Dennis and Rob Seater Software

Dynamic Programming Talk 5 by Daniela and Christoph Content Reinforcement Learning Problem

Solving stochastic dynamic programming models without transition matrices Paul L. Fackler

Bayesian machine learning: a tutorial R emi Bardenet CNRS & CRIStAL, Univ. Lille, France