Knowing The What, But Not The Where in Bayesian Optimization Vu - PowerPoint PPT Presentation

Knowing The What, But Not The Where in Bayesian Optimization Vu Nguyen & Michael A. Osborne University of Oxford Vu Nguyen Bayesian Optimization 1

Black-box Optimization The relationship from � to � is through the black-box. Output � = �(�) Input � Black-box �(�) looking for this maximizer �(� � ) � � �(� � ) Output � � �(�) �(� � ) � � � �� Input Bayesian Optimization 2 Vu Nguyen

Properties of Black-box Function �: � ∈ ℛ � → � ∈ ℛ � � = �(�) � �(�) input output � = �� + � Function form is not known �� = ⋯ No derivative form Expensive to evaluate (in time and cost) Nothing is known about the function, except a few evaluations � = �(�) Bayesian Optimization 3 Vu Nguyen

Bayesian Optimization Overview output � Refine Make a series of evaluations � � , � � , … � � Bayes Opt �(�) input � exploit explore Acquisition function �(�) = �(�) + � × �(�) �(�) �(�) predictive mean predictive variance Surrogate function Bayesian Optimization 4 Vu Nguyen

Outline Bayesian Optimization Bayes Opt with Known Optimum Value Knowing the what, but not the where in Bayes Opt 5 Vu Nguyen

Knowing Optimum Value of The Black-Box We consider situations where the optimum value is known. � ∗ = max �(�) and the goal is to find � ∗ = arg max �(�) . Knowing the what, but not the where in Bayes Opt 6 Vu Nguyen

Examples of Knowing Optimal Value of The Black-Box Deep reinforcement learning: CartPole: 200 Pong: 18 Frozen Lake: 0.79 ± 0.05 InvertedPendulum: 950 Classification: Skin dataset: Accuracy 100 Inverse optimization: Given a database and a target property � , identifying a corresponding data point � ∗ . Knowing the what, but not the where in Bayes Opt 7 Vu Nguyen

What can � ∗ tell us about � ? � ∗ tells us about the upper bound: � ∗ ≥ � � , ∀� 1 1. � ∗ tells us that the function is reaching � ∗ at some points. 2 2. Knowing the what, but not the where in Bayes Opt 8 Vu Nguyen

Transformed Gaussian process � � = � ∗ − 1 2 � � (�) � � ∼ ��( 2� ∗ , �) ≥ 0 This condition ensures that � ∗ ≥ � � , ∀� 1 Knowing the what, but not the where in Bayes Opt 9 Vu Nguyen

We want to control the surrogate using � ∗ Push down: the surrogate must not go above � ∗ 1 standard GP �(�) is above � ∗ transformed GP below � ∗ Knowing the what, but not the where in Bayes Opt 10 Vu Nguyen

Transformed Gaussian process � � = � ∗ − � � � � (�) � � ∼ ��(0, �) Zero mean prior ! ≥ 0 This condition encourages that there is a point where � � = 0 and thus � ∗ = � � 2 Knowing the what, but not the where in Bayes Opt 11 Vu Nguyen

We want to control the surrogate using � ∗ Lift up: the surrogate should reach � ∗ 2 standard GP �(� ) does not reach � ∗ transformed GP reach � ∗ Knowing the what, but not the where in Bayes Opt 12 Vu Nguyen

Transformed Gaussian process Linearization using Taylor expansion � � ≈ � ∗ − 1 � � − � � � 2 � � � � − � � � = � ∗ + 1 � � − � � � � � 2 � � Linear transformation of a GP remains Gaussian � � = � ∗ − 1 � (�) 2 � � � � = � � � � � � � � (�) The predictive distribution � � ∼ �(� � , �(�)) � (�) Taylor expansion is very accurate at the mode which is � � Knowing the what, but not the where in Bayes Opt 13 Vu Nguyen

Outline Bayesian Optimization Bayes Opt with Known Optimum Value � ∗ Problem definition Exploiting � ∗ Building better surrogate model Making informed decision Knowing the what, but not the where in Bayes Opt 14 Vu Nguyen

Confidence Bound Minimization Under GP surrogate model, we have this condition w.h.p. Upper bound Lower bound where � � is defined following [Srinivas et al 2010]. This means � � ∗ − � � � � ∗ ≤ � � ∗ = � ∗ ≤ � � ∗ + � � � � ∗ Lower bound unknown known Upper bound can be estimated ∀� Knowing the what, but not the where in Bayes Opt 15 Vu Nguyen

Confidence Bound Minimization The best candidate for � ∗ is where the bound is tight � � = arg min � � − � ∗ + � � � � Upper bound Lower bound The inequality becomes equality at the true � ∗ location where � � ∗ − � � � � ∗ = � ∗ = � � ∗ + � � � � ∗ Lower bound Upper bound known when � � ∗ = � ∗ and � � ∗ = 0 Knowing the what, but not the where in Bayes Opt 16 Vu Nguyen

Expected Regret Minimization Regret � = � ∗ − �(� � ) where � ∗ = max � � , ∀� Finding the optimum location � ∗ = minimizing the regret. We can select the next point by minimizing the expected regret. Knowing the what, but not the where in Bayes Opt 17 Vu Nguyen

Expected Regret Minimization Using analytical derivation, we derive the closed-form computation for ERM. � �� ∗ � = � � × � � + � ∗ − � � × Φ � � ∗ �� = Gaussian PDF Gaussian CDF � � GP variance GP mean See the paper for details! Knowing the what, but not the where in Bayes Opt 18 Vu Nguyen

Illustration Tend to explore Existing Baselines elsewhere Correctly identify the true The Proposed (unknown) location Knowing the what, but not the where in Bayes Opt 19 Vu Nguyen

The GP transformation is helpful in high dimension Knowing the what, but not the where in Bayes Opt 20 Vu Nguyen

XGBoost Classification and DRL Skin dataset UCI � ∗ = 100 CartPole DRL � ∗ = 200 Knowing the what, but not the where in Bayes Opt 21 Vu Nguyen

Mis-specified � ∗ will degrade the performance Under-specified � ∗ smaller than the true � ∗ More serious, as the algorithm will get stuck. Over-specified � ∗ greater than the true � ∗ Less serious, but still poor performance. Knowing the what, but not the where in Bayes Opt 22 Vu Nguyen

Take Home Messages Bayes opt is efficient for optimizing the black-box function When the optimum value is known, we can exploit this knowledge for better optimization. Knowing the what, but not the where in Bayes Opt 23 Vu Nguyen

Question and Answer vu@robots.ox.ac.uk @nguyentienvu https://ntienvu.github.io Conclusion 24 Vu Nguyen

Knowing The What, But Not The Where in Bayesian Optimization Vu - PowerPoint PPT Presentation

Knowing The What, But Not The Where in Bayesian Optimization Vu Nguyen & Michael A. Osborne University of Oxford Vu Nguyen Bayesian Optimization 1 Black-box Optimization The relationship from to is through the black-box. Output

Advocacy 1 101 Key Elements Knowing what you want Knowing who youre talking to Knowing

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Knowing Me Knowing You Funded by Young Roots Heritage Lottery, BCC and Hengrove Community Arts

Knowing me, knowing youthere is something we can do What is it? #Knowvember18 National

Anomaly Detection Fault Tolerance Anticipation Patterns John Allspaw SVP, Tech Ops Qcon

Outward Church: Knowing God Acts 4:23-31 Our doing for God must come from a deep knowing of

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Efficient algorithms for estimating multi-view mixture models Daniel Hsu Microsoft Research, New

Chapter 6 Alternatives to Expected Utility Theory In this lecture, I describe some well-known

All Investors are Risk-averse Expected Utility Maximizers Carole Bernard (UW), Jit Seng Chen

On the E ffi ciency of the Walrasian Mechanism Moshe Babaio ff Brendan Lucier (Microsoft

Robust Pricing in Contextual Auctions Authors: Negin Golrezaei (Massachusetts Institute of

d i E Applied Maxima and Minima a l l u d Dr. Abdulla Eid b A College of Science . r

Learning Deep Generative Models Inference & Representation Lecture 12 Rahul G. Krishnan

An Equilibrium Theory of Learning, Search and Wages Francisco Gonzalez Shouyong Shi U. Calgary

Knowing The What, But Not The Where in Bayesian Optimization Vu - PowerPoint PPT Presentation

Knowing The What, But Not The Where in Bayesian Optimization Vu Nguyen & Michael A. Osborne University of Oxford Vu Nguyen Bayesian Optimization 1 Black-box Optimization The relationship from to is through the black-box. Output

Advocacy 1 101 Key Elements Knowing what you want Knowing who youre talking to Knowing

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Knowing Me Knowing You Funded by Young Roots Heritage Lottery, BCC and Hengrove Community Arts

Knowing me, knowing youthere is something we can do What is it? #Knowvember18 National

Anomaly Detection Fault Tolerance Anticipation Patterns John Allspaw SVP, Tech Ops Qcon

Outward Church: Knowing God Acts 4:23-31 Our doing for God must come from a deep knowing of

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Efficient algorithms for estimating multi-view mixture models Daniel Hsu Microsoft Research, New

Chapter 6 Alternatives to Expected Utility Theory In this lecture, I describe some well-known

All Investors are Risk-averse Expected Utility Maximizers Carole Bernard (UW), Jit Seng Chen

On the E ffi ciency of the Walrasian Mechanism Moshe Babaio ff Brendan Lucier (Microsoft

Robust Pricing in Contextual Auctions Authors: Negin Golrezaei (Massachusetts Institute of

d i E Applied Maxima and Minima a l l u d Dr. Abdulla Eid b A College of Science . r

Learning Deep Generative Models Inference &amp; Representation Lecture 12 Rahul G. Krishnan

An Equilibrium Theory of Learning, Search and Wages Francisco Gonzalez Shouyong Shi U. Calgary

Learning Deep Generative Models Inference & Representation Lecture 12 Rahul G. Krishnan