Regret bounds for meta Bayesian optimization with an unknown Gaussian - PowerPoint PPT Presentation

Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior Zi Wang* Beomjoon Kim* Leslie Pack Kaelbling Dec 5 @ NeurIPS 18 Poster #22

Bayesian optimization x * = argmax Goal: f ( x ) x ∈𝔜 Challenges: • f is expensive to evaluate • f is multi-peak • no gradient information • evaluations can be noisy

Bayesian optimization 3 x * = argmax Goal: f ( x ) 2 x ∈𝔜 1 f ( x ) 0 Challenges: -1 -2 • f is expensive to evaluate -3 -2 -1 0 1 2 • f is multi-peak x • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model

Bayesian optimization 3 x * = argmax Goal: f ( x ) 2 x ∈𝔜 1 f ( x ) 0 Challenges: -1 -2 • f is expensive to evaluate -3 -2 -1 0 1 2 • f is multi-peak x • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model How to choose the prior?

Bayesian optimization 3 x * = argmax Goal: f ( x ) 2 x ∈𝔜 1 f ( x ) 0 Challenges: -1 -2 • f is expensive to evaluate -3 -2 -1 0 1 2 • f is multi-peak x • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model • re-estimate the prior parameters e.g. by maximizing marginal data likelihood every few iterations

Bayesian optimization x * = argmax Goal: f ( x ) x ∈𝔜 Which comes first? Challenges: Data or prior? • f is expensive to evaluate • f is multi-peak • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model • re-estimate the prior parameters e.g. by maximizing marginal data likelihood every few iterations

Bayesian optimization x * = argmax Goal: f ( x ) x ∈𝔜 Which comes first? Challenges: Data or prior? • f is expensive to evaluate • f is multi-peak • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model • re-estimate the prior parameters e.g. by maximizing marginal data likelihood every few iterations Hard to analyze.

Bayesian optimization with an unknown GP prior data collected on f prior model x

Bayesian optimization with an unknown GP prior data collected on f prior model x Our problem setup: use past experience with similar functions as the meta training data to break the circular dependencies

Meta Bayesian optimization with an unknown GP prior Offline phase Online phase x

̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data sampled from the same prior Offline phase Online phase q ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) Estimated prior μ , ̂ k x x

̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data Construct unbiased estimators of the sampled from the same prior posterior and use a variant of GP-UCB Offline phase Online phase q q ˆ ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) µ 0 ( x ) ˆ µ 0 ( x ) ± ζ 1 ˆ k 0 ( x ) Estimated prior μ , ̂ k x x

Effect of N, the number of meta training functions N = 1000 N = 100 q q ˆ ˆ µ t ( x ) ˆ µ t ( x ) ± ζ t +1 ˆ k t ( x ) µ t ( x ) ˆ µ t ( x ) ± ζ t +1 ˆ k t ( x ) 10 10 5 5 0 0 − 5 − 5 − 10 − 10 0 . 00 0 . 25 0 . 50 0 . 75 1 . 00 0 . 00 0 . 25 0 . 50 0 . 75 1 . 00 x x x x

Bounding the regret of meta BO with an unknown GP prior Theorem (finite input space) Results for continuous input space @ poster #22 Important assumptions: • meta-training functions come from the same prior • enough number of meta-training functions N ≳ T + 20 f Given , with high probability, observations on the test function T O ( O ( N − T ) + C ) + σ 2 1 log T simple regret R T ≲ → C σ T observation noise ≈ 10 constant linear kernel

Empirical results on block picking and placing …… f 1 f 2 f N f meta-training data test function N = 1500 Max observed value 2 2 3 — Our method 3 — UCB 4 4 — Our method — TransLearn — UCB — Rand 5 5 proportion of meta-training data #evaluations of test function 6 6 0 5 10 15 20 25 30 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior Poster #22 More results on: • estimation details for discrete and continuous input spaces R d • regret bounds for compact input space in • regret bounds for probability of improvement in the meta learning setting • empirical results on robotics tasks https://ziw.mit.edu/meta_bo

Regret bounds for meta Bayesian optimization with an unknown Gaussian - PowerPoint PPT Presentation

Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior Zi Wang* Beomjoon Kim* Leslie Pack Kaelbling Dec 5 @ NeurIPS 18 Poster #22 Bayesian optimization x * = argmax Goal: f ( x ) x

Regret bounds for online variational inference Pierre Alquier ACML Nagoya, Nov. 18, 2019

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

On adaptive regret bounds for non- stochastic bandits Gergely Neu INRIA Lille, SequeL team

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Regret Bounds for Lifelong Learning Pierre Alquier Groupe de Travail de Machine learning du CMLA

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Counterfactual Regret Minimization and Domination in Extensive-Form Games Richard Gibson

No-Regret Learning in Convex Games Geoff Gordon, Amy Greenwald, Casey Marks, and Martin Zinkevich

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games Zhongxiang

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

Bayesian Optimization of Composite Functions Ral Astudillo Cornell University Joint work

Acceleration through Optimistic No-Regret Dynamics Jun-Kun Wang and Jacob Abernethy Georgia Tech

M6 Offline Analysis Katarina Pajchel University of Oslo April 18, 2008 Katarina Pajchel

Offline Downloading in China: A Comparative Study

Online vs. Offline Wire Posi/ons Tyler Alion Mike Wallbank

Proving tight security for RabinWilliams signatures D. J. Bernstein University of Illinois at

SUPPORT MEETING 1 Finding a company Eniro.se, google, emfas.se, trade organizations 1.

PROGRAMS @tomstuart / QCon London / 2014-03-05 PROGRAMS CANT DO EVERYTHING ho w can a

Combining Classifiers: A Theoretical Framework J. Kittler Centre for Vision, Speech and Signal

Operating System Principles: Memory Management Swapping, Paging, and Virtual Memory CS 111