regret bounds for meta bayesian optimization with an
play

Regret bounds for meta Bayesian optimization with an unknown Gaussian - PowerPoint PPT Presentation

Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior Zi Wang* Beomjoon Kim* Leslie Pack Kaelbling Dec 5 @ NeurIPS 18 Poster #22 Bayesian optimization x * = argmax Goal: f ( x ) x


  1. Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior Zi Wang* Beomjoon Kim* Leslie Pack Kaelbling Dec 5 @ NeurIPS 18 Poster #22

  2. Bayesian optimization x * = argmax Goal: f ( x ) x ∈𝔜 Challenges: • f is expensive to evaluate • f is multi-peak • no gradient information • evaluations can be noisy

  3. Bayesian optimization 3 x * = argmax Goal: f ( x ) 2 x ∈𝔜 1 f ( x ) 0 Challenges: -1 -2 • f is expensive to evaluate -3 -2 -1 0 1 2 • f is multi-peak x • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model

  4. Bayesian optimization 3 x * = argmax Goal: f ( x ) 2 x ∈𝔜 1 f ( x ) 0 Challenges: -1 -2 • f is expensive to evaluate -3 -2 -1 0 1 2 • f is multi-peak x • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model How to choose the prior?

  5. Bayesian optimization 3 x * = argmax Goal: f ( x ) 2 x ∈𝔜 1 f ( x ) 0 Challenges: -1 -2 • f is expensive to evaluate -3 -2 -1 0 1 2 • f is multi-peak x • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model • re-estimate the prior parameters e.g. by maximizing marginal data likelihood every few iterations

  6. Bayesian optimization x * = argmax Goal: f ( x ) x ∈𝔜 Which comes first? Challenges: Data or prior? • f is expensive to evaluate • f is multi-peak • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model • re-estimate the prior parameters e.g. by maximizing marginal data likelihood every few iterations

  7. Bayesian optimization x * = argmax Goal: f ( x ) x ∈𝔜 Which comes first? Challenges: Data or prior? • f is expensive to evaluate • f is multi-peak • no gradient information Assume a GP prior f ∼ GP ( μ , k ) • evaluations can be noisy LOOP • choose new query point(s) to evaluate • compute the posterior GP model • re-estimate the prior parameters e.g. by maximizing marginal data likelihood every few iterations Hard to analyze.

  8. Bayesian optimization with an unknown GP prior data collected on f prior model x

  9. Bayesian optimization with an unknown GP prior data collected on f prior model x Our problem setup: use past experience with similar functions as the meta training data to break the circular dependencies

  10. Meta Bayesian optimization with an unknown GP prior Offline phase Online phase x

  11. ̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data sampled from the same prior Offline phase Online phase q ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) Estimated prior μ , ̂ k x x

  12. ̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data Construct unbiased estimators of the sampled from the same prior posterior and use a variant of GP-UCB Offline phase Online phase q q ˆ ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) µ 0 ( x ) ˆ µ 0 ( x ) ± ζ 1 ˆ k 0 ( x ) Estimated prior μ , ̂ k x x

  13. ̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data Construct unbiased estimators of the sampled from the same prior posterior and use a variant of GP-UCB Offline phase Online phase q q ˆ ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) µ 1 ( x ) ˆ µ 1 ( x ) ± ζ 2 ˆ k 1 ( x ) Estimated prior μ , ̂ k x x

  14. ̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data Construct unbiased estimators of the sampled from the same prior posterior and use a variant of GP-UCB Offline phase Online phase q q ˆ ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) µ 2 ( x ) ˆ µ 2 ( x ) ± ζ 3 ˆ k 2 ( x ) Estimated prior μ , ̂ k x x

  15. ̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data Construct unbiased estimators of the sampled from the same prior posterior and use a variant of GP-UCB Offline phase Online phase q q ˆ ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) µ 3 ( x ) ˆ µ 3 ( x ) ± ζ 4 ˆ k 3 ( x ) Estimated prior μ , ̂ k x x

  16. ̂ Meta Bayesian optimization with an unknown GP prior Estimate the GP prior from offline data Construct unbiased estimators of the sampled from the same prior posterior and use a variant of GP-UCB Offline phase Online phase q q ˆ ˆ µ ( x ) ˆ µ ( x ) ± 3 ˆ k ( x ) µ 4 ( x ) ˆ µ 4 ( x ) ± ζ 5 ˆ k 4 ( x ) Estimated prior μ , ̂ k x x

  17. Effect of N, the number of meta training functions N = 1000 N = 100 q q ˆ ˆ µ t ( x ) ˆ µ t ( x ) ± ζ t +1 ˆ k t ( x ) µ t ( x ) ˆ µ t ( x ) ± ζ t +1 ˆ k t ( x ) 10 10 5 5 0 0 − 5 − 5 − 10 − 10 0 . 00 0 . 25 0 . 50 0 . 75 1 . 00 0 . 00 0 . 25 0 . 50 0 . 75 1 . 00 x x x x

  18. Bounding the regret of meta BO with an unknown GP prior Theorem (finite input space) Results for continuous input space @ poster #22 Important assumptions: • meta-training functions come from the same prior • enough number of meta-training functions N ≳ T + 20 f Given , with high probability, observations on the test function T O ( O ( N − T ) + C ) + σ 2 1 log T simple regret R T ≲ → C σ T observation noise ≈ 10 constant linear kernel

  19. Empirical results on block picking and placing …… f 1 f 2 f N f meta-training data test function N = 1500 Max observed value 2 2 3 — Our method 3 — UCB 4 4 — Our method — TransLearn — UCB — Rand 5 5 proportion of meta-training data #evaluations of test function 6 6 0 5 10 15 20 25 30 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

  20. Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior Poster #22 More results on: • estimation details for discrete and continuous input spaces R d • regret bounds for compact input space in • regret bounds for probability of improvement in the meta learning setting • empirical results on robotics tasks https://ziw.mit.edu/meta_bo

Recommend


More recommend