safe exploration for optimization with gaussian processes
play

Safe Exploration for Optimization with Gaussian Processes Yanan Sui - PowerPoint PPT Presentation

Safe Exploration for Optimization with Gaussian Processes Yanan Sui Alkis Gotovos Joel W. Burdick Andreas Krause Caltech ETH Zurich Caltech ETH Zurich Better safe than sorry youtube.com/user/mattessons Safe Exploration for Optimization


  1. Safe Exploration for Optimization with Gaussian Processes Yanan Sui Alkis Gotovos Joel W. Burdick Andreas Krause Caltech ETH Zurich Caltech ETH Zurich

  2. Better safe than sorry youtube.com/user/mattessons Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 2

  3. Therapeutic spinal cord stimulation girardgibbs.com sjm.com maximize muscle activity negative efgects on treatment Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 3 ◮ Find electrode confjgurations that ◮ Bad confjgurations may cause pain or have

  4. Goal Optimize an unknown reward function via sequential sampling AND remain “safe” throughout the process Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 4

  5. Problem statement Safety threshold Seed set of safe decisions ( ) Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 5 ◮ Finite decision set D ◮ Unknown reward function f : D → R h

  6. Problem statement Seed set of safe decisions ( ) Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 6 ◮ Finite decision set D ◮ Unknown reward function f : D → R ◮ Safety threshold h ∈ R h

  7. Problem statement Seed set of safe decisions ( ) Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 7 ◮ Finite decision set D ◮ Unknown reward function f : D → R ◮ Safety threshold h ∈ R h

  8. Safe Exploration for Optimization with Gaussian Processes Problem statement Alkis Gotovos 8 ◮ Finite decision set D ◮ Unknown reward function f : D → R ◮ Safety threshold h ∈ R ◮ Seed set S 0 of safe decisions ( ∀ x ∈ S 0 , f ( x ) ≥ h ) h

  9. Problem statement Sequential sampling Goal Find argmax Remain safe: Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 9 ◮ For t = 1 , 2 , . . . ◮ select x t ∈ D ◮ observe f ( x t ) + n t

  10. Problem statement Sequential sampling Goal Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 10 ◮ For t = 1 , 2 , . . . ◮ select x t ∈ D ◮ observe f ( x t ) + n t ◮ Find x ∗ ∈ argmax x ∈ D f ( x ) ◮ Remain safe: ∀ t ≥ 1 , f ( x t ) ≥ h

  11. Problem statement Sequential sampling Goal Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 11 ◮ For t = 1 , 2 , . . . ◮ select x t ∈ D ◮ observe f ( x t ) + n t ◮ Find x ∗ ∈ argmax x ∈ D f ( x ) Remain safe: ∀ t ≥ 1 , f ( x t ) ≥ h ◮

  12. Related work Various proposed criteria, e.g., Expected improvement [Mockus et al., 1974] UCB [Auer, 2002] [Srinivas et al., 2010] Related variants Level set estimation [Gotovos et al., 2013] Bayesian optimization with constraints [Gardner et al., 2014] Gaussian processes popular for modeling the unknown function Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 12 ◮ Bayesian optimization: function evaluation is expensive

  13. Related work Related variants Level set estimation [Gotovos et al., 2013] Bayesian optimization with constraints [Gardner et al., 2014] Gaussian processes popular for modeling the unknown function Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 13 ◮ Bayesian optimization: function evaluation is expensive ◮ Various proposed criteria, e.g., ◮ Expected improvement [Mockus et al., 1974] ◮ UCB [Auer, 2002] [Srinivas et al., 2010]

  14. Gaussian processes popular for modeling the unknown function Related work Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 14 ◮ Bayesian optimization: function evaluation is expensive ◮ Various proposed criteria, e.g., ◮ Expected improvement [Mockus et al., 1974] ◮ UCB [Auer, 2002] [Srinivas et al., 2010] ◮ Related variants ◮ Level set estimation [Gotovos et al., 2013] ◮ Bayesian optimization with constraints [Gardner et al., 2014]

  15. Related work Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 15 ◮ Bayesian optimization: function evaluation is expensive ◮ Various proposed criteria, e.g., ◮ Expected improvement [Mockus et al., 1974] ◮ UCB [Auer, 2002] [Srinivas et al., 2010] ◮ Related variants ◮ Level set estimation [Gotovos et al., 2013] ◮ Bayesian optimization with constraints [Gardner et al., 2014] ◮ Gaussian processes popular for modeling the unknown function

  16. Gaussian process regression Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 16

  17. Gaussian process regression Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 17

  18. Gaussian process regression Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 18

  19. Gaussian process regression Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 19

  20. Gaussian process regression Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 20

  21. Gaussian process regression Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 21 u t ( x ) ℓ t ( x ) x

  22. GP-UCB argmax Sublinear regret under suitable conditions on [Srinivas et al., 2010] Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 22 ◮ Use upper confjdence bounds for optimistic sampling

  23. GP-UCB Sublinear regret under suitable conditions on [Srinivas et al., 2010] Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 23 ◮ Use upper confjdence bounds for optimistic sampling ◮ x t = argmax x ∈ D u t ( x )

  24. GP-UCB Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 24 ◮ Use upper confjdence bounds for optimistic sampling ◮ x t = argmax x ∈ D u t ( x ) ◮ Sublinear regret under suitable conditions on f [Srinivas et al., 2010]

  25. Alkis Gotovos Safe Exploration for Optimization with Gaussian Processes 25 GP-UCB example ( t = 0 )

  26. Alkis Gotovos Safe Exploration for Optimization with Gaussian Processes 26 GP-UCB example ( t = 5 )

  27. Alkis Gotovos Safe Exploration for Optimization with Gaussian Processes 27 GP-UCB example ( t = 10 )

  28. Alkis Gotovos Safe Exploration for Optimization with Gaussian Processes 28 GP-UCB example ( t = 20 )

  29. Certifying safety If for some safe we know , then a safety certifjcate for is Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 29 ◮ Assume that f is L -Lipschitz continuous w.r.t. a metric d

  30. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 30 ◮ Assume that f is L -Lipschitz continuous w.r.t. a metric d ◮ If for some safe x we know f ( x ) , then a safety certifjcate for x ′ is f ( x ) − L d ( x, x ′ ) ≥ h

  31. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 31 S 0

  32. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 32

  33. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 33

  34. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 34

  35. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 35

  36. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 36

  37. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 37

  38. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 38

  39. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 39

  40. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 40 ¯ R 0 ( S 0 )

  41. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 41

  42. Certifying safety Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 42

  43. Reachability Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 43

  44. Reachability Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 44

  45. Reachability Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 45

  46. Reachability Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 46 ¯ R ϵ ( S 0 )

  47. Reconsidering optimization Instead, aim for the -reachable maximum max Smaller stricter goal need more samples Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 47 ◮ Initial goal of fjnding f ∗ = max x ∈ D f ( x ) is unrealistic

  48. Reconsidering optimization max Smaller stricter goal need more samples Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 48 ◮ Initial goal of fjnding f ∗ = max x ∈ D f ( x ) is unrealistic ◮ Instead, aim for the ϵ -reachable maximum f ∗ ϵ = R ϵ ( S 0 ) f ( x ) x ∈ ¯

  49. Reconsidering optimization max Safe Exploration for Optimization with Gaussian Processes Alkis Gotovos 49 ◮ Initial goal of fjnding f ∗ = max x ∈ D f ( x ) is unrealistic ◮ Instead, aim for the ϵ -reachable maximum f ∗ ϵ = R ϵ ( S 0 ) f ( x ) x ∈ ¯ ◮ Smaller ϵ → stricter goal → need more samples

Recommend


More recommend