adversarially robust optimization
play

Adversarially Robust Optimization with Gaussian Processes Ilija - PowerPoint PPT Presentation

Adversarially Robust Optimization with Gaussian Processes Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher Conference on Neural Information Processing Systems (Dec 2018) Gaussian Process Optimization Optimum Non-robust


  1. Adversarially Robust Optimization with Gaussian Processes Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher Conference on Neural Information Processing Systems (Dec 2018)

  2. Gaussian Process Optimization Optimum Non-robust problem: x * = arg max x ∈ D ⊂ℝ d f ( x ) Setting: GP/Bayesian optimization ‣ Unknown utility function , modeled by Gaussian Process f ∼ GP( μ , κ ) f ‣ Sequentially query the unknown function f ‣ Noisy and expensive point evaluations

  3. Adversarially Robust GP Optimization Optimum Robust problem: x * = arg max δ ∈Δ ϵ ( x ) f ( x + δ ) min x ∈ D ⊂ℝ Set of input perturbations: Δ ϵ ( x ) = { x ′ � − x : dist( x , x ′ � ) ≤ ϵ } Setting: ‣ Unknown utility function , modeled by Gaussian Process f ∼ GP( μ , κ ) f ‣ Sequentially query the unknown function f ‣ Noisy and expensive point evaluations

  4. Adversarially Robust GP Optimization Non-Robust Optimum Robust problem: Robust Optimum x * = arg max δ ∈Δ ϵ ( x ) f ( x + δ ) min x ∈ D ⊂ℝ Original Function Set of input perturbations: Perturbed Δ ϵ ( x ) = { x ′ � − x : dist( x , x ′ � ) ≤ ϵ } Function Motivation: adversarial attack, implementation errors, etc. Setting: ‣ Unknown utility function , modeled by Gaussian Process f ∼ GP( μ , κ ) f ‣ Sequentially query the unknown function f ‣ Noisy and expensive point evaluations

  5. Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] BamSOO [Wang et al. ’14] PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] GLASSES [Gonzalez et al. ’15] OPES [Ho ff man & Ghahramani’15] TruVaR [Bogunovic et al. '16] MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…

  6. Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] Robust algorithm: StableOpt ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] R ound : t BamSOO [Wang et al. ’14] ‣ Choose: x t = argmax ˜ δ ∈Δ ϵ ( x ) ucb t − 1 ( x + δ ) min PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] x ∈ D GLASSES [Gonzalez et al. ’15] OPES [Ho ff man & Ghahramani’15] TruVaR [Bogunovic et al. '16] MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…

  7. Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] Robust algorithm: StableOpt ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] R ound : t BamSOO [Wang et al. ’14] ‣ Choose: x t = argmax ˜ δ ∈Δ ϵ ( x ) ucb t − 1 ( x + δ ) min PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] x ∈ D ‣ Select: GLASSES [Gonzalez et al. ’15] δ t = argmin lcb t − 1 ( ˜ x t + δ ) OPES [Ho ff man & Ghahramani’15] δ ∈Δ ϵ ( ˜ x t ) TruVaR [Bogunovic et al. '16] MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…

  8. Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] Robust algorithm: StableOpt ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] R ound : t BamSOO [Wang et al. ’14] ‣ Choose: x t = argmax ˜ δ ∈Δ ϵ ( x ) ucb t − 1 ( x + δ ) min PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] x ∈ D ‣ Select: GLASSES [Gonzalez et al. ’15] δ t = argmin lcb t − 1 ( ˜ x t + δ ) OPES [Ho ff man & Ghahramani’15] δ ∈Δ ϵ ( ˜ x t ) ‣ Observe noisy function value at TruVaR [Bogunovic et al. '16] x t + δ t ˜ MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…

  9. Theoretical Result Theorem: StableOpt guarantees that if T ≳ γ T η 2 then the reported point satisfies the following w.h.p.: x ( T ) δ ∈Δ ϵ ( x ( T ) ) f ( x ( T ) + δ ) ≥ max min x ∈ D ⊂ℝ min δ ∈Δ ϵ ( x ) f ( x + δ ) − η , where : Total number of points queried T : Target accuracy η : Kernel-dependent information quantity γ T

  10. Variations Robustness to unknown parameters: • Goal: Choose robust to di ff erent , max x ∈ D min θ ∈Θ f ( x , θ ) θ x • Application: Tuning hyperparameters robust to di ff erent data types

  11. Variations Robustness to unknown parameters: • Goal: Choose robust to di ff erent , max x ∈ D min θ ∈Θ f ( x , θ ) θ x • Application: Tuning hyperparameters robust to di ff erent data types Robust group identification: Input space is partitioned into groups G 1 G 2 G k • Goal: Identify the group with the highest worst-case function value max G ∈𝒣 min x ∈ G f ( x ) • Application: Robust group movie recommendation

  12. Adversarially Robust Optimization with Gaussian Processes Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher Poster #24 Wed Dec 5th 05:00 -- 07:00 PM @ Room 210 & 230 AB

Recommend


More recommend