Adversarially Robust Optimization with Gaussian Processes Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher Conference on Neural Information Processing Systems (Dec 2018)
Gaussian Process Optimization Optimum Non-robust problem: x * = arg max x ∈ D ⊂ℝ d f ( x ) Setting: GP/Bayesian optimization ‣ Unknown utility function , modeled by Gaussian Process f ∼ GP( μ , κ ) f ‣ Sequentially query the unknown function f ‣ Noisy and expensive point evaluations
Adversarially Robust GP Optimization Optimum Robust problem: x * = arg max δ ∈Δ ϵ ( x ) f ( x + δ ) min x ∈ D ⊂ℝ Set of input perturbations: Δ ϵ ( x ) = { x ′ � − x : dist( x , x ′ � ) ≤ ϵ } Setting: ‣ Unknown utility function , modeled by Gaussian Process f ∼ GP( μ , κ ) f ‣ Sequentially query the unknown function f ‣ Noisy and expensive point evaluations
Adversarially Robust GP Optimization Non-Robust Optimum Robust problem: Robust Optimum x * = arg max δ ∈Δ ϵ ( x ) f ( x + δ ) min x ∈ D ⊂ℝ Original Function Set of input perturbations: Perturbed Δ ϵ ( x ) = { x ′ � − x : dist( x , x ′ � ) ≤ ϵ } Function Motivation: adversarial attack, implementation errors, etc. Setting: ‣ Unknown utility function , modeled by Gaussian Process f ∼ GP( μ , κ ) f ‣ Sequentially query the unknown function f ‣ Noisy and expensive point evaluations
Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] BamSOO [Wang et al. ’14] PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] GLASSES [Gonzalez et al. ’15] OPES [Ho ff man & Ghahramani’15] TruVaR [Bogunovic et al. '16] MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…
Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] Robust algorithm: StableOpt ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] R ound : t BamSOO [Wang et al. ’14] ‣ Choose: x t = argmax ˜ δ ∈Δ ϵ ( x ) ucb t − 1 ( x + δ ) min PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] x ∈ D GLASSES [Gonzalez et al. ’15] OPES [Ho ff man & Ghahramani’15] TruVaR [Bogunovic et al. '16] MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…
Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] Robust algorithm: StableOpt ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] R ound : t BamSOO [Wang et al. ’14] ‣ Choose: x t = argmax ˜ δ ∈Δ ϵ ( x ) ucb t − 1 ( x + δ ) min PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] x ∈ D ‣ Select: GLASSES [Gonzalez et al. ’15] δ t = argmin lcb t − 1 ( ˜ x t + δ ) OPES [Ho ff man & Ghahramani’15] δ ∈Δ ϵ ( ˜ x t ) TruVaR [Bogunovic et al. '16] MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…
Robust Algorithm: StableOpt Non-robust BO methods: Thompson [Thompson ’33] PI [Kushner’64] EI [Mockus et al.’78 ] GP-UCB [Srinivas et al.’11 ] Robust algorithm: StableOpt ES [Henning et al. ’12] GP-UCB-PE [Contal et al.’13 ] R ound : t BamSOO [Wang et al. ’14] ‣ Choose: x t = argmax ˜ δ ∈Δ ϵ ( x ) ucb t − 1 ( x + δ ) min PES [Hernandez-Lobato et al. ’14] MRS [Metzen’16] x ∈ D ‣ Select: GLASSES [Gonzalez et al. ’15] δ t = argmin lcb t − 1 ( ˜ x t + δ ) OPES [Ho ff man & Ghahramani’15] δ ∈Δ ϵ ( ˜ x t ) ‣ Observe noisy function value at TruVaR [Bogunovic et al. '16] x t + δ t ˜ MES [Wang & Jegelka’17] FITBO [Ru et al. ’18] KG [Wu et al. ’17] the list goes on…
Theoretical Result Theorem: StableOpt guarantees that if T ≳ γ T η 2 then the reported point satisfies the following w.h.p.: x ( T ) δ ∈Δ ϵ ( x ( T ) ) f ( x ( T ) + δ ) ≥ max min x ∈ D ⊂ℝ min δ ∈Δ ϵ ( x ) f ( x + δ ) − η , where : Total number of points queried T : Target accuracy η : Kernel-dependent information quantity γ T
Variations Robustness to unknown parameters: • Goal: Choose robust to di ff erent , max x ∈ D min θ ∈Θ f ( x , θ ) θ x • Application: Tuning hyperparameters robust to di ff erent data types
Variations Robustness to unknown parameters: • Goal: Choose robust to di ff erent , max x ∈ D min θ ∈Θ f ( x , θ ) θ x • Application: Tuning hyperparameters robust to di ff erent data types Robust group identification: Input space is partitioned into groups G 1 G 2 G k • Goal: Identify the group with the highest worst-case function value max G ∈ min x ∈ G f ( x ) • Application: Robust group movie recommendation
Adversarially Robust Optimization with Gaussian Processes Ilija Bogunovic, Jonathan Scarlett, Stefanie Jegelka, Volkan Cevher Poster #24 Wed Dec 5th 05:00 -- 07:00 PM @ Room 210 & 230 AB
Recommend
More recommend