L M Machine Learning Research Group R G Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation Ahsan Alvi Binxin Ru, Jan-Peter Calliess, Stephen Roberts, Michael A. Osborne ICML 2019 1
Talk Overview • Bayesian optimisation (BO) recap • Synchronous vs asynchronous BO • Our Method – Design of penaliser – Locally estimated Lipschitz constant • Empirical results 2
1. Bayesian Optimisation (BO) • To solve the global optimisation x ∗ = arg min f ( x ) x ∈ X • The objective function f( ) Non-convex y x Expensive Noisy 3
1. Bayesian Optimisation (BO) x ∗ = arg min f ( x ) x ∈ X x ∗ = arg min f() f ( x ) y x x ∈ X x t +1 = arg max α t ( x ) � � f ∼ GP µ t , K t x ∈ X 2 0.6 1 0.4 0 0.2 -1 -2 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 4
2. Synchronous Batch BO • Enable multiple evaluations in parallel Sequential BO 2 C1 1 Sync Batch BO (B=3) C1 1 4 C2 2 5 C3 3 6 Time 5
2. Asynchronous Batch BO • Maximise utilisation of parallel workers Sync Batch BO (B=3) C1 1 4 C2 2 5 C3 3 6 Time 6
2. Asynchronous Batch BO • Maximise utilisation of parallel workers Sync Batch BO (B=3) C1 1 4 C2 2 5 C3 3 6 Async Batch BO (B=3) C1 1 4 7 C2 2 5 9 C3 3 6 8 Time 7
3. Our Method • A new async batch BO: P enalising L ocally for A s y nchronous B ayesian O ptimisation o n k Workers ( PLAyBOOK ) n o x q = arg max x ∈ X α ( x ) Q q − 1 i =1 ψ ( x | x i ) HLP:X3 busy locations new point assigned to ( points under evaluation at busy workers ) the free worker 8
3. Our Method • P enalising L ocally for A s y nchronous B ayesian O ptimisation o n k Workers ( PLAyBOOK ) n o x q = arg max x ∈ X α ( x ) Q q − 1 i =1 ψ ( x | x i ) • Empirically show: PLAyBOOK outperforms – other async BO methods – its sync. variants in both time and sample efficiency 9
4. Penaliser design • Our hard penaliser (HLP): 1 ˆ b2 n o L k x � x q k ψ HLP ( x | x q ) = min | µ ( x q ) � M | + σ ( x q ) , 1 • LP (Gonzalez et al., 2016) : LP ⇣ ˆ ⌘ HLP L k x � x q k� | µ ( x q ) � M | 0 ψ LP ( x | x q ) = Φ x q σ ( x q ) 0 1 1 x b1 , x b2 LP:x 3 HLP:x 3 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 LP HLP 0 0 -10 -5 0 5 10 -10 -5 0 5 10 1 10
5. Empirical Results: Async. vs. Sync. • PLAyBOOK-HL: Ackley 5-D: B=4 and B=16 B=16 B=4 X-axis: Run Time X-axis: No. of Func. Evaluations 11
5. Empirical Results: Async. methods • Tuning 9 hyperparameters of a CNN for CIFAR-10 B=2 B=4 12
Thank you! Meet us at poster #213! 13
Recommend
More recommend