Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments Kirthevasan Kandasamy , Willie Neiswanger, Reed Zhang, Akshay Krishnamurthy, Jeff Schneider, Barnab´ as P´ oczos ICML 2019
Example 1: Active Learning in Parametric Models (Expensive) Blackbox System Goal: Learn parameter θ in as few experiments. Algorithms: Active-Set-Select (Chaudhuri et al. 2015) 1
Example 2: Blackbox Optimisation (Expensive) Blackbox System Goal: Find argmax x f θ ( x ) in as few experiments. Algorithms: UCB (Srinivas et al 2010, Auer 2002) , EI (Jones et al 1998) . 2
Adaptive Goal Oriented Design of Experiments Experiment Update model Next design with results to test Recommendation (Bayesian) Model Algorithm Application Speci fi c Goal 3
Adaptive Goal Oriented Design of Experiments Experiment Update model Next design with results to test Recommendation (Bayesian) Model Algorithm Application Speci fi c Goal ◮ Blackbox Optimisation ◮ Active Level Set Estimation (Gotovos et al. ’13) ◮ Active Learning ◮ Active Search (Ma et al. ’17) ◮ Active Quadrature ◮ Active Posterior Estimation (Osborne et al. 2012) (Kandasamy et al. ’15) 3
Adaptive Goal Oriented Design of Experiments Experiment Update model Next design with results to test Recommendation (Bayesian) Model Algorithm Application Speci fi c Goal ◮ Blackbox Optimisation ◮ Active Level Set Estimation (Gotovos et al. ’13) ◮ Active Learning ◮ Active Search (Ma et al. ’17) ◮ Active Quadrature ◮ Active Posterior Estimation (Osborne et al. 2012) (Kandasamy et al. ’15) Issues: ◮ New goal/setting = ⇒ New algorithm? ◮ Algorithms tend to depend on the model and vice versa. 3
Adaptive Goal Oriented Design of Experiments 1. System: ◮ An unknown parameter θ completely specifies the system. ◮ A prior P ( θ ) and a likelihood P ( Y | X , θ ). 4
Adaptive Goal Oriented Design of Experiments 1. System: ◮ An unknown parameter θ completely specifies the system. ◮ A prior P ( θ ) and a likelihood P ( Y | X , θ ). 2. Goal: ◮ Collect data D n = { ( x t , y x t ) } n t =1 to maximise a user specified reward function λ ( θ, D n ). 4
Algorithm: Myopic Posterior Sampling ( MPS ) Inspired by Posterior (Thompson) Sampling (Thompson 1933) . At each time step, myopically choose action by assuming that a posterior sample θ ′ ∼ P ( θ | past-experiments) is the true parameter. 5
Algorithm: Myopic Posterior Sampling ( MPS ) Inspired by Posterior (Thompson) Sampling (Thompson 1933) . At each time step, myopically choose action by assuming that a posterior sample θ ′ ∼ P ( θ | past-experiments) is the true parameter. Only requires that we can sample from the posterior. - Many probabilistic programming tools available today. 5
Theory Theorem (Informal): Under certain conditions, MPS is competitive with a globally optimal oracle that knows θ . Proof ideas from adaptive submodularity and bandits. 6
Theory Theorem (Informal): Under certain conditions, MPS is competitive with a globally optimal oracle that knows θ . Proof ideas from adaptive submodularity and bandits. Prior work: With adaptive submodularity, myopic planning algorithms are good when the reward is known a priori. 6
Theory Theorem (Informal): Under certain conditions, MPS is competitive with a globally optimal oracle that knows θ . Proof ideas from adaptive submodularity and bandits. Prior work: With adaptive submodularity, myopic planning algorithms are good when the reward is known a priori. This work: ◮ λ ( θ, D n ): reward not known a priori. ◮ A myopic learning + planning algorithm is good in adaptive submodular environments. 6
Experiments Active Learning Active Level Set Estimation Synthetic Example Luminous Red Galaxies 0.3 10 0 0.25 RAND Gotovos et al '13 0.2 10 -1 Chaudhuri et al '15 0.15 0.1 10 -2 RAND MPS 0.05 Oracle Oracle MPS 10 -3 0 0 20 40 60 80 100 0 20 40 60 80 100 Active Posterior Estimation Application Specific Goal Type Ia Supernova Electrolyte Design 1 RAND 0.9 0.8 RAND 0.7 0.6 0.5 MPS 0.4 Kandasamy et al '15 10 -1 0.3 MPS Oracle Oracle 0.2 0 10 20 30 40 0 20 40 60 80 100 7
Willie Reed Akshay Jeff Barnabas Code: github.com/kirthevasank/mps Poster: #262
Recommend
More recommend