Preference-Based Bayesian Optimization in High Dimensions with Human Feedback Myra Cheng, Ellen Novoseller, Maegan Tucker, Richard Cheng, Joel Burdick, Yisong Yue California Institute of Technology
At every iteration: LineCoSpar Algorithm Pairwise preferences and coactive feedback ● Gaussian process-based model of the underlying utilities Human user Iteratively update the ● posterior from preference feedback Bayesian Preference ● Learn in high dimensions Model of utilities via 1-D subspaces over 1-D subspace and visited actions Actions selected via posterior sampling
Validated in User Studies Cartpole Simulation (4-D) Wearable Exoskeleton (6-D)
Recommend
More recommend