Solving Dynamic Bandit Problems and Decentralized Games using the Kalman Bayesian Learning Automaton By Stian Berg Supervisor Ole-Christoffer Granmo, University of Agder
Introduction • Thesis topic: Evaluation of a novel approach to dynamic bandit problems • Bandit problem example: Link relevance 2
Stationary bandit problem 3
Dynamic bandit problem 4
The Kalman Bayesian Learning Automaton (KBLA) • Kalman filtering • Position tracking • Robot navigation • Electronic equipment • Stock estimation • Forecasting • Computer vision • KBLA • Kalman filtering adapted to work in a bandit setting 5
Summary of results • Among the top performers in all experiments • Scaled rather well with the number of options • Could handle various types of feedback However.... • May need significant tuning for good performance 6
Conclusion • Empirical evaluation of the KBLA • Performance • Scalability • Robustness • Overall we believe this is a very promising approach • Further work • Parameter problem • Combining ideas from other bandit algorithms 7
Recommend
More recommend