by stian berg supervisor ole christoffer granmo
play

By Stian Berg Supervisor Ole-Christoffer Granmo, University of - PowerPoint PPT Presentation

Solving Dynamic Bandit Problems and Decentralized Games using the Kalman Bayesian Learning Automaton By Stian Berg Supervisor Ole-Christoffer Granmo, University of Agder Introduction Thesis topic: Evaluation of a novel approach to dynamic


  1. Solving Dynamic Bandit Problems and Decentralized Games using the Kalman Bayesian Learning Automaton By Stian Berg Supervisor Ole-Christoffer Granmo, University of Agder

  2. Introduction • Thesis topic: Evaluation of a novel approach to dynamic bandit problems • Bandit problem example: Link relevance 2

  3. Stationary bandit problem 3

  4. Dynamic bandit problem 4

  5. The Kalman Bayesian Learning Automaton (KBLA) • Kalman filtering • Position tracking • Robot navigation • Electronic equipment • Stock estimation • Forecasting • Computer vision • KBLA • Kalman filtering adapted to work in a bandit setting 5

  6. Summary of results • Among the top performers in all experiments • Scaled rather well with the number of options • Could handle various types of feedback However.... • May need significant tuning for good performance 6

  7. Conclusion • Empirical evaluation of the KBLA • Performance • Scalability • Robustness • Overall we believe this is a very promising approach • Further work • Parameter problem • Combining ideas from other bandit algorithms 7

Recommend


More recommend