2 3 markov decision process
play

2 3 Markov Decision Process r k+1 s k+1 Environment Environment - PowerPoint PPT Presentation

2 3 Markov Decision Process r k+1 s k+1 Environment Environment Action a k State s k Reward r k Agent 4 5 6 7 8 9 r k+1 s k+1 Environment Action a k Reward r k Critic Value Function State s k TD Error Policy Actor Agent 10 11 12


  1. 2

  2. 3

  3. Markov Decision Process r k+1 s k+1 Environment Environment Action a k State s k Reward r k Agent 4

  4. 5

  5. 6

  6. 7

  7. 8

  8. 9

  9. r k+1 s k+1 Environment Action a k Reward r k Critic Value Function State s k TD Error Policy Actor Agent 10

  10. 11

  11. 12

  12. 13

  13. 14

  14. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. 15

  15. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. 16

  16. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. 17

  17. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Temporal- Game difference Theory RL Direct Policy Search 18

  18. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Task Type -> Cooperative Competitive Mixed Agent Awareness Independent Coordination-free Opponent- Agent-independent independent Tracking Coordination-based - Agent-tracking Aware Indirect Opponent-aware Agent-aware coordination 19

  19. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. 20

  20. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Obstacle S 1 S 2 L 1 R 1 L 2 R 2 1 2 Q L 2 S 2 R 2 L 1 10 -5 0 S 1 -5 -10 -5 R 1 -10 -5 10 21

  21. C. Guestrin, M. Lagoudakis, and R. Parr, “Coordinated reinforcement learning,” in Proc. Int’l Conf. Machine Learning (ICML-02), Jul. 2002. 1 Q 1 Q 3 f 4 2 3 Q 2 Q 4 4 22

  22. C. Guestrin, M. Lagoudakis, and R. Parr, “Coordinated reinforcement learning,” in Proc. Int’l Conf. Machine Learning (ICML-02), Jul. 2002. 1 Q 1 Q 3 f 4 2 3 Q 2 Q 4 4 23

  23. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. 24

  24. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. L 1 R 1 L 2 R 2 1 2 Q 1 L 2 R 2 Q 2 L 2 R 2 L 1 0 1 L 1 0 -1 R 1 -10 10 R 1 10 -10 25

  25. L. M. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in Proc. Int’l Conf. Machine Learning (ICML-94 ), Jul. 1994. 26

  26. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. 27

  27. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. Q 1 L 2 R 2 L 1 0 3 L 1 R 1 1 Left R 1 2 0 Right Room Room L 2 R 2 Q 2 L 2 R 2 2 L 1 0 2 R 1 3 0 28

  28. L. Busoniu, R. Babuska, and B. De Schutter, “A comprehensive survey of multiagent reinforcement learning,” IEEE Trans. Systems, Man and Cybernetics-Part C: Applications and Reviews, vol. 38, no.2, Mar. 2008. 29

  29. 30

  30. 31

Recommend


More recommend