path following with reinforcement learning for autonomous

Path following with reinforcement learning for autonomous cars - - PowerPoint PPT Presentation

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index Basics of Reinforcement Learning Model Based vs Model Free Reinforcement Learning Autonomous Car collision avoidance What is Reinforcement

  1. Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS)

  2. Index ● Basics of Reinforcement Learning ● Model Based vs Model Free Reinforcement Learning ● Autonomous Car collision avoidance

  3. What is Reinforcement Learning? ● Learning by trial and error only based on a reward signal[1] bandit-problem-b72de40db97c Exploration vs Exploitation?

  4. Markov-Desicion Process [1] Reward Function? Policy? Optimal Policy? Transition Function?

  5. Some terminalogy ● Value Function: ● Action Value Function: Why Discounting Factor?

  6. Gridworld [1]

  7. Finding Optimal Policy [1]

  8. Cart Pole Balancing Problem reinforcement-learning-ed0eb5b58288

  9. Index ● Basics of Reinforcement learning ● Model Based vs Model Free Reinforcement Learning ● Autonomous Car collision avoidance

  10. Model-based By a model of the environment we mean anything that an agent can use to predict how the environment will respond to its actions[2].

  11. Example Whats Next? :: Now lets sample from it to adjust the policy..

  12. Why model-based RL? Reduced number of Advantages? interaction with the real ● Fast environment while ● Need less data learning. Problems? Types: Neural Network Model, ● What if the model is wrong? Guassian Process Model.. etc

  13. Model Based+ Model Free [2]

  14. Results [1]

  15. Why better result? [1]

  16. Index ● Basics of Reinforcement learning ● Model Based vs Model Free Reinforcement Learning ● Autonomous Car Collision Avoidance

  17. Application: Autonomous Car Why Reinforcement Learning? Problem with traditional methods ● Slow ● Assumptions Learning in RL ● Adapting to environment ● Learning from mistakes

  18. Generalized Computation Graph Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs(GCG) for Robot Navigation[3] [3] ● H=1 : Model-Free ● H= N (Length of Episode): Model-Based

  19. Model Details ● Deep RNN as Model ● Model output 1= Current Reward ŷ: Robots speed ● Model output 2= Future Value to go(value of the state) ^b: Distance travelled before collision ● Policy Evaluation Function : ● Policy Evaluation by sampling k random action sequence and selecting the one with max reward.

  20. GCG : Algorithm [3]

  21. Evaluation and Results [3]

  22. Summary ● Benefits of Reinforcement Learning ● Model-Free vs Model-Based ● Combined approach that subsumes Model-free and Model-based

  23. References 1. R. Sutton and A. Barto, Reinforcement Learning: An Introduction 2. R. Sutton, “Dyna, an Integrated Architecture for Learning, Planning,and Reacting,” in AAAI, 1991. 3. G. Kahn, A. Villaflor, B. Ding, P. Abbeel, and S. Levine. Self- Supervised Deep ReinforcementLearning with Generalized Computation Graphs for Robot Navigation. InIEEE InternationalConference on Robotics and Automation, 2018.

  24. Question?


More recommend