ai basics
play

AI Basics Heechul Yun Acknowledgement: Many slides are adopted from - PowerPoint PPT Presentation

AI Basics Heechul Yun Acknowledgement: Many slides are adopted from Berkeleys CS188 AI slide deck. Markov Decision Process (MDP) In MDP, Markov means action outcomes depend only on the current state Andrey Markov (1856-1922)


  1. AI Basics Heechul Yun Acknowledgement: Many slides are adopted from Berkeley’s CS188 AI slide deck.

  2. Markov Decision Process (MDP) • In MDP, “Markov” means action outcomes depend only on the current state Andrey Markov (1856-1922)

  3. Markov Decision Process (MDP) • Markov decision process s – Set of states S a – Start state s 0 – Set of actions A s, a – Transitions P(s’ |s,a) (or T(s,a,s ’)) – Rewards R(s,a,s ’ ) (and discount  ) s,a,s ’ s ’ • Policy – Choice of action for each state • Utility – Sum of (discounted) rewards

  4. Q-Learning • Q-Learning: sample-based Q-value iteration • Learn Q(s,a) values as you go – Receive a sample ( s,a,s’,r ) – Consider your old estimate: – Consider your new sample estimate: – Incorporate the new estimate into a running average:

  5. Q-learning • Problems – Q(s,a) table grows exponentially – Cannot deal with complex problems

  6. A Neuron Image credit: Lex Fridman, MIT

  7. Neural Network • Regular neural network • Convolutional neural network

  8. DQN • Q function is estimated with a neural network • Plus a few “tricks” – Experience replay to improve robustness

  9. DQN Implementations • Plenty on the web

  10. OpenAI Gym https://gym.openai.com/

  11. OpenAI Gym • Easy record & replay [link] https://gym.openai.com/

  12. OpenAI Universe https://universe.openai.com/

  13. Udacity Self-Driving Car Simulator [link]

  14. Resources • OpenAI – https://gym.openai.com/ – https://universe.openai.com/ • Udacity self-driving car simul – https://github.com/udacity/self-driving-car-sim • AI lectures for autonomous systems – http://ai.berkeley.edu/ – http://rll.berkeley.edu/deeprlcourse/ – http://selfdrivingcars.mit.edu/ – David Silver, Tutorial: Deep Reinforcement Learning

Recommend


More recommend