meta reinforcement learning
play

Meta Reinforcement Learning Chelsea Finn Why are humans so good at - PowerPoint PPT Presentation

Meta Reinforcement Learning Chelsea Finn Why are humans so good at RL? People have prior experience. People have an existing representation of the world. Can we learn a representation under which RL is fast? Key idea : Explicitly optimize for


  1. Meta Reinforcement Learning Chelsea Finn

  2. Why are humans so good at RL? People have prior experience. People have an existing representation of the world. Can we learn a representation under which RL is fast? Key idea : Explicitly optimize for such a representation “Learn how to reinforcement learn”

  3. Outline Meta-RL Problem Formulation & Examples Method Classes: Recurrent Models, Gradient-Based Models Challenges & Latest Developments

  4. The Meta-Learning Problem Supervised Learning: Inputs: Outputs: Data: Meta Supervised Learning: Inputs: Outputs: Data: { Why is this view useful? Reduces the problem to the design & optimization of f . Finn & Levine. Meta-learning and Universality: Deep Representation… ICLR 2018

  5. Example: Few-Shot Classification Given 1 example of 5 classes: Classify new examples test set training data meta-training training classes … … diagram adapted from Ravi & Larochelle ‘17

  6. Meta-RL Example: Maze Navigation Given a small amount of experience Learn to solve the task By learning how to learn many other tasks: … diagram adapted from Duan et al. ‘17

  7. The Meta Reinforcement Learning Problem Reinforcement Learning: Inputs: Outputs: Data: Meta Reinforcement Learning: Inputs: Outputs: Data: { dataset of datasets k rollouts from collected for each task *and* collecting appropriate data Design & optimization of f (learning to explore) Finn. Learning to Learn with Gradients. PhD Thesis 2018

  8. Meta-RL Example: Maze Navigation Given a small amount of experience Learn to solve the task [meta] test time [meta] train time By learning how to learn many other tasks: meta-training … tasks diagram adapted from Duan et al. ‘17

  9. The Meta Reinforcement Learning Problem Meta Reinforcement Learning: Inputs: Outputs: { Episodic Variant k rollouts from Inputs: Outputs: { Online Variant 1…k timesteps from

  10. Outline Meta-RL Problem Formulation & Examples Method Classes: Recurrent Models, Gradient-Based Models Challenges & Latest Developments

Recommend


More recommend