10703 Deep Reinforcement Learning Reinforcement Learning in Humans - PowerPoint PPT Presentation

Feb 09, 2024 •237 likes •450 views

10703 Deep Reinforcement Learning Reinforcement Learning in Humans and Animals Tom Mitchell October 29, 2018 Reading: Barto & Sutton Chapter 15 Tom Mitchell, October 2018 Outline RL in primates RL in humans Error signals and

10703 Deep Reinforcement Learning Reinforcement Learning in Humans and Animals Tom Mitchell October 29, 2018 Reading: Barto & Sutton Chapter 15 Tom Mitchell, October 2018
Outline • RL in primates • RL in humans • Error signals and predictive coding Tom Mitchell, October 2018
Reward based learning in primates Tom Mitchell, October 2018
Dopamine As Reward Signal t [Schultz et al., Science , 1997] Tom Mitchell, October 2018
Dopamine As Reward Signal t [Schultz et al., Science , 1997] Tom Mitchell, October 2018
Dopamine As Reward Signal t [Schultz et al., Science , 1997] error r γ V(s ) V(s ) = + − t t 1 t + � 6 Tom Mitchell, October 2018
Reward based learning in humans Tom Mitchell, October 2018
RL Models for Human Learning [Seymore et al., Nature 2004] Tom Mitchell, October 2018
[Seymore et al., Nature 2004] � 9 Tom Mitchell, October 2018
One Theory of RL in the Brain from [Nieuwenhuis et al.] • Basal ganglia monitor events, predict future rewards • When prediction revised upward (downward), causes increase (decrease) in activity of midbrain dopaminergic neurons, influencing ACC • This dopamine-based activation somehow results in revising the reward prediction function. Possibly through direct influence on Basal ganglia, and via prefrontal cortex Tom Mitchell, October 2018
Tom Mitchell, October 2018
Tom Mitchell, October 2018
Neuron Level Learning Mechanisms • Hebbian learning – fire together � wire together • Spike Timing Dependent Plasticity   (STDP) – if incoming neuron fires before outgoing   then strengthen connection – if incoming neuron fires after outgoing   then weaken connection • Reward modulated STDP – less understood – in some neurons, appears STDP occurs only if neuromodulator (e.g., dopamine) activity follows firing within time up to 10 sec Tom Mitchell, October 2018
Tom Mitchell, October 2018
Summary: Temporal Difference ML Model   Predicts Dopaminergic Neuron Acitivity during Learning • Evidence now of neural reward signals from – Direct neural recordings in monkeys – fMRI in humans (1 mm spatial resolution) – EEG in humans (1-10 msec temporal resolution) • Dopaminergic responses encode Temporal Difference error • Some differences, and efforts to refine the model – How/where is the value function encoded in the brain? – Study timing (e.g., basal ganglia learns faster than PFC ?) – Role of prior knowledge, rehearsal of experience, multi-task learning? Tom Mitchell, October 2018
Predictive Coding Tom Mitchell, October 2018
[Rao & Ballard, Nature, 1999] Tom Mitchell, October 2018
[Rao & Ballard, 1999] Tom Mitchell, October 2018
Tom Mitchell, October 2018
Tom Mitchell, October 2018

Recommend

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Deep Neural Networks and Deep Reinforcement Learning Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and Courville [chapt. 6,7,8]; AIMA [sect. 21.1-21.3]; Sutton and Barto, Reinforcement Learning: an

528 views • 35 slides

10703 Deep Reinforcement Learning Imitation Learning - 1 Tom Mitchell November 4, 2018

10703 Deep Reinforcement Learning Imitation Learning - 1 Tom Mitchell November 4, 2018 Recommended readings: Used Materials Much of the material and slides for this lecture were borrowed from Katerina Fragkiadaki, and Ruslan

1.25k views • 49 slides

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Reinforcement Learning Q-Learning Deep Q-Learning on Atari Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement Learning Q-Learning Deep Q-Learning on Atari Table of Contents Reinforcement Learning

942 views • 63 slides

10703 Deep Reinforcement Learning Exploration vs. Exploitation Tom Mitchell October 22, 2018

10703 Deep Reinforcement Learning Exploration vs. Exploitation Tom Mitchell October 22, 2018 Reading: Barto & Sutton, Chapter 2 Used Materials Some of the material and slides for this lecture were taken from Chapter 2 of Barto &

787 views • 37 slides

10703 Deep Reinforcement Learning Policy Gradient Methods Tom Mitchell October 1, 2018 Reading:

10703 Deep Reinforcement Learning Policy Gradient Methods Tom Mitchell October 1, 2018 Reading: Barto & Sutton, Chapter 13 Used Materials Much of the material and slides for this lecture were taken from Chapter 13 of Barto & Sutton

587 views • 35 slides

10703 Deep Reinforcement Learning Solving known MDPs Tom Mitchell September 10, 2018 Many

10703 Deep Reinforcement Learning Solving known MDPs Tom Mitchell September 10, 2018 Many slides borrowed from Katerina Fragkiadaki Russ Salakhutdinov Markov Decision Process (MDP) A Markov Decision Process is a tuple is a

1.05k views • 36 slides

10703 Deep Reinforcement Learning Tom Mitchell September 5, 2018 Solving known MDPs Many slides

9/7/18 10703 Deep Reinforcement Learning Tom Mitchell September 5, 2018 Solving known MDPs Many slides borrowed from Katerina Fragkiadaki Russ Salakhutdinov Markov Decision Process (MDP) A Markov Decision Process is a tuple is a

502 views • 14 slides

10703 Deep Reinforcement Learning Policy Gradient Methods Part 3 Tom Mitchell October 8, 2018

10703 Deep Reinforcement Learning Policy Gradient Methods Part 3 Tom Mitchell October 8, 2018 Recommended readings: next slide. (not covered in Barto & Sutton) Used Materials Disclaimer : Much of the material and slides for this

784 views • 26 slides

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning: an Introduction, 2nd Edition: Chapters 6 (6.1 6.5) Outline Reinforcement Learning Reinforcement Learning: the

592 views • 27 slides

Asynchronous RL CMU 10703 Katerina Fragkiadaki Non-stationary data problem for Deep RL

Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Asynchronous RL CMU 10703 Katerina Fragkiadaki Non-stationary data problem for Deep RL Stability of training neural networks requires the gradient updates

933 views • 69 slides

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement learning? Agent/Actor + Action + Environment + State + Reward How does reinforcement learning work?

805 views • 31 slides

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Introduction to Reinforcement Learning RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem Inside an RL agent Temporal difference learning Many faces of Reinforcement Learning What is

555 views • 35 slides

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree Search, Nature 2016] CS 486/686 University of Waterloo Lecture 21: July 12, 2017 Outline AlphaGo Supervised Learning of Policy Networks

541 views • 15 slides

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning<br/><br/> 4/25/19, 8*06 PM Reinforcement Learning<br/><br/> 4/25/19, 8*06 PM Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning? Spring 2019 Created:

372 views • 15 slides

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning and Simulation-Based Search Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and Simulation-Based Search Outline 1 Reinforcement Learning 2 Simulation-Based Search 3 Planning Under

426 views • 20 slides

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine playing a new game whose rules you dont know; after a hundred or so moves your don t know; after a hundred or so moves, your opponent announces, You

512 views • 30 slides

Disclosures Evidence Based Medicine: None Concomitant or Sequential Phlebectomy for

4/18/2015 Disclosures Evidence Based Medicine: None Concomitant or Sequential Phlebectomy for Varicosities with Venous Ablation? 2015 UCSF Vascular Symposium Warren Gasper, MD Assistant Professor of Surgery UCSF Division of Vascular

326 views • 4 slides

Inspection changes for 2019 Jacqui Perfect, Assistant Manager, GP Support www.lmc.org.uk Good

Care Quality Commission (CQC) Inspection changes for 2019 Jacqui Perfect, Assistant Manager, GP Support www.lmc.org.uk Good & Outstanding Practices (i) Approximately 10% of practices will continue to receive a face to face

418 views • 19 slides

Distributions Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Marc Mehlman Marc

Distributions Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Marc Mehlman Marc Mehlman (University of New Haven) Distributions 1 / 49 Table of Contents Distributions 1 Discrete Random Variables 2 Common Discrete

872 views • 49 slides

Accu curacy acy of of Sol olar ar Radi adius us D Det eter erminat nation ons s from

Accu curacy acy of of Sol olar ar Radi adius us D Det eter erminat nation ons s from om S Sol olar ar Eclipse O pse Obs bser ervat ations, ons, and and Compar omparison w son with h SOHO D Dat ata David W. Dunham,

631 views • 24 slides

From SPARC_LAB to EuPRAXIA Massimo.Ferrario@LNF.INFN.IT Courtesy BELLA LAL Orsay, 7 March

From SPARC_LAB to EuPRAXIA Massimo.Ferrario@LNF.INFN.IT Courtesy BELLA LAL Orsay, 7 March 2017 Hawking: the Solartron Towards the Planck scale: 1.22 10 19 GeV Without further novel technology, we will eventually need an accelerator as

1.09k views • 61 slides

The Super Flavour Factory

727 views • 60 slides

Probability and Statistics for Computer Science The weak law of large numbers gives us a

Probability and Statistics for Computer Science The weak law of large numbers gives us a very valuable way of thinking about expecta:ons. ---Prof. Forsythe Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC,

647 views • 61 slides

r.r.fi use indicator others 0 IT In PrEIj Xn I In e EE In E EE Ide c In Xn Iie E ECTS

variance and covariance July 23,2020 f Applications of of Expectation Linearity Variance 2 Covariance and correlation 3 Last time Distributions p if Xsl Bernoulli CP x PREM ipso p i f pin pj I Xu PHX Bincngp Linearity of

523 views • 13 slides