Michael J. Frank Laboratory for Neural Computation and Cognition - PowerPoint PPT Presentation

Clustering and generalization of abstract structures in reinforcement learning   Michael J. Frank Laboratory for Neural Computation and Cognition Brown University

Reinforcement learning in neural nets and AI Mnih et al, 2015 , Nature

But nets show failure to transfer learned knowledge Breakout trained on Offset Paddle Breakout Asynchronous Advantage Actor-Critic (A3C) Kansky et al, 2017 See also Witty et al 2018 arXiv � 3

What can we learn from limitations of models and humans? Trade-offs • Limited WM capacity (curse of dimensionality) • Multi-tasking (shared representations enhance learning & generalization) • Robustness to task contingencies (OpAL vs RL) • (catastrophic) interference in episodic memory • Unsupervised (hebbian) vs supervised • Motor learning – hierarchical structure • In defense of “small” problems: - need to understand key elements - link to neural data / experiments - Theory of Everything before Theory of Anything! � 4

Why does motor learning develop so slowly in humans? • Standard story: infants born early due to large head, small birth canal • ‘Fourth trimester’

Why does motor learning develop so slowly in humans? • Standard story: infants born early due to large head, small birth canal • ‘Fourth trimester’ • But 3 month old infants are still pretty incompetent (from babycenter.com):

Why does motor learning develop so slowly in humans? • Standard story: infants born early due to large head, small birth canal • ‘Fourth trimester’ • But 3 month old infants are still pretty incompetent (from babycenter.com): ‘you no longer need to support his head. When he’s on his stomach he can lift his head and chest. He can open and close his hands..’

Why does motor learning develop so slowly in humans? • Standard story: infants born early due to large head, small birth canal • ‘Fourth trimester’ • But 3 month old infants are still pretty incompetent (from babycenter.com): ‘you no longer need to support his head. When he’s on his stomach he can lift his head and chest. He can open and close his hands..’ • Hypothesis: human brain is wired to discover latent generalizable structure, which is initially inefficient – see Werchan et al 2016!

Why does motor learning develop so slowly in humans? • Standard story: infants born early due to large head, small birth canal • ‘Fourth trimester’ • But 3 month old infants are still pretty incompetent (from babycenter.com): ‘you no longer need to support his head. When he’s on his stomach he can lift his head and chest. He can open and close his hands..’ • Hypothesis: human brain is wired to discover latent generalizable structure, which is initially inefficient

Humans learn contextualized   rule structures Driving rules Driving rules UK Montreal and…

A key structure: Task-sets (TS) Cue 1 stimuli actions

Task-sets (TS) C 1 S1 A1 S2 A2 S3 A3

Task-sets (TS) C 1 S i1 A i1 C 2 S i2 A i2 C 3 S i3 A i3 C 4 S i4 A i4 C 5 S i5 A i5 C 6 S i6 A i6

Abstracting Task-set rules Latent task-set space C 1 C 2 S i1 A i1 TS 1 C 3 C 4 C 5 S i2 A i2 TS 2 C 6 Collins & Frank 2013

Popularity Prior on Task-set rules C 1 C 2 C 2 S i1 A i1 TS 1 C 3 C 4 C 5 S i2 A i2 TS 2 C 6 CRP Prior on TS in a new context: C 7 = N(TS j |C*) / [ α + Σ i ? P 0 (TS = TS j |C new ) N(TS i | C*)] = α / [ α + Σ i N(TS i | C*)] P 0 (TS = new|C new ) Collins & Frank 2013

Ability to create new Task-set rules Latent task-set space: Unknown size C 1 C 2 S i1 A i1 TS 1 C 3 C 4 C 5 S i2 A i2 TS 2 C 6 C 7 S i A i TS new Collins & Frank 2013

Linking algorithmic model and neural network model CTS-model Neural Network-model Ai TSi BG BG DA Both models are approximations of the same process: TS space building Collins & Frank, Psych Review, 2013

Clustering vs partitioning task space in frontostriatal circuits via RL Old TS New TS generalization & transfer RL Collins & Frank 2013; 2016; Frank & Badre, 2012

Clustering vs partitioning task space in frontostriatal circuits via RL Old TS New TS generalization & transfer RL Fitted clustering Model mimicry: prior C-TS and hierarchical neural net are approximations of same structure building process C-PFC sparseness Collins & Frank 2013; 2016; Frank & Badre, 2012

Vector reward prediction errors:   “actor-specific” computations “Mixture of Experts” Hierarchical task MIXTURE Flat task • DA signals are tailored to computations of underlying FC-BG circuit - “Mixture of Experts” ( Frank & Badre 2012; fMRI: Badre & Frank 2012; Collins & Frank 2013 … ) - Vector RPEs

Appending to latent task structures: beyond the identity mapping.. S1 S2 C0 A1 A2 Initial C1 A1 A2 Phase C2 A3 A4 C3 C4

Appending to latent task structures: extrapolating beyond the identity mapping S1 S2 S3 S4 C0 A1 A2 A1 A4 Transfer Initial C1 A1 A2 A1 A4 Phase 1 Phase C2 A3 A4 A3 A2 C3 C4

1/4 1/4 1/2 C0 C1 C2 TS1 TS2 ? ? A 1 A 2 A 3 A 4 Initial phase Phase 2 Subjects (N=34) H Subjects (N=34) 1 Model Proportion Correct Proportion Correct * Proportion Correct 0.8 * 0.6 C0, C1 0.4 C0, C1 C2 C0, C1 C2 0.2 C2 0 0 2 4 6 8 Trial# per input pattern Trial# per input pattern Trial# per input pattern

H init 1 1 Model Model Proportion Correct Proportion Correct 0.8 0.8 0.6 0.6 0.4 0.4 C0, C1 C0, C1 0.2 0.2 C2 C2 0 0 0 2 4 6 8 0 2 4 6 8 Initial phase Phase 2 Subjects (N=34) Subjects (N=34) Proportion Correct Proportion Correct * * C0, C1 C0, C1 C2 C2 Trial# per input pattern Trial# per input pattern

Can subjects generalize learned rules to new contexts? C3 C0 C1 C2 C3 C4 TS3 TS1 TS2

Can subjects generalize learned rules to new contexts? S1 S2 S3 S4 C0 TS1 TS1 Transfer Initial C1 TS1 TS1 Phase 1 Phase C2 TS2 TS2 C3 TS old Transfer C4 TS new Phase 2

C3 C0 C1 C2 C3 C4 TS3 TS1 TS2 A 1 A 2 A 3 A 4 A 1 A 4 CV Subjects (N = 34) 1 Model Proportion Correct Proportion Correct 0.8 * 0.6 0.4 C3: TS old C4: TS new 0.2 0 0 2 4 6 8 Trial# per input pattern Trial# per input pattern

C0 C1 C2 Prediction error: TS1 TS2 PE = reward - expectation ? ? A 1 A 2 A 3 A 4 Correct Correct PE Correct Correct PE Structure learning PE

Prediction error (PE) in EEG signal β PE (electrodes, time) For each subject: β Str (electrodes, time) Time from FB trial number ~ β 0 + β PE + β Str Collins & Frank (2016), Cognition

Prediction error (PE) in EEG signal   Structure PE in EEG signal EEG(trial) ~ β 0 + β PE PE(trial) + β Str StructurePE(trial) PE effect average β PE ROI1 Time from feedback (ms) ROI2 ** * ns ** * Structure learning PE Unique effect of Collins & Frank (2016) Cognition ROI1 ROI2

Structure PE signal predicts transfer. 1 Unique effect of 0.8 Structure New context Prior P(Correct) learning PE 0.6 C3-TS old 0.4 0.6 C4-TS new 0.2 0.5 % Choices 0.4 0 0 2 4 6 8 0.3 Iteration # 1 0.2 0.1 0.8 0 TS1 TS2 other P(Correct) TS1 TS2 other 0.6 ROI1+2 action action action 0.4 0.2 Collins & Frank, Cognition, 0 0 2 4 6 8 accepted Iteration #

Structure learning Proportion Correct Proportion Correct It affords transfer * * C0, C1 C2 Trial# per input pattern Trial# per input pattern It depends on clustering priors It informs neural PE effect Structure learning PE ** * ns ** * Unique effect of representations of reward predictions

Neural model & EEG: TS switch effects

No early clustering benefit Structure learning affords transfer - early structure learning is of new information within learned costly clusters S1 S2 S3 S4 C0 TS1 TS1 Transfer Initial C1 TS1 TS1 Phase 1 Phase C2 TS2 TS2 C3 TS old Transfer C4 TS new Phase 2 Neural signatures of Structure learning affords transfer of hierarchical prediction errors known rules to new contexts – with predict structure learning/ transfer: Badre & Frank 2012; Collins et al 2014, 2016 popularity clustering prior

Do we build structure a priori? N = 33 New Context: * Old TS New TS Significant whole group positive transfer Werchan et al, 2016, JNeurosci

Share: Physical Movements (mappings from sounds to notes) Share : Chord progression, rhythm, etc ( desired sound/ song) � 38

Need compositionality : reuse flute mappings to play a song usually played on guitar Piccolo � 39

Michael J. Frank Laboratory for Neural Computation and Cognition - PowerPoint PPT Presentation

Clustering and generalization of abstract structures in reinforcement learning Michael J. Frank Laboratory for Neural Computation and Cognition Brown University Reinforcement learning in neural nets and AI Mnih et al, 2015 , Nature But nets

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

Live eMate eMate repair at WWNC repair at WWNC Live Frank Gr Gr ndel ndel Frank

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Frank Butler Farms Ltd Frank Butler Farms Ltd Frank Butler Farms Ltd was a mixed farming

Laboratory Investigation of Laboratory Investigation of Laboratory Investigation of Laboratory

Sequential neural networks as automata William Merrill Advised by Dana Angluin Robert Frank

Neural Networks and Computation Graphs CS 6956: Deep Learning for NLP Based on slides and

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Explainable Neural Computation via Stack Neural Module Networks (July, 2018) Ronghang Hu, Jacob

8 Neural MT 2: Attentional Neural MT In the past chapter, we described a simple model for neural

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Machine Learning CMU-10701 Principal Component Analysis Barnabs Pczos &

TOTEM Switching the NRTI Backbone to Tenofovir DF-Emtricitabine TOTEM: Study Design Study

Reduction in Total Ischemic Events in the Reduction of Cardiovascular Events with Icosapent

Highway 7 & Wooddale Highway 7 & Wooddale Avenue Vapor Avenue Vapor Study Background

ANDES Trimester meeting September 3 rd , 2010 (teleconference) Draft agenda : Identification

Travel Sentiment Study Wave 13 JUNE 9, 2020 COVID-19 TRAVEL SENTIMENT STUDY WAVE 13 Fielded

Building Software Agents for Building Software Agents for Planning Monitoring, and Planning

U.S. Travel Association Polling Presentation Key Points By greater than a 5-1 margin,

Michael J. Frank Laboratory for Neural Computation and Cognition - PowerPoint PPT Presentation

Clustering and generalization of abstract structures in reinforcement learning Michael J. Frank Laboratory for Neural Computation and Cognition Brown University Reinforcement learning in neural nets and AI Mnih et al, 2015 , Nature But nets

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

Live eMate eMate repair at WWNC repair at WWNC Live Frank Gr Gr ndel ndel Frank

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

Frank Butler Farms Ltd Frank Butler Farms Ltd Frank Butler Farms Ltd was a mixed farming

Laboratory Investigation of Laboratory Investigation of Laboratory Investigation of Laboratory

Sequential neural networks as automata William Merrill Advised by Dana Angluin Robert Frank

Neural Networks and Computation Graphs CS 6956: Deep Learning for NLP Based on slides and

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Explainable Neural Computation via Stack Neural Module Networks (July, 2018) Ronghang Hu, Jacob

8 Neural MT 2: Attentional Neural MT In the past chapter, we described a simple model for neural

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Introduction to Machine Learning CMU-10701 Principal Component Analysis Barnabs Pczos &amp;

TOTEM Switching the NRTI Backbone to Tenofovir DF-Emtricitabine TOTEM: Study Design Study

Reduction in Total Ischemic Events in the Reduction of Cardiovascular Events with Icosapent

Highway 7 &amp; Wooddale Highway 7 &amp; Wooddale Avenue Vapor Avenue Vapor Study Background

ANDES Trimester meeting September 3 rd , 2010 (teleconference) Draft agenda : Identification

Travel Sentiment Study Wave 13 JUNE 9, 2020 COVID-19 TRAVEL SENTIMENT STUDY WAVE 13 Fielded

Building Software Agents for Building Software Agents for Planning Monitoring, and Planning

U.S. Travel Association Polling Presentation Key Points By greater than a 5-1 margin,

Introduction to Machine Learning CMU-10701 Principal Component Analysis Barnabs Pczos &

Highway 7 & Wooddale Highway 7 & Wooddale Avenue Vapor Avenue Vapor Study Background