fundamentals of computational neuroscience 2e
play

Fundamentals of Computational Neuroscience 2e December 31, 2009 - PowerPoint PPT Presentation

Fundamentals of Computational Neuroscience 2e December 31, 2009 Chapter 9: Modular networks, motor control, and reinforcement learning Mixture of experts Expert 1 Integration Expert 2 Output network Input Expert n Gating network A.


  1. Fundamentals of Computational Neuroscience 2e December 31, 2009 Chapter 9: Modular networks, motor control, and reinforcement learning

  2. Mixture of experts Expert 1 Integration Expert 2 Output network Input Expert n Gating network A. Absolute function B. Mixture of expert for absolute function f ( x ) = abs ( x ) ΣΠ abs( x ) X x

  3. The ‘what-and-where’ task B. Without bias towards short connections 1 A. Model retina with sample image Output node # 5 5 10 4 15 1 5 10 15 20 25 30 35 3 Hidden node # C. With bias towards short connections 2 1 Output node # 1 5 1 2 3 4 5 10 15 1 5 10 15 20 25 30 35 Hidden node # Jacobs and Jordan (1992)

  4. Coupled attractor networks A. Coupled attractor networks B. The left--right universe with letters Node group 1 Node group 2 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 Connections 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 between groups 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 0 0 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 0 0 0 0

  5. Limit on modularity A. Load capacity B. Bounds on intermodular strength 0.12 2 g : Relative intramodular strength m = 1 Load capacity 0.1 1.5 m = 2 0.08 0.06 1 m = 4 α : c 0.04 0.5 0.02 0 0 0 0.2 0.4 0.6 0.8 1 2 4 6 8 10 g : Relative intermodular strength m : Number of modules

  6. Sequence learning A. Modular attractor model B. Time evolution of overlaps 3000 Input Overlap in A Pathway 2000 1000 0 w AB BB w −1000 0 2 4 6 8 10 12 14 16 18 20 AA BA w w 2000 Overlap in B 1500 1000 500 0 −500 0 2 4 6 8 10 12 14 16 18 20 Module A Module B Time [ τ ] Lawrence, Trappenberg and Fine (2006); (Sommer and Wennekers (2005))

  7. Working memory PFC PMC HCMP O’Reilly, Braver, and Cohen 1999

  8. Limit on working memory A. One object B. Two objects C. Four objects 120 120 120 Node number Node number Node number 100 100 100 80 80 80 60 60 60 40 40 40 20 20 20 0 10 20 0 10 20 0 10 20 Time Time Time

  9. Motor learning and control Disturbance Desired Motor Actual Motor command Controlled Sensory - state command state generator object system Afferent Re-afferent

  10. Forward model controller Disturbance Desired Motor command Motor Controlled Actual Sensory - state generator command object state system - Forward dynamic Forward output - model model + Afferent Re-afferent

  11. Inverse model controller Inverse model Disturbance + - Desired Motor Actual Motor command Controlled Sensory - state command state generator object system Afferent Re-afferent

  12. Cerebellum Stellate cell Parallel fibre Basket cell { Molecular layer Purkinje { Purkinje neuron Golgi cell layer { Granular Granule layer cell Climbing fibre Mossy fibre Excitatory synapse Inhibitory synapse Intracerebellar and vestibular { Spinal cord nuclei External cuneate nucleus Out Reticular nuclei Inferior olive Pontine nuclei

  13. Reinforcement learning

  14. Basal Ganglia A. Outline of basic BG anatomy C. Recordings of SNc neurons and simulations Cerebral cortex Caudate nucleus Putamen Thalamus Stimulus A No reward Pattern 4 Subthalamic nucleus rhat Pattern 3 Pattern 2 Globus Substantia nigra pallidus ( ) pars compacta pars reticulata 0 50 100 Superior colliculus Stimulus B Stimulus A Reward Episode

  15. temporal difference learning A. Linear predictor node B. Temporal delta rule C. Temporal difference rule in in in in in in in in r 1 r 2 r 3 r 4 r 1 r 2 r 3 r 4 ( t ) ( t ) ( t ) ( t ) ( t ) ( t ) ( t ) ( t) in in in in r 1 r 2 r 3 r 4 ( t ) ( t ) ( t ) ( t ) in ( t-1 ) in ( t-1 ) r j r j r ( t ) r ( t ) V ( t ) V ( t ) slow slow V ( t − 1) V ( t − 1) fast V ( t ) γ V ( t ) r ( t ) r ( t )

  16. Actor-critique and Q-learning B. Actor-critic model of BG D. Q-learning model of BG state action Cerebral cortex (frontal) F C C C Cerebral cortex state / action coding TH Thalamus Striosomal module Matrix module Striatum SP m SP s reward prediction ST ST SNc Pallidum PD DA action selection Basal ganglia (actor) (critic) Primary reinforcement Primary reinforcement

  17. Actor-critique controller Critic Reinforcement Disturbance signal Desired Motor command Motor Controlled Actual Sensory - state generator (actor) command object state system Afferent Re-afferent

  18. Further Readings Robert A. Jacobs, Michael I. Jordan, and Andrew G. Barto (1991), Task decomposition through competition in a modular connectionist architecture: the what and where tasks , in Cognitive Science 15: 219–250. Geoffrey Hinton (1999), Products of experts , in Proceedings of the Ninth International Conference on Artificial Neural Networks , ICANN ’99, 1:1–6. Yaneer Bar-Yam (1997), Dynamics of complex systems , Addison-Wesley. Edmund T. Rolls and Simon M. Stringer (1999), A model of the interaction between mood and memory , in Networks: Comptutation in neural systems 12: 89–109. N. J. Nilsson (1965), Learning machines: foundations of trainable pattern-classifying systems , McGraw-Hill. O. G. Selfridge (1958), Pandemonium: a paradigm of learning, in the mechanization of thought processes , in Proceedings of a Symposium Held at the National Physical Laboratory , November 1958, 511–27, London HMSO. Marvin Minsky (1986), The society of mind , Simon & Schuster. Akira Miyake and Priti Shah (eds.) (1999), Models of working memory , Cambridge University Press. Daniel M. Wolpert, R. Chris Miall, and Mitsuo Kawato (1998), Internal models in the cerebellum , in Trends Cognitive Science 2: 338–47. Edmund T. Rolls and Alessandro Treves (1998), Neural networks and brain function , Oxford University Press. James C. Houk, Joel L. Davis, and David G. Beiser (eds.) (1995), Models of information processing in the basal ganglia , MIT Press. Richard S. Sutton and Andrew G. Barto (1998), Reinforcement learning: an introduction , MIT Press. Peter Dayan and Laurence F . Abbott (2001), Theoretical Neuroscience , MIT Press.

Recommend


More recommend