speculations on possible brain substrates of symbolic
play

Speculations on possible brain substrates of symbolic processing and - PowerPoint PPT Presentation

Speculations on possible brain substrates of symbolic processing and structured I/O from memory Adam Marblestone CS 379c Stanford 2019 (slides based on pre-DeepMind work, much of it with Ken Hayworth) A tentative high-level template for AI


  1. Speculations on possible brain substrates of symbolic processing and structured I/O from memory Adam Marblestone CS 379c Stanford 2019 (slides based on pre-DeepMind work, much of it with Ken Hayworth)

  2. A tentative high-level template for AI cognitive architectures, based on some interpretations of modern neuroscience (such as it is)

  3. A tentative high-level template for AI cognitive architectures, based on some interpretations of modern neuroscience (such as it is) ???????????? But raises more questions than it answers...

  4. Psychological inspirations for knowledge representation in AI cognitive architectures... and assumptions to question Working memory : reverberating activity? qualitatively similar to ongoing activity in a LSTM? -- but, in cortex? cortico-thalamic loops? unstructured versus pre-structured? variables/slots? -- gating / routing of access to/from working memory Episodic memory : rapid plasticity in hippocampus, supports pattern completion, linked to diverse cortical representations -- many open questions… temporal, spatial, predictive and other relational organizing principles? -- how is it consolidated into semantic memory or other cortically-encoded knowledge? -- free association, chunking, hierarchical contexts... -- how are memory recall, offline replay + prospective planning linked with RL? -- interplay of feature-based generalization and sparse, arbitrary pattern-separated codes? -- ... Semantic memory : knowledge-graph like representations in cortical association areas? -- distinct from episodic memory? distinct from “unstructured” cortical weights? -- is this a distinct architecture, or something that emerges from the other systems? Procedural memory : cortico-striatal synapses governing basal-ganglia action selection? selectable cortical programs? Other : how is the information encoded (e.g., based on which loss functions) before entering any of the above systems? are VAE-like “latent vectors” able to capture enough structure, when trained with the right loss functions, e.g., see MERLIN predictive losses? or does one need something more like “capsules” or other architectural features?

  5. Neural Turing Machine: originally framed as extension to LSTM “working memory”

  6. NTM arguably solves long-standing complaints about lack of symbolic “variable binding” in NNs (e.g., Gary Marcus)

  7. Can we forge tighter links with neuroscience to constrain architectural choices for working + episodic memory analogs, symbolic structures, dynamic routing, and training procedures in ANNs?

  8. neural attractors/assemblies/ensembles (cf., Hopfield…) http://fourier.eng.hmc.edu/e161/lectures/figures/energylandscape.gif

  9. (cf., Hopfield…) https://github.com/adammarblestone/AssociativeMemories

  10. Information represented via assemblies/attractors

  11. Information represented via assemblies/attractors

  12. Sequences of point attractors in the hippocampus?

  13. Sequences of point attractors in the hippocampus?

  14. The attractors may be in cortico-thalamo-cortical loops

  15. Thalamic Latches and Working Memory Buffers McFarland & Haber Murray Sherman

  16. Thalamic Latches and Working Memory Buffers Assumption: Information necessary to select an assembly passes through thalamus between cortical buffers

  17. Gated communication using thalamic relay of attractors Idea: Thalamic relay + attractor implementation of “dynamically partitionable auto-associative neural network” (Hayworth 2012) •Global attractors/assemblies/ensembles shared across source > thalamic relay > destination buffers •Gating the thalamic relay off allows “partitioning” of the buffers •Gating the thalamic relay on allows information to be “copied” from a source buffer to a destination buffer, forcing the destination buffer to occupy an attractor globally shared with that of the source

  18. Cortico-thalamic latched memory buffer

  19. Cortico-thalamic latched memory buffer Assembly/attractor/ ensemble shared across connected cortical and thalamic areas…

  20. “Copy and paste” of symbols using partitionable attractors Hayworth and Marblestone 2018

  21. “Copy and paste” of symbols using partitionable attractors Sequence of gating operations for copy-and-paste of assemblies (cf., symbolic variable binding) During training / symbol allocation... Hayworth 2009

  22. “Copy and paste” of symbols using partitionable attractors Sequence of gating operations for copy-and-paste of assemblies (cf., symbolic variable binding) Later, executing a routing operation... Hayworth 2009

  23. “Copy and paste” of symbols using partitionable attractors Sequence of gating operations for copy-and-paste of assemblies (cf., symbolic variable binding) Hayworth and Marblestone 2018

  24. “Copy and paste” of symbols using partitionable attractors “Latch” and “relay” control via basal ganglia discrete outputs discrete inhibitory/disinhibitory control over target thalamic areas/relays/latches? -Evolutionarily ancient (homologies to simplest vertebrate brains, e.g., ZFish) -Does RL -BG and superior colliculus may also contain innate control structures that could drive “training routines” / “internal curricula” / “bootstrap cost functions”... Lisman 2015

  25. Clamping in target patterns for “contrastive” learning Hayworth and Marblestone 2018

  26. Clamping in target patterns for “contrastive” learning Explicit basal ganglia directed control over the learning of invariances (not just unsupervised “slow feature” finding)? Example: -Basal ganglia recognizes boundaries of “episode” with a given object (BG learns this policy via reinforcement learning?) -BG “clamps” target patterns into thalamo-cortical target buffer -BG trains upstream sensory hierarchy to map varying input to clamped target -Target pattern may be retrieved from memory on subsequent episode? Hayworth and Marblestone 2018

  27. Structured I/O from an associative memory Unstructured associative code Hayworth and Marblestone 2018

  28. Structured I/O from an associative memory Structured representation across multiple buffers Hayworth and Marblestone 2018

  29. A crude, very partial, and speculative “integrative picture” Hayworth and Marblestone 2018

  30. Returning to the current situation re integrated memory-based RL architectures in AI

  31. Returning to the current situation re integrated memory-based RL architectures in AI Basically “soft attention” over a set of memory “slots”, with cosine-distance based similarity lookup…

  32. What about structured routing / potential thalamus analogs?

  33. What about structured routing / potential thalamus analogs?

Recommend


More recommend