4/28/2014 The Scope of This Talk Covert skill learning in a cortical- basal ganglia circuit Journal Institution Charlesworth, J.D., Warren, T.L., & Brainard, M.S. (2012). Covert skill Authors learning in a cortical-basal ganglia circuit. Nature 486, 251-255. doi: 10.1038/nature11078 Background Figures Their Conclusions BIONB 4110 Further Thoughts April 28, 2014 Presented by: Jennifer Hoots and Professor Carl Hopkins The Journal The Institution Nature University of California San Francisco W. M. Keck Center for Integrative Neuroscience Published weekly Department of Physiology Neuroscience Graduate Program Impact factor 38.597 Interdisciplinary International Peer-reviewed The Authors The Authors Jonathan D. Charlesworth Timothy L. Warren Studied molecular biology as an undergraduate at Princeton Performed the experiments with LMAN inactivations University ‘07 Studied at Harvard PhD in neuroscience at University of California San Francisco ’12 as an undergraduate Postdoc at Neurotek (now thync) UCSF graduate student Current senior scientist at thync at the time Performed the experiments with APV in RA Analyzed the data Jonathan Charlesworth. Timothy L. Warren. Retrieved from: http://keck.ucsf.edu/~twarren/ Retrieved from: http://blogs.princeton.edu/pa w/2012/05/tiger-of-the-we- 115/ 1
4/28/2014 The Authors In the "actor-critic" models of reinforcement learning three events must occur for learning to occur. What are Michael S. Brainard these three events and how do they influence Principal Investigator at University of learning? California, San Francisco Howard Hughes Medical Institute professor Also a professor of physiology and psychiatry Howard Hughes Medical Institute (2014). Retrieved from: BS, biochemistry at Harvard University http://www.hhmi.org/scientists/michael- brainard PhD, neurobiology, Stanford University “Actor/Critic Models of Reinforcement Learning” “Actor/Critic Models of Reinforcement Learning” Reinforcement learning or “trial and error” learning was first characterized by Reinforcement learning or “trial and error” learning was first characterized in Thorndike’s (1911) “Law of Effect”. This states that a random action that produces a Thorndike’s (1911) “Law of Effect” which states that a random action that produces a satisfying effect is more likely to occur again in that same situation . satisfying effect is more likely to occur again in that same situation . The three conditions for reinforcement learning are: 1) The situation (context, state, timing). 2) The action (what the animal or “actor” did – a motor act, a plan or a thought). 3) The reward Thus: If, in a given situation, after a given action, a reward occurs (i.e. a satisfying effect or a sense of comfort ), then the action will be more likely to occur again in that same situation. By contrast, if in a given situation a negative reward (i.e. one that produces discomfort or dissatisfaction) will be less likely in the same context. Reward, which can either be positive or negative is input from the “critic”. Therefore: If an Action occurs in a given context, followed by a critic, the action will be repeated or not repeated. Thorndike, Edward Lee (1911) Animal Intelligence. Macmillan, New York, 297 pp. Thorndike, Edward Lee (1911) Animal Intelligence. Macmillan, New York, 297 pp. The authors use birdsong as an The authors use birdsong as an example of learned behavior. What is example of learned behavior. What is the evidence that birds actually learn the evidence that birds actually learn their songs? their songs? 1) P. Marler and W. Thorpe working in Cambridge England in 1950’s discovered that Chaffinches sang only 2 songs as adults, but that the songs were different from one geographic area to another (dialects). 2) To prove that the birds were learning their songs, they raised birds in acoustic isolation. 3) If tutored with a sound from a tape recorder, the isolate bird will copy the tutor song as an adult. If presented with a different dialect copied the tutor’s song, not their own native dialect. 4) White crown sparrow in California (P. Marler) had similar geographic dialects, and similar learning rules. 5) Birds who are deafened before they learn to sing will sing an abbarent song, if deafened after they have learned to sing the deafening has no effect. 6) Male birds often sing exactly the same song as their father’s song. There is a lot of variation in the songs within a species, but sons will often replicate the exact same syllables that their father sang. 2
4/28/2014 Zebra Finch ( Taeniopygia guttata ) Neuroanatomy of song production The brain areas involved in song production were established by tract- tracing studies done in the 1970’s by Fernando Nottebohm (silver degeneration techniques). Native to deserts of Australia. Huge flocks, migratory. LMAN is essential for song learning but not for song production SCIENCE (1984) p. 901- 903 Abnormal song after LMAN lesion Normal song after control lesion to forebrain 3
4/28/2014 Brainard and Doupe (2000) develop error model for song learning Recording of single identified units in HVC recorded during bouts of natural singing showed three different but stereotyped patterns of firing with respect to the vocal output. Single units were identified by anti- dromic stimulation of X, and RA. 1) Units that projected to area X in the Anterior Forebrain Nucleus fire in bursts of one to four times per motif. 2) Units projecting to RA fire very rarely -- phase locked to no more than one syllable per motif. This is a sparse code for one piece of the song. 3) Interneurons within HVC fir throughout the song with tonic firing. Alexay A. Kozhevnikov , Michale S. Fee (2007) Singing-Related Activity of Identified HVC Neurons in the Zebra Finch. Journal of Neurophysiology. Vol. 97no. 4271- 4283. Purpose & Hypothesis Simplified model for how the brain controls a complex, learned vocalization. Purpose: To further resolve the function of cortical-basal ganglia circuits in trial and error skill learning. 1) Neurons in HVC fire in sparse code, one neuron per syllable. Each neuron connects to the next neuron in the timing chain. “…learning requires the reinforcement of exploratory 2) HVC neurons send output to one or more RA neurons. RA neurons fire at syllable-specific times behavioural variation generated by the AFP; therefore, in the song. RA codes for individual muscle preventing the AFP from contributing to behavioural variation contractions within the song. during training should prevent trial-and- error learning.” 3) Each syllable is composed of a complex of muscle contractions linked to the active units in RA. Anthony Leonardo and Michale S. Fee (2005) J.Neurosci. Study Organism Training Tumer, E.C. & Brainard, M.S. (2007) Performance variability Bengalese finches ( Lonchura striata domestica ) enables adaptive plasticity of “crystallized” adult birdsong. Nature. doi: 10.1038/nature06390 Adult males (more than 120 days old) There is trial-by-trial variation in stable adult song Housed in sound-attenuating A computerized system monitors pitch variation and delivers chambers real-time auditory disruption to a subset of those variations All recorded songs were Birds adjust their song to avoid the disruption undirected (no female present) Threshold for avoiding white noise was set at about the baseline median FF performance Beckham, R. (2013) Society Finch - Lonchura striata domestica. efinch.com. Retrieved from:: White noise was delivered for 4-14 hrs while birds were awake http://www.efinch.com/species/society.htm 4
4/28/2014 Figure 1 (a & b) Tumer, E.C. & Brainard, M.S. (2007) Performance variability enables adaptive plasticity of “crystallized” adult birdsong. Nature. doi: 10.1038/nature06390 Figure 1 (c-g) Figure 2 (a) APV infusion Use APV to block LMAN Brainard, M.S., & Doupe, A.J. (2000) Auditory feedback in learning and maintenance of vocal behaviour. Nature Reviews Neuroscience 1, 31-40 . doi: 10.1038/35036205 APV Injection Figure 2 (b & c) Reverse microdialysis technique diffuses the solution into the brain area (RA) across the membrane of the implanted probe 48 hrs of ACSF was dialysed NMDAR antagonist DL-APV (DL-2-Amino-5-phosphonopentanoic acid) was dialysed for at least 1.5 hrs before white noise training Sigma-Aldrich Co. LLC. (2014). Retrieved from: http://www.sigmaaldrich.com/catalog/product/si gma/a5282?lang=en®ion=US Switched solution back to ACSF and prevented birds from singing for at least 1.5 hrs to allow washout before recording first song recording after training 5
Recommend
More recommend