a reinforcement learning model of song
play

A reinforcement learning model of song acquisition in the bird - PowerPoint PPT Presentation

A reinforcement learning model of song acquisition in the bird Michale Fee McGovern Institute Department of Brain and Cognitive Sciences Massachusetts Institute of Technology 9.54 November 12, 2014 Structure of zebra finch song Motif Motif


  1. A reinforcement learning model of song acquisition in the bird Michale Fee McGovern Institute Department of Brain and Cognitive Sciences Massachusetts Institute of Technology 9.54 November 12, 2014

  2. Structure of zebra finch song Motif Motif 10 kHz Frequency 0 kHz 1s Note (~10ms) Syllable (~100ms)

  3. Songbirds learn to sing by imitating their parents Subsong Increased Similarity to Tutor Decreased Variability Plastic Song Crystallized Tutor Song

  4. Overview • The songbird as a model system for understanding how the brain generates and learns complex sequential behaviors • Review some current understanding of the mechanisms of song production • Describe progress in elucidating the role of cortical and basal ganglia circuits in song learning. • Some speculations on how insights from the songbird may inform our understanding of mammalian BG function

  5. A circuit for vocal production HVC Motor Pathway RA Cortex Uva Thalamus nXII Nottebohm et al, 1976, 1982

  6. Antidromic activation allows identification of RA-projecting neurons in HVC Extracellular recording electrode Stimulation HVC electrode RA X Hahnloser, Kozhevnikov and Fee, 2002

  7. • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • ‘ ’ • • • • • • • • HVC neurons burst throughout the song • • ‘ ’ • • • • • • • • • • ‘ ’ • • • Bird C: 91 bursts, 64 neurons Bird A: 66 bursts, 40 neurons Bird B: 56 bursts, 44 neurons • • • • • • • • • • 1 1 1 • t # t # t # 100ms • 100ms 100ms 66 91 56 • Bird B: 56 bursts, 44 neurons • • • • • • • Lynch, Okubo and Fee, in • preparation • • • • • • • • •

  8. Extracellular recording electrode HVC Activity of RA neurons during RA singing Motif Leonardo and Fee, 2005 Yu and Margoliash, 1996

  9. Simple sequence generation circuit Sparse representation of time Output Leonardo and Fee, 2005

  10. Simple sequence generation circuit Sparse representation of time Output Leonardo and Fee, 2005

  11. HVC is the ‘clock’ of the song motor pathway Brain cooling to localize dynamics Bilateral cooling of HVC causes uniform slowing of the song n p 0.25A 0.0A HVC -0.25A RA -0.5A -0.75A nXII ... 5 mm Long and Fee, Nature 2008

  12. A simple reinforcement model of song learning Song evaluation Auditory Memory Auditory - - feedback Error/Reinforcement signal Song motor Exploratory Song system variability Doya and Sejnowski, 1989

  13. A separate circuit for song learning Instructive signal HVC Cortex Motor Pathway RA Anterior Forebrain Pathway (AFP) LMAN Thalamus Basal Ganglia nXII DLM Area X • The learning pathway is not necessary for adult song production , but is required for learning (Bottjer, 1984, Scharff and Nottebohm, 1991) • Bottjer proposed that the AFP transmits an instructive signal that guides plasticity in the motor pathway

  14. Separate premotor pathways for stereotyped song and variability Sequence generator HVC RA Variability generator LMAN nXII Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

  15. Separate premotor pathways for stereotyped song and variability Sequence generator TTX or Muscimol HVC RA Variability generator LMAN nXII Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

  16. Transient inactivation of the learning pathway HVC RA LMAN nXII 55 day old bird Olveczky, Andalman, and Fee, 2005

  17. LMAN drives exploratory variability in song LMAN intact Residual Pitch (Hz) 20 0 LMAN inactivated -20 0 20 40 60 Time (ms)

  18. LMAN also drives early song ‘babbling’ LMAN intact LMAN inactivated 30 dB 250 ms Goldberg and Fee, 2011

  19. HVC lesions abolish all stereotyped song structure HVC Motor Pathway RA Learning Pathway (AFP) LMAN nXII

  20. HVC lesions abolish all stereotyped song structure Pre HVC lesion Post HVC lesion Subsong bird Plastic song bird Adult bird Aronov, Andalman and Fee, Science 2008,  Transient pharmacological inactivation of HVC produces the same effect

  21. The basal ganglia are not necessary for subsong or vocal variability in juvenile birds HVC Subsong Pre-lesion RA LMAN X 30 X dB nXIIts DLM Post-lesion • Lesions of the BG have little or no acute effect on juvenile song variability. • Local cooling in LMAN slow timescales of babbling  exploratory vocal variability is 250 ms generated by local circuit dynamics within LMAN . Goldberg and Fee, 2011

  22. Separate premotor pathways for stereotyped song and variability Sequence generator HVC RA Variability generator LMAN nXII Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

  23. Separate premotor pathways for stereotyped song and variability Sequence generator HVC Instructive signal RA Variability generator LMAN nXII Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

  24. Separate premotor pathways for stereotyped song and variability Sequence generator HVC Instructive signal RA Variability generator LMAN nXII DLM Area X

  25. Song learning is slow Pupil Tutor Days of Training 5 8 12 20 30 606 Hz Pitch (Hz) 568 554 551 596 607 Harmonic Stack Tchernichovski, Mitra, Lints, Nottebohm, 2001

  26. Experimental control of song learning Vocalized DSP Speaker Mic Targeted region of syllable Brain Pitch (kHz) Pitch threshold 0.6 0.5 Heard Cranial airsac Feedback Noise Andalman and Fee 2009; Tumer and Brainard 2007

  27. Conditional auditory feedback drives pitch learning 650 Pitch (Hz) 550 0 h 2 h 4 h 650 Tumer and Brainard 2007 Pitch (Hz) 25 ms 550 Andalman and Fee 2009

  28. Many days of sequential learning 15 700 Up Days Observations Down Days 10 Pitch (Hz) 600 5 500 0 -50 0 50 Δ Pitch, Day (Hz) 120 125 130 135 140 145 150 155 160 165 Days Post Hatch 15 Up Days Observations 650 600 Down Days 10 Pitch (Hz) 5 0 -50 0 50 Δ Pitch, Overnight (Hz) 470 520 141 142 143 144 161 162 163 164 Days Post Hatch Days Post Hatch

  29. Where does this learning occur in the song control circuit? AFP-driven variability Motor pathway Motor parameter space

  30. Where does this learning occur in the song control circuit? AFP-driven bias AFP-driven variability Motor pathway Motor parameter space Error gradient (reduced error)

  31. Where does this learning occur in the song control circuit? AFP-driven bias AFP-driven variability Motor pathway Plasticity in motor pathway Motor parameter space Error gradient (reduced error)

  32. Where does this learning occur in the song control circuit? AFP-driven Motor Pathway Anterior Forebrain Pathway (AFP) bias HVC LMAN RA X nXIIts DLM Plasticity in motor pathway HVC LMAN RA X nXIIts DLM

  33. 600 Pitch (Hz) ? 470 PBS TTX PBS TTX PBS TTX PBS TTX cap outlet tube Up Days TTX Observations Down Days 8 drug TTX reservoir Pitch (Hz) dental acrylic 6 skull 4 inflow tube dialysis membrane Δ 2 LMAN 0 25 Hz -50 0 50 ∆ Pitch (Hz) 2 h Vehicle Vehicle 10 Pitch (Hz) 8 Observations Up Days Down Days 6 4 2 0 -50 0 50 ∆ Pitch (Hz) Andalman and Fee, PNAS 2009

  34. Does AFP-driven variability become biased to reduce vocal errors? AFP-driven error-reducing Yes!! bias AFP-driven variability Motor pathway Plasticity in motor pathway Motor parameter space Error gradient (reduced error)

  35. Is all song learning mediated by AFP bias? Many days of sequential learning 700 Pitch (Hz) 600 500 120 125 130 135 140 145 150 155 160 165 Days Post Hatch

  36. Is all song learning mediated by AFP bias? LMAN(+) LMAN(-) 600 Pitch (Hz) baseline 500 120 125 130 135 140 145 150 155 160 165 Days post-hatch LMAN(+) LMAN(-) 600 Pitch (Hz) baseline 500 120 125 130 135 140 145 150 155 160 165 Days post-hatch

  37. AFP bias is highly predictive of motor pathway plasticity within the next 24 hours Days 100 Lag = -2 d Lag = -1 d Lag = 0 d (down days inv.) Δ m (Hz) 50 Day 1 Day 2 Day 3 Night 0 -100 0 100 Pitch Estimated AFP bias (Hz, down days inverted) β Δ m 1 Correlation Coefficient (r 2 ) 0.8 0.6 0.4 0.2 0 -4 -2 0 2 4 Andalman and Fee, 2009 Lag (days) Warren et al, 2011

  38. Motor pathway plasticity appears to ‘ integrate ’ AFP bias Day 1 Day 2 Day 3 AFP-driven AFP-driven motor motor variability bias pathway pathway Motor pathway plasticity plasticity Motor parameter space Error gradient (reduced error)

Recommend


More recommend