comp 546
play

COMP 546 Lecture 21 Cochlea to brain, Source Localization Tues. - PowerPoint PPT Presentation

COMP 546 Lecture 21 Cochlea to brain, Source Localization Tues. April 3, 2018 1 Ear pinna cochlea auditory canal outer middle inner 2 Eye Ear ? Lens Retina ? Photoreceptors ?


  1. COMP 546 Lecture 21 Cochlea to brain, Source Localization Tues. April 3, 2018 1

  2. Ear pinna cochlea auditory canal outer middle inner 2

  3. Eye Ear • ? • Lens • Retina • ? • Photoreceptors • ? (light -> chemical) • Ganglion cells (spikes) • ? • Optic nerve • ? 3

  4. Eye Ear • Outer ear • Lens • Retina • Cochlea • Photoreceptors • hair cells (light -> chemical) (mechanical -> chemical) • Ganglion cells (spikes) • Ganglion cells (spikes) • Optic nerve • VestibuloCochlear nerve 4

  5. Basilar Membrane BM fibres have bandpass frequency mechanical responses. 20,000 Hz 20 Hz 5

  6. Basilar Membrane: Place code (“ tonotopic ”) Nerve cells (hair + ganglion) are distributed along the BM. They have similar bandpass frequency response functions. 20,000 Hz 20 Hz 6

  7. Bandpass responses (more details next lecture) 0 1000 2000 3000 4000 …. 22,000 7

  8. Neural coding of sound in cochlea • Basilar membrane responds by vibrating with sound. • Hair cells at each BM location release neurotransmitter that signal BM amplitude at that location • Ganglion cells respond to neurotransmitter signals by spiking 8

  9. Louder sound within frequency band → greater amplitude of BM vibration at that location → greater release of neurotransmitter by hair cell → greater probability of spike at each peak of filtered wave 9

  10. Hair cell neurotransmitter release can signal exact timing of BM amplitude peaks for frequencies up to ~2 kHz. For higher frequencies, hair cells encode only the envelope of BM vibrations. t 10

  11. Timing of ganglion cell spikes: for frequencies up to 2 KHz (“phase locking”) BM vibration Spikes Hair cells release more neurotransmitter at BM amplitude peaks. Ganglion cells respond to neurotransmitter peaks by spiking. This allows exact timing of BM vibrations to be encoded by spikes. 11

  12. Ganglion cells cannot spike faster than 500 times per second. So we need many ganglion cells for each hair cell. 3,000 hair cells in each cochlea (left and right) cochlear nerve (to brain) 30,00 ganglion cells in each cochlea 12

  13. “Volley” code Different ganglion cells at same spatial position on BM 13

  14. From cochlea to brain stem BRAIN STEM cochlea → cochlear nucleus → lateral and medial superior olive (LSO, MSO) … → auditory cortex 14

  15. Tonotopic maps lateral medial superior superior olive (LSO) olive (MSO) high 𝜕 low 𝜕 cochlear nucleus (CN) auditory nerve cochlea 15

  16. Binaural Hearing LSO MSO MSO LSO high 𝜕 low 𝜕 high 𝜕 low 𝜕 CN CN 16

  17. Binaural Hearing MSO combines low frequency signals. LSO MSO MSO LSO low 𝜕 low 𝜕 high 𝜕 high 𝜕 CN CN 17

  18. Binaural Hearing LSO combines high frequency signals. LSO MSO MSO LSO high 𝜕 high 𝜕 CN CN 18

  19. Levels of Analysis high - what is the task ? what problem is being solved? - brain areas and pathways - neural coding - neural mechanisms low 19

  20. For high frequency bands, • the head casts a shadow • the timing of the peaks cannot be accurately coded by the spikes (only the rate of spikes is informative) For low frequency bands, • the head casts a weak shadow only • the timing of the peaks can be encoded by spikes 20

  21. Duplex theory of binaural hearing (Rayleigh, 1907) • level differences computed for higher frequencies (ILD -- interaural level differences) • timing differences computed for lower frequencies (ITD - interaural timing differences) 21

  22. Level differences (high frequencies) Excitatory input comes from the ear on the same side. Inhibitory input comes from ear on the opposite side. LSO MSO MSO LSO − − + + high 𝜕 high 𝜕 CN CN 22

  23. Timing differences (low frequencies) Sum excitatory input from both sides. Reminiscent of binocular complex cells in V1 ? LSO MSO MSO LSO low 𝜕 low 𝜕 + + high 𝜕 high 𝜕 + + CN CN 23

  24. Jeffress Model (1948) for timing differences http://auditoryneuroscience.com/topics/jeffress-model-animation from right ear D E C B A from left ear C B D A E 24

  25. Spike Timing precision required for Jeffress Model ? from right ear D E C B A from left ear 1 distance = 10 𝑛𝑗𝑚𝑚𝑗𝑛𝑓𝑢𝑠𝑓𝑡 speed of spike = 10 𝑛𝑓𝑢𝑠𝑓𝑡 𝑡𝑓𝑑𝑝𝑜𝑒 −1 𝑒𝑗𝑡𝑢𝑏𝑜𝑑𝑓 1 ⟹ ∆ time = = 100 𝑛𝑗𝑚𝑚𝑗𝑡𝑓𝑑𝑝𝑜𝑒 𝑡𝑞𝑓𝑓𝑒 25 See Exercises 19 Q2c

  26. Jeffress model remains controversial. It is not known exactly how “coincidence detection” occurs in MSO. 𝜕 A B C D E Coincidence detection B A C D E for each low frequency band A B C D E A B C D E 26

  27. Naïve Computational Model of Source Localization (Recall lecture 20) 𝐽 𝑚 ( 𝑢 ) = 𝛽 𝐽 𝑠 ( 𝑢 − 𝜐 ) + 𝜗(𝑢) shadow model delay error Find the 𝛽 and 𝜐 that minimize 𝑈 { 𝐽 𝑚 ( 𝑢 ) − 𝛽 𝐽 𝑠 ( 𝑢 − 𝜐 ) } 2 where 𝜐 < 0.5 𝑛𝑡 . 𝑢=1 27

  28. Timing difference: find the 𝜐 that maximizes 𝐽 𝑚 ( 𝑢 ) 𝐽 𝑠 ( 𝑢 − 𝜐 ) . 𝑢 Level difference: 𝐽 𝑚 ( 𝑢 ) 2 𝑈 𝑢=1 10 𝑚𝑝𝑕 10 𝐽 𝑠 ( 𝑢 ) 2 𝑈 𝑢=1 28

  29. For each low frequency band 𝑘, find the 𝜐 that maximizes 𝑘 ( 𝑢 ) 𝐽 𝑠𝑗𝑕ℎ𝑢 𝑘 ( 𝑢 − 𝜐 ) . 𝐽 𝑚𝑓𝑔𝑢 𝑢 (or use summation model similar to binocular cells or Jeffress model) An estimated value of delay 𝜐 in frequency band j is consistent with various possible source directions ( 𝜚, θ ). Similar to cone of confusion, but more general because of frequency dependence 29

  30. For each high frequency band 𝑘 , compute interaural level difference (ILD) : 2 𝑘 ( 𝑢 ) 𝑈 𝑢=1 𝐽 𝑚𝑓𝑔𝑢 𝐽𝑀𝐸 𝑘 = 10 𝑚𝑝𝑕 10 2 𝑘 ( 𝑢 ) 𝑈 𝑢=1 𝐽 𝑠𝑗𝑕ℎ𝑢 What does each 𝐽𝑀𝐸 𝑘 tell us ? 30

  31. Recall head related impulse response function (HRIR) from last lecture.. If the source direction is ( q, f ), and 𝑕 𝑘 𝑢 is the filter for band 𝑘. then… = 𝑕 𝑘 𝑢 ∗ ℎ 𝑚𝑓𝑔𝑢 (𝑢; 𝜚, 𝜄 ) ∗ 𝐽 𝑡𝑠𝑑 𝑢; 𝜚, 𝜄 𝑘 𝐽 𝑚𝑓𝑔𝑢 𝑢; 𝜚, 𝜄 𝑢; 𝜚, 𝜄 = 𝑕 𝑘 𝑢 ∗ ℎ 𝑠𝑗𝑕ℎ𝑢 (𝑢; 𝜚, 𝜄 ) ∗ 𝐽 𝑡𝑠𝑑 𝑢; 𝜚, 𝜄 𝑘 𝐽 𝑠𝑗𝑕ℎ𝑢 31

  32. Take the Fourier transform and apply convolution theorem : 𝑕 𝑘 𝜕 𝑘 ℎ 𝑚𝑓𝑔𝑢 (𝜕; 𝜚, 𝜄 ) 𝐽 𝑚𝑓𝑔𝑢 𝜕; 𝜚, 𝜄 = 𝐽 𝑡𝑠𝑑 𝜕; 𝜚, 𝜄 𝑕 𝑘 𝜕 𝑘 ℎ 𝑠𝑗𝑕ℎ𝑢 (𝜕; 𝜚, 𝜄 ) 𝐽 𝑠𝑗𝑕ℎ𝑢 𝜕; 𝜚, 𝜄 = 𝐽 𝑡𝑠𝑑 𝜕; 𝜚, 𝜄 32

  33. Take the Fourier transform and apply convolution theorem : 𝑕 𝑘 𝜕 𝑘 ℎ 𝑚𝑓𝑔𝑢 (𝜕; 𝜚, 𝜄 ) 𝐽 𝑚𝑓𝑔𝑢 𝜕; 𝜚, 𝜄 = 𝐽 𝑡𝑠𝑑 𝜕; 𝜚, 𝜄 𝑕 𝑘 𝜕 𝑘 ℎ 𝑠𝑗𝑕ℎ𝑢 (𝜕; 𝜚, 𝜄 ) 𝐽 𝑠𝑗𝑕ℎ𝑢 𝜕; 𝜚, 𝜄 = 𝐽 𝑡𝑠𝑑 𝜕; 𝜚, 𝜄 If there is just one source direction (𝜚, 𝜄 ), then for each frequency 𝜕 within band 𝑘 ∶ 𝑘 𝐽 𝑚𝑓𝑔𝑢 𝜕 ℎ 𝑚𝑓𝑔𝑢 (𝜕; 𝜚, 𝜄 ) ≈ 𝑘 𝐽 𝑠𝑗𝑕ℎ𝑢 𝜕 ℎ 𝑠𝑗𝑕ℎ𝑢 (𝜕; 𝜚, 𝜄 ) 33

  34. One can show using Parseval’s theorem of Fourier transforms that if ℎ 𝑚𝑓𝑔𝑢 (𝜕; 𝜚, 𝜄 ) and ℎ 𝑠𝑗𝑕ℎ𝑢 (𝜕; 𝜚, 𝜄 ) are approximately constant within band 𝑘, then: 2 𝑘 ( 𝑢 ) 𝑘 ( 𝜚, 𝜄 ) | 2 | 𝑈 𝑢=1 ℎ 𝑚𝑓𝑔𝑢 𝐽 𝑚𝑓𝑔𝑢 ≈ 2 𝑘 ( 𝑢 ) 𝑈 𝑘 𝑢=1 𝐽 𝑠𝑗𝑕ℎ𝑢 | ( 𝜚, 𝜄 ) | 2 ℎ 𝑠𝑗𝑕ℎ𝑢 34

  35. One can show using Parseval’s theorem of Fourier transforms that if ℎ 𝑚𝑓𝑔𝑢 (𝜕; 𝜚, 𝜄 ) and ℎ 𝑠𝑗𝑕ℎ𝑢 (𝜕; 𝜚, 𝜄 ) are approximately constant within band 𝑘, then: 2 𝑘 ( 𝑢 ) 𝑘 ( 𝜚, 𝜄 ) | 2 | 𝑈 𝑢=1 ℎ 𝑚𝑓𝑔𝑢 𝐽 𝑚𝑓𝑔𝑢 ≈ 2 𝑘 ( 𝑢 ) 𝑈 𝑘 𝑢=1 𝐽 𝑠𝑗𝑕ℎ𝑢 | ( 𝜚, 𝜄 ) | 2 ℎ 𝑠𝑗𝑕ℎ𝑢 The ear can measure this… and can infer source directions ( 𝜚, 𝜄 ) that are consistent with it. 35

  36. https://auditoryneuroscience.com/topics/acoustic-cues-sound-location Interaural Level Difference (dB) as a function of ( 𝜚, 𝜄) for two fixed ω . 700 Hz 11,000 Hz Each iso-contour in each frequency band is consistent with a measured level difference (dB). 36

  37. Monaural spectral cues (Spatial localization with one ear?) 𝐽 𝑘 𝑢; 𝜚, 𝜄 = 𝑕 𝑘 𝑢 ∗ ℎ(𝑢; 𝜚, 𝜄 ) ∗ 𝐽 𝑡𝑠𝑑 𝑢; 𝜚, 𝜄 𝐽 𝑘 𝜕; 𝜚, 𝜄 𝑕 𝑘 𝜕 ℎ 𝑘 (𝜕; 𝜚, 𝜄 ) = 𝐽 𝑡𝑠𝑑 𝜕; 𝜚, 𝜄 Pattern of peaks and notches If the source is noise, then all across bands will be due to frequencies make the same HRTF, not to the source. contribution on average. 37

Recommend


More recommend