Josh McDermott Dept. of Brain and Cognitive Sciences, MIT May 6, - PowerPoint PPT Presentation

Aug 12, 2023 •106 likes •212 views

Josh McDermott Dept. of Brain and Cognitive Sciences, MIT May 6, 2015 NSF Speech Technology Workshop My research group: Laboratory for Computational Audition Psychology Neuroscience Engineering Experiments Auditory Machine in humans

Josh McDermott Dept. of Brain and Cognitive Sciences, MIT May 6, 2015 NSF Speech Technology Workshop
My research group: Laboratory for Computational Audition Psychology Neuroscience Engineering Experiments Auditory Machine in humans neuroscience algorithms • We study auditory scene analysis and sound recognition • Contact with speech technology through assistive devices and machine intelligence • Funded by McDonnell Foundation and NSF
Recent approach in our lab: train deep convolutional neural networks on speech tasks, compare representations to brain • So far: word recognition, speaker identification in noise • CNN performs about as well as humans • Can use CNN as a hypothesis about neural representation
Ability of shallow vs. deep CNN layers to predict brain responses provides insights into computational complexity: Primary auditory cortex Speech- selective cortex shallow  deep CNN layer
Using speech analysis/synthesis to manipulate grouping cues: • STRAIGHT decomposes speech into excitation and filtering. • Excitation modeled sinusoidally • Altered to inharmonic, or replaced with noise to simulate whispering: • Do these manipulations affect ability to segregate speech? joint work with Kawahara & Ellis
“WORD 1” Task: “WORD” or Type in all the words you hear. + “WORD 2” 0.9 • Single word recognition 0.8 similar for all conditions. 0.7 • For word pairs, recognition worse for 0.6 Mean # Correct Words inharmonic than 0.5 harmonic speech, suggestive of effect on 0.4 segregation. 0.3 • But much larger effect 0.2 of whispering. Harmonic • Potentially suggestive of 0.1 Jittered Whispered importance of sparsity. 0 Single Word Word Pairs
Reverberation profoundly distorts sound signals: Dry Reverberant Problem for machine speech recognition: Percent Errors Reverberation is also a challenge for hearing- impaired listeners.
Characterizing the distribution of real-world reverberation What is the empirical distribution of environmental impulse responses? IR Measurement • Broadcast fixed source signal • Record resulting reverberant signal • From this, infer environmental IR IR Survey • 24 text messages/day • Phone returns GPS coordinates • Participants reply to text with photo, address
Everyday impulse responses are pretty stereotyped Frequency asymmetry (skew of subband RT60) 6 Survey • Exponential decay KEMAR HATS 5 8m • Faster at high frequencies • Exaggerated asymmetry in 4 271 IRs from 301 large rooms surveyed locations 3 • Suggests prior for dereverberation … 2 1 0 1st quartile 4th quartile -1 -2 -1 0 10 10 10 Mean subband RT60 (s)
Challenges to Impacting Technology • Lack of large high-quality labeled data sets in some domains • Emotional speech • Environmental sounds • Cultural divides between neuroscience and engineering • Different meetings, departments, jargon, funders • Possibly getting worse? • Workshops help, particularly if students have access

Recommend

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It receives information from, and controls the activities of, the trunk and limbs, mainly through rich connections with the spinal cord. The brain

523 views • 27 slides

Xenon: High-Assurance Xen John McDermott John.McDermott@NRL.Navy.Mil Naval Research Laboratory

Xenon: High-Assurance Xen John McDermott John.McDermott@NRL.Navy.Mil Naval Research Laboratory Center for High-Assurance Computer Systems http:/ / chacs.nrl.navy.mil Xenon Xen Beyond Buffer Overflows high-assurance Policy flaws Use the

569 views • 24 slides

MIT MIT S EMINAR ON S EMINAR ON MIT ESD.69 EMINAR ON EMINAR ON MIT HST.926 H EALTH EALTH C ARE

MIT MIT MIT MIT S EMINAR ON S EMINAR ON MIT ESD.69 EMINAR ON EMINAR ON MIT HST.926 H EALTH EALTH C ARE (Special Student) ARE HMS HC.750 (Special Section) S YSTEMS YSTEMS I NNOVA NNOV TION ATION H EALTH EALTH C ARE ARE S YSTEMS FROM

445 views • 16 slides

The Science Matters: Language, Reading, & Brain John Gabrieli Department of Brain and

The Science Matters: Language, Reading, & Brain John Gabrieli Department of Brain and Cognitive Sciences & Martinos Imaging Center at the McGovern Institute for Brain Research, MIT The Science Matters: Language, Reading, & Brain

1.01k views • 38 slides

Open notebooks in cognitive neuroscience Jrn Alexander Quent MRC Cognition and Brain Sciences

Open notebooks in cognitive neuroscience Jrn Alexander Quent MRC Cognition and Brain Sciences Unit 20 November 2018 MRC Cognition and Brain Sciences Unit Keeping notes of your work MRC Cognition and Brain Sciences Unit Keeping notes of your

203 views • 16 slides

Cognitive Event-Related Brain Potentials and MRI in Psychosis Dean F Salisbury, PhD Cognitive

Cognitive Event-Related Brain Potentials and MRI in Psychosis Dean F Salisbury, PhD Cognitive Neuroscience Laboratory, McLean Hospital Harvard Medical School, Boston, Massachusetts, USA High-Resolution Measures of Brain Function and Brain

652 views • 16 slides

Reverse-engineering human intelligence, and engineering more human-like AI Josh Tenenbaum MIT

Reverse-engineering human intelligence, and engineering more human-like AI Josh Tenenbaum MIT Computational Cognitive Science Group CSAIL Department of Brain and Cognitive Sciences March 2013 A success story: Intelligence as statistics

379 views • 5 slides

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain Crocodile 1 Brain YOUR IDEA, DEAL OR PRODUCT THE CROC BRAIN HOW THE CROC BRAIN FILTERS INFORMATION FAST NOVEL VISUAL Croc Brain FAST 3 2

280 views • 8 slides

Chrome OS Internals Josh Triplett josh@joshtriplett.org LinuxCon Europe 2014 Josh Triplett

Chrome OS Internals Josh Triplett josh@joshtriplett.org LinuxCon Europe 2014 Josh Triplett Chrome OS Internals LinuxCon Europe 2014 1 / 43 Overview Intro to Chrome OS Architecture of Chrome OS Verified boot and developer mode Security

1.39k views • 110 slides

Everythings a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference

Everythings a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015 Everythings a file /home/josh/doc/presentations/lpc-2015/fd/fd.pdf /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

1.41k views • 140 slides

Cognitive Interviewing Debbie Collins What is cognitive interviewing? Cognitive interviewing

Cognitive Interviewing Debbie Collins What is cognitive interviewing? Cognitive interviewing techniques Think aloud Probing Observation Response latency Vignettes/ card sorts Cognitive Interviewing Process Comprehension

511 views • 20 slides

Drugs and the Brain Teaser Zak Fallows 2013-07-03 http://datb.mit.edu pharmacology@mit.edu 1

Drugs and the Brain Teaser Zak Fallows 2013-07-03 http://datb.mit.edu pharmacology@mit.edu 1 How the Brain Works You have about 100 billion brain cells, which are called neurons. Each neuron has about 1,000 connections, called synapses.

557 views • 26 slides

Illustration of the Capability and Limits of Visual Perception Aude Oliva Brain and Cognitive

Illustration of the Capability and Limits of Visual Perception Aude Oliva Brain and Cognitive Sciences MIT Email: oliva@mit.edu Web site: cvcl.mit.edu Demo 1 What do you see at a glance? Fast visual perception & Temporal constraints

431 views • 21 slides

Scene Understanding Aude Oliva Brain & Cognitive Sciences Massachusetts Institute of

Scene Understanding Aude Oliva Brain & Cognitive Sciences Massachusetts Institute of Technology Email: oliva@mit.edu http://cvcl.mit.edu PPA Definition A scene is a view of a real-world environment that contains multiples surfaces and

1.2k views • 91 slides

Language and the human brain Brain and Language What will be covered? A brief survey of

Language and the human brain Brain and Language What will be covered? A brief survey of brain structure. Some types of language disturbance that result from brain damage. The autonomy of language faculty. The human brain

553 views • 27 slides

A Heart (The Nerve!) Regions of the Brain Cerebral hemisphere Diencephalon Cerebellum Brain

If I Only Had a Brain . A Heart (The Nerve!) Regions of the Brain Cerebral hemisphere Diencephalon Cerebellum Brain stem (b) Adult brain Regions of the Brain: Cerebrum Central sulcus Precentral gyrus Postcentral gyrus

529 views • 48 slides

Modelling of sensory integration with neural network systems Lennart Gustafsson, Andrew Paplinski

Modelling of sensory integration with neural network systems Lennart Gustafsson, Andrew Paplinski & Tamas Jantvik Q: Why integrate sensory information? A: Because biology does it, at least for higher order animals, and the animals gain from

246 views • 22 slides

Auditory System & Hearing Chapters 9 and 10 Lecture 17 Jonathan Pillow Sensation &

Auditory System & Hearing Chapters 9 and 10 Lecture 17 Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325) Spring 2015 1 Cochlea: physical device tuned to frequency! place code : tuning of different parts of the cochlea to

808 views • 39 slides

Y P O Clinical Applications C of TMS T & O Evidence in N O Depression D Adam

Y P O Clinical Applications C of TMS T & O Evidence in N O Depression D Adam Stern, M.D. 2553480 E Director of Psychiatric Applications S A Berenson-Allen Center for Noninvasive Brain Stimulation, BIDMC E L Instructor in

830 views • 55 slides

CSE440: Introduction to HCI Methods for Design, Prototyping and Evaluating User Interaction

CSE440: Introduction to HCI Methods for Design, Prototyping and Evaluating User Interaction Lecture 07: Nigini Oliveira Human Performance Manaswi Saha Liang He Jian Li Zheng Jeremy Viny What we will do today Human Performance Visual

1.26k views • 90 slides

M-theory: unsupervised learning of hierarchical invariant representations tomaso poggio CBMM

The Center for Brains, Minds and Machines M-theory: unsupervised learning of hierarchical invariant representations tomaso poggio CBMM McGovern Institute, BCS, LCSL, CSAIL MIT Thursday, December 5, 13 Plan 1.Motivation: models of cortex

989 views • 77 slides

Brain and Art Guiomar Niso December 15, 2017 Guiomar Niso | C3GI | 2017 || Santiago Ramn y

Brain and Art Guiomar Niso December 15, 2017 Guiomar Niso | C3GI | 2017 || Santiago Ramn y Cajal Guiomar Niso | C3GI | 2017 || 2 Santiago Ramn y Cajal Premio Nobel 1906 Guiomar Niso | C3GI | 2017 || 3 Human Brain In the brain

583 views • 45 slides

Simula'ons of cor'cal network models made of stochas'c spiking

Simula'ons of cor'cal network models made of stochas'c spiking neurons Antonio C. Roque Department of Physics, FFCLRP University of So Paulo, Ribeiro

579 views • 46 slides

Precise and Approximate Representation of Numbers The Cartesian-Lagrangian representation of

Precise and Approximate Representation of Numbers The Cartesian-Lagrangian representation of numbers. The homotopic representation of numbers Loops and deck transformations The maximal ideal representation The place

731 views • 24 slides

Josh McDermott Dept. of Brain and Cognitive Sciences, MIT May 6, - PowerPoint PPT Presentation

Josh McDermott Dept. of Brain and Cognitive Sciences, MIT May 6, 2015 NSF Speech Technology Workshop My research group: Laboratory for Computational Audition Psychology Neuroscience Engineering Experiments Auditory Machine in humans

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It

Xenon: High-Assurance Xen John McDermott John.McDermott@NRL.Navy.Mil Naval Research Laboratory

MIT MIT S EMINAR ON S EMINAR ON MIT ESD.69 EMINAR ON EMINAR ON MIT HST.926 H EALTH EALTH C ARE

The Science Matters: Language, Reading, & Brain John Gabrieli Department of Brain and

Open notebooks in cognitive neuroscience Jrn Alexander Quent MRC Cognition and Brain Sciences

Cognitive Event-Related Brain Potentials and MRI in Psychosis Dean F Salisbury, PhD Cognitive

Reverse-engineering human intelligence, and engineering more human-like AI Josh Tenenbaum MIT

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain

Chrome OS Internals Josh Triplett josh@joshtriplett.org LinuxCon Europe 2014 Josh Triplett

Everythings a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference

Cognitive Interviewing Debbie Collins What is cognitive interviewing? Cognitive interviewing

Drugs and the Brain Teaser Zak Fallows 2013-07-03 http://datb.mit.edu pharmacology@mit.edu 1

Illustration of the Capability and Limits of Visual Perception Aude Oliva Brain and Cognitive

Scene Understanding Aude Oliva Brain & Cognitive Sciences Massachusetts Institute of

Language and the human brain Brain and Language What will be covered? A brief survey of

A Heart (The Nerve!) Regions of the Brain Cerebral hemisphere Diencephalon Cerebellum Brain

Modelling of sensory integration with neural network systems Lennart Gustafsson, Andrew Paplinski

Auditory System & Hearing Chapters 9 and 10 Lecture 17 Jonathan Pillow Sensation &

Y P O Clinical Applications C of TMS T & O Evidence in N O Depression D Adam

CSE440: Introduction to HCI Methods for Design, Prototyping and Evaluating User Interaction

M-theory: unsupervised learning of hierarchical invariant representations tomaso poggio CBMM

Brain and Art Guiomar Niso December 15, 2017 Guiomar Niso | C3GI | 2017 || Santiago Ramn y

Simula'ons of cor'cal network models made of stochas'c spiking

Precise and Approximate Representation of Numbers The Cartesian-Lagrangian representation of

Sambuz

Useful Links

Newsletter

Mail Us

Josh McDermott Dept. of Brain and Cognitive Sciences, MIT May 6, - PowerPoint PPT Presentation

Josh McDermott Dept. of Brain and Cognitive Sciences, MIT May 6, 2015 NSF Speech Technology Workshop My research group: Laboratory for Computational Audition Psychology Neuroscience Engineering Experiments Auditory Machine in humans

BRAIN VENTRICULAR SYSTEM CSF THE BRAIN BRAIN The brain (encephalon) lies within the cranium. It

Xenon: High-Assurance Xen John McDermott John.McDermott@NRL.Navy.Mil Naval Research Laboratory

MIT MIT S EMINAR ON S EMINAR ON MIT ESD.69 EMINAR ON EMINAR ON MIT HST.926 H EALTH EALTH C ARE

The Science Matters: Language, Reading, &amp; Brain John Gabrieli Department of Brain and

Open notebooks in cognitive neuroscience Jrn Alexander Quent MRC Cognition and Brain Sciences

Cognitive Event-Related Brain Potentials and MRI in Psychosis Dean F Salisbury, PhD Cognitive

Reverse-engineering human intelligence, and engineering more human-like AI Josh Tenenbaum MIT

Pitch Anything by Oren Klaff BUYER 3 3 Neocortex Neocortex 2 2 Mid Brain Mid Brain

Chrome OS Internals Josh Triplett josh@joshtriplett.org LinuxCon Europe 2014 Josh Triplett

Everythings a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference

Cognitive Interviewing Debbie Collins What is cognitive interviewing? Cognitive interviewing

Drugs and the Brain Teaser Zak Fallows 2013-07-03 http://datb.mit.edu pharmacology@mit.edu 1

Illustration of the Capability and Limits of Visual Perception Aude Oliva Brain and Cognitive

Scene Understanding Aude Oliva Brain &amp; Cognitive Sciences Massachusetts Institute of

Language and the human brain Brain and Language What will be covered? A brief survey of

A Heart (The Nerve!) Regions of the Brain Cerebral hemisphere Diencephalon Cerebellum Brain

Modelling of sensory integration with neural network systems Lennart Gustafsson, Andrew Paplinski

Auditory System &amp; Hearing Chapters 9 and 10 Lecture 17 Jonathan Pillow Sensation &amp;

Y P O Clinical Applications C of TMS T &amp; O Evidence in N O Depression D Adam

CSE440: Introduction to HCI Methods for Design, Prototyping and Evaluating User Interaction

M-theory: unsupervised learning of hierarchical invariant representations tomaso poggio CBMM

Brain and Art Guiomar Niso December 15, 2017 Guiomar Niso | C3GI | 2017 || Santiago Ramn y

Simula'ons of cor'cal network models made of stochas'c spiking

Precise and Approximate Representation of Numbers The Cartesian-Lagrangian representation of

Sambuz

Useful Links

Newsletter

Mail Us

The Science Matters: Language, Reading, & Brain John Gabrieli Department of Brain and

Scene Understanding Aude Oliva Brain & Cognitive Sciences Massachusetts Institute of

Auditory System & Hearing Chapters 9 and 10 Lecture 17 Jonathan Pillow Sensation &

Y P O Clinical Applications C of TMS T & O Evidence in N O Depression D Adam