Toward a goal of understanding the mind Naomi Feldman University of Maryland December 4, 2014
Language in the mind How to learners acquire speech sound categories? How do listeners perceive speech in noisy situations? How do listeners represent the speech they hear? Higher-level questions about grammar, discourse, etc.
A common approach Behavioral and neural data from humans in the lab Cognitive model of the phenomenon being studied measurements from the lab
A recurring theme in cognition Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven” (neural modeling) (cognitive science) (computer science)
A recurring theme in cognition Well-designed systems are tuned to fit their environment “emergent” “optimal” “data-driven” (neural modeling) (cognitive science) (computer science) To understand the mind, we need to study the environment
Understanding the mind From cognitive/brain science Behavioral and neural data from humans in the lab Cognitive model of the phenomenon being studied measurements from the lab
Understanding the mind From computing/engineering From cognitive/brain science Collections of data from the Behavioral and neural data environment (e.g., corpora) from humans in the lab Features that help systems Cognitive model of the generalize from those data phenomenon being studied characteristics of measurements the environment from the lab
“Big data for cognitive science” - Jimmy Lin
An example from language How is speech represented? Dynamic signal that’s changing continuously Contains information in both frequency and time Caitlin Richter Aren Jansen Supported by NSF BCS-1320410
Developing representations Speech perception becomes tuned to the native language age in 6 7 8 9 10 11 12 months (Werker & Tees, 1984; Kuhl et al., 1992)
Developing representations Speech perception becomes tuned to the native language 6-8 months: 10-12 months: discriminate non- poor discrimination native consonant of non-native contrasts consonant contrasts age in 6 7 8 9 10 11 12 months (Werker & Tees, 1984; Kuhl et al., 1992)
Developing representations Speech perception becomes tuned to the native language 6 months: some 6-8 months: 10-12 months: language-specific discriminate non- poor discrimination perception of native consonant of non-native vowels contrasts consonant contrasts age in 6 7 8 9 10 11 12 months (Werker & Tees, 1984; Kuhl et al., 1992)
Representation matters People are much better than speech recognition vs. systems at generalizing from the data they hear Bob Siri
Representation matters People are much better than speech recognition vs. systems at generalizing from the data they hear Bob Siri Non-native listeners often fail to perceive unfamiliar phonetic distinctions
How is speech represented? From computing/engineering From cognitive/brain science Speech corpora from many Extensive data from human different languages listeners Effective methods for Cognitive models of language representing the speech signal acquisition and processing characteristics of measurements the environment from the lab
How is speech represented? Cognitive model that connects distributions of sounds in the input to performance on a laboratory task “same” or “different”? listeners’ performance on a discrimination task distribution of sounds in the input (Feldman et al., 2009)
How is speech represented? Different representations imply different input distributions Speech Representation 1 “same” or “different”? Speech Representation 2 listeners’ performance on a discrimination task Which representations best predict human discrimination data?
Speaker normalization 1 2 3 4 5 6 7 8 9 10 11 12 0 -200 -400 Log Likelihood -600 -800 -1000 -1200 -1400 -1600 -1800 Number of Dimensions Unnormalized Normalized (Richter et al., in prep)
A central role for the mind From computing/engineering From cognitive/brain science Collections of data from the Behavioral and neural data environment (e.g., corpora) from humans in the lab Features that help systems Cognitive model of the generalize from those data phenomenon being studied characteristics of measurements the environment from the lab
Benefit to cognitive science Ecological validity for evaluating hypotheses about cognitive representations of speech Engineering tools provide hypotheses and insights regarding cognitive representations Methods for normalizing across speakers (Wegmann et al., 1996) RASTA is essentially an edge detector for speech (Hermansky & Morgan, 1994)
Benefit to engineering Speech representations that yield good performance on speech recognition tasks also predict human data best (Richter et al., in prep) Can cognitive models of phonetic learning improve zero- resource speech recognition systems that learn representations from unlabeled data?
Connections to neuroscience? Existing data on neural activity in when listening to speech (e.g., Mesgarani et al., 2014; Näätänen et al., 1997; Toscano et al., 2010) Use neural data to investigate relationships between neural activation patterns and cognitive representations How do feature representations relate to neural activations computed from the same stretch of speech?
Recommend
More recommend