Speech Processing 15-492/18-492 Human Speech Processing Phonetics and Phonology This lecture is recorded
The vocal tract The vocal tract
From meat to voice From meat to voice Blow air through lungs Blow air through lungs Vibrate larynx Vibrate larynx Vocal tract shape defines resonance Vocal tract shape defines resonance Obstructions modify sound Obstructions modify sound Tongue, teeth, lips, velum (nasal passage) Tongue, teeth, lips, velum (nasal passage)
The ear The ear
From sound to brain waves From sound to brain waves Sound waves Sound waves Vibrate ear drum Vibrate ear drum Cause fluid in cochlear to vibrate Cause fluid in cochlear to vibrate Spiral cochlear Spiral cochlear Vibrate hairs inside cochlear Vibrate hairs inside cochlear Different frequencies vibrate different hairs Different frequencies vibrate different hairs Converts time domain to frequency domain Converts time domain to frequency domain
From grunts to meaning From grunts to meaning Grunts and vocalization Grunts and vocalization Lots of variation available Lots of variation available (continuous systems – not discrete) (continuous systems – not discrete) Noises become distinct, recognizable Noises become distinct, recognizable Grow into languages, dialects and idiolects Grow into languages, dialects and idiolects What are the fundamental units? What are the fundamental units?
Articulatory Movements Articulatory Movements
Electromagnetic Articulograph Electromagnetic Articulograph
Phonemes Phonemes Defined as fundamental units of speech Defined as fundamental units of speech If you change it, it (can) change the meaning If you change it, it (can) change the meaning “pat” to “bat” pat” to “bat” “ “pat” to “pam” pat” to “pam” “
IPA IPA International Phonetic Alphabet International Phonetic Alphabet Defines everything Defines everything All vowels, consonants, modifications All vowels, consonants, modifications All distinctions, for all languages All distinctions, for all languages Uses latin++ character set to do it Uses latin++ character set to do it But it can be hard to type in computer programs But it can be hard to type in computer programs Organized by Organized by Vowels Vowels Consonants Consonants
Vowel Space Vowel Space • One or two banded frequencies (formants)
Consonant Chart Consonant Chart • Place and Manner of Articulation Wikipedia: IPA
English (US) Vowels English (US) Vowels AA wAshington AE fAt, bAd AH bUt, hUsh AO lAWn, mAll AW hOW, sOUth AX About, cAnoe AY hIde, bUY EH gEt, fEAther ER makER, sEARch EY gAte, EIght IH bIt, shIp IY bEAt, shEEp OW lOne, nOse OY tOY, OYster UH fUll U fOOl W
English Consonants English Consonants Stops: P, B, T, D, K, G Stops: P, B, T, D, K, G Fricatives: F, V, HH, S, Z, SH, ZH Fricatives: F, V, HH, S, Z, SH, ZH Affricatives: CH, JH Affricatives: CH, JH Nasals: N, M, NG Nasals: N, M, NG Glides: L, R, Y, W Glides: L, R, Y, W Note: voiced vs unvoiced: Note: voiced vs unvoiced: P vs B, F vs V P vs B, F vs V
Number of Phonemes in Language Number of Phonemes in Language US English: 43 US English: 43 UK English: 44 UK English: 44 Japanese: 25 Japanese: 25 Hindi: 81 Hindi: 81 Numbers aren’t definite though Numbers aren’t definite though Depends on who you ask, Depends on who you ask, And what you want it for And what you want it for
Not all variation is Phonetic Not all variation is Phonetic Phonology: linguistically discrete units Phonology: linguistically discrete units May be a number of different ways to say them May be a number of different ways to say them /r/ trill (Scottish or Spanish) vs US way /r/ trill (Scottish or Spanish) vs US way Phonetics vs Phonemics Phonetics vs Phonemics Phonetics: discrete units Phonetics: discrete units Phonemics: all sounds Phonemics: all sounds /t/ in US English: becomes “flap” /t/ in US English: becomes “flap” “ “water” / w ao t er / water” / w ao t er / “ “water” / w ao dx er / water” / w ao dx er /
Dialect and Idiolect Dialect and Idiolect Variation within language (and speakers) Variation within language (and speakers) Phonetic Phonetic “ “Don” vs “Dawn”, “Cot” vs “Caught” Don” vs “Dawn”, “Cot” vs “Caught” R deletion (Haavaad vs Harvard) R deletion (Haavaad vs Harvard) Word choice: Word choice: Y’all, Yins Y’all, Yins Politeness levels Politeness levels
Not all languages use the same set Not all languages use the same set Asperated stops (Korean, Hindi) Asperated stops (Korean, Hindi) P vs PH P vs PH English uses both, but doesn’t care English uses both, but doesn’t care Pot vs sPot (place hand over mouth) Pot vs sPot (place hand over mouth) L-R in Japanese not phonological L-R in Japanese not phonological US English dialects: US English dialects: Mary, Merry, Marry Mary, Merry, Marry Scottish English vs US English Scottish English vs US English No distinction between “pull” and “pool” No distinction between “pull” and “pool” Distinction between: “for” and “four” Distinction between: “for” and “four”
Different language dimensions Different language dimensions Vowel length Vowel length Bit vs beat Bit vs beat Japanese: shujin (husband) vs shuujin (prisoner) Japanese: shujin (husband) vs shuujin (prisoner) Tones Tones F0 (tune) used phonetically F0 (tune) used phonetically Chinese, Thai, Burmese Chinese, Thai, Burmese Clicks Clicks Xhosa Xhosa
Co-articulation Co-articulation Voicing actually doesn’t always stop Voicing actually doesn’t always stop “ “have honey”, “impossible” have honey”, “impossible” Nasalized voices, lip rounding Nasalized voices, lip rounding “ “min” vs “bit”, “sow” vs “see” min” vs “bit”, “sow” vs “see” Lexical stress: Lexical stress: EMphasis, emPHAsis EMphasis, emPHAsis PROject, proJECT PROject, proJECT Reduction, contraction Reduction, contraction “ “A boy is riding a bike” A boy is riding a bike” “ “I want to go to Disneyland.” I want to go to Disneyland.” “ “I will go tomorrow” I will go tomorrow”
Prosody Prosody Intonation Intonation Tune Tune Duration Duration How long/short of each phoneme How long/short of each phoneme Phrasing Phrasing Where the breaks are Where the breaks are
Intonation (F0) Intonation (F0) Rate of vibration during voiced speech Rate of vibration during voiced speech Males: 80-140 times a second Males: 80-140 times a second Females: 130-220 times a second Females: 130-220 times a second Children: 180-320 times a second Children: 180-320 times a second Used for: Used for: Emphasis Emphasis Style: questions, statements, confidence etc Style: questions, statements, confidence etc
Intonation Contour Intonation Contour
Intonation Information Intonation Information Large pitch range (female) Large pitch range (female) Authoritive since goes down at the end Authoritive since goes down at the end News reader News reader Emphasis for Finance H* Emphasis for Finance H* Final has a raise – more information to Final has a raise – more information to come come Female American newsreader from WBUR Female American newsreader from WBUR (Boston University Radio) (Boston University Radio)
Words Words Words Words The things with space around them (sort of) The things with space around them (sort of) Chinese, Thai, Japanese doesn’t use spaces Chinese, Thai, Japanese doesn’t use spaces Speech doesn’t use spaces Speech doesn’t use spaces Blackboard vs Black Board Blackboard vs Black Board English English Morphology: walk, walks, walking, walked Morphology: walk, walks, walking, walked Japanese Japanese Morphology: aruku, arukimasu, arukimashita, aruite, aruikitai, Morphology: aruku, arukimasu, arukimashita, aruite, aruikitai, aruikitakatta, arukemasu, …. aruikitakatta, arukemasu, ….
Speech Acts Speech Acts Words aren’t always what they seem Words aren’t always what they seem Can you pass the salt? Can you pass the salt? Boston. Boston! Boston? Boston. Boston! Boston? Yeah, right Yeah, right Multiple ways to say the same thing: Multiple ways to say the same thing: I want to go to Boston. I want to go to Boston. Yes Yes
Human Speech Human Speech Human production and perception Human production and perception Quite different from computers Quite different from computers Phonology Phonology Defining the alphabet of speech Defining the alphabet of speech Different languages make different distinctions Different languages make different distinctions Intonation Intonation How its said How its said
Recommend
More recommend