speech processing 15 492 18 492
play

Speech Processing 15-492/18-492 Human Speech Processing Phonetics - PowerPoint PPT Presentation

Speech Processing 15-492/18-492 Human Speech Processing Phonetics and Phonology The vocal tract From meat to voice Blow air through lungs Blow air through lungs Vibrate larynx Vibrate larynx Vocal tract shape defines


  1. Speech Processing 15-492/18-492 Human Speech Processing Phonetics and Phonology

  2. The vocal tract

  3. From meat to voice Blow air through lungs � Blow air through lungs � � Vibrate larynx Vibrate larynx � � Vocal tract shape defines resonance Vocal tract shape defines resonance � � Obstructions modify sound Obstructions modify sound �  Tongue, teeth, lips, velum (nasal passage) Tongue, teeth, lips, velum (nasal passage) 

  4. The ear

  5. From sound to brain waves Sound waves � Sound waves � � Vibrate ear drum Vibrate ear drum � � Cause fluid in cochlear to vibrate Cause fluid in cochlear to vibrate � � Spiral cochlear Spiral cochlear �  Vibrate hairs inside cochlear Vibrate hairs inside cochlear   Different frequencies vibrate different hairs Different frequencies vibrate different hairs   Converts time domain to frequency Converts time domain to frequency domainS domainS 

  6. From grunts to meaning Grunts and vocalization � Grunts and vocalization � � Lots of variation available Lots of variation available �  (continuous systems (continuous systems – – not discrete) not discrete)  � Noises become distinct, recognizable Noises become distinct, recognizable � Grow into languages, dialects and idiolects � Grow into languages, dialects and idiolects � What are the fundamental units? � What are the fundamental units? �

  7. Articulatory Movements

  8. Electromagnetic Articulograph

  9. Phonemes Defined as fundamental units of speech � Defined as fundamental units of speech � � If you change it, it (can) change the meaning If you change it, it (can) change the meaning � “pat” to “bat” “pat” to “bat” “pat” to “pam pam” ” “pat” to “

  10. Vowel Space • One or two banded frequencies (formants)

  11. English (US) Vowels AA wAshington AE fAt, , bAd bAd AA wAshington AE fAt AH bUt, , hUsh hUsh AO lAWn, , mAll mAll AH bUt AO lAWn AW hOW, , sOUth sOUth AX About, cAnoe cAnoe AW hOW AX About, AY hIde, , bUY bUY EH gEt, , fEAther fEAther AY hIde EH gEt ER makER, , sEARch sEARch EY gAte, , EIght EIght ER makER EY gAte IH bIt, , shIp shIp IY bEAt, , shEEp shEEp IH bIt IY bEAt OW lOne, , nOse nOse OY tOY, , OYster OYster OW lOne OY tOY UH fUll UW fOOl UH fUll UW fOOl

  12. English Consonants Stops: P, B, T, D, K, G � Stops: P, B, T, D, K, G � Fricatives: F, V, HH, S, Z, SH, ZH � Fricatives: F, V, HH, S, Z, SH, ZH � Affricatives: CH, JH � Affricatives: CH, JH � Nasals: N, M, NG � Nasals: N, M, NG � Glides: L, R, Y, W � Glides: L, R, Y, W � Note: voiced vs vs unvoiced: unvoiced: � Note: voiced � � P P vs vs B, F B, F vs vs V V �

  13. Number of Phonemes in Language US English: 43 � US English: 43 � UK English: 44 � UK English: 44 � Japanese: 25 � Japanese: 25 � Hindi: 81 � Hindi: 81 � Numbers aren’t definite though � Numbers aren’t definite though � � Depends on who you ask, Depends on who you ask, � � And what you want it for And what you want it for �

  14. Not all variation is Phonetic � Phonology: linguistically discrete units Phonology: linguistically discrete units � � May be a number of different ways to say them May be a number of different ways to say them � � /r/ trill (Scottish or Spanish) /r/ trill (Scottish or Spanish) vs vs US way US way � � Phonetics Phonetics vs vs Phonemics Phonemics � � Phonetics: discrete units Phonetics: discrete units � � Phonemics: all sounds Phonemics: all sounds � � /t/ in US English: becomes “flap” /t/ in US English: becomes “flap” � � “water” / w “water” / w ao ao t t er er / / � � “water” / w “water” / w ao ao dx dx er er / / �

  15. Dialect and Idiolect Variation within language (and speakers) � Variation within language (and speakers) � Phonetic � Phonetic � � “Don” “Don” vs vs “Dawn”, “Cot” “Dawn”, “Cot” vs vs “Caught” “Caught” � � R deletion ( R deletion (Haavaad Haavaad vs vs Harvard) Harvard) � Word choice: � Word choice: � � Y’all, Y’all, Yins Yins � � Politeness levels Politeness levels �

  16. Not all languages use the same set � Asperated Asperated stops (Korean, Hindi) stops (Korean, Hindi) � � P P vs vs PH PH � � English uses both, but doesn’t care English uses both, but doesn’t care � � Pot Pot vs vs sPot sPot (place hand over mouth) (place hand over mouth) � � L L- -R in Japanese not phonological R in Japanese not phonological � � US English dialects: US English dialects: � � Mary, Merry, Marry Mary, Merry, Marry � � Scottish English Scottish English vs vs US English US English � � No distinction between “pull” and “pool” No distinction between “pull” and “pool” � � Distinction between: “for” and “four” Distinction between: “for” and “four” �

  17. Different language dimensions Vowel length � Vowel length � � Bit Bit vs vs beat beat � � Japanese: Japanese: shujin shujin (husband) (husband) vs vs shuujin shuujin (prisoner) (prisoner) � Tones � Tones � � F0 (tune) used phonetically F0 (tune) used phonetically � � Chinese, Thai, Burmese Chinese, Thai, Burmese � Clicks � Clicks � � Xhosa Xhosa �

  18. Co-articulation � Voicing actually doesn’t always stop Voicing actually doesn’t always stop � � “have honey”, “impossible” “have honey”, “impossible” � � Nasalized voices, lip rounding Nasalized voices, lip rounding � � “min” “min” vs vs “bit”, “sow” “bit”, “sow” vs vs “see” “see” � � Lexical stress: Lexical stress: � � EMphasis EMphasis, , emPHAsis emPHAsis � � PROject PROject, , proJECT proJECT � � Reduction, contraction Reduction, contraction � � “A boy is riding a bike” “A boy is riding a bike” � � “I want to go to Disneyland.” “I want to go to Disneyland.” � � “I will go tomorrow” “I will go tomorrow” �

  19. Prosody Intonation � Intonation � � Tune Tune � Duration � Duration � � How long/short of each phoneme How long/short of each phoneme � Phrasing � Phrasing � � Where the breaks are Where the breaks are �

  20. Intonation (F0) Rate of vibration during voiced speech � Rate of vibration during voiced speech � � Males: 80 Males: 80- -140 times a second 140 times a second � � Females: 130 Females: 130- -220 times a second 220 times a second � � Children: 180 Children: 180- -320 times a second 320 times a second � Used for: � Used for: � � Emphasis Emphasis � � Style: questions, statements, confidence etc Style: questions, statements, confidence etc �

  21. Intonation Contour

  22. Intonation Information Large pitch range (female) � Large pitch range (female) � Authoritive since goes down at the end since goes down at the end � Authoritive � � News reader News reader � Emphasis for Finance H* � Emphasis for Finance H* � Final has a raise – – more information to more information to � Final has a raise � come come Female American newsreader from WBUR � Female American newsreader from WBUR � (Boston University Radio) � (Boston University Radio) �

  23. Intonation Examples Fixed durations, flat F0. � Fixed durations, flat F0. � Decline F0 � Decline F0 � “hat” accents on stressed syllables � “hat” accents on stressed syllables � accents and end tones � accents and end tones � statistically trained � statistically trained �

  24. Words � Words Words � � The things with space around them (sort of) The things with space around them (sort of) � � Chinese, Thai, Japanese doesn’t use spaces Chinese, Thai, Japanese doesn’t use spaces � � Speech doesn’t use spaces Speech doesn’t use spaces �  Blackboard Blackboard vs vs Black Board Black Board  � English English �  Morphology: walk, walks, walking, walked Morphology: walk, walks, walking, walked  � Japanese Japanese �  Morphology: Morphology: aruku aruku, , arukimasu arukimasu, , arukimashita arukimashita, , aruite aruite, , aruikitai aruikitai, ,  aruikitakatta, , arukemasu arukemasu, …. , …. aruikitakatta

  25. Speech Acts Words aren’t always what they seem � Words aren’t always what they seem � � Can you pass the salt? Can you pass the salt? � � Boston. Boston! Boston? Boston. Boston! Boston? � � Yeah, right Yeah, right � Multiple ways to say the same thing: � Multiple ways to say the same thing: � � I want to go to Boston. I want to go to Boston. � � Yes Yes �

  26. Human Speech Human production and perception � Human production and perception � � Quite different from computers Quite different from computers � Phonology � Phonology � � Defining the alphabet of speech Defining the alphabet of speech � � Different languages make different distinctions Different languages make different distinctions � Intonation � Intonation � � How its said How its said �

Recommend


More recommend