automatic detection and classification of prosodic events

Automatic Detection and Classification of Prosodic Events Thesis - PowerPoint PPT Presentation

Automatic Detection and Classification of Prosodic Events Thesis Proposal Andrew Rosenberg December 12, 2007 Introduction Intonation What v. How Dimensions of Prosodic Variation Speaking Rate Pitch Range Voice

  1. Automatic Detection and Classification of Prosodic Events Thesis Proposal Andrew Rosenberg December 12, 2007

  2. Introduction Intonation • “What” v. “How” • Dimensions of Prosodic Variation • Speaking Rate • Pitch Range • Voice Quality • Loudness • Accenting* • Phrasing* A. Rosenberg - Thesis Proposal - 12/12/07 2

  3. Introduction Prosodic Events • Categorical Phenomena • Accenting • Acoustic excursion which makes a word “prominent” or “stand out” from its surroundings • Phrasing • “Perceived disjuncture” between words A. Rosenberg - Thesis Proposal - 12/12/07 3

  4. Introduction Accenting • Directs the listeners attention to a concept • Contrast • Topic • Information Status • Example: Eileen is pro-English. • Expected accenting goes unnoticed • Unexpected accenting leads to unexpected meaning • A: Is Eileen pro-French? B: Eileen is pro-English. A. Rosenberg - Thesis Proposal - 12/12/07 4

  5. Introduction Phrasing • Phrasing defines an acoustic unit • Physiologically necessary • Communicatively useful • Attachment Example: Anna will win Manny. • Phrase final tones indicate: • How phrases are composed • Example: I need some oregano (H-) and marjoram (H-) and some fresh basil (L-) okay? • Pragmatic and discourse effect • Example: Mariana made the marmalade. A. Rosenberg - Thesis Proposal - 12/12/07 5

  6. Introduction Why Prosodic Events • Consensus • Understanding • Availability A. Rosenberg - Thesis Proposal - 12/12/07 6

  7. Introduction Goals • Provide prosodic information to SLP systems • Develop novel techniques for classification and detection • Increase understanding of the acoustic and lexical influences on the use of prosodic event A. Rosenberg - Thesis Proposal - 12/12/07 7

  8. Outline • Detection of Prosodic Events • Classification of Prosodic Events • Applications A. Rosenberg - Thesis Proposal - 12/12/07 8

  9. Outline • Detection of Prosodic Events • Pitch Accent • Phrase Boundary • Integrated Prosodic Event Detection • Classification of Prosodic Events • Applications A. Rosenberg - Thesis Proposal - 12/12/07 9

  10. Pitch Accent Detection • Recognition of Acoustic Excursion • Acoustic Correlates • Pitch • Energy* • Duration • Previous Approaches • [Wightman&Ostendorff 1994, Conkie et al. 1999, Sun 2002, Marsi et al. 2003, Gregory 2004, Ananthakrishnan et al. 2005, Tamburini 2006, Chaolei 2007, Levow 2008, inter alia] A. Rosenberg - Thesis Proposal - 12/12/07 10

  11. Pitch Accent Detection Basic Assumptions • Unit of Analysis: Syllable vs. Word • Use of Lexical or Syntactic Information • Supervised vs. Unsupervised Learning A. Rosenberg - Thesis Proposal - 12/12/07 11

  12. Pitch Accent Detection Experiments • Feature Representation • Pitch - min, max, stdev, mean, rms • Energy - min, max, stdev, mean, rms • Duration • Context Normalization of max and mean • Range and z-score normalization over nine static context windows • Speaker Normalization (z-score) • Naïve Bayes, J48, SVM, Boosting, Bagging, Dagging* A. Rosenberg - Thesis Proposal - 12/12/07 12

  13. Pitch Accent Detection Results 90.0 Human Agreement 87.5 85.0 82.5 80.0 77.5 75.0 BDC-spon BDC-read BU-RNC TDT -4 Naïve Bayes J48 Boosting Bagging SVM A. Rosenberg - Thesis Proposal - 12/12/07 13

  14. Pitch Accent Detection Spectral Analysis • Spectral Balance • [Sluijter & Van Heuven 1996 1997, Fant 2000, Heldner 1999] [My name is Randy Keller] A. Rosenberg - Thesis Proposal - 12/12/07 14

  15. Pitch Accent Detection Spectral Analysis • Spectral Balance • [Sluijter & Van Heuven 1996 1997, Fant 2000, Heldner 1999] [My name is Randy Keller] A. Rosenberg - Thesis Proposal - 12/12/07 15

  16. Pitch Accent Detection Spectral Analysis • Examined the predictive power of 210 frequency regions [Rosenberg & Hirschberg 2006] etc. [My name is Randy Keller] A. Rosenberg - Thesis Proposal - 12/12/07 16

  17. Pitch Accent Detection Spectral Analysis Findings • There is significant difference in the predictive power of energy information in frequency regions (14.8%) • >99.9% of data points are correctly classified by at least one classifier • Majority voting leads to ~81.8% correct classification using only energy features • Worse than SVM, but better than J48 and Boosting detection A. Rosenberg - Thesis Proposal - 12/12/07 17

  18. Pitch Accent Detection Correcting Classifier • Can pitch and duration information be combined with these results to improve pitch accent detection accuracy? [Rosenberg & Hirschberg 2007] • For each of 210 energy-based classifiers, train a second pitch and duration based classifier to correct the predictions of the energy classifiers A. Rosenberg - Thesis Proposal - 12/12/07 18

  19. Pitch Accent Detection Correcting Classifier Diagram Filters Energy ... Classifiers Correctors ... Aggregator ∑ A. Rosenberg - Thesis Proposal - 12/12/07 19

  20. Pitch Accent Detection Correcting Classifier Results 90.0 Human Agreement 87.5 85.0 82.5 80.0 77.5 75.0 BDC-spon BDC-read TDT -4 Boosting Bagging SVM Energy Voting Corrected Voting A. Rosenberg - Thesis Proposal - 12/12/07 20

  21. Pitch Accent Detection Proposed Work • Define Word Boundaries using ASR Transcripts • Inclusion of Syntactic Features: 1. Extend the Feature Vector 2. Syntactic-Class-Dependent Modeling • Penn Treebank, Collapsed Classes, Function v. Content 3. Model Combination A. Rosenberg - Thesis Proposal - 12/12/07 21

  22. Outline • Detection of Prosodic Events • Pitch Accent • Phrase Boundary • Integrated Prosodic Event Detection • Classification of Prosodic Events • Applications A. Rosenberg - Thesis Proposal - 12/12/07 22

  23. Phrase Boundary Detection • “Perceived Disjuncture” • Intermediate v. Intonational phrases • Acoustic Features • Silence • Pre-boundary Lengthening* • The final syllable in a phrase has increased duration • Declination Line Reset • Pitch and intensity decrease over the duration of a phrase A. Rosenberg - Thesis Proposal - 12/12/07 23

  24. Phrase Boundary Detection Experiments • Reuse the feature vector from pitch accent detection experiments • Include Pitch and Energy Reset Features • Classify word boundaries as intonational and intermediate phrase boundaries • Naïve Bayes, J48, SVM* A. Rosenberg - Thesis Proposal - 12/12/07 24

  25. Phrase Boundary Detection SVM Results - Full Intonational Phrases 100 90 BDC-read 80 BDC-spon BU-RNC 70 TDT -4 60 Communicator 50 IBM TTS Trains 40 30 20 10 0 Baseline Accuracy Difference A. Rosenberg - Thesis Proposal - 12/12/07 25

  26. Phrase Boundary Detection SVM Results - Full Intonational Phrases 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 F-Measure Baseline* BDC-read BDC-spon BU-RNC TDT -4 Communicator IBM TTS Trains A. Rosenberg - Thesis Proposal - 12/12/07 26

  27. Phrase Boundary Detection Proposed Work • Inclusion of Lexical Features • Similar to pitch accent inclusion approaches • Pre-boundary lengthening • Requires syllable information • Forced aligned from manual word boundaries • ASR phone hypothesis A. Rosenberg - Thesis Proposal - 12/12/07 27

  28. Outline • Detection of Prosodic Events • Pitch Accent • Phrase Boundary • Integrated Prosodic Event Detection • Classification of Prosodic Events • Applications A. Rosenberg - Thesis Proposal - 12/12/07 28

  29. Integrated Prosodic Event Detection • Pitch accents can improve phrase boundary detection [Wang & Hirschberg 1992] • Hypothesis: Phrase boundaries can improve pitch accent detection • Accents “stand out” from context. • Phrase boundaries define acoustic context. A. Rosenberg - Thesis Proposal - 12/12/07 29

  30. Integrated Prosodic Event Detection Proposed Approaches • Simultaneous Detection • 4-way classification {acc, non}x{phrase, non} • Preliminary results show improved performance on pitch accent and phrase boundary on some corpora • Iterative Detection • Detect pitch accents. Use these to detect phrase boundaries. Use these to detect accent. Repeat • Classifier Fusion • Dynamic Bayesian Model A. Rosenberg - Thesis Proposal - 12/12/07 30

  31. Outline • Detection of Prosodic Events • Classification of Prosodic Events • Pitch Accent Type • Phrase-final Tone • Applications A. Rosenberg - Thesis Proposal - 12/12/07 31

  32. Outline • Detection of Prosodic Events • Classification of Prosodic Events • Pitch Accent Type • Phrase-final Tone • Applications A. Rosenberg - Thesis Proposal - 12/12/07 32

  33. Prosodic Event Categorization Accent Types and Phrase-final Tones • Intonation can be described by sequence of low (L) and high (H) tones [Pierrehumbert 1980, Silverman 1992] • Accents: L*, H* • Complex tones: L+H*, L*+H, H+!H* • Intermediate Phrase-final tones (Phrase Accents): • Phrase Accents: L-, H- • Intonational Phrase-final tones (Phrase Accent + Boundary Tone): • L-L%, L-H%, H-L%, H-H% A. Rosenberg - Thesis Proposal - 12/12/07 33


More recommend