computational
play

COMPUTATIONAL PARALINGUISTICS AND WHAT WE MIGHT GET FROM PHONETICS - PowerPoint PPT Presentation

COMPUTATIONAL PARALINGUISTICS AND WHAT WE MIGHT GET FROM PHONETICS / SPEECH SCIENCE Anton Batliner May 7th, 2015 NSF, Arlington The Topic Paralinguistics: not what but how the person(s) behind The Interspeech Computational Paralinguistic


  1. COMPUTATIONAL PARALINGUISTICS AND WHAT WE MIGHT GET FROM PHONETICS / SPEECH SCIENCE Anton Batliner May 7th, 2015 NSF, Arlington

  2. The Topic Paralinguistics: not what but how  the person(s) behind The Interspeech Computational Paralinguistic Challenges ● 2009: emotion (childrens' speech) ● 2010: age & gender, affect (level of interest) ● 2011: intoxication (+/- alcoholised), sleepiness ● 2012: personality (big 5), likability, pathology ● 2013: social signals, conflict, emotion, autism ● 2014: physical load, cognitive load ● 2015: degree of nativeness, Parkinson's condition, eating condition The Book Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing Björn Schuller & Anton Batliner, 344 pages, 2014, Wiley. Anton Batliner 2

  3. Cultural Clashes Phonetics (Speech Science) Speech Processing phonetics/knowledge-based interpretation: we don't really know what's brute force : we don't know what's happening happening because: only what we are looking for is what we get. but we know how good we can be (roughly). data small, laboratory, controlled large, real-life manual (labels, segmentation) automatic pre-processing few many, brute forcing, MFCC features (low resolution, high generalisation) (high resolution) processing basic , (M)anova, mixed models ML / Pattern Recognition inferential, statistics (fusion of) classifiers / regression  significance  effect size driving force description, explanation, performance, models applications both: what can we model, convey, teach? Anton Batliner 3

  4. What to do: CP Challenges  challenges ● ML procedures, multi-modality, acoustic normalisation ● cross-corpus /language/culture databases ● speaker normalisation/adaptation  severe wrong assignments ● confusions: hits vs. ● 'most important' features (from phonetics) ● hybrid approach: same constellation, a few features based on tradition / phonetic evidence vs. brute force feature sets with/without feature reduction/selection ● interests: performance, interpretation, usability in applications ● loudness in Parkinson's Condition – primary feature, to teach ● speech tempo in non-nativeness – secondary feature, not to teach ● speaker overlap in conflict – primary but : different cultures! – to teach ● variability in depression or autism – cover feature, maybe to teach Anton Batliner 4

  5. Features: Hybrid approach performance brute force huge feature processing vector x interpretation ? ? performance phonetics processing y interpretation hand-picked, few features processing performance x huge feature hybrid vector processing interpretation y=x phonetic knowledge usability in applications

  6. A Bandanna Approach Thank you for your attention Anton Batliner 6

Recommend


More recommend