Emotions in IVR Systems Spoken Dialog Systems Emotions in IVR - PowerPoint PPT Presentation

Articles J. Liscombe, G. Riccardi & D. Hakkani-Tür: Using Context to Improve Emotion Detection in Emotions in IVR Systems Spoken Dialog Systems Emotions in IVR Systems L. Devillers & L. Vidrascu: Anger and Frustration Anger and Frustration Real-life Emotions Detection with Lexical and Paralinguistic Cues on Human-Human Call Center Emotions in Speech Emotions in Speech November 9 November 9 Dialogs Yves Scherrer Yves Scherrer Overview Features Liscombe et al. Devillers & Vidrascu Liscombe et al. Devillers & Vidrascu Lexical features / Trigrams of user utterances Unigrams of user utterances, Linguistic cues Stemming Setting of data collection Phone account information Medical emergencies Prosodic features / Loudness (energy), Loudness (energy), Automated dialog system Human-Human interactions Paralinguistic cues Pitch contour (F0), Pitch contour (F0), Motivation Improve customer satisfaction Study real-life speech in highly Speaking rate, Speaking rate, emotive situations Voice quality (jitter), Voice quality (jitter, ...), Studied emotions Negative, non-negative Anger, Fear, Relief, Sadness Hesitations, Disfluency (pauses), (but 7 emotions annotated) (but finer-grained annotation) Turn-final pitch contour, Non-linguistic events (mouth Pitch accent. noise, crying, …). Corpus used for experiments 5690 dialogs 680 dialogs Normalized by gender. Normalized by speaker. 20,013 user turns 2258 speaker turns Dialog act features Domain-dependent dialog act tag -- Training-test split 75% - 25% 72% - 28% Contextual features Differential of prosodic features -- Machine Learning method Boosting algorithm Log-likelihood ratio (linguistic) Transcriptions Repetition measure used for classification SVM (paralinguistic) Dialog acts … of 2 previous turns

Liscombe et al. Corpus Motivation: 20,013 user turns from 5,690 dialogs Two sources of user frustration: Emotional states: Reason of call (complaint about bill, …) positive/neutral, somewhat frustrated, very frustrated, Arising from interaction problems with the spoken dialog somewhat angry, very angry, somewhat other negative, system very other negative Goal: Simplified set: positive/neutral, negative (Wise choice?) Detect the problem Inter-annotator agreement: Try to repair it, or transfer to human operator 0.32 Cohen's Kappa for full set (“fair agreement”) How could a spoken dialog system “repair” an interaction? 0.42 for simplified set (“moderate agreement”) Automatic Classification Automatic Classification Features used: Boosting algorithm: 1 lexical feature Boosting is a general method of producing a very accurate prediction rule by combining rough and 17 prosodic features moderately inaccurate “rules of thumb.” 1 dialog act feature http://www.cs.princeton.edu/~schapire/boost.html 61 contextual features Can a set of weak learners create a single strong learner? 2000 iterations with the BoosTexter boosting A weak learner is a classifier which is only slightly correlated with the true classification (it can label algorithm examples better than random guessing). Each user turn must be classified as negative or non- A strong learner is a classifier that is arbitrarily well negative given a set of 80 features correlated with the true classification. http://en.wikipedia.org/wiki/Boosting

Lexical features Prosodic features Unigrams, bigrams, trigrams of transcription Features: Result: Energy (loudness) F0 (pitch contour) Words correlating with negative user state: Speaking rate dollars, cents, call Turn-final pitch contour person, human, speak, talking, machine Pitch accent oh, sigh What can these results tell us about emotion annotation? Voice quality (jitter) Normalization: Speaker normalization not possible (data sparsity) Gender normalization Dialog act features Contextual features 65 specific, domain-dependent dialog act tags: First-order differentials of prosodic features wrt. 2 previous utterances Yes Customer_Rep Second-order differentials of prosodic features wrt. Account_Balance 2 previous utterances (Why?) Why should these tags work better than the words Transcriptions of 2 previous utterances of the utterances? Measure of repetition (Levenshtein edit distance) Dialog acts of 2 previous user turns Once frustrated, always frustrated? Dialog acts of 2 previous system turns

Results Devillers & Vidrascu No surprises... Motivations: What do you think about these results? “ The context of emergency gives a larger palette of complex and mixed emotions.” Emotions in emergency situations are more extreme, and Accuracy rate are “really felt in a natural way.” Baseline 73.1% (majority) Debate on acted vs. real emotions Lexical + prosodic features 76.1% Ethical concerns? Lexical + prosodic + dialog act features 77.0% Lexical + prosodic + dialog act + context 79.0% Corpus Classification 688 dialogs, avg 48 turns per dialog Restrict corpus to: Annotation: Utterances from callers Utterances annotated with one of the following non- Decisions of 2 annotators are combined in a soft vector: mixed emotions: Emotion mixtures Anger, Fear, Relief, Sadness 8 coarse-level emotions, 21 fine-grained emotions Justification for this choice? Inter-annotator agreement for client turns: This yields 2258 utterances from 680 speakers. 0.57 (moderate) Consistency checks: Self-reannotation procedure (85% similarity) Perception test (no details given)

Lexical cue model Paralinguistic (prosodic) cues Log-likelihood ratio: 100 features, fed into an SVM classifier: 4 unigram emotion models (1 for each emotion) F0 (pitch contour) and spectral features (formants) A general task-specific model Energy (loudness) Interpolation coefficient to avoid data sparsity problems Voice quality (jitter, shimmer, ...) A coefficient of 0.75 gave the best results Speaking rate, silences, pauses, filled pauses Stemming: Mouth noise, laughter, crying, breathing Normalized by speaker Cut inflectional suffixes (more important for rich- morphology languages like French) Here: ~24 client turns in each dialog Improves overall recognition rates by 12-13 points Liscombe et al.: 3.5 client turns in each dialog ! data sparsity Paralinguistic (prosodic) cues Results Voice quality Anger Fear Relief Sadness Total Jitter: varying pitch in the voice Number of 49 384 107 100 640 Shimmer: varying loudness in the voice utterances Lexical cues 59% 90% 86% 34% 78% NHR: Noise-to-harmonic ratio Prosodic cues 39% 64% 58% 57% 59.8% HNR: Harmonic-to-noise ratio Relief is associated to lexical markers like thanks or I agree . “ Sadness is more prosodic or syntactic than lexical.” Comments?

Results Liscombe et al. Devillers & Vidrascu Baseline 73.1% (majority) 25% (random) Lexical/linguistic features -- 78% Prosodic/paralinguistic features 75.2% (see thesis) 59.8% Lexical + prosodic features 76.1% -- Lexical + prosodic + dialog act features 77.0% -- Lexical + prosodic + dialog act + context 79.0% --

Emotions in IVR Systems Spoken Dialog Systems Emotions in IVR - PowerPoint PPT Presentation

Articles J. Liscombe, G. Riccardi & D. Hakkani-Tr: Using Context to Improve Emotion Detection in Emotions in IVR Systems Spoken Dialog Systems Emotions in IVR Systems L. Devillers & L. Vidrascu: Anger and Frustration Anger and

GoBus Accessible Transit Using the IVR System What is the IVR? Interactive Voice Response

IVR versus a Live Operator for Phone Surveys in India Dipanjan Chakraborty * Indrani Medhi ^

Emotions ...in life Biblical wisdom for our feelings ...in counselling ...in Scripture

Emotional Intelligence Emotional intelligence is the ability to perceive emotions, to access and

Introduction to N2SRP INAP-Controlled IVR N-SQUARED SOFTWARE What is N2SRP? N2SRP is an

KILLING JOY: SARA AHMED, BAD FEELINGS, AND ADVISING CJ Venable, MA September 13, 2018 WHAT DO

The Emotions and Cognitions Behind The Emotions and Cognitions Behind Financial Decisions: The

Robot Emotions Emotions of Living Creatures motivation system for complex organisms

Emotions Commotion Emotions Commotion Genesis 3 and the Psyche Genesis 3 and the Psyche Christ

Emotion and Multimedia Learning Emotions Defined Theoretical Foundations Jan L. Plass,

Types of Subjectivity Subjectivity in Language Sentiments: positive or negative emotions,

Emotion Part 1/3 Questions + Themes What are emotions? What are they for?

Sharon Mast, Facilitator IIRP World Conference Bethlehem PA October 27, 2014 Shift your

Integrating Multi-modes IVR, Web, Paper Prepared for BLS Data Capture Technology Forum March 18,

Pre-Proposal Conference Interactive Voice Response (IVR) System Date Issued: March 4, 2020 RFP

Coupling of Particle-based Simulation and MARS Code for Simulation of IVR-ERVC: Preliminary Study

Temperature Check: Providing Anger Management Tools For Youth Presented ed By: Raven Shelton

Anger Miscellaneous Information Universal trigger for anger- Threat to self-esteem

ELIXIR Recommended Interoperability Resources Carole Goble, ELIXIR-UK Interoperability Platform

community Yanis Mimouni INSERM EJP RD coordination team VASCERN Days 2019 2019, November 7-8

ZONES OF R EGULATION A Curriculum Designed to Foster Self-Regulation & Emotional Control

Mentor Louise Peyton My Role My role as the Pastoral Learning Mentor in school enables me

Fairness, Trust, and Cooperation: Insights from Decision Neuroscience Alan Sanfey Donders

Resilient? 2019 IAMMA CONFERENCE BOB KIELY APRIL 12, 2019 Outline: Definition Setting Context

Emotions in IVR Systems Spoken Dialog Systems Emotions in IVR - PowerPoint PPT Presentation

Articles J. Liscombe, G. Riccardi & D. Hakkani-Tr: Using Context to Improve Emotion Detection in Emotions in IVR Systems Spoken Dialog Systems Emotions in IVR Systems L. Devillers & L. Vidrascu: Anger and Frustration Anger and

GoBus Accessible Transit Using the IVR System What is the IVR? Interactive Voice Response

IVR versus a Live Operator for Phone Surveys in India Dipanjan Chakraborty * Indrani Medhi ^

Emotions ...in life Biblical wisdom for our feelings ...in counselling ...in Scripture

Emotional Intelligence Emotional intelligence is the ability to perceive emotions, to access and

Introduction to N2SRP INAP-Controlled IVR N-SQUARED SOFTWARE What is N2SRP? N2SRP is an

KILLING JOY: SARA AHMED, BAD FEELINGS, AND ADVISING CJ Venable, MA September 13, 2018 WHAT DO

The Emotions and Cognitions Behind The Emotions and Cognitions Behind Financial Decisions: The

Robot Emotions Emotions of Living Creatures motivation system for complex organisms

Emotions Commotion Emotions Commotion Genesis 3 and the Psyche Genesis 3 and the Psyche Christ

Emotion and Multimedia Learning Emotions Defined Theoretical Foundations Jan L. Plass,

Types of Subjectivity Subjectivity in Language Sentiments: positive or negative emotions,

Emotion Part 1/3 Questions + Themes What are emotions? What are they for?

Sharon Mast, Facilitator IIRP World Conference Bethlehem PA October 27, 2014 Shift your

Integrating Multi-modes IVR, Web, Paper Prepared for BLS Data Capture Technology Forum March 18,

Pre-Proposal Conference Interactive Voice Response (IVR) System Date Issued: March 4, 2020 RFP

Coupling of Particle-based Simulation and MARS Code for Simulation of IVR-ERVC: Preliminary Study

Temperature Check: Providing Anger Management Tools For Youth Presented ed By: Raven Shelton

Anger Miscellaneous Information Universal trigger for anger- Threat to self-esteem

ELIXIR Recommended Interoperability Resources Carole Goble, ELIXIR-UK Interoperability Platform

community Yanis Mimouni INSERM EJP RD coordination team VASCERN Days 2019 2019, November 7-8

ZONES OF R EGULATION A Curriculum Designed to Foster Self-Regulation &amp; Emotional Control

Mentor Louise Peyton My Role My role as the Pastoral Learning Mentor in school enables me

Fairness, Trust, and Cooperation: Insights from Decision Neuroscience Alan Sanfey Donders

Resilient? 2019 IAMMA CONFERENCE BOB KIELY APRIL 12, 2019 Outline: Definition Setting Context

ZONES OF R EGULATION A Curriculum Designed to Foster Self-Regulation & Emotional Control