Criteria and metrics for thresholded AU detection Jeff Girard and - PowerPoint PPT Presentation

University of Pittsburgh – Affect Analysis Group http://pitt.edu/~emotion Criteria and metrics for thresholded AU detection Jeff Girard and Jeff Cohn University of Pittsburgh BeFIT Workshop, ICCV 2011

Facial Action Coding System http://pitt.edu/~emotion • Facial action informs: – Emotion Displays – Pain Displays – Social Signaling • FACS Coding System: – Anatomically-based – Common vocabulary – Objective and reliable – Action Unit 6+12+25+26 November 13, 2011 BeFIT Workshop, ICCV 2011 2/16

Automatic AU Detection http://pitt.edu/~emotion • FACS training, reliability, and manual coding are time consuming • Automatic AU detection would be faster and enable “online” coding • However, automatic coding requires a trained classifier • Classifier training requires time, expertise, and ground truth coding • Classifiers are considered corpus- specific; thus a new classifier is usually trained for each corpus November 13, 2011 BeFIT Workshop, ICCV 2011 3/16

Classifier Strategy Comparison http://pitt.edu/~emotion Novel Classifier Training Naïve Classifier Implementation Classifier Groundtruth Classifier Data Automatic Data Automatic Coding Training from other Collection Coding Collection Coding (subset) (subset) Database Strengths: Strengths: +Classifier trained on same database +Requires no ground truth coding Limitations: +Requires no classifier training -Requires ground truth coding Limitations: -Requires classifier training -Classifier not trained on same database November 13, 2011 BeFIT Workshop, ICCV 2011 4/16

Threshold Analysis Alternative http://pitt.edu/~emotion 1 0,75 0,5 0,25 Classifier Threshold Data Automatic from other Analysis 0 Collection Coding Database (subset) -0,25 -0,5 -0,75 -1 Strengths: +Requires no new classifier training +Threshold optimized for current database Limitations: -Classifier not trained on current database -Requires some ground truth coding November 13, 2011 BeFIT Workshop, ICCV 2011 5/16

Step 1 – Obtain Classifier http://pitt.edu/~emotion SVM Classifier • SIFT Feature Data • Radial Basis Kernel • 3-fold cross-validation RU-FACS-1 Database • Classifier Training Set – 17 subjects (97000 frames) • Classifier Testing Set – 11 subjects (67000 frames) • False Opinion Paradigm November 13, 2011 BeFIT Workshop, ICCV 2011 6/16

Step 2 – Implement Classifier http://pitt.edu/~emotion Spectrum Database • Threshold Training Set – 23 subjects (88000 frames) • Threshold Testing Set – 13 subjects (37000 frames) • HRSD-17 Depression Interview 10 12 14 4 November 13, 2011 BeFIT Workshop, ICCV 2011 7/16

Step 3 – Identify Potential Thresholds http://pitt.edu/~emotion SVM_12 1 0,75 0,5 SVM Decision Value 0,25 0 -0,25 -0,5 -0,75 -1 0 300 600 900 1200 Frame Number • Find minimum and maximum SVM decision values for each AU • Separate this range into equal steps to identify potential thresholds • This study compared 250 thresholds for each of the action units November 13, 2011 BeFIT Workshop, ICCV 2011 8/16

Step 4 – Generate Predictions http://pitt.edu/~emotion SVM_12 Threshold 1 0,75 0,5 0,25 0 -0,25 -0,5 -0,75 -1 0 300 600 900 1200 Thresholded Prediction 0 300 600 900 1200 November 13, 2011 BeFIT Workshop, ICCV 2011 9/16

Step 5 – Compare to Groundtruth http://pitt.edu/~emotion Thresholded Prediction 0 300 600 900 1200 Groundtruth Labels 0 300 600 900 1200 Accuracy = 0.855 F1 = 0.756 Kappa = 0.656 November 13, 2011 BeFIT Workshop, ICCV 2011 10/16

Step 6 – Identify Optimal Thresholds http://pitt.edu/~emotion AU_4 Threshold Training AU_10 Threshold Training Accuracy F1 Kappa Accuracy F1 Kappa 1 1 0,9 0,9 Score on Performance Metric Score on Performance Metric 0,8 0,8 0,7 0,7 0,6 0,6 0,5 0,5 0,4 0,4 0,3 0,3 0,2 0,2 0,1 0,1 0 0 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 Threshold Value Threshold Value November 13, 2011 BeFIT Workshop, ICCV 2011 11/16

Step 6 – Identify Optimal Thresholds http://pitt.edu/~emotion AU_12 Threshold Training AU_14 Threshold Training Accuracy F1 Kappa Accuracy F1 Kappa 1 1 Score on Performance Metric 0,9 0,9 Score on Performance Metric 0,8 0,8 0,7 0,7 0,6 0,6 0,5 0,5 0,4 0,4 0,3 0,3 0,2 0,2 0,1 0,1 0 0 -4 -3 -2 -1 0 1 2 3 4 -5 0 5 10 15 20 Threshold Value Threshold Value November 13, 2011 BeFIT Workshop, ICCV 2011 12/16

Overall Results in Testing Subset http://pitt.edu/~emotion Naïve Classifier Threshold Analysis 1 Score on Performance Metric 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 Accuracy F1 Kappa Performance Metric p < .0001 p < .002 p < .0001 November 13, 2011 BeFIT Workshop, ICCV 2011 13/16

Comparison to FERA Winner http://pitt.edu/~emotion 1 0,9 0,8 0,7 0,6 Overall F1 0,5 0,4 0,3 0,2 0,1 0 Naïve Implementation Threshold Analysis FERA Winner* November 13, 2011 BeFIT Workshop, ICCV 2011 14/16

Performance Gains by Action Unit http://pitt.edu/~emotion Accuracy F1 Kappa 80% Percent Increase in Performance 70% 60% 50% 4 10 40% 30% 20% 10% 0% 12 14 AU_4 AU_10 AU_12 AU_14 November 13, 2011 BeFIT Workshop, ICCV 2011 15/16

Future Directions http://pitt.edu/~emotion • Additional Databases and Action Units • Additional Feature and Classifier types • Determine required size of training set • Smoothing to remove noise in predictions • Compare directly to Novel Classifier Training jmg174@pitt.edu November 13, 2011 BeFIT Workshop, ICCV 2011 16/16

Predictions with Smoothing http://pitt.edu/~emotion Thresholded Prediction 0 300 600 900 1200 Thresholded Prediction (with smoothing) 0 300 600 900 1200 Groundtruth Labels 0 300 600 900 1200 Accuracy = 0.855 F1 = 0.756 Kappa = 0.656 Accuracy = 0.896 F1 = 0.826 Kappa = 0.754 November 13, 2011 BeFIT Workshop, ICCV 2011 17/16

Threshold Training Set http://pitt.edu/~emotion Training Set Naïve Implementation Threshold Analysis 1 0,9 Score on Performance Metric 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 Accuracy F1 Kappa Performance Metric November 13, 2011 BeFIT Workshop, ICCV 2011 18/16

Results by Threshold Type http://pitt.edu/~emotion Zero maxAc maxF1 maxKa EER 1 Score on Performance Metric 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 Accuracy F1 Kappa Performance Metric The threshold that maximized Accuracy performed poorly on F1 and Kappa. Thresholds that maximized F1, Kappa, and EER performed best on all metrics. November 13, 2011 BeFIT Workshop, ICCV 2011 19/16

Criteria and metrics for thresholded AU detection Jeff Girard and - PowerPoint PPT Presentation

University of Pittsburgh Affect Analysis Group http://pitt.edu/~emotion Criteria and metrics for thresholded AU detection Jeff Girard and Jeff Cohn University of Pittsburgh BeFIT Workshop, ICCV 2011 Facial Action Coding System

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games Colin McMillen and Manuela Veloso

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Software Metrics And I gnominy Software Metrics And I gnominy Software Metrics And I gnominy

Astheno-Khler and strong KT General results metrics Bismut connection Definition of strong KT

NDCs and metrics Andrei Marcu , Director, ERCST 1 NDCs and metrics Main issues: - Which metrics

Metrics and Estimation Rahul Premraj + Andreas Zeller 1 Metrics Quantitative measures that

Software Metrics Chapter 4 1 SW Metrics SW process and product metrics are quantitative

Software Metrics Alex Boughton Executive Summary What are software metrics? Why are

Software Metrics Overview SE 350 Software Process & Product Quality Lecture Objectives

Metrics are Pivotal A NATIONAL FARM TO INSTITUTION METRICS COLLABORATIVE WEBINAR Local

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for

No challenge, decade outlook. Industrys evolutionary path Que sera sera Grand Goodness

Automated OpenCL GPU kernel fusion for Stan Math Tadej Ciglari (presenter) * , Rok enovar,

High Performance Experiment Data Archiving with gStore Chep 2012, New York May 21, 2012 Horst

Constant-time programming in FaCT Sunjay Cauligi , UC San Diego Fraser Brown, Ranjit Jhala, Brian

Statistics Netherlands - Coding occupations Coding occupations The new coding process Hendrika

Distributed Computing with Spark Reza Zadeh Thanks to Matei Zaharia Outline Data

Advances in Programming Languages APL1: Whats so important about language? Ian Stark School

Criteria and metrics for thresholded AU detection Jeff Girard and - PowerPoint PPT Presentation

University of Pittsburgh Affect Analysis Group http://pitt.edu/~emotion Criteria and metrics for thresholded AU detection Jeff Girard and Jeff Cohn University of Pittsburgh BeFIT Workshop, ICCV 2011 Facial Action Coding System

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games Colin McMillen and Manuela Veloso

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Software Metrics And I gnominy Software Metrics And I gnominy Software Metrics And I gnominy

Astheno-Khler and strong KT General results metrics Bismut connection Definition of strong KT

NDCs and metrics Andrei Marcu , Director, ERCST 1 NDCs and metrics Main issues: - Which metrics

Metrics and Estimation Rahul Premraj + Andreas Zeller 1 Metrics Quantitative measures that

Software Metrics Chapter 4 1 SW Metrics SW process and product metrics are quantitative

Software Metrics Alex Boughton Executive Summary What are software metrics? Why are

Software Metrics Overview SE 350 Software Process &amp; Product Quality Lecture Objectives

Metrics are Pivotal A NATIONAL FARM TO INSTITUTION METRICS COLLABORATIVE WEBINAR Local

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

Ingest and Dissemination with DAITSS Presented by Randy Fischer, Programmer, Florida Center for

No challenge, decade outlook. Industrys evolutionary path Que sera sera Grand Goodness

Automated OpenCL GPU kernel fusion for Stan Math Tadej Ciglari (presenter) * , Rok enovar,

High Performance Experiment Data Archiving with gStore Chep 2012, New York May 21, 2012 Horst

Constant-time programming in FaCT Sunjay Cauligi , UC San Diego Fraser Brown, Ranjit Jhala, Brian

Statistics Netherlands - Coding occupations Coding occupations The new coding process Hendrika

Distributed Computing with Spark Reza Zadeh Thanks to Matei Zaharia Outline Data

Advances in Programming Languages APL1: Whats so important about language? Ian Stark School

Software Metrics Overview SE 350 Software Process & Product Quality Lecture Objectives