University of Pittsburgh – Affect Analysis Group http://pitt.edu/~emotion Criteria and metrics for thresholded AU detection Jeff Girard and Jeff Cohn University of Pittsburgh BeFIT Workshop, ICCV 2011
Facial Action Coding System http://pitt.edu/~emotion • Facial action informs: – Emotion Displays – Pain Displays – Social Signaling • FACS Coding System: – Anatomically-based – Common vocabulary – Objective and reliable – Action Unit 6+12+25+26 November 13, 2011 BeFIT Workshop, ICCV 2011 2/16
Automatic AU Detection http://pitt.edu/~emotion • FACS training, reliability, and manual coding are time consuming • Automatic AU detection would be faster and enable “online” coding • However, automatic coding requires a trained classifier • Classifier training requires time, expertise, and ground truth coding • Classifiers are considered corpus- specific; thus a new classifier is usually trained for each corpus November 13, 2011 BeFIT Workshop, ICCV 2011 3/16
Classifier Strategy Comparison http://pitt.edu/~emotion Novel Classifier Training Naïve Classifier Implementation Classifier Groundtruth Classifier Data Automatic Data Automatic Coding Training from other Collection Coding Collection Coding (subset) (subset) Database Strengths: Strengths: +Classifier trained on same database +Requires no ground truth coding Limitations: +Requires no classifier training -Requires ground truth coding Limitations: -Requires classifier training -Classifier not trained on same database November 13, 2011 BeFIT Workshop, ICCV 2011 4/16
Threshold Analysis Alternative http://pitt.edu/~emotion 1 0,75 0,5 0,25 Classifier Threshold Data Automatic from other Analysis 0 Collection Coding Database (subset) -0,25 -0,5 -0,75 -1 Strengths: +Requires no new classifier training +Threshold optimized for current database Limitations: -Classifier not trained on current database -Requires some ground truth coding November 13, 2011 BeFIT Workshop, ICCV 2011 5/16
Step 1 – Obtain Classifier http://pitt.edu/~emotion SVM Classifier • SIFT Feature Data • Radial Basis Kernel • 3-fold cross-validation RU-FACS-1 Database • Classifier Training Set – 17 subjects (97000 frames) • Classifier Testing Set – 11 subjects (67000 frames) • False Opinion Paradigm November 13, 2011 BeFIT Workshop, ICCV 2011 6/16
Step 2 – Implement Classifier http://pitt.edu/~emotion Spectrum Database • Threshold Training Set – 23 subjects (88000 frames) • Threshold Testing Set – 13 subjects (37000 frames) • HRSD-17 Depression Interview 10 12 14 4 November 13, 2011 BeFIT Workshop, ICCV 2011 7/16
Step 3 – Identify Potential Thresholds http://pitt.edu/~emotion SVM_12 1 0,75 0,5 SVM Decision Value 0,25 0 -0,25 -0,5 -0,75 -1 0 300 600 900 1200 Frame Number • Find minimum and maximum SVM decision values for each AU • Separate this range into equal steps to identify potential thresholds • This study compared 250 thresholds for each of the action units November 13, 2011 BeFIT Workshop, ICCV 2011 8/16
Step 4 – Generate Predictions http://pitt.edu/~emotion SVM_12 Threshold 1 0,75 0,5 0,25 0 -0,25 -0,5 -0,75 -1 0 300 600 900 1200 Thresholded Prediction 0 300 600 900 1200 November 13, 2011 BeFIT Workshop, ICCV 2011 9/16
Step 5 – Compare to Groundtruth http://pitt.edu/~emotion Thresholded Prediction 0 300 600 900 1200 Groundtruth Labels 0 300 600 900 1200 Accuracy = 0.855 F1 = 0.756 Kappa = 0.656 November 13, 2011 BeFIT Workshop, ICCV 2011 10/16
Step 6 – Identify Optimal Thresholds http://pitt.edu/~emotion AU_4 Threshold Training AU_10 Threshold Training Accuracy F1 Kappa Accuracy F1 Kappa 1 1 0,9 0,9 Score on Performance Metric Score on Performance Metric 0,8 0,8 0,7 0,7 0,6 0,6 0,5 0,5 0,4 0,4 0,3 0,3 0,2 0,2 0,1 0,1 0 0 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 Threshold Value Threshold Value November 13, 2011 BeFIT Workshop, ICCV 2011 11/16
Step 6 – Identify Optimal Thresholds http://pitt.edu/~emotion AU_12 Threshold Training AU_14 Threshold Training Accuracy F1 Kappa Accuracy F1 Kappa 1 1 Score on Performance Metric 0,9 0,9 Score on Performance Metric 0,8 0,8 0,7 0,7 0,6 0,6 0,5 0,5 0,4 0,4 0,3 0,3 0,2 0,2 0,1 0,1 0 0 -4 -3 -2 -1 0 1 2 3 4 -5 0 5 10 15 20 Threshold Value Threshold Value November 13, 2011 BeFIT Workshop, ICCV 2011 12/16
Overall Results in Testing Subset http://pitt.edu/~emotion Naïve Classifier Threshold Analysis 1 Score on Performance Metric 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 Accuracy F1 Kappa Performance Metric p < .0001 p < .002 p < .0001 November 13, 2011 BeFIT Workshop, ICCV 2011 13/16
Comparison to FERA Winner http://pitt.edu/~emotion 1 0,9 0,8 0,7 0,6 Overall F1 0,5 0,4 0,3 0,2 0,1 0 Naïve Implementation Threshold Analysis FERA Winner* November 13, 2011 BeFIT Workshop, ICCV 2011 14/16
Performance Gains by Action Unit http://pitt.edu/~emotion Accuracy F1 Kappa 80% Percent Increase in Performance 70% 60% 50% 4 10 40% 30% 20% 10% 0% 12 14 AU_4 AU_10 AU_12 AU_14 November 13, 2011 BeFIT Workshop, ICCV 2011 15/16
Future Directions http://pitt.edu/~emotion • Additional Databases and Action Units • Additional Feature and Classifier types • Determine required size of training set • Smoothing to remove noise in predictions • Compare directly to Novel Classifier Training jmg174@pitt.edu November 13, 2011 BeFIT Workshop, ICCV 2011 16/16
Predictions with Smoothing http://pitt.edu/~emotion Thresholded Prediction 0 300 600 900 1200 Thresholded Prediction (with smoothing) 0 300 600 900 1200 Groundtruth Labels 0 300 600 900 1200 Accuracy = 0.855 F1 = 0.756 Kappa = 0.656 Accuracy = 0.896 F1 = 0.826 Kappa = 0.754 November 13, 2011 BeFIT Workshop, ICCV 2011 17/16
Threshold Training Set http://pitt.edu/~emotion Training Set Naïve Implementation Threshold Analysis 1 0,9 Score on Performance Metric 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 Accuracy F1 Kappa Performance Metric November 13, 2011 BeFIT Workshop, ICCV 2011 18/16
Results by Threshold Type http://pitt.edu/~emotion Zero maxAc maxF1 maxKa EER 1 Score on Performance Metric 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 Accuracy F1 Kappa Performance Metric The threshold that maximized Accuracy performed poorly on F1 and Kappa. Thresholds that maximized F1, Kappa, and EER performed best on all metrics. November 13, 2011 BeFIT Workshop, ICCV 2011 19/16
Recommend
More recommend