High-Resolution Comprehensive 3-D Dynamic Database for Facial Articulation Analysis ���������������� ����������������������������������������������� ������������� Bogdan J. Matuszewski , Wei Quan, Lik-Kwan Shark bmatuszewski1@uclan.ac.uk Applied Digital Signal and Image Processing (ADSIP) Research Centre School of Computing, Engineering and Physical Sciences University of Central Lancashire (UCLan), Preston PR1 2HE, UK
Presentation Outline • Motivation • Structure of the Hi4D-ADSIP • Validation of facial expression • Facial dysfunction analysis • Conclusions
Facial Articulation Databases A representative sample of the existing databases
Hi4D-ADSIP database Motivation •High resolution 3D dynamic facial scans represent more closely the facial structure related to the “internal” face anatomy rather than only the external appearance, therefore such data promise to have greater applicability for bio-medical applications: - head and neck radiation therapy; - corrective plastic surgery; - quantitative assessment of neurological conditions (stroke, Bell’s palsy, Parkinson’s disease); - aging. •3D dynamic data should enable to construct more accurate facial models e.g. for HCI, biometrics and security: - facial composites from crime witness accounts (efit/evofit) – there is some evidence that facial expressions and facial dynamics can improve a success rate of such systems. But the Hi4D-ADSIP database has been design for a general use!
Hi4D-ADSIP database Experimental Setup • 3D facial sequences captured at 60fps using six cameras with 2352x1728 pixel each (scanner from Dimensional Imaging has been used). • Audio synchronised with 3D recordings. • Sessions recorded on a camcorder.
Hi4D-ADSIP database Database Structure
Hi4D-ADSIP database Database Structure •Currently there are 80 “control” subjects in the database. 65 of them are undergraduate students from the Performing Arts Department at UCLan. The rest are postgraduate students and staff from the University. They are of different ethnic origin, with age ranging between 18 and 60; 48 are female and 32 are male.
Hi4D-ADSIP database Database Structure
Hi4D-ADSIP database Database Structure •Currently there are 80 “control” subjects in the database. 65 of them are undergraduate students from the Performing Arts Department at UCLan. The rest are postgraduate students and staff from the university. They are of different ethnic origin, with age ranging between 18 and 60; 48 are female and 32 are male. • Seven expressions were performed (acted) by each subject, including: Anger, Disgust, Fear, Happiness, Sadness, Surprise and Pain at three levels of intensity ‘low’, ‘normal’ and ‘extreme’. Additionally each subject was asked to articulate mouth and eyebrows as well as read five phrases typically used in assessment of some neurological conditions again at three intensity levels.
Hi4D-ADSIP database
Hi4D-ADSIP database Database Structure •Currently there are 80 “control” subjects in the database. 65 of them are undergraduate students from the Performing Arts Department at UCLan. The rest are postgraduate students and staff from the university. They are of different ethnic origin, with age ranging between 18 and 60; 48 are female and 32 are male. • Seven expressions were performed (acted) by each subject, including: Anger, Disgust, Fear, Happiness, Sadness, Surprise and Pain at three levels of intensity ‘mild’, ‘normal’ and ‘extreme’. Additionally each subject was asked to articulate mouth and eyebrows as well as read five phrases typically used in assessment of some neurological conditions again at three intensity levels. • Recorded sequences last between three and five seconds. • In total there are 3,360 recorded 3D sequences (~610,000 3D face models)
Hi4D-ADSIP database Database Structure
Three levels of disgust Mouth and eyebrows articulation
Standard phrases used in an assessment of neurological patients 1. You know how 2. Down to earth 3. I got home from work 4. Near the table in the dining room 5. They heard him speak on the radio last night
Hi4D-ADSIP database Facial Expression Validation Each recorded video clip is assessed by 5 observers. To make the task manageable for observers, during an observation session each observer assesses 105 video clips (5 subjects with 7 expressions at 3 expressions intensity levels), with subjects assigned to the observer randomly. Observers are asked to provide a confidence ratings, for each observed sequence, with values in the range of 0 to 100%. For a given video clip, ratings could be distributed over the various expressions as long as scores add up to 100%. Each expression has an associated mean confidence vector. Confidence scores have a grand mean of 60%; by actor: 54% - 80%.; by expression: 35% - 83%; by level: 57% - 65%
Hi4D-ADSIP database Facial Expression Validation Happiness expressions were given high confidence scores of 83% on average, whereas fear expressions were the worst with the average confidence score of only 35%. Also, the normal intensity level on average was somewhat better rated than mild, and extreme was on average also somewhat better than normal. Interestingly for two expressions happiness and pain the extreme level confidence scores were lower than for the normal level. Intensity Anger Disgust Fear Happiness Sadness Surprise Pain Mean (%) (%) (%) (%) (%) (%) (%) Mild 59.44 50.42 25.13 86.50 39.63 60.50 75.62 56.75 Normal 60.92 53.07 32.66 85.67 51.71 61.50 79.33 60.69 Extreme 64.36 61.50 48.25 78.21 61.70 64.75 75.33 64.87 Mean 61.57 54.99 35.34 83.46 .51.01 62.25 76.76 60.77 Human observers mean confidence results for seven expressions
Hi4D-ADSIP database Facial Expression Validation Anger Disgust Fear Happiness Sadness Surprise Pain (%) (%) (%) (%) (%) (%) (%) Anger 61.57 18.38 4.22 0.56 2.93 3.60 8.75 Disgust 16.37 54.99 7.93 0.58 8.31 5.24 6.58 Fear 3.84 9.69 35.34 0 9.86 33.71 7.56 Happiness 0.44 1.44 2.64 2.36 7.82 1.83 83.46 Sadness 2.19 9.33 7.23 0.69 5.25 13.06 62.25 Surprise 1.06 1.68 9.10 6.17 3.65 1.58 76.76 Pain 7.67 14.24 7.43 1.64 15.72 2.29 51.01 Human observer confidence confusion matrix
Hi4D-ADSIP database Baseline automatic facial expression recognition The recognition rates for different facial expressions vary between 98% for happiness expression and 68% for disgust expression. This difference between recognition rates for different expressions mirrors the recognition rates obtained for human observers, with happiness having the highest recognition rate; anger, surprise, pain medium and disgust and sadness with the low recognition rates. Anger Disgust Fear Happiness Sadness Surprise Pain (%) (%) (%) (%) (%) (%) (%) Anger 5.30 6.67 0.00 1.30 1.30 2.67 82.70 Disgust 10.67 6.67 0.00 8.00 0.00 6.67 68.00 Fear 6.67 0.00 0.00 2.70 5.30 2.67 82.70 Happiness 0.00 0.00 0.00 0.00 1.33 0.00 98.70 Sadness 9.30 4.30 1.30 1.30 77.30 0.00 6.67 Surprise 0.00 0.00 1.30 2.67 2.67 81.33 0.00 Pain 8.00 1.30 2.67 0.00 2.67 0.00 85.33 Confusion matrix of the kNN classifier with Fisher-face representation
Facial dysfunction analysis The constructed database can benefit research and development in diverse applications, including: psychology, security, biometrics, entertainment and medical assessment. To the best of the authors knowledge this is, to date, the most comprehensive repository of 3D dynamic facial articulations. In parallel to the proposed ”control subjects” database, a ”clinical subjects” database is also being constructed for stroke, Bell’s palsy and Parkinson’s disease. These databases are currently being used for the studies on detection and quantification of facial dysfunctions of neurological patients. Based on analysis of facial asymmetry, the preliminary results from the on-going study for stroke patients suggest that dynamic 3D optical scanning is a feasible technique for the accurate and robust quantification of facial paresis.
Conclusions • The presentation introduced Hi4D-ADSIP 3D dynamic facial articulation database. • Currently it contains 3360 recorded sequences showing 14 different articulations acquired from 80 subjects. • Another ~20 older (>50 years old) subject will be added soon to reduce an age bias in the database. • Clinical version of the database for neurological patents (stroke, Bell’s palsy and Parkinson’s disease) is under construction – currently containing 32 subjects (mostly stroke and Parkinson’s disease). • Subject to final validation of the tracking results and completed human observer validation the database will be soon (January/February 2012) make publicly available for research (non-profit) purposes. A sample of the database is currently available (for non-profit use), please send an email to bmatuszewski1@uclan.ac.uk. We would be grateful for a feedback as it will help us to improve the database for the full release.
Recommend
More recommend