Fac acial ial Ex Expression ression Det etecti ection on using ng Pat Patch ch-based based Ei Eigen en-fac ace e Isomap omap Netw Networ orks ks By: Sohini Roychowdhury, Assistant Professor, Department of Electrical and Computer Engineering, University of Washington, Bothell, WA, USA
Outline • Introduction • Facial Patch Creation • Eigen-Face Creation • Facial Network Clustering • Facial Network Analysis • Results • Conclusions 2
Introduction • Automated Facial Expression Detection: • Useful for Real Time Security Surveillance Systems, Social Networks [1]. • Challenges due to variations in: • Pose • Lighting • Imaging distortions • Expression • Occlusions. • Motivation: Source:http://mostepicstuff.c • Patched faces have better expression clustering om/app-that-changes-your- performance than full faces. facial-expression-to-cartoon- look/ • Clustering minimizes training data complexity. • Goal: To design a network-based expression classification system with low computational Source:http://www.smithsonianmag.com/i 3 time complexity. nnovation/app-captures-emotions-real-time- 180951878/?no-ist
Prior Work • Two categories of existing facial expression detection algorithms: Based on extracting feature vectors from parts of a face such as eyes, nose, 1. mouth, and chin, with the help of deformable templates [2] [3].. High computational complexity Based on the information theory concepts such as principal component 2. analysis method [4-6]. Not very effective. Large training data set required. • The proposed method involves: • Guided patch creation followed by Isomap clustering of the patched Eigen-faces for unsupervised classification. • Two classification tasks are performed: Classification of images with occlusions (mainly glasses and beards) 1. Classification of smiling faces. 2. • Low computational time complexity: • Unsupervised classification requires a runtime of less than 1 second for a dataset of 80 images of original dimension [112x92] each, in a 2.6GHz 2GB RAM Laptop. 4
Key Contributions Facial Expression Network-based clustering 1. requires only 2 training data samples for expression clustering. Facial Expression Network analysis identifies the 2. faces at the edge of the expression clusters as vital expression detectors. Network centrality and flow- based measures can further demonstrate the expression information flow in the networks. Data Set: 80 images corresponding to the 1st and 10 th image per person for 40 people [2x40=80 images] used from the ORL Data base of faces [7]. Each image of dimension [112x92] is resized to [90x90] for computational simplicity. 5
Facial Patch Creation 6 Fig 1: Extraction of high pass filtered regions of interest and face patches corresponding to the eye and mouth region, respectively.
Eigen-Face Creation [6] For each image ‘I’, the Karhunen-Loeve expansion [4] is applied to find vectors that • best represent the distribution of face images , where n=80 images. { , I I ,.... } I 1 2 n n 1 The average face is the 0 th Eigen vector computed as: I • I i n i 1 • Difference of each face from the average are computed: I i i I are subjected to PCA to find a set of ‘n’ orthonormal vectors { } n i i 1 n which best describe the distribution of images. { } i i 1 Method: n 1 T T C AA , A [ , ,.... ] Let covariance matrix: ov i i 1 2 n n i 1 T T For computational feasibility: A Av v AA Av Av Av are eigen vectors of C i i i i i i i ov T T L A A , where, L Construct a matrix of dimension [nxn] as l m , l m ‘n’ Eigen-vectors of `L’ ( { } n ) are then extracted. These Eigen-vectors i i 1 } n determine linear combinations of ‘n’ faces to form the Eigen-Faces ( ) . i i 1 n where, . i i j , j j 1 Matrix ‘L’ represents signature of each face in terms of an ‘n’ dimensional vector. 7
Example of Eigen-Faces 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80 8 Fig 2: The 0 th Eigen vector followed by 15 Principal Eigen-Faces for the 1 st face of 1 st person in the ORL data set.
Isomap-based Clustering For the matrix, Isomap [8] is used for lower dimension embedding using L [ nxn ] multidimensional scaling. Matrix ‘L’ is reduced to an unweighted network (G), where each image ‘ i ’ is connected to ‘k’ Euclidean neighbors in high dimensional space. } n Y Network G=(Y,E), where represent the signature of each Eigen-Face as a i i 1 vertex/node. ‘E’ represents an edge matrix such that 1: represents a directed link between nodes Y Y , o p E o p , 0: represents no link between nodes Y Y , o p Two faces (nodes) that have the largest Euclidean distance between them are selected as cluster representatives. i.e., If, represent the distance between nodes D i j , (i,j), then, { Z Z , } arg max D i 2 i j , i j , Such that Z 1 belongs to cluster 1 and Z 2 belongs to cluster 2. Based on the distance of every other node from Z 1 or Z 2 , each node is assigned to the closest cluster. 9 Fig 3: Isomap-based clustering using full faces
Results Task 1 : Eye occlusion detection (classification of faces with glasses) Comparison of Isomap-based clustering using full face Eigen-faces vs. Patched Eye (I e ) Eigen-Faces. Fig 4a: Isomap-based clustering using full faces Fig 4b: Isomap-based clustering using patched faces. Isomap created using k=5 Isomap created using k=5 10
Task 2 : Smile detection (classification of smiling faces) Comparison of Isomap-based clustering using full face Eigen-faces vs. Patched Eye (I e ) Eigen-Faces. Fig 5a: Isomap-based clustering using full faces Fig 5b: Isomap-based clustering using patched faces. Isomap created using k=3 Isomap created using k=7 11
Method Isomap Sensitivity Specificity Accuracy k Residual AUC Task1: Classification of facial occlusions Full Face Eigen-Faces 0.6896 0.7450 0.725 5 0.0603 0.7031 Patched Eigen-Faces 0.7586 0.6862 0.725 5 0.0275 0.7245 Task 2: Classification of smile Full Face Eigen-Faces 0.1428 0.8667 0.55 3 0.02605 0.5111 Patched Eigen-Faces 0.75 0.5556 0.6625 7 0.0132 0.6319 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 sensitivity sensitivity 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 Full Faces 0.1 Patched Face Full Faces 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Patched Face 0 1-specificity 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity Fig 6a: Clustering ROC for Task 2 by varying Fig 6a: Clustering ROC for Task 1 by varying parameter ‘k’ from [3-21] 12 parameter ‘k’ from [3-21]
Network Analysis The nodes(faces) with top 2 highest betweenness centrality(B) and Eigen Centrality (EC) are identified for the Facial Networks. Task 1: Full Face Network Patched Face Network Links 14 54 10 78 Nodes 68 Max. Betweenness 6 46 62 38 Max. Centrality 28 23 34 67 59 31 74 27 9 19 37 77 40 61 16 73 29 21 22 41 36 49 20 35 63 39 69 33 60 71 66 70 13 30 7 80 42 53 32 79 2 44 57 17 18 47 25 26 41 65 15 50 56 52 58 72 43 55 45 12 1 75 3 58 1 5 79 76 48 45 17 53 73 39 18 64 13 69 76 12 4 33 24 29 24 56 37 26 77 52 64 74 22 43 62 47 11 8 5 34 42 75 66 48 25 8 51 65 40 36 60 2 80 44 3 21 20 55 61 30 71 7 4 49 68 35 57 28 70 16 15 67 11 9 31 50 27 72 63 Links 51 78 32 23 54 Nodes 6 38 46 14 59 Max. Betweenness 10 19 Max. Centrality Patched faces have B 2 =1052 B 1 =1154 high B 1 =753.16 B 2 =640.95 centrality for occlusion EC 1 =0.3865 EC 2 =0.3167 13 EC 1 =0.27 EC 2 =0.25 clustering.
Task 2: Full Face Network Patched Face Network 46 53 13 26 31 6 37 77 68 74 61 33 75 34 71 28 49 14 18 35 55 65 30 39 25 57 56 70 41 54 44 73 59 9 19 67 47 27 64 24 16 20 58 42 76 2 60 17 75 66 40 17 36 7 40 50 65 47 3 43 43 45 1 72 15 36 18 41 32 76 63 58 52 12 15 21 45 5 80 25 44 1 56 24 66 32 5 57 11 64 46 6 49 48 51 7 4 61 23 8 9 55 2021 31 30 70 79 38 78 35 60 3 38 10 12 69 71 62 23 63 22 29 4 52 78 54 80 37 48 34 74 53 13 28 2 77 26 Links 19 14 22 62 68 72 16 Nodes 42 39 59 50 Max. Betweenness 79 27 11 69 Max. Centrality 29 Links 67 51 33 73 8 Nodes Max. Betweenness 10 Max. Centrality B 1 =703 B 2 =664 B 1 =2629 B 2 =1588 EC 1 =0.3058 EC 2 =0.2632 Patched faces have high EC 2 =0.292 EC 1 =0.296 centrality for smile clustering. 14
Recommend
More recommend