facial expression
play

Facial Expression Recognition YING SHEN SSE, TONGJI UNIVERSITY - PowerPoint PPT Presentation

Facial Expression Recognition YING SHEN SSE, TONGJI UNIVERSITY Facial expression recognition Page 1 Outline Introduction Facial expression recognition Appearance-based vs. model based Active appearance model (AAM)


  1. Pre-requisite: matrix differentiation • Function is a vector and the variable is a vector Example:   x   1 ( )   y x       1 2 2 , , ( ) , ( ) 3   y x x y x x x y x x x   2 1 1 2 2 3 2  ( )  y x   2   x 3     ( ) ( ) y x y x 1 2     x x     1 1 2 0 x   1   T   ( ) ( ) d y y x y x    1 2 1 3       d x x x     2 2 0 2   x     3 ( ) ( ) x x y y 1 2       x x 3 3 Facial expression recognition Page 26

  2. Pre-requisite: matrix differentiation • Function is a scalar and the variable is a matrix   m n ( ), f X X Definition      f f f      x x x   11 12 1 n ( ) df X      X d    f f f          x x x 1 2 m m mn Facial expression recognition Page 27

  3. Pre-requisite: matrix differentiation • Useful results (1)  n  1 , x a Then, T T d a x d x a   , a a x x d d How to prove? Facial expression recognition Page 28

  4. Pre-requisite: matrix differentiation • Useful results dA x  (2) Then, A T d x T T d x A  T (3) Then, A d x T d x A x   (4) T Then, ( ) A A x x d T d a Xb  (5) T Then, ab d X T T d a X b  (6) T Then, ba d X T x x d  2 x (7) Then, d x Facial expression recognition Page 29

  5. Outline • Introduction • Facial expression recognition • Appearance-based vs. model-based • Active appearance model (AAM) • Pre-requisite • Principle component analysis • Procrustes analysis • ASM • Delaunay triangulation • AAM Facial expression recognition Page 30

  6. Principal Component Analysis (PCA) • PCA: converts a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components • The number of principal components is less than or equal to the number of original variables • This transformation is defined in such a way that the first principal component has the largest possible variance, and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components Facial expression recognition Page 31

  7. Principal Component Analysis (PCA) How to find? • Illustration x , y (2.5, 2.4) (0.5, 0.7) (2.2, 2.9) (1.9, 2.2) (3.1, 3.0) (2.3, 2.7) (2.0, 1.6) (1.0, 1.1) (1.5, 1.6) (1.1, 0.9) De-correlation! Along which orientation the data points scatter most? Facial expression recognition Page 32

  8. Principal Component Analysis (PCA) • Identify the orientation with largest variance Suppose X contains n data points, and each data point is p - dimensional, that is  Now, we want to find such a unit vector , 1 Facial expression recognition Page 33

  9. Principal Component Analysis (PCA) • Identify the orientation with largest variance   n n 1 1                2 T T T T T var ( ) ( )( ) X x x x   i i i 1 1 n n   1 1 i i    T        T T C ( ) ( ) x x (Note that: ) i i   x 1 n  where i n  1 i n 1   x      T ( )( ) and is the covariance matrix C x i i 1 n  1 i Facial expression recognition Page 34

  10. Principal Component Analysis (PCA) • Identify the orientation with largest variance     T Since is unit, 1 Based on Lagrange multiplier method, we need to,            T T argmax 1 C             T T 1 d C      C    0 2 2 C  d  is C ’s eigen -vector Since,                    T T T max var max max max X C Thus, Facial expression recognition Page 35

  11. Principal Component Analysis (PCA) • Identify the orientation with largest variance  Thus, should be the eigen-vector of C corresponding to 1 the largest eigen-value of C   What is another orientation , orthogonal to , and 2 1 along which the data can have the second largest variation? Answer: it is the eigen-vector associated to the second   largest eigen-value of C and such a variance is 2 2 Facial expression recognition Page 36

  12. Principal Component Analysis (PCA) • Identify the orientation with largest variance Results: the eigen-vectors of C forms a set of orthogonal basis and they are referred as Principal Components of the original data X You can consider PCs as a set of orthogonal coordinates. Under such a coordinate system, variables are not correlated. Facial expression recognition Page 37

  13. Principal Component Analysis (PCA) • Express data in PCs      p n { , ,..., } Suppose are PCs derived from X , X 1 2 p   1 p Then, a data point can be linearly represented x i    { , ,..., } by , and the representation coefficients are 1 2 p    T 1    T     2 c x  i i      T   p Actually, c i is the coordinates of x i in the new coordinate    system spanned by { , ,..., } 1 2 p Facial expression recognition Page 38

  14. Principal Component Analysis (PCA) • Illustration   x , y 2.5 0.5 2.2 1.9 3.1 2.3 2.0 1.0 1.5 1.1    X (2.5, 2.4)   2.4 0.7 2.9 2.2 3.0 2.7 1.6 1.1 1.6 0.9 (0.5, 0.7)   5.549 5.539 (2.2, 2.9)   cov( )  X (1.9, 2.2)  5.539 6.449  (3.1, 3.0) (2.3, 2.7) Eigen-values = 11.5562,0.4418 (2.0, 1.6) (1.0, 1.1)   0.6779    (1.5, 1.6)  1  0.7352  (1.1, 0.9) Corresponding eigen-vectors:    0.7352     2  0.6779  Facial expression recognition Page 39

  15. Principal Component Analysis (PCA) • Illustration Facial expression recognition Page 40

  16. Principal Component Analysis (PCA) • Illustration Coordinates of the data points in the new coordinate system    T   1  newC X    T   2   0.6779 0.7352   X    0.7352 0.6779   3.459 0.854 3.623 2.905 4.307 3.544 2.532 1.487 2.193 1.407           0.211 0.107 0.348 0.094 0.245 0.139 0.386 0.011 0.018 0.199 Facial expression recognition Page 41

  17. Principal Component Analysis (PCA) • Illustration Coordinates of the data points in the new coordinate system Draw newC on the plot In such a new system, two variables are linearly independent! Facial expression recognition Page 42

  18. Principal Component Analysis (PCA) • Data dimension reduction with PCA        1 1 p p n p Suppose , are the PCs { } , { } , X x x   1 1 i i i i i i    T 1    T      2 { } p If all of are used, is still p -dimensional c x   1 i i i i      T   p   m { } , c If only are used, will be m -dimensional m p  1 i i i That is, the dimension of the data is reduced! Facial expression recognition Page 43

  19. Principal Component Analysis (PCA) • Illustration Coordinates of the data points in the new coordinate system   0.6779 0.7352    newC X   0.7352 0.6779  If only the first PC (corresponds to the largest eigen-value) is remained   newC  0.6779 0.7352 X    3.459 0.854 3.623 2.905 4.307 3.544 2.532 1.487 2.193 1.407 Facial expression recognition Page 44

  20. Principal Component Analysis (PCA) • Illustration All PCs are used Only 1 PC is used Dimension reduction! Facial expression recognition Page 45

  21. Principal Component Analysis (PCA) • Illustration If only the first PC (corresponds to the largest eigen-value) is remained   newC  0.6779 0.7352 X    3.459 0.854 3.623 2.905 4.307 3.544 2.532 1.487 2.193 1.407 How to recover newC to the original space? Easy T newC   0.6779 0.7352   0.6779      3.459 0.854 3.623 2.905 4.307 3.544 2.532 1.487 2.1931.407   0.7352 Facial expression recognition Page 46

  22. Principal Component Analysis (PCA) • Illustration Original data Data recovered if only 1 PC used Facial expression recognition Page 47

  23. Outline • Introduction • Facial expression recognition • Appearance-based vs. model-based • Active appearance model (AAM) • Pre-requisite • Principle component analysis • Procrustes analysis • ASM • Delaunay triangulation • AAM Facial expression recognition Page 48

  24. Procrustes analysis • Who is Procrustes? Facial expression recognition Page 49

  25. Procrustes analysis • Problem: • Suppose we have two sets of point correspondence pairs ( m 1 , m 2 ,…, m N ) and ( n 1 , n 2 ,…, n N ). • We want to find a scale factor s , an orthogonal matrix R and a vector t so that: N    2     2 ( ) m sR n T (1) i i  1 i Facial expression recognition Page 50

  26. Procrustes analysis • In 2D space: Facial expression recognition Page 51

  27. Procrustes analysis • How to compute R and T ? • We assume that there is a similarity transform     N N between point sets and m n  i  i i 1 1 i • Find s, R and T to minimize N N     2      2 2 ( ) e m sR n T (1) i i i   1 1 i i • Let N N 1 1         ' ' , , , m m n n m m m n n n i i i i i i N N   1 1 i i N N   • Note that   ' ' , m 0 n 0 i i   1 1 i i Facial expression recognition Page 52

  28. Procrustes analysis Then :              ' ' ' ' ( ) ( ) ( ) ( ) e m sR n T m m sR n n T m m sR n sR n T i i i i i i i           ' ' ' ' ( ) ( ) ( ) m sR n T m sR n m sR n e 0 i i i i    ' ' { , } m n ( ) e T m sR n is independent from i i 0 (1) can be rewritten as:   N N N N     2 2            2 2 ' ' ' ' ' ' 2 ( ) ( ) 2 ( ) e m sR n e m sR n e m sR n Ne 0 0 0 i i i i i i i     i 1 i 1 i 1 i 1     N N N    2        ' ' ' ' 2 ( ) 2 2 ( ) m sR n e m e sR n Ne i i 0 i 0 i 0    1 1 1 i i i N  2    ' ' 2 ( ) m sR n Ne i i 0  1 i Variables are separated and can be minimized separately.     2 0 ( ) e T m sR n If we have s and R , T can be determined. 0

  29. Procrustes analysis Then the problem simplifies to: how to minimize N  2    2 ' ' ( ) m sR n Consider its geometric meaning here. i i  1 i We revise the error item as a symmetrical one: 2 N N N N 1 1 1     2 2        2 ' ' ' ' ' ' ( ) 2 ( ) ( ) m sR n m m R n s R n i i i i i i s s s     1 1 1 1 i i i i N N N 1    2 2     ' ' ' ' 2 ( ) m m R n s n i i i i s    1 1 1 i i i Variables are separated. D Q P 2   1 1 1         2 2 2( )   P D sQ s Q P PQ D   s s s Thus,

  30. Procrustes analysis N  2 ' m 2   i 1 P       1 Determined!. 0 i  s Q P  s   N  Q s 2 ' n i  1 i Then the problem simplifies to: how to maximize N    ' ' ( ) Note that: D is a real number. D m R n i i  1 i       N N N      T T      ' ' ' ' ' ' D m Rn m Rn trace  Rn m  trace RH i i i i i i      1 1 1 i i i   N   T ' ' H n m i i  1 i Now we are looking for an orthogonal matrix R to maximize the trace of RH.

  31. Procrustes analysis Lemma For any positive semi-definite matrix C and any orthogonal matrix B :      trace C trace BC Proof:   T , A C AA From the positive definite property of C , where A is a non-singular matrix. a Let be the i th column of A. Then i          T T T trace BAA trace A BA a Ba i i i    , x y x y According to Schwarz inequality:         T T T T T T a Ba a Ba a a a B Ba a a i i i i i i i i i i Hence,             T T T trace BC trace C trace BAA a a trace AA that is, i i i

  32. Procrustes analysis   N   T   T ' ' H U V Consider the SVD of H n m i i  1 i According to the property of SVD, U and V are orthogonal matrices, and Λ is a diagonal matrix with nonnegative elements.  Note that: X is orthogonal. T Now let X VU     T T T XH VU U V V V which is positive semi-definite. We have Thus, from the lemma, we know: for any orthogonal matrix B  ( ) ( ) trace XH trace BXH for any orthogonal matrix Ψ   ( ) ( ) trace XH trace H It’s time to go back to our objective now… R should be X

  33. Procrustes analysis Now, s , R and T are all determined.   N  T    ' ' T H n m U V i i  1 i N   2 ' m i     ( ) T T m sR n 1 i R VU s N  2 ' n i  1 i

  34. Procrustes analysis • Problem: Given two configuration matrices X and Y with the same dimensions, find rotation and translation that would bring Y as close as possible to X . 1) Find the centroids (mean values of columns) of X and Y . Call them 𝐲 and 𝐳 . 2) Remove from each column corresponding mean. Call the new matrices X new and Y new 3) Find Y new T X new . Find the SVD of this matrix Y new T X new = UDV T N 4) Find the rotation matrix R = UV T . Find scale factor   2 X new  1 i Find the translation vector T = 𝐲 - sR 𝐳 . s 5) N  2 Y new  1 i Facial expression recognition Page 59

  35. Some modifications • Sometimes it is necessary to weight some variables down and others up. In these cases Procrustes analysis can be performed using weights. We want to minimize the function:    2 T ( ( )( ) ) M tr W X AY X AY • This modification can be taken into account if we find SVD of Y T WX instead of Y T X Facial expression recognition Page 60

  36. Outline • Introduction • Facial expression recognition • Appearance-based vs. model-based • Active appearance model (AAM) • Pre-requisite • Principle component analysis • Procrustes analysis • ASM • Delaunay triangulation • AAM Facial expression recognition Page 61

  37. Active shape model (ASM) • Collect training samples • 30 neutral faces • 30 smile faces Facial expression recognition Page 62

  38. ASM • Add landmarks • 26 points each face • X = [ x 1 x 2 … x N ]; N = 60. Facial expression recognition Page 63

  39. ASM • Align training faces together using Procrustes analysis Transform x i to x j : Rotation ( R i ), Scale ( s i ), Transformation ( T i ) T min * Z Z i i    ( * *( ) ) Z x s R x T i j i Consider a weight matrix W : D kl represents the distance between the point k and l in one image and V Dkl represents the variance of D kl in different images   w   N 1    1 ( ) w V   W k D  kl  1 l     w m m is the dimension of x i Facial expression recognition Page 64

  40. ASM • We want to minimize  T min * * E Z W Z i i i 1) Find the centroids (mean values of columns) of x i and x j . 2) Remove from each column corresponding mean. 3) Find the SVD of this matrix x i_new *W* x j_new T = UDV T 4) Find the rotation matrix A = UV T . Find the scale factor and translation vector as shown in page 59. Facial expression recognition Page 65

  41. ASM • Align training faces using Procrustes analysis • Steps: 1. Align the other faces with the first face   N 2. Compute the mean face x x i i  1 3. Align training faces with the mean face 4. Repeat step 2 and 3 until the discrepancies between training faces and the mean face won’t change Facial expression recognition Page 66

  42. ASM • Faces after alignment Facial expression recognition Page 67

  43. ASM • Construct models of faces   N • Compute mean face x x i i  1 • Calculate covariance matrix N 1  x    T ( )( ) S x x x i i N  1 i • Find its eigenvalues ( λ 1 , λ 2 , …, λ m ) and eigenvectors P = ( p 1 , p 2 , …, p m ) • Choose the first t largest eigenvalues t       , f V V i v T T i t  1 i Dimension reduction: 26*2  27 Usually f v = 0.98 Facial expression recognition Page 68

  44. ASM • We can approximate a training face x   x x Pb   T ( ) b P x x • b i is called the i th mode of the model   • constraint: | | 3 b i i Facial expression recognition Page 69

  45. ASM • Effect of varying the first three shape parameters in turn between ± 3 s.d. from the mean value -3 s.d. -3 s.d. origin +3 s.d. origin +3 s.d. Facial expression recognition Page 70

  46. ASM • Active Shape Model Algorithm 1. Examine a region of the image around each point x i to find the best nearby match for the point x i ' 2. Update the parameters ( T , s , R , b ) to best fit the new found points x' 3. Apply constraints to the parameters, b , to ensure plausible shapes (e.g. limit so | b i | < 3 λ 𝑗 ). 4. Repeat until convergence. Q1: How to find corresponding points in a new image? Facial expression recognition Page 71

  47. ASM • Find the initial corresponding points • Detect face using Viola-Jones face detector • Estimate positions of eyes centers, nose center, and mouse center • Align the corresponding positions on the mean face to the estimated positions of eyes centers, nose center, and mouse center ( s init , R init , T init ) on the new face   * * x s R x T init init init init Facial expression recognition Page 72

  48. ASM • Construct local features for each point in the training samples • For a given point, we sample along a profile k pixels either side of the model point in the i th training image. • Instead of sampling absolute grey-level values, we sample derivatives and put them in a vector g i • Normalize the sample: point i -1   g i g point i i g ij j 2 k +1 pixels Facial expression recognition point i +1 Page 73

  49. ASM • For each training image, we can get a set of normalized samples { g 1 , g 2 ,…, g N } for point i • We assume that these g i are distributed as a multivariate Gaussian, and estimate their mean 𝒉 𝒋 and covariance 𝑇 𝑕 𝑗 . • This gives a statistical model for the grey- level profile about the point i • Given a new sample g s , the distance of g s to 𝒉 𝒋 can be computed using the Mahalanobis distance S     1 T ( , ) ( ) ( ) d g g g g g g s i s i g s i i Facial expression recognition Page 74

  50. ASM • During search we sample a profile m pixels from either side of each initial point ( m > k ) on the new face. point i -1 d x i point i 2 m +1 pixels point i +1 S     1 T min ( , ) ( ) ( ) d g g g g g g s i s i g s i i g Facial expression recognition Page 75 s

  51. ASM • Some constraints on dx i • | dx i | = 0 if | d best | <= δ • | dx i | = 0.5 d best if δ <= | d best | <= d max • | dx i | = 0.5 d max if | d best | > d max Facial expression recognition Page 76

  52. ASM • Apply one iteration of ASM algorithm 1. Examine a region of the image around each point x i to find the best nearby match for the point x i ': x i ' = x i + dx i 2. Update the parameters ( T , s , R , b ) to best fit the new found points x ' 3. Apply constraints to the parameters, b , to ensure plausible shapes (e.g. limit so | b i | < 3 λ 𝑗 ). 4. Repeat until convergence. Q2: How to find T , s , R , and b for x '? Facial expression recognition Page 77

  53. ASM • Suppose we want to match a model x to a new set of image points y • We wish to find ( T , s , R , b ) that can minimize   2 min | * ( ) | y x s R T     2 min | * ( ) | y s R x Pb T Facial expression recognition Page 78

  54. ASM • A simple iterative approach to achieving the minimum 1. Initialize the shape parameters, b , to zero (the mean shape). 2. Generate the model point positions using x = 𝐲 + Pb 3. Find the pose parameters ( s , R , T ) which best align the model points x to the current found points y   2 min | * ( ) | y s R x T 4. Project y into the model co-ordinate frame by inverting the transformation :    1 ' ( ) / y R y T s 5. Project y into the tangent plane to 𝐲 by scaling: y '' = y '/( y '· 𝐲 ). 6. Update the model parameters to match to y ''   T ( '' ) b P y x 7. If not converged, return to step 2. Facial expression recognition Page 79

  55. ASM • Apply one iteration of ASM algorithm 1. Examine a region of the image around each point x i to find the best nearby match for the point x i ': x i ' = x i + dx i 2. Update the parameters ( T , s , R , b ) to best fit the new found points x ' using the algorithm in page 79 3. Apply constraints to the parameters, b , to ensure plausible shapes (e.g. limit so | b i | < 3 λ 𝑗 ). 4. Repeat until convergence. Facial expression recognition Page 80

  56. ASM • So now we have a vector of b for the new face • Classify facial expression using b Facial expression recognition Page 81

  57. Classification • Training • Compute b for each training faces • Training a classifier (e.g. SVM, Neural Network) using { b 1 , b 2 ,…, b N } and the corresponding labels • Test • Using the trained classifier to classify a new mode vector b new of a new face Facial expression recognition Page 82

  58. Outline • Introduction • Facial expression recognition • Appearance-based vs. model-based • Active appearance model (AAM) • Pre-requisite • Principle component analysis • Procrustes analysis • ASM • Delaunay triangulation • AAM Facial expression recognition Page 83

  59. Delaunay triangulation • Terrains by interpolation • To build a model of the terrain surface, we can start with a number of sample points where we know the height. Facial expression recognition Page 84

  60. Delaunay triangulation • How do we interpolate the height at other points? • Height ƒ( p ) defined at each point p in P • How can we most naturally approximate height of points not in P ? Let ƒ ( p ) = height of nearest point for points not in A Does not look natural Facial expression recognition Page 85

  61. Delaunay triangulation • Better option: triangulation • Determine a triangulation of P in R 2 , then raise points to desired height • Triangulation: planar subdivision whose bounded faces are triangles with vertices from P Facial expression recognition Page 86

  62. Delaunay triangulation • Formal definition • Let P = { p 1 ,…, p n } be a point set. • Maximal planar subdivision : a subdivision S such that no edge connecting two vertices can be added to S without destroying its planarity • A triangulation of P is a maximal planar subdivision with vertex set P . Facial expression recognition Page 87

  63. Delaunay triangulation • Triangulation is made of triangles • Outer polygon must be convex hull • Internal faces must be triangles, otherwise they could be triangulated further Facial expression recognition Page 88

  64. Delaunay triangulation • For P consisting of n points, all triangulations contain • 2 n -2- k triangles • 3 n -3- k edges • n = number of points in P • k = number of points on convex hull of P Facial expression recognition Page 89

  65. Delaunay triangulation • But which triangulation? Facial expression recognition Page 90

  66. Delaunay triangulation • Some triangulations are “better” than others • Avoid skinny triangles, i.e. maximize minimum angle of triangulation Facial expression recognition Page 91

  67. Delaunay triangulation • Let 𝒰 be a triangulation of P with m triangles. Its angle vector is A( 𝒰 ) = ( α 1 ,…, α 3 m ) where α 1 ,…, α 3 m are the angles of 𝒰 sorted by increasing value. • Let 𝒰′ be another triangulation of P . A( 𝒰 ) is larger then A( 𝒰′ ) iff there exists an i such that  j =  ' j for all j < i and  i >  ' i • 𝒰 is angle optimal if A( 𝒰 ) > A( 𝒰′ ) for all triangulations 𝒰′ of P Facial expression recognition Page 92

  68. Delaunay triangulation • If the two triangles form a convex quadrilateral, we could have an alternative triangulation by performing an edge flip on their shared edge. • The edge 𝑓 = 𝑄 𝑗 𝑄 𝑘 is illegal if min 1≤𝑗≤6 𝛽 𝑗 ≤ min 1≤𝑗≤6 𝛽′ 𝑗 • Flipping an illegal edge increases the angle vector Facial expression recognition Page 93

  69. Delaunay triangulation • If triangulation 𝒰 contains an illegal edge e , we can make A( 𝒰 ) larger by flipping e . • In this case, 𝒰 is an illegal triangulation . Facial expression recognition Page 94

  70. Delaunay triangulation • We can use Thale’s Theorem to test if an edge is legal without calculating angles Theorem : Let C be a circle, ℓ a line intersecting C in points a and b , and p , q , r , s points lying on the same side of ℓ . Suppose that p , q lie on C , r lies inside C , and s lies outside C . Then ∡𝑏𝑠𝑐 > ∡𝑏𝑞𝑐 = ∡𝑏𝑟𝑐 > ∡𝑏𝑡𝑐 Facial expression recognition Page 95

  71. Delaunay triangulation • If p i , p j , p k , p l form a convex quadrilateral and do not lie on a common circle, exactly one of p i p j and p k p l is an illegal edge. Lemma : The edge 𝑄 𝑗 𝑄 𝑘 is illegal iff p l lies in the interior of the circle C . Facial expression recognition Page 96

  72. Delaunay triangulation • A legal triangulation is a triangulation that does not contain any illegal edge. • Compute Legal Triangulations 1. Compute a triangulation of input points P . 2. Flip illegal edges of this triangulation until all edges are legal. • Algorithm terminates because there is a finite number of triangulations. • Too slow to be interesting… Facial expression recognition Page 97

  73. Delaunay triangulation • Before we can understand an interesting solution to the terrain problem, we need to understand Delaunay Graphs. • Delaunay Graph of a set of points P is the dual graph of the Voronoi diagram of P Facial expression recognition Page 98

  74. Delaunay triangulation • Voronoi Diagram and Delaunay Graph • Let P be a set of n points in the plane • The Voronoi diagram Vor( P ) is the subdivision of the plane into Voronoi cells 𝒲(𝑞) for all 𝑞 ∈ 𝑄 • Let 𝒣 be the dual graph of Vor( P ) • The Delaunay graph 𝒠𝒣(𝑄) is the straight line embedding of 𝒣 Facial expression recognition Page 99

  75. Delaunay triangulation • Voronoi Diagram and Delaunay Graph • Calculate Vor(P) • Place one vertex in each site of the Vor( P ) Facial expression recognition Page 100

Recommend


More recommend