factor analysis
play

Factor Analysis ! " " Leibny Paola Garca Perera. " - PowerPoint PPT Presentation

Factor Analysis ! " " Leibny Paola Garca Perera. " Carnegie Mellon University. " Tecnolgico de Monterrey, Campus Monterrey, Mexico " Universidad de Zaragoza, Spain. " " Bhiksha Raj, Juan Arturo Nolazco


  1. Factor Analysis ! " " Leibny Paola García Perera. " Carnegie Mellon University. " Tecnológico de Monterrey, Campus Monterrey, Mexico " Universidad de Zaragoza, Spain. " " Bhiksha Raj, Juan Arturo Nolazco Flores, Eduardo Lleida " " " " "

  2. Agenda " Introduction "  Motivation: "  Dimension reduction "  Modeling: covariance matrix "  Factor Analysis (FA) "  Geometrical explanation "  Formulation (The Equations) "  EM algorithm "  Comparison with PCA and PPCA. "  Example with numbers "  Applications "  Speaker Verification: Joint Factor Analysis (JFA) "  Some results "  References " 

  3. Introduction " Problem: Lots of data with n-dimensions vectors. " Example: " " ! $  a 11 a 12 a 13 a 1 P # & "  a 21 a 22 a 23 a 2 P # & Feature " # & "  Y = a 31 a 32 a 33 a 3 P # & Vectors " " #      & # &  a N 1 a N 2 a N 3 a NP " # & " % P >> 1 " Can we reduce the number of " dimensions? To reduce computing time, simplify process? " " YES!  "

  4. Introduction: ! Covariance matrix " What can give us information of the data? (Just for this special  case) " The covariance matrix "  Get rid of not important information. "  Think of continuous factors that control the data. "  "

  5. Factor Analysis (FA) " What is Factor Analysis? "  Analysis of the covariance in observed variables (Y). "  In terms of few (latent) common factors. "  Plus a specific error " x 2 x 1 x 3 λ 1 λ 2 λ 3 " ⊕ ⊕ ⊕ ⊕ ε 3 ε 2 ε 4 ε 1 y 1 y 2 y 3 y 4

  6. Factor Analysis (FA): Geometrical Representation " x 3 " y λ 1 ε µ x " λ 2 x 2 x 1

  7. Factor Analysis (FA): ! Formulation (the equations) " Assumptions " Form " " y − µ = Λ x + ε Ε ( x ) = Ε ( ε ) = 0 " T ) = Ι y = Λ x + ε Ε ( ΛΛ " y → P × 1 data vector " " " % 0 0 ψ 11 $ ' µ → P × 1 mean vector " T ) = ψ = "  Ε ( εε 0 0 $ ' $ ' loading Matrix " Λ → P × R 0 0 " ψ PP $ ' # & factor vector " x → R × 1 " Ε ( y , x ) = Λ error vector " ε → P × 1 " T ) = ΛΛ T + Ψ Full rank!! " Σ = Ε ( yy

  8. Factor Analysis (FA): ! Formulation (the equations) " Now that we have checked the matrices dimensions. " The model: " ( ) ( ) = N x 0, I p x " ( ) = N ( y µ + Λ x , Ψ ) p y x , θ " " Quick notes: " ! ( ) p x , y " # # Are Gaussians!! " ( ) p y " " # ( ) p x y " # $ "

  9. Factor Analysis (FA): ! Formulation (the equations) " Now, we can compute: " ) dx = N ( y u , ΛΛ T + Ψ ) " ( ) = ( ∫ p y θ p ( x ) p y x , θ " x This marginal is… a Gaussian!! " Compute the expected value and covariance. " Ε ( y ) = Ε ( µ + Λ x + ε ) = Ε ( µ ) + ΛΕ ( x ) + Ε ( ε ) = µ " $ & T Cov( y ) = Ε ( y − µ )( y − µ ) " % ' " $ & $ & T T = Ε ( µ + Λ x + ε − µ )( µ + Λ x + ε − µ ) ' = Ε ( Λ x + ε )( Λ x + ε ) % % ' " T + Ε εε T + Ψ $ T & $ T & = ΛΕ xx 'Λ ' = ΛΛ % % "

  10. Factor Analysis (FA): Formulation (the equations) " So, factor analysis is a constrained covariance Gaussian Model!! " " ) = N ( y µ , ΛΛ T + Ψ ) ( p y θ " So, what is the covariance? " " ! $ Λ T 0 0 ψ 11 " # & + cov( y ) = Λ  0 0 # & " # & 0 0 ψ PP " # & " % " "

  11. Factor Analysis (FA): ! Formulation (the equations) " How can we compute the likelihood function? " − 1 y n − µ T ΛΛ T + Ψ ) = − N 2 log ΛΛ T + Ψ − 1 y n − µ  θ , D ∑ ( ) ( ) ( ) ( " 2 n $ ' " ) = − N 2 log Σ − 1 y n − µ ) y n − µ T ( ( )  θ , D 2 tr Σ − 1 ∑ ( & ) " % ( n " ) = − N 2 log Σ − 1  θ , D ( ) 2 tr Σ − 1 S ( " S is the sample data covariance Matrix. " Conclusion: " Constrained model close to the Sample covariance! "

  12. Factor Analysis (FA): ! Formulation (the equations) " So we need sufficient statistics… " " ∑ y n mean: " n " y n − µ ) y n − µ T ( ( ) ∑ covariance: " n " "

  13. Factor Analysis (FA): ! Expectation Maximization " µ How to estimate ? "  Just compute the mean of the data. "  For the rest of the parameters ? " Λ , Ψ  Expectation Maximization "  " " "

  14. Factor Analysis (FA): ! Expectation Maximization " Advantages "  Focuses on maximizing the likelihood "  Disadvantages "  Need to know the distribution "  No analytical solution "  " " "

  15. Factor Analysis (FA): ! Expectation Maximization " Remember EM algorithm? " E-step: "  t + 1 = p x n y n , θ t ( ) q n " M-step "  x n y n θ t + 1 = argmax ) log p y n , x n θ ( ( ) dx n ∑ t + 1 ∫ q n " θ x n " "

  16. Factor Analysis (FA): ! Expectation Maximization " What do we need? " E-step: "  " " Conditional probability!!! " " t + 1 = p x n y n , θ t ) = N x n m n , Σ n ( ( ) q n " M-step: "  " " Log of the complete data for: " " Λ t + 1 = argmax ( ) q t + 1 ∑  x n , y n " Λ n n Ψ t + 1 = argmax " ( ) q t + 1 ∑  x n , y n " Ψ n n

  17. Factor Analysis (FA): ! Expectation Maximization " ( ) What else is needed? " p x y " Let’s start with: " ' * ! $ ' * ! $ ! $ ! $ x x & | 0 Λ T I ) , p , = N & , # & ) # & # # " ) ΛΛ T + Ψ , y y µ " % " % " % # & Λ ( + " % ( + " Remember that, " " ( T ) ( ) = E x µ + Λ x + ε − u ) = E ( x − 0)( y − u ) T ( ( ) cov x , y " ( ) = Λ T T ( ) = E x Λ x + ε " "

  18. Factor Analysis (FA): ! Expectation Maximization " Now, " Remembering Gaussian conditioning formulas " " ( ) = N ( x m , V ) p x y " − 1 y − u m = Λ ΛΛ T + Ψ ( ) ( ) " − 1 Λ V = I − Λ T ΛΛ T + Ψ ( ) " Remember inversion lemma? " " − 1 = Ψ − 1 + Ψ − 1 Λ I + Λ T Ψ − 1 Λ − 1 Λ T Ψ − 1 Σ − 1 = ΛΛ T + Ψ ( ) ( ) " Inverting this matrix is much more efficient O(MP) instead of O(P 2 ), thanks to the lemma. " "

  19. Factor Analysis (FA): ! Expectation Maximization " We finally obtain: " " " ( ) = N ( x m , V ) p x y " − 1 ( ) V = I − Λ T Ψ − 1 Λ " m = V Λ T Ψ − 1 y − u ( ) " " " " "

  20. Factor Analysis (FA): ! Expectation Maximization " Some nice observations: " " ( ) = N ( x m , V ) p x y " − 1 ( ) V = I − Λ T Ψ − 1 Λ " m = V Λ T Ψ − 1 y − u " ( ) Means that the posterior mean is just a linear operation!!! " And the covariance does not depend on the observed data!!! " − 1 " ( ) V = I − Λ T Ψ − 1 Λ "

  21. Factor Analysis (FA): ! Expectation Maximization " How does it look? " " x 3 " " y " " µ " x 2 " " x 1 "

  22. Factor Analysis (FA): ! Expectation Maximization " Let’s subtract the mean for our computation. " The likelihood for the complete data is: "  Λ , Ψ ( ) ∑ log p x n , y n ( ) = " n " + log p x n y n ( ) ( )  Λ , Ψ ∑ log p x n ( ) = " n T Ψ − 1 y n − Λ x n ) = − N 2 log Ψ − 1 x T x − 1 " y n − Λ x n  Λ , Ψ ( ) ( ) ∑ ∑ ( 2 2 " n n ) = − N 2 log Ψ − N  Λ , Ψ 2 tr ( S Ψ − 1 ) ( " " T y n − Λ x n S = 1 y n − Λ x n ∑ ( ) ( ) " N n

  23. Factor Analysis (FA): ! Expectation Maximization " Now, let’s compute the M step! (Almost there!) " We need to calculate the derivatives of the log likelihood " ∂  Λ , Ψ " ( ) = −Ψ − 1 ∑ y n x T + Ψ − 1 Λ ∑ x n x T n n " ∂Λ n n ∂  Λ , Ψ " ( ) = N Ψ − NS ∂Ψ − 1 " 2 2 t q And the expectations with respect to " " E  ' Λ ] = −Ψ − 1 ∑ y n m T + Ψ − 1 Λ ∑ [ V n n n n " [ ] − N ⋅ E S ( = N Ψ E  ' Ψ − 1 % ' " & 2 2

  24. Factor Analysis (FA): ! Expectation Maximization " Finally, set the derivatives to zero and solve! " " − 1 # & # & Λ t + 1 = ∑ y n m nT ∑ V n " % ( % ( $ ' $ ' n n " # & Ψ t + 1 = 1 ∑ y n y nT + Λ t + 1 ∑ m n y nT N diag % ( " $ ' n n " " " " "

Recommend


More recommend