f is the m dimisional vector of f f f unobservable
play

[ ] F , , , = is the m -dimisional vector of F F F - PowerPoint PPT Presentation

Factor analysis (cf. sections 9.1-9.3) Example 9.8: Example 9.3: In a consumer preference study a random We register the scores in 6 subject areas for 220 students sample of customers were asked to rate several attributes of a new product using


  1. Factor analysis (cf. sections 9.1-9.3) Example 9.8: Example 9.3: In a consumer preference study a random We register the scores in 6 subject areas for 220 students sample of customers were asked to rate several attributes of a new product using a 7-point scale Sample correlation matrix Sample correlation matrix There are indications for 3 “groups” of variables: The variables seem to form two “groups” of variables: {1, 2}, {3}, and {4, 5, 6} {1, 3} and {2, 4, 5} 1 2 An observable random vector X of dimension p has mean In matrix notation the factor model takes the form: vector � and covariance matrix Σ X LF − = + � ε The (orthogonal) factor model postulates that X depends linearly on unobservable random variables (latent variables) L { } Here is the p x m matrix of factor loadings, = l ij F 1 , F 2 , ..., F m , called common factors and p additional ′ [ ] F , , , = is the m -dimisional vector of F F … F (unobservable) sources of variation ε 1 , ε 2 , ...., ε p , called 1 2 m ′ errors or specific factors =   ε ε , , , ε  ε … common factors, and is the  1 2 p p -vector of errors p -vector of errors Model formulation: Model formulation: − µ = + + + ε X l F l F l F ⋯ 1 1 11 1 12 2 1 1 Assumptions: m m − µ = + + + ε X l F l F l F ⋯ ( ) F = 0 E 2 2 21 1 22 2 2 2 m m ⋮ Cov( ) F ( FF ′ ) I = E = − µ = + + + ε X l F l F l F ⋯ ( ) 0 p p p 1 1 p 2 2 pm m p E = ε The coefficient is called the loading of the i- th l Cov( ) ( ′ ) diag{ , , , } = E = = ψ ψ ψ ε εε Ψ … ij 1 2 p variable on the j -th factor Cov( , ) = ε F 0 3 4

  2. The model implies a structure of the covariance matrix Σ of X Also m = ∑ To find this structure, we first note that Cov( , ) X X l l i k ij kj ( X − � X )( − ) ′ ( LF ε LF )( ) ′ = + + � ε 1 j = LF LF ( ) ′ ε LF ( ) ′ LF ε ′ ′ = + + + εε Further we have ( X � F ) ′ ( ) ′ = L FF L ( ′ ) ′ + ( ε F L ′ ) ′ + L F ε ( ′ ) + ′ − = LF + ε F = L FF ( ′ ) + ε F ′ εε Therefore = = cov( ) cov( ) X X = = {( {( X X − − � X � X )( )( − − ) } ) } ′ ′ E E so that Σ Σ � � Cov( , ) X F = {( X − � F ) } ′ = L E = L ( FF ′ ) + ( ε F ′ ) E E ′ ′ ′ ′ ′ ′ = L ( FF L ) + ( ε F L ) + L ( F ε ) + ( ) E E E E εε LL ′ = LIL ′ + 0L ′ + L 0 ′ + = + Ψ Ψ This gives Cov( , ) X F = l i j ij We may write m def ∑ Var( ) 2 The factor loadings are the covariances between the σ = X = + ψ l 2 = h + ψ ii i ij i � � �� � � � �� � i i observable variables and the unobservable factors 1 j = communality specific variance 5 6 Example 9.1: Consider the covariance matrix When we have no structure for the covariance matrix Σ , it depends on p ( p +1)/2 parameters, namely the p variances σ σ , ≠ and the p ( p –1)/2 covariances ik i k ii = LL ′ + Σ Ψ When the relation holds, the covariance matrix Σ , may be expressed by p ( m +1) parameters, namely ψ ψ A direct computation shows that the pm factor loadings and the p specific variances the pm factor loadings and the p specific variances l l i i ij ij Unfortunately many covariance matrix Σ , may not be ′ expressed as with m (much) smaller than p = LL + Σ Ψ (Example 9.2 discusses one problem that may occur) Or ′ = LL + Σ Ψ 7 8

  3. There is some inherent ambiguity associated with the factor model when m >1 There are different methods for estimation of the To look closer at this, let T be a m x m orthogonal matrix factor model The factor model may then be reformulated as X LF − = + LTT F ′ L F * * � ε = + = + ε ε We will consider: L * LT and F * T F ′ • estimation using principal components = = where • maximum likelihood estimation • maximum likelihood estimation ( F * ) = T ′ ( ) F = 0 E E Now Cov( * ) ′ Cov( ) ′ F = T F T = T T = I For both methods the solution may be rotated by multiplication by an orthogonal matrix to simplify the Therefore it is impossible on the basis of observations to interpretation of factors (to be described later) distinguish the loadings L from the loadings L* Thus the factor loadings L are determined only up to an orthogonal matrix T 9 10

Recommend


More recommend