consequences of measurement error
play

Consequences of measurement error Psychology 588: Covariance - PowerPoint PPT Presentation

Consequences of measurement error Psychology 588: Covariance structure and factor models Scaling indeterminacy of latent variables 2 Scale of a latent variable is arbitrary and determined by a convention for convenience


  1. Consequences of measurement error Psychology 588: Covariance structure and factor models

  2. Scaling indeterminacy of latent variables 2 • Scale of a latent variable is arbitrary and “determined” by a convention for convenience • Typically set to variance of one (factor analysis convention) or to be identical to an arbitrarily chosen indicator’s scale By centering indicator variables, we set latent variables’ means to zero • Consider the following transformation:              x j J a b b * , 1,..., , , 0 j j j j      a      j    *     j j j  b  b  

  3. If all J indicators are considered simultaneously, vector • notation is more convenient:         x ν λ δ a b * ,     a 1      ν λ λ δ *      b   b  meaning that the linear transformation of ξ can be exactly compensated in the accordingly transformed ν * = ν – λ a / b and λ * = λ / b , leaving the errors δ unchanged (i.e., same fit)

  4. What’s great about measurement errors in equation 4 • Regression weights and correlations are interpreted, implicitly assuming that the “operationally defined” variables involve no measurement error --- hardly realized for theoretical constructs (e.g., self esteem, IQ, etc.) • Ignoring the measurement error will lead to inconsistent estimates • We will see consequences of ignoring measurement errors

  5. Univariate consequences 5 Consider a mean-included equation for X (hours worked per • week) to indicate ξ (achievement motivation):                    X E E E , , 0, 0         E X X          X 2 var var Given only one indicator per latent variable, the intercept and loading (i.e., weight) are simply scaling constants for ξ However, if the ξ scale is set comparable to the X scale (i.e., λ = 1 ), we see that var( X ) is an over-estimation of ϕ = var( ξ ) if δ is not included in the equation

  6. Bivariate relation and simple regression 6 zeta • True data structure: gamma xi eta     x 1 1 1 η : job satisfaction      y y x y : satisfaction scale 2      e d • cov( x , y ) is unbiased estimate of cov( ξ , η ) with λ 1 = λ 2 =1, since no other variables ( δ and ε ) can explain cov( x , y )              cov , cov ,                x y cov , cov ,

  7.        • From the previous equations, and by analogy cov , with y = γ * x + ζ * if measurement errors are ignored,      x y cov ,       *       xx    x var  var  The parenthesized ratio (reliability) becomes 1 only with no measurement error; otherwise, γ * is an attenuated estimate of   γ and is an inconsistent estimator of γ s s * ˆ xy xx    2 , • If the bias of regression weight has an additional 1        * factor as --- but such scaling is unusual xx 2 1 when there is only one indicator per latent variable

  8. • Correlations:     2   2 cov ,    2         var var   2   x y 2 2 cov ,    2         xy x y x y var var var var       2 var      2        xx yy  x y var var var which shows an attenuation of the “true” correlation due to measurement error, with the familiar correction formula:       xy xx yy

  9. Consequences in multiple regression 9 • True data structure: 1 x1 d1 xi1 zeta g      γ ξ 1 g2 1 x2 xi2 eta d2   1 x ξ δ 3 g 1 y xi3 x3 d3     y e with Λ x = I and λ y = 1     y γ x * * • Ignoring measurement errors:            σ ξ ξ ξ γ Φγ cov , cov , •               σ x y ξ δ ξ γ Φγ cov , cov , xy

  10.       γ Φ σ y γ x 1 * • * and by analogy with ,           1 γ Σ σ Σ Φγ Φ Θ Φγ 1 1 *  xx xy xx *   γ γ Without measurement error ( Θ δ = 0 ), * γ γ ; otherwise,  x    Σ Φ γ Σ Σ γ 1 * • Alternatively written: since --- where x  xx    is the OLS estimator of B in i.e., ξ Bx e Σ Σ 1 i , x  xx regression weights for prediction of ξ by x   Σ Σ I 1 Again, without measurement error, x  xx γ σ σ Γ Σ Σ • Note: in Bollen (pp. 159-168), are meant to be , , , , ,   xy xy respectively, for the multiple regression model

  11. As a very simplified case, suppose x 1 is the only fallible as: •     x 1 1 1    x i q , 2,..., i i with the true and estimated regression equations:               q q 1 1 2 2             x  * * * * q q 1 1 2 2 • In this special case, the regression weight matrix has a simple    Φ Φ Θ multiplicative form of bias (hint: use ):      c 0 1         1     Σ Σ Φ Θ Φ I c 0 1 1 ,        c I   xx x q q 1   2

  12. • Consequently, resulting bias factors are:    b *     x  1 1 q 1 1 2       b i  q * 1 , 2, ,       i i x i i 1 1 ~  Bias-factor for x 1 is less than 1 in absolute value (1 without  * measurement error), and so is biased toward 0 --- the 1 b  bias factor indicates regression weight b 1 in    x  q 1 1 2 ξ 1 = b 0 + b 1 x 1 + b 2 ξ 2 + … + b q ξ q  Consequences for x i , i = 2,…, q are additive, depending on relationships between ξ 1 and ξ i holding all other IVs constant, and γ 1

  13. • So far all reasoning is based on rather unrealistic assumptions:  Only single indicator per latent variable, and so its loading becomes simply scaling constant  Only one fallible IV • Without such assumptions (e.g., all IVs fallible), consequences of measurement error become too complicated and hard to  Σ Σ 1 simplify algebraically --- no particular simple form of a x  xx • One clear conclusion: all estimates are inconsistent --- systematically different from what they meant to be

  14. • Consequence in standardization:      var    ii i * * standardized     i i    var var • Consequence in SMC is similar to the bivariate case:      R R 2 *2 plim plim • What should we do with essentially omnipresent measurement error?  Use SEM which allows for measurement errors in the model --- though we are limited in certain models regarding the model identification (e.g., Table 5.1, p. 164)

  15. Correlated errors of measurement 15 • Consequence in regression weights further complicated:     γ Σ Σ γ Σ σ * 1 1   xx x xx     cov ,     * For simple regression:   xx x var  Now, is not necessarily < γ * • If correlated measurement errors are only within IVs (i.e., σ δε =   0 , Σ xx = Φ + Θ σ where Θ σ is not diagonal), γ Σ Σ γ 1 * x  xx still holds (but the bias factor will have a more complicated form, also involving off-diagonal entries of Θ σ )

  16. With multi-equations 16 • In path models with sequential causal paths, consequences of measurement errors very hard to simply generalize --- see the union sentiment (Fig. 5.2, p. 169) and SES (Fig. 5.4, p. 173) examples • If reliabilities are known, the corresponding error variances can be constrained; if unknown, the error variances may be modeled as free parameters provided that they are identifiable • To keep in mind: we need more than one indicator per latent variable for identifiability and statistical testing --- leading to measurement models with multiple indicators or CFA

Recommend


More recommend