SEM Stas Kolenikov U of Missouri Introduction Structural Equation Modeling Structural equation Using gllamm , confa and gmm models Formulation Path diagrams Identification Estimation Stas Kolenikov Stata tools for SEM gllamm Department of Statistics confa gmm + sem4gmm University of Missouri-Columbia NHANES Joint work with Kenneth Bollen (UNC) daily functioning July 15, 2010 Outlets References
SEM Goals of the talk Stas Kolenikov U of Missouri 1 Introduce structural equation models Introduction 2 Describe Stata packages to fit them: Structural • confa : a 5/8” hex wrench equation models • gllamm : a Swiss-army tomahawk Formulation • gmm : do-it-yourself kit Path diagrams Identification Estimation 3 Example: daily functioning in NHANES Stata tools for SEM gllamm confa gmm + sem4gmm NHANES daily functioning Outlets References
SEM First, some theory Stas Kolenikov U of Missouri Introduction 1 Introduction Structural equation models 1 Structural equation Formulation models Formulation Path diagrams Path diagrams Identification Identification Estimation Estimation Stata tools for SEM gllamm Stata tools for SEM 2 confa gmm + sem4gmm gllamm NHANES daily confa functioning gmm + sem4gmm Outlets References 3 NHANES daily functioning Outlets 4 References 5
SEM Structural equation modeling Stas Kolenikov U of Missouri (SEM) Introduction • Standard multivariate technique in social sciences Structural • Incorporates constructs that cannot be directly equation models observed: Formulation Path diagrams • psychology: level of stress Identification Estimation • sociology: quality of democratic institutions Stata tools for • biology: genotype and environment SEM • health: difficulty in personal functioning gllamm confa gmm + sem4gmm • Special cases: NHANES • linear regression daily functioning • confirmatory factor analysis Outlets • simultaneous equations References • errors-in-variables and instrumental variables regression
SEM Origins of SEM Stas Kolenikov U of Missouri Path analysis of Sewall Wright (1918) Introduction Structural ⊗ equation models Formulation Causal modeling of Hubert Blalock (1961) Path diagrams Identification Estimation Stata tools for ⊗ SEM gllamm confa Factor analysis estimation of Karl J¨ oreskog (1969) gmm + sem4gmm NHANES daily functioning ⊗ Outlets References Econometric simultaneous equations of Arthur Goldberger (1972)
SEM Structural equations model Stas Kolenikov U of Missouri Latent variables: Introduction η = α η + B η + Γ ξ + ζ (1) Structural equation models Formulation Measurement model for observed variables: Path diagrams Identification Estimation y = α y + Λ y η + ε (2) Stata tools for SEM x = α x + Λ x ξ + δ (3) gllamm confa gmm + sem4gmm ξ , ζ , ε , δ are uncorrelated with one another NHANES daily functioning J¨ oreskog (1973), Bollen (1989), Yuan & Bentler (2007) Outlets References
SEM Implied moments Stas Kolenikov U of Missouri Denoting Introduction V [ ξ ] = Φ , V [ ζ ] = Ψ , V [ ε ] = Θ ε , V [ δ ] = Θ δ , Structural equation models � x � R = Λ y ( I − B ) − 1 , Formulation z = Path diagrams y Identification Estimation Stata tools for obtain SEM gllamm � α y + Λ y R µ ξ � confa � � µ ( θ ) ≡ E = (4) gmm + sem4gmm z α x + Λ x µ ξ NHANES daily � Λ x Φ Λ ′ x + Θ δ Λ x ΦΓ ′ R ′ � functioning � � Σ( θ ) ≡ V = (5) z R (ΓΦΓ ′ + Ψ) R ′ + Θ ε R ΓΦ Λ ′ Outlets x References
SEM Path diagrams Stas Kolenikov U of Missouri φ 12 � φ 22 � z 1 � φ 11 � Introduction Structural ξ 1 β 12 equation y 3 ǫ 3 � θ 6 � models λ 6 β 11 Formulation Path diagrams Identification 1 λ 3 λ 5 λ 2 Estimation η 1 y 2 ǫ 2 � θ 5 � Stata tools for SEM x 1 x 2 x 3 gllamm 1 y 1 confa ǫ 1 � θ 4 � � θ 4 � gmm + sem4gmm NHANES daily δ 1 δ 2 δ 3 ζ 1 functioning � θ 1 � � θ 2 � � θ 3 � � σ 1 � Outlets References
SEM Identification Stas Kolenikov U of Missouri Before proceeding to estimation, the researcher needs to Introduction verify that the SEM is identified : Structural equation models Formulation Pr { X : f ( X , θ ) = f ( X , θ ′ ) ⇒ θ = θ ′ } = 1 I Path diagrams Identification Estimation Stata tools for Different parameter values should give rise to different SEM gllamm likelihoods/objective functions, either globally, or locally in a confa gmm + sem4gmm neighborhood of a point in a parameter space. NHANES daily functioning Outlets References
SEM Likelihood Stas Kolenikov U of Missouri • Normal data ⇒ likelihood is the function of sufficient Introduction statistic (¯ z , S ) : Structural equation + n tr [Σ − 1 ( θ ) S ] � � − 2 log L ( θ, Y , X ) ∼ n ln det Σ( θ ) models Formulation z − µ ( θ )) ′ Σ − 1 ( θ )(¯ Path diagrams + n (¯ z − µ ( θ )) → min (6) Identification θ Estimation Stata tools for • Generalized latent variable approach for mixed SEM gllamm response (normal, binomial, Poisson, ordinal, within the confa gmm + sem4gmm same model): NHANES daily functioning n � � − 2 log L ( θ, Y , X ) ∼ f ( y i , x i | ξ, ζ ; θ ) d F ( ξ, ζ | θ ) (7) ln Outlets References i = 1 Bartholomew & Knott (1999), Skrondal & Rabe-Hesketh (2004)
SEM Estimation methods Stas Kolenikov U of Missouri • (quasi-)MLE Introduction • Weighted least squares: Structural equation models s = vech S , σ ( θ ) = vech Σ( θ ) Formulation Path diagrams F = ( s − σ ( θ )) ′ V n ( s − σ ( θ )) → min (8) Identification θ Estimation Stata tools for where V n is weighting matrix: SEM gllamm V ( 1 ) • Optimal ˆ = ˆ confa V [ s − σ ( θ )] (Browne 1984) n gmm + sem4gmm • Simplistic: least squares V ( 2 ) = I NHANES n daily V ( 3 ) • Diagonally weighted least squares: ˆ = diag ˆ V [ s − σ ] functioning n • Model-implied instrumental variables limited information Outlets References estimator (Bollen 1996) • Bounded influence/outlier-robust methods (Yuan, Bentler & Chan 2004, Moustaki & Victoria-Feser 2006) • Empirical likelihood
SEM Goodness of fit Stas Kolenikov U of Missouri • The estimated model Σ(ˆ θ ) is often related to the Introduction “saturated” model Σ ≡ S and/or independence model Structural Σ 0 = diag S equation models • Likelihood formulation ⇒ LRT test, asymptotically χ 2 Formulation k Path diagrams j w j χ 2 • Non-normal data: LRT statistic ∼ � 1 , can be Identification Estimation Satterthwaite-adjusted towards the mean and variance Stata tools for of the appropriate χ 2 SEM k (Satorra & Bentler 1994, Yuan & gllamm Bentler 1997) confa gmm + sem4gmm • Analogies with regression R 2 attempted, about three NHANES daily dozen fit indices available (Marsh, Balla & Hau 1996) functioning • Reliability of indicators: R 2 in regression of an indicator Outlets References on its latent variable • Signs and magnitudes of coefficient estimates
SEM Now, some tools Stas Kolenikov U of Missouri Introduction 1 Introduction Structural equation models 1 Structural equation Formulation models Formulation Path diagrams Path diagrams Identification Identification Estimation Estimation Stata tools for SEM gllamm Stata tools for SEM 2 confa gmm + sem4gmm gllamm NHANES daily confa functioning gmm + sem4gmm Outlets References 3 NHANES daily functioning Outlets 4 References 5
SEM gllamm Stas Kolenikov U of Missouri Generalized Linear Latent And Mixed Models (Skrondal & Introduction Rabe-Hesketh 2004, Rabe-Hesketh, Skrondal & Structural Pickles 2005, Rabe-Hesketh & Skrondal 2008) equation models • Exploits commonalities between latent and mixed Formulation Path diagrams models Identification Estimation • Adds GLM-like links and family functions to them Stata tools for SEM • Allows heterogeneous response (different exponential gllamm confa family members) gmm + sem4gmm NHANES • Allows multiple levels daily functioning • Maximum likelihood via numeric integration of random Outlets effects and latent variables (Gauss-Newton quadrature, References adaptive quadrature); hence one of the most computationally demanding packages ever
SEM gllamm Stas Kolenikov U of Missouri • One line of data per dependent variable × unit Introduction • Requires reshape long transformation of indicators Structural for latent variable models equation models • Measurement model: eq() option Formulation Path diagrams Identification • Structural model: geq() bmatrix() options Estimation • Families and links: family() fv() link() lv() Stata tools for SEM • Tricks that Stas commonly uses: gllamm confa gmm + sem4gmm • make sure the model is correctly specified: trace NHANES noest options daily functioning • good starting values speed up convergence: from() option Outlets • number of integration points gives tradeoff between References speed and accuracy: nip() option • get an idea about the speed: dot option
Recommend
More recommend