mixtures of rasch models
play

Mixtures of Rasch Models Several approaches to test for DIF: LR - PowerPoint PPT Presentation

Introduction Rasch model for measuring latent traits Model assumption: Item parameters estimates do not depend on person sample Violated in case of differential item functioning (DIF) Mixtures of Rasch Models Several approaches to test for


  1. Introduction Rasch model for measuring latent traits Model assumption: Item parameters estimates do not depend on person sample Violated in case of differential item functioning (DIF) Mixtures of Rasch Models Several approaches to test for DIF: LR tests, Wald tests Rasch trees Hannah Frick, Friedrich Leisch, Achim Zeileis, Carolin Strobl Mixture models Here: Two versions of the mixture model approach http://www.uibk.ac.at/statistics/ Rasch Model ML Estimation Factorization of the full likelihood on basis of the scores r i = � m Probability for person i to solve item j : j = 1 y ij e y ij ( θ i − β j ) L ( θ , β ) = f ( y | θ , β ) P ( Y ij = y ij | θ i , β j ) = 1 + e θ i − β j = h ( y | r , θ , β ) g ( r | θ , β ) = h ( y | r , β ) g ( r | θ , β ) y ij : Response by person i to item j Joint ML: Joint estimation of β and θ is inconsistent θ i : Ability of person i Marginal ML: Assume distribution for θ and integrate out in g ( r | θ , β ) β j : Difficulty of item j Conditional ML: Assume g ( r ) = g ( r | θ , β ) as given or that it does not depend on θ , β (but potentially other parameters). Hence, g ( r ) is a nuisance term and only h ( y | r , β ) needs to be maximized.

  2. Mixture Models Mixtures of Rasch Models Mixture of the full likelihoods by Rost (1990): n K Mixture models are a tool to model data with unobserved � � f ( y | π , ψ , β ) = π k ψ r i , k h ( y i | r i , β k ) i = 1 k = 1 heterogeneity caused by, e.g., (latent) groups with ψ r i , k = g k ( r i ) Mixture density = � weight × component Mixture of the conditional likelihoods: Weights are a priori probabilities for the components n K � � f ( y | π , β ) = π k h ( y i | r i , β k ) Components are densities or (regression) models i = 1 k = 1 Parameter Estimation Number of Components EM algorithm by Dempster, Laird and Rubin (1977) How can the number of components k be established? Group membership is seen as a missing value Optimization is done iteratively by alternate estimation of group A priori known number of groups in the data membership (E-step) and component densities (M-step) LR test: Regularity conditions are not fulfilled E-step: → Distribution under H 0 unknown π k h ( y i | r i , ˆ ˆ β k ) p ik = ˆ � K π g h ( y i | r i , ˆ → Bootstrap necessary g = 1 ˆ β g ) Information criteria: AIC, BIC, ICL M-step: For each component separately n � ˆ ˆ p ik log h ( y i | r i , ˆ β k = argmax β k ) β k i = 1

  3. Simulation Design Item Parameters 10 items, 1800 people, equal group sizes A: One Latent Class B/C: Two Latent Classes (No DIF) (DIF) Latent groups in item and/or person parameters: ● ● ● β 1 = β 2 β 1 = β 2 β 1 � = β 2 ● ● ● ● 2 2 ● ● ● Item Difficulty Item Difficulty 1 1 ● ● ● β 1 ● ● ● ● θ 1 = θ 2 A B 0 0 β 2 ● ● ● ● −1 ● −1 ● ● ● ● ● −2 −2 ● ● ● ● ● ● θ 1 � = θ 2 C 2 4 6 8 10 2 4 6 8 10 Item Number Item Number Person Parameters Criteria for Goodness of Fit A/B: θ 1 = θ 2 C: θ 1 ≠ θ 2 Number of components Rand index: 0.5 0.4 θ 1 θ 2 Agreement between true and estimated partition 0.4 0.3 0.3 Mean residual sum of squares: Density 0.2 Agreement between true and estimated (item) parameter vector 0.2 0.1 0.1 0.0 0.0 −6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6 Ability Ability

  4. No Latent Classes (No DIF) Two Latent Classes (DIF) A B 500 500 AIC AIC BIC BIC 400 400 ICL ICL 300 300 200 200 100 100 0 0 1 2 3 1 2 3 Number of Components Number of Components Latent Structure in Item and Person Parameters Latent Structure in Item and Person Parameters (DIF + Ability Differences) (DIF + Ability Differences) Rand Index (C) C (Accuracy of Clustering) 500 ● ● AIC 0.95 BIC 400 ● ● ● ● ICL 0.90 300 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.85 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.80 ● 100 ● 0.75 ● ● ● 0 1 2 3 4 AIC BIC ICL Number of Components

  5. Latent Structure in Item and Person Parameters Summary and Outlook (DIF + Ability Differences) Model suitable for detecting latent classes with DIF Log Mean Residual SSQ (C) (Accuracy of Item Parameter Estimates) Model also suitable when a latent structure in the ● person parameters is present 50 ● ● ● ● ● ● AIC tends to overestimate the correct number of classes, ● ● ● 20 BIC and ICL work well ● ● ● ● ● ● ● ● ● ● ● Clustering of the observations works well 10 ● ● ● ● ● ● ● ● ● ● Estimation of the item parameters in the components works ● ● ● ● ● ● ● ● ● ● ● 5 reasonably well ● ● ● ● ● ● Comparison with Rost’s MRM to follow 2 ● ● AIC BIC ICL Literature Arthur Dempster, Nan Laird, and Donald Rubin. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B , 39(1): 1–38, 1977. Bettina Grün and Friedrich Leisch. Flexmix Version 2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters. Journal of Statistical Software , 28(4): 1–35, 2008. Georg Rasch. Probabilistic Models for Some Intelligence and Attainment Tests. The University of Chicago Press, 1960. Jürgen Rost. Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis. Applied Psychological Measurement , 14(3): 271–282, 1990. Carolin Strobl. Das Rasch-Modell - Eine verständliche Einführung für Studium und Praxis . Rainer Hampp Verlag, 2010.

Recommend


More recommend