hierar hierarchical chical linear modelling linear
play

HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING - PowerPoint PPT Presentation

HIERAR HIERARCHICAL CHICAL LINEAR MODELLING LINEAR MODELLING Expectation Expectation After completing the workshop you will be able to: understand the data structure for multilevel data analysis; develop the appropriate models to


  1. St Steps eps 1. Clarifying the research questions 2. Choosing appropriate parameter estimator 3. Assessing the need for MLM 4. Building the level-1 model 5. Building the level-2 model 6. Multilevel effect size reporting 7. Likelihood ratio model testing

  2. St Steps in R eps in Running HLM Analysis nning HLM Analysis Three models are typically run: 1. Fully unconditional model – No independent variables are specified – Used to determine if there is sufficient variance among groups to justify using HLM (intraclass correlation) 2. Partially conditional model – Predictors are added at level 1 3. Fully conditional model – Level 2 (and 3) predictors are modeled on the intercept and/or slopes to determine their effects on the outcome measure or on relationships between predictors and outcome

  3. Time for some Algebra! ■ You mus u must learn some of the basic mathematical learn some of the basic mathematical no notations used in multile tations used in multilevel l modeling. modeling. – As we will see, the program HLM uses this notations to express the models that you estimate. – Understanding these basic symbols and expressions will allow you to tackle more complex analyses, and understand other researchers’ more complex analyses.

  4. A level-1 model: multiple students in one school (familiar OLS equation) Is student’s Math achievement score Is average achievement within school (intercept) Is average effect of SES on achievement (slope) Is student’s standardized SES (independent variable) Is unique effect for student i (error term) ■ Student is viewed as having average achievement in the school, plus a positive deviation due to SES, plus a positive or negative deviation due to the unique circumstances of the student.

  5. A level-1 model: multiple students in multiple schools Is student’s achievement in school number j Is average achievement within school j Is average effect of SES on achievement for school j Is student’s standardized SES of student i in school j Is unique effect for student i in school j  Now we are estimating the equation from before for each school. Each school can have a different average achievement (or intercept), and a different impact of SES on achievement (or slope).

  6. Need to make some additional assumptions about the coefficients, because they vary  2 r ~ N ( 0 , ) ij     E(β ) , Var( β ) 0j 00 0j 00     E(β ) , Var (β ) 1j 10 1j 11   Cov(β β ) , 0j 1j 01 ■ Student-level errors are normally distributed. ■ Gamma’s: we expect the average achievement for school j to be equal to the average school mean for all j schools, and the slope of SES for school j to equal the average of the slopes for all j schools. ■ Tau’s: these are the variances of the intercepts and slopes, and the covariance between them.

  7. Level-2 model: explaining the Level-1 coefficients ■ Since our intercepts and slopes vary by school, we can now model why they vary. ■ Suppose we hypothesize that levels of achievement and impact of SES are related to whether a school is public or Catholic. ■ We need equations for the intercept and slope to describe our hypothesis:      β W u (intercept ) 0j 00 01 j 0j      β W u (slope coefficien t) 1j 10 11 j 1j β is average achievemen t within school j 0j β is average effect of SES on achievemen t for school j 1j

  8. Level-2 model (continued)

  9. So math achievement of an individual student in school j is explained by … mean achievement in public schools, plus impact of a school being Catholic on mean achievement (if j is Catholic) the effect of SES on achievement, plus the impact of a school being Catholic on how SES affects achievement (again, if j is Catholic) student- and school-specific error terms

  10. Multile Multilevel R el Regression Model gression Model Some examples from multilevel regression modeling: Lowest (individual) level: ■ Y ij = b 0j + b 1j X ij + e ij and at the Second (group) level: ■ b 0j = g 00 + g 01 Z j + u 0j ■ b 1j = g 10 + g 11 Z j + u 1j Combining: ■ Y ij = g 00 + g 10 X ij + g 01 Z j + g 11 Z j X ij + u 1j X ij + u 0j + e ij

  11. Hands-on Session with HLM Software: HLM 7 Student Version

  12. Star Starting HLM ting HLM • Prepare data; • Identify variables to be included in the model; • Develop hypothesized model; and • Install HLM program

  13. HSB DATA Our data file is a subsample from the 1982 High School and Beyond Survey and is used extensively in Hierarchical Linear Models by Raudenbush and Bryk. The data file, called hsb , consists of 7185 students nested in 160 schools. The outcome variable of interest is the student-level (level 1) math achievement score ( mathach ). The variable ses is the socio-economic status of a student and therefore is at the student level. The variable meanses is the group-mean centered version of ses and therefore is at the school level (level 2). The variable sector is an indicator variable indicating if a school is public or catholic and is therefore a school-level variable. There are 90 public schools (sector=0) and 70 catholic schools (sector=1) in the sample.

  14. Exam Example ple ■ Using HSB-data ■ Questions:  Is multilevel modeling needed for mathematics achievement scores?  Is there a relationship between SES and student level Mathematics achievement scores?  Does the effect of SES on Mathematics achievement scores vary significantly across schools?  Is the effect of SES on ACHMATH moderated by the MEANSES and SECTOR?

  15. Inf Inform HLM of the in rm HLM of the input and Mak put and Make MDM f MDM file le

  16. Inform HLM with the data and analysis command

  17. STEPS 7 1 2 3 4 5 6 8 9 10 10

  18. Choose V Choose Variables f riables for Le r Level_1 l_1 Data file : HSB1.sav

  19. Choose V Choose Variables f riables for Le r Level_2 l_2 Data file : HSB2.sav

  20. Specify the Model Specify the Model 1 2

  21. NULL/UNCONDITIONAL NULL/UNCONDITIONAL MODEL MODEL Also kno Also known as Random Ef n as Random Effect Model ct Model

  22. Fully Unconditional Model lly Unconditional Model ■ Fully unconditional model is run with no predictors to determine if a significant portion of the variance in achievement is between schools – indicating HLM should be used to analyze these data.

  23. Purpose of N Purpose of Null Model ll Model ■ It is used as the baseline model to compare the results of more elaborate models, ■ It can estimate the grand mean of mathematics achievement ( γ oo ) with adjustment for clustering of students within schools and for different sample sizes across schools, ■ It can estimate variance components at student ( σ 2 ) and school level ( τ oo ).

  24. Null/U ll/Unconditional Model nconditional Model ■ Null model is used for two purposes: (1) It is the basis for calculating the intra-class correlation coefficient (ICC), which is the usual test of whether multilevel modelling is needed; and (2) It outputs the deviance statistic (-2LL) and other coefficients used as a baseline for comparing later, more complex models.

  25. Null Model ll Model The level-1 model Y ij = β oj + r ij (1) Where Y ij = Mathematics achievement for student i in school j, Β oj = The average mathematics achievement for school j, * r ij = error term representing a unique effect associated with student i in school j . * Assumed to have a normal distribution with a mean of zero and a level-one variance, σ 2

  26. Null Model ll Model The Level-2 Model, β oj = γ oo + u o j (2) Where, γ oo = The intercept represents grand mean or overall average of mathematics achievement, *u o j = The error term represents a unique effect associated with school j . * Assumed to have a normal distribution with a mean of zero and a level-two variance, τ 00

  27. Null Model ll Model ■ Combine the two equations (Mixed-model), Y ij = γ oo + u oj + r ij

  28. Null or Unconditional Model ll or Unconditional Model The level 1 intercept term, expressed as β 0 j in output, is a function of a random intercept term at level 2 ( γ 00 ) and a level 1 residual error term (r ij ). The level 1 intercept, in turn, is a function of the grand mean ( γ 00 ) across level 2 units, which are agencies in this example, plus a random error term (u 0j ), signifying the intercept is modelled as a random effect. Substituting the right-hand side of the level 2 equation into the level 1 equation gives the mixed model equation for the null random intercept model.

  29. Annotat Anno tated R ed Results 1 sults 1 ■ Reliability in HLM ≠ ordinary reliability ■ Reliability for the intercept in HLM indicates to what extent the intercept measures can discriminate among schools in their average achievement. ■ Low reliability does not mean lack of precision.

  30. Annotat Anno tated R ed Results 2 sults 2 ■ The reliability of the random effect of the level 1 intercept is the average reliability of the level 2 units. ■ It measures the overall reliability of the OLS estimates for each of the intercepts. The reliability estimate for this model is .901. ■ This indicates that the sample means is tend to be quite reliable as indicator of the true school mean.

  31. Anno Annotat tated R ed Results 3 sults 3 ■ In Final estimation of fixed effect: The intercept is 12.64 ( SE =.24) and differ from zero [This value indicate the grand mean of Mathematics Achievement] ■ To measure the magnitude of the variation among schools in their mean achievement levels, we can calculate the plausible values range for these means based on the between variance we obtained from the model: 12.64 ± 1.96*(0.24) = (12.17, 13.11).

  32. Anno Annotat tated R ed Results 4 sults 4 ■ The estimated between variance, τ 2 , corresponds to the term intercept1 in the output of final estimation of variance components and the estimated within variance, σ 2 , corresponds to the term level-1 in the same output section. For this model, τ 2 is 8.61 and σ 2 is 39.15.

  33. Annotat Anno tated R ed Results 5 sults 5 Final Final estimation of V estimation of Variance riance At the school level, is the variance of the true school means , around the grand mean, .The estimated variability in these school means is To measure the magnitude of the variation among schools in their mean achievement levels, we can calculate the plausible values range for these means. Under normality assumption, we would expect 95% of the school means to fall within the range: Indicates a substantial range in average achievement level s among schools in the Which yields sample data 12.64 ± 1.96 ( ) 1/2 = (6.89, 18.39)

  34. Anno Annotat tated R ed Results 6 sults 6 ■ Statistically significant between-school variance (variance at school level) indicates that school average mathematics achievement varies significantly across schools.

  35. Ef Effectiv ctive sam e sample size le size ■ A higher ICC value indicates greater dependence among observations within schools – Effective sample size is smaller than observed sample size ■ Effective n= mk / (1 + ICC*(m-1)) – where n=sample size, m= number of students per schools and k= number of schools ■ If ICC=1, effective n is equal to the # of schools (k) ■ If ICC=0, effective n is equal to the observed n (i.e., mk) ■ In general, effective n lies between k and mk

  36. Calculating ICC Calculating ICC ■ Based on the covariance estimates, we can compute the intra-class correlation: 8.61431/(8.61431 + 39.14831) = .18. ■ This tells us the portion of the total variance that occurs between schools.

  37. Calculating the Intra-class Correlation Calculating the Intra-class Correlation coef coefficient (ICC) ficient (ICC)

  38. ADDING V ADDING VARIABLE A RIABLE AT THE S THE STUDENT LEVEL UDENT LEVEL Random Coef Random Coefficient Model ficient Model

  39. Adding Predict ding Predictors at Le at Level_1 l_1

  40. No Notes on the R s on the Results 1 sults 1 ■ The model we fit was mathach ij = β 0j + β 1j (SES - meanses) + r ij β 0j = γ 00 + u 0j β 1j = γ 10 + u 1j ■ Filling in the parameter estimates we get mathach ij = β 0j + β 1j (SES - meanses) + r ij β 0j = 12.64 + u 0j β 1j = 2.19 + u 1j V(u 0j ) = 8.68 V(u 1j ) = .68 V(r ij ) = 36.7 ■ In a single equation our model would be written as: mathach ij = γ 00 + u 0j + ( γ 10 + u 1j )(SES - meanses) + r ij = γ 00 + γ 10 *(SES - meanses) + u 0j + u 1j *(SES – meanses) + r ij

  41. Notes on the R No s on the Results 2 sults 2 ■ The estimate for the variance of the slope for group- centered SES is 0.68. The p-value is .003. Because the test is statistically significant, we reject the hypothesis that there is no difference in slopes among schools. ■ The 95% plausible value range for the school means and school-specific SES achievement slope is 12.64 ± 1.96 *(8.68) 1/2 = (6.87, 18.41). ■ The 95% plausible value range for the SES -achievement slope is 2.19 ± 1.96 *(.68) 1/2 = (.57, 3.81).

  42. No Notes on the R s on the Results 3 sults 3 ■ The coefficient for the constant is the predicted math achievement when all predictors are 0; hence, when the average school SES is 0, the students' math achievement is predicted to be 12.65

  43. No Notes on the R s on the Results 4 sults 4 ■ Notice that the residual variance is now 36.70, compared to the residual variance of 39.15 in the one- way ANOVA with random effects (unconditional means) model. ■ We can compute the proportion variance explained at level 1 as (39.15 - 36.70) / 39.15 = .063. This suggests using student-level SES as a predictor of math achievement reduced the within-school variance by 6.3%. ■ The correlation between the intercept and the slope is .019. It seems that they are not highly correlated.

  44. Calculating Pr Calculating Propor oportion of V tion of Variance riance Explained Explained

  45. THE INTER THE INTERCEPT EPT AND SL AND SLOPE OPE AS THE OUT AS THE OUTCOME OME MODEL MODEL Final Model Final Model

  46. This model is referred to as an intercepts- and slopes-as-outcomes model

  47. Resear search Questions ch Questions ■ Do MEANSES and SECTOR significantly predict the intercept? ■ Do MEANSES and SECTOR significantly predict the within school slope? ■ How much variation in the intercepts and slopes is explained using SECTOR and MEANSES as predictors?

  48. Int Interpre rpreting the Final R ting the Final Results sults For intercept: ■ The MEANSES is positively related to school mean math achievement. ■ Catholic schools have higher mean achievement than do public schools, after controlling the effect of MEANSES.

  49. Interpre Int rpreting the Final R ting the Final Results sults For slope ■ School with higher MEANSES have a larger slope than low MEANSES. ■ Catholic schools have significantly weaker SES slopes on average than do public schools.

  50. Repor porting in T ing in Table ble

  51. Final Model Final Model ■ The estimate for the variance of the SES slope is .15 with p-value .369; hence, we fail to reject the null hypothesis that there is no significant variation among the slopes of MEANSES remain unexplained after controlling the MEANSES and SECTOR effects. ■ The correlation between the level-1 intercept and the slope for SES is given as .32 from the earlier part of the output. ■ There is variation remain unexplained even after controlling the MEANSES and SECTOR effects.

  52. Assessing Model Fit Assessing Model Fit ■ Using Deviance Statistics ■ Using Proportion of variance explained ■ Using other indicators AIC and BIC

  53. Estimation Specification Estimation Specif ication ■ REML (restricted maximum likelihood) versus FML (full maximum likelihood) – REML and FML will usually produce similar results for the level-1 residual ( σ 2 ), but there can be noticeable differences for the variance-covariance matrix of the random effects. – REML is the default estimation method in HLM. – If the number of level-2 units is large, then the difference will be small. – If the number of level-2 units is small, then FML variance estimates will be smaller than REML, leading to artificially short confidence interval and problematic significant tests. ■ Nested models – If the fixed effects are the same, and there are fewer random effects in the reduced model, then both REML or FML are fine. – If one model has fewer fixed effects and possibly fewer random effects, then use FML to compare models.

  54. SOME PRA SOME PRACTICAL TICAL ASPECTS OF ASPECTS OF MUL MULTILEVEL ILEVEL MODELING MODELING

  55. Questions t Questions to Answ Answer er ■ Can you use multilevel techniques to study your dependent variable? ■ Should you use multilevel techniques to study your dependent variable? ■ How will you center your level-1 and level-2 predictors? ■ Which of the level-1 coefficients will be explained at level-2? I.e., are they fixed or random? ■ How does my model perform?

  56. Can I use HLM? Can I use HLM? ■ HLM requires a large amount of data . ■ Minimum:  number of groups: 30, but most recommend 50+  number of individuals within groups: 5-10, but can have low as 1.  average group size: 10, obviously more is better .

  57. Should I Use HLM? Should I Use HLM? ■ How much of the variance in your dependent variable is explained by group membership? ■ Intra-class correlation coefficient (ICC) = var between groups (var between groups + var within groups)       2 /( ) 00 00  Remember, is the variance of the intercepts , or the 00  2 school means, and is the student - level variance ■

  58. Cent Centering v ering variables riables ■ Whether and how you center is a very important decision: interpretation of results depends on your choice. ■ Important because the intercept at level-1 is also a dependent variable. ■ Centering – Refers to subtracting a mean from your independent variables. – The transformed value for an individual measures how much they deviate (+/-) from the mean.

  59. Cent Centering v ering variables riables ■ Suppose we center verbal SAT scores around Actual Centered a student mean of 500. score score Steve 800 300 Claire 750 250 ■ How would we interpret a regression coefficient if Bill 500 0 all variables were Paul 200 -300 similarly transformed? 91

  60. Cent Centering v ering variables riables ■ Why would we want to center? – Variable may lack a natural zero point, such as SAT score. – Stability of estimates at level-1 affected by location of variables. – Location at level-2 is less important. • Centering in multilevel models presents a unique challenge because different centering choices have a significant impact on how the parameter estimate is interpreted.

  61. Cent Centering v ering variables riables ■ Generally two types of centering are used in HLM for a specific variable: – Grand mean centering – subtract the mean for the entire sample from each observation in the sample. – Group mean centering – subtract the mean for each group from each member of the group. ■ To fully understand the implications of centering, see the discussion in Bryk and Raudenbush (2002) pp. 134-149.

  62. Grand Mean Centering Grand Mean Cent ering ■ Grand mean centering is scoring variables as deviations from their sample means. ■ An example would be scoring occupational status as a deviation from the mean occupational status in the entire sample—scoring how high or low people are relative to the average. ■ In multivariate analyses, predictor variables that are grand-mean centered generate mathematically identical predicted values to those from the same model estimated on the original, conventionally scored variables.

  63. Grand Mean Centering Grand Mean Cent ering ■ Some writers still claim that grand mean centering reduces multicollinearity, particularly when the regression includes many interactions, and most especially when these are cross-level interactions, (Bickel 2007; Preacher 2003). ■ Another advantage of grand mean centering is that it allows one to interpret the intercept as the predicted mean on the dependent variable when all the predictors are set to zero (Paccagnella 2006).

  64. Grand Mean Cent Grand Mean Centering ering ■ It is also sometimes said that grand-mean centering facilitates regression coefficient interpretation, particularly for cross-level interactions when a variable is continuous (Bickel 2007; Kenny et al. 1998; Hox 2010). ■ Hox (2010) reports that convergence tends to be achieved more frequently and analyses run faster using grand-mean centering.

  65. Grand Mean Cent Grand Mean Centering ering

  66. Gr Group Mean Cent oup Mean Centering ering ■ Group mean centering refers to scoring variables in multi-level models as deviations from the mean of their macro-level group. ■ An example would be scoring occupational status as a deviation from the mean status in respondent’s own country. ■ In group-mean centering, Nigerian clerks would be high (because they are high compared to the average Nigerian) while Swiss clerks would be low (because they are low compared to the average Swiss).

  67. Group Mean Cent Gr oup Mean Centering ering ■ Bryk (2002) also posit that group-mean centering can reduce bias in random component variance estimates. ■ Paccagnella (2006) alleges the benefit of group-mean centering for researchers interested in ‘‘separating the between-group and the within-group components from the total variation to investigate how groups (contexts) affect the student performances, explicitly accounting for the group structure into the model’’ ■ Group-mean centering (also known as within-group deviation scoring) is widely used in many disciplines, and widely recommended.

  68. Gr Group Mean Cent oup Mean Centering ering

Recommend


More recommend