Regression Modeling in R: Case Studies REGRESSION MODELING IN R : CASE STUDIES Mixed E�ect Models Danielle Quinn PhD Candidate, Memorial University
Regression Modeling in R: Case Studies Before you start modeling...
Regression Modeling in R: Case Studies Before you start modeling... orchids site abundance richness humidity tree_age 1 a 11 7 59.5 14 2 a 10 4 70.4 12 3 a 13 4 73.4 9
Regression Modeling in R: Case Studies What is the research question? How does tree age in�uence the number of orchid species present?
Regression Modeling in R: Case Studies Grouped data data belonging to the same group are correlated this violates assumptions about the independence of observations
Regression Modeling in R: Case Studies Fixed e�ects linear_glm <- glm(richness ~ tree_age + site, data = orchids, family = "gaussian") Fixed e�ect : an unknown constant that is estimated from the data
Regression Modeling in R: Case Studies The cost of �xed e�ects Treating site as a �xed e�ect means that we are estimating parameters for each of the eight individual sites linear_glm$coefficients (Intercept) tree_age siteb sitec sited sitee 2.77185153 0.09323541 0.78985312 -2.06398531 -2.21142636 -0.68811751 sitef siteg siteh -1.41014688 -2.17285272 -1.79616157 Need to have su�cient data for each group Adding parameters costs us degrees of freedom
Regression Modeling in R: Case Studies What are we actually trying to do with our data? How does tree age in�uence the number of orchid species present?
Regression Modeling in R: Case Studies Random e�ects Random e�ect : the model aims to estimate the distribution of the e�ect rather than estimate the e�ect itself as a constant Concerned with the wider population rather than the individuals sampled
Regression Modeling in R: Case Studies Random intercept model Estimates a distribution of the random e�ect of site on intercept random = ~1 | site speci�es a random intercept model where site will act as a random e�ect to in�uence the intercept of the linear model
Regression Modeling in R: Case Studies REGRESSION MODELING IN R : CASE STUDIES Time to practice!
Regression Modeling in R: Case Studies REGRESSION MODELING IN R : CASE STUDIES Model Selection and Interpretation Danielle Quinn PhD Candidate, Memorial University
Regression Modeling in R: Case Studies Model comparison How do we know if we need a random intercept model?
Regression Modeling in R: Case Studies Fit methods for comparing models Gaussian GLM: Iteratively Reweighted Least Squares gaussian_glm <- glm(richness ~ tree_age, data = orchids, family = "gaussian") GLS linear model: Generalized Least Squares (GLS) gls_model <- gls(richess ~ tree_age, data = orchids) Mixed e�ects model: Restricted Maximum Likelihood (REML) random_int_model <- lme(richness ~ tree_age, random = ~1 | site, data = orchids, method = "REML")
Regression Modeling in R: Case Studies Applying a likelihood ratio test # Apply likelihood ratio test to compare models anova(gls_model, random_int_model) Model df AIC BIC logLik Test L.Ratio p-val gls_model 1 3 623.9579 633.1456 -308.9789 random_int_model 2 4 577.7961 590.0464 -284.8980 1 vs 2 48.16181 <.00 What does the p-value tell us? Has the AIC value improved?
Regression Modeling in R: Case Studies Correcting the p-value Model df AIC BIC logLik Test L.Ratio p-val gls_model 1 3 623.9579 633.1456 -308.9789 random_int_model 2 4 577.7961 590.0464 -284.8980 1 vs 2 48.16181 <.00 LR <- ((-308.9789) - (-284.8980)) * -2 (1 - pchisq(LR, 1)) * 0.5 1.962319e-12
Regression Modeling in R: Case Studies Calculating variance of random intercept random_int_model Linear mixed-effects model fit by REML Data: orchids Log-restricted-likelihood: -284.898 Fixed: richness ~ tree_age (Intercept) tree_age 1.63093599 0.08444773 Random effects: Formula: ~1 | site (Intercept) Residual StdDev: 1.019076 1.33274 Number of Observations: 160 Number of Groups: 8 Estimated variance of random intercept (d) = 1.019076 ^ 2 = 1.038516
Regression Modeling in R: Case Studies Extracting variance of random intercept VarCorr(random_int_model) site = pdLogChol(1) Variance StdDev (Intercept) 1.038515 1.019076 Residual 1.776197 1.332740 # Extract variance value from position 1,1 VarCorr(random_int_model)[1,1] 1.038515
Regression Modeling in R: Case Studies REGRESSION MODELING IN R : CASE STUDIES Take a closer look at your random intercept model
Regression Modeling in R: Case Studies REGRESSION MODELING IN R : CASE STUDIES Visualizing a random intercept model Danielle Quinn PhD Candidate, Memorial University
Regression Modeling in R: Case Studies Where do we start? ggplot(orchids) + geom_jitter(aes(x = tree_age, y = richness, col = site))
Regression Modeling in R: Case Studies Population level predictions Generated from the �xed component of the model Overall relationship between response and �xed predictor variables
Regression Modeling in R: Case Studies Generating population level predictions pred_df.fixed <- data.frame(tree_age = seq(from = 5, to = 20, length = 10)) # Generate population level predictions pred_df.fixed$predicted <- predict(random_int_model, pred_df.fixed, level = 0) head(pred_df.fixed) tree_age predicted 1 5.000000 2.053175 2 6.666667 2.193921 3 8.333333 2.334667 4 10.000000 2.475413 5 11.666667 2.616159 6 13.333333 2.756906
Regression Modeling in R: Case Studies Visualizing population level predictions ggplot(orchids) + geom_jitter(aes(x = tree_age, y = richness, col = site)) + geom_line(aes(x = tree_age, y = predicted), size = 2, data = pred_df.fixed)
Regression Modeling in R: Case Studies Visualizing population level predictions
Regression Modeling in R: Case Studies Generating random e�ect predictions pred_df.random <- expand.grid(tree_age = seq(from = 5, to = 20, length = 10 site = unique(orchids$site)) # Generate random effect predictions pred_df.random$random <- predict(random_int_model, pred_df.random, level = 1) pred_df.random tree_age site random 1 5.000000 a 3.202164 2 6.666667 a 3.342910 3 8.333333 a 3.483656 ... ... 78 16.66667 h 2.519369 79 18.33333 h 2.660115 80 20.00000 h 2.800861
Regression Modeling in R: Case Studies Visualizing the random e�ect ggplot(orchids) + geom_jitter(aes(x = tree_age, y = richness, col = site)) + geom_line(aes(x = tree_age, y = predicted), size = 2, data = pred_df.fixed) + geom_line(aes(x = tree_age, y = random, col = site), data = pred_df.random)
Regression Modeling in R: Case Studies Visualizing the random e�ect
Regression Modeling in R: Case Studies REGRESSION MODELING IN R : CASE STUDIES Time to practice!
Recommend
More recommend