PS 405 – Week 7 Section: Interactions D.J. Flynn February 25, 2014
Today’s plan Review: the multiplicative interaction model Estimating/interpreting interaction models in R Plotting/visualizing interactive effects in R
Why interactions? In political science, we ofen make conditional hypotheses (e.g., an increase in X leads to an increase in Y when condition Z is met, but not when condition Z is absent). For example, the abstract from Jerit and Barabas (2012) says: We know little about...how individual-level partisan motivation interacts with the information environment... Using survey data as well as an experiment with diverse subjects, we demonstrate...a selective pattern of learning in which partisans have higher levels of knowledge for facts that confirm their world view... This basic relationship is exaggerated on topics receiving high levels of media coverage.
Quick review of interactions ◮ We use interaction terms (“multiplicative interaction models”) to test conditional hypotheses ◮ By using interactions, we’re testing moderation (NOT mediation ) ◮ Example from lecture: For public school students, higher undergrad GPAs are associated with greater success in grad school – but not for private school students.
The basic interaction model Y i = β 0 + β 1 X i + β 2 Z i + β 3 X i Z i + ǫ i , where Y is the DV, X and Z are regressors, XZ is the interaction, and ǫ is the residual.
What each term represents Y i = β 0 + β 1 X i + β 2 Z i + β 3 X i Z i + ǫ i Notice: ◮ When Z i = 0, we have Y i = β 0 + β 1 X i + ǫ 1 . ◮ Thus, the marginal effect of X when Z = 0 is δ Y δ X = β 1 ◮ When Z i = 1, we have Y i = ( β 0 + β 2 ) + ( β 1 + β 3 ) X i + ǫ i . ◮ Thus, the marginal effect of X when Z = 1 is δ Y δ X = β 1 + β 3
Brambor, Clark, and Golder’s (2006) interaction rules 1 1. Use interaction terms whenever the hypothesis is conditional in nature. 2. Include all constitutive terms in the model. 3. Do NOT interpret the coefficients on constitutive terms as unconditional marginal effects. 4. Do NOT forget to calculate substantively meaningful marginal effects and standard errors. (coming...) 1 Source: Brambor, Thomas, William Roberts Clark, and Matt Golder. 2006. “Understanding Interaction Models: Improving Empirical Analyses.” Political Analysis 14: 63–82.
Constitutive terms ◮ Constitutive terms are those elements that “constitute” the interaction term: ◮ if you include XZ , you should also include X and Z . ◮ if you include X 2 , you should also include X . ◮ If you include X 3 , you should also include X and X 2 , etc.... ◮ Always include constitutive terms (even when the effect of X when Z = 0 is not theoretically important)
If you omit a constitutive term, then there is potential for omitted variable bias: Suppose the true model is: Y i = β 0 + β 1 X i + β 2 Z i + β 3 X i Z i + ǫ i . But you estimate: Y i = β 0 + β 1 X i + β 2 X i Z i + ǫ i . If β 2 � = 0 and Z is correlated with an included regressor, then estimates of β 0 , β 1 , and β 3 will be biased (and standard errors wrong).
Standard errors/hypothesis testing ◮ So far we’ve reviewed how to calculate marginal effects. But what about the significance of these effects? ◮ Can’t just look at the significance of the interaction coefficient – because effect of X on Y could be significant at some levels of Z, but not others. ◮ Easiest case: dummy variable interactions.
◮ Suppose we have the standard interaction model with a binary Z: Y i = β 0 + β 1 X i + β 2 Z i + β 3 X i Z i + ǫ i ◮ The marginal effect of X when Z = 0 is β 1 , and its standard � var ( ˆ error is just β 1 ) . ◮ The marginal effect of X when Z = 1 is β 1 + β 3 , and its standard error is � var ( ˆ β 1 ) + Z 2 var ( ˆ β 3 ) + 2 Z cov ( ˆ β 1 ˆ β 3 ) .
◮ However, if we’re interacting categorical/continuous terms, then things get messy. ◮ Logic: Z could be at many different levels, so we need to know significance of X at all these levels. ◮ To do this, we need marginal effect plots and confidence intervals . ◮ Let’s estimate an interactive model and create a helpful plot.....
Interactions in R There are two ways to run interactions in R . I think this way is best because it automatically includes constitutive terms for you: x<-rnorm(100,10,2) z<-rnorm(100,8,1) y<-x*z + rnorm(100,9,2) int.model<-lm(y ∼ x*z) summary(int.model) Example: library(car) prestige.model<-lm(prestige ∼ income*education + type, data=Prestige) summary(prestige.model)
Let’s interpret... summary(prestige.model) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.780e+01 7.594e+00 -2.344 0.021212 * income 3.786e-03 9.445e-04 4.008 0.000124 *** education 5.104e+00 7.766e-01 6.572 2.93e-09 *** typeprof 5.479e+00 3.714e+00 1.475 0.143574 typewc -3.584e+00 2.428e+00 -1.476 0.143303 income:education -2.102e-04 6.977e-05 -3.012 0.003347 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
Visualizing interactions with the effects package install.packages("effects") library(effects) income.effect<-effect("income*education", prestige.model) plot(income.effect,as.table=T)
income*education effect plot 5000 10000 15000 20000 25000 education education 90 80 70 60 50 40 prestige 30 education education 90 80 70 60 50 40 30 5000 10000 15000 20000 25000 income
We could look at the other side of the interaction: the effect of education on prestige at different levels of income: plot(income.effect, x.var="education", as.table=T)
income*education effect plot 8 9 10 11 12 13 14 income income income 90 80 70 60 50 40 30 prestige income income 90 80 70 60 50 40 30 8 9 10 11 12 13 14 education
Another example: using the same data, let’s look at the effect of income on prestige, conditional on the type of profession: prestige.model2 <- lm(prestige ∼ income*type + education, data=Prestige) income.effect <- effect("income*type", prestige.model2) plot(income.effect, layout=c(3,1))
income*type effect plot 5000 10000 15000 20000 25000 type : bc type : prof type : wc 120 100 prestige 80 60 40 5000 10000 15000 20000 25000 5000 10000 15000 20000 25000 income
You might have noticed that none of these examples involved a continuous-by-continuous interaction. That’s because continuous by continuous interactions are very challenging to interpret. You have a few options in these situations 2 ◮ collapse one of the continuous variables into categories and treat it as a factor ◮ complicated plots 2 See Thomas Leeper’s page on this: thomasleeper.com/Rcourse/Tutorials/olsinteractionplots2.html
Concluding notes on the effects package ◮ One of the more flexible packages you’ll encounter ◮ You can also use it to get marginal effect at different levels of X and SEs (no plots): income.effect ◮ You can also make tons of alterations to the plots (e.g., change from lines to confidence bars, re-label axes just like in standard R plots, etc.) ◮ The official effects manual is here . A more user-friendly intro is here .
A final point: marginal effects vs. predictions Everything we did today is about calculating marginal effects of X on Y (at diffferent levels of Z). Or, equivalently, marginal effects of Z on Y (at different levels of X). To get marginal effects, we don’t care about other regressors in the model. If we were interested in coming up with predicted values for each observation in the dataset ( ˆ Y i ), we would care about the coefficients on the other regressors. This is easy. We just solve for the linear prediction – just like in non-interactive models.
Recommend
More recommend