DataCamp Building Response Models in R BUILDING RESPONSE MODELS IN R Model extensions part 1: Dummy variables Kathrin Gruber Assistant Professor of Econometrics Erasmus University Rotterdam
DataCamp Building Response Models in R Dummy variables
DataCamp Building Response Models in R Understanding dummy variables aggregate(log(SALES) ~ DISPLAY, FUN = mean, data = sales.data) DISPLAY log(SALES) 1 0 4.194953 2 1 4.657477
DataCamp Building Response Models in R The effect of display on sales dummy.model <- lm(log(SALES) ~ DISPLAY, data = sales.data) coef(dummy.model) (Intercept) DISPLAY 4.1950 0.4625 Average unit sales for no- DISPLAY exp(coef(dummy.model)[1]) (Intercept) 66.35063 Switching from no- DISPLAY to DISPLAY exp(coef(dummy.model)[2] - 1) DISPLAY1 0.5842211
DataCamp Building Response Models in R The effect of multiple dummies on sales (1) aggregate(log(SALES) ~ DISPLAY + COUPON + DISPLAYCOUPON, FUN = mean, data = sales.data) DISPLAY COUPON DISPLAYCOUPON log(SALES) 1 0 0 0 3.797571 2 1 0 0 4.657477 3 0 1 0 5.557327 4 0 0 1 5.946818
DataCamp Building Response Models in R The effect of multiple dummies on sales (2) aggregate(log(SALES) ~ DISPLAY + COUPON + DISPLAYCOUPON, FUN = mean, data = sales.data) DISPLAY COUPON DISPLAYCOUPON log(SALES) 1 0 0 0 3.797571 2 1 0 0 4.657477 3 0 1 0 5.557327 4 0 0 1 5.946818 dummy.model <- lm(log(SALES) ~ DISPLAY + COUPON + DISPLAYCOUPON, data = sales.data) coef(dummy.model) (Intercept) DISPLAY COUPON DISPLAYCOUPON 3.7976 0.8599 1.7598 2.1492
DataCamp Building Response Models in R What about price? lm(update(dummy.model, . ~ . + PRICE), data = sales.data) Call: lm(formula = update(dummy.model, . ~ . + PRICE), data = sales.data) Coefficients: (Intercept) DISPLAY COUPON DISPLAYCOUPON PRICE 3.4310 0.8747 1.7646 2.1630 0.3123
DataCamp Building Response Models in R BUILDING RESPONSE MODELS IN R Let's practice!
DataCamp Building Response Models in R BUILDING RESPONSE MODELS IN R Model extensions part 2: Lagged effects Kathrin Gruber Assistant Professor of Econometrics Erasmus University Rotterdam
DataCamp Building Response Models in R About lags Carry-over effect Time span between marketing activities and response. Evaluation of several time periods by back-shifting. How to lag? head(cbind(sales.data$PRICE, lag(sales.data$PRICE, n = 1))) [,1] [,2] [1,] 1.090000 NA [2,] 1.271818 1.090000 [3,] 1.271818 1.271818 [4,] 1.271818 1.271818 [5,] 1.271818 1.271818 [6,] 1.271818 1.271818
DataCamp Building Response Models in R Adding lagged price effects Price.lag <- lag(sales.data$PRICE) lag.model <- lm(log(SALES) ~ PRICE + Price.lag, data = sales.data) coef(lag.model) (Intercept) PRICE Price.lag 3.906 -4.579 4.935
DataCamp Building Response Models in R More lags Coupon.lag <- lag(sales.data$COUPON) lm(update(lag.model, . ~ . + COUPON + Coupon.lag), data = sales.data) Call: lm(formula = update(lag.model, . ~ . + COUPON + Coupon.lag), data = sales.data) Coefficients: (Intercept) PRICE Price.lag COUPON Coupon.lag 3.8327 -4.5050 4.8426 0.9697 0.3840
DataCamp Building Response Models in R What's the value added? lag.model <- lm(log(SALES) ~ PRICE + Price.lag + DISPLAY + Display.lag + COUPON + Coupon.lag + DISPLAYCOUPON + DisplayCoupon.lag, data = sales.data) plot(log(SALES) ~ 1, data = sales.data) lines(c(NA, fitted.values(lag.model)) ~ 1)
DataCamp Building Response Models in R BUILDING RESPONSE MODELS IN R Let's practice!
DataCamp Building Response Models in R BUILDING RESPONSE MODELS IN R How many extensions are needed? Kathrin Gruber Assistant Professor of Econometrics Erasmus University Rotterdam
DataCamp Building Response Models in R Summarizing the model summary(extended.model) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.2561 0.5654 3.991 0.000117 *** PRICE -2.6857 0.7921 -3.390 0.000959 *** Price.lag 3.9920 0.7959 5.016 1.96e-06 *** DISPLAY 0.4570 0.1279 3.572 0.000521 *** Display.lag 0.5097 0.1180 4.319 3.36e-05 *** COUPON 1.7531 0.1576 11.121 < 2e-16 *** Coupon.lag -0.2098 0.1567 -1.339 0.183344 DISPLAYCOUPON 2.0087 0.2017 9.960 < 2e-16 *** DisplayCoupon.lag 0.4489 0.2112 2.126 0.035695 * Residual standard error: 0.5 on 114 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.7135, Adjusted R-squared: 0.6934 F-statistic: 35.5 on 8 and 114 DF, p-value: < 2.2e-16
DataCamp Building Response Models in R Statistical significance summary(extended.model) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.2561 0.5654 3.991 0.000117 *** PRICE -2.6857 0.7921 -3.390 0.000959 *** Price.lag 3.9920 0.7959 5.016 1.96e-06 *** DISPLAY 0.4570 0.1279 3.572 0.000521 *** Display.lag 0.5097 0.1180 4.319 3.36e-05 *** COUPON 1.7531 0.1576 11.121 < 2e-16 *** Coupon.lag -0.2098 0.1567 -1.339 0.183344 DISPLAYCOUPON 2.0087 0.2017 9.960 < 2e-16 *** DisplayCoupon.lag 0.4489 0.2112 2.126 0.035695 * Residual standard error: 0.5 on 114 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.7135, Adjusted R-squared: 0.6934 F-statistic: 35.5 on 8 and 114 DF, p-value: < 2.2e-16
DataCamp Building Response Models in R Dropping predictors AIC(extended.model) [1] 189.21 AIC(lm(update(extended.model, . ~ . - Coupon.lag), data = sales.data)) [1] 189.1284
DataCamp Building Response Models in R Elimination predictors library(MASS) final.model <- stepAIC(extended.model, direction = "backward", trace = FALSE) summary(final.model) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.1887 0.5651 3.873 0.000179 *** PRICE -2.6888 0.7949 -3.383 0.000982 *** Price.lag 4.0267 0.7982 5.045 1.71e-06 *** DISPLAY 0.4524 0.1283 3.525 0.000609 *** Display.lag 0.5447 0.1155 4.717 6.78e-06 *** COUPON 1.7635 0.1580 11.161 < 2e-16 *** DISPLAYCOUPON 1.9954 0.2021 9.872 < 2e-16 *** DisplayCoupon.lag 0.4839 0.2103 2.301 0.023182 * Residual standard error: 0.5017 on 115 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.709, Adjusted R-squared: 0.6913 F-statistic: 40.04 on 7 and 115 DF, p-value: < 2.2e-16
DataCamp Building Response Models in R BUILDING RESPONSE MODELS IN R Let's practice!
Recommend
More recommend