week 6 clustered data and panels
play

Week 6: Clustered Data and Panels Robust Standard Errors, Fixed and - PowerPoint PPT Presentation

BUS41100 Applied Regression Analysis Week 6: Clustered Data and Panels Robust Standard Errors, Fixed and Random Effects Max H. Farrell The University of Chicago Booth School of Business Clustering No more time series. Back to SLR. Our


  1. BUS41100 Applied Regression Analysis Week 6: Clustered Data and Panels Robust Standard Errors, Fixed and Random Effects Max H. Farrell The University of Chicago Booth School of Business

  2. Clustering No more time series. Back to SLR. Our assumptions were: iid ∼ N (0 , σ 2 ) , Y i = β 0 + β 1 X i + ε i , ε i which in particular means COV ( ε i , ε j ) = 0 for all i � = j. Clustering allows each observation to have ◮ unknown correlation with a small number others ◮ . . . in a known pattern. ◮ Examples ◮ Children in classrooms in schools ◮ Firms in industries ◮ Products made by companies ◮ How much independent information? 1

  3. The SLR model with clustering ❍ ✟ ε i ✟ ❍ iid ❅ � σ 2 ) , ∼ N (0 , � Y i = β 0 + β 1 X i + ε i , ❅ Instead  σ 2 if i = j, just V [ ε i ]  i   COV ( ε i , ε j ) = if i � = j, but in the same cluster σ ij   0 otherwise .  So only standard errors change! ◮ Same slope β 1 for everyone Cluster methods aim for robustness : ◮ No assumptions about σ 2 i and σ ij ◮ Assume we have many clusters G , each with a small n = � G number of observations n g : g =1 n g 2

  4. Example: Patents and R&D in 1991, by firm.id > head(D91) year sector rdexp firm.id patents 1449 1991 4 6.287435 1 55 1450 1991 5 5.150736 2 67 1451 1991 2 4.172710 3 55 1452 1991 2 6.127538 4 83 1453 1991 11 4.866621 5 0 1454 1991 5 7.696947 6 4 Are these rows independent? If they were . . . > D91$newY <- log(D91$patents + 1) > summary(slr <- lm(newY ~ log(rdexp), data=D91)) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.9226 0.7551 -5.195 5.54e-07 log(rdexp) 4.1723 0.4531 9.208 < 2e-16 Residual standard error: 1.451 on 179 degrees of freedom 3

  5. What happens when errors are correlated? ◮ If ε i > 0 we expect ε j > 0 . (if σ ij > 0 ) ⇒ Both observation i and j are above the line. ● ● ● 800 ● 600 No. of Patents ● ● 400 ● ● ● ● ● ● ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1.2 1.4 1.6 1.8 2.0 log(R&D Expenditure) 4

  6. We want our inference to be robust to this problem. > library(multiwayvcov); library(lmtest) > vcov.slr <- cluster.vcov(slr, D91$sector) > coeftest(slr, vcov.slr) t test of coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.92263 0.90933 -4.3138 2.649e-05 log(rdexp) 4.17226 0.56036 7.4457 3.920e-12 > summary(slr) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.9226 0.7551 -5.195 5.54e-07 log(rdexp) 4.1723 0.4531 9.208 < 2e-16 5

  7. Can we just control for clusters? No! ◮ Not different slopes (and intercepts?) for each cluster . . . we want one slope with the right standard error! > coeftest(slr, vcov.slr) Estimate Std. Error t value Pr(>|t|) (Intercept) -3.92263 0.90933 -4.3138 2.649e-05 log(rdexp) 4.17226 0.56036 7.4457 3.920e-12 > slr.dummies <- lm(newY ~ log(rdexp) + as.factor(sector) - 1) > summary(slr.dummies) Estimate Std. Error t value Pr(>|t|) log(rdexp) 4.5007 0.5145 8.747 2.43e-15 as.factor(sector)1 -5.8800 0.9235 -6.367 1.83e-09 as.factor(sector)2 -3.4714 0.8794 -3.947 0.000117 ... ... 6

  8. Can we just control for clusters? No! ◮ Not different slopes (and intercepts?) for each cluster . . . we want one slope with the right standard error! ● ● ● 800 ● 600 No. of Patents ● ● ● ● 400 ● ● ● ● ● ● ● 200 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1.2 1.4 1.6 1.8 2.0 7 log(R&D Expenditure)

  9. Panel Data So far we have seen i.i.d. data and time series data. Panel data combines these: ◮ units i = 1 , . . . , n ◮ followed over time periods t = 1 , . . . , T ⇒ dependent over time, possibly clustered More and more datasets are panels , also called longitudinal ◮ Tracking consumer decisions ◮ Firm financials over time ◮ Macro data across countries ◮ Students in classrooms over several grades Distinct from a repeated cross-section : ◮ New units sampled each time ⇒ independent over time 8

  10. The linear regression model for panel data: Y i,t = β 1 X i,t + α i + γ t + ε i,t Familiar pieces, just like SLR: ◮ β 1 – the general trend, same as always. (Where’s β 0 ?) ◮ Y i,t , X i,t , ε i,t – Outcome, predictor, mean zero idiosyncratic shock (clustered?) What’s new: ◮ α i – unit-specific effects. Different people are different! ◮ Cars: Camry/Tundra/Sienna. S&P500: Hershey/UPS/Wynn ◮ γ t – time-specific effects. Different years are different! ◮ For now, γ t = 0 . Same concepts/methods. Just the familiar same slope , different intercepts model! Well, almost . . . 9

  11. Estimation strategy depends on how we think about α i 1. α i = 0 = ⇒ Y i,t = β 1 X i,t + ε i,t ◮ lm on N = nT observations. Cluster if needed. 2. random effects: cor( α i , X i,t ) = 0 ◮ Still possible to use lm on N = nT (and cluster on unit) . . . Y i,t = β 1 X i,t + ˜ ε i,t , ε i,t = α i + ε i,t ˜ ◮ . . . but lots of variance! 3. fixed effects: cor( α i , X i,t ) � = 0 ◮ same slope , but n different intercepts! Y i,t = β 1 X i,t + α i + ε i,t ◮ Too many parameters to estimate. patent data has n = 181 . ◮ No time-invariant X i,t = X i . 10

  12. The real patent data is a panel with clustering: ◮ unit is a firm : i = 1, . . . , 181 ◮ time is year = 1983, . . . , 1991 ◮ clustered by sector ? > table(D$year) 1983 1984 1985 1986 1987 1988 1989 1990 1991 181 181 181 181 181 181 181 181 181 > table(D$firm.id, D$year) 1983 1984 1985 1986 1987 1988 1989 1990 1991 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 1 ... 11

  13. Estimation in R: using lm or the plm package. 1. α i = 0 > slr <- lm(newY ~ log(rdexp), data=D) > plm.pooled <- plm(newY ~ log(rdexp), data=D, + index=c("firm.id", "year"), model="pooling") 2. random effects: cor( α i , X i,t ) = 0 > vcov.model <- cluster.vcov(slr, D$firm.id) > coeftest(slr, vcov.model) > plm.random <- plm(newY ~ log(rdexp), data=D, + index=c("firm.id", "year"), model="random") 3. fixed effects: cor( α i , X i,t ) � = 0 > many.dummies <- lm(newY ~ log(rdexp) + as.factor(firm.id) - 1, > plm.fixed <- plm(newY ~ log(rdexp), data=D, + index=c("firm.id", "year"), model="within") 12

  14. Choosing between fixed or random effects. ◮ Fixed effects are more general, more realistic: isolate changes due to X vs due to specific person. ◮ If α i don’t matter, then b RE ≈ b FE > phtest(plm.random, plm.fixed) Hausman Test data: newY ~ log(rdexp) chisq = 22.162, df = 1, p-value = 2.506e-06 alternative hypothesis: one model is inconsistent Using year fixed effects ( γ t ). > lm(newY ~ log(rdexp) + as.factor(year) - 1, data=D) > plm(newY ~ log(rdexp), data=D, + index=c("firm.id", "year"), model="within", effect="time") Both firm and year fixed effects → effect="twoways" 13

  15. Clustered Panels A panel is not exempt from the concern of clustered data. ? Y i,t = β 1 X i,t + α i + γ t + ε i,t cor( ε i 1 ,t 1 , ε i 2 ,t 2 ) = 0 > summary(plm.fixed) Estimate Std. Error t-value Pr(>|t|) log(rdexp) 2.22611 0.22642 9.832 < 2.2e-16 > vcov <- cluster.vcov(many.dummies, D$sector) > coeftest(plm.fixed, vcov) Estimate Std. Error t value Pr(>|t|) log(rdexp) 2.22611 0.80872 2.7527 0.005985 → Four times less information! ֒ 14

Recommend


More recommend