cluster robust inference with heterogeneous clusters
play

Cluster Robust Inference with Heterogeneous Clusters joint work with - PowerPoint PPT Presentation

Cluster Robust Inference with Heterogeneous Clusters joint work with Chang Lee and Drew Carter Douglas G. Steigerwald UC Santa Barbara July 2018 D. Steigerwald (UCSB) Cluster Robust July 2018 1 / 32 Empirical Framework Kuhn et alia AER


  1. Cluster Robust Inference with Heterogeneous Clusters joint work with Chang Lee and Drew Carter Douglas G. Steigerwald UC Santa Barbara July 2018 D. Steigerwald (UCSB) Cluster Robust July 2018 1 / 32

  2. Empirical Framework Kuhn et alia AER 2011 measure consumption impact from a shock to neighbor’s income 410 postal codes ( g ) : 4 to 105 households ( i ) : g grows with n c gi = α 0 + α fe + β 1 � win g + β 2 � income gi + u gi V covariance matrix of OLSE for coe¢cients b V cluster-robust variance estimator baseline beliefs for this empirical setting b V is known to be consistent 1 b V removes downward bias in OLS estimator of V 2 degrees-of-freedom for hypothesis testing at least 410 3 410 for t test of H 0 : β 1 = 0 1 n for t test of H 0 : β 2 = 0 2 D. Steigerwald (UCSB) Cluster Robust July 2018 2 / 32

  3. Research Response Our …ndings c gi = α 0 + α fe + β 1 � win g + β 2 � income gi + u gi V covariance matrix of OLSE for coe¢cients b V cluster-robust variance estimator our …ndings for this empirical setting b V is known to be consistent - false 1 previously established when group designs (cluster sizes) are equal 1 we establish consistency when group designs (cluster sizes) vary 2 inconsistent for α fe 3 b V removes downward bias in OLS estimator of V - false 2 b V may have downward bias 1 degrees-of-freedom for hypothesis testing at least 410 - false 3 b V a function only of between cluster variation 1 d-o-f at most 410 for either t test of H 0 : β 1 = 0 or H 0 : β 2 = 0 2 variation in designs (cluster sizes) reduces d-o-f below 410 3 D. Steigerwald (UCSB) Cluster Robust July 2018 3 / 32

  4. Road Map Data sets with growing number of clusters Interest focuses on cluster invariant regressor I no cluster …xed e¤ects Consistency with cluster homogeneity White (1984) Finite sample behavior Cameron, Gelbach and Miller (2008) Consistency with cluster heterogeneity 1 allow cluster sizes to vary 1 number of clusters tends to in…nity 2 Guide to …nite sample behavior - re‡ects cluster heterogeneity 2 E¤ective number of clusters 1 smaller than number of clusters 2 Guidelines for Empirical Research 3 D. Steigerwald (UCSB) Cluster Robust July 2018 4 / 32

  5. Cluster Structure data generating process y gi = β 0 + β 1 x g + β 2 z gi + u gi y gi observation i in cluster g ∑ g n g = n n g number of observations in cluster g G number of clusters Error covariance matrix 2 3 Ω 1 0 0 6 7 ... Ω = 4 5 0 0 Ω G 0 0 Ω g unrestricted (positive de…nite) D. Steigerwald (UCSB) Cluster Robust July 2018 5 / 32

  6. Robust Test Statistic Shah, Holt and Folsom 1977 a selection vector H 0 : a T β = 0 h� i � � 1 X T ˆ V = ∑ G X T X β OLSE for β with variance g = 1 Var g u g test statistic a T ˆ β Z = p a T ˆ V a cluster robust variance estimator � � � 1 � ∑ G � � � 1 b X T X g = 1 X T u T X T X V = u g ˆ g X g � g ˆ robust to arbitrary structure of Ω g allows n g to vary D. Steigerwald (UCSB) Cluster Robust July 2018 6 / 32

  7. Consistency Theorem 1 Assumptions Ω g not identical over g X g not identical over g n g not constant over g If, as n ! ∞ : G ! ∞ a T b MS V a ! 1 a T Va which leads directly to Z H 0 N ( 0 , 1 ) D. Steigerwald (UCSB) Cluster Robust July 2018 7 / 32

  8. Remark 1 Convergence governed by G not n � � � 1 X T X T X A g = g X g ˆ β g OLSE based only on X g � � � � T b β g � ˆ ˆ β g � ˆ ˆ A T V = ∑ g A g β β g b V is a function only of between cluster variation consistency requires G ! ∞ y gi = β 0 + β 1 x g + β 2 z gi + u gi even for test of β 2 behavior of Z is governed by G if there is no cluster correlation each observation is a cluster G = n D. Steigerwald (UCSB) Cluster Robust July 2018 8 / 32

  9. Remark 2 Inconsistent Testing b V is a function only of between cluster variation consistency of b V depends on G growing inconsistent test for I coe¢cient estimator that depends on …xed subset of clusters leading examples I controls that correspond to a group of clusters I cluster speci…c controls (cluster …xed e¤ects) D. Steigerwald (UCSB) Cluster Robust July 2018 9 / 32

  10. Cluster Heterogeneity and Asymptotic Approximation What gives rise to cluster heterogeneity? For example: unequal cluster sizes 1 equal cluster sizes, but variation in Ω g 2 equal cluster sizes and constant Ω g , but variation in X g 3 the majority of empirical studies have cluster heterogeneity convergence of Z requires G ! ∞ Is G an accurate guide to performance under heterogeneity? D. Steigerwald (UCSB) Cluster Robust July 2018 10 / 32

  11. Cluster Heterogeneity Measure analysis leads to a natural measure of heterogeneity for each cluster γ g = a T � � � 1 X T � � � 1 a X T X X T X g Ω g X g depends on which coe¢cients are under test through a measure of heterogeneity for entire sample � � 2 G ∑ G 1 γ g � ¯ γ g = 1 Γ = γ 2 ¯ I (squared) coe¢cient of variation for γ g D. Steigerwald (UCSB) Cluster Robust July 2018 11 / 32

  12. Finite Sample Behavior of Cluster Robust Estimator leading term in asymptotic behavior of Z is governed by G under homogeneity I number of clusters is a guide to inference G 1 + Γ under heterogeneity inference is guided by the e¤ective number of clusters G ENC = 1 + Γ D. Steigerwald (UCSB) Cluster Robust July 2018 12 / 32

  13. Magnitude of Cluster Correction example: if Γ = 2 ENC = G 3 di¤erent order of magnitude than standard bias correction G � k As n ! ∞ : ENC governs the mean-squared error of b V cluster heterogeneity increases I variation in b V I bias in b V D. Steigerwald (UCSB) Cluster Robust July 2018 13 / 32

  14. Laboratory Performance Framework y gi = β 0 + β 1 x g + β 2 z gi + u gi error components model u gi = ε g + v gi � � iid 0 , cz 2 � N ( 0 , 1 ) independently of v gi j X � N ε g gi correlation matrix for cluster g 2 3 1 ρ ij 6 7 ... 1 p gi � p 5 where ρ ij = 4 1 + cz 2 1 + cz 2 gj ρ ij 1 c = 500 nearly uncorrelated (heteroskedastic) c = 0 perfectly correlated (homoskedastic) D. Steigerwald (UCSB) Cluster Robust July 2018 14 / 32

  15. Design Variation 2500 observations divided into 100 groups iid iid � Bernoulli ( . 5 ) � U ( 0 , 1 ) x g z gi Cluster Sizes 1 design 1 : n 1 = 25 n 2 = � � � = n 100 = 25 1 design 2 : n 1 = 124 n 2 = � � � = n 100 = 24 2 . . . 3 design 10 : n 1 = 916 n 2 = � � � = n 100 = 16 4 Error Cluster Correlation 2 c = 500 : correlation � 0 heteroskedastic 1 . . . 2 c = 0 : correlation =1 homoskedastic 3 D. Steigerwald (UCSB) Cluster Robust July 2018 15 / 32

  16. Impact of Design on E¤ective Number of Clusters E¤ective Number of Clusters: G 1 + Γ ( Ω , X ) Γ ( Ω , X ) : measure of cluster heterogeneity cluster size variation 1 Increasing cluster size variation reduces ENC 1 realized values for X 2 Data sets with unequal values for x g reduce ENC 1 cluster error correlation 3 As the cluster error correlation increases, ENC is more sensitive to 1 variation in x g for each set of cluster sizes and value of c : generate 1000 values of X D. Steigerwald (UCSB) Cluster Robust July 2018 16 / 32

  17. Impact of Design on ENC D. Steigerwald (UCSB) Cluster Robust July 2018 17 / 32

  18. Impact of E¤ective Number of Clusters on MSE of Cluster-Robust Variance Estimator Mean-Squared Error: � ! � a T b V a � 2 � G ( 1 + Γ ) MSE � � X a T Va Reducing the ENC increases the MSE for b V MSE is conditional on realization of X 1 5 values of X are generated for each set of cluster sizes and value of c 1 for each value of X , 1000 values of u are generated 2 D. Steigerwald (UCSB) Cluster Robust July 2018 18 / 32

  19. D. Steigerwald (UCSB) Cluster Robust July 2018 19 / 32

  20. D. Steigerwald (UCSB) Cluster Robust July 2018 20 / 32

  21. MSE of Cluster-Robust Variance Estimator Cluster-Invariant Regressor y gi = β 0 + β 1 x g + β 2 z gi + u gi estimator of variance for ˆ β 1 MSE is impacted by bias bias is driven by variation in cluster size With variation in cluster sizes, the cluster-robust standard error can be signi…cantly downward biased for the cluster-invariant regressor. D. Steigerwald (UCSB) Cluster Robust July 2018 21 / 32

  22. MSE of Cluster-Robust Variance Estimator Cluster-Varying Regressor y gi = β 0 + β 1 x g + β 2 z gi + u gi estimator of variance for ˆ β 2 MSE impact depends on c if c = 500 (no error cluster correlation) : bias impacts I bias driven by variation in cluster size if c < 500 (error cluster correlation) : variation dominates With error cluster correlation, the cluster-robust standard error can be highly variable for the cluster-varying regressor. D. Steigerwald (UCSB) Cluster Robust July 2018 22 / 32

  23. Empirical Test Size for Cluster-Robust t Test Cluster-Invariant Regressor y gi = β 0 + β 1 x g + β 2 z gi + u gi test of H 0 : β 1 = 0 I small ENC ! downward bias in cluster-robust s.e. ! large empirical test size test of H 0 : β 2 = 0 I small ENC ! greater variation in cluster-robust s.e. ! variation in empirical test size Most pronounced impact for hypothesis test of β 1 D. Steigerwald (UCSB) Cluster Robust July 2018 23 / 32

  24. D. Steigerwald (UCSB) Cluster Robust July 2018 24 / 32

  25. D. Steigerwald (UCSB) Cluster Robust July 2018 25 / 32

Recommend


More recommend