Gov 2000: 13. Panel Data and Clustering Matthew Blackwell Fall - PowerPoint PPT Presentation

Gov 2000: 13. Panel Data and Clustering Matthew Blackwell Fall 2016 1 / 55

1. Panel Data 2. First Difgerencing Methods 3. Fixed Efgects Methods 4. Clustering 5. What’s next for you? 2 / 55

Where are we? Where are we going? and violations of those assumptions 3 / 55 • Up until now: the linear regression model, its assumptions, • This week: what can we do with panel data?

1/ Panel Data 4 / 55

Motivation but… ways that we can’t measure? outcomes progress in spite of these problems? 5 / 55 • Relationship between democracy and infant mortality? • Compare levels of democracy with levels of infant mortality, • Democratic countries are difgerent from non-democracies in ▶ they are richer or developed earlier ▶ provide benefjts more effjciently ▶ posses some cultural trait correlated with better health • If we have data on countries over time, can we make any

Ross data NA 215 0 ## 6 Afghanistan 1970 NA 0 ## 5 Afghanistan 1969 NA 0 ## 4 Afghanistan 1968 0 ross <- foreign::read.dta("../data/ross-democracy.dta") ## 3 Afghanistan 1967 NA 0 ## 2 Afghanistan 1966 230 0 ## 1 Afghanistan 1965 cty_name year democracy infmort_unicef ## head(ross[, c("cty_name", "year", "democracy", "infmort_unicef")]) 6 / 55

Notation for panel data (a political science term, mostly) 7 / 55 • Units, 𝑗 = 1, … , 𝑜 • Time, 𝑢 = 1, … , 𝑈 • Time is a typical application, but applies to other groupings: ▶ counties within states ▶ states within countries ▶ people within coutries, etc. • Panel data: large 𝑜 , relatively short 𝑈 • Time series, cross-sectional (TSCS) data: smaller 𝑜 , large 𝑈

𝑗𝑢 𝜸 + 𝑤 𝑗𝑢 Model model: 𝔽[𝑣 𝑗𝑢 |𝐲 𝑗𝑢 , 𝑏 𝑗 ] = 0 𝔽[𝑣 𝑗𝑢 |𝐲 𝑗𝑢 ] = 0 . 8 / 55 𝑧 𝑗𝑢 = 𝐲 ′ 𝑗𝑢 𝜸 + 𝑏 𝑗 + 𝑣 𝑗𝑢 • 𝐲 𝑗𝑢 is a vector of covariates (possibly time-varying) • 𝑏 𝑗 is an unobserved time-constant unit efgect (“fjxed efgect”) • 𝑣 𝑗𝑢 are the unobserved time-varying “idiosyncratic” errors • 𝑤 𝑗𝑢 = 𝑏 𝑗 + 𝑣 𝑗𝑢 is the combined unobserved error: 𝑧 𝑗𝑢 = 𝐲 ′ • Assume that if we could measure 𝑏 𝑗 , we would have the right ▶ Note that this implies, 𝑣 𝑗𝑢 uncorrelated with 𝐲 𝑗𝑢 , so that

Pooled OLS 1. Variance is wrong 2. Possible violation of zero conditional mean errors 9 / 55 • Pooled OLS: pool all observations into one regression • Treats all unit-periods (each 𝑗𝑢 ) as an iid unit. • Has two problems: • Both problems arise out of ignoring the unmeasured heterogeneity inherent in 𝑏 𝑗

Pooled OLS with Ross data ## <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.795 on 646 degrees of freedom (5773 observations deleted due to missingness) 0.0155 ## Multiple R-squared: 0.504, Adjusted R-squared: 0.503 ## F-statistic: 329 on 2 and 646 DF, p-value: <2e-16 -14.8 -0.2283 pooled.mod <- lm(log(kidmort_unicef) ~ democracy + log(GDPcur), 9.7640 data = ross) summary(pooled.mod) ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.3449 ## log(GDPcur) 28.3 <2e-16 *** ## democracy -0.9552 0.0698 -13.7 <2e-16 *** 10 / 55

Unmeasured heterogeneity consistency! aspects of health outcomes, like quality of health system or a lack of ethnic confmict. error and the independent variables. conditional mean error fails for the combined error. 11 / 55 • If unit-efgect, 𝑏 𝑗 is uncorrelated with 𝐲 𝑗𝑢 , no problem for ▶ � 𝔽[𝑤 𝑗𝑢 |𝐲 𝑗𝑢 ] = 𝔽[𝑏 𝑗 + 𝑣 𝑗𝑢 |𝐲 𝑗𝑢 ] = 0 . ▶ Just run pooled OLS (but worry about SEs). • But 𝑏 𝑗 often correlated with 𝐲 𝑗𝑢 so that 𝔽[𝑏 𝑗 |𝐲 𝑗𝑢 ] ≠ 0 . ▶ Example: democratic institutions correlated with unmeasured ▶ Ignore the heterogeneity � correlation between the combined ▶ � 𝔽[𝑤 𝑗𝑢 |𝐲 𝑗𝑢 ] = 𝔽[𝑏 𝑗 + 𝑣 𝑗𝑢 |𝐲 𝑗𝑢 ] ≠ 0 • Pooled OLS will be biased and inconsistent because zero

Panel data consistently even when zero conditional mean error is violated. confounding. 12 / 55 • Panel data (sometimes) allows us to estimate coeffjcients • Two approaches that leverage repeated observations: ▶ Difgerencing: look at changes over time. ▶ Fixed efgects: look at relationships within units. • These approaches can help address time-constant unmeasured

2/ First Differencing Methods 13 / 55

First differencing = (𝐲 ′ 𝑗 𝜸 + Δ𝑣 𝑗 = Δ𝐲 ′ = (𝐲 ′ 14 / 55 unobserved heterogeneity • One approach: compare changes over time • Intuitively, changes over time will be free of time-constant • Two time periods: 𝑧 𝑗1 = 𝐲 ′ 𝑗1 𝜸 + 𝑏 𝑗 + 𝑣 𝑗1 𝑧 𝑗2 = 𝐲 ′ 𝑗2 𝜸 + 𝑏 𝑗 + 𝑣 𝑗2 • Look at the change in 𝑧 over time: Δ𝑧 𝑗 = 𝑧 𝑗2 − 𝑧 𝑗1 𝑗2 𝜸 + 𝑏 𝑗 + 𝑣 𝑗2 ) − (𝐲 ′ 𝑗1 𝜸 + 𝑏 𝑗 + 𝑣 𝑗1 ) 𝑗2 − 𝐲 ′ 𝑗1 )𝜸 + (𝑏 𝑗 − 𝑏 𝑗 ) + (𝑣 𝑗2 − 𝑣 𝑗1 )

First differences model 𝑗 𝜸 + Δ𝑣 𝑗 Δ𝐲 𝑗 conditional mean error holds. units the difgerences 15 / 55 Δ𝑧 𝑗 = Δ𝐲 ′ • Coeffjcient on the levels 𝐲 𝑗𝑢 = the coeffjcient on the changes • Time-constant unobserved heterogeneity 𝑏 𝑗 drops out • Zero conditional mean error: 𝔽[Δ𝑣 𝑗 |Δ𝐲 𝑗 ] = 0 and zero ▶ Stronger than 𝔽[𝑣 𝑗𝑢 |𝐲 𝑗𝑢 , 𝑏 𝑗 ] because requires assumptions about relationships between 𝑣 𝑗2 and 𝐲 𝑗1 . • No perfect collinearity: 𝐲 𝑗𝑢 has to change over time for some • Under these modifjed assumptions, we can run regular OLS on

First differences in R ## --- <2e-16 *** ## democracy -0.0449 0.0242 -1.85 0.064 . ## log(GDPcur) -0.1718 0.0138 -12.49 <2e-16 *** ## Signif. codes: 0.0113 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Total Sum of Squares: 23.5 ## Residual Sum of Squares: 17.8 ## R-Squared : 0.246 ## Adj. R-Squared : 0.244 ## F-statistic: 78.1367 on 2 and 480 DF, p-value: <2e-16 -13.26 -0.1495 library(plm) ## (intercept) index = c("id", "year"), model = "fd") summary(fd.mod) ## Oneway (individual) effect First-Difference Model ## ## Call: ## plm(formula = log(kidmort_unicef) ~ democracy + log(GDPcur), ## data = ross, model = "fd", index = c("id", "year")) ## ## Unbalanced Panel: n=166, T=1-7, N=649 ## ## Residuals : ## Min. 1st Qu. Median 3rd Qu. Max. ## -0.9060 -0.0956 0.0468 0.1410 0.3950 ## ## Coefficients : ## Estimate Std. Error t-value Pr(>|t|) 16 / 55 fd.mod <- plm(log(kidmort_unicef) ~ democracy + log(GDPcur), data = ross,

Differences-in-differences 17 / 55 • Often called “difg-in-difg”, it is a special kind of FD model • Let 𝑦 𝑗𝑢 be an indicator of a unit being “treated” at time 𝑢 . • Focus on two-periods where: ▶ 𝑦 𝑗1 = 0 for all 𝑗 ▶ 𝑦 𝑗2 = 1 for the “treated group” • Here is the basic model: 𝑧 𝑗𝑢 = 𝛾 0 + 𝜀 0 𝑒 𝑢 + 𝛾 1 𝑦 𝑗𝑢 + 𝑏 𝑗 + 𝑣 𝑗𝑢 • 𝑒 𝑢 is a dummy variable for the second time period ▶ 𝑒 2 = 1 and 𝑒 1 = 0 • 𝛾 1 is the quantity of interest: it’s the efgect of being treated

𝜀 0 ) associated with being in the treatment group. Diff-in-diff mechanics 18 / 55 • Let’s take difgerences: (𝑧 𝑗2 − 𝑧 𝑗1 ) = 𝜀 0 + 𝛾 1 (𝑦 𝑗2 − 𝑦 𝑗1 ) + (𝑣 𝑗2 − 𝑣 𝑗1 ) • (𝑦 𝑗2 − 𝑦 𝑗1 ) = 1 only for the treated group • (𝑦 𝑗2 − 𝑦 𝑗1 ) = 0 only for the control group • 𝜀 0 : the difgerence in the average outcome from period 1 to period 2 in the untreated group • 𝛾 1 represents the additional change in 𝑧 over time (on top of

Diff-in-diff interpretation group to the changes over time in the treated group. the causal efgect: treatment/control difgerences in period 2? lower outcomes than the control group 19 / 55 • Key idea: comparing the changes over time in the control • The difgerences between these difgerences is our estimate of 𝛾 1 = Δ𝑧 treated − Δ𝑧 control • Why more credible than simply looking at the 𝑧 𝑗2 = (𝛾 0 + 𝜀 0 ) + 𝛾 1 𝑦 𝑗2 + 𝑏 𝑗 + 𝑣 𝑗2 • 𝑏 𝑗 might be correlated with the treatment • Unmeasured reasons why the treated group has higher or • � bias due to violation of zero conditional mean error

Example: Lyall (2009) 20 / 55

Example: Lyall (2009) to places where the insurgency is the strongest with whether or not shelling occurs, 𝑦 𝑗𝑢 over time for shelled and non-shelled villages: 21 / 55 • Does Russian shelling of villages cause insurgent attacks? attacks 𝑗𝑢 = 𝛾 0 + 𝜀 0 𝑒 𝑢 + 𝛾 1 shelling 𝑗𝑢 + 𝑏 𝑗 + 𝑣 𝑗𝑢 • We might think that artillery shelling by Russians is targeted • That is, part of the village fjxed efgect, 𝑏 𝑗 might be correlated • This would cause our pooled estimates to be biased • Instead Lyall takes a difg-in-difg approach: compare attacks Δ attacks 𝑗 = 𝜀 0 + 𝛾 1 Δ shelling 𝑗 + Δ𝑣 𝑗

Gov 2000: 13. Panel Data and Clustering Matthew Blackwell Fall - PowerPoint PPT Presentation

Gov 2000: 13. Panel Data and Clustering Matthew Blackwell Fall 2016 1 / 55 1. Panel Data 2. First Difgerencing Methods 3. Fixed Efgects Methods 4. Clustering 5. Whats next for you? 2 / 55 Where are we? Where are we going? and

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

4th Quarter 2000 4th Quarter 2000 November 28, 2000 November 28, 2000 Investor Community

CHAPTER VIII VIII CHAPTER Data Clustering and Data Clustering and Self- -Organizing Feature

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Introduction to Machine Learning, Clustering and EM Barnab s P czos Contents Clustering

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

LJS tlfh 13E P 1o..t.. hxb a.f?e /.rrr r !yt t a!,rs &\4... qpera F.ovi 6ruu

SQL Basics Lecture 7b SQL Basics 5 November 2014 1 Wentworth Institute of Technology COMP570

Building Resilient Serverless Systems @johnchapin | symphonia.io John Chapin Partner,

Transplant vs. Surgery for Early HCC Rajesh Ramanathan, MD Surgical Oncology ISIGO October 10 th

Comparison between difgerent online storage systems WA105 Technical Board Meeting, June 15th,

http://www.neutrino2008.co.nz NEUTRINOS: Ghosts of the Universe Stephen Parke Theoretical

IDO Public Process Training Office of Neighborhood Coordination, Planning Department, Alternative

So how hard is solving LWE/NTRU anyway? Martin R. Albrecht @martinralbrecht 10 January 2019, RWC

Gov 2000: 13. Panel Data and Clustering Matthew Blackwell Fall - PowerPoint PPT Presentation

Gov 2000: 13. Panel Data and Clustering Matthew Blackwell Fall 2016 1 / 55 1. Panel Data 2. First Difgerencing Methods 3. Fixed Efgects Methods 4. Clustering 5. Whats next for you? 2 / 55 Where are we? Where are we going? and

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

4th Quarter 2000 4th Quarter 2000 November 28, 2000 November 28, 2000 Investor Community

CHAPTER VIII VIII CHAPTER Data Clustering and Data Clustering and Self- -Organizing Feature

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Introduction to Machine Learning, Clustering and EM Barnab s P czos Contents Clustering

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

LJS tlfh 13E P 1o..t.. hxb a.f?e /.rrr r !yt t a!,rs &amp;\4... qpera F.ovi 6ruu

SQL Basics Lecture 7b SQL Basics 5 November 2014 1 Wentworth Institute of Technology COMP570

Building Resilient Serverless Systems @johnchapin | symphonia.io John Chapin Partner,

Transplant vs. Surgery for Early HCC Rajesh Ramanathan, MD Surgical Oncology ISIGO October 10 th

Comparison between difgerent online storage systems WA105 Technical Board Meeting, June 15th,

http://www.neutrino2008.co.nz NEUTRINOS: Ghosts of the Universe Stephen Parke Theoretical

IDO Public Process Training Office of Neighborhood Coordination, Planning Department, Alternative

So how hard is solving LWE/NTRU anyway? Martin R. Albrecht @martinralbrecht 10 January 2019, RWC

LJS tlfh 13E P 1o..t.. hxb a.f?e /.rrr r !yt t a!,rs &\4... qpera F.ovi 6ruu