differences in differences estimator example card and
play

Differences-in-Differences Estimator: Example Card and Krueger - PowerPoint PPT Presentation

Differences-in-Differences Estimator: Example Card and Krueger (1994) Example : Card D. and Krueger A. (1994) Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania, American Economic Review,


  1. Differences-in-Differences Estimator: Example Card and Krueger (1994) • Example : Card D. and Krueger A. (1994) “Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania”, American Economic Review, Vol. 84, No. 4, pp. 772-793. • We mention this paper as an example of a natural or quasi- experiment. • Effect of minimum wages on employment (a classic and controversial question in labour economics).

  2. Differences-in-Differences Estimator: Example Card and Krueger (1994) • In February 1992 New Jersey (NJ) increased the state minimum wage from $4.25 to $5.05. Pennsylvania (PA)’s minimum wage stayed at $4.25. • They surveyed about 400 fast food stores both in NJ and in PA both before and after the minimum wage increase in NJ. • The differences-in-differences strategy amounts to comparing the change in employment in NJ to the change in employment in PA.

  3. Differences-in-Differences Estimator and Common Trend Assumption • The key assumption for DID is that the outcome in the treatment and control group would follow the same time trend in the absence of the treatment. • This does not mean that they have to have the same mean of the outcome! • Common trend assumption is difficult to verify but one often uses pre-treatment data to show that the trends are the same.

  4. Example of TTTC for time series: Causal Effects Using Bayesian Structural Time-Series Models (Brodersen et al. 2015) • Brodersen et al. (2015): new methodology to estimate the causal effect of an intervention using time series data. • Goal: propose a new approach to infer the causal impact of a market intervention, such as the launch of a new product, or the effect of an advertising campaign. • Very useful method for private sector firms that want to assess the profitability of investment decisions. – No coincidence that all authors of the article work at Google. – They develop the R package CausalImpact to perform the estimation.

  5. Example of TTTC for time series: Causal Effects Using Bayesian Structural Time-Series Models (Brodersen et al. 2015) • Main idea: generalize the DID estimator to a time-series setting by modelling the counterfactual of a time series observed both before and after a given intervention. • Example: estimate the causal effect of an ad campaign on number of clicks to a website – Causal effect: difference between the observed number of clicks and the number of clicks that would have been observed absent the ad campaign.

  6. Example of TTTC for time series: Causal Effects Using Bayesian Structural Time-Series Models (Brodersen et al. 2015) • Advantages with respect to DID: – DID is based on a static regression model that assumes i.i.d. data, while time series clearly are not i.i.d. – DID considers only two points in time: before and after the treatment. However, the way in which the treatment effect changes over time can be very important. – DID applied to time series data imposes restrictions on how the variables used to characterize the control group are selected.

  7. CausalImpact Package in R • https://google.github.io/CausalImpact/CausalImpact.html#insta lling-the-package • Tutorial: https://www.youtube.com/watch?v=GTgZfCltMm8 • Methodology: bayesian structural time-series model. • Find time series that are correlated with the outcome of interest but are uncorrelated with the treatment: – Examples: markets where no action was taken, stock markets, weather, search queries from google trends –”unmoved movers” that indicate an interest in the industry or market where a firm operates without being directly affected by the action taken. • Training the model in the pre-period and apply the model in the post-period. • Spike and slab prior to automatically find useful predictors.

  8. CausalImpact Package in R • Caveats and checks: – Back-test the model to prevent spurious correlations between time series that perfectly predict the outcome before the treatment: • Test the model before the treatment to test that there was no effect before the treatment. – Rule of thumb: 5 and 20 predictor time series to balance the explanatory power of the predictors. • Open question: how to use this method to calculate the impact of multiple events that overlap overtime? • Precision of the estimates: – The longer you predict into the future, the wider the CI. – The stronger the predictors time series and the less noisy the outcome, the tighter the CI.

  9. Different Approaches of Program Evaluation 1. Run an experiment and use simple differences estimator. 2. Use observational data to construct the counterfactual: a. Selection on observables :  Unconfoundedness assumption: we assume to observe all X variables that affect both participation decision and outcome. • Differences-in-Differences (DID) • Matching • Regression discontinuity (RD) b. Selection on unobservables  We assume participation depends on unobserved variables. • Instrumental variable estimation • Control function approach

  10. Use Observational Data to Construct the Counterfactuals • How to construct the counterfactual when we cannot simply use a differences estimator (DE)? • NOTE : we cannot simply use the DE because of unobservables affecting the treatment and the control groups differently over time. 1. Selection on observables :  Unconfoundedness assumption (UA): we assume to observe all X variables that affect both participation decision and outcome.  Differences-in-Differences (DID): if UA satisfied, we can control for all relevant X s and make sure common trend assumption is satisfied. • Common trend assumption: absent treatment, the change in treated outcome would have been the same as the change in non-treated outcome.  Matching  Regression discontinuity (RD)

  11. Matching Estimator • Intuitive idea : we use a group of observed variables Z to form matches between individuals in treatment and control groups. • Example : Angrist (1998) uses matching to estimate the effect of military service on earnings later on in life. – Treatment: D=1 if someone was in the military. – Matching idea: conditional on the observed variables Z that are used to select soldiers (age, schooling and test scores), having been in the military is independent of potential future earnings. – Matching estimator: conditional on Z, the impact of military service on earnings can be estimated by comparing the earnings of those who were in the military to the earnings of those that were not in the military.

  12. Matching Estimator • The matching method compares the outcomes of treated participants with those of matched non treated, where matches are chosen on the basis of similarity in observed characteristics.  Covariate-specific treatment-control comparison weighted using the distribution of covariates among the treated. • Main advantage of matching estimators: they typically do not require specifying a functional form of the outcome equation and are therefore not susceptible to wrong functional form bias.

  13. Assumptions of Matching Approach • Assume you have access to data on treated and untreated individuals (D=1 and D=0) • Assume you also have access to a set of Z variables whose distribution, F(.), is not affected by D: F(Z|D,Y1,Y0)=F(Z|Y1,Y0)

  14. Assumptions of Matching Approach 1. Selection on Observables (Unconfoundedness Assump.)  There exists a set of observed characteristics Z such that outcomes are independent of treatment conditional on Z (i.e. treatment assignment is “strictly ignorable” given Z (Rosenbaum and Rubin 1983)).

  15. Assumptions of Matching Approach 1. Selection on Observables (Unconfoundedness Assump.)  There exists a set of observed characteristics Z such that outcomes are independent of treatment conditional on Z (i.e. treatment assignment is “strictly ignorable” given Z (Rosenbaum and Rubin 1983)). 2. Common Support Assumption  Assumption 2 is required, so that matches for D=0 and D=1 observations can effectively be found.

  16. Implications of Assumptions • If Assumptions 1 and 2 are satisfied, then the problem of determining the average treatment effect can be solved by substituting the Y0 distribution observed for “matched -on-Z non- participants” for the missing Y0 distribution of participants.

  17. Implications of Assumptions • If Assumptions 1 and 2 are satisfied, then the problem of determining the average treatment effect can be solved by substituting the Y0 distribution observed for “matched -on-Z non- participants” for the missing Y0 distribution of participants. • For assumption 1 to hold, individuals cannot select into the program based on anticipated treatment impact.

  18. Implications of Assumptions • If Assumptions 1 and 2 are satisfied, then the problem of determining the average treatment effect can be solved by substituting the Y0 distribution observed for “matched -on-Z non- participants” for the missing Y0 distribution of participants. • For assumption 1 to hold, individuals cannot select into the program based on anticipated treatment impact. • Assumption 1 implies:

Recommend


More recommend