social experiments
play

Social Experiments Handbook of Econometrics James J. Heckman and - PowerPoint PPT Presentation

Social Experiments Handbook of Econometrics James J. Heckman and Edward J. Vytlacil The University of Chicago and Stanford University Handout Draft, July 28, 2004 1 1 Social Experiments 1.1 Introduction Consider ideal experiments with no


  1. Social Experiments Handbook of Econometrics James J. Heckman and Edward J. Vytlacil The University of Chicago and Stanford University Handout Draft, July 28, 2004 1

  2. 1 Social Experiments 1.1 Introduction Consider ideal experiments with no compliance or attrition problems. Two distinct cases for the application of the method of randomized trials. First case advocates randomization to identify structural parameters. Second and more recent case seeks to use randomization to identify treatment parameters. 2

  3. 1.2 Treatment E ff ects vs. Structural Parameters Marschak (1953): goal of structural estimation is to solve a variety of decision problems. 1 Decision problems (a) evaluating the e ff ectiveness of an existing policy, (b) projecting the e ff ectiveness of a policy to di ff erent environments from the one where it was experienced, (c) forecasting the e ff ects of a new policy, never previously experienced. 1 Recall the opening sentence of his seminal article: “Knowledge is useful if it helps us make the best decisions”. (Marschak, 1953, p.1). 3

  4. Marschak (1953) realized that for certain decision problems, knowledge of individual structural parameters, or any structural parameter, is unnecessary. Second, and neglected, contribution of his paper, notion of decision-speci Þ c parameters. Prototypical problem of determining the impact of taxes on labor supply h. 4

  5. Interior solution labor supply equation of hours of work h, wages, W , other variables including assets, ε denote an unobservable. (3-1) h = h ( W, X, ε ) . Additively separable version of the Marshallian causal function (3-2) h = h ( W, X ) + ε. 5

  6. ceteris paribus e ff ects of W and X on h (3-1a) h = h ( W, X, ε, θ ) θ is a low dimensional parameter that generates h . Separable version: (3-2a) h = h ( W, X, θ ) + ε. a linear-in-parameters Cowles Commission type representation of h : (3-3) h = α 0 X + β*nW + ε 6

  7. Distinguish 3 cases. (1) The case where tax t has been implemented in the past and we wish to forecast the e ff ects of the tax in the future in a population with the same distribution of ( X, ε ) variables as prevailed when measurements of tax variation were made. (2) A second case where tax t has been implemented in the past but we wish to project the e ff ects of the same tax to a di ff erent population of ( X, ε ) variables. (3) A case where the tax has never been implemented and we wish to forecast the e ff ect of a tax either on an initial population used to estimate (1) or on a di ff erent population. 7

  8. Suppose the goal of the analysis is to determine the e ff ect of taxes on average labor supply on a relevant population with distribution G ( W, X,ε ) . Case One Case one, we have data from the same population for which we wish to con- struct a forecast. In the randomized trial, persons face tax rate Pr( T = t j | X, W, ε ) = Pr( T = t j | X, W ) . 8

  9. From each regime we can identify (3-4) Z E ( h | W, X, t j ) = h ( W (1 − t j ) , X, ε ) d G ( ε | X, W, t j ) . For the entire population: (3-4a) Z E ( h | t j ) = h ( W (1 − t j ) , X, ε ) d G ( ε, X, W | t j ) . 9

  10. Knowledge of (3-4) or (3-4a) from the historical data. No knowledge of any Marshallian causal function or structural parameter is required to do policy analysis for case one. Case Two Two resembles case one except for one crucial di ff erence. Projecting the same policy onto a di ff erent population, it is necessary to break (3-4) or (3-4a) into its components and determine h ( W (1 − t j ) , X, ε ) separately from G ( ε, X, W, t ) . 10

  11. Assumptions Required to Project (1) Knowledge of h ( · ) is needed on the new population. May entail determi- nation of h on a di ff erent support from that used to determine h in the target population. Structural estimation comes into its own. Parametric structure (3-3) Knowledge of G ( · ) for the target population is also required. Exogeneity enters, a crucial facilitating assumption. (A-1) ( X, W, T ) ⊥ ⊥ ε 11

  12. G ( ε | X, W, T ) = G ( ε ) . 2 Distribution of unobservables is the same in the sample as in the forecast or target regime, G ( ε ) = G 0 ( ε ) , G 0 ( ε ) is the distribution of unobservables in the Z E ( h | W, X, t j ) = h ( W (1 − t j ) , X, ε ) d G ( ε ) Can determine h ( · ) over the new support of X . If G 0 6 = G . Face a new problem. 2 There are many de Þ nitions of this term. Assumption (A-1) is often supplemented by the additional assumption that the distribution of X does not depend on the parameters of the model ( e.g. θ in (2 0 ) or (1 0 ) ) . 12

  13. Case Three Third case, knowledge of the target population. Taxes operate through the term W (1 − t ) . No wage variation in samples def = W ∗ in If wages vary in the presample period, the support of W (1 − t ) the target regime is contained in the support of W in the historical regime, conditional distributions of W ∗ and W given X, ε are the same, supports of ( X, ε ) are the same in both regimes, (a) Support ( W ∗ ) target ⊆ Support ( W ) historical (b) G ( w ∗ | X, ε ) target = G ( w | X, ε ) historical (c) Support ( X, ε ) target = Support ( X , ε ) historical 13

  14. W ∗ = W (1 − t ) Under (a), can Þ nd a counterpart value of W (1 − t ) = W ∗ in the target population. If these conditions are not met, necessary to build up the G and the h functions over the new supports using the appropriate distributions. Enter the realm where structural estimation is required. 14

  15. 1.3 Two Di ff erent Cases For Social Experiments Demonstrate the contrasting nature of the two cases for social experiments. Present a form of experimentation that identi Þ es Þ rst, historically older, case seeks to use randomization to identify Marshallian causal functions “Treatment” is a tax policy: proportional tax on wages. 3 3 Historically, randomization was Þ rst used in economics to vary wage and income pa- rameters facing individuals in order to estimate wage and income e ff ects in labor supply to examine the implications of negative income taxes on labor supply. Part of the goal of randomization was to produce variation in wages and incomes to determine estimates of income and substitution e ff ects. See Cain and Watts (1973). Ashenfelter (1983) shows how estimates of income and substitution e ff ects can be used to estimate the impact of negative income taxes on labor supply. 15

  16. Determine how labor supply responds to taxes t in an experimentally deter- mined population. Labor supply equation is h = h ( t, W, X, ε ) , taxes T are assigned to persons so that (A-5) ( T ⊥ ⊥ ε ) k ( W, X ) . 16

  17. Thus Pr( T = t | W, X, ε ) = Pr( T = t | W, X ) . Assuming full compliance compute the labor supply given t (“treatment” or taxes) as Z Z E ( h | t, W, X ) = h ( t, W, X, ε ) d G ( ε | t, W, X ) = h ( t, W, X, ε ) d G ( ε | W, X ) For tax rate t 0 , Z E ( h | t 0 , W, X ) = h ( t 0 , W, X, ε ) d G ( ε | W, X ) . 17

  18. Form contrast: Z E ( h | t, W, X ) − E ( h | t 0 , W, X ) = [ h ( t, W, X, ε ) − h ( t 0 , W, X, ε )] d G ( ε | W, X ) . May remove the conditioning on ( W, X ) by integrating out ( W, X ) Population average treatment e ff ect for taxes ( t, t 0 ) is Z E F c ( h | t ) − E F c ( h | t 0 ) = [ E ( h | t, W, X ) − E ( h | t 0 , W, X )] dF c ( W, X ) , 18

  19. Applying the results of the experiment to a new population, forecasting the e ff ects of tax rates not previously experienced, requires the same types of adjustments described in Lecture 2. Decompose E ( h | t, W, X ) into h ( · ) and G ( · ) Additive Separability Helps h = h ( W, t, X ) + ε, E ( h | W, X, t ) − E ( h | W, X, t 0 ) = h ( W, X, t ) − h ( W, X, t 0 ) . Treatment e ff ect is the di ff erence between two Marshallian causal functions. 19

  20. Specialize the Marshallian causal functions h = α 0 + α 1 *n ( W (1 − t )) + α 0 2 X + ε = α 0 + α 1 *nW + α 1 *n (1 − t ) + α 0 2 X + ε. E ( h | W, t, X ) − E ( h | W, t 0 , X ) = α 1 [ *n (1 − t ) − *n (1 − t 0 )] α 1 is identi Þ able from the treatment e ff ects. Randomization governed by (A-5) does not identify α 2 . Generally, ε and W are stochastically dependent, variation induced in T a randomization that implements (A-5) does not make W or X exogenous. 20

  21. Social experiments only identify treatment terms and terms that interact with treatment . Main e ff ects for ( W, X ) not identi Þ ed. Thus consider the additively separable case h ( W, X, t, ε ) = h ( W, X, t ) + ε . Under it we can recover h ( W, X, t ) − h ( W, X, t 0 ) . Decompose h ( W, X, t ) into a main e ff ect ϕ ( W, X ) , an interaction term plus main e ff ect for treatment term η ( W, X, t ) May write h ( W, X, t ) = ϕ ( W, X ) + η ( W, X, t ) . ϕ ( W, X ) di ff erences out all contrasts. Only di ff erences in η ( W, X, t ) can be identi Þ ed. 21

  22. Randomization identi Þ es the treatment e ff ect (not by creating exogeneity be- tween “right hand” variables and error term and identifying Marshallian causal parameters) by balancing the bias . A consequence of (A-5) E ( ε | t 0 , W, A ) = E ( ε | t, W, A ) . “Control functions” (or conditional bias terms) balance of the bias. 22

  23. If we seek to project the Þ ndings from one experiment to a new population with the same, task is greatly simpli Þ ed by assuming ε ⊥ ⊥ ( W, X ) . No longer necessary to determine distribution of ε given W, X ( G ( ε | W, X )) 23

Recommend


More recommend