Econ 2148, fall 2019 Instrumental variables II, continuous treatment - PowerPoint PPT Presentation

Instrumental variables Econ 2148, fall 2019 Instrumental variables II, continuous treatment Maximilian Kasy Department of Economics, Harvard University 1 / 35

Instrumental variables Recall instrumental variables part I ◮ Origins of instrumental variables: Systems of linear structural equations Strong restriction: Constant causal effects. ◮ Modern perspective: Potential outcomes, allow for heterogeneity of causal effects ◮ Binary case: 1. Keep IV estimand, reinterpret it in more general setting: Local Average Treatment Effect (LATE) 2. Keep object of interest average treatment effect (ATE): Partial identification (Bounds) 2 / 35

Instrumental variables Agenda instrumental variables part II ◮ Continuous treatment case: 1. Restricting heterogeneity in the structural equation: Nonparametric IV (conditional moment equalities) 2. Restricting heterogeneity in the first stage: Control functions 3. Linear IV: Continuous version of LATE 3 / 35

Instrumental variables Takeaways for this part of class ◮ We can write linear IV in three numerically equivalent ways: 1. As ratio Cov( Z , Y ) / Cov( Z , X ) . 2. As regression of Y on first stage predicted values � X . 3. As regression of Y on X controlling for the first stage residual V . ◮ The literature on IV identification with continuous treatment generalizes these ideas to non-linear settings. 4 / 35

Instrumental variables Takeaways continued 1. Moment restrictions: ◮ Assume one-dimensional additive heterogeneity in structural equation of interest ◮ ⇒ nonparametric regression of Y on non-parametric prediction � X . 2. Control functions: ◮ Assume one-dimensional heterogeneity in first stage relationship. ◮ ⇒ X is independent of structural heterogeneity conditional on V = F X | Z ( X | Z ) . 3. Continuous LATE: ◮ No restrictions on heterogeneity. ◮ Interpret linear IV coefficient as weighted average derivative. 5 / 35

Instrumental variables Alternative ways of writing the linear IV estimand ◮ Linear triangular system: Y = β 0 + β 1 X + U X = γ 0 + γ 1 Z + V ◮ Exogeneity (randomization) conditions: Cov( Z , U ) = 0 , Cov( Z , V ) = 0 . ◮ Relevance condition: Cov( Z , X ) = γ 1 Var( Z ) � = 0 . ◮ Under these conditions, β 1 = Cov( Z , Y ) Cov( Z , X ) . 6 / 35

Instrumental variables Moment conditions ◮ Write Cov( Z , U ) = 0 as Cov( Z , Y − β 0 − β 1 X ) = 0 ◮ Let � X be the predicted value from a first stage regression, � X = γ 0 + γ 1 Z . ◮ Multiply Cov( Z , U ) by γ 1 , Cov( � X , Y − β 0 − β 1 X ) = 0 , and note Cov( � X , X ) = Var( � X ) , to get β 1 = Cov( � X , Y ) . Var( � X ) ◮ ⇒ two-stage least squares! 7 / 35

Instrumental variables Conditional moment equalities ◮ Under the stronger mean independence restriction E [ U | Z ] ≡ 0, 0 = E [( Y − β 0 − β 1 X ) | Z = z ] = E [ Y | Z = z ] − β 0 − β 1 E [ X | Z = z ] for all z . ◮ “Conditional moment equality” ◮ Suggest 2 stage estimator: 1. Regress both Y and X (non-parametrically or linearly) on Z . 2. Then regress E [ Y | Z = z ] or Y (linearly) on E [ X | Z = z ] . ◮ ⇒ two-stage least squares! 8 / 35

Instrumental variables Control function perspective ◮ V is the residual of a first stage regression of X on Z . ◮ Consider a regression of Y on X and V , Y = δ 0 + δ 1 X + δ 2 V + W ◮ Partial regression formula: ◮ δ 1 is the coefficient of a regression of ˜ Y on ˜ X (or of Y on ˜ X ) , ◮ where ˜ Y , ˜ X are the residuals of regressions on V . ◮ By construction: X = γ 0 + γ 1 Z = � ˜ X Y = β 0 + β 1 ˜ ˜ X + ˜ U ◮ Cov( Z , U ) = Cov( Z , V ) = 0 implies Cov(˜ X , ˜ U ) = 0, and thus δ 1 = β 1 . 9 / 35

Instrumental variables Recap ◮ Three numerically equivalent estimands: 1. The slope Cov( Z , Y ) / Cov( Z , X ) . 2. The two-stage least squares slope from the regression Y = β 0 + β 1 � X + ˜ U , where ˜ U = ( β 1 V + U ) , and � X is the first stage predicted value � X = γ 0 + γ 1 Z . 3. The slope of the regression with control Y = δ 0 + δ 1 X + δ 2 V + W , where the control function V is given by the first stage residual, V = X − γ 0 − γ 1 Z . 10 / 35

Instrumental variables Roadmap ◮ Nonparametric IV estimators generalize these approaches in different ways, dropping the linearity assumptions: 1. If heterogeneity in the structural equation is one-dimensional: conditional moment equalities 2. If heterogeneity in the first stage is one-dimensional: control functions 3. Without heterogeneity restrictions: continuous versions of the LATE result for the linear IV estimand ◮ Objects of interest: ◮ Average structural function (ASF) ¯ g ( x ) = E [ g ( x , U )] . ◮ Quantile structural function (QSF) g τ ( x ) defined by P ( g ( x , U ) < g τ ( x )) = τ . � E [ ω x · g ′ ( x , U )] dx for weights ω x . ◮ Weighted averages of marginal causal effect, 11 / 35

Instrumental variables Moment restrictions Approach I: Conditional moment restrictions (nonparametric IV) ◮ Consider the following generalization of the linear model: Y = g ( X )+ U X = h ( Z , V ) Z ⊥ ( U , V ) ◮ Here the ASF ¯ g equals g . Practice problem ◮ Under these assumptions, write out the conditional expectation E [ Y | Z = z ] as an integral with respect to dP ( X | Z = z ) . ◮ Consider the special case where both X and Z have finite support of size n x and n z , and rewrite the integral as a matrix multiplication. 12 / 35

Instrumental variables Moment restrictions Solution ◮ Using additivity of structural equation, and independence, k ( z ) = E [ Y | Z = z ] = E [ g ( X ) | Z = z ]+ E [ U | Z = z ] = E [ g ( X ) | Z = z ] � = g ( x ) dP ( X = x | Z = z ) . ◮ In the finite support case, let ◮ k = ( k ( z 1 ) ,..., k ( z n z )) , g = ( g ( x 1 ) ,..., g ( x n x )) , ◮ and let P be the n z × n x matrix with entries P ( X = x | Z = z ) . ◮ Then the integral equation can be written as k = P · g . 13 / 35

Instrumental variables Moment restrictions Completeness ◮ The function k ( z ) = E [ Y | Z = z ] and the conditional distribution P X | Z are identified. ◮ In the finite-support case, the equation k = P · g implies that g is identified if the matrix P has full column rank n x . ◮ The analogue of the full rank condition for the continuous case (integral equation) is called “completeness.” ◮ Completeness requires that variation in Z induces enough variation in X , like the “instrument relevance” condition in the linear case. ◮ Completeness is a feature of the observable distribution P X | Z , in contrast to the conditions of exogeneity / exclusion, or restrictions on heterogeneity. 14 / 35

Instrumental variables Moment restrictions Ill posed inverse problem ◮ Even if completeness holds, estimation in the continuous case is complicated by the “ill posed inverse” problem. ◮ Consider the discrete case. The vector g is identified from g = ( P ′ P ) − 1 P ′ k ◮ Suppose that P ′ P has eigenvalues close to zero. Then g is very sensitive to minor changes in P ′ k . 15 / 35

Instrumental variables Moment restrictions ◮ Continuous analog: notation ˜ k ( z ) = E [ Y | Z = z ] f Z ( z ) � ( P g )( z ) = g ( x ) f X , Z ( x , z ) dx � ( P ′ k )( x ) = k ( z ) f X , Z ( x , z ) dz T = P ′ ◦ P ◮ Thus the moment conditions can be rewritten as ˜ k = P g or P ′ ˜ k = T g , ◮ Therefore g = T − 1 P ′ ˜ k , if the inverse of T exists – which is equivalent to completeness. 16 / 35

Instrumental variables Moment restrictions ◮ T is a linear, self-adjoint ( ≈ symmetric) positive definite operator on L 2 . ◮ Functional analysis: � � f X , Z ( x , z ) 2 fxdz ≤ ∞ , then 0 is the unique accumulation point of the eigenvalues If of T , ◮ and the eigenvectors form an orthonormal basis of L 2 . ◮ Implication: g is not a continuous function of P ′ ˜ k in L 2 . ◮ Minor estimation errors for ˜ k can translate into arbitrarily large estimation errors for g . ◮ Takeaway: Estimation needs to use regularization, convergence rates are slow. 17 / 35

Instrumental variables Moment restrictions Estimation using series ◮ Implementation is surprisingly simple. ◮ Use series approximation g ( x ) ≈ ∑ k j = 1 β j φ j ( x ) . ◮ Then we get k ∑ E [ φ j ′ ( Z ) Y ] ≈ β j E [ φ j ′ ( Z ) φ j ( X )] j = 1 ◮ and thus β ≈ ( E [ φ j ′ ( Z ) φ j ( X )]) − 1 j , j ′ ( E [ φ j ′ ( Z ) Y ]) j ′ . ◮ Sample analog: Two stage least squares, where the regressors φ j ( X ) are instrumented by the instruments φ j ′ ( Z ) . 18 / 35

Instrumental variables Moment restrictions Additive one-dimensional hetereogeneity is crucial for conditional moment equality ◮ Consider the following non-additive example: Y = X 2 · U X = Z + V � � �� 0 . 5 1 ( U , V ) ∼ N 0 , 0 . 5 1 ◮ Average structural function: g ( x ) = E [ x 2 · U ] = 0 . ¯ ◮ Conditional moment equality is solved by ˜ g ( x ) = x : g ( X ) | Z = z ] = E [( Z + V ) 2 U | Z = z ] − z E [ Y − ˜ = 2 zE [ VU ]+ E [ V 2 U ] − z = 0 . 19 / 35

Econ 2148, fall 2019 Instrumental variables II, continuous treatment - PowerPoint PPT Presentation

Instrumental variables Econ 2148, fall 2019 Instrumental variables II, continuous treatment Maximilian Kasy Department of Economics, Harvard University 1 / 35 Instrumental variables Recall instrumental variables part I Origins of

Econ 2148, fall 2017 Instrumental variables II, continuous treatment Maximilian Kasy Department

Econ 2148, fall 2019 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2019 Applications of Gaussian process priors Maximilian Kasy Department of

Econ 2148, fall 2019 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2019 Data visualization Maximilian Kasy Department of Economics, Harvard

Econ 2148, fall 2019 Shrinkage in the Normal means model Maximilian Kasy Department of

Econ 2148, fall 2019 Text as data Maximilian Kasy Department of Economics, Harvard University 1

Econ 2148, fall 2019 Multi-armed bandits Maximilian Kasy Department of Economics, Harvard

Econ 2148, fall 2019 Trees, forests, and causal trees Maximilian Kasy Department of Economics,

Econ 2148, fall 2019 Statistical decision theory Maximilian Kasy Department of Economics,

ECON 626: Applied Microeconomics Lecture 4: Instrumental Variables Professors: Pamela Jakiela

Econ 2148, fall 2017 Applications of Gaussian process priors Maximilian Kasy Department of

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Shrinkage in the Normal means model Maximilian Kasy Department of

Econ 2148, fall 2017 Statistical decision theory Maximilian Kasy Department of Economics,

Instrumental Variables Philosophy of Economics University of Virginia Matthias Brinkmann

Continuous Distributions 1.8-1.9: Continuous Random Variables 1.10.1: Uniform Distribution

Variables (IV) in Stata Austin Nichols @austnnchols Magic Bullets Instrumental Variables

Instrumental Variables for Dummies January 2011 () IV January 2011 1 / 4 Instrumental

Variables (IV) in Stata Austin Nichols 2019 London Stata Conference

ECON2228 Notes 11 Christopher F Baum Boston College Economics 20142015 cfb (BC Econ)

Continuous Distributions 1.8-1.9: Continuous Random Variables 1.10.1: Uniform Distribution

Gov 2002 - Causal Inference II: Instrumental Variables Matthew Blackwell Arthur Spirling