strategy proof estimators for simple regression
play

Strategy-proof estimators for simple regression By Javier Perote - PowerPoint PPT Presentation

Strategy-proof estimators for simple regression By Javier Perote (University of Salamanca) and Juan Perote-Pea (University of Zaragoza) MOTIVATION First, this is the continuation of a research project consisting in introducing private


  1. Strategy-proof estimators for simple regression By Javier Perote (University of Salamanca) and Juan Perote-Peña (University of Zaragoza)

  2. MOTIVATION • First, this is the continuation of a research project consisting in introducing private information and strategic considerations into well-known “aggregation” and “decision” techniques like: – Operations Research (PERT, queuing theory, linear programming,…) – Multicriteria decision making – Clustering techniques – Econometrics • Are these techniques “robust” to individual manipulation using the private information?

  3. MOTIVATION • Secondly, strategic data manipulation evokes the literature on “ robustness ” to avoid random contamination and outlier detection: most of the estimators proposed in that literature use the properties of the median to aggregate data • Interestingly, the median as an allocation device to aggregate information is strategy-proof in some contexts: i.e., when individuals have “single- peaked” preferences on a single dimension in public goods allocation problems • Can the incentives literature (from social choice theory) answer questions on econometrics?

  4. STRUCTURE OF THE PAPER • First, we argue that the informational problem can be very important in some econometric studies. Therefore, designing estimators that are robust to data manipulation can be useful • Secondly, we examine the most popular estimators, OLS and show that they may lead to sample contamination (they’re NOT robust) • Then, we propose a whole family of estimators for the simple regression case that can be proved to be immune to this kind of data contamination • Finally, we’ll confront some of them with OLS in a Monte Carlo experiment

  5. WHAT KIND OF PROBLEM? • Some econometric problems use reported or declared information (that cannot be easily and costlessly observed or verified) from agents or individuals (like questionnaires i.e., it is the agent’s private information) • The information extracted from the data is (or can be) used to allocate “something” or to assess policies that might be important to the agents • Therefore, the agents might be tempted to report false information if they think that the data managing process can be profitably manipulated

  6. AN EXAMPLE • A big firm or a government department has a number of divisions (perhaps located in different regions) • Measures of the output “produced” by the divisions cannot be verified without important costs (inventory costs, monitoring costs, etc.). For instance, number of clients served in a month • Therefore, the information about each division’s output is privately owned by the division manager and is reported by him to the firm’s manager

  7. THE MODEL WITH THE EXAMPLE • Some of the inputs affecting each division’s output are known to the planner (firm’s boss), maybe because the planner himself “allocated” then in the past (i.e., the number of workers in each division, the estimated demand in each region, the monthly division’s budget, etc.) { } : N = • set of divisions (= agents) 1 , 2 ,..., n ∈ • each agent is also an “observation” , : i j N ∀ i ∈ • division i’s measure of (true) output , : N y i ~ ∀ i ∈ , : • N y division i’s reported output i

  8. THE MODEL WITH THE EXAMPLE ∀ i ∈ • publicly known explanatory variable , : N x i = β + β + • True data generating process: y x e 0 1 i i i = σ 1 ,..., • where i n and is an i.i.d. : ( 0 , ) e i N random variable (error term or random shock) ⎡ ⎤ ⎡ ⎤ • Let and : 1 x y ( , ) X Y 1 1 ⎢ ⎥ ⎢ ⎥ ... ... ... ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = = 1 X x Y y True i i ⎢ ⎥ ⎢ ⎥ ... ... ... ⎥ ⎥ ⎢ ⎢ sample ⎢ ⎥ ⎢ ⎥ ⎣ 1 ⎦ ⎣ ⎦ x y n n

  9. THE MODEL WITH THE EXAMPLE • A regression estimator is a function “ T ” of the ′ ˆ ˆ ˆ β = β β = sample ( , ) : X Y ( , ) ( , ) T X Y 0 1 • The estimated or predicted values of the response variable for each observation are generated as: = β ˆ + β ˆ ∀ ∈ ˆ • . y x i N 0 1 i i ∀ ∈ ˆ , e i i N • And the residuals are the differences: = − ˆ ˆ • e y y . The most widely used estimator is i i i n the OLS one: ∑ β ˆ = ∀ 2 ˆ arg , ( , ) min e X Y OLS i = 1 i

  10. THE MODEL WITH THE EXAMPLE • When the true sample is known to the ( , ) X Y planner, the OLS estimator is the unbiased one with minimum variance (good properties) • But when the true sample is unknown, the only ~ information received by the planner is ( , ) X Y instead of ( , ) . Applying OLS to the reported X Y ~ sample only maintain the good poperties ( , ) X Y ~ Y = when all agents do not lie! (i.e., ) Y • QUESTION : In which cases will the agents lie?

  11. THE MODEL WITH THE EXAMPLE • We must assume some “preferences” guiding the agents’ declaring behaviour. We opt by the… • SINGLE-PEAKEDNESS ASSUMPTION: i ∈ y • Agent with true response value has single- N i y peaked preferences R on the real line E if: i i ∀ ∈ ≠ y , y P v v E v y i • (i) i i i ∀ > > → + + y • (ii) and , 0 , ( ) ( ) v v v v y v P y v i i i i − − y ( ) ( ). y v P y v i i i i

  12. EXAMPLE OF SINGLE-PEAKEDNESS i ∈ • Possible single-peaked preferences for N Preference “intensity” E y i The real line representing ˆ y predicted values i

  13. THE MODEL WITH THE EXAMPLE ( ) = • Let us use the partitioned notation: , Y y i Y − i ~ ~ ′ β ˆ = β ˆ β ˆ = • Def : Regression estimator ( , ) ( , , ) T X y i Y − 0 1 i ~ • is manipulable at sample ∈ by observation ( , ) X Y Z { } ~ ~ ~ ∃ ∈ ℜ ∃ ∈ ≠ ∈ y y , ( ) • if R y E y y such that 1 ,..., i n i i i i i i [ ] [ ] ~ ~ ~ ~ ~ ~ ~ β ˆ + β ˆ β ˆ + β ˆ y ( , , ) ( , , ) ( , , ) ( , , ) X y Y X y Y x P X y Y X y Y x i − − − − 0 i i i i i i 0 i i i i i ~ ~ ′ β ˆ = β ˆ β ˆ = ( , ) ( , , ) T X y i Y • Def : Regression estimator − 0 1 i • is strategy-proof if it is NOT manipulable at any ~ { } ∈ ∈ sample for any observation ( , ) X Y Z 1 ,..., i n

  14. i L • The workers’ union’s wage setting problem i w SOME EXAMPLES i FB + i L i w = i ~ y i L i rK − i q i p

  15. SOME EXAMPLES • The efficiency frontier estimation problem log r i β ˆ 1 ~ = + y w L FB i i i i β ˆ 0 σ log i σ i = β + β σ + : log log DGP r e 0 1 i i i

  16. SOME EXAMPLES • The tax pay-as-you-go rates allocation problem t i : PAYG average tax rate 30% β ˆ 1 20% ~ = + y w L FB i i i i β ˆ 0 I i : income 10 , 000 $ ˆ ˆ = β + β : PAYG tax schedule t I 0 1 i i

  17. i x variables for 5 True response observations OLS IS NOT STRATEGY-PROOF 2 x 2 i y ~ y • Example:

  18. i The OLS estimator x regression line generates the OLS IS NOT STRATEGY-PROOF 2 x 2 i y ~ y • Example:

  19. OLS IS NOT STRATEGY-PROOF The regression ~ y • Example: line slightly i shifts downwards ~ y ≠ y Lie: : 2 2 And the new x prediction for 2 y y 2 is closer to true ~ 2 y x 2 x i 2 By lying and under- y estimating , agent 2 2 can be better off

  20. A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ 1 0 1 i N i i ⎩ ⎭ x i

  21. A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ ~ 1 0 1 i N i i ⎩ ⎭ x y i i ~ Case of 5 y 2 observations x i x 2

  22. A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ ~ 1 0 1 i N i i ⎩ ⎭ x y i i β ˆ is the median 1 of the slopes x i

  23. A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ ~ 1 0 1 i N i i ⎩ ⎭ x y i i β ˆ is the median 1 2 of the slopes 1 3 x 4 i 5

Recommend


More recommend