a notion of suffjciency for statistical modelling of
play

A Notion of Suffjciency for Statistical Modelling of Interval Data - PowerPoint PPT Presentation

A Notion of Suffjciency for Statistical Modelling of Interval Data T. Augustin, E. Endres, M.E.G.V. Cattaneo, P. Fink, J. Pla, U. Ptter, M. Seitz, G. Schollmeyer, A. Wiencierz Durham, WPMSIIP 2016 Augustin et al. A Notion of Suffjciency


  1. A Notion of Suffjciency for Statistical Modelling of Interval Data T. Augustin, E. Endres, M.E.G.V. Cattaneo, P. Fink, J. Plaß, U. Pötter, M. Seitz, G. Schollmeyer, A. Wiencierz Durham, WPMSIIP 2016 Augustin et al. A Notion of Suffjciency for Interval Data 1 / 49

  2. Interval Data 1 Reliable Inference instead of Overprecision 2 Generalized Linear Models; Maximum Likelihood Estimation 3 Collecting Regions from Estimating Equations 4 Envelopes of Estimating Equations: One Dimensional Case 5 Penalty Approach 6 MLE-Equivalence 7 Concluding Remarks 8 Augustin et al. A Notion of Suffjciency for Interval Data 2 / 49

  3. Interval Data Augustin et al. A Notion of Suffjciency for Interval Data 3 / 49

  4. Interval Data interval data, more generally “imprecise”, “coarse”, “messy”, “defjcient” data are quite common There is an underlying true value that is not observed in the granularity originally intended. epistemic point of view (cp., e.g., Couso & Dubois (2014, IJAR), Couso, Dubois & Sánchez (2014, Springer) ) fjnite precision of measurements response efgects like heaping anonymization compliance, increase of respond rate special case: missing data categorical data: indecision between certain alternatives matching of data a better name would be “non-idealized data” Augustin et al. A Notion of Suffjciency for Interval Data 4 / 49

  5. The two-layers perspective ✛ ✛ ideal Y i ideal X i efgects ✻ observation model observation model ❄ ❄ observable Y i observable X i ❄ ❄ ✲ ✛ data inference data Augustin et al. A Notion of Suffjciency for Interval Data 5 / 49

  6. Interval Data: Example German General Social Survey (ALLBUS) 2010: 2827 observations from Germany in total, 2000 report personal income (30% missing). An additional 10% report only income brackets. 100 Frequencies 50 0 0 1000 2000 3000 4000 5000 6000 7000 8000 Augustin et al. A Notion of Suffjciency for Interval Data 6 / 49

  7. Consequences: 1 Missingness, grouping, and heaping will rarely conform to the assumption of “coarsening at random” (CAR). 2 Missingness, grouping, and heaping add an additional type of uncertainty apart from classical statistical uncertainty. This uncertainty can’t be decreased by sampling more data. Use credible inference procedures that do not rely on unsustainable “assumptions”! Interval Data: Example 1 We see heaping at 1000 e , 2000 e , . . . , less so at 500 e , 1500 e , . . . 2 Both heaping and grouping depend on the amount of income reported. 3 Missingness (some 20% of the data) might as well depend on the amount of income. Augustin et al. A Notion of Suffjciency for Interval Data 7 / 49

  8. Use credible inference procedures that do not rely on unsustainable “assumptions”! Interval Data: Example 1 We see heaping at 1000 e , 2000 e , . . . , less so at 500 e , 1500 e , . . . 2 Both heaping and grouping depend on the amount of income reported. 3 Missingness (some 20% of the data) might as well depend on the amount of income. Consequences: 1 Missingness, grouping, and heaping will rarely conform to the assumption of “coarsening at random” (CAR). 2 Missingness, grouping, and heaping add an additional type of uncertainty apart from classical statistical uncertainty. This uncertainty can’t be decreased by sampling more data. Augustin et al. A Notion of Suffjciency for Interval Data 7 / 49

  9. Interval Data: Example 1 We see heaping at 1000 e , 2000 e , . . . , less so at 500 e , 1500 e , . . . 2 Both heaping and grouping depend on the amount of income reported. 3 Missingness (some 20% of the data) might as well depend on the amount of income. Consequences: 1 Missingness, grouping, and heaping will rarely conform to the assumption of “coarsening at random” (CAR). 2 Missingness, grouping, and heaping add an additional type of uncertainty apart from classical statistical uncertainty. This uncertainty can’t be decreased by sampling more data. Use credible inference procedures that do not rely on unsustainable “assumptions”! Augustin et al. A Notion of Suffjciency for Interval Data 7 / 49

  10. Probability Model Joint distribution of exact and interval-valued random variables with marginal distributions P (exact data) and P * (observable, e.g. coarsened data): ( X , Y ) (( X * × Y * ) , F * , P * ) (Ω , ˚ F , ˚ P ) ( ❳ , ❨ ) Assumptions defjciency model X * ⊂ P ( X ) , Y * ⊂ P ( Y ) (( X × Y ) , F , P ) ideal, exact model For coarse data: consistency condition (error freeness) Pr ( X ∈ X , Y ∈ Y ) = 1 Augustin et al. A Notion of Suffjciency for Interval Data 8 / 49

  11. Reliable Inference instead of Overprecision Augustin et al. A Notion of Suffjciency for Interval Data 9 / 49

  12. Interval Data: Representations ( 3 ) ( 1 ) ( 2 ) Epistemic point of view: Couso & Dubois (2014, IJAR), Couso, Dubois & Sánchez (2014, Springer) We represent interval-valued data as follows: x := [ x , x ] = { ( x 1 , . . . , x n ) | x 1 ≤ x 1 ≤ x 1 , . . . , x n ≤ x n ≤ x n } where it is assumed that the intervals contain the actual, underlying, “true” x ∈ x . Analogously for Y -variable. Augustin et al. A Notion of Suffjciency for Interval Data 10 / 49

  13. Manski’s Law of Decreasing Credibility Reliability !? Credibility ? "The credibility of inference decreases with the strength of the assumptions maintained." (Manski (2003, p. 1)) Augustin et al. A Notion of Suffjciency for Interval Data 11 / 49

  14. Reliable Inference Instead of Overprecision!! Consequences from Manski’s Law of Decreasing Credibility: Adding untenable assumptions to produce precise solution may distroy credibility of statistical analysis, and therefore its relevance for the subject matter questions. Make realistic assumptions and let the data speak for themselves! Extreme case: Consider the set of all models that are compatible with the data (and then add successively additional assumptions, if desirable) The results may be imprecise, but are more reliable The extent of imprecision is related to the data quality! As a welcome by-product: clarifjcation of the implication of certain assumptions Often still suffjcient to answer subjective matter question Augustin et al. A Notion of Suffjciency for Interval Data 12 / 49

  15. Work in that direction Interval analysis/reliable computing, i.i.d. case, e.g. Nguyen, Kreinovich, Wu, Xiang (2011, Springer) Linear regression, e.g., ◮ Rohwer & Pötter (2001, Juventa) ◮ Manski & Tamer (2002, Econometrica) ◮ Chernozhukov Hong &Tamer (2007, Econometrica) ◮ Beresteanu & Molinari (2008, Econometrica) ◮ Cattaneo & Wiencierz (2012, IntJAproxReason) ◮ Beresteanu, Molchanov,& Molinari. (2012, J Econometrics) ◮ Bontemps, Magnac & Maurin (2012, Econometrica) ◮ Schollmeyer & Augustin (2015, IntJAproxReason) What to do with generalized linear models? ◮ logit regression: Plass, Augustin, Cattaneo, Schollmeyer (2015, ISIPTA) ◮ ◮ Seitz (2015, Springer Best Masters) Augustin et al. A Notion of Suffjciency for Interval Data 13 / 49

  16. Generalized Linear Models; Maximum Likelihood Estimation Augustin et al. A Notion of Suffjciency for Interval Data 14 / 49

  17. Basic Notation, Regression Models n observations („large “) ❨ = ( Y 1 , · · · , Y n ) T response variable ❳ = ( X 1 , · · · , X n ) T covariates ( X i , Y i ) i = 1 , ··· , n i.i.d here Y i one dimensional, of metrical, ordinal, or categorical scale X i p -dimensional, (metric or binary) joint distribution: density with respect to appropriate dominating measure n n ∏︂ ∏︂ f ( ❳ , ❨ ) ( ① , ② ) = f ( X i , Y i ) ( x i , y i ) = f Y i | X i ( y i | x i ) · f X i ( x i ) ⏟ ⏞ i = 1 i = 1 model Augustin et al. A Notion of Suffjciency for Interval Data 15 / 49

  18. Typically parametrization of f Y | X ( · ) only, f X ( · ) is assumed to contain ancillary information regression parameters 𝛾 = ( 𝛾 0 , 𝛾 1 , . . . , 𝛾 p ) T , further parameter 𝛿 parametric model for [ Y i | X i ] Here generalized linear model Augustin et al. A Notion of Suffjciency for Interval Data 16 / 49

  19. Generalized Linear Models E.g. Fahrmeir, Kneib, Lang, Marx (2013, Spinger) Generalizing linear regression Y i = 𝛾 0 + 𝛾 ′ ⇒ Y i | X i ∼ N ( X ′ i 𝛾, 𝜏 2 ) 1 X i + 𝜁 i ⇐ to other distributions * Gamma distribution, inverted Gaussian, Beta distribution * Poisson distribution − → count data * Bernoulli/Multinomial distribution − → categorical data: logit/Probit model f ( y i || 𝜉 i , 𝛿 ) = const ( y i , 𝛿 ) · exp ( 𝜉 i y i − b ( 𝜘 i ) ) , i = 1 , · · · , n 𝛿 𝜉 i = 𝛾 0 + 𝛾 1 · x i 1 + · · · + 𝛾 p · x ip (︃ 1 )︃ ′ exponential family with individual canonical parameter 𝜉 i = 𝛾 X ′ i ("canonical link") Augustin et al. A Notion of Suffjciency for Interval Data 17 / 49

  20. DGP DGP ✻ ❄ Data Augustin et al. A Notion of Suffjciency for Interval Data 18 / 49

  21. Maximum Likelihood Estimation After having observed the data, reinterpret the density as a function of the parameters, describing how likely each parameter has produced the data. Maximum Likelihood-Estimator (MLE): root of the derivative of the logarithmized likelihood − → score function (︃ 1 n )︃ score ( 𝛾 ) = 1 ∑︂ ( Y i − E ( Y i | X i )) X i 𝛿 i = 1 Augustin et al. A Notion of Suffjciency for Interval Data 19 / 49

Recommend


More recommend