Robust Regression with Coarse Data Marco Cattaneo and Andrea - PowerPoint PPT Presentation

Robust Regression with Coarse Data Marco Cattaneo and Andrea Wiencierz Department of Statistics, LMU Munich Statistische Woche 2011, Leipzig, Germany 21 September 2011

coarse data unobserved precise data observed coarse data Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

coarse data unobserved precise data observed coarse data ◮ in the literature, two kinds of general approaches to regression with coarse data: Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

coarse data unobserved precise data observed coarse data ◮ in the literature, two kinds of general approaches to regression with coarse data: ◮ represent the observed coarse data by few precise values (e.g., intervals by center and width), and apply standard regression methods to those values: see for instance Domingues et al. (2010) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

coarse data unobserved precise data observed coarse data ◮ in the literature, two kinds of general approaches to regression with coarse data: ◮ represent the observed coarse data by few precise values (e.g., intervals by center and width), and apply standard regression methods to those values: see for instance Domingues et al. (2010) ◮ apply standard regression methods to all possible precise data compatible with the observed coarse data, and consider the range of outcomes as the imprecise result: see for example Ferson et al. (2007) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

coarse data unobserved precise data observed coarse data ◮ in the literature, two kinds of general approaches to regression with coarse data: ◮ represent the observed coarse data by few precise values (e.g., intervals by center and width), and apply standard regression methods to those values: see for instance Domingues et al. (2010) ◮ apply standard regression methods to all possible precise data compatible with the observed coarse data, and consider the range of outcomes as the imprecise result: see for example Ferson et al. (2007) ◮ LIR (Likelihood-based Imprecise Regression): new regression method directly applicable to coarse data (Cattaneo and Wiencierz, 2011) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

nonparametric likelihood ◮ precise data (unobserved): random variables V i = ( X i , Y i ) ∈ X × R Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

nonparametric likelihood ◮ precise data (unobserved): random variables V i = ( X i , Y i ) ∈ X × R ◮ coarse data (observed): random sets V ∗ i ⊆ X × R Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

nonparametric likelihood ◮ precise data (unobserved): random variables V i = ( X i , Y i ) ∈ X × R ◮ coarse data (observed): random sets V ∗ i ⊆ X × R ◮ nonparametric model: P is the set of all probability measures such that Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

nonparametric likelihood ◮ precise data (unobserved): random variables V i = ( X i , Y i ) ∈ X × R ◮ coarse data (observed): random sets V ∗ i ⊆ X × R ◮ nonparametric model: P is the set of all probability measures such that ◮ ( V 1 , V ∗ 1 ) , . . . , ( V n , V ∗ n ) i.i.d. Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

nonparametric likelihood ◮ precise data (unobserved): random variables V i = ( X i , Y i ) ∈ X × R ◮ coarse data (observed): random sets V ∗ i ⊆ X × R ◮ nonparametric model: P is the set of all probability measures such that ◮ ( V 1 , V ∗ 1 ) , . . . , ( V n , V ∗ n ) i.i.d. ◮ P ( V i ∈ V ∗ i ) ≥ 1 − ε (where ε ∈ [0 , 1] is fixed) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

nonparametric likelihood ◮ precise data (unobserved): random variables V i = ( X i , Y i ) ∈ X × R ◮ coarse data (observed): random sets V ∗ i ⊆ X × R ◮ nonparametric model: P is the set of all probability measures such that ◮ ( V 1 , V ∗ 1 ) , . . . , ( V n , V ∗ n ) i.i.d. ◮ P ( V i ∈ V ∗ i ) ≥ 1 − ε (where ε ∈ [0 , 1] is fixed) ◮ the observed (coarse) data V ∗ 1 = A 1 , . . . , V ∗ n = A n induce the (normalized) likelihood function lik : P → [0 , 1] with P ( V ∗ 1 = A 1 , . . . , V ∗ n = A n ) lik ( P ) = max P ′ ∈P P ′ ( V ∗ 1 = A 1 , . . . , V ∗ n = A n ) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

regression problem ◮ regression functions: F is a certain set of functions f : X → R Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

regression problem ◮ regression functions: F is a certain set of functions f : X → R ◮ absolute residuals: R f , i = | Y i − f ( X i ) | Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

regression problem ◮ regression functions: F is a certain set of functions f : X → R ◮ absolute residuals: R f , i = | Y i − f ( X i ) | ◮ for each function f ∈ F , the quantiles of the distribution of the absolute residuals R f , i can be estimated even under the nonparametric model P Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

regression problem ◮ regression functions: F is a certain set of functions f : X → R ◮ absolute residuals: R f , i = | Y i − f ( X i ) | ◮ for each function f ∈ F , the quantiles of the distribution of the absolute residuals R f , i can be estimated even under the nonparametric model P ◮ the regression problem can be interpreted as the minimization of the p-quantile of the distribution of the absolute residuals R f , i (where p ∈ (0 , 1) is fixed) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

generalized LQS regression ◮ likelihood-based confidence interval for the p -quantile of the distribution of the absolute residuals R f , i (where Q f ( P ) is the interval of all p -quantiles of R f , i under P , and β ∈ (0 , 1) is fixed): � C f = Q f ( P ) P ∈P : lik ( P ) >β Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

generalized LQS regression ◮ likelihood-based confidence interval for the p -quantile of the distribution of the absolute residuals R f , i (where Q f ( P ) is the interval of all p -quantiles of R f , i under P , and β ∈ (0 , 1) is fixed): � C f = Q f ( P ) P ∈P : lik ( P ) >β ◮ point estimate : f LRM is the function in F minimizing sup C f (Likelihood-based Region Minimax: see Cattaneo, 2007) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

generalized LQS regression ◮ likelihood-based confidence interval for the p -quantile of the distribution of the absolute residuals R f , i (where Q f ( P ) is the interval of all p -quantiles of R f , i under P , and β ∈ (0 , 1) is fixed): � C f = Q f ( P ) P ∈P : lik ( P ) >β ◮ point estimate : f LRM is the function in F minimizing sup C f (Likelihood-based Region Minimax: see Cattaneo, 2007) ◮ f LRM has a simple geometrical interpretation: B f LRM , q LRM is the thinnest band of the form B f , q = { ( x , y ) ∈ X × R : | y − f ( x ) | ≤ q } containing at least k coarse data (where k > ( p + ε ) n depends on n , ε, p , β ), for all f ∈ F and all q ∈ [0 , + ∞ ) Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

generalized LQS regression ◮ likelihood-based confidence interval for the p -quantile of the distribution of the absolute residuals R f , i (where Q f ( P ) is the interval of all p -quantiles of R f , i under P , and β ∈ (0 , 1) is fixed): � C f = Q f ( P ) P ∈P : lik ( P ) >β ◮ point estimate : f LRM is the function in F minimizing sup C f (Likelihood-based Region Minimax: see Cattaneo, 2007) ◮ f LRM has a simple geometrical interpretation: B f LRM , q LRM is the thinnest band of the form B f , q = { ( x , y ) ∈ X × R : | y − f ( x ) | ≤ q } containing at least k coarse data (where k > ( p + ε ) n depends on n , ε, p , β ), for all f ∈ F and all q ∈ [0 , + ∞ ) ◮ when the observed data are in fact precise, f LRM corresponds to the LQS (Least Quantile of Squares) estimate with quantile k n Marco Cattaneo and Andrea Wiencierz @ LMU Munich Robust Regression with Coarse Data

Robust Regression with Coarse Data Marco Cattaneo and Andrea - PowerPoint PPT Presentation

Robust Regression with Coarse Data Marco Cattaneo and Andrea Wiencierz Department of Statistics, LMU Munich Statistische Woche 2011, Leipzig, Germany 21 September 2011 coarse data unobserved precise data observed coarse data Marco Cattaneo

Coarse-graining Markov state models with PCCA Coarse-graining Markov state models

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Robust Statistics Part 3: Regression analysis Peter Rousseeuw LARS-IASC School, May 2019 Peter

Coarse Woody Debris as Measurable Management Targets A.J. Kroll Weyerhaeuser COARSE WOODY

New design method for C30 recycled concr ete using mixed source concrete coarse agg regates

COARSE-TO-FINE, COST-SENSITIVE CLASSIFICATION OF E-MAIL Jay Pujara jay@cs.umd.edu Lise Getoor

Some categorical aspects of coarse spaces and balleans Nicol` o Zava joint work with Dikran

Data-driven window width adaption adaption for robust for robust online moving window regression

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Stat 8053, Fall 2013: Robust Regression Duncans occupational-prestige regression was introduced

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Addendum Peter Butler, David Joss and Marcus Scheck on behalf of ISOLDE-Darmstadt-GANIL-

4.4 Coordinate Systems McDonald Fall 2018, MATH 2210Q, 4.4 Slides 4.4 Homework : Read section and

Neutralino Dark Matter and polarization: a way to distinguish SUSY-GUT from CMSSM Lorenzo

Introductory Course on Non-smooth Optimisation Lecture 05 - PeacemanRachford,

Towards statistically solid thinking about validation Rick.Donnelly@wsp.com | WSP | 6 May 2020

Bounds on Deviation by Markov: Chebyshev Bound E[(R -) 2 ] x 2 variance of R chebyshev.1

Logistics E-mail UML 2 Should have received mail from me. If not: Check LDAP

End-vertices of graph searching algorithms Graph searching algorithms Shou-Jun Xu ( M )

Sambuz

Useful Links

Newsletter

Mail Us