Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Robust inference based on flexible parametric families of distributions Adelchi Azzalini (Università di Padova, Italia) ICORS, Parma, June 2009
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Outline of the talk skew-symmetric families of distributions flexible likelihood for robust inference some numerical comparison
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Skew-symmetric distributions — Introduction A generator of distributions context: families of continuous distributions on R d start from a density f 0 symmetric around 0, ( x ∈ R d ) f 0 ( x ) = f 0 ( − x ) choose a real-valued w ( x ) such that w ( − x ) = − w ( x ) choose a scalar cdf G ( · ) with symmetric pdf G ′ ( · ) then f ( x ) = 2 f 0 ( x ) G { w ( x ) } is a skew-symmetric pdf
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Basic case: skew-normal distribution ( d = 1) Choose N ( 0 , 1 ) ingredients: f 0 ( x ) = ϕ ( x ) , G = Φ , w ( x ) = α x and get f ( x ) = 2 ϕ ( x ) Φ( α x ) 0.8 0.8 α = − 2 α = 2 α = −5 α = 5 α = −20 α = 20 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 −4 −3 −2 −1 0 1 2 −2 −1 0 1 2 3 4
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Regulate both skewness and kurtosis Select f 0 from a symmetric family with adjustable tails. Interesting cases: Exponential power (Subbotin, 1923): � −� x � ν � Ω f 0 ( x ) ∝ exp ν Student’s t : � − ν + d 1 + � x � 2 � 2 Ω f 0 ( x ) ∝ ν In both cases ν regulates the tail thickness Various options for the skewing factor
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Skew- t distribution (case d = 1) let Z ∼ Skew-normal ( α ) then a natural form of skew- t (ST) variate is Z X = � χ 2 ν /ν density is f ( x ) = 2 t ν ( x ) T ν + 1 { w ( x ) } where � ν + 1 w ( x ) = α x ν + x 2 Note: f ( x ) is of skew-symmetric type Note: a multivariate version exists
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Skew- t distribution: example of densities ν = 5 ν = 1 0.6 α = 0 α = 0 α = 2 0.5 α = 2 0.6 α = 5 α = 5 α = 20 α = 20 0.4 0.4 0.3 t 5 t 1 0.2 0.2 0.1 0.0 0.0 −2 −1 0 1 2 3 4 −2 −1 0 1 2 3 4
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing A flexible distribution Consider ST has a general-purpose tool for statistical modelling Combines high flexibility for skewness and for the tails: α regulates skewness ( α ∈ R d ), ν regulates the tail thickness ( ν > 0) Make use of the tail parameter to accomodate “outliers”, possibly non-symmetrically distributed (Ideal in d -dimensional case: a tail parameter for each component)
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Regression models with ST errors fitted model: y = x ⊤ β + ε, ε ∼ (scale factor) × ST estimate parameters via MLE (or Bayesian approach, according to taste) adjust intercept because E { ST } � = 0 various options: intercept = ˆ β 0 + E { ε } . . . needs ˆ ν > 1 intercept = ˆ β 0 + median ( ε ) . . . use this others. . .
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Flexible distribution approach vs M-estimation M-estimates converge to solution of non-linear equation: λ ( θ ) := E { ψ ( X , θ ) } = 0 In simple location case λ ( θ ) := E { ψ ( X − θ ) } = 0 What are we estimating? If the error distribution is not symmetric, no explicit solution In the “robust likelihood” approach we estimate the parameters of the error distribution Note: empirical evidence that real data have asymmetric outliers
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing A simple regression example (Yohai, 1987) International phone calls from Belgium (Yohai, 1987) T 20 LS T MM T 15 T T T calls 10 5 N N N N N N N N N N N N N N N N N N 0 50 55 60 65 70 year
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing A simple regression example (Yohai, 1987) International phone calls from Belgium (Yohai, 1987) T 20 LS T MM ST T 15 T T T calls 10 5 N N N N N N N N N N N N N N N N N N 0 50 55 60 65 70 year
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing A classical benchmark: stackloss data n � y i | p (loss function) = | y i − ˆ i = 1 p 0.5 1 2 LS 30.1 49.7 178.8 MM 27.1 45.3 222.8 LTS 25.9 44.7 241.7 ST 25.0 43.4 240.0 ( n = 21 with 3 covariates)
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Regression with contaminated normal errors Simulate data from model: Distribution of errors y = β 0 + β 1 x + ε 0.3 where 0.2 ε ∼ ( 1 − π ) N(0,1) + π N ( µ 1 , 3 ) β 0 = 0 0.1 β 1 = 2 π = 0 . 05 , 0 . 10 0.0 0 5 10 15 µ 1 = 2 . 5 , 5 , 10 replicates: 10 4 in each case
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Simulation: Root Mean Square Error for β 0 Contamination = 5% Contamination = 10% 0.8 0.8 LS LS MM MM LTS LTS 0.6 0.6 ST ST Root mean square error of hat(beta0) Root mean square error of hat(beta0) 0.4 0.4 0.2 0.2 0.0 0.0 2 4 6 8 10 2 4 6 8 10 µ 1 µ 1
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Simulation: Root Mean Square Error for β 1 Contamination = 5% Contamination = 10% 0.10 0.10 LS LS MM MM 0.08 0.08 LTS LTS ST ST Root mean square error of hat(beta1) Root mean square error of hat(beta1) 0.06 0.06 0.04 0.04 0.02 0.02 0.00 0.00 2 4 6 8 10 2 4 6 8 10 µ 1 µ 1
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Summary ST and other flexible families of distributions allow regulation of skewness and kurtosis corresponding likelihood inference appears reliable even when used outside the parametric class advantages are: a probability model is fitted to the data the quantities being estimated are explicitly known
Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing References & resources Genton, M. G. (2004, Skew-elliptical distributions. . . ) edited volume Azzalini, A. (2005, Scand J. Stat. , vol.32) Review paper with discussion Resources: http://azzalini.stat.unipd.it/SN/ A. Azzalini & M. G. Genton (2008). Robust likelihood methods based on the skew- t and related distributions. Int. Statist. Rev., 76, 106–129
Recommend
More recommend