compstat 2010
play

COMPSTAT 2010 * Score moment estimates Zden ek Fabi an - PowerPoint PPT Presentation

COMPSTAT 2010 * Score moment estimates Zden ek Fabi an Institute of Computer Sciences, Prague August 17, 2010 Zden ek Fabi an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates Motivation Apart from the


  1. COMPSTAT 2010 * Score moment estimates Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague August 17, 2010 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  2. Motivation Apart from the fact that the ML estimates ˆ θ ML are often influenced by outliers, the solution f ( x ; ˆ θ ML ) of the parametric estimation problem has some other drawbacks: Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  3. Motivation Apart from the fact that the ML estimates ˆ θ ML are often influenced by outliers, the solution f ( x ; ˆ θ ML ) of the parametric estimation problem has some other drawbacks: Instead of f ( x ; ˆ θ ML ) , a few numbers characterizing the data would be useful in further analysis. However, moments m k = E ( X − m 1 ) k , m 1 = EX are often queer expressions containing special functions, and moments of heavy-tailed distributions do not exist, so that the approach m k = m k (ˆ ˆ θ ML ) is not used Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  4. Motivation Apart from the fact that the ML estimates ˆ θ ML are often influenced by outliers, the solution f ( x ; ˆ θ ML ) of the parametric estimation problem has some other drawbacks: Instead of f ( x ; ˆ θ ML ) , a few numbers characterizing the data would be useful in further analysis. However, moments m k = E ( X − m 1 ) k , m 1 = EX are often queer expressions containing special functions, and moments of heavy-tailed distributions do not exist, so that the approach m k = m k (ˆ ˆ θ ML ) is not used Complex problems are solved by using ’pure’ data not ’adapted’ to the assumed model by an adequate inference function (Pearson correlation coefficient) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  5. Problem The reason: The score function r ( x ; θ ) = ( r θ 1 , ..., r θ m ) , ∂ r θ j ( x ; θ ) = ∂θ j log f ( x ; θ ) , is a vector function , suitable for estimation of parameters, but too complicated to afford useful proposals of sensible numeric characteristics of distributions and too complicated to be used in more complex problems Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  6. Problem The reason: The score function r ( x ; θ ) = ( r θ 1 , ..., r θ m ) , ∂ r θ j ( x ; θ ) = ∂θ j log f ( x ; θ ) , is a vector function , suitable for estimation of parameters, but too complicated to afford useful proposals of sensible numeric characteristics of distributions and too complicated to be used in more complex problems The problem: To find a relevant scalar inference function S ( x ; θ ) reflecting basic features of the model distribution, and to use moments � S k ( x ; θ ) f ( x ; θ ) dx M k ( θ ) = X for generalized moment estimates Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  7. Location distributions Location distribution g ( y − µ ) , µ ∈ R , g unimodal, regular, with support R Scalar score r µ ( y ; µ ) = ∂ ∂µ log g ( y − µ ) = S G ( y − µ ) where function S G ( y ) = − g ′ ( y ) g ( y ) is obtained by differentiating according the variable Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  8. Location distributions Location distribution g ( y − µ ) , µ ∈ R , g unimodal, regular, with support R Scalar score r µ ( y ; µ ) = ∂ ∂µ log g ( y − µ ) = S G ( y − µ ) where function S G ( y ) = − g ′ ( y ) g ( y ) is obtained by differentiating according the variable Scalar score of a distribution with support R 1 d S G ( y ; θ ) = − dy g ( y ; θ ) g ( y ; θ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  9. Log-location distributions - I The log-location distribution (Lawless 2003) F of random variable X = η − 1 ( Y ) with support X = ( 0 , ∞ ) has density f ( x ; τ ) = g ( u ) η ′ ( x ) , where g ( y − µ ) is the density of ’prototype’ distribution on R , u = η ( x ) − η ( τ ) and the ’log-location’ parameter τ = η − 1 ( µ ) is the ’image’ of the location µ of the prototype Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  10. Log-location distributions II Theorem. ∂ ∂τ log f ( x ; τ ) = S G ( u ) η ′ ( τ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  11. Log-location distributions II Theorem. ∂ ∂τ log f ( x ; τ ) = S G ( u ) η ′ ( τ ) 1 d � 1 � T ( x ; τ ) ≡ S G ( u ) = − η ′ ( x ) f ( x ; τ ) f ( x ; τ ) dx transformation-based score (t-score) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  12. Log-location distributions II Theorem. ∂ ∂τ log f ( x ; τ ) = S G ( u ) η ′ ( τ ) 1 d � 1 � T ( x ; τ ) ≡ S G ( u ) = − η ′ ( x ) f ( x ; τ ) f ( x ; τ ) dx transformation-based score (t-score) Scalar score S τ ( x ) = η ′ ( τ ) T ( x ; τ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  13. Generalizations F on general interval support X ⊆ R , η : X → R Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  14. Generalizations F on general interval support X ⊆ R , η : X → R t-score (a general concept) 1 d � 1 � T ( x , θ ) = − η ′ ( x ) f ( x ; θ ) f ( x ; θ ) dx where (Johnson, 1949) � log ( x − a ) X = ( a , ∞ ) if η ( x ) = log ( x − a ) X = ( a , b ) if ( b − x ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  15. Generalizations F on general interval support X ⊆ R , η : X → R t-score (a general concept) 1 d � 1 � T ( x , θ ) = − η ′ ( x ) f ( x ; θ ) f ( x ; θ ) dx where (Johnson, 1949) � log ( x − a ) X = ( a , ∞ ) if η ( x ) = log ( x − a ) X = ( a , b ) if ( b − x ) ∂τ log f ( x ; θ ) = η ′ ( τ ) T ( x ; θ ) , ∂ However, to use relation θ has to be in the form θ = ( η − 1 ( µ ) , θ 2 , ..., θ m ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  16. Starting point S τ ( x ; τ ) = η ′ ( τ ) T ( x ; τ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  17. Starting point S τ ( x ; τ ) = η ′ ( τ ) T ( x ; τ ) Example: f ( x ; τ ) = 1 τ e − x /τ S τ ( x ; τ ) = 1 T ( x ; τ ) = x /τ − 1 τ ( x /τ − 1 ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  18. Starting point S τ ( x ; τ ) = η ′ ( τ ) T ( x ; τ ) Example: f ( x ; τ ) = 1 τ e − x /τ S τ ( x ; τ ) = 1 T ( x ; τ ) = x /τ − 1 τ ( x /τ − 1 ) τ is usually taken as scale parameter, but τ = η − 1 ( µ ) and T ( τ ; θ ) = 0 . Perhaps the most important value is not the parameter, but the ’center’ of the distribution itself Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  19. Definitions Measure of central tendency: t-mean x ∗ ( θ ) : T ( x ; θ ) = 0 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  20. Definitions Measure of central tendency: t-mean x ∗ ( θ ) : T ( x ; θ ) = 0 Inference function: Scalar score S ( x ; θ ) ≡ η ′ ( x ∗ ) T ( x ; θ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  21. Definitions Measure of central tendency: t-mean x ∗ ( θ ) : T ( x ; θ ) = 0 Inference function: Scalar score S ( x ; θ ) ≡ η ′ ( x ∗ ) T ( x ; θ ) E θ S 2 Fisher information for x ∗ Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  22. Example: Scalar scores of beta-prime distribution x ∗ = p x p − 1 T ( x ) = qx − p S ( x ) = q qx − p 1 f ( x ) = ( x + 1 ) p + q B ( p , q ) x + 1 q p x + 1 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  23. Consequences Measure of variability: Score variance: the reciprocal Fisher information 1 ω 2 ( θ ) = E θ S 2 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  24. Consequences Measure of variability: Score variance: the reciprocal Fisher information 1 ω 2 ( θ ) = E θ S 2 ’Center’ and ’radius’ of the distribution x ∗ ( θ ) , ω ( θ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

  25. Consequences Measure of variability: Score variance: the reciprocal Fisher information 1 ω 2 ( θ ) = E θ S 2 ’Center’ and ’radius’ of the distribution x ∗ ( θ ) , ω ( θ ) Estimates: Important are not the estimates of θ , but the x ∗ = x ∗ (ˆ sample t-mean ˆ θ ML ) and sample score standard ω = ω (ˆ deviation ˆ θ ML ) , which make possible to compare results for various models with different parameters Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates

Recommend


More recommend