COMPSTAT 2010 * Score moment estimates Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague August 17, 2010 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Motivation Apart from the fact that the ML estimates ˆ θ ML are often influenced by outliers, the solution f ( x ; ˆ θ ML ) of the parametric estimation problem has some other drawbacks: Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Motivation Apart from the fact that the ML estimates ˆ θ ML are often influenced by outliers, the solution f ( x ; ˆ θ ML ) of the parametric estimation problem has some other drawbacks: Instead of f ( x ; ˆ θ ML ) , a few numbers characterizing the data would be useful in further analysis. However, moments m k = E ( X − m 1 ) k , m 1 = EX are often queer expressions containing special functions, and moments of heavy-tailed distributions do not exist, so that the approach m k = m k (ˆ ˆ θ ML ) is not used Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Motivation Apart from the fact that the ML estimates ˆ θ ML are often influenced by outliers, the solution f ( x ; ˆ θ ML ) of the parametric estimation problem has some other drawbacks: Instead of f ( x ; ˆ θ ML ) , a few numbers characterizing the data would be useful in further analysis. However, moments m k = E ( X − m 1 ) k , m 1 = EX are often queer expressions containing special functions, and moments of heavy-tailed distributions do not exist, so that the approach m k = m k (ˆ ˆ θ ML ) is not used Complex problems are solved by using ’pure’ data not ’adapted’ to the assumed model by an adequate inference function (Pearson correlation coefficient) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Problem The reason: The score function r ( x ; θ ) = ( r θ 1 , ..., r θ m ) , ∂ r θ j ( x ; θ ) = ∂θ j log f ( x ; θ ) , is a vector function , suitable for estimation of parameters, but too complicated to afford useful proposals of sensible numeric characteristics of distributions and too complicated to be used in more complex problems Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Problem The reason: The score function r ( x ; θ ) = ( r θ 1 , ..., r θ m ) , ∂ r θ j ( x ; θ ) = ∂θ j log f ( x ; θ ) , is a vector function , suitable for estimation of parameters, but too complicated to afford useful proposals of sensible numeric characteristics of distributions and too complicated to be used in more complex problems The problem: To find a relevant scalar inference function S ( x ; θ ) reflecting basic features of the model distribution, and to use moments � S k ( x ; θ ) f ( x ; θ ) dx M k ( θ ) = X for generalized moment estimates Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Location distributions Location distribution g ( y − µ ) , µ ∈ R , g unimodal, regular, with support R Scalar score r µ ( y ; µ ) = ∂ ∂µ log g ( y − µ ) = S G ( y − µ ) where function S G ( y ) = − g ′ ( y ) g ( y ) is obtained by differentiating according the variable Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Location distributions Location distribution g ( y − µ ) , µ ∈ R , g unimodal, regular, with support R Scalar score r µ ( y ; µ ) = ∂ ∂µ log g ( y − µ ) = S G ( y − µ ) where function S G ( y ) = − g ′ ( y ) g ( y ) is obtained by differentiating according the variable Scalar score of a distribution with support R 1 d S G ( y ; θ ) = − dy g ( y ; θ ) g ( y ; θ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Log-location distributions - I The log-location distribution (Lawless 2003) F of random variable X = η − 1 ( Y ) with support X = ( 0 , ∞ ) has density f ( x ; τ ) = g ( u ) η ′ ( x ) , where g ( y − µ ) is the density of ’prototype’ distribution on R , u = η ( x ) − η ( τ ) and the ’log-location’ parameter τ = η − 1 ( µ ) is the ’image’ of the location µ of the prototype Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Log-location distributions II Theorem. ∂ ∂τ log f ( x ; τ ) = S G ( u ) η ′ ( τ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Log-location distributions II Theorem. ∂ ∂τ log f ( x ; τ ) = S G ( u ) η ′ ( τ ) 1 d � 1 � T ( x ; τ ) ≡ S G ( u ) = − η ′ ( x ) f ( x ; τ ) f ( x ; τ ) dx transformation-based score (t-score) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Log-location distributions II Theorem. ∂ ∂τ log f ( x ; τ ) = S G ( u ) η ′ ( τ ) 1 d � 1 � T ( x ; τ ) ≡ S G ( u ) = − η ′ ( x ) f ( x ; τ ) f ( x ; τ ) dx transformation-based score (t-score) Scalar score S τ ( x ) = η ′ ( τ ) T ( x ; τ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Generalizations F on general interval support X ⊆ R , η : X → R Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Generalizations F on general interval support X ⊆ R , η : X → R t-score (a general concept) 1 d � 1 � T ( x , θ ) = − η ′ ( x ) f ( x ; θ ) f ( x ; θ ) dx where (Johnson, 1949) � log ( x − a ) X = ( a , ∞ ) if η ( x ) = log ( x − a ) X = ( a , b ) if ( b − x ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Generalizations F on general interval support X ⊆ R , η : X → R t-score (a general concept) 1 d � 1 � T ( x , θ ) = − η ′ ( x ) f ( x ; θ ) f ( x ; θ ) dx where (Johnson, 1949) � log ( x − a ) X = ( a , ∞ ) if η ( x ) = log ( x − a ) X = ( a , b ) if ( b − x ) ∂τ log f ( x ; θ ) = η ′ ( τ ) T ( x ; θ ) , ∂ However, to use relation θ has to be in the form θ = ( η − 1 ( µ ) , θ 2 , ..., θ m ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Starting point S τ ( x ; τ ) = η ′ ( τ ) T ( x ; τ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Starting point S τ ( x ; τ ) = η ′ ( τ ) T ( x ; τ ) Example: f ( x ; τ ) = 1 τ e − x /τ S τ ( x ; τ ) = 1 T ( x ; τ ) = x /τ − 1 τ ( x /τ − 1 ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Starting point S τ ( x ; τ ) = η ′ ( τ ) T ( x ; τ ) Example: f ( x ; τ ) = 1 τ e − x /τ S τ ( x ; τ ) = 1 T ( x ; τ ) = x /τ − 1 τ ( x /τ − 1 ) τ is usually taken as scale parameter, but τ = η − 1 ( µ ) and T ( τ ; θ ) = 0 . Perhaps the most important value is not the parameter, but the ’center’ of the distribution itself Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Definitions Measure of central tendency: t-mean x ∗ ( θ ) : T ( x ; θ ) = 0 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Definitions Measure of central tendency: t-mean x ∗ ( θ ) : T ( x ; θ ) = 0 Inference function: Scalar score S ( x ; θ ) ≡ η ′ ( x ∗ ) T ( x ; θ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Definitions Measure of central tendency: t-mean x ∗ ( θ ) : T ( x ; θ ) = 0 Inference function: Scalar score S ( x ; θ ) ≡ η ′ ( x ∗ ) T ( x ; θ ) E θ S 2 Fisher information for x ∗ Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Example: Scalar scores of beta-prime distribution x ∗ = p x p − 1 T ( x ) = qx − p S ( x ) = q qx − p 1 f ( x ) = ( x + 1 ) p + q B ( p , q ) x + 1 q p x + 1 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Consequences Measure of variability: Score variance: the reciprocal Fisher information 1 ω 2 ( θ ) = E θ S 2 Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Consequences Measure of variability: Score variance: the reciprocal Fisher information 1 ω 2 ( θ ) = E θ S 2 ’Center’ and ’radius’ of the distribution x ∗ ( θ ) , ω ( θ ) Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Consequences Measure of variability: Score variance: the reciprocal Fisher information 1 ω 2 ( θ ) = E θ S 2 ’Center’ and ’radius’ of the distribution x ∗ ( θ ) , ω ( θ ) Estimates: Important are not the estimates of θ , but the x ∗ = x ∗ (ˆ sample t-mean ˆ θ ML ) and sample score standard ω = ω (ˆ deviation ˆ θ ML ) , which make possible to compare results for various models with different parameters Zdenˇ ek Fabi´ an Institute of Computer Sciences, Prague COMPSTAT 2010 * Score moment estimates
Recommend
More recommend