new approaches for statistical modelling
play

New approaches for statistical modelling Jelena Jockovi c - PowerPoint PPT Presentation

The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New approaches for statistical modelling Jelena Jockovi c ADVISORS: Pepa Ram rez Cobo, Prof. Fernando L


  1. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New approaches for statistical modelling Jelena Jockovi´ c ADVISORS: Pepa Ram´ ırez Cobo, Prof. Fernando L´ opez Bl´ azquez DOC-COURSE IMUS, University of Seville May 25 2010

  2. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Outline The Double Pareto Lognormal Distribution ( dPlN ) Algebraic structures concerning probability densities Edgeworth expansions

  3. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Heavy tailed distributions • samples with some extreme values • cannot be modelled by normal distribution • application: insurance, finance, hydrology, internet traffic... • models: Pareto, Pareto mixtures, Log-normal,..., dPlN dPlN introduced in: Reed, W. and Jorgensen, M. (2004). The Double Pareto Lognormal distribution - a new parametric model for size distributions. Communications in Statistics, Theory and Methods, 33(8):1733-1753.

  4. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN : Definition • Reed and Jorgersen, 2004. • Define Y = W + Z ind., where Z ∼ N ( ν, τ 2 ) and W ∼ f W ( w ) (skewed Laplace distribution): � αβ α + β e βw for w � 0 , f W ( w ) = αβ α + β e − αw for w > 0 where α, β > 0 • Y ∼ NL ( α, β, ν, τ ) and X = exp( Y ) ∼ dPlN ( α, β, ν, τ ) .

  5. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions CDF � log x − ν � P [ X � x ] = Φ τ � α 2 τ 2 � � log x − ν − τ 2 α � βx − α − β + α exp + αν Φ 2 τ � β 2 τ 2 � � log x − ν + τ 2 β � αx β Φ c − α + β exp − βν 2 τ

  6. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions PDF, Moments • PDF β α f ( x ) = α + β f 1 ( x ) + α + β f 2 ( x ) , � � � log x − ν − ατ 2 � αν + α 2 τ 2 f 1 ( x ) = αx − α − 1 exp Φ , 2 τ � � � log x − ν + βτ 2 � − βν + β 2 τ 2 f 2 ( x ) = βx β − 1 exp Φ c . 2 τ • Moments: The MGF does not exist in closed form. However, for r < α moments can be obtained.

  7. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN properties • Power law tail behaviour: f ( x ) ∼ αA ( α, ν, τ ) x − α − 1 , x → ∞ , f ( x ) ∼ βA ( − β, ν, τ ) x β − 1 , x → 0 • Closure under power-law transformations: X ∼ dPlN ( α, β, ν, τ 2 ) , a, b > 0 W = aX b ∼ dPlN ( α/b, β/b, bν + log a, b 2 τ 2 )

  8. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results X, Y ∼ dPlN , Z, W ∼ NL What is: • Z + W, Z − W ? • X · Y, X/Y ? • X + Y, X − Y ?

  9. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results X, Y ∼ dPlN , Z, W ∼ NL What is: • Z + W, Z − W ? obtained • X · Y, X/Y ? obtained • X + Y, X − Y ? very hard! � exp( ax ) ϕ ( x + b )Φ( x ) dx (!?)

  10. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Future work • more general formulas for NL, dPlN • lack of identifiability of dPlN f ( x 1 , x 2 , ..., x n | θ ) = f ( x 1 , x 2 , ..., x n | θ ′ ) , θ � = θ ′ θ = ( α, β, ν, τ ) , f - the likelihood function Sometimes, parameters are not estimated well! • queueing models

  11. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions dPlN and queueing systems • queueing systems closely related to heavy tailed modelling (congestion in teletraffic systems, ruin problems in insurance...) Cooper, R. (1981). Introduction to Queueing Theory. North Holland, 2nd edition. • GI/M/c described in: Ausin, M., Lillo, R., and Wiper, M. (2007). Bayesian control of the number of servers in a GI/M/c queueing system. Journal of Statistical Planning and Inference , 137:3043-3057. • dPlN/M/ 1 , M/dPlN/ 1 analyzed in: Ramirez, P., Lillo, R., Wilson, S., and Wiper, M. (2010). Bayesian inference for Double Pareto Lognormal queues. To appear in Annals of Applied Statistics. • next: dPlN/G/c queueing system! optimizing number of servers!

  12. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation • bringing together applied probability and algebra (known applications in analysis of variance, multivariate analysis and stationary processes) Some classical references: Girardin, V. and Senoussi, R. (2003). Semigroup stationary processes and spectral representation. Bernouilli, 9(5):857-876. Grenander, U. (1963). Probabilities on Algebraic Structures. John Wiley, New York. Hannan, E. (1965). Group representations and applied probability. J. Appl. Prob., 2:1-68. • What is the family of densities � f � ?

  13. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation • bringing together applied probability and algebra (known applications in analysis of variance, multivariate analysis and stationary processes) Some classical references: Girardin, V. and Senoussi, R. (2003). Semigroup stationary processes and spectral representation. Bernouilli, 9(5):857-876. Grenander, U. (1963). Probabilities on Algebraic Structures. John Wiley, New York. Hannan, E. (1965). Group representations and applied probability. J. Appl. Prob., 2:1-68. • What is the family of densities � f � ? Most distribution families are not closed for convolutions. We have to define new operations!

  14. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results Gamma distribution: β α Γ( α ) x α − 1 e − βx . V = { g ( α, β ) | α > 0 , β > 0 } , g ( α, β ) = ⊕ : V × V → V, g ( α, β ) ⊕ g ( α 1 , β 1 ) = g ( αα 1 , ββ 1 ) c ⊗ g ( α, β ) = g ( α c , β c ) ⊗ : R × V → V, inner product: � g ( α, β ) , g ( α 1 , β 1 ) � = log α · log α 1 + log β · log β 1 Then, the structure ( V, R , ⊕ , ⊗ , � . � ) is a pre-Hilbert space.

  15. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions New results Normal distribution: ( x − ν )2 2 π e − 1 V = { f ( ν, τ 2 ) | ν ∈ R , τ 2 > 0 } , 1 f ( ν, τ 2 ) = . τ 2 √ 2 τ f ( ν, τ 2 ) ⊕ f ( ν 1 , τ 12 ) = f ( ν + ν 1 , τ 2 τ 2 ⊕ : V × V → V, 1 ) c ⊗ f ( ν, τ 2 ) = f ( cν, ( τ 2 ) c ) ⊗ : R × V → V, 1 ) � = νν 1 + log τ 2 · log τ 2 inner product: � f ( ν, τ 2 ) , f ( ν 1 , τ 2 1 Then, the structure ( V, R , ⊕ , ⊗ , � . � ) is a pre-Hilbert space.

  16. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Conclusions and future work Operations ⊕ and ⊗ can be applied to: • any family of densities defined by two real parameters (at least one positive) • moment generating functions, characteristic functions (example: stable distributions)

  17. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Motivation Central Limit Theorem: | F n ( x ) − Φ( x ) |≤ C 0 sup 1 n x ∈ R 2 Error may be too large! A way to improve it: � � � � k � � � A j ( x ) ≤ C k ( x ) � � F n ( x ) − , A 0 ( x ) = Φ( x ) � � 1 ( k +1) � n � n 2 2 j =0

  18. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Definition F - d.f. to be approximated, f - c.f., { κ r } - cumulants We want to find the expansion based on: d.f. Ψ , with c.f. ψ and cumulants { γ r } � + ∞ � � ( κ r − γ r )( it ) r f ( t ) = exp ψ ( t ) (holds) r ! r =1 Under certain conditions and after applying the inverse Fourier transform: � + ∞ � � ( κ r − γ r )( − D x ) r F ( t ) = exp Ψ( t ) (Charlier differential series) r ! r =1 D x - differential operator with respect to x

  19. The Double Pareto Lognormal Distribution ( dP lN ) Algebraic structures concerning probability densities Edgeworth expansions Edgeworth expansions - definition � X 1+ X 2+ ··· + Xn � − µ F n ( x ) = P ≤ x , X i - iid r.v. with mean µ and n σ variance σ , Φ - standard normal distribution Collecting terms according to powers of n ... Edgeworth expansion:   ∞ � P j ( it )  1 +  exp( − t 2 / 2) , f n ( t ) = P j - pol. of deg. 3 j, j n 2 j =1 ∞ � P j ( − D x ) F n ( x ) = Φ( x ) + Φ( x ) j n 2 j =1 ( P j - Cramer-Edgeworth polynomials) Convergent series, can be truncated with error arbitrary small!

Recommend


More recommend