semi parametric estimation large time scaling lrd fourier
play

Semi-parametric estimation Large-Time Scaling (LRD). Fourier vs - PowerPoint PPT Presentation

WISP- 2004 - E. Moulines 1/46-1 Semi-parametric estimation Large-Time Scaling (LRD). Fourier vs Wavelets E. Moulines (ENST) C. Hurvich (NYU) P . Soulier (U. Paris X) F. Roueff (ENST) M. Taqqu (BU) WISP- 2004 - E. Moulines 2/46-1 The


  1. WISP- 2004 - E. Moulines 1/46-1 Semi-parametric estimation Large-Time Scaling (LRD). Fourier vs Wavelets E. Moulines (ENST) C. Hurvich (NYU) P . Soulier (U. Paris X) F. Roueff (ENST) M. Taqqu (BU)

  2. WISP- 2004 - E. Moulines 2/46-1 The need for sound statistical methods The richness of traffic is such that one is always in need of more powerful data gathering and processing infrastructures on the one hand, and statistical analysis methods on the other. For existing estimation techniques, the most urgent requirement is increasing their robustness to nonstationarities of various types , which will always be present, despite the luxury of huge data sets which allow apparently stationary subsets to be selected. Closely related to this is the need for formal hypothesis tests to more rigorously select between competing conclusions, and closely related in turn is the need for reliable confidence intervals to be computable, computed, and used intelligently ”Self-Similar Traffic and Network Dynamics”, by Erramili, Roughan, Veitch, Willinger (Proc. IEEE 2002)

  3. WISP- 2004 - E. Moulines 3/46-1 OUTLINE OF THE TALK • Today, I will survey methods to detect and assess large time scaling (definition to come)... • I will not spend time to explain why large time scaling is important and what are the plausible models explaining large time scaling properties (read the excellent survey paper mentioned in the first slide !) • As you might know, the statisticians already comme too few and too late... (like the US cavalry) ! I hope the issue is still of some importance to some of you !!!

  4. WISP- 2004 - E. Moulines 4/46-1 • Introduction • Fourier Methods • Wavelet Methods • Pros and Cons and conclusion

  5. WISP- 2004 - E. Moulines 5/46-1 FRACTIONAL MODELS • A covariance stationary process { X t } is said to be fractional if its spectral density is given f ( x ) = | 1 − e i x | − 2 d f ∗ ( x ) , d < 1 / 2 where f ∗ is continuous at zero frequency. • Allowing d to take non integer values produces a fundamental change in the correlation structure of the process as compared to the correlation structure of a standard times series... The covariance coefficients decay hyperbolically ρ ( τ ) := Cov( X τ , X 0 ) = O ( τ − 1+2 d ) τ → ∞ . as

  6. WISP- 2004 - E. Moulines 6/46-1 SEMI-PARAMETRIC ESTIMATION • In the semi-parametric setting (SPS), a full parametric model is not specified for the ”smooth part” of the spectral density f ∗ : f ∗ is considered as an infinite dimensional nuisance parameter . • Two distinct approaches: Local-to-zero methods : estimators that estimate d and f ∗ (0) and which are consistent 1. without any restrictions on f ∗ away from zero, apart from integrability on [ − π, + π ] . global methods : estimators that jointly estimate d and f ∗ over the whole frequency range, 2. and which are consistent over classes of functions implying ”global” regularity conditions.

  7. WISP- 2004 - E. Moulines 7/46-1 OUTLINE OF THE TALK • The Semi-parametric setting • Fourier Methods • Wavelet Methods • Pros and Cons and conclusion

  8. WISP- 2004 - E. Moulines 8/46-1 PERIODOGRAM AT FOURIER FREQUENCIES • The oldest and most natural tool for spectral estimation is the periodogram (100 years Before Internet !) • Given an observation X 1 , · · · , X n , the ordinary discrete Fourier transform (DFT) and the periodogram are respectively defined as n � d X n ( x ) = (2 πn ) − 1 / 2 X t e i tx , t =1 2 � n � � � n ( x ) | 2 = (2 πn ) − 1 � I X n ( x ) = | d X X t e i tx . � � � � � � t =1

  9. WISP- 2004 - E. Moulines 9/46-1 PERIODOGRAM AT FOURIER FREQUENCIES Under miscellaneous weak dependence conditions , • the periodogram is an asymptotically unbiased estimate of the spectral density, i.e. E [ I X n ( x k )] = f ( x k ) + O ( n − 1 ) , 1 ≤ k ≤ ˜ n, where the O ( n − 1 ) term is uniform in k , • the periodogram ordinates are asymptotically uncorrelated, n ( x k )) = f ( x k ) 2 + O ( n − 1 ) var( I X cov( I X n ( x k ) , I X n ( x l )) = O ( n − 1 ) , k � = l where the O ( n − 1 ) term is uniform w.r.t k, l ,

  10. WISP- 2004 - E. Moulines 10/46-1 THE PERIODOGRAM OF A FRACTIONAL PROCESS: BAD NEWS • For LRD processes ( 0 < d < 1 / 2 ), none of the above mentioned properties remains valid (K¨ unsch (1986), Hurvich and Beltrao (1993)) ! • The bias is not vanishingly small: for any given k ∈ { 1 , · · · , [( n − 1) / 2] , n →∞ E [ I X lim n ( x k )] /f ( x k ) � = 1 , • The periodogram coordinates are not asymptotically uncorrelated, for any given k, j , 1 ≤ k < j ≤ [( n − 1) / 2] I X n ( x k ) /f ( x k ) , I X � � n →∞ | cov | � = 0 . lim n ( x j ) /f ( x j )

  11. WISP- 2004 - E. Moulines 11/46-1 THE PERIODOGRAM OF A FRACTIONAL PROCESS: GOOD NEWS • Nevertheless, the bias is vanishingly small for frequencies sufficiently far away from zero. | E [ I n ( x k ) /f ( x k )] − 1 | ≤ Ck − 1 • The normalized periodogram ordinates are asymptotically uncorrelated | cov ( I n ( x k ) /f ( x k ) , I n ( x j ) /f ( x j )) | ≤ Ck − 2 d ( j − k ) 2 d − 1 , k < j

  12. WISP- 2004 - E. Moulines 12/46-1 THE GEWEKE PORTER-HUDAK (GPH) ESTIMATOR • In the neighborhood of the zero frequency, f ( x ) ≈ | 1 − e i x | − 2 d f ∗ (0) . Therefore, log f ( x ) ≈ dg ( x ) + log f ∗ (0) g ( x ) = − 2 log | 1 − e i x | • Writing log I X n ( x k ) = log f ( x k ) + log I X n ( x k ) /f ( x k ) and plugging the expression above, log I X log I X � � n ( x k ) = dg ( x k ) + c + n ( x k ) /f ( x k ) − γ • This suggests to estimate d as the regression coef. associated to g ! M ( ˆ � n ( x k )) − ¯ d GPH ( M ) , ˆ c GPH ( M )) = arg min (log( I X c ) 2 , dg ( x k ) − ¯ ¯ d, ¯ c k =1

  13. WISP- 2004 - E. Moulines 13/46-1 FARIMA(0,d,0), d= 0.4, φ = −0.9 10 5 0 Log−frequency −5 −10 −15 −4 −3 −2 −1 0 10 10 10 10 10 dB GPH estimator for a FARIMA(1,d,0) process, ( I − B ) d (1 − φB ) X = Z , φ = 0 . 9 . Blue line: log-periodogram. Green Line: least square fit of the intercept.

  14. WISP- 2004 - E. Moulines 14/46-1 THE LOCAL WHITTLE ESTIMATOR (LWE) • Assume that ( d n, 1 , . . . , d n,M ) are independent zero-mean complex gaussian random variables satisfying E | d n,k | 2 = s n,k , E d 2 n,k = 0 . • The negated log-likelihood of ( d n, 1 , . . . , d n,M ) is M log( s n,k ) + | d n,k | 2 � s n,k i =1 • Idea: approximate the log-likelihood of ( d X n ( x 1 ) , . . . , d X n ( x M )) by M log( f ( x k )) + I X n ( x k ) � − f ( x k ) i =1 Of course, this is not quite true (see the comments above) but we may nevertheless expect that this approximation yields to sensible estimates.

  15. WISP- 2004 - E. Moulines 15/46-1 THE LOCAL WHITTLE ESTIMATOR The Local Whittle Estimator (LWE) is defined as the minimum of M I X � n ( x k ) � C | 1 − e i x k | − 2 ¯ ( ˆ , ˆ � log( ¯ d GSE C M − 1 d ) + C M ) = argmin ¯ d, ¯ M C | 1 − e i x j | − 2 ¯ ¯ d k =1 where M is a bandwidth parameter. • Contrary to GPH, there is no closed form solution... • however, the problem can be solved for C for any given d , yielding a profile quasi-likelihood which depends only on a single parameter d ... this is not a tough optimization problem !!

  16. WISP- 2004 - E. Moulines 16/46-1 FEXP ESTIMATOR Principle Estimate d and the coefficients of a truncated expansion of log f ∗ on the cosine • basis. √ 2 π and h j ( x ) = cos( jx ) / √ π , j ≥ 1 . The log-periodogram regression • Define h 0 = 1 / estimator of d is defined by ( ˆ d FEXP ( q ) , ˆ θ 0 , · · · , ˆ θ q ) = q K � 2 � n ( x k )) − ¯ ¯ � � log( I X dg ( x k ) − arg min θ j h j ( x k ) , d, ¯ ¯ θ 0 , ··· , ¯ θ q k =1 j =0 • the choice of the bandwidth parameter M is replaced here by the choice of the truncation index q .

  17. WISP- 2004 - E. Moulines 17/46-1 ASYMPTOTIC NORMALITY : GPH / GSE estimator • (Loc1) There exist a real d < 1 / 2 , a square summable sequence { ψ j } and a zero-mean unit variance white noise { Z t } t ∈ Z such that ∞ � X t = ( I − B ) − d Y t , Y t = ψ k Z t − k . and k = −∞ ψ ( x ) the Fourier transform of { ψ k } and f ∗ = | ˆ We denote ˆ ψ | 2 . (Loc2) 1 /L ≤ f ∗ (0) ≤ L and | f ∗ ( x ) − f ∗ (0) | ≤ L | x | β , for x ∈ [0 , ∆] and some • L > 0 . • (Loc3) The bandwidth M = M n is a non-decreasing function of n (the sample size) which verifies 2 β n →∞ ( M − 1 + M n n − 1+2 β ) = 0 . lim n

  18. WISP- 2004 - E. Moulines 18/46-1 ASYMPTOTIC NORMALITY : GPH / GSE estimator Assume (Loc1-3) • If Z is a martingale increment sequence + conditions then M n ( ˆ � d GSE ( M n ) − d ) → d N (0 , 1 / 4) . • If Z is Gaussian or Z is i.i.d. and satisfies moments + Cramer’s condition (non-lattice) M n ( ˆ � d GPH ( M n ) − d ) → d N (0 , π 2 / 24)

Recommend


More recommend