time series analysis
play

Time Series Analysis Henrik Madsen hm@imm.dtu.dk Informatics and - PowerPoint PPT Presentation

H. Madsen, Time Series Analysis, Chapmann Hall Time Series Analysis Henrik Madsen hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby Henrik Madsen 1 H. Madsen, Time Series Analysis,


  1. H. Madsen, Time Series Analysis, Chapmann Hall Time Series Analysis Henrik Madsen hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby Henrik Madsen 1

  2. H. Madsen, Time Series Analysis, Chapmann Hall Outline of the lecture Identification of univariate time series models, cont.: Estimation of model parameters, Sec. 6.4 (cont.) Model order selection, Sec. 6.5 Model validation, Sec. 6.6 Henrik Madsen 2

  3. H. Madsen, Time Series Analysis, Chapmann Hall Estimation – methods (from previous lecture) We have an appropriate model structure AR ( p ) , MA ( q ) , ARMA ( p, q ) , ARIMA ( p, d, q ) with p , d , and q known Task : Based on the observations find appropriate values of the parameters The book describes many methods: Moment estimates LS-estimates Prediction error estimates • Conditioned • Unconditioned ML-estimates • Conditioned • Unconditioned (exact) Henrik Madsen 3

  4. H. Madsen, Time Series Analysis, Chapmann Hall Maximum likelihood estimates ARMA ( p, q ) -process: Y t + φ 1 Y t − 1 + · · · + φ p Y t − p = ε t + θ 1 ε t − 1 + · · · + θ q ε t − q Notation: θ T = ( φ 1 , . . . , φ p , θ 1 , . . . , θ q ) Y T = ( Y t , Y t − 1 , . . . , Y 1 ) t The Likelihood function is the joint probability distribution function for all observations for given values of θ and σ 2 ε : L ( Y N ; θ , σ 2 ε ) = f ( Y N | θ , σ 2 ε ) Given the observations Y N we estimate θ and σ 2 ε as the values for which the likelihood is maximized. Henrik Madsen 4

  5. H. Madsen, Time Series Analysis, Chapmann Hall The likelihood function for ARMA ( p, q ) -models The random variable Y N | Y N − 1 only contains ε N as a random component ε N is a white noise process at time N and does therefore not depend on anything We therefore know that the random variables Y N | Y N − 1 and Y N − 1 are independent, hence (see also page 3): f ( Y N | θ , σ 2 ε ) = f ( Y N | Y N − 1 , θ , σ 2 ε ) f ( Y N − 1 | θ , σ 2 ε ) Repeating these arguments:   N � L ( Y N ; θ , σ 2 f ( Y t | Y t − 1 , θ , σ 2  f ( Y p | θ , σ 2  ε ) = ε ) ε ) t = p +1 Henrik Madsen 5

  6. H. Madsen, Time Series Analysis, Chapmann Hall The conditional likelihood function Evaluation of f ( Y p | θ , σ 2 ε ) requires special attention It turns out that the estimates obtained using the conditional likelihood function : N � L ( Y N ; θ , σ 2 f ( Y t | Y t − 1 , θ , σ 2 ε ) = ε ) t = p +1 results in the same estimates as the exact likelihood function when many observations are available For small samples there can be some difference Software: The S-PLUS function arima.mle calculate conditional estimates The R function arima calculate exact estimates Henrik Madsen 6

  7. H. Madsen, Time Series Analysis, Chapmann Hall Evaluating the conditional likelihood function Task : Find the conditional densities given specified values of the parameters θ and σ 2 ε The mean of the random variable Y t | Y t − 1 is the the 1-step forecast � Y t | t − 1 The prediction error ε t = Y t − � Y t | t − 1 has variance σ 2 ε We assume that the process is Gaussian: 1 2 π e − ( Y t − b f ( Y t | Y t − 1 , θ , σ 2 Y t | t − 1 ( θ )) 2 / 2 σ 2 √ ε ) = ε σ ε And therefore:   N �  − 1 ε 2 π ) − N − p L ( Y N ; θ , σ 2 ε ) = ( σ 2 ε 2  exp t ( θ ) 2 2 σ 2 ε t = p +1 Henrik Madsen 7

  8. H. Madsen, Time Series Analysis, Chapmann Hall ML-estimates The (conditional) ML-estimate � θ is a prediction error estimate since it is obtained by minimizing N � ε 2 S ( θ ) = t ( θ ) t = p +1 By differentiating w.r.t. σ 2 ε it can be shown that the ML-estimate of σ 2 ε is σ 2 ε = S ( � θ ) / ( N − p ) � The estimate � θ is asymptoticly “good” and the ε H − 1 where H variance-covariance matrix is approximately 2 σ 2 contains the 2nd order partial derivatives of S ( θ ) at the minimum Henrik Madsen 8

  9. H. Madsen, Time Series Analysis, Chapmann Hall Finding the ML-estimates using the PE-method 1-step predictions: � Y t | t − 1 = − φ 1 Y t − 1 − · · · − φ p Y t − p + 0 + θ 1 ε t − 1 + · · · + θ q ε t − q If we use ε p = ε p − 1 = · · · = ε p +1 − q = 0 we can find: � Y p +1 | p = − φ 1 Y p − · · · − φ p Y 1 + 0 + θ 1 ε p + · · · + θ q ε p +1 − q Which will give us ε p +1 = Y p +1 − � Y p +1 | p and we can then calculate � Y p +2 | p +1 and ε p +2 . . . and so on until we have all the 1-step prediction errors we need. We use numerical optimization to find the parameters which minimize the sum of squared prediction errors Henrik Madsen 9

  10. H. Madsen, Time Series Analysis, Chapmann Hall S ( θ ) for (1 + 0 . 7 B ) Y t = (1 − 0 . 4 B ) ε t with σ 2 ε = 0 . 25 2 Data: arima.sim(model=list(ar=−0.7,ma=0.4), n=500, sd=0.25) 1.0 0.8 45 0.6 AR−parameter 0.4 40 0.2 0.0 35 −0.2 30 −0.4 −1.0 −0.5 0.0 0.5 MA−parameter Henrik Madsen 10

  11. H. Madsen, Time Series Analysis, Chapmann Hall Moment estimates Given the model structure: Find formulas for the theoretical autocorrelation or autocovariance as function of the parameters in the model Estimate, e.g. calculate the SACF Solve the equations by using the lowest lags necessary Complicated! General properties of the estimator unknown! Henrik Madsen 11

  12. H. Madsen, Time Series Analysis, Chapmann Hall Moment estimates for AR ( p ) -processes In this case moment estimates are simple to find due to the Yule -Walker equations (page 104). We simply plug in the estimated autocorrelation function in lags 1 to p :       ρ (1) 1 ρ (1) · · · ρ ( p − 1) − φ 1 � � �       ρ (2) ρ (1) 1 · · · ρ ( p − 2) − φ 2 � � �        =       . . . . . . . . . .      . . . . . ρ ( p ) ρ ( p − 1) ρ ( p − 2) · · · 1 − φ p � � � and solve w.r.t. the φ ’s The function ar in S-PLUS or R use this approach as default Henrik Madsen 12

  13. H. Madsen, Time Series Analysis, Chapmann Hall Model building Data 1. Identification (Specifying the model order) Theory physical insight 2. Estimation (of the model parameters) 3. Model checking No Is the model OK ? Yes Applications using the model (Prediction, simulation, etc.) Henrik Madsen 13

  14. H. Madsen, Time Series Analysis, Chapmann Hall Validation of the model and extensions / reductions Residual analysis (Sec. 6.6.2): Is it possible to detect problems with residuals? (the 1 -step prediction errors using the estimates, i.e. { ε t ( � θ ) } , should be white noise) If the SACF or the SPACF of { ε t ( � θ ) } points towards a particular ARMA-structure we can derive how the original model should be extended (Sec. 6.5.1) If the model pass the residual analysis it makes sense to test null hypotheses about the parameters (Sec. 6.5.2) Henrik Madsen 14

  15. H. Madsen, Time Series Analysis, Chapmann Hall Residual analysis Plot { ε t ( � θ ) } ; do the residuals look stationary? Tests in the autocorrelation. If { ε t ( � θ ) } is white noise then ˆ ρ ε ( k ) is approximately Gaussian distributed with mean 0 and variance 1 /N . If the model fails calculate SPACF also and see if an ARMA -structure for the residuals can be derived (Sec. 6.5.1) Since ˆ ρ ε ( k 1 ) and ˆ ρ ε ( k 2 ) are independent (Eq. 6.4) the test m � √ � 2 � statistic Q 2 = N ˆ ρ ε t ( b θ ) ( k ) is approximately distributed k =1 as χ 2 ( m − n ) , where n is the number of parameters. S-PLUS: arima.diag(’output from arima.mle’) Henrik Madsen 15

  16. H. Madsen, Time Series Analysis, Chapmann Hall Residual analysis (continued) Test for the number of changes in sign. In a series of length N there is N − 1 possibilities for changes in sign. If the series is white noise (with mean zero) the probability of change is 1 / 2 and the changes will be independent. Therefore the number of changes is distributed as Bin ( N − 1 , 1 / 2) S -PLUS: binom.test(N-1, ’No. of changes’) Test in the scaled cumulated periodogram of the residuals is done by plotting it and adding lines at ± K α / √ q , where q = ( N − 2) / 2 for N even and q = ( N − 1) / 2 for N odd. For 1 − α confidence limits K α can be found in Table 6.2 S-PLUS (95% confidence interval): library(MASS) cpgram(’residuals’) Henrik Madsen 16

  17. H. Madsen, Time Series Analysis, Chapmann Hall Sum of squared residuals depend on the model size ^ θ S( ) 1 x x x x x x x 1 2 3 4 5 6 7 i = number of parameters (It is assumed that the models are nested) Henrik Madsen 17

  18. H. Madsen, Time Series Analysis, Chapmann Hall Test is the model The test essentially checks if the reduction in SSE ( S 1 − S 2 ) is large enough to justify the extra parameters in model 2 ( n 2 parameters) as compared to model 1 ( n 1 parameters). The number of observations used is called N . If vector θ extra is used to denote the extra parameters in model 2 as compared to model 1, then the test is formally: H 0 : θ extra = 0 vs. H 0 : θ extra � = 0 If H 0 is true it (approximately) hold that ( S 1 − S 2 ) / ( n 2 − n 1 ) ∼ F( n 2 − n 1 , N − n 2 ) S 2 / ( N − n 2 ) (The likelihood ratio test is also a possibility) Henrik Madsen 18

Recommend


More recommend