Lecture 11 Fitting ARIMA Models 10/10/2018 1
Model Fitting
Fitting ARIMA For an π΅ππ½ππ΅(π, π, π) model β’ Requires that the data be stationary after differencing β’ Handling π is straight forward, just difference the original data π times (leaving π β π observations) π§ β² β’ After differencing, fit an π΅πππ΅(π, π) model to π§ β² π’ . β’ To keep things simple weβll assume π₯ π’ πππ βΌ πͺ(0, π 2 π₯ ) 2 π’ = Ξ π π§ π’
In general, the vector π³ = (π§ 1 , π§ 2 , β¦ , π§ π’ ) β² will have a multivariate normal distribution with mean {π} π = πΉ(π§ π ) = πΉ(π§ π’ ) and covariance π» where {π»} ππ = πΏ(π β π) . 2(π³ β π) β² Ξ£ β1 (π³ β π)) (2π) π’/2 det (π») 1/2 Γ exp (β1 MLE - Stationarity & iid normal errors normal. The joint density of π³ is given by π π³ (π³) = 1 3 If both of these conditions are met, then the time series π§ π’ will also be
MLE - Stationarity & iid normal errors normal. The joint density of π³ is given by π π³ (π³) = 1 3 If both of these conditions are met, then the time series π§ π’ will also be In general, the vector π³ = (π§ 1 , π§ 2 , β¦ , π§ π’ ) β² will have a multivariate normal distribution with mean {π} π = πΉ(π§ π ) = πΉ(π§ π’ ) and covariance π» where {π»} ππ = πΏ(π β π) . 2(π³ β π) β² Ξ£ β1 (π³ β π)) (2π) π’/2 det (π») 1/2 Γ exp (β1
AR
Fitting π΅π(1) π₯ π₯ . the MLE estimate for π , π , and π 2 but that does not make it easy to write down a (simplified) closed form for Using these properties it is possible to write the distribution of π³ as a MVN π₯ π 2 1 β π 2 π 2 π ππ (π§ π’ ) = 1 β π π πΉ(π§ π’ ) = π₯ , we know We need to estimate three parameters: π , π , and π 2 4 π§ π’ = π + π π§ π’β1 + π₯ π’ πΏ β = 1 β π 2 π |β|
Conditional Density π π§ π’ |π§ π’β1 (π§ π’ ) = ) π₯ π 2 2 exp (β1 π₯ β2π π 2 1 π₯ ) We can also rewrite the density as follows, π₯ π 2 where, 5 π π³ = π π§ π’ , π§ π’β1 , β¦, π§ 2 , π§ 1 = π π§ π’ | π§ π’β1 , β¦, π§ 2 , π§ 1 π π§ π’β1 |π§ π’β2 , β¦, π§ 2 , π§ 1 β― π π§ 2 |π§ 1 π π§ 1 = π π§ π’ | π§ π’β1 π π§ π’β1 |π§ π’β2 β― π π§ 2 |π§ 1 π π§ 1 π§ 1 βΌ πͺ (π, 1 β π 2 ) π§ π’ |π§ π’β1 βΌ πͺ (π + π π§ π’β1 , π 2 (π§ π’ β π + π π§ π’β1 ) 2
Log likelihood of AR(1) π 2 π=2 β π π₯ π 2 1 + π=2 β π π₯ π 2 1 π₯ 2((π β 1) log 2π + (π β 1) log π 2 6 β(π, π, π 2 1 log π π§ π |π§ πβ1 π=2 β π’ π 2 π₯ log π π§ π’ |π§ π’β1 (π§ π’ ) = β 1 2 ( log 2π + log π 2 π₯ + (π§ π’ β π + π π§ π’β1 ) 2 ) π₯ ) = log π π³ = log π π§ 1 + = β 1 π₯ β log (1 β π 2 ) + (1 β π 2 ) 2 ( log 2π + log π 2 (π§ 1 β π) 2 ) β 1 π₯ + (π§ π β π + π π§ πβ1 ) 2 ) = β 1 2 (π log 2π + π log π 2 π₯ β log (1 β π 2 ) ((1 β π 2 )(π§ 1 β π) 2 + (π§ π β π + π π§ πβ1 ) 2 ))
AR(1) Example with π = 0.75 , π = 0.5 , and π 2 7 π₯ = 1 , ar1 4 2 0 β2 0 100 200 300 400 500 0.6 0.6 0.4 0.4 PACF ACF 0.2 0.2 0.0 0.0 0 5 10 15 20 25 0 5 10 15 20 25 Lag Lag
Arima MAE BIC=1433.35 ## ## Training set error measures: ## ME RMSE MPE ## AIC=1420.71 MAPE MASE ## Training set 0.005333274 0.9950158 0.7997576 -984.9413 1178.615 0.9246146 ## ACF1 ## Training set -0.04437489 AICc=1420.76 log likelihood=-707.35 ar1_arima = forecast:: Arima (ar1, order = c (1,0,0)) ar1 summary (ar1_arima) ## Series: ar1 ## ARIMA(1,0,0) with non-zero mean ## ## Coefficients: ## mean ## sigma^2 estimated as 0.994: ## 0.7312 1.8934 ## s.e. 0.0309 0.1646 ## 8
lm ## Signif. codes: 0.07328 7.144 3.25e-12 *** ## lag(y) 0.72817 0.03093 23.539 < 2e-16 *** ## --- 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## (Intercept) ## ## Residual standard error: 0.9949 on 497 degrees of freedom ## (1 observation deleted due to missingness) ## Multiple R-squared: 0.5272, Adjusted R-squared: 0.5262 ## F-statistic: 554.1 on 1 and 497 DF, p-value: < 2.2e-16 0.52347 Estimate Std. Error t value Pr(>|t|) d = data_frame (y = ar1 %>% strip_attrs (), t= seq_along (ar1)) ## summary (ar1_lm) ## ## Call: ## lm(formula = y ~ lag(y), data = d) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.2772 -0.6880 0.0785 0.6819 2.5704 ## ## Coefficients: 9 ar1_lm = lm (y~ lag (y), data=d)
Bayesian AR(1) Model mu = delta/(1-phi) }β sigma2_w <- 1/tau tau ~ dgamma(0.001,0.001) phi ~ dnorm(0,1) delta ~ dnorm(0,1/1000) # priors } ar1_model = βmodel{ y_hat[t] ~ dnorm(delta + phi*y[t-1], 1/sigma2_w) y[t] ~ dnorm(delta + phi*y[t-1], 1/sigma2_w) for (t in 2:length(y)) { y_hat[1] ~ dnorm(delta/(1-phi), (sigma2_w/(1-phi^2))^-1) y[1] ~ dnorm(delta/(1-phi), (sigma2_w/(1-phi^2))^-1) # likelihood 10
Chains 11 0.7 0.6 delta 0.5 0.4 0.3 .variable 0.80 .value delta 0.75 phi phi 0.70 sigma2_w 0.65 1.2 1.1 sigma2_w 1.0 0.9 0.8 0 1000 2000 3000 4000 5000 .iteration
Posteriors 12 delta phi sigma2_w 10 model density ARIMA lm truth 5 0 0.3 0.4 0.5 0.6 0.7 0.8 0.65 0.70 0.75 0.80 0.8 0.9 1.0 1.1 1.2
Predictions 13 4 Model y 2 lm y ARIMA bayes 0 β2 0 100 200 300 400 500 t
Faceted 14 y lm 4 2 0 Model β2 y lm y ARIMA bayes ARIMA bayes 4 2 0 β2 0 100 200 300 400 500 0 100 200 300 400 500 t
Regressing π§ π’ on π§ π’βπ , β¦ , π§ π’β1 gets us an approximate solution, but it Fitting AR(p) - Lagged Regression We can rewrite the density as follows, π(π³) = π(π§ π’ , π§ π’β1 , β¦ , π§ 2 , π§ 1 ) = π(π§ π |π§ πβ1 , β¦ , π§ πβπ ) β― π(π§ π+1 |π§ π , β¦ , π§ 1 )π(π§ π , β¦ , π§ 1 ) ignores the π(π§ 1 , π§ 2 , β¦ , π§ π ) part of the likelihood. How much does this matter (vs. using the full likelihood)? β’ If π is near to π then probably a lot β’ If π << π then probably not much 15
Fitting AR(p) - Lagged Regression We can rewrite the density as follows, π(π³) = π(π§ π’ , π§ π’β1 , β¦ , π§ 2 , π§ 1 ) = π(π§ π |π§ πβ1 , β¦ , π§ πβπ ) β― π(π§ π+1 |π§ π , β¦ , π§ 1 )π(π§ π , β¦ , π§ 1 ) ignores the π(π§ 1 , π§ 2 , β¦ , π§ π ) part of the likelihood. How much does this matter (vs. using the full likelihood)? β’ If π is near to π then probably a lot β’ If π << π then probably not much 15 Regressing π§ π’ on π§ π’βπ , β¦ , π§ π’β1 gets us an approximate solution, but it
Fitting AR(p) - Method of Moments Recall for an AR(p) process, πΏ(0) = π 2 πΏ(β) = π 1 πΏ(β β 1) + π 2 πΏ(β β 2) + β¦ π π πΏ(β β π) We can rewrite the first equation in terms of π 2 π₯ , π 2 these are called the Yule-Walker equations. 16 π₯ + π 1 πΏ(1) + π 2 πΏ(2) + β¦ + π π πΏ(π) π₯ = πΏ(0) β π 1 πΏ(1) β π 2 πΏ(2) β β¦ β π π πΏ(π)
πΉ π which π₯ = πΏ(0) β Yule-Walker Μ Μ can plug in and solve for π and π 2 π₯ , Μ π = Μ π« π β1 πΉ π = (πΏ(1), πΏ(2), β¦ , πΏ(π)) β² π 2 Μ πΉ π β² Μ π« β1 π Μ πΉ π If we estimate the covariance structure from the data we obtain πΓ1 These equations can be rewritten into matrix notation as follows 1Γ1 π« π πΓπ π πΓ1 = πΉ π πΓ1 π 2 π₯ 1Γ1 = πΏ(0) β π β² πΉ π 1Γπ πΉ πͺ πΓ1 where π« πͺ πΓπ = {πΏ(π β π)} π,π π πΓ1 = (π 1 , π 2 , β¦ , π π ) β² 17
Yule-Walker Μ If we estimate the covariance structure from the data we obtain Μ can plug in and solve for π and π 2 π₯ , Μ π = Μ π« π β1 πΉ π These equations can be rewritten into matrix notation as follows π 2 Μ πΉ π β² Μ π« β1 π Μ πΉ π = (πΏ(1), πΏ(2), β¦ , πΏ(π)) β² πΓ1 πΉ π = (π 1 , π 2 , β¦ , π π ) β² π« π πΓπ π πΓ1 = πΉ π πΓ1 π 2 π₯ 1Γ1 = πΏ(0) 1Γ1 β π β² 1Γπ πΉ πͺ πΓ1 where π« πͺ πΓπ = {πΏ(π β π)} π,π π πΓ1 17 πΉ π which π₯ = πΏ(0) β
ARMA
Recommend
More recommend