lecture 6
play

Lecture 6 Discrete Time Series 9/21/2018 1 Discrete Time Series - PowerPoint PPT Presentation

Lecture 6 Discrete Time Series 9/21/2018 1 Discrete Time Series Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In


  1. Lecture 6 Discrete Time Series 9/21/2018 1

  2. Discrete Time Series

  3. Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of {𝑧 𝑢 1 , … , 𝑧 𝑢 𝑜 } must be identical to the distribution of {𝑧 𝑢 1 +𝑙 , … , 𝑧 𝑢 𝑜 +𝑙 } for any value of 𝑜 and 𝑙 . 2

  4. Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of {𝑧 𝑢 1 , … , 𝑧 𝑢 𝑜 } must be identical to the distribution of {𝑧 𝑢 1 +𝑙 , … , 𝑧 𝑢 𝑜 +𝑙 } for any value of 𝑜 and 𝑙 . 2

  5. Weak Stationary Strict stationary is unnecessarily strong / restrictive for many applications, so instead we often opt for weak stationary which requires the following, 1. The process has finite variance 𝐹(𝑧 2 2. The mean of the process is constant 𝐹(𝑧 𝑢 ) = 𝜈 for all 𝑢 3. The second moment only depends on the lag 𝐷𝑝𝑤(𝑧 𝑢 , 𝑧 𝑡 ) = 𝐷𝑝𝑤(𝑧 𝑢+𝑙 , 𝑧 𝑡+𝑙 ) for all 𝑢, 𝑡, 𝑙 When we say stationary in class we will almost always mean weakly stationary . 3 𝑢 ) < ∞ for all 𝑢

  6. Weak Stationary Strict stationary is unnecessarily strong / restrictive for many applications, so instead we often opt for weak stationary which requires the following, 1. The process has finite variance 𝐹(𝑧 2 2. The mean of the process is constant 𝐹(𝑧 𝑢 ) = 𝜈 for all 𝑢 3. The second moment only depends on the lag 𝐷𝑝𝑤(𝑧 𝑢 , 𝑧 𝑡 ) = 𝐷𝑝𝑤(𝑧 𝑢+𝑙 , 𝑧 𝑡+𝑙 ) for all 𝑢, 𝑡, 𝑙 When we say stationary in class we will almost always mean weakly stationary . 3 𝑢 ) < ∞ for all 𝑢

  7. 𝛿 𝑙 = 𝛿(𝑢, 𝑢 + 𝑙) = 𝐷𝑝𝑤(𝑧 𝑢 , 𝑧 𝑢+𝑙 ) = 𝛿(𝑙) 𝜍 𝑙 = Autocorrelation as 𝛿(0) √𝛿(𝑢, 𝑢)𝛿(𝑢 + 𝑙, 𝑢 + 𝑙) 𝛿(𝑢, 𝑢 + 𝑙) this is also sometimes written in terms of the autocovariance function ( 𝛿 𝑙 ) 𝜏 2 √𝑊 𝑏𝑠(𝑧 𝑢 )𝑊 𝑏𝑠(𝑧 𝑢+𝑙 ) 𝐷𝑝𝑤(𝑧 𝑢 , 𝑧 𝑢+𝑙 ) = we define the autocorrelation at lag 𝑙 as 4 For a stationary time series, where 𝐹(𝑧 𝑢 ) = 𝜈 and Var (𝑧 𝑢 ) = 𝜏 2 for all 𝑢 , 𝜍 𝑙 = 𝐷𝑝𝑠(𝑧 𝑢 , 𝑧 𝑢+𝑙 ) = 𝐹 ((𝑧 𝑢 − 𝜈)(𝑧 𝑢+𝑙 − 𝜈))

  8. Autocorrelation 𝜏 2 𝛿(0) √𝛿(𝑢, 𝑢)𝛿(𝑢 + 𝑙, 𝑢 + 𝑙) 𝛿(𝑢, 𝑢 + 𝑙) as this is also sometimes written in terms of the autocovariance function ( 𝛿 𝑙 ) √𝑊 𝑏𝑠(𝑧 𝑢 )𝑊 𝑏𝑠(𝑧 𝑢+𝑙 ) 𝐷𝑝𝑤(𝑧 𝑢 , 𝑧 𝑢+𝑙 ) = we define the autocorrelation at lag 𝑙 as 4 For a stationary time series, where 𝐹(𝑧 𝑢 ) = 𝜈 and Var (𝑧 𝑢 ) = 𝜏 2 for all 𝑢 , 𝜍 𝑙 = 𝐷𝑝𝑠(𝑧 𝑢 , 𝑧 𝑢+𝑙 ) = 𝐹 ((𝑧 𝑢 − 𝜈)(𝑧 𝑢+𝑙 − 𝜈)) 𝛿 𝑙 = 𝛿(𝑢, 𝑢 + 𝑙) = 𝐷𝑝𝑤(𝑧 𝑢 , 𝑧 𝑢+𝑙 ) = 𝛿(𝑙) 𝜍 𝑙 =

  9. Covariance Structure ⋮ 𝛿(1) 𝛿(0) ⋯ 𝛿(𝑜 − 4) 𝛿(𝑜 − 3) 𝛿(𝑜 − 2) 𝛿(𝑜 − 1) ⋮ 𝛿(𝑜 − 1) ⋱ ⋮ ⋮ ⋮ ⋮ 𝛿(𝑜 − 3) 𝛿(𝑜 − 4) 𝛿(𝑜) 𝛿(𝑜 − 2) 𝛿(0) ⎟ where 𝑄 𝑢,𝑙 (𝑧) is the project of 𝑧 onto the space spanned by 𝑧 𝑢+1 , … , 𝑧 𝑢+𝑙−1 . ⎠ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ 𝛿(𝑜 − 3) ⎟ ⎟ ⎟ ⎟ ⎞ 𝛿(0) 𝛿(1) ⋯ ⋯ 𝛿(1) Based on our definition of a (weakly) stationary process, it implies a ⎜ 𝛿(0) ⎝ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ 𝛿(2) ⎜ ⎜ ⎜ ⎜ ⎛ 𝚻 = covariance of the following structure, 𝛿(1) 𝛿(3) 𝛿(2) 𝛿(2) 𝛿(3) 𝛿(𝑜 − 2) 𝛿(𝑜 − 3) ⋯ 𝛿(1) 𝛿(0) 𝛿(1) 𝛿(𝑜 − 1) ⋯ 𝛿(𝑜 − 2) ⋯ 𝛿(2) 𝛿(1) 𝛿(0) 𝛿(1) 𝛿(𝑜) 𝛿(𝑜 − 1) 5

  10. Example - Random walk 6 Let 𝑧 𝑢 = 𝑧 𝑢−1 + 𝑥 𝑢 with 𝑧 0 = 0 and 𝑥 𝑢 ∼ 𝒪(0, 1) . Random walk 10 y 0 −10 0 250 500 750 1000 t

  11. ACF + PACF 7 rw$y 10 0 −10 0 200 400 600 800 1000 1.00 1.00 0.75 0.75 PACF ACF 0.50 0.50 0.25 0.25 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  12. Stationary? 8 Is 𝑧 𝑢 stationary?

  13. Partial Autocorrelation - pACF Given these type of patterns in the autocorrelation we often want to This is done through the calculation of a partial autocorrelation ( 𝛽(𝑙) ), which is defined as follows: 𝛽(0) = 1 𝛽(1) = 𝜍(1) = 𝐷𝑝𝑠(𝑧 𝑢 , 𝑧 𝑢+1 ) ⋮ 9 examine the relationship between 𝑧 𝑢 and 𝑧 𝑢+𝑙 with the (linear) dependence of 𝑧 𝑢 on 𝑧 𝑢+1 through 𝑧 𝑢+𝑙−1 removed. 𝛽(𝑙) = 𝐷𝑝𝑠(𝑧 𝑢 − 𝑄 𝑢,𝑙 (𝑧 𝑢 ), 𝑧 𝑢+𝑙 − 𝑄 𝑢,𝑙 (𝑧 𝑢+𝑙 ))

  14. Example - Random walk with drift 10 Let 𝑧 𝑢 = 𝜀 + 𝑧 𝑢−1 + 𝑥 𝑢 with 𝑧 0 = 0 and 𝑥 𝑢 ∼ 𝒪(0, 1) . Random walk with trend 80 60 40 y 20 0 0 250 500 750 1000 t

  15. ACF + PACF 11 rwt$y 80 60 40 20 0 0 200 400 600 800 1000 1.00 1.00 0.75 0.75 PACF ACF 0.50 0.50 0.25 0.25 0.00 0.00 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  16. Stationary? 12 Is 𝑧 𝑢 stationary?

  17. Example - Moving Average 13 Let 𝑥 𝑢 ∼ 𝒪(0, 1) and 𝑧 𝑢 = 𝑥 𝑢−1 + 𝑥 𝑢 . Moving Average 3 2 1 y 0 −1 −2 0 25 50 75 100 t

  18. ACF + PACF 14 ma$y 3 2 1 0 −1 −2 0 20 40 60 80 100 0.25 0.25 PACF ACF 0.00 0.00 −0.25 −0.25 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  19. Stationary? 15 Is 𝑧 𝑢 stationary?

  20. 16 Autoregressive Let 𝑥 𝑢 ∼ 𝒪(0, 1) and 𝑧 𝑢 = 𝑧 𝑢−1 − 0.9𝑧 𝑢−2 + 𝑥 𝑢 with 𝑧 𝑢 = 0 for 𝑢 < 1 . Autoregressive 4 y 0 −4 0 100 200 300 400 500 t

  21. ACF + PACF 17 ar$y 4 0 −4 0 100 200 300 400 500 0.5 0.5 PACF 0.0 0.0 ACF −0.5 −0.5 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag

  22. Example - Australian Wine Sales ## ## # ... with 166 more rows ## 10 1981. 22591 9 1981. 21133 ## 8 1981. 23739 ## 7 1980. 22893 ## 6 1980. 19227 ## 5 1980. 18019 ## 4 1980. 17708 3 1980. 20016 Australian total wine sales by wine makers in bottles <= 1 litre. Jan 1980 – ## 2 1980. 16733 ## 15136 1 1980 ## <dbl> <dbl> ## date sales ## ## # A tibble: 176 x 2 aus_wine Aug 1994. 18 aus_wine = readRDS (”../data/aus_wine.rds”)

  23. 19 Time series 40000 30000 sales 20000 1980 1985 1990 1995 date

  24. Basic Model Fit 20 40000 30000 model sales linear quadratic 20000 1980 1985 1990 1995 date

  25. Residuals 21 lin_resid 15000 10000 5000 0 −5000 −10000 type residual lin_resid quad_resid 15000 quad_resid 10000 5000 0 −5000 −10000 1980 1985 1990 1995 date

  26. Autocorrelation Plot 22 d$quad_resid 10000 5000 0 −5000 −10000 0 50 100 150 0.5 0.5 PACF ACF 0.0 0.0 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35 Lag Lag

  27. 23 lag1 lag2 lag3 lag4 10000 5000 0 −5000 −10000 lag5 lag6 lag7 lag8 10000 quad_resid 5000 0 −5000 −10000 lag9 lag10 lag11 lag12 10000 5000 0 −5000 −10000 −10000 −5000 0 5000 10000 −10000 −5000 0 5000 10000 −10000 −5000 0 5000 10000 −10000 −5000 0 5000 10000 lag_value

  28. Auto regressive errors 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 0.679 ## lag_12 0.89024 0.04045 22.006 <2e-16 *** ## --- ## Signif. codes: ## 201.58416 ## Residual standard error: 2581 on 162 degrees of freedom ## (12 observations deleted due to missingness) ## Multiple R-squared: 0.7493, Adjusted R-squared: 0.7478 ## F-statistic: 484.3 on 1 and 162 DF, p-value: < 2.2e-16 0.415 83.65080 ## 3Q ## Call: ## lm(formula = quad_resid ~ lag_12, data = d_ar) ## ## Residuals: ## Min 1Q Median Max ## (Intercept) ## -12286.5 -1380.5 73.4 1505.2 7188.1 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) 24

  29. Residual residuals 25 5000 0 resid −5000 −10000 1980 1985 1990 1995 date

  30. Residual residuals - acf 26 l_ar$residuals 5000 0 −5000 −10000 0 50 100 150 0.2 0.2 0.1 0.1 0.0 PACF 0.0 ACF −0.1 −0.1 −0.2 −0.2 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35 Lag Lag

  31. 27 lag1 lag2 lag3 lag4 5000 0 −5000 −10000 lag5 lag6 lag7 lag8 5000 0 resid −5000 −10000 lag9 lag10 lag11 lag12 5000 0 −5000 −10000 −10000 −5000 0 5000 −10000 −5000 0 5000 −10000 −5000 0 5000 −10000 −5000 0 5000 lag_value

  32. sales (𝑢) = 𝛾 0 + 𝛾 1 𝑢 + 𝛾 2 𝑢 2 + 𝑥 𝑢 𝑥 𝑢 = 𝜀 𝑥 𝑢−12 + 𝜗 𝑢 Writing down the model? So, is our EDA suggesting that we fit the following model? the model we actually fit is, where 28 sales (𝑢) = 𝛾 0 + 𝛾 1 𝑢 + 𝛾 2 𝑢 2 + 𝛾 3 sales (𝑢 − 12) + 𝜗 𝑢

Recommend


More recommend