Lecture 6 Discrete Time Series Colin Rundel 02/06/2017 1
Discrete Time Series 2
Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of y t 1 y t n must be identical to the distribution of y t 1 k y t n k for any value of n and k . 3
Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of any value of n and k . 3 { y t 1 , . . . , y t n } must be identical to the distribution of { y t 1 + k , . . . , y t n + k } for
Stationary Processes A stocastic process (i.e. a time series) is considered to be strictly stationary if the properties of the process are not changed by a shift in origin. In the time series context this means that the joint distribution of any value of n and k . 3 { y t 1 , . . . , y t n } must be identical to the distribution of { y t 1 + k , . . . , y t n + k } for
Weak Stationary Strict stationary is too strong for most applications, so instead we often opt for weak stationary which requires the following, 1. The process has finite variance 2. The mean of the process in constant 3. The second moment only depends on the lag When we say stationary in class we almost always mean this version of weakly stationary . 4 E ( y 2 t ) < ∞ for all t E ( y t ) = µ for all t Cov ( y t , y s ) = Cov ( y t + k , y s + k ) for all t , s , k
Weak Stationary Strict stationary is too strong for most applications, so instead we often opt for weak stationary which requires the following, 1. The process has finite variance 2. The mean of the process in constant 3. The second moment only depends on the lag When we say stationary in class we almost always mean this version of weakly stationary . 4 E ( y 2 t ) < ∞ for all t E ( y t ) = µ for all t Cov ( y t , y s ) = Cov ( y t + k , y s + k ) for all t , s , k
Cov y t y t Autocorrelation t t 0 k k k t t t t k t t k k k k as k ) this is also sometimes written in terms of the autocovariance function ( define the autocorrelation at lag k as 5 For a stationary time series, where E ( y t ) = µ and Var ( y t ) = σ 2 ) for all t , we ρ k = Cor ( y t , y t + k ) Cov ( y t , y t + k ) = √ Var ( y t ) Var ( y t + k ) E (( y t − µ )( y t + k − µ )) = σ 2
Autocorrelation as 5 define the autocorrelation at lag k as For a stationary time series, where E ( y t ) = µ and Var ( y t ) = σ 2 ) for all t , we ρ k = Cor ( y t , y t + k ) Cov ( y t , y t + k ) = √ Var ( y t ) Var ( y t + k ) E (( y t − µ )( y t + k − µ )) = σ 2 this is also sometimes written in terms of the autocovariance function ( γ k ) γ k = γ ( t , t + k ) = Cov ( y t , y t + k ) γ ( t , t ) γ ( t + k , t + k ) = γ ( k ) γ ( t , t + k ) ρ k = √ γ ( 0 )
Covariance Structure . . . . . . . . Based on our definition of a (weakly) stationary process, it implies a . . . ... . . . . 6 covariance of the following structure, · · · γ ( 0 ) γ ( 1 ) γ ( 2 ) γ ( 3 ) γ ( n ) γ ( 1 ) γ ( 0 ) γ ( 1 ) γ ( 2 ) · · · γ ( n − 1 ) γ ( 2 ) γ ( 1 ) γ ( 0 ) γ ( 1 ) · · · γ ( n − 2 ) γ ( 3 ) γ ( 2 ) γ ( 1 ) γ ( 0 ) · · · γ ( n − 3 ) γ ( n − 1 ) γ ( n − 2 ) γ ( n − 3 ) · · · γ ( n ) γ ( 0 )
7 Example - Random walk Let y t = y t − 1 + w t with y 0 = 0 and w t ∼ N ( 0 , 1 ) . Is y t stationary? Random walk 10 0 y −10 −20 0 250 500 750 1000 t
ACF + PACF 8 Series rw$y Series rw$y 1.0 0.8 0.8 Partial ACF 0.6 ACF 0.4 0.4 0.2 0.0 0.0 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag
Example - Random walk with drift 9 Let y t = δ + y t − 1 + w t with y 0 = 0 and w t ∼ N ( 0 , 1 ) . Is y t stationary? Random walk with trend 100 75 50 y 25 0 0 250 500 750 1000 t
ACF + PACF 10 Series rwt$y Series rwt$y 1.0 1.0 0.8 0.8 Partial ACF 0.6 0.6 ACF 0.4 0.4 0.2 0.2 0.0 0.0 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag
Example - Moving Average 11 Let w t ∼ N ( 0 , 1 ) and y t = 1 3 ( w t − 1 + w t + w t + 1 ) , is y t stationary? Moving Average 1.5 1.0 0.5 y 0.0 −0.5 −1.0 0 25 50 75 100 t
ACF + PACF 12 Series ma$y Series ma$y 1.0 0.6 0.6 0.4 Partial ACF ACF 0.2 0.2 −0.2 −0.2 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag
Autoregression stationary? 13 Let w t ∼ N ( 0 , 1 ) and y t = y t − 1 − 0 . 9 y t − 2 + w t with y t = 0 for t < 1, is y t Autoregressive 5 y 0 −5 0 100 200 300 400 500 t
ACF + PACF 14 Series ar$y Series ar$y 1.0 0.4 0.5 Partial ACF 0.0 ACF 0.0 −0.4 −0.5 −0.8 0 10 20 30 40 50 0 10 20 30 40 50 Lag Lag
Example - Australian Wine Sales ## 4 ## # ... with 166 more rows ## 10 1980.750 22591 1980.667 21133 ## 9 1980.583 23739 ## 8 1980.500 22893 ## 7 1980.417 19227 ## 6 1980.333 18019 ## 5 1980.250 17708 1980.167 20016 Australian total wine sales by wine makers in bottles <= 1 litre. Jan 1980 – ## 3 1980.083 16733 ## 2 1980.000 15136 ## 1 <dbl> <dbl> ## date sales ## ## # A tibble: 176 × 2 aus_wine load ( url (”http://www.stat.duke.edu/~cr173/Sta444_Sp17/data/aus_wine.Rdata”)) Aug 1994. 15
16 Time series 40000 30000 sales 20000 1980 1985 1990 1995 date
17 Basic Model Fit 40000 30000 sales 20000 1980 1985 1990 1995 date
Residuals 18 resid_l 15000 10000 5000 0 −5000 −10000 type residual resid_l resid_q resid_q 15000 10000 5000 0 −5000 −10000 1980 1985 1990 1995 date
Autocorrelation Plot 19 Series d$resid_q 1.0 0.8 0.6 0.4 ACF 0.2 0.0 −0.2 −0.4 0 5 10 15 20 25 30 35 Lag
Partial Autocorrelation Plot 20 Series d$resid_q 0.6 0.4 Partial ACF 0.2 0.0 −0.2 0 5 10 15 20 25 30 35 Lag
21 lag_01 lag_02 lag_03 lag_04 10000 5000 0 −5000 −10000 lag_05 lag_06 lag_07 lag_08 10000 5000 resid_q 0 −5000 −10000 lag_09 lag_10 lag_11 lag_12 10000 5000 0 −5000 −10000 −10000 −5000 0 500010000 −10000 −5000 0 500010000 −10000 −5000 0 500010000 −10000 −5000 0 500010000 lag_value
Auto regressive errors 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 0.679 ## lag_12 0.89024 0.04045 22.006 <2e-16 *** ## --- ## Signif. codes: ## 201.58416 ## Residual standard error: 2581 on 162 degrees of freedom ## (12 observations deleted due to missingness) ## Multiple R-squared: 0.7493, Adjusted R-squared: 0.7478 ## F-statistic: 484.3 on 1 and 162 DF, p-value: < 2.2e-16 0.415 83.65080 ## 3Q ## Call: ## lm(formula = resid_q ~ lag_12, data = d_ar) ## ## Residuals: ## Min 1Q Median Max ## (Intercept) ## -12286.5 -1380.5 73.4 1505.2 7188.1 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) 22
Residual residuals 23 5000 0 resid −5000 −10000 1980 1985 1990 1995 date
Residual residuals - acf 24 Series l_ar$residuals 1.0 0.8 0.6 0.4 ACF 0.2 0.0 −0.2 0 5 10 15 20 25 30 35 Lag
25 lag_01 lag_02 lag_03 lag_04 5000 0 −5000 −10000 lag_05 lag_06 lag_07 lag_08 5000 0 resid −5000 −10000 lag_09 lag_10 lag_11 lag_12 5000 0 −5000 −10000 −10000 −5000 0 5000 −10000 −5000 0 5000 −10000 −5000 0 5000 −10000 −5000 0 5000 lag_value
Writing down the model? So, is our EDA suggesting that we then fit the following model? . . . the implied model is, where 26 sales ( t ) = β 0 + β 1 t + β 2 t 2 + β 3 sales ( t − 12 ) + ϵ t sales ( t ) = β 0 + β 1 t + β 2 t 2 + w t w t = δ w t − 12 + ϵ t
Recommend
More recommend