marcel dettling
play

Marcel Dettling Institute for Data Analysis and Process Design - PowerPoint PPT Presentation

Applied Time Series Analysis FS 2012 Week 03 Marcel Dettling Institute for Data Analysis and Process Design Zurich University of Applied Sciences marcel.dettling@zhaw.ch http://stat.ethz.ch/~dettling ETH Zrich, March 5, 2012 Marcel


  1. Applied Time Series Analysis FS 2012 – Week 03 Marcel Dettling Institute for Data Analysis and Process Design Zurich University of Applied Sciences marcel.dettling@zhaw.ch http://stat.ethz.ch/~dettling ETH Zürich, March 5, 2012 Marcel Dettling, Zurich University of Applied Sciences 1

  2. Applied Time Series Analysis FS 2012 – Week 03 Descriptive Decomposition It is convenient to describe non-stationary time series with a simple decomposition model    X m s E t t t t = trend + seasonal effect + stationary remainder The modelling can be done with: 1) taking differences with appropriate lag (=differencing) 2) smoothing approaches (= filtering) 3) parametric models (= curve fitting) Marcel Dettling, Zurich University of Applied Sciences 2

  3. Applied Time Series Analysis FS 2012 – Week 03 Parametric Modelling When to use?  Parametric modelling is often used if we have previous knowledge about the trend following a functional form.  If the main goal of the analysis is forecasting, a trend in functional form may allow for easier extrapolation than a trend obtained via smoothing.  It can also be useful if we have a specific model in mind and want to infer it. Caution: correlated errors! Marcel Dettling, Zurich University of Applied Sciences 3

  4. Applied Time Series Analysis FS 2012 – Week 03 Parametric Modelling: Example Maine unemployment data: Jan/1996 – Aug/2006 Unemployment in Maine 6 5 (%) 4 3 1996 1998 2000 2002 2004 2006 Time Marcel Dettling, Zurich University of Applied Sciences 4

  5. Applied Time Series Analysis FS 2012 – Week 03 Modeling the Unemployment Data Most often, time series are parametrically decomposed by using regression models. For the trend, polynomial functions are widely used, whereas the seasonal effect is modelled with dummy variables (= a factor).                  2 3 4 X t t t t E 0 1 2 3 4 ( ) t i t t    1,2,...,128 t where    ( ) 1,2,...,12 i t Remark: choice of the polynomial degree is crucial! Marcel Dettling, Zurich University of Applied Sciences 5

  6. Applied Time Series Analysis FS 2012 – Week 03 Polynomial Order / OLS Fitting Estimation of the coefficients will be done in a regression con- text. We can use the ordinary least squares algorithm, but: • we have violated assumptions, is not uncorrelated E t • the estimated coefficients are still unbiased • standard errors (tests, CIs) can be wrong Which polynomial order is required? Eyeballing allows to determine the minimum grade that is required for the polynomial. It is at least the number of maxima the hypothesized trend has, plus one. Marcel Dettling, Zurich University of Applied Sciences 6

  7. Applied Time Series Analysis FS 2012 – Week 03 Important Hints for Fitting • The main predictor used in polynomial parametric modeling is the time of the observations. It can be obtained by typing time(maine) . • For avoiding numerical and collinearity problems, it is essential to center the time/predictors! • R sets the first factor level to 0, seasonality is thus expressed as surplus to the January value. • For visualization: when the trend must fit the data, we have to adjust, because the mean for the seasonal effect is usually different from zero! Marcel Dettling, Zurich University of Applied Sciences 7

  8. Applied Time Series Analysis FS 2012 – Week 03 Trend of O(4), O(5) and O(6) Unemployment in Maine 6 5 (%) 4 O(4) O(5) 3 O(6) 1996 1998 2000 2002 2004 2006 Time Marcel Dettling, Zurich University of Applied Sciences 8

  9. Applied Time Series Analysis FS 2012 – Week 03 Residual Analysis: O(4) Residuals vs. Time, O(4) 0.6 0.2 -0.2 -0.6 1996 1998 2000 2002 2004 2006 Time Marcel Dettling, Zurich University of Applied Sciences 9

  10. Applied Time Series Analysis FS 2012 – Week 03 Residual Analysis: O(5) Residuals vs. Time, O(5) 0.6 0.2 -0.2 -0.6 1996 1998 2000 2002 2004 2006 Time Marcel Dettling, Zurich University of Applied Sciences 10

  11. Applied Time Series Analysis FS 2012 – Week 03 Residual Analysis: O(6) Residuals vs. Time, O(6) 0.4 0.2 0.0 -0.2 -0.4 1996 1998 2000 2002 2004 2006 Time Marcel Dettling, Zurich University of Applied Sciences 11

  12. Applied Time Series Analysis FS 2012 – Week 03 Parametric Modeling: Remarks Some advantages and disadvantages: + trend and seasonal effect can be estimated + and are explicitly known, can be visualised ˆ t ˆ t m s + even some inference on trend/season is possible + time series keeps the original length - choice of a/the correct model is necessary/difficult - residuals are correlated: this is a model violation! m ˆ t ˆ t s - extrapolation of , are not entirely obvious Marcel Dettling, Zurich University of Applied Sciences 12

  13. Applied Time Series Analysis FS 2012 – Week 03 Where are we? For most of the rest of this course, we will deal with (weakly) stationary time series. They have the following properties:   [ ] • E X t   2 ( ) Var X • t   ( , ) • Cov X X  t t h h If a time series is non-stationary, we know how to decompose into deterministic and stationary, random part. Our forthcoming goals are: - understanding the dependency in a stationary series - modeling this dependency and generate forecasts Marcel Dettling, Zurich University of Applied Sciences 13

  14. Applied Time Series Analysis FS 2012 – Week 03 Autocorrelation The aim of this section is to explore the dependency structure within a time series. Def: Autocorrelation ( , ) Cov X X   t k t ( , ) Cor X X   t k t ( ) ( ) Var X Var X  t k t The autocorrelation is a dimensionless measure for the amount of linear association between the random variables X  collinearity between the random variables and X . t k t Marcel Dettling, Zurich University of Applied Sciences 14

  15. Applied Time Series Analysis FS 2012 – Week 03 Autocorrelation Estimation Our next goal is to estimate the autocorrelation function (acf) from a realization of weakly stationary time series. Luteinizing Hormone in Blood at 10min Intervals 3.5 3.0 2.5 lh 2.0 1.5 0 10 20 30 40 Time Autocorrelation Function 1.0 0.6 ACF 0.2 -0.2 0 5 10 15 Marcel Dettling, Zurich University of Applied Sciences 15

  16. Applied Time Series Analysis FS 2012 – Week 03 Autocorrelation Estimation: lag k>1 Idea 1: Compute the sample correlation for all pairs ( , ) x x  s s k k=2, cor=0.19 k=3, cor=-0.15 k=4, cor=-0.19 k=5, cor=-0.16 3.5 3.5 3.5 3.5 3.0 3.0 3.0 3.0 X_{s+2} X_{s+3} X_{s+4} X_{s+5} 2.5 2.5 2.5 2.5 2.0 2.0 2.0 2.0 1.5 1.5 1.5 1.5 1.5 2.0 2.5 3.0 3.5 1.5 2.0 2.5 3.0 3.5 1.5 2.0 2.5 3.0 3.5 1.5 2.0 2.5 3.0 3.5 X_s X_s X_s X_s k=6, cor=-0.02 k=7, cor=-0.01 k=8, cor=0.01 k=9, cor=-0.17 3.5 3.5 3.5 3.5 3.0 3.0 3.0 3.0 X_{s+6} X_{s+7} X_{s+8} X_{s+9} 2.5 2.5 2.5 2.5 2.0 2.0 2.0 2.0 1.5 1.5 1.5 1.5 1.5 2.0 2.5 3.0 3.5 1.5 2.0 2.5 3.0 3.5 1.5 2.0 2.5 3.0 1.5 2.0 2.5 3.0 X_s X_s X_s X_s Marcel Dettling, Zurich University of Applied Sciences 16

  17. Applied Time Series Analysis FS 2012 – Week 03 Autocorrelation Estimation: lag k Idea 2: Plug-in estimate with sample covariance How does it work?  see blackboard… Marcel Dettling, Zurich University of Applied Sciences 17

  18. Applied Time Series Analysis FS 2012 – Week 03 Autocorrelation Estimation: lag k Idea 2: Plug-in estimate with sample covariance  ˆ ( ) ( , ) k Cov X X     ˆ( ) t t k k  ˆ(0) ( ) Var X t  n k 1      ˆ( ) ( )( ) k x x x x where  s k s n  1 s   n 1 and x x t n  1 t Standard approach in time series analysis for computing the acf Marcel Dettling, Zurich University of Applied Sciences 18

  19. Applied Time Series Analysis FS 2012 – Week 03 Comparison Idea 1 vs. Idea 2  see blackboard for some more information Comparison between lagged sample correlations and acf 1.0 0.5 acf 0.0 -0.5 acf -1.0 lagged sample correlations 0 10 20 30 40 lag Marcel Dettling, Zurich University of Applied Sciences 19

  20. Applied Time Series Analysis FS 2012 – Week 03 What is important about ACF estimation? - Correlations are never to be trusted without a visual inspection with a scatterplot. - The bigger the lag k, the fewer data pairs remain for estimating the acf at lag k. - Rule of the thumb: the acf is only meaningful up to about a) lag 10*log 10 (n) b) lag n/4 - The estimated sample ACs can be highly correlated. - The correlogram is only meaningful for stationary series!!! Marcel Dettling, Zurich University of Applied Sciences 20

  21. Applied Time Series Analysis FS 2012 – Week 03 Correlogram A useful aid in interpreting a set of autocorrelation coefficients is  ˆ( ) the graph called correlogram, where the are plotted k against the lag k. Interpreting the meaning of a set of autocorrelation coefficients is not always easy. The following slides offer some advice. Series lh 1.0 0.6 ACF 0.2 -0.2 0 5 10 15 Marcel Dettling, Zurich University of Applied Sciences 21 Lag

Recommend


More recommend