forecasting in r
play

Forecasting in R Exponential smoothing in ETS form Outline 1. - PowerPoint PPT Presentation

Forecasting in R Exponential smoothing in ETS form Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection. Outline


  1. Forecasting in R Exponential smoothing in ETS form

  2. Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.

  3. Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.

  4. Introduction to ETS β€’ Different types of time series: Let us understand the principles of extrapolative forecasting with series with a single component

  5. NaΓ―ve forecast What is the simplest forecast you can think of for a time series? For example: what will the temperature be like in your room after 5 minutes? 𝑧 𝑒+1 = 𝑧 𝑒 SKU A SKU A 800 800 600 600 Sales Sales 400 400 200 200 0 0 10 10 20 20 30 30 40 40 50 50 60 60 Observation Observation β€’ The forecast is a straight line οƒ  always equal to the last observation. β€’ Is this a good forecast?

  6. Arithmetic mean Another approach would be to calculate the average and use this as a forecast. For example: calculate the average temperature in your room over all the years you live there… 𝑒 𝑒+1 = 1 𝑧 𝑒 𝑧 𝑗 𝑗=1 SKU A 800 600 Sales 400 200 0 10 20 30 40 50 60 Observation β€’ The average has long memory and the random movements of the noise will be cancelled out. β€’ Is this a good forecast?

  7. Simple Moving Average Simple Moving Average allows us to select the appropriate memory (length of the average). e.g. only consider the temperature over the last week 𝑒 𝑒+1 = 1 𝑧 𝑧 𝑗 𝑙 𝑗=π‘’βˆ’π‘™+1 The simple moving average : β€’ Has a single parameter k . This controls the length of the moving average and it is also known as its order. β€’ Its variable length allows us to control how reactive we are to new information and how robust we are against noise. β€’ Gives equal importance to all k observations.

  8. Simple Moving Average SKU A 800 MA(3) 600 Which of the Sales 400 different length 200 moving averages is the most 0 SKU A 10 20 30 40 50 60 appropriate for 800 MA(6) this SKU? 600 Sales 400 We choose the 200 one that gives us a smooth 0 SKU A 10 20 30 40 50 60 800 estimate of the MA(12) level, here 600 MA(12) Sales 400 200 0 10 20 30 40 50 60 Observation

  9. Simple Moving Average SKU A 800 MA(12) 600 Which of the Sales 400 different length 200 moving averages is the most 0 SKU A 10 20 30 40 50 60 appropriate for 800 MA(24) this SKU? 600 Sales 400 We do not need 200 excessive moving average lengths. 0 SKU A 10 20 30 40 50 60 800 These will be far MA(36) too insensitive to 600 new information. Sales 400 200 0 10 20 30 40 50 60 Observation

  10. Weighted Moving Average Should the weights be the same for all k observations? We can overcome this limitation by allowing different weights for each observation in the average: 𝑒 𝑙 𝑧 𝑒+1 = π‘₯ 𝑗 𝑧 𝑗 , w. r. t. π‘₯ 𝑗 = 1 𝑗=π‘’βˆ’π‘™+1 𝑗=1 With the weighted moving average : β€’ We can control the length of the average and the importance of each observation β€’ All weights must add up to 100% or 1. Normally the older the observation the smaller the weight. β€’ Has k+1 parameters, the length of the average and k weights. β€’ The number of weights makes it very challenging to use in practice.

  11. Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.

  12. The Exponential Smoothing Concept Starting from the weighted moving average we can construct a heuristic to select the weights easily and consequently its order ( k ). Data y t y t-1 y t-2 y t-3 ... Weights w t w t-1 w t-2 w t-3 ... Make the more recent information more relevant, bigger weights 1. Remember! Weights must add up to 100% (or 1) 2. οƒ  Take 50% for the first and then always take 50% of the remaining weight. (Sum of all weights β‰ˆ 100%) Weights w t w t-1 w t-2 w t-3 w t-4 w t-5 w t-6 Weights 50% 25% 12.5% 6.25% 3.12% 1.56% β‰ˆ 0% οƒ  The length of the average is set automatically!

  13. The Exponential Smoothing Concept Weights w t w t-1 w t-2 w t-3 w t-4 w t-5 w t-6 50% 25% 12.5% 6.25% 3.12% 1.56% β‰ˆ 0% Only one parameter, the initial weight! Let this weight be Alpha ( Ξ± )... Ξ±(1 - Ξ±) 0 Ξ±(1 - Ξ±) 1 Ξ±(1 - Ξ±) 2 Ξ±(1 - Ξ±) 3 Ξ±(1 - Ξ±) 4 Ξ±(1 - Ξ±) 5 Ξ±(1 - Ξ±) 6 Exponentially distributed weights The exponential weighting scheme allows us to select reasonable weights and the length of the weighted moving average with a single parameter, the Ξ±.

  14. The Exponential Smoothing Concept 𝑒+1 = 𝛽𝑧 𝑒 + 𝛽 1 βˆ’ 𝛽 𝑧 π‘’βˆ’1 + 𝛽 1 βˆ’ 𝛽 2 𝑧 π‘’βˆ’2 + 𝛽 1 βˆ’ 𝛽 3 𝑧 π‘’βˆ’3 + β‹― 𝑧 𝛽𝑧 π‘’βˆ’1 + 𝛽 1 βˆ’ 𝛽 𝑧 π‘’βˆ’2 + 𝛽 1 βˆ’ 𝛽 2 𝑧 π‘’βˆ’3 + β‹― 𝑧 𝑒+1 = 𝛽𝑧 𝑒 + 1 βˆ’ 𝛽 What is this? 𝑒 = 𝛽𝑧 π‘’βˆ’1 + 𝛽 1 βˆ’ 𝛽 𝑧 π‘’βˆ’2 + 𝛽 1 βˆ’ 𝛽 2 𝑧 π‘’βˆ’3 + β‹― 𝑧 A simpler form of the model: 𝑧 𝑒+1 = 𝛽𝑧 𝑒 + 1 βˆ’ 𝛽 𝑧 𝑒

  15. Simple Exponential Smoothing 𝑧 𝑒+1 = 𝛽𝑧 𝑒 + 1 βˆ’ 𝛽 𝑧 𝑒 The parameter Ξ± , is called smoothing parameter and is bounded between 0 and 1. The exponential smoothing formula can be read as: the forecast is Ξ± times the most recent observation and (1- Ξ±) times all the previous information. β€’ A low Ξ± implies that the forecast is mostly based on the previous information β€’ A high Ξ± implies that the forecast is mostly based on the last information Therefore the smoothing parameter Ξ± controls how reactive is the forecast to new information. This form was proposed by Brown (1956). Much has changed since then…

  16. Simple Exponential Smoothing SKU A - Alpha: 0.1 SKU A - Alpha: 0.7 800 800 600 600 Sales Sales 400 400 200 200 Noise is not filtered οƒ  Avoid 0 0 10 20 30 40 50 60 10 20 30 40 50 60 SKU A - Alpha: 0.3 SKU A - Alpha: 0.9 Noise is filtered 800 800 600 600 Sales Sales 400 400 200 200 0 0 10 20 30 40 50 60 10 20 30 40 50 60 Observation Observation SKU A - Alpha: 0.5 SKU A - Alpha: 1.0 800 800 600 600 = Naive Sales Sales 400 400 200 200 0 0 10 20 30 40 50 60 10 20 30 40 50 60 Observation Observation

  17. Simple Exponential Smoothing SKU B - Alpha: 0.1 10000 In the presence of high noise or 8000 outliers we need to use low values 6000 Sales of alpha to make our forecasts 4000 2000 more robust. 0 10 20 30 40 50 60 SKU B - Alpha: 0.3 Observation 10000 Here the outlier affects strongly 8000 our forecast. 6000 Sales 4000 2000 0 10 20 30 40 50 60 Observation SKU B - Alpha: 0.5 10000 Here the effect of outlier is even 8000 stronger. 6000 Sales 4000 2000 0 10 20 30 40 50 60 Observation

  18. Simple Exponential Smoothing SKU C - Alpha: 0.1 6000 Very low alpha parameter makes 4000 our forecast too slow to adjust to Sales the new level of sales. 2000 0 10 20 30 40 50 60 Observation SKU C - Alpha: 0.3 6000 Here the alpha achieves a good 4000 compromise between reactivity Sales and robustness to noise. 2000 0 10 20 30 40 50 60 SKU C - Alpha: 0.5 6000 Very high alpha parameter makes 4000 our forecast to react very fast, but Sales now it does not filter out noise 2000 adequately. 0 10 20 30 40 50 60 Observation

  19. Simple Exponential Smoothing We can formulate exponential smoothing in a different way: 𝑧 𝑒+1 = 𝛽𝑧 𝑒 + 1 βˆ’ 𝛽 𝑧 𝑒 𝑧 𝑒+1 = 𝛽𝑧 𝑒 + 𝑧 𝑒 βˆ’ 𝛽𝑧 𝑒 𝑧 𝑒+1 = 𝑧 𝑒 + 𝛽 𝑧 𝑒 βˆ’ 𝑧 𝑒 The difference between the Actuals and the Forecast is the forecast error. 𝑧 𝑒+1 = 𝑧 𝑒 + 𝛽𝑓 𝑒 This is known as the error correction form of exponential smoothing. Why is this useful? Let’s find out after a short quiz…

  20. Forecasting Level Series, Quiz Please, follow the link: http://etc.ch/V7Ss 1. Which of the methods is more appropriate for the following data?

  21. Forecasting Level Series, Quiz Please, follow the link: http://etc.ch/V7Ss 2. Which of the methods is more appropriate for the following data (2 nd example)?

  22. Forecasting Level Series, Quiz Please, follow the link: http://etc.ch/V7Ss 3. Which of the smoothing parameters is more appropriate for this data if we use SES?

  23. Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.

  24. Introduction to ETS SES models the level of a time series So, we can write 𝑧 𝑒+1 = π‘š 𝑒 By shifting the indices by 1 period we can now write: 𝑧 𝑒 = π‘š π‘’βˆ’1 + 𝑓 𝑒 (1) π‘š 𝑒 = π‘š π‘’βˆ’1 + 𝑏𝑓 𝑒 (2) This will lead us to the so called State Space Models : β€’ Eq. (1) – the measurement equation : says that the observed actuals are the result of some structure ( π‘š 𝑒 ) and noise ( 𝑓 𝑒 ). β€’ Eq. (2) – the transition equation : says that there is an unobserved process describing how the level of the time series evolves. For our case this is all the structure of the series. β€’ We can have other components as well…

Recommend


More recommend