Forecasting in R Exponential smoothing in ETS form
Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.
Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.
Introduction to ETS β’ Different types of time series: Let us understand the principles of extrapolative forecasting with series with a single component
NaΓ―ve forecast What is the simplest forecast you can think of for a time series? For example: what will the temperature be like in your room after 5 minutes? π§ π’+1 = π§ π’ SKU A SKU A 800 800 600 600 Sales Sales 400 400 200 200 0 0 10 10 20 20 30 30 40 40 50 50 60 60 Observation Observation β’ The forecast is a straight line ο always equal to the last observation. β’ Is this a good forecast?
Arithmetic mean Another approach would be to calculate the average and use this as a forecast. For example: calculate the average temperature in your room over all the years you live thereβ¦ π’ π’+1 = 1 π§ π’ π§ π π=1 SKU A 800 600 Sales 400 200 0 10 20 30 40 50 60 Observation β’ The average has long memory and the random movements of the noise will be cancelled out. β’ Is this a good forecast?
Simple Moving Average Simple Moving Average allows us to select the appropriate memory (length of the average). e.g. only consider the temperature over the last week π’ π’+1 = 1 π§ π§ π π π=π’βπ+1 The simple moving average : β’ Has a single parameter k . This controls the length of the moving average and it is also known as its order. β’ Its variable length allows us to control how reactive we are to new information and how robust we are against noise. β’ Gives equal importance to all k observations.
Simple Moving Average SKU A 800 MA(3) 600 Which of the Sales 400 different length 200 moving averages is the most 0 SKU A 10 20 30 40 50 60 appropriate for 800 MA(6) this SKU? 600 Sales 400 We choose the 200 one that gives us a smooth 0 SKU A 10 20 30 40 50 60 800 estimate of the MA(12) level, here 600 MA(12) Sales 400 200 0 10 20 30 40 50 60 Observation
Simple Moving Average SKU A 800 MA(12) 600 Which of the Sales 400 different length 200 moving averages is the most 0 SKU A 10 20 30 40 50 60 appropriate for 800 MA(24) this SKU? 600 Sales 400 We do not need 200 excessive moving average lengths. 0 SKU A 10 20 30 40 50 60 800 These will be far MA(36) too insensitive to 600 new information. Sales 400 200 0 10 20 30 40 50 60 Observation
Weighted Moving Average Should the weights be the same for all k observations? We can overcome this limitation by allowing different weights for each observation in the average: π’ π π§ π’+1 = π₯ π π§ π , w. r. t. π₯ π = 1 π=π’βπ+1 π=1 With the weighted moving average : β’ We can control the length of the average and the importance of each observation β’ All weights must add up to 100% or 1. Normally the older the observation the smaller the weight. β’ Has k+1 parameters, the length of the average and k weights. β’ The number of weights makes it very challenging to use in practice.
Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.
The Exponential Smoothing Concept Starting from the weighted moving average we can construct a heuristic to select the weights easily and consequently its order ( k ). Data y t y t-1 y t-2 y t-3 ... Weights w t w t-1 w t-2 w t-3 ... Make the more recent information more relevant, bigger weights 1. Remember! Weights must add up to 100% (or 1) 2. ο Take 50% for the first and then always take 50% of the remaining weight. (Sum of all weights β 100%) Weights w t w t-1 w t-2 w t-3 w t-4 w t-5 w t-6 Weights 50% 25% 12.5% 6.25% 3.12% 1.56% β 0% ο The length of the average is set automatically!
The Exponential Smoothing Concept Weights w t w t-1 w t-2 w t-3 w t-4 w t-5 w t-6 50% 25% 12.5% 6.25% 3.12% 1.56% β 0% Only one parameter, the initial weight! Let this weight be Alpha ( Ξ± )... Ξ±(1 - Ξ±) 0 Ξ±(1 - Ξ±) 1 Ξ±(1 - Ξ±) 2 Ξ±(1 - Ξ±) 3 Ξ±(1 - Ξ±) 4 Ξ±(1 - Ξ±) 5 Ξ±(1 - Ξ±) 6 Exponentially distributed weights The exponential weighting scheme allows us to select reasonable weights and the length of the weighted moving average with a single parameter, the Ξ±.
The Exponential Smoothing Concept π’+1 = π½π§ π’ + π½ 1 β π½ π§ π’β1 + π½ 1 β π½ 2 π§ π’β2 + π½ 1 β π½ 3 π§ π’β3 + β― π§ π½π§ π’β1 + π½ 1 β π½ π§ π’β2 + π½ 1 β π½ 2 π§ π’β3 + β― π§ π’+1 = π½π§ π’ + 1 β π½ What is this? π’ = π½π§ π’β1 + π½ 1 β π½ π§ π’β2 + π½ 1 β π½ 2 π§ π’β3 + β― π§ A simpler form of the model: π§ π’+1 = π½π§ π’ + 1 β π½ π§ π’
Simple Exponential Smoothing π§ π’+1 = π½π§ π’ + 1 β π½ π§ π’ The parameter Ξ± , is called smoothing parameter and is bounded between 0 and 1. The exponential smoothing formula can be read as: the forecast is Ξ± times the most recent observation and (1- Ξ±) times all the previous information. β’ A low Ξ± implies that the forecast is mostly based on the previous information β’ A high Ξ± implies that the forecast is mostly based on the last information Therefore the smoothing parameter Ξ± controls how reactive is the forecast to new information. This form was proposed by Brown (1956). Much has changed since thenβ¦
Simple Exponential Smoothing SKU A - Alpha: 0.1 SKU A - Alpha: 0.7 800 800 600 600 Sales Sales 400 400 200 200 Noise is not filtered ο Avoid 0 0 10 20 30 40 50 60 10 20 30 40 50 60 SKU A - Alpha: 0.3 SKU A - Alpha: 0.9 Noise is filtered 800 800 600 600 Sales Sales 400 400 200 200 0 0 10 20 30 40 50 60 10 20 30 40 50 60 Observation Observation SKU A - Alpha: 0.5 SKU A - Alpha: 1.0 800 800 600 600 = Naive Sales Sales 400 400 200 200 0 0 10 20 30 40 50 60 10 20 30 40 50 60 Observation Observation
Simple Exponential Smoothing SKU B - Alpha: 0.1 10000 In the presence of high noise or 8000 outliers we need to use low values 6000 Sales of alpha to make our forecasts 4000 2000 more robust. 0 10 20 30 40 50 60 SKU B - Alpha: 0.3 Observation 10000 Here the outlier affects strongly 8000 our forecast. 6000 Sales 4000 2000 0 10 20 30 40 50 60 Observation SKU B - Alpha: 0.5 10000 Here the effect of outlier is even 8000 stronger. 6000 Sales 4000 2000 0 10 20 30 40 50 60 Observation
Simple Exponential Smoothing SKU C - Alpha: 0.1 6000 Very low alpha parameter makes 4000 our forecast too slow to adjust to Sales the new level of sales. 2000 0 10 20 30 40 50 60 Observation SKU C - Alpha: 0.3 6000 Here the alpha achieves a good 4000 compromise between reactivity Sales and robustness to noise. 2000 0 10 20 30 40 50 60 SKU C - Alpha: 0.5 6000 Very high alpha parameter makes 4000 our forecast to react very fast, but Sales now it does not filter out noise 2000 adequately. 0 10 20 30 40 50 60 Observation
Simple Exponential Smoothing We can formulate exponential smoothing in a different way: π§ π’+1 = π½π§ π’ + 1 β π½ π§ π’ π§ π’+1 = π½π§ π’ + π§ π’ β π½π§ π’ π§ π’+1 = π§ π’ + π½ π§ π’ β π§ π’ The difference between the Actuals and the Forecast is the forecast error. π§ π’+1 = π§ π’ + π½π π’ This is known as the error correction form of exponential smoothing. Why is this useful? Letβs find out after a short quizβ¦
Forecasting Level Series, Quiz Please, follow the link: http://etc.ch/V7Ss 1. Which of the methods is more appropriate for the following data?
Forecasting Level Series, Quiz Please, follow the link: http://etc.ch/V7Ss 2. Which of the methods is more appropriate for the following data (2 nd example)?
Forecasting Level Series, Quiz Please, follow the link: http://etc.ch/V7Ss 3. Which of the smoothing parameters is more appropriate for this data if we use SES?
Outline 1. Forecasting level series; 2. Simple Exponential Smoothing; 3. Introduction to ETS; 4. Local level model; 5. Trend and seasonal models; 6. Model estimation and selection.
Introduction to ETS SES models the level of a time series So, we can write π§ π’+1 = π π’ By shifting the indices by 1 period we can now write: π§ π’ = π π’β1 + π π’ (1) π π’ = π π’β1 + ππ π’ (2) This will lead us to the so called State Space Models : β’ Eq. (1) β the measurement equation : says that the observed actuals are the result of some structure ( π π’ ) and noise ( π π’ ). β’ Eq. (2) β the transition equation : says that there is an unobserved process describing how the level of the time series evolves. For our case this is all the structure of the series. β’ We can have other components as wellβ¦
Recommend
More recommend