Augmented Out-of-sample Comparison Method for Time Series Forecasting Techniques Igor Ilic, Berk Gorgulu, Mucahit Cevik Data Science Lab Ryerson University http://www.datasciencelab.ca
Background Methods Results Conclusions I NTRODUCTION • Problems with current comparison techniques for time series models: ⇤ one shot comparison ⇤ lack of robustness of conventional methods for comparing different regression methods (e.g random train-test splits) ⇤ the testing usually does not correctly reflect real-world situations • An augmented out-of-sample model comparison method: ⇤ more flexible and robust technique ⇤ takes the spatio-temporal nature of time series into account • Effectiveness of the method are tested using ARIMA, LSTM, and GRU models on Turkish electricity consumption data Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 2 / 10
Background Methods Results Conclusions L ITERATURE R EVIEW • Single interval tests ( Valipour et al. (2013), Kane et al. (2014) ) ⇤ Single time step is used as the test data ⇤ Simple to implement and translates to real-world tests ⇤ Susceptible to lucky one-shot tests • Multiple datasets ( Zhang (2003), Merh et al. (2010) ) ⇤ Predict the future data points for many datasets ⇤ Computationally intensive, requires a lot of data • Random Test Interval Sampling ( Huang et al. (2015) ) ⇤ Single dataset is sufficient for testing ⇤ The predictions are not made for the immediate future, not suitable for comparing methods that are data dependent (e.g. ARIMA) • Augmented Training ( Tashman (2000) ) ⇤ Rolling method for training a model ⇤ Fast ad hoc way to train a model with the best hyper parameters Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 3 / 10
Background Methods Results Conclusions A UGMENTED O UT - OF - SAMPLE C OMPARISON M ETHOD • The dataset is used to obtain the train-test sets, a forecast interval, and the number of tests. • After each test, the model is updated to include the test data. • Since the model is merely updated on a small portion of data, we do not have to retrain the entire model as found in rolling horizons ⇒ Leads to a significant speedup in testing Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 4 / 10
Background Methods Results Conclusions A LGORITHM Algorithm 1: Augmented Out-Of-Sample Testing Input : Dataset sorted by ascending date as D , algorithm as f , test interval length as ` , number of tests as n Output: Array of predicted values and real values 1 T , U ← TrainTestSplit ( D ) ; 2 model ← TrainUsing ( f , T ) ; 3 { C j } n j = 1 ← Split ( U , n ) ; 4 results ← ∅ ; 5 for i = 1 . . . n do testingData ← RetrieveFirst ( ` , C i ) ; 6 testResults ← TestUsing ( model , testingData ) ; 7 results ← results ∪ testResults ; 8 model ← UpdateUsing ( model , C i ) ; 9 10 end 11 return results ; Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 5 / 10
Background Methods Results Conclusions E XPERIMENTAL S ETUP • Dataset: ⇤ The Turkish electricity dataset: Five years worth of hourly data ⇤ Daily and weekly seasonality ⇤ Covariates included: day of week, hour of day ⇤ Covariates excluded: holidays, weather information, electricity pricing • Models: ⇤ Two RNN algorithms: LSTM and GRU ⇤ SARIMAX: Seasonal ARIMA with Regressors ⇤ A naive baseline model: use last week’s data to predict current week Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 6 / 10
Background Methods Results Conclusions C OMPARING M ODEL P ERFORMANCES - 1 Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 7 / 10
Background Methods Results Conclusions C OMPARING M ODEL P ERFORMANCES - 2 Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 8 / 10
Background Methods Results Conclusions S UMMARY • The augmented out-of-sample method alleviates many shortcomings of the standard approaches. • By allowing for more testing on the same dataset, in a realistic manner to real-world training, augmented out-of-sample comparison is able to determine the best algorithm. • Our augmented out-of-sample model comparison method show that neural networks outperform classical models such as ARIMA to predict electricity consumption rates in Turkey. Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 10 / 10
Background Methods Results Conclusions S UMMARY R ESULTS Table: MAPE by Prediction Forecast Interval with 95% Error Bounds Hours Baseline (%) SARIMAX (%) LSTM (%) GRU (%) 6 6 . 8 ± 1 . 8 2 . 6 ± 0 . 5 1 . 8 ± 0 . 4 1 . 6 ± 0 . 3 24 6 . 8 ± 0 . 9 5 . 4 ± 0 . 5 2 . 6 ± 0 . 2 1 . 9 ± 0 . 1 48 6 . 2 ± 0 . 6 7 . 9 ± 0 . 5 3 . 2 ± 0 . 2 3 . 3 ± 0 . 2 • RNN architectures outperform classical algorithms • The gap widens as the forecast interval grows Ilic et al. (Ryerson DSL) Augmented Out-of-sample Comparison 9 / 10
Recommend
More recommend