Machine Learning for Multi-step Ahead Forecasting of Volatility Proxies Jacopo De Stefani, Ir. - jdestefa@ulb.ac.be Prof. Gianluca Bontempi - gbonte@ulb.ac.be Olivier Caelen, PhD - olivier.caelen@worldline.com Dalila Hattab, PhD - dalila.hattab@equensworldline.com MIDAS 2017 - ECML-PKDD Hotel Aleksandar Palace, Skopje, FYROM Monday 18 th September, 2017
Problem overview First series CAC40 [2012−01−02/2013−11−04] Last 47.255 45 40 35 30 25 50 Volume (100,000s): 40 345,721 30 20 10 0 3 Moving Average Convergence Divergence (12,26,9): 2 1 MACD: 1.335 0 Signal: 1.258 −1 −2 −3 Jan 02 Mar 01 May 02 Jul 02 Sep 03 Nov 01 Jan 02 Mar 01 May 02 Jul 01 Sep 02 Nov 01 2012 2012 2012 2012 2012 2012 2013 2013 2013 2013 2013 2013 2/32
What is volatility? Definition Volatility is a statistical measure of the dispersion of returns for a given security or market index. High volatility Low volatility 1 0 . 5 0 r t − 0 . 5 − 1 0 20 40 60 80 100 t [days] 3/32
A closer look on data - Volatility proxies Calendar Day 0 Calendar Day 1 P h P h 1 10 . 2 0 P c Pre-opening 0 P c 1 P t 10 P o P l 0 P l 1 P o 0 9 . 8 1 1 − f 1 − f f 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 t [days] P o t P h σ P t Volatility proxy t P l t P c t 4/32
Models for volatility SVR k-NN Machine NN Univariate Multivariate Learning ES MA EWMA HA RGARCH (p,q) Component- GARCH (p,q) STES Average- Volatility models based RS- GARCH Extended (p,q) ST- GARCH (p,q) SR-AR QGARCH (p,q) Past Asymmetric SR-TAR volatility GJR- Simple GARCH ARCH (p,q) Regression SR-ARMA EGARCH (p,q) Random Symmetric Walk GARCH (p,q) ARCH (q) 5/32
Models for volatility SVR k-NN Machine NN Univariate Multivariate Learning ES MA EWMA HA RGARCH (p,q) Component- GARCH (p,q) STES Average- Volatility models based RS- GARCH Extended (p,q) ST- GARCH (p,q) SR-AR QGARCH (p,q) Past Asymmetric SR-TAR volatility GJR- Simple GARCH ARCH (p,q) Regression SR-ARMA EGARCH (p,q) Random Symmetric Walk GARCH (p,q) ARCH (q) 5/32
Models for volatility SVR k-NN Machine Univariate Multivariate NN Learning Established Research ES MA EWMA HA RGARCH (p,q) Component- GARCH (p,q) STES Average- Volatility models based RS- GARCH Extended (p,q) ST- GARCH (p,q) SR-AR QGARCH (p,q) Past Asymmetric SR-TAR volatility GJR- Simple GARCH ARCH (p,q) Regression SR-ARMA EGARCH (p,q) Random Symmetric Walk GARCH (p,q) ARCH (q) 5/32
Models for volatility SVR k-NN Machine Univariate Multivariate NN Learning Future Research Established Research ES MA EWMA HA RGARCH (p,q) Component- GARCH (p,q) STES Average- Volatility models based RS- GARCH Extended (p,q) ST- GARCH (p,q) SR-AR QGARCH (p,q) Past Asymmetric SR-TAR volatility GJR- Simple GARCH ARCH (p,q) Regression SR-ARMA EGARCH (p,q) Random Symmetric Walk GARCH (p,q) ARCH (q) 5/32
Multistep ahead TS forecasting - Taieb [2014] Definition Given a univariate time series { y 1 , · · · , y T } comprising T observations, forecast the next H observations { y T +1 , · · · , y T + H } where H is the forecast horizon. Hypotheses: ◮ Autoregressive model y t = m ( y t − 1 , · · · , y t − d ) + ε t with lag order (embedding) d ◮ ε is a stochastic iid model with µ ε = 0 and σ 2 ε = σ 2 6/32
Multistep ahead forecasting for volatility State-of-the-art NAR [ σ P σ P · · · t − 1 ] t − d m ( σ P ) σ P σ P · · · ˆ t + H ] [ˆ t 1 Input 1 Output 7/32
Multistep ahead forecasting for volatility State-of-the-art Proposed model NAR NARX [ σ P σ P · · · t − 1 ] t − d [ σ P σ P [ σ X σ X · · · · · · t − 1 ] t − 1 ] t − d t − d m ( σ P ) m ( σ P , σ X ) σ P σ P · · · ˆ t + H ] [ˆ σ P σ P · · · ˆ t + H ] [ˆ t t 2 inputs 1 Input 1 Output 1 output 7/32
Multistep ahead forecasting for volatility State-of-the-art Proposed model Future work NAR NARX [ σ P σ P · · · t − 1 ] t − d [ σ P σ P · · · t − 1 ] [ · · · · · · · · · ] t − d [ σ X M [ σ P σ P [ σ X σ X σ X M · · · · · · · · · t − 1 ] t − 1 ] t − 1 ] t − d t − d t − d m ( σ P ) m ( σ P , · · · , σ X M ) m ( σ P , σ X ) σ P σ P σ P · · · ˆ t + H ] σ P · · · ˆ t + H ] [ˆ σ P σ P [ˆ · · · ˆ t + H ] [ˆ t t t [ · · · · · · · · · ] σ X M σ X M · · · ˆ t + H ] [ˆ t 2 inputs 1 Input M + 1 inputs 1 Output 1 output M + 1 outputs 7/32
Multistep ahead forecasting for volatility Direct method ◮ A single model f h for each horizon h . ◮ Forecast at h step is made using h th model. ◮ Dataset examples ( d = 3 , h = 3 ): Direct NAR Direct NARX x y x y σ P σ P σ P σ P σ P σ P σ P σ X σ X σ X σ P 3 2 1 5 3 2 1 3 2 1 5 σ P σ P σ P σ P σ P σ P σ P σ X σ X σ X σ P 4 3 2 6 4 3 2 4 3 2 6 ... ... ... ... ... ... ... ... ... ... ... σ P σ P σ P σ P σ P σ P σ P σ X σ X σ X σ P T − 5 T − 6 T − 7 T − 2 T − 5 T − 6 T − 7 T − 5 T − 6 T − 7 T − 2 8/32
Experimental setup Data: Volatility proxies σ X , σ P from CAC40: ◮ Price based [ σ P σ P · · · t − 1 ] t − d ◮ σ i family - Garman and Klass [ σ X σ X · · · t − 1 ] t − d [1980] ◮ Return based ◮ GARCH (1,1) model - Hansen and Lunde [2005] m ( σ P , σ X ) ◮ Sample standard deviation Models: ◮ Feedforward Neural Networks σ P (NAR,NARX) σ P [ˆ · · · ˆ t + H ] t ◮ k-Nearest Neighbours (NAR,NARX) ◮ Support Vector Regression (NAR,NARX) 2 TS Input ◮ Naive (w/o σ X ) 1 TS Output ◮ GARCH(1,1) (w/o σ X ) ◮ Average (w/o σ X ) 9/32
Correlation meta-analysis (cf. Field [2001]) ◮ 40 time series Volume (CAC40) 250 100 σ SD σ SD σ SD 50 σ G σ 1 σ 6 σ 4 σ 5 σ 2 σ 3 σ 0 r t ◮ Time range: 1 Volume ? 05-01-2009 to σ 1 0.8 ? 22-10-2014 σ 6 0.6 ? ◮ 1489 OHLC σ 4 ? 0.4 σ 5 samples per ? TS σ 2 0.2 ? σ 3 ◮ Hierarchical 0 ? r t clustering ? −0.2 σ 0 using Ward Jr ? −0.4 σ SD 250 [1963] ? σ SD 100 −0.6 ◮ All ? σ SD 50 correlations ? −0.8 σ G are ? −1 statistically significant 10/32
NARX forecaster - Results ANN 11/32
NARX forecaster - Results ANN 12/32
NARX forecaster - Results KNN 13/32
NARX forecaster - Results KNN 14/32
NARX forecaster - Results SVR 15/32
NARX forecaster - Results SVR 16/32
Conclusions ◮ Correlation clustering among proxies belonging to the same t and σ SD,n family, i.e. σ i . t ◮ All ML methods outperform the reference GARCH method, both in the single input and the multiple input configuration. ◮ Only the addition of an external regressor, and for h > 8 bring a statistically significant improvement (paired t-test, pv=0.05). ◮ No model appear to clearly outperform all the others on every horizons, but generally SVR performs better than ANN and k-NN. 17/32
Thank you for your attention! Any questions/comments? jacopo.de.stefani@ulb.ac.be Find the paper at: 18/32
Bibliography I References Tim Bollerslev. Generalized autoregressive conditional heteroskedasticity. Journal of econometrics , 31(3):307–327, 1986. Andy P Field. Meta-analysis of correlation coefficients: a monte carlo comparison of fixed-and random-effects methods. Psychological methods , 6(2):161, 2001. Mark B Garman and Michael J Klass. On the estimation of security price volatilities from historical data. Journal of business , pages 67–78, 1980. 19/32
Bibliography II Peter R Hansen and Asger Lunde. A forecast comparison of volatility models: does anything beat a garch (1, 1)? Journal of applied econometrics , 20(7):873–889, 2005. Rob J Hyndman and Anne B Koehler. Another look at measures of forecast accuracy. International journal of forecasting , 22(4): 679–688, 2006. Souhaib Ben Taieb. Machine learning strategies for multi-step-ahead time series forecasting . PhD thesis, Ph. D. Thesis, 2014. Joe H Ward Jr. Hierarchical grouping to optimize an objective function. Journal of the American statistical association , 58 (301):236–244, 1963. 20/32
Appendix 21/32
System overview Raw OHLC data Missing values imputation Imputed Data OHLC data User choice preprocessing Proxy generation Model choice {ANN, σ i t , σ SD , σ G KNN} t t Correlation analysis Model identification m ∗ , θ ∗ {RO, RW} Evaluation choice Forecaster User choice 22/32
System overview Raw OHLC data Missing values imputation Imputed Data OHLC data User choice preprocessing Proxy generation Model choice {ANN, σ i t , σ SD , σ G KNN} t t Model identification Correlation analysis m ∗ , θ ∗ {RO, RW} Evaluation choice Forecaster User choice 22/32
System overview Raw OHLC data Missing values imputation Imputed Data OHLC data User choice preprocessing Proxy generation Model choice {ANN, σ i t , σ SD , σ G KNN} t t Model identification Correlation analysis m ∗ , θ ∗ {RO, RW} Evaluation choice Forecaster User choice 22/32
Recommend
More recommend