Volatility Forecasting with Sparse Bayesian Kernel Models Peter Ti - PowerPoint PPT Presentation

Volatility Forecasting with Sparse Bayesian Kernel Models Peter Tiˇ no University of Birmingham, UK Nikolay Nikolaev Goldsmiths College, University of London, UK Xin Yao University of Birmingham, UK

Volatility Forecasting with Sparse Bayesian Kernel Models Some motivations ◗ Quantizing real-valued financial time-series into symbolic streams and subsequent use of predictive models on such sequences - Good/Bad? ◗ Careful quantization can reduce the noise component in the data while preserving the underlying predictable patterns in the stochastic process (Buhlmann 1998, Giles 1997, Schittenkopf 2002). ◗ Still a controversial topic (Lin 2004). ◗ Large comparative studies of various model classes used to predict daily volatility differences in order to trade (on a daily basis) straddles on the DAX and FTSE 100 indexes (Tino 2001, Schit- tenkopf 2002). P. Tiˇ no, N. Nikolaev, X. Yao 1

Volatility Forecasting with Sparse Bayesian Kernel Models Previous work (Tino 2001, Schittenkopf 2002) • (At-the-money) straddles traded based on predictions of daily (implied/historical) volatility differences in the underlying indexes. Backtesting, but a realistic trading setting. • Continuous models - operating on the original real-valued sequences of volatility differences (feed-forward neural networks, mixture density networks, AR models). • Symbolic models operating on the quantized sequences (fixed order Markov models, variable memory length Markov models, fractal prediction machines). P. Tiˇ no, N. Nikolaev, X. Yao 2

Volatility Forecasting with Sparse Bayesian Kernel Models Previous work cont’d Two key observations: • quantization technique significantly improves the overall profit • quantization into just two symbols representing the sign of daily volatility moves gave the best results. This contribution: add another token to the ‘discretize vs. don’t discretize’ debate. Apply carefully formulated continuous models - Non-informative Prior Relevance Vector Machine (NPRVM). Based on the Relevance Vector Machine (RVM) (Tipping, 2001). P. Tiˇ no, N. Nikolaev, X. Yao 3

Volatility Forecasting with Sparse Bayesian Kernel Models Model formulation Time series of scalar observables (volatility differences): x 1 , ..., x t , ..., x T Predictive model operating on lagged delay vectors: (1) x t +1 ≡ y t = f ( x t ) = f ( x t − ( d − 1) τ , x t − ( d − 2) τ , ..., x t ) , ˆ where d is the embedding dimension and τ is the delay time. Generalized linear kernel regression formulation M � (2) f ( x ) = w n K ( x , x n ) , n =1 K ( · , · ) is the kernel basis function, e.g. K ( x , x n ) = exp [ −� x − x n ) � 2 / (2 s 2 )] . P. Tiˇ no, N. Nikolaev, X. Yao 4

Volatility Forecasting with Sparse Bayesian Kernel Models RVM The future values of x are modeled as x t +1 = f ( x t ) + ε t , ε t is i.i.d. zero-mean Gaussian noise with (unknown) variance σ 2 . RVM: • Start with basis functions centered on all given data points. • ARD framework for weights w n : prior p ( w | α ) over the M weights is an M -dim Gaussian of zero mean and covariance matrix Γ ( α ) = diag ( α − 1 1 , α − 1 2 , ..., α − 1 M ) . The Hyperparameters α = ( α 1 , α 2 , ..., α M ) quantify the prior belief in the possible ranges of weight values. Hyperparameter β quantifies the (inverse variance of) output noise. P. Tiˇ no, N. Nikolaev, X. Yao 5

Volatility Forecasting with Sparse Bayesian Kernel Models RVM cont’d Posterior distribution over the weights: p ( w | y , α, β − 1 ) = p ( y | x , w ( α ) , β − 1 ) p ( w | α ) (3) p ( y | α, β − 1 ) RVM carries out re-estimation of the weights and hyperparameters (by maximizing marginal likelihood of the hyperparameters p ( y | x , α, β − 1 ) ). Some hyperparameters α n grow, causing their corresponding weights w n to shrink toward zero. In practice, all training points x n with the corresponding hyperparameter α n above a (predefined) threshold α MAX are pruned out from the model. P. Tiˇ no, N. Nikolaev, X. Yao 6

Volatility Forecasting with Sparse Bayesian Kernel Models A stronger bias for sparseness ... Figueiredo (2003): increase the pressure for model sparseness by considering a Laplacian prior (instead of Gaussian) � κ � M p ( w | κ ) = exp( − κ � w � 1 ) . 2 Can be motivated be assuming that each weight w n has a zero- mean Gaussian prior with variance α − 1 (RVM) and that each n variance α − 1 has an exponential hyper-prior (hierarchical Bayes) n p ( α n | γ ) = γ � − γ � 2 exp . 2 α n Hyperparameter γ controls the degree of model sparseness. P. Tiˇ no, N. Nikolaev, X. Yao 7

Volatility Forecasting with Sparse Bayesian Kernel Models Get rid of γ Figueiredo (2003): Replace the exponential hyper-prior on variances α − 1 by a non-informative Jeffreys hyper-prior n 1 p ( α − 1 n ) ∝ , hence p ( α n ) ∝ α n . α − 1 n Treat variances α − 1 as hidden data. EM-style weight update: n w ( t ) = β ( β K T K + A ( t )) − 1 K T y , ˆ (4) where K is the kernel design matrix K ( x i , x j ) , 1 < = i, j < = N , and A ( t ) = diag ( | w 1 ( t ) | − 2 , | w 2 ( t ) | − 2 , ..., | w M ( t ) | − 2 ) . A plays the role of Γ ( α ) − 1 = diag ( { α − 1 n } ) in RVM weight estimation. Effectively, variances α n at time t are estimated as | w n ( t ) | 2 . P. Tiˇ no, N. Nikolaev, X. Yao 8

Volatility Forecasting with Sparse Bayesian Kernel Models Data • DAX: Daily closing values of DAX (August 1991 – June 1998); daily closing prices of call and put options on DAX with different maturities and exercise prices. The first in-the-money and the first out-of-the money call and put option maturing next month are available. The at-the-money point is assumed to be the value of the DAX at that time. The prices of call and put options are added to obtain straddle prices. • FTSE 100: Intraday bid-ask prices of American options on FTSE 100 at LIFFE (May 1991 – December 1995). Trading at 3 pm on normal trading days and 12 pm otherwise. The first quotes of call and put options maturing the next month with the same strike price as close as possible to the value of the current FTSE 100 are extracted. For these options (roughly at-the-money), the average of bid-ask quotes is an approximation of the option price. P. Tiˇ no, N. Nikolaev, X. Yao 9

Volatility Forecasting with Sparse Bayesian Kernel Models Trading strategy Only straddles maturing the following month are traded (avoids the influence of strong price movements towards the end of con- tracts). Every trading day, predict the change in volatility for the next trading day. If volatility is predicted to increase, buy near-the- money straddles (strike price closest to the at-the-money point) worth a fixed amount of money, otherwise sell them. On the next trading day, close the position and restart by pre- dicting the next volatility change. Fixed but otherwise arbitrary investment – facilitate the interpre- tation of results with respect to transactions costs. P. Tiˇ no, N. Nikolaev, X. Yao 10

Volatility Forecasting with Sparse Bayesian Kernel Models Dealing with non-stationarity ‘Sliding window technique’ (shift by 5 days). Within each window: • Training set: 500 trading days. Several representatives from the class of NPRVM models with different pruning cut values ( α MAX = 0 . 25 and α MAX = 0 . 5 ) and kernel width s 2 (input lag d = 10 ) are estimated. • Validation set: 125 trading days. Accumulated profit of the estimated models is checked on the validation set – criterion for selecting a model class representative. • Test set: 5 trading days. Out-of-sample profit of the model class representative is deter- mined on the test set. P. Tiˇ no, N. Nikolaev, X. Yao 11

Volatility Forecasting with Sparse Bayesian Kernel Models Experimental setup series of daily volatility differences Train Valid Test series of daily test set profits block 1 block 2 block 3 block n series of average block-profits P. Tiˇ no, N. Nikolaev, X. Yao 12

Volatility Forecasting with Sparse Bayesian Kernel Models Estimating significance of the results The test sets are non-overlapping, test set profits are concate- nated to form a large series of out-of-sample profits. Divide the daily profits into disjoint blocks of length 60 and 40 for DAX and FTSE series, respectively. The average block profit can be assumed to be normally dis- tributed (central limit theorem). Jarque-Bera test did not reject the null hypothesis of a normal distribution at any reasonable significance level. Subject the series of average block profits to t -tests. P. Tiˇ no, N. Nikolaev, X. Yao 13

Volatility Forecasting with Sparse Bayesian Kernel Models ‘Simple’ and Compound models Simple: pick one of the four trivial strategies – ‘Always Sell’, ‘Always Buy’, ’Copy the last trading decision’ and ’Reverse the last trading decision’ – based on the validation set profit. Eliminates the need for a training set potentially containing old (no longer relevant) data. Compound models: ‘ NPRVM+Simple ’ make predictions in the test week using either the more sophisticated model (NPRVM), or ‘Simple’, depending on which model gained more profit on the validation set. P. Tiˇ no, N. Nikolaev, X. Yao 14

Volatility Forecasting with Sparse Bayesian Kernel Models Peter Ti - PowerPoint PPT Presentation

Volatility Forecasting with Sparse Bayesian Kernel Models Peter Ti no University of Birmingham, UK Nikolay Nikolaev Goldsmiths College, University of London, UK Xin Yao University of Birmingham, UK Volatility Forecasting with Sparse

market volatility Table of Contents Understanding volatility 3 4 Method #1: Avoid volatility

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

Multifractal Volatility: Multifractal Volatility: Theory, Forecasting, and Pricing Theory,

Castlestone Low Volatility Income UCITS Fund Presentation Q2 - 2020 Investing in Low Volatility

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

Lecture on advanced volatility models Erik Lindstrm Stochastic Volatility (SV) Let r t be a

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Risk-parameter estimation in volatility models Christian Francq Jean-Michel Zakoan CREST and

Bayesian hierarchical models Bruno Nicenboim / Shravan Vasishth 2020-03-14 1 Bayesian

Volatility of network charges Presentation to DNCMF Patrick Taylor 13 th April 2011 Charging

TOOLS TO REDUCE PRICE VOLATILITY IN AGRICULTURE MARKETS HOW WE ARE CURRENTLY DEALING WITH

Waiving Loss Return Obligations During Oversupply Conditions February 2, 2018 Pre-decisional.

Trading Strategies Generated by Lyapunov Functions Ioannis Karatzas Columbia University, New

Financial Intermediation at Any Scale For Quantitative Modelling (2/3) Cours Bachelier

Code Modification Forum Clayton Hotel, Cork Wednesday, 6 March 2018 Agenda (1 of 2) 1. Review

Stop Trading On Congressional Knowledge (STOCK) Act April 12, 2012 STOCK Act Enacted into law

PROPOSED PRIVATISATION OF ARA JLIG, Straits Trading and Cheung Kong Property to partner with

Crowd View: Converting Investors' Opinions into Indicators Chung-Chi Chen, Hen-Hsen Huang,

Goals Database Administration All large and small databases need database Database

Sambuz

Useful Links

Newsletter

Mail Us