A Bootstrap Stationarity Test for Predictive Regression Invalidity Robert Taylor University of Essex Co-authors: Iliyan Georgiev (University of Bologna) David Harvey (University of Nottingham) Stephen Leybourne (University of Nottingham) EFiC Conference, 7/7/2017 (Robert Taylor, Essex) EFiC Conference, 7/7/2017 1 / 61
Introduction Predictive regressions play an important role in empirical economics. For example, Granger causality implies that some variable is not considered as a cause for another variable if the former cannot predict the latter. In financial economics it is of interest whether current information on variables such as dividend yields or interest spreads contain information about future (excess) stock price returns (e.g. Campbell and Shiller, 1988, JF). (Robert Taylor, Essex) EFiC Conference, 7/7/2017 2 / 61
Introduction Another linear rational expectations hypothesis that can be tested by predictive regression methods is the uncovered interest rate parity hypothesis (UIPH), which asserts that the expected change of future exchange rates is equal to the difference between (conformable) domestic and foreign interest rates. Here the predictive regression is of the changes of the exchange rate minus the previous period interest rate differential regressed onto the previous period interest rate differential. If the UIPH holds, interest rate differentials should not be able to predict, but the coefficient on interest rate differentials is often found to be significantly negative in practice (e.g. Froot and Thaler, 1990, Jn. Ec. Persp.) (Robert Taylor, Essex) EFiC Conference, 7/7/2017 3 / 61
Introduction An important practical problem with performing such predictive regressions with financial applications is that in many cases the regressor is highly persistent, whereas the dependent variable is close to white noise. For example, stock price returns or exchange rate changes appear to be approximately white noise, whereas predictors like dividend yields or interest rate differentials exhibit persistence behaviour akin to that of a unit root or near unit root autoregressive process. As shown by Elliott and Stock (1994,ET), the conventional t -statistic in the predictive regression can suffer from severe size distortions in such cases. (Robert Taylor, Essex) EFiC Conference, 7/7/2017 4 / 61
Introduction Consider testing H 0 : β = 0 (i.e. y t unpredictable by x t − 1 ) in the predictive regression y t = α + β x t − 1 + ǫ t where y t is local-to-white noise (e.g. returns) and x t is local-to-unit root (e.g. dividend yield). A number of papers have focused on developing asymptotically valid tests of this hypothesis, allowing for an unknown local-to-unity parameter in x t and unknown correlation between ǫ t and the innovations to x t process, e.g.: Cavanagh et al. (1995,ET) (Bonferroni bounds that yield conservative tests) Campbell and Yogo (2006,JFE) (point optimal t -test and employing confidence belts) Breitung and Demetrescu (2015,JoE) (variable addition and IV methods). (Robert Taylor, Essex) EFiC Conference, 7/7/2017 5 / 61
Introduction However, suppose the true DGP is y t = α + δ z t − 1 + ǫ t where z t is some other local-to-unit root process uncorrelated with x t . In this case, testing H 0 : β = 0 in y t = α + β x t − 1 + ǫ t can result in an asymptotically over-sized test. This over-size can be interpreted as a tendency to find a spurious predictor of y t : it is incorrectly concluded that x t − 1 can be used to predict y t when in actuality y t is only predictable by z t − 1 . Such spurious predictive regression possibilities were highlighted by Ferson et al. (2003a,b,JF,Jnl. Inv. Man.) (using simulation) and Deng (2014,J Fin. Ectrx) (using an asymptotic analysis). (Robert Taylor, Essex) EFiC Conference, 7/7/2017 6 / 61
Introduction In this paper we show theoretically the potential for spurious predictive regression to arise in the context of a model where x t and z t follow similar but uncorrelated persistent processes (modelled as local-to-unity autoregressions), while modelling the coefficient on z t − 1 as being local-to-zero. We find that spurious rejections in favour of y t being predicted by x t − 1 can occur very frequently. It is important therefore to be able to identify whether or not the potential predictive regression of y t on x t − 1 is mis-specified due to omission of a relevant predictor z t − 1 . (Robert Taylor, Essex) EFiC Conference, 7/7/2017 7 / 61
Introduction We propose a test for predictive regression invalidity based on the following: If y t = α + β x t − 1 + ǫ t is the true DGP, the persistent component of y t is present in the regression of y t on x t − 1 , and the residuals will be stationary If y t = α + δ z t − 1 + ǫ t is the true DGP, the persistent component of y t is not present in the regression of y t on x t − 1 , and the residuals will be persistent So, any remaining persistence in the residuals from the regression of y t on x t − 1 must be due to z t − 1 , signalling invalidity of a predictive regression that employs x t − 1 . Our proposed test therefore tests for persistence in the residuals from a regression of y t on x t − 1 , adapting the co-integration tests of Shin (1994) and Leybourne and McCabe (1994,JBES), which are variants of the stationarity test of Kwiatkowski et al. (1992,JoE) test (KPSS). (Robert Taylor, Essex) EFiC Conference, 7/7/2017 8 / 61
Introduction A difficulty is that under our null (predictive regression validity), our proposed test has a limit distribution that still depends on the local-to-unity parameter in the process for x t . This makes it very difficult to control the size of the test since the local-to-unity parameter cannot be consistently estimated. We show that a fixed regressor wild bootstrap procedure (cf. Hansen, 2000,JoE) that conditions on x t − 1 can be implemented to yield an asymptotically size-controlled testing strategy. This procedure is also robust to a wide range of non-stationary error volatility patterns which is potentially important for applications to financial and economic time series. (Robert Taylor, Essex) EFiC Conference, 7/7/2017 9 / 61
The Predictive Regression Model The DGP we consider for observed y t is y t = α y + β x x t − 1 + β z z t − 1 + ǫ yt , t = 1, ..., T where x t = α x + s x , t , z t = α z + s z , t , t = 0, ..., T s x , t = ρ x s x , t − 1 + ǫ xt , s z , t = ρ z s z , t − 1 + ǫ zt , t = 1, ..., T where ρ x : = 1 − c x T − 1 and ρ z : = 1 − c z T − 1 , with c x , c z ≥ 0 , so that x t and z t are persistent unit root or local to unit root autoregressive processes. In order to examine the asymptotic local power of the test procedures, we let β x : = g x T − 1 and β z : = g z T − 1 , so that when g x and/or g z are non-zero, y t is a persistent, but local-to-noise process. (Robert Taylor, Essex) EFiC Conference, 7/7/2017 10 / 61
The Predictive Regression Model The innovation vector ǫ t : = [ ǫ xt , ǫ zt , ǫ yt ] ′ is taken to satisfy the following conditions: ǫ xt = HD t e t ǫ zt ǫ yt where e t is a 3 × 1 vector m.d.s. with σ t : = E ( e t e ′ t |F t − 1 ) satisfying p T − 1 ∑ T → E ( e t e ′ t ) = I 3 , t = 1 σ t h 11 0 0 d 1 t 0 0 , H : = D t : = 0 h 22 0 0 d 2 t 0 h 31 h 32 h 33 0 0 d 3 t with HH ′ strictly positive definite and the d it satisfying d it : = d i ( t / T ) , i = 1, 2, 3 , with d i ( · ) non-stochastic. The structure of H imposes zero correlation between ǫ xt and ǫ zt , while ǫ yt can be correlated with ǫ xt and/or ǫ zt . (Robert Taylor, Essex) EFiC Conference, 7/7/2017 11 / 61
The Predictive Regression Model Stationary conditional heteroskedasticity is permitted. Unconditional heteroskedasticity is also permitted via the time-varying matrix D t . Assumptions on D t allow for, e.g. single or multiple variance or covariance shifts, variances which follow a broken trend, smooth transition variance shifts, etc. In the unconditionally homoskedastic case, it is useful to let D t = I (w.l.o.g.) and define the innovation variance-covariance matrix as σ 2 0 σ xy x HH ′ = : Ω : = σ 2 0 σ zy z σ 2 σ xy σ zy y (Robert Taylor, Essex) EFiC Conference, 7/7/2017 12 / 61
The Predictive Regression Model Under our assumptions, the following weak convergence result holds: � r M η x ( r ) 0 d 1 ( s ) dB 1 ( s ) ⌊ Tr ⌋ � r w T − 1/2 : = H ∑ → M η z ( r ) 0 d 2 ( s ) dB 2 ( s ) ǫ t � r t = 1 M η y ( r ) 0 d 3 ( s ) dB 3 ( s ) h 11 { � 1 0 d 1 ( s ) 2 } 1/2 0 0 B η 1 ( r ) h 22 { � 1 = 0 d 2 ( s ) 2 } 1/2 B η 2 ( r ) 0 0 h 31 { � 1 h 32 { � 1 h 33 { � 1 B η 3 ( r ) 0 d 1 ( s ) 2 } 1/2 0 d 2 ( s ) 2 } 1/2 0 d 3 ( s ) 2 } 1/2 with [ B 1 ( r ) , B 2 ( r ) , B 3 ( r )] ′ a 3 × 1 vector of independent standard Brownian motion processes and B η i ( r ) : = { � 1 0 d i ( s ) 2 } − 1/2 � r 0 d i ( s ) dB i ( s ) , i = 1, 2, 3. The B η i ( r ) are variance-transformed Brownian motions (Brownian motion under a modification of the time domain). (Robert Taylor, Essex) EFiC Conference, 7/7/2017 13 / 61
Recommend
More recommend