High-dimensional functional time series forecasting Yuan Gao, Han Lin Shang, Yanrong Yang Abstract In this paper, we consider the problem of modeling and forecasting mortality rate data of N correlated populations simultaneously. The data are treated as functional time series, and techniques from functional data analysis (FDA) are used. We propose a two-fold dimension reduction approach where N populations of functions are reduced to a low dimension matrix using dynamic functional principal component analysis and factor modeling. Scalar time series models can be fitted to the elements of the resulting matrix, and forecast functions can be constructed. The proposed method is easy to implement especially when the number of populations N is large. We show the superiority of our approach by both simulation studies and an application to Japan mortality rates data. 1 Introduction Functional data are considered as realizations of smooth random curves. When curves are collected sequen- tially, they form functional time series X t ( u ) , u ∈ I . To deal with infinite dimensional functions, there is a demand for efficient data reduction techniques. Functional principal component ananlysis (FPCA) is the most commonly used approach that serves this purpose. FPCA performs eigendecomposition on the underly- ing covariance functions. As with in multivariate principal component analysis case, most of the variance structures are captured in a vector called principal component scores. There is extensive literature on FPCA. The monographs Ramsay & Silverman (2002, 2005) provide a comprehensive account of methodology and applications on functional data and FPCA. Some papers include Hall et al. (2006) and Hall & Hosseini-Nasab Yuan Gao Research School of Finance, Actuarial Studies and Statistics, Australian National University, Building 26C, Kingsley Street, Acton ACT 2601, Australia, e-mail: u5758483@anu.edu.au Han Lin Shang Research School of Finance, Actuarial Studies and Statistics, Australian National University, e-mail: hanlin.shang@anu.edu.au Yanrong Yang Research School of Finance, Actuarial Studies and Statistics, Australian National University, e-mail: yanrong.yang@anu.edu.au 1
2 Yuan Gao, Han Lin Shang, Yanrong Yang (2006) on theoretical properties, Yao et al. (2005) for sparse longitudinal data, Locantore et al. (1999) and Viviani et al. (2005) for some interesting applications. Existing FPCA method has been developed for independent observations, which is a serious weakness when we are dealing with time series data. In this paper, we adopt a dynamic FPCA approach (H ¨ ormann et al. 2015, Panaretos & Tavakoli 2013), where serial dependence between the curves are taken into account. With dynamic FPCA, functional time series are reduced to a vector time series, where the individual component processes are mutually uncorrelated principal component scores. It is often the case that we collect a vector of N functions at a single time point t . If these N functions are assumed to be correlated, multivariate functional models should be considered. Classical multivariate FPCA concatenates the multiple functions into one to perform univariate FPCA (Ramsay & Silverman 2005). Chiou et al. (2014) suggested normalizing each random function as a preliminary step before concatenation. Berrendero et al. (2011) studied functional version of principal component ananlysis, where multivariate functional data are reduced to one or two functions rather than vectors. However, existing models dealing with multivariate functional data either fail to handle data with a large N (as in the classical FPCA approach), or are hard to implement practically (as in Berrendero et al. 2011). We propose in this paper a two-fold dimension reduction model to model and forecast high-dimensional functional time series. By high-dimension, we allow that the dimension of the functional time series N to be as large as or even larger than the length of observed functional time series T . The dimension reduction process is straightforward and easy to implement: 1) Dynamic functional principal component analysis is performed separately on each set of functional time series, resulting in N sets of principal component scores of low dimension K (typically less than 5); 2) The first principal component scores from each of N sets of functional time series are combined into an N × T matrix. Multivariate principal component analysis is conducted to further reduce the dimension into an r × T matrix ( r ≪ N ). The same is done for the second, third until the K th principal component scores. The vector of N functional time series is reduced to rK what we call factors. 3) Fit scalar time series to each factor and produce forecasts. The forecast factors can be used to construct forecast functions. The proposed dimension reduction model is essentially using a matrix of small dimension ( r × K ) to represent the covariation of the original N functional time series. Elements of the reduced matrix are uncorrelated and it is adequate to model each component with scalar time series models. In the second step above mentioned, multivariate principal component analysis should be tailored to dependent data settings. We adopt factor models which are frequently used for dimension reduction. Some early application of factor analysis to multiple time series include Anderson (1963), Priestley et al. (1974) and Brillinger (1981). Time series in high-dimensional settings where N → ∞ together with T are studied in Chamberlain (1983), Bai (2003) and Lam et al. (2011). Among these, we adopt similar approach considered in Lam et al. (2011), where the model is conceptually simple and asymptotic properties are established. We are interested in applying the proposed model to mortality rate forecasting. There has been literature where age-specific mortality rates are modeled as functional data. Hyndman & Ullah (2007) proposed to forecast mortality and fertility rates with a robust functional time series model. Hyndman et al. (2013) introduced a product-ratio model to forecast multiple populations. However, none of the existing models known are applicable when the number of populations is large. In this paper, we focus on Japanese sub- national mortality rates, where data are available for each of the 47 prefectures. The data set contains yearly age and sex-specific mortality rates in a span of 41 years from 1975 to 2015. We show the superiority of our model which forecasts 47 populations simultaneously by comparing with the independent functional time series model.
High-dimensional functional time series forecasting 3 The remainder of the paper is organized as follows. In Section 2, more detailed background on dynamic FPCA is introduced and the two-fold dimension reduction model is proposed. We show simulation studies in Section 3, and in Section 4, we apply our model to Japanese age and sex-specific mortality rate data. Conclusion and discussion are in Section 5. 2 Research methods In this section, we introduce dynamic FPCA and factor model for the two-fold dimension reduction process. 2.1 Dynamic functional principal component analysis In this first step, dynamic FPCA is performed on each population separately. We consider stationary N - X t = ( X 1 t ( u ) ,..., X N t ( u )) ⊤ . X i X t : t ∈ Z , where X t ( u ) stands for the function dimensional functional time series X X X from i th population at time t . It takes values in the space H : = L 2 ( I ) of real-valued square integrable functions on I . � The space H is a Hilbert space, equipped with the inner product � x , y � : = I x ( u ) y ( u ) du . For each i = 1 ,..., N , we assume X i t has a continuous mean function, µ i ( u ) and an auto-covariance function at lag h , γ i h ( u , v ) , where µ i ( u ) = E [ X i ( u )] , γ i h ( u , v ) = cov [ X i t ( u ) , X i t + h ( v )] (1) The long-run covariance function is defined as ∞ c i ( u , v ) = ∑ γ i h ( u , v ) (2) h = − ∞ Using c i ( u , v ) as a kernel operator, we define the operator C by: � C i ( x )( u ) = I c i ( u , v ) x ( v ) dv , u , v ∈ I (3) The kernel is symmetric, and non-negative definite. Thus by Mercer’s Theorem, the operator C admits an eigendecomposition ∞ C i ( x ) = ∑ λ k � x , υ k � υ k , (4) k = 1 where ( λ l : l ≥ 1 ) are C ’s eigenvalues in descending order and ( υ l : l ≥ 1 ) the corresponding normalized eve Theorem, X i can be represented with eigenfunctions. By Karhunen-Lo` ∞ X i ∑ β i t , k υ i t ( u ) = k ( u ) (5) k = 1
Recommend
More recommend