Statistical downscaling by EOFVAR-X models Jiang, Ci-Ren (Institute of Statistical Science, Academia Sinica) Chen, Lu-Hung* (Institute of Statistics, NCHU) July 17, 2017
Outline • Introduction • EOFVAR-X model • Real experiments • Conclusions and discussions 1
Introduction
Statistical downscaling • Station-wise multiple linear model (and its variants) • Bayesian spatial(-temporal) models • Neural-networks • Support vector regression models • etc. 2
Station-wise multiple linear model grid points, a station-wise multiple linear model considers • Independence assumptions between: • But shouldn’t a meteorological variable be spatially and temporally correlated? 3 Denote Y t = [ Y t ( s 1 ) , ... , Y t ( s m )] ′ by the observations from m s n )] ′ by the model outputs on n stations and X t = [ X t ( ˜ s 1 ) , ... , X t ( ˜ Y t ( s i ) = X ′ β β i + ε t t β 1. difgerent stations Y t ( s i ) and Y t ( s j ) 2. difgerent observational time points Y t ( s i ) and Y t ′ ( s i )
Principal regression by Benestad et al. (2015) L (2) L now uncorrelated with each other, and it is safe to consider (1) • Apply principal components analysis to both observations Y t 4 and and model outputs X t . That is, let K ∑ ∑ Y t = γ t , k φ k X t = χ t , ℓ ψ ℓ . k = 1 ℓ = 1 • The spatial relationships of Y t and X t are retained in the eigenvectors φ k ’s and ψ ℓ ’s, respectively. • By the orthonormal property of φ k ’s, the pc scores γ t , k ’s are ∑ γ t , k = b k , 0 + b k , ℓ χ t , k + e t ℓ = 1 for each k = 1 , 2 , ... , K .
Principal regression by Benestad et al. (2015) by generally performs better and is more robust to station-wise • Empirical studies in Benestad et al. (2015) suggest that PCR K 5 • Denote ˆ b k , 0 , ˆ b k , 1 , ... , ˆ b k , L by the estimation of b k , 0 , b k , 1 , ... , b k , L , the downscaling of X t can be accomplished ˆ ∑ Y t = γ t , k φ k ˆ k = 1 γ t , k = ˆ ℓ = 1 ˆ b k , 0 + ∑ L b k , ℓ χ t , k . with ˆ MLR with K = 4 and L = 4.
EOFVAR-X model
Hypothesis • Are temporal relationships also useful for statistical downscaling? 6
EOFVAR-X model of time series, vector autoregressive model: (3) L L L K K 7 • For simplicity, we adapt the easiest model for analyzing vector γ t = [ γ t , 1 , γ t , 2 , ... , γ t , K ] ′ and • The pc scores γ γ χ t = [ χ t , 1 , χ t , 2 , ... , χ t , L ] ′ are temporally correlated due to the χ χ natures of Y t and X t . • Thus, they can be treated as vector of time series. ∑ ∑ γ t , k = α k , 0 + α k , κ , 1 γ t − 1 , κ + ··· + α k , κ , p γ t − p , κ κ = 1 κ = 1 ∑ ∑ ∑ + ρ k , ℓ , 0 χ t , ℓ + ρ k , ℓ , 1 χ t − 1 , ℓ + ··· + ρ k , ℓ , q χ t − q , ℓ ℓ = 1 ℓ = 1 ℓ = 1 + u t , k
EOFVAR-X model • The lags p and q in equation (3) are tuning parameters and can be selected by model selection, e.g. cross-validation, AIC, BIC, etc. • Equation (2) by Benestad et al. (2015) is a special case of • Thus our hypothesis can be verifjed by hypothesis testing or model selection. 8 equation (3) with α k , κ , 1 = α k , κ , 2 = ··· = α k , κ , p = 0 and ρ k , ℓ , 1 = ρ k , ℓ , 2 = ··· = ρ k , ℓ , q = 0 for all k , κ , and ℓ .
Implementation details Rixen, 2003, Data Interpolating Empirical Orthogonal Functions) (provided by the R package sinkr ). 3. Estimate the coeffjcients in equation (3) (by the R package MTS ) 9 1. Impute possible missing data in Y t by DINEOF (Beckers and 2. Apply PCA to both X t and Y t as described in equation (1).
Real experiments
Data sets Daily mean temperatures of the following two countries are used: 1. German • 254 stations from ECAD, 1979-2016. • Rolling cross-validation from 2001/1/1. 2. Taiwan • 7 stations from ASOS, 2000-2016. • Rolling cross-validation from 2011/1/1. experiments. 10 • NCEP-DOE Reanalysis 2 (lon 5 ◦ − 16 ◦ , lat 47 ◦ − 56 ◦ ) • NCEP-DOE Reanalysis 2 (lon 118 ◦ − 123 ◦ , lat 21 ◦ − 26 ◦ ) As suggested by Benestad et al. (2015), we set K = L = 4 in both
Rolling cross-validation Figure 1: Figure from https://robjhyndman.com/hyndsight/tscv/ 11
Results German 1 https://github.com/chenlu-hung/eofvarx at the project’s github page 1 . The source codes and notebooks for the experiments are available Table 1: Cross-validated RMSE and Pearson correlation coeffjcients EOFVAR-X 12 PCR Correlation RMSE Correlation RMSE Taiwan 1 . 39 ± 0 . 53 0 . 83 ± 0 . 11 1 . 44 ± 0 . 75 0 . 80 ± 0 . 37 1 . 14 ± 0 . 49 0 . 88 ± 0 . 08 1 . 00 ± 0 . 57 0 . 83 ± 0 . 33 (mean ± sd)
Conclusions and discussions
Conclusions and discussions 1. We developed an EOFVAR-X model to incorporate both the spatial and temporal relationships in a meteorological variable. 2. Our experimental results on two difgerent countries suggest that spatial and temporal relationships are both useful for statistical downscaling. 3. Some future directions: • More sophisticated time series models can be considered, e.g., ARIMA, GARCH, etc. • Simultaneous downscaling of multiple meteorological variables. • Nonlinear models (e.g. deep learning). 13
Thank you! 14
References Beckers, J.-M. and Rixen, M. (2003). Eof calculations and data fjlling from incomplete oceanographic datasets. Journal of Atmospheric and Oceanic Technology. Benestad, R. E., Chen, D., Mezghani, A., Fan, L., and Parding, K. (2015). On using principal components to represent stations in 15 empirical–statistical downscaling. Tellus A.
Recommend
More recommend