Data assimilation with stochastic model reduction Fei Lu Department of Mathematics, Johns Hopkins Joint with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) Xuemin Tu (U. of Kansas) Nov. 21, 2017 Banff International Research Station 1 / 24
Motivation: weather/climate prediction High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . Complex full systems: ECMWF: 16 km horizontal ◮ can only afford to resolve x ′ = f ( x ) grid → 10 9 freedoms ◮ y : unresolved variables (subgrid-scales) Discrete partial observations: missing i.c. Ensemble prediction: need many simulations The Lorenz 96 system Wilks 2005 2 / 24
Motivation: weather/climate prediction High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . Complex full systems: ECMWF: 16 km horizontal ◮ can only afford to resolve x ′ = f ( x ) grid → 10 9 freedoms ◮ y : unresolved variables (subgrid-scales) Discrete partial observations: missing i.c. Ensemble prediction: need many simulations → Develop a reduced model that quantifies the model error U ( x , y ) The Lorenz 96 system captures key statistical + dynamical properties Wilks 2005 3 / 24
Outline Stochastic model reduction (A first step: reduction from simulated data) ◮ Discrete-time stochastic parametrization (NARMA) ◮ Application to chaotic dynamical systems Data assimilation with the reduced model (An intermediate step: NARMA + noisy data → state estimation and prediction) 4 / 24
Stochastic model reduction x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Data { x ( nh ) } N n = 1 Memory effects (Mori, Zwanzig, Chorin, Ghil, Majda, Wilks, . . . ) Takens Theorem: delay embedding Mori-Zwanzig formalism: “generalized Langevin equation” � t dx ˙ dt = f ( x ) + K ( x ( t − s ) , s ) ds + W t , ���� � �� � 0 � �� � noise Markov term memory Goal: a non-Markovian stochastic reduced system for x 5 / 24
Differential system or discrete-time system? X ′ = f ( X ) + Z ( t , ω ) X n + 1 = X n + R h ( X n ) + Z n informative, neat messy Inference 1 likelihood Discretization 2 error correction by data − − − − − − − − − 1 Talay, Mattingly, Stuart, Higham, Milstein,Tretyakov, . . . 2 Brockwell, Sørensen, Pokern, Wiberg, Samson,. . . 6 / 24
Discrete-time stochastic parametrization NARMA( p , q ) X n = X n − 1 + R h ( X n − 1 ) + Z n , Z n = Φ n + ξ n , p r s q � � � � Φ n = a j X n − j + b i , j P i ( X n − j ) + c j ξ n − j j = 1 j = 1 i = 1 j = 1 � �� � � �� � Auto-Regression Moving Average R h ( X n − 1 ) from a numerical scheme for x ′ ≈ f ( x ) Φ n depends on the past Tasks: Structure derivation: terms and orders ( p , r , s , q ) in Φ n – physical laws, asymptotic behavior, discretization Parameter estimation: a j , b i , j , c j , and σ – conditional likelihood methods 7 / 24
Application to chaotic dynamics systems Example I: the Lorenz 96 system A chaotic dynamical system (a simplified atmospheric model) � dt x k = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 − 1 d y k , j , J j dt y k , j = 1 d ε [ y k , j + 1 ( y k , j − 1 − y k , j + 2 ) − y k , j + x k ] , where x ∈ R 18 , y ∈ R 360 . ǫ = 0 . 5 → no scale separation. Find a reduced system for x ∈ R 18 based on ➣ Data { x ( nh ) } N n = 1 ➣ d dt x k ≈ x k − 1 ( x k + 1 − x k − 2 ) − x k + 10. Wilks 2005 8 / 24
� NARMA: x n − 1 + R h ( x n − 1 ) + z n ; z n = Φ n + ξ n , x n = p d x p q � � � � b j , l ( x n − j ) l + Φ n c j R h ( x n − j ) + d j ξ n − j . = a + j = 1 l = 1 j = 1 j = 1 � 1 , h = 0 . 01 ; p = 2 , d x = 3 ; q = 0 , h = 0 . 05 . 9 / 24
� NARMA: x n − 1 + R h ( x n − 1 ) + z n ; z n = Φ n + ξ n , x n = p d x p q � � � � b j , l ( x n − j ) l + Φ n c j R h ( x n − j ) + d j ξ n − j . = a + j = 1 l = 1 j = 1 j = 1 � 1 , h = 0 . 01 ; p = 2 , d x = 3 ; q = 0 , h = 0 . 05 . � Polynomial autoregression (POLYAR) 3 d = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 + U , dt x k U = P ( x k ) + η k , with d η k ( t ) = φη k ( t ) + dB k ( t ) . where P ( x ) = � d x j = 0 a j x j . Optimal d x = 5. 10 / 24
Long-term statistics Empirical probability density function (PDF) L96 L96 0.12 POLYAR POLYAR 0.1 NARMA NARMA 0.1 0.08 0.08 PDF PDF 0.06 0.06 0.04 0.04 0.02 0.02 h=0.01 h=0.05 − 5 0 5 10 − 5 0 5 10 X 1 X 1 Empirical autocorrelation function (ACF) 1 L96 L96 h=0.05 h=0.01 POLYAR POLYAR 0.8 NARMA NARMA 0.6 0.5 0.4 ACF ACF 0.2 0 0 − 0.2 − 0.5 0 2 4 6 8 10 0 2 4 6 8 10 time time 11 / 24
Prediction ( h = 0 . 05 ) A typical ensemble forecast: 10 POLYAR 5 X 1 0 − 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time 10 NARMAX 5 X 1 0 − 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time forecast trajectories in cyan true trajectory in blue 12 / 24
Prediction ( h = 0 . 05 ) RMSE of many forecasts: A typical ensemble forecast: 20 18 10 POLYAR 16 POLYAR 5 X 1 14 0 12 RMSE − 5 10 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 NARMAX time 8 6 10 NARMAX 4 5 2 X 1 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 lead time − 5 Forecast time: 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time T ≈ 1 . 25 POLYAR: forecast trajectories in cyan NARMA: T ≈ 2 . 5 (Full model: T ≈ 2 . 5 true trajectory in blue “Best” forecast time achieved! ) 13 / 24
Example II: the Kuramoto-Sivashinsky equation v t + v xx + v xxxx + vv x = 0 , t > 0 , x ∈ [ 0 , 2 πν ] , periodic . Spatio-temporally chaotic solved with 128 Fourier modes 14 / 24
Example II: the Kuramoto-Sivashinsky equation v t + v xx + v xxxx + vv x = 0 , t > 0 , x ∈ [ 0 , 2 πν ] , periodic . Spatio-temporally chaotic Problem setting: ν = 3 . 43 Observing only 5 Fourier modes to predict their evolution Reduced models: the truncated system not accurate Discrete-time sto. paramtrization a : derive structure from inertial manifold → an effective NARMA model a Lu-Lin-Chorin17 solved with 128 Fourier modes 15 / 24
Key point 1: long-term statistics ↔ Large time behavior of PDE 4 Inertial manifolds M : - finite-dimensional, positively invariant manifolds - exponentially attracts all trajectories Let v = u + w . Rewrite the KSE: On M , w = ψ ( u ) du dt = Au + Pf ( u + w ) du dt = PAu + Pf ( u + ψ ( u )) . dw dt = Aw + Qf ( u + w ) Approximate inertial manifolds (AIMs): approximate w = ψ ( u ) dw dt ≈ 0 ⇒ w ≈ A − 1 Qf ( u + w ) , Fixed point: ψ 0 = 0; ψ n + 1 = A − 1 Qf ( u + ψ n ) . Key point 2: parametrize the AIM AIM with 5 modes: unstable (An accurate AIM requires m = dim ( u ) to be large! ) use the terms; estimate their coefficients from data → an effective NARMA model 16 / 24
NARMA with AIMs 5 The AIMs hint at how the high modes depend on the low modes: � � � A − 1 Qf ( u ) ˆ k ⇒ ˆ v l ˆ ˆ | k | > K : v k ≈ ψ 1 , k = v k ≈ c k v k − l . 1 ≤| l | , | k − l |≤ K � u n j , 1 ≤ j ≤ K ; u n � j = i � K l = j − K u n l u n K < j ≤ 2 K . j − l , A discrete-time stochastic system: ( p = 2 , q = 1) u n + 1 = u n k + hR h k ( u n ) + hz n k , k z n k = Φ n k + ξ n k , p K q � � � Φ n b k , j u n − j u n u n j + K − k + c k , ( K + 1 ) R h k ( u n ) + d k , j ξ n − j c k , j � j + K � k ( θ k ) = µ k + + . k k j = 0 j = 1 j = 1 17 / 24
Long-term statistics: Data Data Truncated system Truncated system 0.8 NARMA NARMA 0 10 0.6 ACF pdf 0.4 0.2 − 2 10 0 − 0.2 − 0.4 − 0.2 0 0.2 0.4 0.6 0 10 20 30 40 50 Real v 4 time probability density function auto-correlation function 18 / 24
Prediction A typical forecast: RMSE of many forecasts: 0.5 the truncated system 15 v 4 0 − 0.5 RMSE the truncated system 10 20 40 60 80 0.4 NARMA 0.2 5 NARMA 0 v 4 − 0.2 − 0.4 0 20 40 60 80 20 40 60 80 time t lead time Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 19 / 24
Part II: Data assimilation with the reduced model x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Noisy data: x ( nh ) + W ( n ) , n = 1 , 2 , . . . Data assimilation: state estimation and prediction EnKF: ensemble from forward model + Kalman update Assume: we can simulate the full system offline → use the solution to quantify model error U ( x , y ) by tuning inflation and localization of EnKF deriving a NARMA reduced model 20 / 24
The Lorenz 96 system Estimate and predict x based on ➣ Noisy Data z ( n ) = x ( nh ) + W ( n ) ➣ Forward models L96x: d dt x k ≈ x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 (account for the model error by IL in EnKF ) NARMA (account for the model error by parametrization in the forward model) Wilks 2005 21 / 24
Relative error of state estimation L96x + IL 0.03 NARMA Full model + IL 0.025 relative error 0.02 0.015 0.01 0.005 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 std of observation noise Relative error for different observation noises. (ensemble size: =1000 for L96x and NARMA; =10 for the full model) 22 / 24
Recommend
More recommend