Data assimilation with stochastic model reduction Fei Lu - PowerPoint PPT Presentation

Data assimilation with stochastic model reduction Fei Lu Department of Mathematics, Johns Hopkins Joint with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) Xuemin Tu (U. of Kansas) Nov. 21, 2017 Banff International Research Station 1 / 24

Motivation: weather/climate prediction High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . Complex full systems: ECMWF: 16 km horizontal ◮ can only afford to resolve x ′ = f ( x ) grid → 10 9 freedoms ◮ y : unresolved variables (subgrid-scales) Discrete partial observations: missing i.c. Ensemble prediction: need many simulations The Lorenz 96 system Wilks 2005 2 / 24

Motivation: weather/climate prediction High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . Complex full systems: ECMWF: 16 km horizontal ◮ can only afford to resolve x ′ = f ( x ) grid → 10 9 freedoms ◮ y : unresolved variables (subgrid-scales) Discrete partial observations: missing i.c. Ensemble prediction: need many simulations → Develop a reduced model that quantifies the model error U ( x , y ) The Lorenz 96 system captures key statistical + dynamical properties Wilks 2005 3 / 24

Outline Stochastic model reduction (A first step: reduction from simulated data) ◮ Discrete-time stochastic parametrization (NARMA) ◮ Application to chaotic dynamical systems Data assimilation with the reduced model (An intermediate step: NARMA + noisy data → state estimation and prediction) 4 / 24

Stochastic model reduction x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Data { x ( nh ) } N n = 1 Memory effects (Mori, Zwanzig, Chorin, Ghil, Majda, Wilks, . . . ) Takens Theorem: delay embedding Mori-Zwanzig formalism: “generalized Langevin equation” � t dx ˙ dt = f ( x ) + K ( x ( t − s ) , s ) ds + W t , �� 0 � �� noise Markov term memory Goal: a non-Markovian stochastic reduced system for x 5 / 24

Differential system or discrete-time system? X ′ = f ( X ) + Z ( t , ω ) X n + 1 = X n + R h ( X n ) + Z n informative, neat messy Inference 1 likelihood Discretization 2 error correction by data − − − − − − − − − 1 Talay, Mattingly, Stuart, Higham, Milstein,Tretyakov, . . . 2 Brockwell, Sørensen, Pokern, Wiberg, Samson,. . . 6 / 24

Discrete-time stochastic parametrization NARMA( p , q ) X n = X n − 1 + R h ( X n − 1 ) + Z n , Z n = Φ n + ξ n , p r s q � � � � Φ n = a j X n − j + b i , j P i ( X n − j ) + c j ξ n − j j = 1 j = 1 i = 1 j = 1 � �� Auto-Regression Moving Average R h ( X n − 1 ) from a numerical scheme for x ′ ≈ f ( x ) Φ n depends on the past Tasks: Structure derivation: terms and orders ( p , r , s , q ) in Φ n – physical laws, asymptotic behavior, discretization Parameter estimation: a j , b i , j , c j , and σ – conditional likelihood methods 7 / 24

Application to chaotic dynamics systems Example I: the Lorenz 96 system A chaotic dynamical system (a simplified atmospheric model) � dt x k = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 − 1 d y k , j , J j dt y k , j = 1 d ε [ y k , j + 1 ( y k , j − 1 − y k , j + 2 ) − y k , j + x k ] , where x ∈ R 18 , y ∈ R 360 . ǫ = 0 . 5 → no scale separation. Find a reduced system for x ∈ R 18 based on ➣ Data { x ( nh ) } N n = 1 ➣ d dt x k ≈ x k − 1 ( x k + 1 − x k − 2 ) − x k + 10. Wilks 2005 8 / 24

� NARMA: x n − 1 + R h ( x n − 1 ) + z n ; z n = Φ n + ξ n , x n = p d x p q � � � � b j , l ( x n − j ) l + Φ n c j R h ( x n − j ) + d j ξ n − j . = a + j = 1 l = 1 j = 1 j = 1 � 1 , h = 0 . 01 ; p = 2 , d x = 3 ; q = 0 , h = 0 . 05 . 9 / 24

� NARMA: x n − 1 + R h ( x n − 1 ) + z n ; z n = Φ n + ξ n , x n = p d x p q � � � � b j , l ( x n − j ) l + Φ n c j R h ( x n − j ) + d j ξ n − j . = a + j = 1 l = 1 j = 1 j = 1 � 1 , h = 0 . 01 ; p = 2 , d x = 3 ; q = 0 , h = 0 . 05 . � Polynomial autoregression (POLYAR) 3 d = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 + U , dt x k U = P ( x k ) + η k , with d η k ( t ) = φη k ( t ) + dB k ( t ) . where P ( x ) = � d x j = 0 a j x j . Optimal d x = 5. 10 / 24

Long-term statistics Empirical probability density function (PDF) L96 L96 0.12 POLYAR POLYAR 0.1 NARMA NARMA 0.1 0.08 0.08 PDF PDF 0.06 0.06 0.04 0.04 0.02 0.02 h=0.01 h=0.05 − 5 0 5 10 − 5 0 5 10 X 1 X 1 Empirical autocorrelation function (ACF) 1 L96 L96 h=0.05 h=0.01 POLYAR POLYAR 0.8 NARMA NARMA 0.6 0.5 0.4 ACF ACF 0.2 0 0 − 0.2 − 0.5 0 2 4 6 8 10 0 2 4 6 8 10 time time 11 / 24

Prediction ( h = 0 . 05 ) A typical ensemble forecast: 10 POLYAR 5 X 1 0 − 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time 10 NARMAX 5 X 1 0 − 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time forecast trajectories in cyan true trajectory in blue 12 / 24

Prediction ( h = 0 . 05 ) RMSE of many forecasts: A typical ensemble forecast: 20 18 10 POLYAR 16 POLYAR 5 X 1 14 0 12 RMSE − 5 10 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 NARMAX time 8 6 10 NARMAX 4 5 2 X 1 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 lead time − 5 Forecast time: 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time T ≈ 1 . 25 POLYAR: forecast trajectories in cyan NARMA: T ≈ 2 . 5 (Full model: T ≈ 2 . 5 true trajectory in blue “Best” forecast time achieved! ) 13 / 24

Example II: the Kuramoto-Sivashinsky equation v t + v xx + v xxxx + vv x = 0 , t > 0 , x ∈ [ 0 , 2 πν ] , periodic . Spatio-temporally chaotic solved with 128 Fourier modes 14 / 24

Example II: the Kuramoto-Sivashinsky equation v t + v xx + v xxxx + vv x = 0 , t > 0 , x ∈ [ 0 , 2 πν ] , periodic . Spatio-temporally chaotic Problem setting: ν = 3 . 43 Observing only 5 Fourier modes to predict their evolution Reduced models: the truncated system not accurate Discrete-time sto. paramtrization a : derive structure from inertial manifold → an effective NARMA model a Lu-Lin-Chorin17 solved with 128 Fourier modes 15 / 24

Key point 1: long-term statistics ↔ Large time behavior of PDE 4 Inertial manifolds M : - finite-dimensional, positively invariant manifolds - exponentially attracts all trajectories Let v = u + w . Rewrite the KSE: On M , w = ψ ( u ) du dt = Au + Pf ( u + w ) du dt = PAu + Pf ( u + ψ ( u )) . dw dt = Aw + Qf ( u + w ) Approximate inertial manifolds (AIMs): approximate w = ψ ( u ) dw dt ≈ 0 ⇒ w ≈ A − 1 Qf ( u + w ) , Fixed point: ψ 0 = 0; ψ n + 1 = A − 1 Qf ( u + ψ n ) . Key point 2: parametrize the AIM AIM with 5 modes: unstable (An accurate AIM requires m = dim ( u ) to be large! ) use the terms; estimate their coefficients from data → an effective NARMA model 16 / 24

NARMA with AIMs 5 The AIMs hint at how the high modes depend on the low modes: � � � A − 1 Qf ( u ) ˆ k ⇒ ˆ v l ˆ ˆ | k | > K : v k ≈ ψ 1 , k = v k ≈ c k v k − l . 1 ≤| l | , | k − l |≤ K � u n j , 1 ≤ j ≤ K ; u n � j = i � K l = j − K u n l u n K < j ≤ 2 K . j − l , A discrete-time stochastic system: ( p = 2 , q = 1) u n + 1 = u n k + hR h k ( u n ) + hz n k , k z n k = Φ n k + ξ n k , p K q � � � Φ n b k , j u n − j u n u n j + K − k + c k , ( K + 1 ) R h k ( u n ) + d k , j ξ n − j c k , j � j + K � k ( θ k ) = µ k + + . k k j = 0 j = 1 j = 1 17 / 24

Long-term statistics: Data Data Truncated system Truncated system 0.8 NARMA NARMA 0 10 0.6 ACF pdf 0.4 0.2 − 2 10 0 − 0.2 − 0.4 − 0.2 0 0.2 0.4 0.6 0 10 20 30 40 50 Real v 4 time probability density function auto-correlation function 18 / 24

Prediction A typical forecast: RMSE of many forecasts: 0.5 the truncated system 15 v 4 0 − 0.5 RMSE the truncated system 10 20 40 60 80 0.4 NARMA 0.2 5 NARMA 0 v 4 − 0.2 − 0.4 0 20 40 60 80 20 40 60 80 time t lead time Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 19 / 24

Part II: Data assimilation with the reduced model x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Noisy data: x ( nh ) + W ( n ) , n = 1 , 2 , . . . Data assimilation: state estimation and prediction EnKF: ensemble from forward model + Kalman update Assume: we can simulate the full system offline → use the solution to quantify model error U ( x , y ) by tuning inflation and localization of EnKF deriving a NARMA reduced model 20 / 24

The Lorenz 96 system Estimate and predict x based on ➣ Noisy Data z ( n ) = x ( nh ) + W ( n ) ➣ Forward models L96x: d dt x k ≈ x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 (account for the model error by IL in EnKF ) NARMA (account for the model error by parametrization in the forward model) Wilks 2005 21 / 24

Relative error of state estimation L96x + IL 0.03 NARMA Full model + IL 0.025 relative error 0.02 0.015 0.01 0.005 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 std of observation noise Relative error for different observation noises. (ensemble size: =1000 for L96x and NARMA; =10 for the full model) 22 / 24

Data assimilation with stochastic model reduction Fei Lu - PowerPoint PPT Presentation

Data assimilation with stochastic model reduction Fei Lu Department of Mathematics, Johns Hopkins Joint with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) Xuemin Tu (U. of Kansas) Nov. 21, 2017 Banff International Research

Data Assimilation with Stochastic Model Reduction of Chaotic Systems Fei Lu Department of

Assimilation of AIRS Data at NRL Assimilation of AIRS Data at NRL Benjamin Ruston, Clay

The Joint Effort for Data assimilation Integration (JEDI) OOPS Structure, Data flow Joint Center

or Flow-Adaptive Background-Error-Covariance Modeling for Data Assimilation using Information

Soil moisture data assimilation: Soil moisture data assimilation: Error modeling, adaptive

Use of AIRS data in the Joint Center for Satellite Data Assimilation Lars Peter Riishojgaard

Functional Data Assimilation with White-Noise Data Error and Applications to Assimilation of

Introduction to Data Assimilation Ma elle Nodet team.inria.fr/moise/maelle Universit e de

Data assimilation and observing Data assimilation and observing strategies working group

Assimilation of 3D Radar Data and Derived Objects on the Convective Scale with an Ensemble-based

Assimilation of Advanced Assimilation of Advanced InfraRed nfraRed Sounder (AIRS) observations

On a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for

The effects of data selection on The effects of data selection on the assimilation of AIRS data

Data Assimilation in a Two Neuron Network Anna Miller University of California San Diego

Using Model Reduction in Data Assimilation Met Office Website Nancy Nichols, Amos Lawless The

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

Paper Reviewed (1) Chris Stolte, Diane Tang, Pat Hanrahan Query, Analysis, and

Creating Adaptive Policies: From Concept to Practical Tools January 30 th 2018 OCCIAR Webinar

G Geo-Spatial Technologies: Sensor S ti l T h l i S Web Enablement Web Enablement Dr K R

Description of the bias introduced by the transition from Conventional Manual Measurements to

Creating Open Science Friendly Environment Joseph W Matiko Dar Es Salaam Institute of Technology,

AZ imuth EL evation CON trol for DUAL AXIS DRIVEN POSITIONING of photovoltaic panels and

Linux Kernel Power Management (PM) Framework for ARM 64-bit Processors L.Pieralisi 21/8/2014 -

Lecture 6. June 14 2019 Recap The last lecture gave an overview of the cancer world, describing