Data Assimilation with Stochastic Model Reduction of Chaotic Systems - PowerPoint PPT Presentation

Data Assimilation with Stochastic Model Reduction of Chaotic Systems Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) Xuemin Tu (U. of Kansas) Snowbird, SIAM-DS, May 23, 2019 1 / 24

Model error from sub-grid scales ECMWF: 16 km horizontal grid → 10 9 freedoms From: Wikipedia, ECWMF The Lorenz 96 system Wilks 2005 2 / 24

Model error from sub-grid scales High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . ECMWF: 16 km horizontal grid → 10 9 freedoms HighD multiscale full systems: ◮ can only afford to resolve x ′ = f ( x ) ◮ y : unresolved variables (subgrid-scales) Discrete noisy observations: missing i.c. Ensemble prediction: need many simulations The Lorenz 96 system Wilks 2005 3 / 24

Model error from sub-grid scales High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . ECMWF: 16 km horizontal grid → 10 9 freedoms HighD multiscale full systems: ◮ can only afford to resolve x ′ = f ( x ) ◮ y : unresolved variables (subgrid-scales) Discrete noisy observations: missing i.c. Ensemble prediction: need many simulations → How to account for the model error U ( x , y ) ? The Lorenz 96 system Wilks 2005 4 / 24

Given a highD multiscale full system: x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Ensemble prediction: can afford to resolve x ′ = f ( x ) online. Accounting model error U ( x , y ) from subgrid scales Indirect approaches: correct the forecast ensemble ◮ e.g. Inflation and Localization; 1 relaxation/bias correction 2 , ◮ in Assimilation step; deficiency in forecast model remains Direct approach: improve the forecast model ◮ parametrization methods 3 , non-Markovian 4 , ◮ random perturbation, averaging+homogenization 5 1 Mitchell-Houtekamer(00), Hamill-Whitaker(05), Anderson(07) 2 Zhang-etc (04), Dee-Da Silva(98) 3 Palmer, Arnold+(01,13), Wilks(05), Meng-Zhang(07), Danforth-Kalnay-Li+(08,09), Berry-Harlim(14), Mitchell-Carrissi(15), Van Leeuwen etc(18) 4 Chorin+(00-15), Marjda-Timofeyev-Harlim+(03-13), Chekroun-Kondrashov- Gil+(11,15), Cromellin+Vanden-Eijinden(08), Gottwald+(15) 5 Hamill-Whitaker(05),Houtekamer+(09) Pavliotis-Stuart(08), Gottwald+(12-13) 5 / 24

Outline 1. Stochastic model reduction (reduction from simulated data) ◮ Discrete-time stochastic parametrization (NARMA) 2. Data assimilation with the reduced model (Noisy data + reduced model → state estimation and prediction) 6 / 24

Stochastic model reduction x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Data { x ( nh ) } N n = 1 Why stochastic reduced models? � N N →∞ The system is “ergodic”: 1 n = 1 F ( x ( nh )) − − − − → � F ( x ) µ ( dx ) N U ( x , y ) acts like a stochastic force 7 / 24

Stochastic model reduction x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Data { x ( nh ) } N n = 1 Why stochastic reduced models? � N N →∞ The system is “ergodic”: 1 n = 1 F ( x ( nh )) − − − − → � F ( x ) µ ( dx ) N U ( x , y ) acts like a stochastic force Memory effects (Mori, Zwanzig, Chorin, Kubo, Majda, Wilks, Ghil, . . . ) Mori-Zwanzig formalism → generalized Langevin equ. � t dx dt = E [ RHS | x ] + K ( x ( s ) , t − s ) ds + W t , � �� 0 noise Markov term � �� memory Fluctuation-dissipation theory → Hypoelliptic SDEs dX = a ( X , Y ) dt + Y ; dY = b ( X , Y ) dt + c ( X , Y ) dW , Parametrization: multi-layer stochastic models Goal: develop a non-Markovian stochastic reduced system for x 8 / 24

Discrete-time stochastic parametrization NARMA( p , q ) X n = X n − 1 + R h ( X n − 1 ) + Z n , Z n = Φ n + ξ n , p q r s � � � � Φ n = a j X n − j + b i , j P i ( X n − j ) + c j ξ n − j j = 1 j = 1 i = 1 j = 1 � �� Auto-Regression Moving Average R h ( X n − 1 ) from a numerical scheme for x ′ ≈ f ( x ) Φ n depends on the past Tasks: Structure derivation: terms and orders ( p , r , s , q ) in Φ n ; Parameter estimation: a j , b i , j , c j , and σ . 9 / 24

Overview: x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Data { x ( nh ) } N n = 1 Discrete-time stochastic parametrization NARMA X n X n − 1 + R h ( X n − 1 ) + Z n , = 1. compute R h ( x ) Z n = Φ n + ξ n , p q � � 2. derive structure a j X n − j + c j ξ n − j Φ n = j = 1 j = 1 r s 3. estimate parameters � � + b i , j P i ( X n − j ) . j = 1 i = 1 10 / 24

Application to the chaotic Lorenz 96 system A chaotic dynamical system (a simplified atmospheric model) dt x k = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 − 1 d � y k , j , J j dt y k , j = 1 d ε [ y k , j + 1 ( y k , j − 1 − y k , j + 2 ) − y k , j + x k ] , where x ∈ R 18 , y ∈ R 360 . Find a reduced system for x ∈ R 18 based on ➣ Data { x ( nh ) } N n = 1 ➣ d dt x k ≈ x k − 1 ( x k + 1 − x k − 2 ) − x k + 10. Wilks 2005 11 / 24

� NARMA: x n − 1 + R h ( x n − 1 ) + z n ; z n = Φ n + ξ n , x n = p d x p q � � � � b j , l ( x n − j ) l + Φ n c j R h ( x n − j ) + d j ξ n − j . = a + j = 1 l = 1 j = 1 j = 1 � 1 , h = 0 . 01 ; p = 2 , d x = 3 ; q = 0 , h = 0 . 05 . 6 Wilks 05: an MLR model in atmosphere science 12 / 24

� NARMA: x n − 1 + R h ( x n − 1 ) + z n ; z n = Φ n + ξ n , x n = p d x p q � � � � b j , l ( x n − j ) l + Φ n c j R h ( x n − j ) + d j ξ n − j . = a + j = 1 l = 1 j = 1 j = 1 � 1 , h = 0 . 01 ; p = 2 , d x = 3 ; q = 0 , h = 0 . 05 . � Polynomial autoregression (POLYAR) 6 d dt x k = x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 + U , U = P ( x k ) + η k , with d η k ( t ) = φη k ( t ) + dB k ( t ) . where P ( x ) = � d x j = 0 a j x j . Optimal d x = 5. 6 Wilks 05: an MLR model in atmosphere science 13 / 24

Long-term statistics Empirical probability density function (PDF) L96 L96 0.12 POLYAR POLYAR 0.1 NARMA NARMA 0.1 0.08 0.08 PDF PDF 0.06 0.06 0.04 0.04 0.02 0.02 h=0.01 h=0.05 − 5 0 5 10 − 5 0 5 10 X 1 X 1 Empirical autocorrelation function (ACF) 1 L96 L96 h=0.05 h=0.01 POLYAR POLYAR 0.8 NARMA NARMA 0.6 0.5 0.4 ACF ACF 0.2 0 0 − 0.2 − 0.5 0 2 4 6 8 10 0 2 4 6 8 10 time time 14 / 24

Prediction ( h = 0 . 05 ) A typical ensemble forecast: 10 POLYAR 5 X 1 0 − 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time 10 NARMAX 5 X 1 0 − 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time forecast trajectories in cyan true trajectory in blue 15 / 24

Prediction ( h = 0 . 05 ) RMSE of many forecasts: A typical ensemble forecast: 20 18 10 POLYAR 16 POLYAR 5 X 1 14 0 12 RMSE 10 − 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 NARMAX time 8 6 10 NARMAX 4 5 2 X 1 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 lead time − 5 Forecast time: 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time POLYAR: T ≈ 1 . 25 forecast trajectories in cyan NARMA: T ≈ 2 . 5 (Full model: T ≈ 2 . 5 true trajectory in blue “Best” forecast time achieved! ) 16 / 24

Outline 1. Stochastic model reduction (reduction from simulated data) ◮ Discrete-time stochastic parametrization (NARMA) 2. Data assimilation with the reduced model (Noisy data + reduced model → state estimation and prediction) 17 / 24

Data assimilation with the reduced model x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Noisy data: x ( nh ) + W ( n ) , n = 1 , 2 , . . . Data assimilation: estimate the state of a forward model make prediction (by ensembles of solutions) The widely used method: Ensemble Kalman filters (EnKF) 18 / 24

The Lorenz 96 system Estimate and predict x based on ➣ Noisy Data z ( n ) = x ( nh ) + W ( n ) ➣ Forward models L96x : the truncated model d dt x k ≈ x k − 1 ( x k + 1 − x k − 2 ) − x k + 10 (account for the model error by IL in EnKF ) NARMA (account for the model error by parametrization in the forward model) Wilks 2005 19 / 24

Relative error of state estimation L96x + IL 0.03 NARMA Full model + IL 0.025 relative error 0.02 0.015 0.01 0.005 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 std of observation noise Relative error for different observation noises. (ensemble size: =1000 for L96x and NARMA; =10 for the full model) 20 / 24

RMSE of state prediction 16 14 12 10 RMSE 8 6 4 L96x + IL 2 NARMA Full model + IL 0 0 1 2 3 4 time RMSE of 10 4 ensemble forecasts. (ensemble size: =1000 for L96x and NARMA; =10 for the full model) Summary: The stochastic model improves performance of DA. 21 / 24

Summary and ongoing work Accounting model error U ( x , y ) from subgrid scales x ′ = f(x) + U(x,y), y ′ = g(x,y). Data { x ( nh ) } N Stochastic model reduction by n = 1 Discrete-time stochastic parametrization Inference ◮ simplifies the inference from data ◮ incorporates memory flexibly “ X ′ = f ( X ) + Z ( t , ω ) ” Inference ◮ effective reduced model ( NARMA ) ◮ capture key statistical-dynamical features ◮ make medium-range forecasting Discretization Improve the forecast model “ X n + 1 = X n + R h ( X n ) + Z n ” → Improve performance of DA for prediction 22 / 24

Data Assimilation with Stochastic Model Reduction of Chaotic Systems - PowerPoint PPT Presentation

Data Assimilation with Stochastic Model Reduction of Chaotic Systems Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) Xuemin Tu (U. of Kansas) Snowbird, SIAM-DS,

Data assimilation with stochastic model reduction Fei Lu Department of Mathematics, Johns

Assimilation of AIRS Data at NRL Assimilation of AIRS Data at NRL Benjamin Ruston, Clay

The Joint Effort for Data assimilation Integration (JEDI) OOPS Structure, Data flow Joint Center

or Flow-Adaptive Background-Error-Covariance Modeling for Data Assimilation using Information

Soil moisture data assimilation: Soil moisture data assimilation: Error modeling, adaptive

Use of AIRS data in the Joint Center for Satellite Data Assimilation Lars Peter Riishojgaard

Functional Data Assimilation with White-Noise Data Error and Applications to Assimilation of

Introduction to Data Assimilation Ma elle Nodet team.inria.fr/moise/maelle Universit e de

Data assimilation and observing Data assimilation and observing strategies working group

Assimilation of 3D Radar Data and Derived Objects on the Convective Scale with an Ensemble-based

Assimilation of Advanced Assimilation of Advanced InfraRed nfraRed Sounder (AIRS) observations

On a Data Assimilation Method coupling Kalman Filtering, MCRE Concept and PGD Model Reduction for

The effects of data selection on The effects of data selection on the assimilation of AIRS data

Data Assimilation in a Two Neuron Network Anna Miller University of California San Diego

Using Model Reduction in Data Assimilation Met Office Website Nancy Nichols, Amos Lawless The

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

Learning Selection Strategies in Buchbergers Algorithm Dylan Peifer Cornell University 31

Hardness vs Randomness for Bounded Depth Arithmetic Circuits Chi-Ning Chou Mrinal Kumar Noam

Euclid OU-MER Herv Dole et al. MER tasks 1. MER photometry strategies 2. Pipeline

Invertible Harmonic Mappings in the Plane Higher Dimensions Elliptic Operators Giovanni

Homogenization of access to DCI Carlos Blanco, A. S. Cofio, V. Fernandez-Quiruelas Santander

Building a Coreless Internet Without Ripping Out the Core (goodell@eecs.harvard.edu) Geoffrey

Regular Sequences Eric Rowland School of Computer Science University of Waterloo, Ontario,

Vertical Visibility among Parallel Polygons in Three Dimensions GD 2015 Radoslav Fulek (IST,