Stochastic model reduction: from nonlinear Galerkin to parametric - PowerPoint PPT Presentation

Stochastic model reduction: from nonlinear Galerkin to parametric inference Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) May 22, 2019 SIAM DS19, Snowbird 1 / 24

Consider dissipative PDEs in operator form: v t = + B ( v ) + f , Av �� self − adjoint nonlinear Examples: Burgers v t = ν v xx − vv x + f ( x , t ) , Kuramoto-Sivashinsky: v t = − v xx − ν v xxxx − vv x 2 / 24

To resolve the Eq. by Fourier-Galerkin (when periodic BC) � d v k + ik v k − l + � dt � v k = − q ν k � v l � � f k ( t ) , 2 | l |≤ N , | k − l |≤ N Need: N � 5 /ν Fourier modes, dt ∼ 1 / N . E.g. ν = 10 − 4 : spatial grid= 5 × 10 4 , time steps= 5 T × 10 4 We are mainly interested in large scales, K << N . Question: a reduced model for ( � v 1 : K ) ? Reduce spatial dimension + Increase time step-size 3 / 24

Motivation: data assimilation in weather/climate prediction High-dimensional Discrete partial Prediction Full system data x ′ = f ( x ) + U ( x , y ) , Observe only Forecast y ′ = g ( x , y ) . { x ( nh ) } N n = 1 . x ( t ) , t ≥ Nh . HighD multiscale full chaotic/ergodic systems: ◮ can only afford to resolve x ′ = f ( x ) online ◮ y : unresolved variables (subgrid-scales) Discrete noisy observations: missing i.c. Ensemble prediction: need many simulations 4 / 24

x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . Data { x ( nh ) } N n = 1 Objective: Develop a closed reduced model of x that captures key statistical + dynamical properties use it for online state estimation and prediction [Approximate the stochastic process ( x ( t ) , t > 0 ) in distribution.] 5 / 24

Various efforts in closure model reduction: Direct constructions: ◮ non-linear/Petrov- Galerkin: y ( t ) = F ( x ( t )) ◮ Mori-Zwanzig formalism (memory) → statistical approximation by a non-Markov process ◮ relaxation approximations ◮ linear response / filtering / feedback control ◮ . . . Inference/Data-driven ROM ◮ hypoellitpic SDEs, GLEs and SDDEs ◮ discrete-time (time series) models ◮ data-driven: POD, DMD, Kooperman operator ◮ nonparametric inference ◮ machine learning (NN’s) . . . 6 / 24

Inference-based model reduction SDEs or time series – dynamical models 7 / 24

Differential system or discrete-time system? X ′ = f ( X ) + Z ( t , ω ) X n + 1 = X n + R h ( X n ) + Z n informative non-intrusive Inference 1 likelihood Discretization 2 error correction by data − − − − − − − − − 1 Brockwell, Sørensen, Pokern, Wiberg, Samson,. . . 2 Milstein, Tretyakov, Talay, Mattingly, Stuart, Higham, . . . 8 / 24

Discrete-time stochastic parametrization NARMA( p , q ) [Chorin-Lu (15)] X n = X n − 1 + R h ( X n − 1 ) + Z n , Z n = Φ n + ξ n , p q r s � � � � Φ n = a j X n − j + b i , j P i ( X n − j ) + c j ξ n − j j = 1 j = 1 i = 1 j = 1 � �� Auto-Regression Moving Average R h ( X n − 1 ) from a numerical scheme for x ′ ≈ f ( x ) Φ n depends on the past NARMAX in system identification Z n = Φ( Z , X ) + ξ n , Tasks: Structure derivation: terms and orders ( p , r , s , q ) in Φ n ; Parameter estimation: a j , b i , j , c j , and σ . Conditional MLE 9 / 24

Model reduction for dissipative PDEs by parametric inference 10 / 24

Kuramoto-Sivashinsky: v t = − v xx − ν v xxxx − vv x Burgers: v t = ν v xx − vv x + f ( x , t ) , Goal: a closed model for ( � v 1 : K ) , K = 2 K 0 << N . � d v k + ik v k − l + � v k = − q ν dt � k � � v l � f k ( t ) , 2 | l |≤ K , | k − l |≤ K � + ik � v l � v k − l 2 | l | > K or | k − l | > K View ( � v 1 : K ) ∼ x , ( � v k > K ) ∼ y : x ′ = f ( x ) + U ( x , y ) , y ′ = g ( x , y ) . TODO: represent the effects of high modes to the low modes 11 / 24

Derivation of a parametric form (KSE) Let v = u + w . In operator form: v t = Av + B ( v ) , du dt = PAu + PB ( u ) + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) Nonlinear Galerkin: approximate inertial manifold (IM) 1 dw dt ≈ 0 ⇒ w ≈ A − 1 QB ( u + w ) ⇒ w ≈ ψ ( u ) Need: spectral gap condition ; dim = ( u ) > K : parametrization with time delay (Lu-Lin17) 1 Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 12 / 24

Derivation of a parametric form (KSE) Let v = u + w . In operator form: v t = Av + B ( v ) , du dt = PAu + PB ( u ) + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) Nonlinear Galerkin: approximate inertial manifold (IM) 1 dw dt ≈ 0 ⇒ w ≈ A − 1 QB ( u + w ) ⇒ w ≈ ψ ( u ) Need: spectral gap condition ; dim = ( u ) > K : parametrization with time delay (Lu-Lin17) A time series (NARMA) model of the form u n k = R δ ( u n − 1 ) + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , f n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − j Φ n c v + c R ) + c w u n − 1 � � k = k , j k k l k − l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K 1 Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 13 / 24

Test setting: ν = 3 . 43 N = 128, dt = 0 . 001 Reduced model: K = 5, δ = 100 dt 3 unstable modes 2 stable modes Long-term statistics: Data Data Truncated system Truncated system 0.8 NARMA NARMA 0 10 0.6 pdf ACF 0.4 0.2 − 2 10 0 − 0.2 − 0.4 − 0.2 0 0.2 0.4 0.6 0 10 20 30 40 50 Real v 4 time probability density function auto-correlation function 14 / 24

Prediction A typical forecast: RMSE of many forecasts: 0.5 the truncated system 15 v 4 0 − 0.5 RMSE the truncated system 10 20 40 60 80 0.4 NARMA 0.2 5 NARMA 0 v 4 − 0.2 − 0.4 0 20 40 60 80 20 40 60 80 time t lead time Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 ( ≈ 2 Lyapunov time) 15 / 24

Derivation of a parametric form: stochastic Burgers Let v = u + w . In operator form: du dt = PAu + PB ( u ) + Pf + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) spectral gap: Burgers ? (likely not) w ( t ) is not function of u ( t ) , but a functional of its path 16 / 24

Derivation of a parametric form: stochastic Burgers Let v = u + w . In operator form: du dt = PAu + PB ( u ) + Pf + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) spectral gap: Burgers ? (likely not) w ( t ) is not function of u ( t ) , but a functional of its path Integration instead: � t w ( t ) = e − QAt w ( 0 ) + e − QA ( t − s ) [ QB ( u ( s ) + w ( s ))] ds 0 w n ≈ c 0 QB ( u n ) + c 1 QB ( u n − 1 ) + · · · + c p QB ( u n − p ) Linear in parameter approximation: PB ( u + w ) − PB ( u ) = P [( uw ) x + ( u 2 ) x ] / 2 ≈ P [( uw ) x ] / 2 + noise p � c j P [( u n QB ( u n − j )) x ] + noise ≈ j = 0 17 / 24

A time series (NARMA) model of the form k = R δ ( u n − 1 u n ) + f n k + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , f n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − j Φ n c v + c R ) + c w u n − 1 � � k = k , j k − l k k l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K 18 / 24

Numerical tests: ν = 0 . 05, K 0 = 4 → random shocks Spectrum 10 0 True Truncated NAR Spectrum 10 -1 Full model: N = 128 , dt = 0 . 005 10 -2 1 2 3 4 5 6 7 8 Wavenumber Reduced model: K = 8, δ = 20 dt Energy spectrum 19 / 24

2 | 2 ,|u k | 2 ) k=1 2 | 2 ,|u k | 2 ) k=2 cov(|u cov(|u 10 -3 20 0.06 True ACF 0.04 10 Truncated 0.02 NAR 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=3 2 | 2 ,|u k | 2 ) k=4 cov(|u cov(|u 10 -3 10 -3 2 2 ACF 0 1 -2 0 -4 -1 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=5 2 | 2 ,|u k | 2 ) k=6 cov(|u cov(|u 10 -4 10 -4 20 20 ACF 10 10 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=7 2 | 2 ,|u k | 2 ) k=8 cov(|u cov(|u 10 -4 10 -3 20 4 ACF 10 2 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 Time Lag Time Lag Cross-ACF of energy (4th moments!) 20 / 24

Abs of Mode k=1 Abs of Mode k=2 1 0.5 True 0.5 Truncated 0 0 NAR -0.5 -0.5 -1 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=3 Abs of Mode k=4 0.4 1 0.2 0.5 0 0 -0.2 -0.4 -0.5 -0.6 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=5 Abs of Mode k=6 0.5 0.4 0.2 0 0 -0.2 -0.4 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=7 Abs of Mode k=8 0.5 0.5 0 0 -0.5 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Time Time Trajectory prediction in response to force 21 / 24

Summary and ongoing work x ′ = f(x) + U(x,y), y ′ = g(x,y). Data { x ( nh ) } N n = 1 Inference-based stochastic model reduction Inference non-intrusive time series ( NARMA ) “ X ′ = f ( X ) + Z ( t , ω ) ” Inference parametrize projections on path space Discretization → Effective stochastic reduced model “ X n + 1 = X n + R h ( X n ) + Z n ” for prediction 22 / 24

Open problems: model reduction: model selection post-processing theoretical understanding of the approximation ◮ distance between the two stochastic processes? 23 / 24

Stochastic model reduction: from nonlinear Galerkin to parametric - PowerPoint PPT Presentation

Stochastic model reduction: from nonlinear Galerkin to parametric inference Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) May 22, 2019 SIAM DS19, Snowbird 1 /

An Operator Splitting Based Stochastic Galerkin Method for Nonlinear Systems of Hyperbolic

Online adaptive discrete empirical interpolation for nonlinear model reduction Benjamin

Continuous-time Stochastic Grey-box Model of the Nonlinear Feedback System based on Residual

A nonlinear sigma model connected with stochastic processes and quantum diffusion. Margherita

Data Assimilation with Stochastic Model Reduction of Chaotic Systems Fei Lu Department of

Data assimilation with stochastic model reduction Fei Lu Department of Mathematics, Johns

Linear and nonlinear methods for model reduction Diane Guignard Joint work : A. Bonito, R.

Interpolation-based model reduction of nonlinear control systems Tobias Breiten Max Planck

Data-driven model reduction for stochastic Burgers equations Fei Lu Department of Mathematics,

Data-driven stochastic model reduction Fei Lu joint with Alexandre J. Chorin and Kevin Lin UC

Model reduction of partially-observed Motivation stochastic differential equations A control

A Nonlinear Trust-Region Framework for PDE-Constrained Optimization Using Adaptive Model

Energy Stable Discontinuous Galerkin Methods for Maxwells Equations in Nonlinear Optical Media

Manifold Construction and Parameterization for Nonlinear Manifold-Based Model Reduction Chenjie

Conjugate gradient methods for stochastic Galerkin finite element saddle point matrices B T A

Efficient, Parametrically-Robust Nonlinear Model Reduction using Local Reduced-Order Bases

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Stochastic Galerkin approximations of elliptic PDEs driven by spatial white noise Xiaoliang Wan

Hlder continuity for the nonlinear stochastic heat equation with rough initial conditions Le

Adaptive Stochastic Collocation for PDE-Constrained Optimization under Uncertainty using Sparse

Stochastic Simulation Methods: Variance reduction methods Antithetic variables Bo Friis

Nonlinear model reduction Using machine learning to enable rapid simulation of extreme-scale

Visualization ( Nonlinear dimensionality reduction ) Fei Sha Yahoo! Research

Order of convergence of splitting schemes for both deterministic and stochastic nonlinear Schr