Data-driven model reduction for stochastic Burgers equations Fei Lu - PowerPoint PPT Presentation

Data-driven model reduction for stochastic Burgers equations Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) 2nd Symposium on Machine Learning and Dynamical Systems September, 2020 1 / 23

Consider a stochastic Burgers equation v t = ν v xx − vv x + f ( x , t ) , x ∈ [ 0 , 2 π ] , periodic BC N-mode Fourier-Galerkin: k = 1 , . . . , N � d v k + ik v k − l + � v k = − ν k 2 � dt � � v l � f k ( t ) , 2 | l |≤ N , | k − l |≤ N Need: N � 1 /ν , dt ∼ 1 / N by (CFL) → Costly: ν = 10 − 4 → N ∼ 10 4 , time steps= 10 4 T To simulate 10 4 time units, we need 10 8 time steps! Interested in: efficient simulations of ( � v 1 : K ) , K << N . Question: a reduced closure model of ( � v 1 : K ) ? Space-time reduction: reduce spatial dimension + increase time step size 2 / 23

Motivation: data assimilation with ensemble prediction x ′ = F ( x ) + U ( x , y ) , ( � resolved scales v 1 : K ) y ′ = G ( x , y ) , ( � subgrid-scales v K + 1 : N ) Data assimilation: partial noisy observation → prediction missing i.c. → ensemble prediction can only afford to resolve x ′ = F ( x ) 3 / 23

Motivation: data assimilation with ensemble prediction x ′ = F ( x ) + U ( x , y ) , ( � resolved scales v 1 : K ) y ′ = G ( x , y ) , ( � subgrid-scales v K + 1 : N ) Data assimilation: partial noisy observation → prediction missing i.c. → ensemble prediction can only afford to resolve x ′ = F ( x ) Objective: Develop a closure reduced model of x that captures key statistical + dynamical properties can be used for ensemble simulations 4 / 23

Closure modeling, model error UQ, subgrid parametrization Inference/Data-driven ROM Direct constructions: PCA/POD, DMD, Kooperman [Holmes, non-linear Galerkin [Fioas, Jolly, Lumley, Marsden, Mezic, Wilcox, Kutz, Rowley ...] Kevrekidis, Titi...] ROM closure [Farhat, Carlberg, Iliescu, Wang...] moment closure [Levermore, Morokoff...] stochastic models: SDEs/GLEs, Mori-Zwanzig formalism time series models [Chorin/Majda/Gil groups] memory → non-Markov process Equation-free [Kevrekidis,...] [Chorin, Hald, Kupferman, Stinis, Li, Darve, E, Karniadarkis, Venturi, Duraisamy ...] manifold/machine learning [***...] 5 / 23

Inference-based model reduction 6 / 23

x ′ = F ( x ) + U ( x , y ) , y ′ = G ( x , y ) . Data { x ( nh ) } N n = 1 KEY: approx. the distribution of the stochastic process Approximate the discrete-time forward map: x n = F n ( x 1 : n − 1 ) curse of dimensionality parametric inference: use the structure of the map 7 / 23

Discrete-time stochastic parametrization NARMA( p , q ) [Chorin-Lu15] X n = X n − 1 + R h ( X n − 1 ) + Z n , Z n = Φ n + ξ n , p q r s � � � � Φ n = a j X n − j + b i , j P i ( X n − j ) + c j ξ n − j j = 1 j = 1 i = 1 j = 1 � �� Auto-Regression Moving Average R h ( X n − 1 ) from a numerical scheme for x ′ ≈ F ( x ) Φ n depends on the past NARMAX in system identification Z n = Φ( Z , X ) + ξ n , Tasks: Structure derivation: terms and orders ( p , r , s , q ) in Φ n ; Parameter estimation: a j , b i , j , c j , and σ . Conditional MLE 8 / 23

Example: The two-layer Lorenz 96 model NARMA reproduces statistics: ACF, PDF [Chorin-Lu15PNAS] NARMA improves Data Assimilation [Lu-Tu-Chorin17MWR] 9 / 23

Model reduction for dissipative PDEs nonlinear Galerkin ↓ parametric inference 10 / 23

Kuramoto-Sivashinsky: v t = − v xx − ν v xxxx − vv x Burgers: v t = ν v xx − vv x + f ( x , t ) , Goal: a closed model for ( � v 1 : K ) , K << N . � d v k + ik v k − l + � dt � v k = − q ν k � � v l � f k ( t ) , 2 | l |≤ K , | k − l |≤ K � + ik � v l � v k − l 2 | l | > K or | k − l | > K View ( � v 1 : K ) ∼ x , ( � v k > K ) ∼ y : x ′ = F ( x ) + U ( x , y ) , y ′ = G ( x , y ) . TODO: represent the effects of high modes to the low modes 11 / 23

Derivation of a parametric form (KSE): v t = − v xx − ν v xxxx − vv x Let v = u + w . In operator form: v t = Av + B ( v ) , du dt = PAu + PB ( u ) + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) Nonlinear Galerkin: approximate inertial manifold (IM) 1 dw dt ≈ 0 ⇒ w ≈ A − 1 QB ( u + w ) ⇒ w ≈ ψ ( u ) Need: spectral gap condition ; dim ( u ) > K : parametrization with time delay (Lu-Lin-Chorin17) A time series (NARMA) model of the form k = R δ ( u n − 1 u n ) + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , g n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − 1 u n − j Φ n c v + c R ) + c w k = � � k , j k k l k − l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K KEY: high-modes = functions of low modes 1 Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 12 / 23

Test setting: ν = 3 . 43 N = 128, dt = 0 . 001 Reduced model: K = 5, δ = 100 dt 3 unstable modes 2 stable modes Long-term statistics: Data Data Truncated system Truncated system 0.8 NARMA NARMA 0 10 0.6 pdf ACF 0.4 0.2 − 2 10 0 − 0.2 − 0.4 − 0.2 0 0.2 0.4 0.6 0 10 20 30 40 50 Real v 4 time probability density function auto-correlation function 13 / 23

Prediction A typical forecast: RMSE of many forecasts: 0.5 the truncated system 15 v 4 0 − 0.5 RMSE the truncated system 10 20 40 60 80 0.4 NARMA 0.2 5 NARMA 0 v 4 − 0.2 − 0.4 0 20 40 60 80 20 40 60 80 time t lead time Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 ( ≈ 2 Lyapunov time) 14 / 23

Derivation of parametric form: stochastic Burgers v t = ν v xx − vv x + f ( x , t ) Let v = u + w . In operator form: du dt = PAu + PB ( u ) + Pf + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) + Qf spectral gap: Burgers ? (likely not) w ( t ) is not function of u ( t ) , but a functional of its path Integration instead: � t w ( t ) = e − QAt w ( 0 ) + e − QA ( t − s ) [ QB ( u ( s ) + w ( s ))] ds 0 w n ≈ c 0 QB ( u n ) + c 1 QB ( u n − 1 ) + · · · + c p QB ( u n − p ) Linear in parameter approximation: PB ( u + w ) − PB ( u ) = P [( uw ) x + ( u 2 ) x ] / 2 ≈ P [( uw ) x ] / 2 + noise p � c j P [( u n QB ( u n − j )) x ] + noise ≈ j = 0 KEY: high-modes = functionals of paths of low modes 15 / 23

A time series (NARMA) model of the form k = R δ ( u n − 1 u n ) + f n k + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , f n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − j Φ n c v + c R ) + c w u n − 1 � � k = k , j k − l k k l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K 16 / 23

Numerical tests: ν = 0 . 05, K 0 = 4 → random shocks Spectrum 10 0 True Truncated NAR Spectrum 10 -1 Full model: N = 128 , dt = 0 . 005 10 -2 1 2 3 4 5 6 7 8 Wavenumber Reduced model: K = 8, δ = 20 dt Energy spectrum 17 / 23

2 | 2 ,|u k | 2 ) k=1 2 | 2 ,|u k | 2 ) k=2 cov(|u cov(|u 10 -3 20 0.06 True ACF 0.04 10 Truncated 0.02 NAR 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=3 2 | 2 ,|u k | 2 ) k=4 cov(|u cov(|u 10 -3 10 -3 2 2 ACF 0 1 -2 0 -4 -1 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=5 2 | 2 ,|u k | 2 ) k=6 cov(|u cov(|u 10 -4 10 -4 20 20 ACF 10 10 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=7 2 | 2 ,|u k | 2 ) k=8 cov(|u cov(|u 10 -4 10 -3 20 4 ACF 10 2 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 Time Lag Time Lag Cross-ACF of energy (4th moments!) 18 / 23

Abs of Mode k=1 Abs of Mode k=2 1 0.5 True 0.5 Truncated 0 0 NAR -0.5 -0.5 -1 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=3 Abs of Mode k=4 0.4 1 0.2 0.5 0 0 -0.2 -0.4 -0.5 -0.6 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=5 Abs of Mode k=6 0.5 0.4 0.2 0 0 -0.2 -0.4 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=7 Abs of Mode k=8 0.5 0.5 0 0 -0.5 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Time Time Trajectory prediction in response to force 19 / 23

Spacial-temporal reduction: how small can K (spatial dim.) be? how large can δ (time-step size) be? CFL number: | u | dt dx ∼ | u | Ndt ∼ | u | K δ 20 / 23

Summary and ongoing work x ′ = f(x) + U(x,y), y ′ = g(x,y). Inference-based stochastic model reduction Data { x ( nh ) } N n = 1 non-intrusive time series ( NARMA ) Inference parametrize projections on path space “ X ′ = f ( X ) + Z ( t , ω ) ” � Inference c k Φ k x n = F n ( x 1 : n − 1 ) ≈ n − p : n − 1 k Discretization x n = F n ( x 1 : n − 1 ) ≈ E [ x n | x 1 : n − 1 ] “ X n + 1 = X n + R h ( X n ) + Z n ” → Effective stochastic reduced model for prediction 21 / 23

Open problems: general dissipative systems + model selection post-processing to predict shocks theoretical understanding of the approximation ◮ optimal on the basis space in L 2 (Lin-L.19) ◮ distance between the two stochastic processes? 22 / 23

Data-driven model reduction for stochastic Burgers equations Fei Lu - PowerPoint PPT Presentation

Data-driven model reduction for stochastic Burgers equations Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) 2nd Symposium on Machine Learning and Dynamical

Why Burgers Equation: What Are the . . . Can Burgers Equation . . . Symmetry-Based Approach

The viscous Burgers Equation on the Sierpinski gasket Melissa Meinert Bielefeld University 6th

Relativistic Burgers equations on a curved spacetime Baver Okutmustur Middle East Technical

Burgers universality in four-dimensional SU ( N ) Yang-Mills theory at large N Herbert Neuberger

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

Diffusion-Driven Congestion Reduction Diffusion-Driven Congestion Reduction for Substrate

False fasting is driven by pride False fasting is driven by pride False fasting is

Data-driven stochastic model reduction Fei Lu joint with Alexandre J. Chorin and Kevin Lin UC

Stochastic Processes Will Perkins March 7, 2013 Stochastic Processes Q: What is a Stochastic

What If We Only Have Stochastic . . . What if the Stochastic . . . Approximate Stochastic

SAS data reduction Haydyn Mertens (EMBL-Hamburg) Data reduction steps Acquisition Reduction

Introduction to Harm Reduction Definition of Harm Reduction Harm reduction refers to policies,

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure

Data-driven time parallelism and model reduction Kevin Carlberg 1 , Lukas Brencher 2 , Bernard

Outline 1 Presentation of the problem Truncated Stochastic Algorithms and Variance Reduction:

Entropy Estimation on the Basis Stochastic Model of a Stochastic Model Werner Schindler Bundesamt

Towards efficient end-to-end architectures for action recognition and detection in videos Limin

CS 309: Autonomous Intelligent Robotics Instructor: Jivko Sinapov

Development of Physical Sciences in Africa and Tribute to Prof. Francis Allotey . 17-19

Cryptanalysis of FORK-256 Krystian Matusiewicz 1 , Thomas Peyrin 2 , Olivier Billet 2 , Scott

COMP 150: Developmental Robotics Instructor: Jivko Sinapov www.cs.tufts.edu/~jsinapov This Week

Measurements of CMB Polarization SPT Keck Array Bicep2 Christian Reichardt University of

An old Ar(ficial Intelligence dream that comes true: Merging

Cryptanalysis of FORK-256 and some comments on the state of hash functions research Krystian