data driven model reduction for stochastic burgers
play

Data-driven model reduction for stochastic Burgers equations Fei Lu - PowerPoint PPT Presentation

Data-driven model reduction for stochastic Burgers equations Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) 2nd Symposium on Machine Learning and Dynamical


  1. Data-driven model reduction for stochastic Burgers equations Fei Lu Department of Mathematics, Johns Hopkins Joint work with: Alexandre J. Chorin (UC Berkeley) Kevin K. Lin (U. of Arizona) 2nd Symposium on Machine Learning and Dynamical Systems September, 2020 1 / 23

  2. Consider a stochastic Burgers equation v t = ν v xx − vv x + f ( x , t ) , x ∈ [ 0 , 2 π ] , periodic BC N-mode Fourier-Galerkin: k = 1 , . . . , N � d v k + ik v k − l + � v k = − ν k 2 � dt � � v l � f k ( t ) , 2 | l |≤ N , | k − l |≤ N Need: N � 1 /ν , dt ∼ 1 / N by (CFL) → Costly: ν = 10 − 4 → N ∼ 10 4 , time steps= 10 4 T To simulate 10 4 time units, we need 10 8 time steps! Interested in: efficient simulations of ( � v 1 : K ) , K << N . Question: a reduced closure model of ( � v 1 : K ) ? Space-time reduction: reduce spatial dimension + increase time step size 2 / 23

  3. Motivation: data assimilation with ensemble prediction x ′ = F ( x ) + U ( x , y ) , ( � resolved scales v 1 : K ) y ′ = G ( x , y ) , ( � subgrid-scales v K + 1 : N ) Data assimilation: partial noisy observation → prediction missing i.c. → ensemble prediction can only afford to resolve x ′ = F ( x ) 3 / 23

  4. Motivation: data assimilation with ensemble prediction x ′ = F ( x ) + U ( x , y ) , ( � resolved scales v 1 : K ) y ′ = G ( x , y ) , ( � subgrid-scales v K + 1 : N ) Data assimilation: partial noisy observation → prediction missing i.c. → ensemble prediction can only afford to resolve x ′ = F ( x ) Objective: Develop a closure reduced model of x that captures key statistical + dynamical properties can be used for ensemble simulations 4 / 23

  5. Closure modeling, model error UQ, subgrid parametrization Inference/Data-driven ROM Direct constructions: PCA/POD, DMD, Kooperman [Holmes, non-linear Galerkin [Fioas, Jolly, Lumley, Marsden, Mezic, Wilcox, Kutz, Rowley ...] Kevrekidis, Titi...] ROM closure [Farhat, Carlberg, Iliescu, Wang...] moment closure [Levermore, Morokoff...] stochastic models: SDEs/GLEs, Mori-Zwanzig formalism time series models [Chorin/Majda/Gil groups] memory → non-Markov process Equation-free [Kevrekidis,...] [Chorin, Hald, Kupferman, Stinis, Li, Darve, E, Karniadarkis, Venturi, Duraisamy ...] manifold/machine learning [***...] 5 / 23

  6. Inference-based model reduction 6 / 23

  7. x ′ = F ( x ) + U ( x , y ) , y ′ = G ( x , y ) . Data { x ( nh ) } N n = 1 KEY: approx. the distribution of the stochastic process Approximate the discrete-time forward map: x n = F n ( x 1 : n − 1 ) curse of dimensionality parametric inference: use the structure of the map 7 / 23

  8. Discrete-time stochastic parametrization NARMA( p , q ) [Chorin-Lu15] X n = X n − 1 + R h ( X n − 1 ) + Z n , Z n = Φ n + ξ n , p q r s � � � � Φ n = a j X n − j + b i , j P i ( X n − j ) + c j ξ n − j j = 1 j = 1 i = 1 j = 1 � �� � � �� � Auto-Regression Moving Average R h ( X n − 1 ) from a numerical scheme for x ′ ≈ F ( x ) Φ n depends on the past NARMAX in system identification Z n = Φ( Z , X ) + ξ n , Tasks: Structure derivation: terms and orders ( p , r , s , q ) in Φ n ; Parameter estimation: a j , b i , j , c j , and σ . Conditional MLE 8 / 23

  9. Example: The two-layer Lorenz 96 model NARMA reproduces statistics: ACF, PDF [Chorin-Lu15PNAS] NARMA improves Data Assimilation [Lu-Tu-Chorin17MWR] 9 / 23

  10. Model reduction for dissipative PDEs nonlinear Galerkin ↓ parametric inference 10 / 23

  11. Kuramoto-Sivashinsky: v t = − v xx − ν v xxxx − vv x Burgers: v t = ν v xx − vv x + f ( x , t ) , Goal: a closed model for ( � v 1 : K ) , K << N . � d v k + ik v k − l + � dt � v k = − q ν k � � v l � f k ( t ) , 2 | l |≤ K , | k − l |≤ K � + ik � v l � v k − l 2 | l | > K or | k − l | > K View ( � v 1 : K ) ∼ x , ( � v k > K ) ∼ y : x ′ = F ( x ) + U ( x , y ) , y ′ = G ( x , y ) . TODO: represent the effects of high modes to the low modes 11 / 23

  12. Derivation of a parametric form (KSE): v t = − v xx − ν v xxxx − vv x Let v = u + w . In operator form: v t = Av + B ( v ) , du dt = PAu + PB ( u ) + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) Nonlinear Galerkin: approximate inertial manifold (IM) 1 dw dt ≈ 0 ⇒ w ≈ A − 1 QB ( u + w ) ⇒ w ≈ ψ ( u ) Need: spectral gap condition ; dim ( u ) > K : parametrization with time delay (Lu-Lin-Chorin17) A time series (NARMA) model of the form k = R δ ( u n − 1 u n ) + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , g n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − 1 u n − j Φ n c v + c R ) + c w k = � � k , j k k l k − l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K KEY: high-modes = functions of low modes 1 Foias, Constantin, Temam, Sell, Jolly, Kevrekidis, Titi et al (88-94) 12 / 23

  13. Test setting: ν = 3 . 43 N = 128, dt = 0 . 001 Reduced model: K = 5, δ = 100 dt 3 unstable modes 2 stable modes Long-term statistics: Data Data Truncated system Truncated system 0.8 NARMA NARMA 0 10 0.6 pdf ACF 0.4 0.2 − 2 10 0 − 0.2 − 0.4 − 0.2 0 0.2 0.4 0.6 0 10 20 30 40 50 Real v 4 time probability density function auto-correlation function 13 / 23

  14. Prediction A typical forecast: RMSE of many forecasts: 0.5 the truncated system 15 v 4 0 − 0.5 RMSE the truncated system 10 20 40 60 80 0.4 NARMA 0.2 5 NARMA 0 v 4 − 0.2 − 0.4 0 20 40 60 80 20 40 60 80 time t lead time Forecast time: the truncated system: T ≈ 5 the NARMA system: T ≈ 50 ( ≈ 2 Lyapunov time) 14 / 23

  15. Derivation of parametric form: stochastic Burgers v t = ν v xx − vv x + f ( x , t ) Let v = u + w . In operator form: du dt = PAu + PB ( u ) + Pf + [ PB ( u + w ) − PB ( u )] dw dt = QAw + QB ( u + w ) + Qf spectral gap: Burgers ? (likely not) w ( t ) is not function of u ( t ) , but a functional of its path Integration instead: � t w ( t ) = e − QAt w ( 0 ) + e − QA ( t − s ) [ QB ( u ( s ) + w ( s ))] ds 0 w n ≈ c 0 QB ( u n ) + c 1 QB ( u n − 1 ) + · · · + c p QB ( u n − p ) Linear in parameter approximation: PB ( u + w ) − PB ( u ) = P [( uw ) x + ( u 2 ) x ] / 2 ≈ P [( uw ) x ] / 2 + noise p � c j P [( u n QB ( u n − j )) x ] + noise ≈ j = 0 KEY: high-modes = functionals of paths of low modes 15 / 23

  16. A time series (NARMA) model of the form k = R δ ( u n − 1 u n ) + f n k + g n k + Φ n k , k with Φ n k := Φ n k ( u n − p : n − 1 , f n − p : n − 1 ) in form of p � � k , j u n − j k , j R δ ( u n − j u n − j Φ n c v + c R ) + c w u n − 1 � � k = k , j k − l k k l j = 1 | k − l |≤ K , K < | l |≤ 2 K or | l |≤ K , K < | k − l |≤ 2 K 16 / 23

  17. Numerical tests: ν = 0 . 05, K 0 = 4 → random shocks Spectrum 10 0 True Truncated NAR Spectrum 10 -1 Full model: N = 128 , dt = 0 . 005 10 -2 1 2 3 4 5 6 7 8 Wavenumber Reduced model: K = 8, δ = 20 dt Energy spectrum 17 / 23

  18. 2 | 2 ,|u k | 2 ) k=1 2 | 2 ,|u k | 2 ) k=2 cov(|u cov(|u 10 -3 20 0.06 True ACF 0.04 10 Truncated 0.02 NAR 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=3 2 | 2 ,|u k | 2 ) k=4 cov(|u cov(|u 10 -3 10 -3 2 2 ACF 0 1 -2 0 -4 -1 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=5 2 | 2 ,|u k | 2 ) k=6 cov(|u cov(|u 10 -4 10 -4 20 20 ACF 10 10 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 2 | 2 ,|u k | 2 ) k=7 2 | 2 ,|u k | 2 ) k=8 cov(|u cov(|u 10 -4 10 -3 20 4 ACF 10 2 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 Time Lag Time Lag Cross-ACF of energy (4th moments!) 18 / 23

  19. Abs of Mode k=1 Abs of Mode k=2 1 0.5 True 0.5 Truncated 0 0 NAR -0.5 -0.5 -1 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=3 Abs of Mode k=4 0.4 1 0.2 0.5 0 0 -0.2 -0.4 -0.5 -0.6 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=5 Abs of Mode k=6 0.5 0.4 0.2 0 0 -0.2 -0.4 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Abs of Mode k=7 Abs of Mode k=8 0.5 0.5 0 0 -0.5 -0.5 0 5 10 15 20 25 0 5 10 15 20 25 Time Time Trajectory prediction in response to force 19 / 23

  20. Spacial-temporal reduction: how small can K (spatial dim.) be? how large can δ (time-step size) be? CFL number: | u | dt dx ∼ | u | Ndt ∼ | u | K δ 20 / 23

  21. Summary and ongoing work x ′ = f(x) + U(x,y), y ′ = g(x,y). Inference-based stochastic model reduction Data { x ( nh ) } N n = 1 non-intrusive time series ( NARMA ) Inference parametrize projections on path space “ X ′ = f ( X ) + Z ( t , ω ) ” � Inference c k Φ k x n = F n ( x 1 : n − 1 ) ≈ n − p : n − 1 k Discretization x n = F n ( x 1 : n − 1 ) ≈ E [ x n | x 1 : n − 1 ] “ X n + 1 = X n + R h ( X n ) + Z n ” → Effective stochastic reduced model for prediction 21 / 23

  22. Open problems: general dissipative systems + model selection post-processing to predict shocks theoretical understanding of the approximation ◮ optimal on the basis space in L 2 (Lin-L.19) ◮ distance between the two stochastic processes? 22 / 23

Recommend


More recommend