performance tuning of newton gmres methods for
play

Performance tuning of Newton-GMRES methods for discontinuous - PowerPoint PPT Presentation

Introduction Background Numerical Experiments Conclusion Performance tuning of Newton-GMRES methods for discontinuous Galerkin discretization of the Navier-Stokes equations Matthew J. Zahr and Per-Olof Persson Stanford University University


  1. Introduction Background Numerical Experiments Conclusion Performance tuning of Newton-GMRES methods for discontinuous Galerkin discretization of the Navier-Stokes equations Matthew J. Zahr and Per-Olof Persson Stanford University University of California, Berkeley Lawrence Berkeley National Lab 25th June 2013 San Diego, CA 43rd AIAA Fluid Dynamics Conference and Exhibit Zahr and Persson DG Performance Tuning

  2. Introduction Background Numerical Experiments Conclusion 1 Introduction 2 Background ODE Scheme Newton Prediction Jacobian Recycling GMRES Tolerance 3 Numerical Experiments Experiment 1: ODE Scheme Experiment 2: Newton Prediction Experiment 3: Jacobian Recycling Experiment 4: GMRES Tolerance 4 Conclusion Zahr and Persson DG Performance Tuning

  3. Introduction Background Numerical Experiments Conclusion Motivation Low-order methods perform poorly for problems where high numerical accuracy is required Wave propagation (e.g. aeroacoustics) Turbulent flow (e.g. draw & transition prediction) Non-linear interactions (e.g. fluid-structure coupling) High-order discontinuous Galerkin methods attractive options: Low dissipation, stabilization, complex geometries Parallel computers required for realistic problems because of high computational and storage costs with DG Zahr and Persson DG Performance Tuning

  4. Introduction Background Numerical Experiments Conclusion Motivation Fundamental properties of Discontinuous Galerkin (DG) methods: FVM FDM FEM DG 1) High-order/Low dispersion 2) Unstructured meshes 3) Stability for conservation laws However, several problems to resolve: High CPU/memory requirements (compared to FVM or H-O FDM) Low tolerance to under-resolved features High-order geometry representation and mesh generateion The challenge is to make DG competitive for real-world problems Zahr and Persson DG Performance Tuning

  5. Introduction Background Numerical Experiments Conclusion Semi-discrete Equations Discretization of the Navier-Stokes equations with DG-FEM M ˙ u ( t ) = r ( t, u ( t )) where M ∈ R N × N is the block diagonal mass matrix, u ∈ R N is the time-dependent state vector arising from the DG-FEM discretization, and r : R + × R N → R N is the spatially-discretized nonlinearity of the Navier-Stokes equations. Zahr and Persson DG Performance Tuning

  6. Introduction ODE Scheme Background Newton Prediction Numerical Experiments Jacobian Recycling Conclusion GMRES Tolerance Implicit Time Integration Implicit solvers typically required because of CFL restrictions from viscous effects, low Mach numbers, and adaptive/anisotropic grids Backward differentiation formulas Runge-Kutta methods Jacobian matrices are large even at p = 2 or p = 3, however: They are required for non-trivial preconditioners They are very expensive to recompute Therefore, we consider matrix-based Newton-Krylov solvers Zahr and Persson DG Performance Tuning

  7. Introduction ODE Scheme Background Newton Prediction Numerical Experiments Jacobian Recycling Conclusion GMRES Tolerance Backward Differentiation Formulas (BDF) � n � M u ( n +1) − α i M u ( i ) + κ ∆ t r ( t n +1 , u ( n +1) ) � = 0 i =0 BDF1 (Backward Euler) α 1 = � � 0 · · · 0 1 κ 1 = 1 BDF2 α 2 = � � 0 · · · 0 − 1 / 3 4 / 3 κ 2 = 2 / 3 BDF3 α 3 = � � 0 · · · 0 2 / 11 − 9 / 11 18 / 11 κ 3 = 6 / 11 BDF23 α 23 = τ α 2 + (1 − τ ) α 3 κ 23 = τκ 2 + (1 − τ ) κ 3 Zahr and Persson DG Performance Tuning

  8. Introduction ODE Scheme Background Newton Prediction Numerical Experiments Jacobian Recycling Conclusion GMRES Tolerance BDF23 3: 3rd Order, A-stable BDF Define u 23 as n u ( n ) + α 23 n − 1 u ( n − 1) + α 23 u 23 = α 23 n − 2 u ( n − 2) Solve the nonlinear Backward Cauchy-Euler (BCE) equation R ( u i ) = 0, where R ( u i ) = M u i − ( M u 23 + κ 23 ∆ t r ( t n +1 , u i )) Define u 33 as n u ( n ) + α 3 n − 1 u ( n − 1) + α 3 n − 2 u ( n − 2) − δ ( u i − u 23 ) u 33 = α 3 Solve the nonlinear BCE equation R ( u n +1 ) = 0, where R ( u n +1 ) = M u ( n +1) − � � M u 33 + κ 33 ∆ t r ( t n +1 , u ( n +1) ) Zahr and Persson DG Performance Tuning

  9. Introduction ODE Scheme Background Newton Prediction Numerical Experiments Jacobian Recycling Conclusion GMRES Tolerance Diagonally-Implicit Runge Kutta (DIRK) Standard formulation ( k -form) s u ( n +1) = u ( n ) + � b i k i i =1   i  t n + c i ∆ t, u ( n ) + �  , M k i = ∆ t r a ij k j j =1 Alternate formulation ( u -form) s u ( n +1) = u ( n ) + ∆ t � b j M − 1 r ( t n + c j ∆ t, ¯ u j ) j =1 i u i = M u ( n ) + ∆ t � M ¯ a ij r ( t n + c j ∆ t, ¯ u j ) . j =1 Zahr and Persson DG Performance Tuning

  10. Introduction ODE Scheme Background Newton Prediction Numerical Experiments Jacobian Recycling Conclusion GMRES Tolerance Newton Prediction Accurate predictions for Newton’s method may result in fewer nonlinear iterations Extrapolation using Lagrangian polynomial Construct polynomial of order p with p + 1 points in solution history Use polynomial to predict solution at next time step Constant (LAG0), linear (LAG1), quadratic (LAG2) Extrapolation using Hermite polynomial Construct polynomial of order 2 p + 1 with p points in history of solution and derivative Use polynomial to predict solution at next time step Linear (HERM1), cubic (HERM2), quintic (HERM3) Zahr and Persson DG Performance Tuning

  11. Introduction ODE Scheme Background Newton Prediction Numerical Experiments Jacobian Recycling Conclusion GMRES Tolerance Jacobian Recycling For matrix-based methods, every nonlinear iteration requires a Jacobian evaluation Jacobian assembly at least 10 × as expensive as residual evaluation Re-using Jacobians yield inexact Newton directions May require more Newton iterations per time step Enables re-use of preconditioner Reduces number of Jacobian evaluations and preconditioner computations Recompute Jacobian when corresponding Newton step fails to reduce nonlinear residual Zahr and Persson DG Performance Tuning

  12. Introduction ODE Scheme Background Newton Prediction Numerical Experiments Jacobian Recycling Conclusion GMRES Tolerance GMRES Tolerance When using GMRES to solve Ax = b, common convergence criteria is || Ax − b || 2 ≤ Gtol || b || 2 Small GMRES tolerance → search directions “close” to Newton directions More GMRES iterations per Newton step, fewer Newton iterations Large GMRES tolerance → search directions may be far from Newton directions Fewer GMRES iterations per Newton step, more Newton iterations Zahr and Persson DG Performance Tuning

  13. Introduction Experiment 1: ODE Scheme Background Experiment 2: Newton Prediction Numerical Experiments Experiment 3: Jacobian Recycling Conclusion Experiment 4: GMRES Tolerance Euler Vortex Euler vortex mesh, with degree Solution (density) p = 4 √ 10 2 + 5 2 Figure : Euler Vortex: Mesh and Solution at t 0 = Zahr and Persson DG Performance Tuning

  14. Introduction Experiment 1: ODE Scheme Background Experiment 2: Newton Prediction Numerical Experiments Experiment 3: Jacobian Recycling Conclusion Experiment 4: GMRES Tolerance Viscous flow over NACA wing at high angle of attack NACA mesh, with degree p = 4 Solution (Mach) Figure : NACA Wing: Mesh and Solution at t 0 = 5 . 01 Zahr and Persson DG Performance Tuning

  15. Introduction Experiment 1: ODE Scheme Background Experiment 2: Newton Prediction Numerical Experiments Experiment 3: Jacobian Recycling Conclusion Experiment 4: GMRES Tolerance Euler Vortex NACA Wing 10 1 10 − 2 10 0 Error (mass matrix norm) Error (mass matrix norm) 10 − 3 10 − 1 10 − 2 10 − 4 10 − 3 BDF2 BDF2 10 − 5 BDF23 BDF23 10 − 4 BDF23 3 BDF23 3 DIRK1 DIRK1 DIRK2 DIRK2 DIRK3 DIRK3 10 − 6 10 − 5 10 0 10 1 10 2 10 3 10 2 10 3 CPU time (sec) CPU time (sec) LAG2, Jacobian LAG2, Jacobian Recomputation, Gtol = 10 − 5 Recomputation, Gtol = 10 − 5 BDF23 3 cheaper than DIRK3 for high accuracy BDF23 has same slope but better offset than BDF2 Zahr and Persson DG Performance Tuning

  16. Introduction Experiment 1: ODE Scheme Background Experiment 2: Newton Prediction Numerical Experiments Experiment 3: Jacobian Recycling Conclusion Experiment 4: GMRES Tolerance Euler Vortex NACA Wing 10 1 10 − 1 10 0 Error (mass matrix norm) Error (mass matrix norm) 10 − 2 10 − 1 10 − 2 10 − 3 LAG0 LAG0 LAG1 LAG1 10 − 3 LAG2 LAG2 HERM1 10 − 4 HERM1 HERM2 HERM2 HERM3 HERM3 10 − 4 10 0 10 1 10 2 10 3 10 2 10 3 CPU time (sec) CPU time (sec) BDF23, Jacobian BDF23, Jacobian Recomputation, Gtol = 10 − 5 Recomputation, Gtol = 10 − 5 LAG0 is a poor predictor LAG1, LAG2, HERM1, HERM2 are comparable predictors LAG2 is a good predictor for all ∆ t considered High-order extrapolation may not be a good idea Zahr and Persson DG Performance Tuning

Recommend


More recommend