experiences using 2decomp fft to solve partial
play

Experiences using 2decomp&fft to solve Partial Differential - PowerPoint PPT Presentation

Experiences using 2decomp&fft to solve Partial Differential Equations using Fourier Spectral Methods Sudarshan Balakrishnan, Abdullah H. Bargash, Gong Chen, Brandon Cloutier, Ning Li, Dave Malicke, Benson Muite, Michael Quell, Paul Rigge,


  1. Experiences using 2decomp&fft to solve Partial Differential Equations using Fourier Spectral Methods Sudarshan Balakrishnan, Abdullah H. Bargash, Gong Chen, Brandon Cloutier, Ning Li, Dave Malicke, Benson Muite, Michael Quell, Paul Rigge, Damian San Rom´ an Alerigi Mamdouh Solimani, Andre Souza, Abdulaziz S.Thiban, Mark Van Moer, Jeremy West Tartu ¨ Ulikool benson.muite@ut.ee http://math.ut.ee/˜benson http: //en.wikibooks.org/wiki/Parallel_Spectral_Numerical_Methods

  2. Outline Project Aim Fourier Series The Heat Equation The Allen Cahn Equation The Gray-Scott Equations Nonlinear Schr¨ odinger Equation Navier-Stokes Equation Maxwell’s Equations Possible Further Work

  3. Project Aim Teaching tool for use in mathematics and computer science courses from 1st year undergraduate to postgraduate level Research tool: investigate partial differential equations, investigate computer performance Help paper reproducibility and verifiability Reduce coding time Rely on 2decomp&fft for MPI parallelization ( http://2decomp.org ) Material has been tested in a course on Multivariable calculus and on an introduction to partial differential equations

  4. Fourier series: Separation of Variables 1 dy = y (1) dt dy = dt (2) y � dy � = dt (3) y ln y + a = t + b (4) e ln y + a e t + b = (5) e ln y e a e t e b = (6) e b e a e t y = (7) ce t . y ( t ) = (8)

  5. Fourier series: Separation of Variables 2 u t = u xx (9) Suppose u = X ( x ) T ( t ) d 2 X d T d t ( t ) d x 2 ( x ) T ( t ) = X ( x ) = − C , (10) Solving each of these separately and then using linearity we get a general solution � � � α n exp ( − C n t ) sin ( C n x ) + β n exp ( − C n t ) cos ( C n x ) n (11)

  6. Fourier series: Separation of Variables 3 How do we find a particular solution? Suppose u ( x , t = 0 ) = f ( x ) Suppose u ( 0 , t ) = u ( 2 π, t ) and u x ( 0 , t ) = u x ( 2 π, t ) then recall � 2 π � π m = n sin ( nx ) sin ( mx ) = m � = n , (12) 0 0 � 2 π � π m = n cos ( nx ) cos ( mx ) = m � = n , (13) 0 0 � 2 π cos ( nx ) sin ( mx ) = 0 . (14) 0

  7. Fourier series: Separation of Variables 4 So if � f ( x ) = α n sin ( nx ) + β n cos ( nx ) . (15) n then � 2 π f ( x ) sin ( nx ) d x 0 α n = (16) � 2 π sin 2 ( nx ) d x 0 � 2 π f ( x ) cos ( nx ) d x 0 β n = . (17) � 2 π cos 2 ( nx ) d x 0 and � exp ( − n 2 t ) [ α n sin ( nx ) + β n cos ( nx )] u ( x , t ) = (18) n The Fast Fourier Transform allows one to find good approximations to α n and β n when the solution is found at a finite number of evenly spaced grid points

  8. The 1D Heat Equation: Finding derivatives and timestepping Let � ˆ u ( x ) = u k exp ( ikx ) (19) k then d ν u � ( ik ) ν ˆ d x ν = u k . (20) Consider u t = u xx , which is approximated by ∂ ˆ u k α ( ik ) 2 ˆ = u k (21) ∂ t u n + 1 ˆ − ˆ u n k k α ( ik ) 2 ˆ u n + 1 = (22) k h u n + 1 ( 1 − α h ( ik ) 2 ) u n ˆ ˆ = (23) k k u n ˆ u n + 1 k ˆ = ( 1 − α h ( ik ) 2 ) . (24) k

  9. The 2D Allen Cahn Equation Consider u t = ǫ ( u xx + u yy ) + u − u 3 , which is approximated by ∂ ˆ u � ( ik x ) 2 + ( ik y ) 2 � u − ˆ u + ˆ ˆ u 3 = ǫ (25) ∂ t u n + 1 − ˆ u n ˆ � ( ik x ) 2 + ( ik y ) 2 � ˆ ˆ u n + 1 + ˆ u n − ( u n ) 3 (26) = ǫ h

  10. The 3D Gray-Scott Equations ∂ u = D u ∆ u + α ( 1 − u ) − uv 2 , (27) ∂ t ∂ v = D v ∆ v − β v + uv 2 . (28) ∂ t Solved using a splitting method (More information on splitting for this at http://arxiv.org/abs/1310.3901 ) http://web.student.tuwien.ac.at/˜e1226394/

  11. The 2D nonlinear Schr¨ odinger Equation i ψ t + ψ xx + ψ yy = | ψ | 2 ψ Solved using Fast Fourier Transform and splitting

  12. The 2D nonlinear Schr¨ odinger Equation Table: Computation times in seconds for 20 time steps of 10 − 5 for a Fourier split step method for the cubic nonlinear Schr¨ odinger equation on [ − 5 π, 5 π ] 2 . Grid GPU GPU GPU Xeon Phi CPU Size (Cuf) (Cuda) (OpenACC) (61 cores) (1 core) 256 2 0.00802 0.0116 0.0130 0.0122 0.442 512 2 0.0234 0.0315 0.0369 0.0291 1.94 1024 2 0.0851 0.105 0.132 0.118 12.7 2048 2 0.334 0.415 0.527 0.422 57.2 4096 2 1.49 2.02 2.30 1.626 329

  13. The Real Cubic Klein-Gordon Equation u tt − ∆ u + u = | u | 2 u (29) � 1 2 | u t | 2 + 1 2 | u | 2 + 1 2 |∇ u | 2 − 1 4 | u | 4 d x E ( u , u t ) = (30) u n + 1 − 2 u n + u n − 1 − ∆ u n + 1 + 2 u n + u n − 1 + u n + 1 + 2 u n + u n − 1 ( δ t ) 2 4 4 (31) � 2 u n � u n � � = ± (32) Parallelization done using 2decomp library for FFT and processing independent loops Other time stepping algorithms possible, including splitting

  14. Simulations and Videos by Brian Leu, Albert Liu, and Parth Sheth 10 2 Measured Ideal Computation Time (s) 10 1 10 0 10 4 10 5 10 6 Number of Cores Figure: Strong scaling on Mira for a 4096 3 discretization. http://www-personal.umich.edu/˜alberliu/ http://www-personal.umich.edu/˜brianleu/ http://www-personal.umich.edu/˜pssheth/

  15. The 2D and 3D Navier Stokes Equation Consider incompressible case only � ∂ u � ∂ t + u · ∇ u = −∇ p + µ ∆ u (33) ρ ∇ · u = 0 . (34) p pressure, µ viscosity, ρ , density 2D u ( x , y ) = ( u ( x , y ) , v ( x , y )) 3D u ( x , y , z ) = ( u ( x , y , z ) , v ( x , y , z ) , w ( x , y , z ))

  16. 2D Vorticity-Streamfunction Formulation ω = ∇ × u = ∂ v ∂ x − ∂ u ∂ y = − ∆ ψ � ∂ω � ∂ t + u ∂ω ∂ x + v ∂ω ρ = µ ∆ ω (35) ∂ y and ∆ ψ = − ω. (36)

  17. Time Discretization � ω n + 1 , k + 1 − ω n (37) ρ δ t u n + 1 , k ∂ω n + 1 , k + v n + 1 , k ∂ω n + 1 , k + u n ∂ω n ∂ x + v n ∂ω n + 1 � �� 2 ∂ x ∂ y ∂ y = µ � ω n + 1 , k + 1 + ω n � 2 ∆ , and ∆ ψ n + 1 , k + 1 = − ω n + 1 , k + 1 , (38) u n + 1 , k + 1 = ∂ψ n + 1 , k + 1 v n + 1 , k + 1 = − ∂ψ n + 1 , k + 1 , . (39) ∂ y ∂ x Fixed point iteration used to obtain nonlinear terms

  18. Example Videos http://www-personal.umich.edu/˜cloutbra/ research.html Simulations on a single NVIDIA Fermi GPU about 20 times faster than a 16 core CPU

  19. 3D Equivalent Formulation Simplification of equation with periodic boundary conditions � ∂ u � ρ ∂ t + u · ∇ u = −∇ p + µ ∆ u (40) ∇ · u = 0 (41) so � ∂ u � �� ∇ · ∂ t + u · ∇ u = ∇ · [ −∇ p + µ ∆ u ] (42) ρ ρ ∇ · ( u · ∇ u ) = − ∆ p (43) p = ∆ − 1 [ ∇ · ( u · ∇ u )] (44) so � ∂ u ∆ − 1 [ ∇ · ( u · ∇ u )] � � � ρ ∂ t + u · ∇ u = − ρ ∇ + µ ∆ u (45)

  20. 3D Equivalent Formulation - Implicit Midpoint Time Discretization � u n + 1 , j + 1 − u n + u n + 1 , j + u n � u n + 1 , j + u n �� ρ · ∇ δ t 2 2 ( u n + 1 , j + u n ) · ∇ ( u n + 1 , j + u n ) ∆ − 1 � � � ��� = ρ ∇ ∇ · 4 + µ ∆ u n + 1 , j + 1 + u n , 2 Video of Taylor Green Vortex http://vimeo.com/87981782

  21. 3D Equivalent Formulation - Carpenter-Kennedy Discretization 1: procedure R UNGE –K UTTA ( u ) h = 0 2: u = u n 3: for k = 1 → 5 do 4: h ← g ( u ) + β k h 5: µ = 0 . 5 δ t ( α k + 1 − α k ) 6: v − µ l ( v ) = u + γ k δ t h + µ l ( u ) 7: u ← v 8: end for 9: u n + 1 = u 10: 11: end procedure

  22. Performance δ t = 0 . 005 for 512 3 and δ t = 0 . 01 for 256 3 grid points. For IMR scheme, fixed point iteration procedure was stopped once the difference between two successive iterates was less than 10 − 10 in l ∞ norm of velocity fields. Core Hours Method Grid Size Cores Time Steps Time (s) Timestep 256 3 IMR 512 1000 4060 0.578 512 3 IMR 1024 500 9899 5.68 512 3 CK 4096 2000 7040 4.0 Table: Performance of Fourier pseudospectral code on Shaheen. IMR is an abbreviation for implicit midpoint rule and CK is an abbreviation for Carpenter–Kennedy.

  23. Kinetic Energy Evolution Kinetic Energy 0.13 0.12 0.11 Energy 0.1 IMR 256 0.09 IMR 512 CK 512 0.08 Reference 0 2 4 6 8 10 Time Figure: KE of solutions are so close they are almost indistinguishable

  24. Kinetic Energy Dissipation Rate Kinetic Energy Dissipation Rate 0.014 IMR 256 0.012 IMR 512 Energy Dissipation Rate CK 512 0.01 Reference 0.008 0.006 0.004 0.002 0 0 2 4 6 8 10 Time Figure: Plot during the initial stage, where flow is essentially inviscid and laminar. Fully developed turbulent flow is observed around t max ≈ 8.

  25. Kinetic Energy Dissipation Rate Kinetic Energy Dissipation Rate − 4 15x 10 IMR 256 − Reference Energy Dissipation Rate IMR 512 − Reference 10 CK 512 − Reference 5 0 − 5 0 2 4 6 8 10 Time Figure: Difference in kinetic energy dissipation rates between the current discretizations and the reference solution.

  26. Vorticity Figure: Square of the vorticity in the plane centered at ( π, 0 , 0 ) with normal vector ( 1 , 0 , 0 ) .

  27. Discrete energy equality for midpoint rule � T � u ( t = T ) � 2 l 2 − � u ( t = 0 ) � 2 �∇ u � 2 l 2 = − µ l 2 d t 0 N − 1 l 2 = − µ 2 � U n + U n + 1 �� � � � u N � 2 l 2 − � u 0 � 2 � ∇ l 2 δ t . � � 4 � n = 0

Recommend


More recommend