Parallel Time-Domain Boundary Element Method for 3-Dimensional Wave Equation Space-Time Methods for PDEs, RICAM Linz, November 10, 2016 D. Luk´ aˇ s, M. Merta, J. Zapletal, and A. Veit Vˇ SB–Technical University of Ostrava, Czech Rep. University of Chicago � industry email: dalibor.lukas@vsb.cz
Parallel Time-Domain Boundary Element Method for 3-Dimensional Wave Equation Outline • Parallel fast BEM and applications • Boundary integral formulation of sound-hard scattering • Time-domain boundary element method • Parallelization, preconditioning, numerical experiments • Conclusion, outlook, references
Parallel Time-Domain Boundary Element Method for 3-Dimensional Wave Equation Outline • Parallel fast BEM and applications • Boundary integral formulation of sound-hard scattering • Time-domain boundary element method • Parallelization, preconditioning, numerical experiments • Conclusion, outlook, references
Parallel fast BEM and applications Laplace equation in an unbounded domain Ω ⊂ R 3 lipschitz domain x ∈ Ω e := R 3 \ Ω −△ u ( � x ) = 0 , � γ N u ( x ) := d u dn ( x ) = g ( x ) , x ∈ Γ := ∂ Ω | u ( � x ) | = O (1 / | � x | ) , | � x | → ∞ . Representation formula � � ∀ � x ∈ Ω : u ( � x ) = − γ N u ( y ) G ( � x, y ) dS ( y ) + u ( y ) γ N ,y G ( � x, y ) dS ( y ) , Γ Γ � �� � � �� � △ ( . )=0 in Ω e , | . | = O (1 / | � △ ( . )=0 in Ω e , | . | = O (1 / | � x | ) x | ) 1 where G ( � x, y ) := x − y | . 4 π | � An indirect method Find an auxiliary double-layer density φ : Γ → R such that � � x ∈ Ω e . γ N φ ( y ) γ N ,y G ( x, y ) dS ( y ) = g ( x ) , x ∈ Γ � u ( � x ) = φ ( y ) γ N ,y G ( � x, y ) dS ( y ) , � Γ Γ
Parallel fast BEM and applications Shape optimization of a DC electromagnet, FEM-BEM coupling Ω o : focusing optics Ω e : air Γ: boundary Ω i Ω m sample Ω e : air Ω e coil J Ω i : ferromagnetic yoke L., Postava, ˇ Zivotsk´ y: J Magn Magn Mater ’10, Math Comput Simulat ’12
Parallel fast BEM and applications Acoustics of a railway wheel � H -matrices, ACA/FMM
Parallel fast BEM and applications Parallel fast BEM using cyclic graph decompositions rotate 0 1 2 3 4 5 6 7 8 9 10 11 12 0 G 0 0 1 12 1 2 G 1 3 11 2 4 5 6 10 3 7 8 9 4 9 10 8 5 11 12 7 6 L., Kov´ aˇ r, Kov´ aˇ rov´ a, Merta: Numer Alg ’15 Solution to the system of 2.7M DOFs on 273 cores in 16 minutes.
Parallel fast BEM and applications Elmg. forming of plates with Fraunhofer IWU Chemnitz, FEM-BEM Time 3e−05 s, Thalf 6e−05 s, Jmax 9.48183e+06 A/m 2 , Hmax 61.1986 A/m, Fmax 720.694 N/m 3 . 0.02 0.015 0.01 0.005 z [m] 0 −0.005 −0.01 −0.015 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 r [m]
Parallel fast BEM and applications Structural health monitoring of aircrafts with Honeywell Int. using 3d anisotropic mixed elements (TD-NNS) by [Pechstein (Sinwel), Sch¨ oberl ’12]. crack piezo-actuator piezo-sensor
Parallel Time-Domain Boundary Element Method for 3-Dimensional Wave Equation Outline • Parallel fast BEM and applications • Boundary integral formulation of sound-hard scattering • Time-domain boundary element method • Parallelization, preconditioning, numerical experiments • Conclusion, outlook, references
Boundary integral formulation of sound-hard scattering Sound-hard scattering Given the scatterer Ω and the causal incident wave u inc (satisfying ¨ u − △ u = 0), we look for the scattered field u : u − △ u = 0 in Ω e × [0 , T ] , ¨ u u ( ., 0) = 0 in Ω e , u ( ., 0) = ˙ Γ Ω ∂n = − ∂u inc ∂u ∂n on Γ × [0 , T ] . u inc
Boundary integral formulation of sound-hard scattering Double-layer indirect boundary integral method We search for u in the form of the retarded double-layer potential � � � ˙ u ( x, t ) = − 1 n ( y ) · ( x − y ) φ ( y, t − � x − y � ) φ ( y, t − � x − y � ) + dS ( y ) , � x − y � 2 4 π � x − y � � x − y � Γ which satisfies the wave equation and the initial conditions. It remains to fulfill the Neumann boundary condition x → x ∈ Γ n ( x ) · ∇ � lim x u ( � x, t ) = g ( x, t ) on Γ × [0 , T ] , Ω ∋ � � �� � =:( Wφ )( x,t ) where g := − ∂u inc ∂n .
Boundary integral formulation of sound-hard scattering Weak boundary integral formulation [Bamberger, HaDuong ’86] Find φ ∈ V such that a ( ξ, φ ) = b ( ξ ) ∀ ξ ∈ V, where � � T � � n ( x ) · n ( y ) ξ ( x, t ) ¨ ˙ a ( ξ, φ ) := φ ( y, t − � x − y � ) 4 π � x − y � 0 Γ Γ � + curl Γ ˙ ξ ( x, t ) · curl Γ φ ( y, t − � x − y � ) dS ( y ) dS ( x ) dt, 4 π � x − y � � T � g ( x, t ) ˙ b ( ξ ) := ξ ( x, t ) dS ( x ) dt. 0 Γ
Parallel Time-Domain Boundary Element Method for 3-Dimensional Wave Equation Outline • Parallel fast BEM and applications • Boundary integral formulation of sound-hard scattering • Time-domain boundary element method • Parallelization, preconditioning, numerical experiments • Conclusion, outlook, references
Time-domain boundary element method Discrete ansatz Replace V by a finite-dimensional subspace V h, ∆ t spanned by the tensor-product of N temporal and M spatial basis functions: N M � � α j φ h, ∆ t ( x, t ) := l ϕ j ( x ) b l ( t ) . l =1 j =1 We arrive at the ( N M ) × ( N M ) block linear system A 1 , 1 . . . A 1 ,N b 1 α 1 . . . . ... = , . . . . . . . . A N, 1 . . . A N,N b N α N where ( α l ) j := α j ( A k,l ) i,j := a ( ϕ i ( x ) b k ( t ) , ϕ j ( y ) b l ( t )) , ( b k ) i := b ( ϕ i ( x ) b k ( t )) , l .
Time-domain boundary element method Matrix: a deeper look =:Ψ k,l ( � x − y � ) � �� � � � � T n ( x ) · n ( y ) b k ( t )¨ ˙ ( A k,l ) i,j = 4 π � x − y � ϕ i ( x ) ϕ j ( y ) b l ( t − � x − y � ) dt dS ( y ) dS ( x ) 0 supp ϕ i supp ϕ j � � � T curl Γ ϕ i ( x ) · curl Γ ϕ j ( y ) ˙ + b k ( t ) b l ( t − � x − y � ) dt dS ( y ) dS ( x ) , 4 π � x − y � 0 � �� � supp ϕ i supp ϕ j =: � Ψ k,l ( � x − y � ) Piecewise smooth time-ansatz � expensive quadrature due to nontrivial intersection of the light cone supp � supp Ψ k,l , Ψ k,l with supp ϕ i × supp ϕ j . [El Gharib ’99], [Stephan, Maischak, Ostermann ’08]
Time-domain boundary element method C ∞ -smooth (partition of unity) temporal basis [Sauter, Veit ’12, ’14] • allows for Sauter-Schwab quadrature b 0 b 1 b 2 b 3 over supp ϕ i × supp ϕ j 0.8 • and for higher-order approximation in 0.6 time. 0.4 0.2 0.5 1.5 2.5 =:Ψ k,l ( � x − y � ) ∈ C ∞ ( R ) t [s] � �� � � � � T n ( x ) · n ( y ) b k ( t )¨ ˙ ( A k,l ) i,j = 4 π � x − y � ϕ i ( x ) ϕ j ( y ) b l ( t − � x − y � ) dt dS ( y ) dS ( x ) 0 supp ϕ i supp ϕ j � � � T curl Γ ϕ i ( x ) · curl Γ ϕ j ( y ) ˙ + b k ( t ) b l ( t − � x − y � ) dt dS ( y ) dS ( x ) , 4 π � x − y � 0 � �� � supp ϕ i supp ϕ j =: � Ψ k,l ( � x − y � ) ∈ C ∞ ( R ) To accelerate the assembly, Ψ and � Ψ are replaced by piecewise Chebyshev interpolants.
Time-domain boundary element method Convergence of � φ h, ∆ t ( x, . ) − φ analytical ( x, . ) � L 2 (0 ,T ) on the sphere [Veit ’12] Ω the unit sphere, φ analytical a spherical harmonic function, x ∈ Γ 1st-order time-basis functions 2nd-order time-basis functions −1 10 −3 10 Error Error −2 −4 10 10 1 /N 2 1 /N −5 10 −3 10 1 2 5 10 20 40 10 10 Number of timesteps Number of timesteps
Time-domain boundary element method Matrix structure The matrix is sparse and it has a block-Hessenberg structure.
Parallel Time-Domain Boundary Element Method for 3-Dimensional Wave Equation Outline • Parallel fast BEM and applications • Boundary integral formulation of sound-hard scattering • Time-domain boundary element method • Parallelization, preconditioning, numerical experiments • Conclusion, outlook, references
Parallelization, preconditioning, numerical experiments Parallel implementation • For equidistant time stepping only the blue parts have to be assembled. • We employ a hybrid MPI-OpenMP model: – pairs of triangles corresponding to nonzero entries are evenly distributed to MPI nodes, – on each node the assembly (quadrature) is performed using OpenMP. • We use up to 64 Intel Xeon E5 nodes (1024 cores) of the cluster Anselm, Vˇ SB-TU Os- trava.
Parallelization, preconditioning, numerical experiments Parallel implementation We distribute blocks among MPI processes (nodes) as follows: Each process does the following: 1. Precompute the sparsity pattern of the block. 2. Distribute pairs of elements among computational nodes using MPI. 3. On each computational node assemble its contribution to the block in a shared memory using OpenMP. 4. Gather the data on the MPI rank(s) owning the block.
Parallelization, preconditioning, numerical experiments Weak parallel scalability of the assembly Submarine surface decomposed into 5604 triangles, 40 time steps, 80 time DOFs
Recommend
More recommend