Automatic Domain Decomposition for a Black-Box PDE Solver Torsten - PowerPoint PPT Presentation

Automatic Domain Decomposition for a Black-Box PDE Solver Torsten Adolph and Willi Sch¨ onauer Forschungszentrum Karlsruhe Institute for Scientific Computing Karlsruhe, Germany torsten.adolph@iwr.fzk.de willi.schoenauer@iwr.fzk.de http://www.fzk.de/iwr http://www.rz.uni-karlsruhe.de/rz/docs/FDEM/Literatur

Motivation Numerical solution of non-linear systems of Partial Differential Equations (PDEs) • Finite Difference Method (FDM) • Finite Element Method (FEM) • Finite Volume Method (FVM) Finite Difference Element Method (FDEM) Combination of advantages of FDM and FEM: FDM on unstructured FEM grid Torsten Adolph 2 DD17, July 3-7, 2006

Objectives • Elliptic and parabolic non-linear systems of PDEs • 2-D and 3-D with arbitrary geometry • Arbitrary non-linear boundary conditions (BCs) • Subdomains with different PDEs • Robustness • Black-box (PDEs/BCs and domain) • Error estimate • Order control/Mesh refinement • Efficient parallelization Torsten Adolph DD17, July 3-7, 2006 3

Difference formulas of order q on unstructured grid Polynomial approach of order q (m coefficients) 2-D: m = (q+1) · (q+2)/2 3-D: m = (q+1) · (q+2) · (q+3)/6 ✉ 3 ✉ 4 ✚ ✚ ✚ ✚ ✚ ✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✉ 2 ✚ ✚ ✚ ✚ ✚ ✚ q=2 ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ♠ ✉ 0 ✚ ✚ ✚ ✚ ✚ ✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ q=4 ✉ ✚ ✚ ✚ ✚ ✚ ✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ q=6 ✚ ✚ ✚ ✚ ✚ ✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✉ 1 ✉ ✚ ✚ ✚ ✚ ✚ ✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ ✚✚ 5=m-1 order q=2 ⎧ 1, node i ⎪ ⎨ Influence polynomial P q,i = → u d , u x,d , u y,d , u xx,d , u yy,d , u xy,d 0, other nodes ⎪ ⎩ Search for nodes in rings (up to order q+ ∆ q) → m+r nodes Selection of m appropriate nodes by special algorithm Torsten Adolph DD17, July 3-7, 2006 4

Discretization error estimate u x = u x,d,q + ¯ d x,q = u x,d,q +2 + ¯ e.g. for u x : d x,q +2 + ¯ � � → d x,q = u x,d,q +2 − u x,d,q d x,q +2 Error equation P u ≡ P ( t, x, y, u, u t , u x , u y , u xx , u yy , u xy ) Linearization by Newton-Raphson Discretization with error estimates d t , d x , . . . and linearization in d t , d x , . . . → ∆ u d = ∆ u P u + ∆ u D t + ∆ u D x + ∆ u D y + ∆ u D xy = (level of solution) = Q − 1 · [( P u ) d + D t + { D x + D y + D xy } ] (level of equation) d Only apply Newton correction ∆ u P u : → Q d · ∆ u P u = ( P u ) d Torsten Adolph DD17, July 3-7, 2006 5

Problem: Black-box for PDEs and domain User input: any system of PDEs Local mesh refinement any unstructured FEM grid 2-D and 3-D (Sliding) dividing lines dividing line P (1) P (2) Subdomains with P (3) different PDEs Solution: 1-D DD with overlap Torsten Adolph DD17, July 3-7, 2006 6

Re-sorting of the nodes Domain Node numbering Objective (from mesh generator) (by re-sorting for x-coordinate) 4 22 21 20 19 18 17 3 8 16 24 32 40 48 56 64 23 44 43 42 41 40 39 16 7 15 23 31 39 47 55 63 24 45 58 57 56 55 38 15 6 14 22 30 38 46 54 62 25 46 59 64 63 54 37 14 5 13 21 29 37 45 53 61 26 47 60 61 62 53 36 13 4 12 20 28 36 44 52 60 27 48 49 50 51 52 35 12 3 11 19 27 35 43 51 59 ✉ ✉ ✉ ✉ 28 29 30 31 32 33 34 11 2 10 18 26 34 42 50 58 ✉ ❤ ✉ ✉ ✉ ❤ 1 5 6 7 8 9 10 2 1 9 17 25 33 41 49 57 ✉ ✉ ✉ ✉ Difference star: 29 1 5 28 48 49 10 1 2 9 11 19 → all 4 processors involved → only 2 (neighboured) Proc. 1 2 3 4 processors involved Torsten Adolph DD17, July 3-7, 2006 7

Algorithm for global sorting of the nodes I • Needs 2 · (np-1) steps on np processors • Step i ∈ { 1, . . . , np-1 } : Sorting until first sorted nodes are received by proc. np • Step i ∈ { np, . . . , 2 · (np-1) } : Sorting, proc. np sends sorted nodes to processors 1 to np-1 • Always send to the right neighbour processor (except for processor np) • Always receive from the left neighbour processor • Up to np/2 processors active in parallel • Communication via MPI • Start with local sorting of the nodes (heapsort) • Step after receiving nodes: merging (= local sorting) Torsten Adolph DD17, July 3-7, 2006 8

Algorithm for global sorting of the nodes II • Formal description of step i (illustration on next slide) ip ≥ i+3 ∧ ip ≤ min(i,np) i odd merge 2 ip ≥ i+1 ∧ ip ≤ min(i,np) send 2 ip ≥ i+3 ∧ ip ≤ min(i,np) receive 2 i ∈ { 1, . . . , np-1 } : additionally: ip = i+1 i ∈ { np, . . . , 2 · (np-1) } : ip = i-np+1 ip ≥ i 2 + 1 ∧ ip ≤ min(i,np) i even merge ip ≥ i 2 + 1 ∧ ip ≤ min(i,np) send ip ≥ i 2 + 2 ∧ ip ≤ min(i,np) receive i ∈ { 1, . . . , np-1 } : additionally: ip = i+1 i ∈ { np, . . . , 2 · (np-1) } : ip = i-np+1 Torsten Adolph DD17, July 3-7, 2006 9

Illustration of the global re-sorting algorithm Step Proc. 1 2 3 4 5 6 7 1 S M Merge 2 R M S S S Send 3 R M S R M S S R Receive 4 R M S R M S R M S S 5 R M S R M S R M S R 6 R M S R M S R 7 R M S R np=8 R Step Proc. 8 9 10 11 12 13 14 1 R 2 R 3 R 4 R 5 M S S R 6 M S R M S R M S S R 7 M S R M S R M S R M S R M S S R np=8 M S R M S R M S R M S R M S R M S R M S Torsten Adolph DD17, July 3-7, 2006 10

Distribution of the elements mesh generator mesh file proc. ip 1 ip 2 ip np-1 ip np · · · old data . . . (static) shifted in ring to the right new data Processor that owns leftmost node of an element becomes element owner → Execution of 2 ring shifts 1 st : Determination of owner of leftmost node 2 nd : Storing of element numbers on owning processors Torsten Adolph DD17, July 3-7, 2006 11

Distribution of the boundary/DL/SDL nodes mesh generator DL: dividing line SDL: sliding dividing line mesh file proc. ip 1 ip 2 ip np-1 ip np · · · old data . . . (static) shifted in ring to the right new data Compare received node numbers of boundary/DL/SDL nodes to node numbers of own nodes → Store matching node numbers in arrays for boundary/DL/SDL nodes Send non-matching node numbers to right neighbour processor Torsten Adolph DD17, July 3-7, 2006 12

Illustration of 1-D DD (np=4) 1 2 Proc. 1 2 3 4 3 4 1 2 3 4 1 2 3 4 Torsten Adolph DD17, July 3-7, 2006 13

Overlap ⎫ Computation of the right hand side ⎬ ⎭ local (without communication) and of the matrix Q d → Store necessary nodes and elements of neighbour processors on proc. ip Proc. ip-1 ip ip+1 overlap own overlap necessary nodes on proc. ip ip-1, ip+1: overlap processors of proc. ip Width of overlap: Compute mean edge length h mean Choose safety factor a overlap Compute x overlap,1 , x overlap,2 x overlap,1 = 0.5 · a overlap · h mean · ( √ m(q+ ∆ q)-1 ) 1. criterion (enough nodes): x overlap,2 = a overlap · h mean · (q+ ∆ q) 2. criterion (enough rings): Compute x overlap = max(x overlap,1 , x overlap,2 ) Torsten Adolph DD17, July 3-7, 2006 14

Illustration of overlap x left = x min − x overlap x right = x max + x overlap ip2 ip3 ip4 ip5 ip6 ip7 x overlap x overlap x left,4 x min,4 x max,4 x right,4 Torsten Adolph DD17, July 3-7, 2006 15

Bandwidth optimization Before re-sorting After re-sorting Bandwidth: Full 2253 Before re-sorting 2154 After re-sorting 185 With SSP BO 112 SSP: own improved Cuthill-McKee Torsten Adolph DD17, July 3-7, 2006 16

Summary • Black-box PDE solver FDEM (URL: http://www.rz.uni-karlsruhe.de/rz/docs/FDEM/Literatur) • User input: any PDE system, any domain, 2-D and 3-D • Global re-sorting algorithm for nodes • Send elements in ring-shift to owning processors → 1-D DD with overlap • Computation of linear system of equations purely local • Efficient parallelization with MPI • Built-in bandwidth optimizer This 1-D DD is simple, robust and efficient! Torsten Adolph DD17, July 3-7, 2006 17

Automatic Domain Decomposition for a Black-Box PDE Solver Torsten - PowerPoint PPT Presentation

Automatic Domain Decomposition for a Black-Box PDE Solver Torsten Adolph and Willi Sch onauer Forschungszentrum Karlsruhe Institute for Scientific Computing Karlsruhe, Germany torsten.adolph@iwr.fzk.de willi.schoenauer@iwr.fzk.de

Paradoxes in Probability How probability continues to amuse me! Let's play a game! Box A Box B

Image Processing A case study for a domain decomposed MPI code Domain Decomposition 1

Systerel Smart Solver Forum Mthodes Formelles October 2014 S3 S3 for C Systerel Smart Solver

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

PDE Models PDE Models Goal is to set up design software frameworks Determine scope

Black Box Scanning Tool + White Box Testing Tool Toshis Black Box Scanning Tool Same

A recipe for black box functors Maru Sarazola and Brendan Fong What is a black box functor? In

Automatic Solver Configuration and Solver Portfolios Meinolf Sellmann IBM Research Watson AI

A CDCL(LA) Solver SPASS-SATT A CDCL(LA) Solver Translation: fun (=SPASS) sated (=SATT)

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Kid s Box American English Level 1 Presentation Plus: Kid s Box American English Kid s Box

Flux Box Flux Box A concept by Flux Laboratory Flux box : concept Flux box : concept What is Flux

[7] Gaussian Elimination Starting to peek inside the black box So far sol ve( A, b) is a black

M A PHYS or the development of a parallel algebraic domain decomposition solver in the course of

Growing Solver-Aided Languages with ROSETTE Emina Torlak & Rastislav Bodik U.C. Berkeley

AM 205: lecture 12 Last time: Numerical differentiation, numerical solution of ordinary

Masking schemes: evaluation Oscar Reparaz COSIC/KU Leuven PROOFS Taipei (Taiwan)

The long-term post-outburst spin down of low magnetic field magnetar Swift J1822.3-1606 Scholz

Combinational Building Blocks Multiplexers Decoders Chapter 2 <155> Multiplexer

Clustering L eon Bottou NEC Labs America COS 424 3/4/2010 Agenda Classification,

Kai Schneider M 2P2 -CNRS & CMI, Universit de Provence, Marseille, France Joint work with :

Iterative Solvers for Coupled Fluid-Solid Scattering Jan Mandel Center for Computational

Advanced Rendering Advanced Rendering http://www.ugrad.cs.ubc.ca/~cs314/Vjan2013 2 Reading for

Automatic Domain Decomposition for a Black-Box PDE Solver Torsten - PowerPoint PPT Presentation

Automatic Domain Decomposition for a Black-Box PDE Solver Torsten Adolph and Willi Sch onauer Forschungszentrum Karlsruhe Institute for Scientific Computing Karlsruhe, Germany torsten.adolph@iwr.fzk.de willi.schoenauer@iwr.fzk.de

Paradoxes in Probability How probability continues to amuse me! Let's play a game! Box A Box B

Image Processing A case study for a domain decomposed MPI code Domain Decomposition 1

Systerel Smart Solver Forum Mthodes Formelles October 2014 S3 S3 for C Systerel Smart Solver

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

PDE Models PDE Models Goal is to set up design software frameworks Determine scope

Black Box Scanning Tool + White Box Testing Tool Toshis Black Box Scanning Tool Same

A recipe for black box functors Maru Sarazola and Brendan Fong What is a black box functor? In

Automatic Solver Configuration and Solver Portfolios Meinolf Sellmann IBM Research Watson AI

A CDCL(LA) Solver SPASS-SATT A CDCL(LA) Solver Translation: fun (=SPASS) sated (=SATT)

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

Kid s Box American English Level 1 Presentation Plus: Kid s Box American English Kid s Box

Flux Box Flux Box A concept by Flux Laboratory Flux box : concept Flux box : concept What is Flux

[7] Gaussian Elimination Starting to peek inside the black box So far sol ve( A, b) is a black

M A PHYS or the development of a parallel algebraic domain decomposition solver in the course of

Growing Solver-Aided Languages with ROSETTE Emina Torlak &amp; Rastislav Bodik U.C. Berkeley

AM 205: lecture 12 Last time: Numerical differentiation, numerical solution of ordinary

Masking schemes: evaluation Oscar Reparaz COSIC/KU Leuven PROOFS Taipei (Taiwan)

The long-term post-outburst spin down of low magnetic field magnetar Swift J1822.3-1606 Scholz

Combinational Building Blocks Multiplexers Decoders Chapter 2 &lt;155&gt; Multiplexer

Clustering L eon Bottou NEC Labs America COS 424 3/4/2010 Agenda Classification,

Kai Schneider M 2P2 -CNRS &amp; CMI, Universit de Provence, Marseille, France Joint work with :

Iterative Solvers for Coupled Fluid-Solid Scattering Jan Mandel Center for Computational

Advanced Rendering Advanced Rendering http://www.ugrad.cs.ubc.ca/~cs314/Vjan2013 2 Reading for

Growing Solver-Aided Languages with ROSETTE Emina Torlak & Rastislav Bodik U.C. Berkeley

Combinational Building Blocks Multiplexers Decoders Chapter 2 <155> Multiplexer

Kai Schneider M 2P2 -CNRS & CMI, Universit de Provence, Marseille, France Joint work with :