Partitioned Multi-Physics on Distributed Data via preCICE - - PowerPoint PPT Presentation

partitioned multi physics on distributed data via precice
SMART_READER_LITE
LIVE PREVIEW

Partitioned Multi-Physics on Distributed Data via preCICE - - PowerPoint PPT Presentation

Universit at Stuttgart, Technische Universit at M unchen SPPEXA APM 2016 Partitioned Multi-Physics on Distributed Data via preCICE Hans-Joachim Bungartz, Florian Lindner, Miriam Mehl, Klaudius Scheufele, Alexander Shukaev, Benjamin


slide-1
SLIDE 1

Universit¨ at Stuttgart, Technische Universit¨ at M¨ unchen

SPPEXA APM 2016

Partitioned Multi-Physics on Distributed Data via preCICE

Hans-Joachim Bungartz, Florian Lindner, Miriam Mehl, Klaudius Scheufele, Alexander Shukaev, Benjamin Uekermann

Universit¨ at Stuttgart, Technische Universit¨ at M¨ unchen January 25, 2016

Uekermann et al.: Partitioned Multi-Physics on Distributed Data via preCICE SPPEXA APM 2016, January 25, 2016 1

slide-2
SLIDE 2

Multi-Physics and Exa-Scale

◮ many exciting applications need multi-physics ◮ more compute power ⇒ more physics? ◮ many sophisticated, scalable,

single-physics, legacy codes

◮ our goal: minimal invasive coupling,

no deterioration of scalability

slide-3
SLIDE 3

Our Example: Fluid-Structure-Acoustic Interaction

Structure Fluid Acoustic ◮ FEAP - FASTEST - Ateles ◮ OpenFOAM - OpenFOAM - Ateles ◮ glue-software: preCICE

slide-4
SLIDE 4

Our Example: Fluid-Structure-Acoustic Interaction

F S A F S F S F S F S F S AAAAAAA A F S

◮ implicit or explicit coupling ◮ subcycling

slide-5
SLIDE 5

Agenda This talk: glue-software preCICE Next talk (Verena Krupp): application perspective

slide-6
SLIDE 6

Agenda This talk: glue-software preCICE Next talk (Verena Krupp): application perspective

  • 1. Very brief introduction to preCICE
  • 2. Realization on Distributed Data
  • 3. Performance on Distributed Data
slide-7
SLIDE 7

preCICE

◮ precise Code Interaction Coupling Environment ◮ developed in Munich and Stuttgart ◮ library approach, minimal invasive ◮ high-level API in C++, C, Fortran77/90/95,

Fortran2003

◮ once adapter written ⇒ plug and play ◮ https://github.com/precice/precice

slide-8
SLIDE 8

Big Picture

S O L V E R A . 1 A D A P T E R S T E E R I N G E Q U A T I O N C O U P L I N G C O M M U N I C A T I O N D A T A M A P P I N G

M A S T E R

S O L V E R A . 2 A D A P T E R

S L A V E

S O L V E R A . N A D A P T E R

S L A V E

S O L V E R B . 1 A D A P T E R S T E E R I N G

M A S T E R

S O L V E R B . 2

S L A V E

S O L V E R B . M

S L A V E

A D A P T E R A D A P T E R

slide-9
SLIDE 9

Coupled Solvers

Ateles (APES) CF, A in-house (U Siegen) Alya System IF, S in-house (BSC) Calculix S

  • pen-source (A*STAR)

CARAT S in-house (TUM STATIK) COMSOL S commercial EFD IF in-house (TUM SCCS) FASTEST IF in-house (TU Darmstadt) Fluent IF commercial OpenFOAM CF, IF, S

  • pen-source (TU Delft)

Peano 1 IF in-house (TUM SCCS) SU2 CF

  • pen-source
slide-10
SLIDE 10

preCICE API

turn on() while time loop = done do solve timestep() end while turn off()

slide-11
SLIDE 11

preCICE API

turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice initialize() while time loop = done do solve timestep() end while turn off() precice finalize()

slide-12
SLIDE 12

preCICE API

turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice initialize() while time loop = done do while precice action required(readCheckPoint) do solve timestep() precice advance() end while end while turn off() precice finalize()

slide-13
SLIDE 13

preCICE API

turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice set vertices(meshID, N, pos(dim*N), vertIDs(N)) precice initialize() while time loop = done do while precice action required(readCheckPoint) do solve timestep() precice advance() end while end while turn off() precice finalize()

slide-14
SLIDE 14

preCICE API

turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice set vertices(meshID, N, pos(dim*N), vertIDs(N)) precice initialize() while time loop = done do while precice action required(readCheckPoint) do precice write bvdata(stresID,N,vertIDs,stres(dim*N)) solve timestep() precice advance() precice read bvdata(displID,N,vertIDs,displ(dim*N)) end while end while turn off() precice finalize()

slide-15
SLIDE 15

preCICE Config

slide-16
SLIDE 16

Communication on Distributed Data

coupling surface solver A solver B

slide-17
SLIDE 17

Communication on Distributed Data

A acceptor B requestor

2 1 4 3 2 1

◮ kernel: 1:N communication based on either TCP/IP or MPI Ports

⇒ no deadlocks at initialization (independent of order at B)

◮ asynchronous communication (preferred over threads)

⇒ no deadlocks at communication

slide-18
SLIDE 18

Interpolation on Distributed Data

Projection-based Interpolation

◮ first or second order ◮ example: consistent mapping from B to A ◮ parallelization: almost trivial

B A B A

nearest neighbour mapping nearest projection mapping

slide-19
SLIDE 19

Interpolation on Distributed Data

Projection-based Interpolation

◮ first or second order ◮ example: consistent mapping from B to A ◮ parallelization: almost trivial

B A B A

nearest neighbour mapping nearest projection mapping

Radial Basis Function Interpolation

◮ higher order ◮ parallelization: far from trivial (realized with PETSc)

slide-20
SLIDE 20

Fixed-Point Acceleration on Distributed Data

Anderson Acceleration (IQN-ILS)

Find x ∈ D ⊂ Rn : H(x) = x, H : D → Rn initial value x0 ˜ x0 = H(x0) and R0 = ˜ x0 − x0 x1 = x0 + 0.1 · R0 for k = 1 . . . do ˜ xk = H(xk) and Rk = ˜ xk − xk V k = [∆Rk

0 , . . . , ∆Rk k−1] with ∆Rk i = Ri − Rk

Wk = [∆˜ xk

0 , . . . , ∆˜

xk

k−1] with ∆˜

xk

i = ˜

xi − ˜ xk decompose V k = QkUk solve the first k lines of Ukα = −Qk TRk ∆˜ x = W α xk+1 = ˜ xk + ∆˜ xk end for

slide-21
SLIDE 21

Fixed-Point Acceleration on Distributed Data

= v r = V R V Q R Q

^ ^ ^ ^

Insert Column

In: ˆ Q ∈ Rn×m, ˆ R ∈ Rm×m, v ∈ Rn, Out: Q ∈ Rn×(m+1), R ∈ R(m+1)×(m+1) for j = 1 . . . m do r(j) = ˆ Q(:, j)Tv v = v − r(j) · ˆ Q(:, j) end for r(m + 1) = v2 Q(:, m + 1) = r(m + 1)−1 · v R =

  • r,

ˆ R

  • Given’s rotations Gi,j s.t. R = G1,2 . . . Gm,m+1R upper triangle

Q = QGm,m+1 . . . G1,2

slide-22
SLIDE 22

Performance Tests, Initialization

#DOFs at interface: 2.6 · 105, strong scaling

I II III IV V com ILS NN Time [ms] 10 0 10 1 10 2 10 3 10 4 10 5 p=128 p=256 p=512 p=1024 p=2048

slide-23
SLIDE 23

Performance Tests, Work per Timestep

#DOFs at interface: 2.6 · 105, strong scaling

com ILS NN Time [ms] 1 2 3 4 5 6 7 8 9 10 p=128 p=256 p=512 p=1024 p=2048

slide-24
SLIDE 24

Performance Tests, Traveling Pulse

DG solver Ateles, Euler equations, #DOFs: total: 5.9 · 109, at interface: 1.1 · 107, NN mapping and communication strong scaling from 128 to 16384 processors per participant.

A t e l e s L e f t A t e l e s R i g h t p r e C I C E x y Z

Joint work with V. Krupp et al. (Universit¨ at Siegen)

slide-25
SLIDE 25

Performance Tests, Work per Timestep

27 28 29 210 211 212 213 214 100 101 102 103 104 105 106 68303 35590 18395 8389 4070 1954 1002 544 606 175 28 59 15 20 18 5 Number of Processes per Participant (NPP) Time [ms] Compute (Ateles) Advance (preCICE)

slide-26
SLIDE 26

Summary

27 28 29 210 211 212 213 214 100 101 102 103 104 105 106 68303 35590 18395 8389 4070 1954 1002 544 606 175 28 59 15 20 18 5 Number of Processes per Participant (NPP) Time [ms] Compute (Ateles) Advance (preCICE)

slide-27
SLIDE 27

Summary

Structure Fluid Acoustic

27 28 29 210 211 212 213 214 100 101 102 103 104 105 106 68303 35590 18395 8389 4070 1954 1002 544 606 175 28 59 15 20 18 5 Number of Processes per Participant (NPP) Time [ms] Compute (Ateles) Advance (preCICE)

slide-28
SLIDE 28

Summary

Structure Fluid Acoustic

S O L V E R A . 1 A D A P T E R S T E E R I N G E Q U A T I O N C O U P L I N G C O M M U N I C A T I O N D A T A M A P P I N G M A S T E R S O L V E R A . 2 A D A P T E R S L A V E S O L V E R A . N A D A P T E R S L A V E S O L V E R B . 1 A D A P T E R S T E E R I N G M A S T E R S O L V E R B . 2 S L A V E S O L V E R B . M S L A V E A D A P T E R A D A P T E R

27 28 29 210 211 212 213 214 100 101 102 103 104 105 106 68303 35590 18395 8389 4070 1954 1002 544 606 175 28 59 15 20 18 5 Number of Processes per Participant (NPP) Time [ms] Compute (Ateles) Advance (preCICE)

slide-29
SLIDE 29

Summary

Structure Fluid Acoustic

S O L V E R A . 1 A D A P T E R S T E E R I N G E Q U A T I O N C O U P L I N G C O M M U N I C A T I O N D A T A M A P P I N G M A S T E R S O L V E R A . 2 A D A P T E R S L A V E S O L V E R A . N A D A P T E R S L A V E S O L V E R B . 1 A D A P T E R S T E E R I N G M A S T E R S O L V E R B . 2 S L A V E S O L V E R B . M S L A V E A D A P T E R A D A P T E R

turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice set vertices(meshID, N, pos(dim*N), vertIDs(N)) precice initialize() while time loop = done do while precice action required(readCheckPoint) do pre write bvdata(stresID,N,vertIDs,stres(dim*N)) solve timestep() precice advance() pre read bvdata(displID,N,vertIDs,displ(dim*N)) end while end while turn off() precice finalize()

27 28 29 210 211 212 213 214 100 101 102 103 104 105 106 68303 35590 18395 8389 4070 1954 1002 544 606 175 28 59 15 20 18 5 Number of Processes per Participant (NPP) Time [ms] Compute (Ateles) Advance (preCICE)

slide-30
SLIDE 30

Summary

Structure Fluid Acoustic

S O L V E R A . 1 A D A P T E R S T E E R I N G E Q U A T I O N C O U P L I N G C O M M U N I C A T I O N D A T A M A P P I N G M A S T E R S O L V E R A . 2 A D A P T E R S L A V E S O L V E R A . N A D A P T E R S L A V E S O L V E R B . 1 A D A P T E R S T E E R I N G M A S T E R S O L V E R B . 2 S L A V E S O L V E R B . M S L A V E A D A P T E R A D A P T E R

turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice set vertices(meshID, N, pos(dim*N), vertIDs(N)) precice initialize() while time loop = done do while precice action required(readCheckPoint) do pre write bvdata(stresID,N,vertIDs,stres(dim*N)) solve timestep() precice advance() pre read bvdata(displID,N,vertIDs,displ(dim*N)) end while end while turn off() precice finalize()

27 28 29 210 211 212 213 214 100 101 102 103 104 105 106 68303 35590 18395 8389 4070 1954 1002 544 606 175 28 59 15 20 18 5 Number of Processes per Participant (NPP) Time [ms] Compute (Ateles) Advance (preCICE)