partitioned multi physics on distributed data via precice
play

Partitioned Multi-Physics on Distributed Data via preCICE - PowerPoint PPT Presentation

Universit at Stuttgart, Technische Universit at M unchen SPPEXA APM 2016 Partitioned Multi-Physics on Distributed Data via preCICE Hans-Joachim Bungartz, Florian Lindner, Miriam Mehl, Klaudius Scheufele, Alexander Shukaev, Benjamin


  1. Universit¨ at Stuttgart, Technische Universit¨ at M¨ unchen SPPEXA APM 2016 Partitioned Multi-Physics on Distributed Data via preCICE Hans-Joachim Bungartz, Florian Lindner, Miriam Mehl, Klaudius Scheufele, Alexander Shukaev, Benjamin Uekermann Universit¨ at Stuttgart, Technische Universit¨ at M¨ unchen January 25, 2016 Uekermann et al.: Partitioned Multi-Physics on Distributed Data via preCICE SPPEXA APM 2016, January 25, 2016 1

  2. Multi-Physics and Exa-Scale ◮ many exciting applications need multi-physics ◮ more compute power ⇒ more physics? ◮ many sophisticated, scalable, single-physics, legacy codes ◮ our goal: minimal invasive coupling, no deterioration of scalability

  3. Our Example: Fluid-Structure-Acoustic Interaction Acoustic Fluid Structure ◮ FEAP - FASTEST - Ateles ◮ OpenFOAM - OpenFOAM - Ateles ◮ glue-software: preCICE

  4. Our Example: Fluid-Structure-Acoustic Interaction A AAAAAAA A F F F F F F F S S S S S S S ◮ implicit or explicit coupling ◮ subcycling

  5. Agenda This talk: glue-software preCICE Next talk (Verena Krupp): application perspective

  6. Agenda This talk: glue-software preCICE Next talk (Verena Krupp): application perspective 1. Very brief introduction to preCICE 2. Realization on Distributed Data 3. Performance on Distributed Data

  7. preCICE ◮ pre cise C ode I nteraction C oupling E nvironment ◮ developed in Munich and Stuttgart ◮ library approach, minimal invasive ◮ high-level API in C++ , C , Fortran77/90/95 , Fortran2003 ◮ once adapter written ⇒ plug and play ◮ https://github.com/precice/precice

  8. Big Picture A D A P T E R A D A P T E R E Q U A T I O N C O U P L I N G S O L V E R A . 1 S O L V E R B . 1 M A S T E R M A S T E R A D A P T E R A D A P T E R S S C O M M U N I C A T I O N S O L V E R A . 2 S O L V E R B . 2 T T E E S L A V E E E S L A V E R R I I N N G G D A T A M A P P I N G A D A P T E R A D A P T E R S O L V E R A . N S O L V E R B . M S L A V E S L A V E

  9. Coupled Solvers Ateles (APES) CF , A in-house (U Siegen) Alya System IF , S in-house (BSC) Calculix open-source (A*STAR) S CARAT in-house (TUM STATIK) S COMSOL S commercial EFD IF in-house (TUM SCCS) FASTEST IF in-house (TU Darmstadt) Fluent IF commercial OpenFOAM CF , IF , S open-source (TU Delft) Peano 1 IF in-house (TUM SCCS) SU2 CF open-source

  10. preCICE API turn on() while time loop � = done do solve timestep() end while turn off()

  11. preCICE API turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice initialize() while time loop � = done do solve timestep() end while turn off() precice finalize()

  12. preCICE API turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice initialize() while time loop � = done do while precice action required(readCheckPoint) do solve timestep() precice advance() end while end while turn off() precice finalize()

  13. preCICE API turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice set vertices(meshID, N, pos(dim*N), vertIDs(N)) precice initialize() while time loop � = done do while precice action required(readCheckPoint) do solve timestep() precice advance() end while end while turn off() precice finalize()

  14. preCICE API turn on() precice create(“FLUID”, “precice config.xml”, index, size) precice set vertices(meshID, N, pos(dim*N), vertIDs(N)) precice initialize() while time loop � = done do while precice action required(readCheckPoint) do precice write bvdata(stresID,N,vertIDs,stres(dim*N)) solve timestep() precice advance() precice read bvdata(displID,N,vertIDs,displ(dim*N)) end while end while turn off() precice finalize()

  15. preCICE Config

  16. Communication on Distributed Data solver A coupling surface solver B

  17. Communication on Distributed Data A acceptor B requestor 0 0 1 1 2 3 2 4 ◮ kernel: 1:N communication based on either TCP/IP or MPI Ports ⇒ no deadlocks at initialization (independent of order at B) ◮ asynchronous communication (preferred over threads) ⇒ no deadlocks at communication

  18. Interpolation on Distributed Data Projection-based Interpolation ◮ first or second order ◮ example: consistent mapping from B to A ◮ parallelization: almost trivial A A B B nearest neighbour mapping nearest projection mapping

  19. Interpolation on Distributed Data Projection-based Interpolation ◮ first or second order ◮ example: consistent mapping from B to A ◮ parallelization: almost trivial A A B B nearest neighbour mapping nearest projection mapping Radial Basis Function Interpolation ◮ higher order ◮ parallelization: far from trivial (realized with PETSc )

  20. Fixed-Point Acceleration on Distributed Data Anderson Acceleration (IQN-ILS) Find x ∈ D ⊂ R n : H ( x ) = x , H : D → R n initial value x 0 x 0 = H ( x 0 ) and R 0 = ˜ x 0 − x 0 ˜ x 1 = x 0 + 0 . 1 · R 0 for k = 1 . . . do x k = H ( x k ) and R k = ˜ x k − x k ˜ V k = [∆ R k i = R i − R k 0 , . . . , ∆ R k k − 1 ] with ∆ R k x i − ˜ x k x k x k x k W k = [∆˜ 0 , . . . , ∆˜ k − 1 ] with ∆˜ i = ˜ decompose V k = Q k U k solve the first k lines of U k α = − Q k T R k ∆˜ x = W α x k +1 = ˜ x k + ∆˜ x k end for

  21. Fixed-Point Acceleration on Distributed Data ^ ^ R R r ^ ^ = 0 = Q v Q V V Insert Column In: ˆ Q ∈ R n × m , ˆ R ∈ R m × m , v ∈ R n , Out: Q ∈ R n × ( m +1) , R ∈ R ( m +1) × ( m +1) for j = 1 . . . m do r ( j ) = ˆ Q (: , j ) T v v = v − r ( j ) · ˆ Q (: , j ) end for r ( m + 1) = � v � 2 Q (: , m + 1) = r ( m + 1) − 1 · v � ˆ � �� R R = r , 0 Given’s rotations G i , j s.t. R = G 1 , 2 . . . G m , m +1 R upper triangle Q = QG m , m +1 . . . G 1 , 2

  22. Performance Tests, Initialization #DOFs at interface: 2 . 6 · 10 5 , strong scaling 10 5 p=128 p=256 10 4 p=512 p=1024 p=2048 10 3 Time [ms] 10 2 10 1 10 0 I II III IV V com ILS NN

  23. Performance Tests, Work per Timestep #DOFs at interface: 2 . 6 · 10 5 , strong scaling 10 p=128 9 p=256 p=512 8 p=1024 7 p=2048 6 Time [ms] 5 4 3 2 1 com ILS NN

  24. Performance Tests, Traveling Pulse DG solver Ateles, Euler equations, #DOFs: total: 5 . 9 · 10 9 , at interface: 1 . 1 · 10 7 , NN mapping and communication strong scaling from 128 to 16384 processors per participant. A t e l e s A t e l e s E C L e f t R i g h t I C e r p Z y x Joint work with V. Krupp et al. (Universit¨ at Siegen)

  25. 18395 68303 20 15 59 28 175 606 544 1002 1954 4070 8389 2 7 35590 10 6 5 10 5 10 4 10 3 10 2 10 1 10 0 2 14 2 13 2 12 2 11 2 10 2 9 2 8 18 Performance Tests, Work per Timestep Compute (Ateles) Advance (preCICE) Time [ ms ] Number of Processes per Participant (NPP)

  26. Compute (Ateles) Advance (preCICE) Time [ ms ] Number of Processes per Participant (NPP) 175 1954 1002 544 606 15 28 59 8389 20 18 5 4070 18395 2 7 2 14 2 8 2 9 2 10 2 11 2 12 2 13 10 0 35590 10 1 10 2 10 3 10 4 10 5 10 6 68303 Summary

  27. Compute (Ateles) Advance (preCICE) Time [ ms ] Number of Processes per Participant (NPP) 606 8389 4070 1954 1002 544 59 175 28 35590 15 20 18 5 18395 68303 2 13 2 7 10 5 10 4 10 3 10 2 10 1 10 0 2 14 10 6 2 12 2 11 2 10 2 9 2 8 Summary Acoustic Fluid Structure

  28. Compute (Ateles) Advance (preCICE) Time [ ms ] Number of Processes per Participant (NPP) 10 0 10 5 10 4 10 3 10 2 10 1 2 13 2 14 68303 2 12 2 11 2 10 2 9 10 6 8389 35590 18395 2 7 4070 1954 1002 544 606 175 28 59 15 20 18 5 2 8 Summary A D A P T E R A D A P T E R Acoustic E Q U A T I O N C O U P L I N G S O L V E R A . 1 S O L V E R B . 1 M A S T E R M A S T E R Fluid A D A P T E R A D A P T E R S S C O M M U N I C A T I O N S O L V E R A . 2 S O L V E R B . 2 T T E E Structure S L A V E E E S L A V E R R I I N N G G D A T A M A P P I N G A D A P T E R A D A P T E R S O L V E R A . N S O L V E R B . M S L A V E S L A V E

Recommend


More recommend