parallel runtime environments with cloud database
play

Parallel Runtime Environments with Cloud Database: Performance Study - PowerPoint PPT Presentation

Parallel Runtime Environments with Cloud Database: Performance Study for HMM with Adaptive Sampling D. Roehm, R. S. Pavel, T. C. Germann, A. L. McPherson and C. Junghans Los Alamos National Laboratory NM, USA May 8, 2015


  1. Parallel Runtime Environments with Cloud Database: Performance Study for HMM with Adaptive Sampling D. Roehm, R. S. Pavel, T. C. Germann, A. L. McPherson and C. Junghans Los Alamos National Laboratory NM, USA May 8, 2015 UNCLASSIFIED(LA-UR-14-29231) 1

  2. Introduction Implementation Results Summary Motivation Material modeling: time and length scale challenge Micro-structure matters, but is computationally expensive HMM: Combination of macro- and micro-scale simulations Adaptive sampling techniques: take micro-structure into account when necessary Prediction (kriging) based on Model problem: Laser impact database values instead of on a copper plate executing MD simulation UNCLASSIFIED(LA-UR-14-29231) 2

  3. Introduction Implementation Results Summary Elastodynamics On Macro level: Conservation Non-oscillatory central laws for mass, momentum, and scheme energy: (predictor-corrector) Continuum mechanics ⇒ ρ∂ t A − ▽ q = 0 conservation PDEs in ∂ t q − ▽ · τ = 0 Lagrangian coordinates ∂ t e + ▽ · j = 0 Evolution of deformation, momentum and energy On Micro level: Take strain, density computed by finite momentum density, and energy volume solver density and return stress, momentum, and energy density Stress and energy flux flux. evaluated with MD UNCLASSIFIED(LA-UR-14-29231) 3

  4. Introduction Implementation Results Summary Database: Redis Key-value storage High performance Support for distribution/cluster mode NoSQL Redis: open-source, networked, in-memory, key-value data store Users of Redis: GitHub, Twitter, Stackoverflow, Craigslist, ... (info: http://redis.io) Locality-aware hashes: a range along all seven dimensions of our conserved vector Truncated hash based upon the specified range Sort values by distance to requested input UNCLASSIFIED(LA-UR-14-29231) 4

  5. Introduction Implementation Results Summary Kriging “Optimally predicting”, originated in geostatistics Prediction of Z ( s 0 ) at a certain position in the high dimensional space by computing a weighted average of the known vectors in the neighborhood of the point n � Z ′ ( s 0 ) = λ i Z ( s i ) i =1 at location s 0 that minimizes the mean-square error E [ Z ( s 0 ) − Z ′ ( s 0 )] 2 . Calculates an error of the prediction at the same time Store simulation results in key-value database UNCLASSIFIED(LA-UR-14-29231) 5

  6. Introduction Implementation Results Summary Implementation Macrosolver frameworks (github.com/exmatex/CoHMM) Charm++ 6.6.0 (University of Illinois: charm.cs.uiuc.edu) Intel CnC 1.0.002 (Intel: icnc.github.io) OpenMP Libcircle v0.2.1 (github.com/hpc/libcircle) MD miniapp CoMD (github.com/exmatex/CoMD) serial C Version Compilers and Libraries: GCC 4.8.x ICPC 15.0.0 Boost 1.55 Blas and Lapack UNCLASSIFIED(LA-UR-14-29231) 6

  7. Introduction Implementation Results Summary Schematics �� ��� ��������������� ���������������� ���������������� �� ��� ��������� �������� �������������������������� ��������� �������� UNCLASSIFIED(LA-UR-14-29231) 7

  8. Introduction Implementation Results Summary Flat Wave Test Case Color bar right hand side: Strain. Color bar top: Type of call. UNCLASSIFIED(LA-UR-14-29231) 8

  9. Introduction Implementation Results Summary Flat Wave Test Case Colors: Type of call. UNCLASSIFIED(LA-UR-14-29231) 9

  10. Introduction Implementation Results Summary Adaptive Sampling Performance 100 10 Hits [%] 1 CoMD Kriging Database CoMD Duplicates Kriging Database Kriging Duplicates 0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 HMM step / N Overall less than 5% of CoMD calls ⇒ speedup of 25 UNCLASSIFIED(LA-UR-14-29231) 10

  11. Introduction Implementation Results Summary Adaptive Sampling Performance: Flat Wave 1e+00 CoMD CoMD Database Time per Task/CoMD Time Kriging Database 1e-01 Kriging 1e-02 1e-03 1e-04 1e-05 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 HMM step / N Absolute time per task UNCLASSIFIED(LA-UR-14-29231) 11

  12. Introduction Implementation Results Summary Circular Impact Test Case Calltype CoMD C. Dupl. DB Kr. DB Kr. Kr. Dupl. 35 Strain [MPa] 35 30 30 25 25 20 15 20 10 15 5 0 10 y [0..50] 5 . . 5 0 ] x [ 0 0 UNCLASSIFIED(LA-UR-14-29231) 12

  13. Introduction Implementation Results Summary Adaptive Sampling Performance: Circular Impact 100 10 1 Hits [%] 0.1 CoMDcalls CoMD Database 0.01 Kriging DB Kriging 0.001 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 HMM step / N Save approx. 10% of calls long term ⇒ speedup of 2.5 UNCLASSIFIED(LA-UR-14-29231) 13

  14. Introduction Implementation Results Summary Framework Performance: Circular Impact Analytic 500 Charm++ 450 Intel CnC OpenMP 400 350 300 Time [s] 250 200 150 100 50 0 0 10 20 30 40 50 60 70 80 90 100 HMM step 16 cores single database shared memory UNCLASSIFIED(LA-UR-14-29231) 14

  15. Introduction Implementation Results Summary Framework Performance: Circular Impact 250000 Charm++ Intel CnC OpenMP 200000 150000 Time [s] 100000 50000 0 0 5 10 15 20 25 30 35 40 45 50 HMM step 16 cores single database shared memory UNCLASSIFIED(LA-UR-14-29231) 15

  16. Introduction Implementation Results Summary Framework Performance: Flat Wave 180000 Charm++ 160000 Intel CnC Libcircle 140000 OpenMP 120000 Time [s] 100000 80000 60000 40000 20000 0 0 50 100 150 200 250 300 350 400 HMM step 48 cores single database shared memory UNCLASSIFIED(LA-UR-14-29231) 16

  17. Introduction Implementation Results Summary Framework Performance: Flat Wave 30000 Charm++ old Intel CnC old 25000 Libcircle Charm++ new Intel CnC new 20000 Time [s] 15000 10000 5000 0 0 20 40 60 80 100 HMM step 144 cores single database UNCLASSIFIED(LA-UR-14-29231) 17

  18. Introduction Implementation Results Summary Framework Performance: Circular Impact 120000 Charm++ Intel CnC 100000 Libcircle 80000 Time [s] 60000 40000 20000 0 0 20 40 60 80 100 HMM step 480 cores with single database UNCLASSIFIED(LA-UR-14-29231) 18

  19. Introduction Implementation Results Summary Framework Pros and Cons Based on preliminary results! Charm++ More complex to implement (.ci files) Great platform support, but uncommon build system Good performance Good documentation and support on mailing list Intel CnC Straightforward to implement Needs Intel MPI or MPICH Good performance with optimization efforts Good documentation for basics Tuners mainly undocumented Libcircle Trivial to implement Great platform support Performance NOT comparable Manual serialization of input and output data UNCLASSIFIED(LA-UR-14-29231) 19

  20. Introduction Implementation Results Summary Summary and Outlook Implemented Distributed Database Kriging for Adaptive Sampling ( D 2 KAS ) for HMM (elastodynamics) using different frameworks Our adaptive scheme achieves a speedup of 2 . 5 − 25 1 Enables inclusion of defects, crystal domains or phase Color bar right hand side: boundaries Strain. Color bar top: Type of One code base: Charm++, call. CnC, OpenMP, Libcircle (github.com/exmatex/CoHMM) 1 Comp. Phys. Comm. 192 , 138 (2015) UNCLASSIFIED(LA-UR-14-29231) 20

  21. Introduction Implementation Results Summary Thanks to Phil Miller (Audience, UI-UC) Frank Schlimbach (Intel) This work was supported by the Los Alamos Information Science & Technology Center (IS&T) Co-Design Summer School, the U.S. Department of Energy (DOE), Office of Advanced Scientific Computing Research (ASCR) through the Exascale Co-Design Center for Materials in Extreme Environments (ExMatEx), and the Center for Nonlinear Studies (CNLS). UNCLASSIFIED(LA-UR-14-29231) 21

  22. The End Thank you for your attention ! UNCLASSIFIED(LA-UR-14-29231) 22

Recommend


More recommend