Porting the Plasma Simulation PIConGPU to Heterogeneous Architectures with Alpaka René Widera 1 , Erik Zenker 1,2 , Guido Juckeland 1 , Benjamin Worpitz 1,2 , Axel Huebl 1,2 , Andreas Knüpfer 2 , Wolfgang E. Nagel 2 , Michael Bussmann 1 1 Helmholtz-Zentrum Dresden – Rossendorf 2 Technische Universität Dresden Prof. Peter Mustermann I Institut xxxxx I www.hzdr.de
PICon GPU Electron Acceleration Ion Acceleration Plasma Instabilities with Lasers with Lasers Compact X-Ray sources Tumor Therapy Astrophysics Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 2 { r.widera, e.zenker, g.juckeland }@hzdr.de
Domain Decomposition ─ Field and Particle Domain + + + ─ + ─ + ─ ─ ─ + + + + ─ ─ ─ + + + ─ ─ ─ + + ─ ─ + ─ + + + ─ ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 3 { r.widera, e.zenker, g.juckeland }@hzdr.de
Domain Decomposition ─ Field and Particle Domain + + + ─ + ─ + ─ Moving Particles create Fields ─ ─ + + + Fields act back on Particles + ─ ─ Particles change Cells ─ + + + ─ ─ ─ + + ─ ─ + ─ + + + ─ ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 4 { r.widera, e.zenker, g.juckeland }@hzdr.de
Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain + + + 1 2 + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 5 { r.widera, e.zenker, g.juckeland }@hzdr.de
Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 6 { r.widera, e.zenker, g.juckeland }@hzdr.de
Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 chunked in supercells line wise aligned Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 7 { r.widera, e.zenker, g.juckeland }@hzdr.de
Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 chunked in supercells fixed size frames line wise aligned struct of aligned arrays Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 8 { r.widera, e.zenker, g.juckeland }@hzdr.de
Creating Vectorized Data Structures for Particles and Fields Field Domain Particle Domain Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 chunked in supercells fixed size frames line wise aligned struct of aligned arrays Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 9 { r.widera, e.zenker, g.juckeland }@hzdr.de
Algorithm Driven Cache Strategy Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 10 { r.widera, e.zenker, g.juckeland }@hzdr.de
Algorithm Driven Cache Strategy Global Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 11 { r.widera, e.zenker, g.juckeland }@hzdr.de
Algorithm Driven Cache Strategy Global Memory Shared Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 12 { r.widera, e.zenker, g.juckeland }@hzdr.de
Algorithm Driven Cache Strategy Shared Memory Global Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + 3 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 13 { r.widera, e.zenker, g.juckeland }@hzdr.de
High Utilization of Threads Shared Memory Global Memory Cell 1 Cell 1 + + 1 2 Cell 2 Cell 4 + + THREAD BLOCK 3 4 THREAD 1 THREAD 2 THREAD 3 THREAD 4 Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 14 { r.widera, e.zenker, g.juckeland }@hzdr.de
Task-Parallel Execution of Kernels + Asynchronous Communication Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 15 { r.widera, e.zenker, g.juckeland }@hzdr.de
PIConGPU ─ Scales up to 16,384 GPUs strong scaling 10000 1000 speedup 100 ideal 1 to 32 10 8 to 256 64 to 2048 512 to 16384 4096 to 16384 1 1 10 100 1000 10000 number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 16 { r.widera, e.zenker, g.juckeland }@hzdr.de
PIConGPU ─ Scales up to 16,384 GPUs strong scaling weak scaling efficiency 105 10000 100 1000 efficiency [%] 95 speedup 100 90 ideal 1 to 32 10 8 to 256 85 64 to 2048 512 to 16384 ideal 4096 to 16384 PIConGPU 1 80 1 10 100 1000 10000 1 10 100 1000 10000 number of GPUs number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 17 { r.widera, e.zenker, g.juckeland }@hzdr.de
PIConGPU ─ Scales up to 16,384 GPUs strong scaling weak scaling efficiency 105 10000 100 1000 efficiency [%] 95 Efficiency >95% speedup 100 90 ideal 1 to 32 10 8 to 256 85 64 to 2048 512 to 16384 ideal 4096 to 16384 PIConGPU 1 80 1 10 100 1000 10000 1 10 100 1000 10000 number of GPUs number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 18 { r.widera, e.zenker, g.juckeland }@hzdr.de
PIConGPU ─ Scales up to 16,384 GPUs strong scaling weak scaling efficiency 105 10000 100 1000 efficiency [%] 95 Efficiency >95% speedup 100 6.9 PFlop/s (SP) 90 ideal 1 to 32 10 8 to 256 85 64 to 2048 512 to 16384 ideal 4096 to 16384 PIConGPU 1 80 1 10 100 1000 10000 1 10 100 1000 10000 number of GPUs number of GPUs Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 19 { r.widera, e.zenker, g.juckeland }@hzdr.de
More Physics, More Computations, More Power! ─ + s 1 s 2 ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 20 { r.widera, e.zenker, g.juckeland }@hzdr.de
More Physics, More Computations, More Power! Old atom state ─ s 1,1 + s 1,2 s 1 s 2 s 1,3 ... s n,m ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 21 { r.widera, e.zenker, g.juckeland }@hzdr.de
More Physics, More Computations, More Power! Atom-physical Old atom state effects ─ t 1,1 t 1,2 t 1,3 … t 1,n s 1,1 + t 2,1 . s 1,2 s 1 s 2 t 3,1 . s 1,3 … . ... t n,1 t n,n s n,m ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 22 { r.widera, e.zenker, g.juckeland }@hzdr.de
More Physics, More Computations, More Power! Atom-physical Old atom New atom state state effects ─ t 1,1 t 1,2 t 1,3 … t 1,n s 1,1 s 1,1 + t 2,1 . s 1,2 s 1,2 s 1 s 2 t 3,1 . s 1,3 s 1,3 = … . ... ... t n,1 t n,n s n,m s n,m ─ Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 23 { r.widera, e.zenker, g.juckeland }@hzdr.de
More Physics, More Computations, More Power! Atom-physical Old atom New atom state state effects ─ t 1,1 t 1,2 t 1,3 … t 1,n s 1,1 s 1,1 + t 2,1 . s 1,2 s 1,2 s 1 s 2 t 3,1 . s 1,3 s 1,3 = … . ... ... t n,1 t n,n s n,m s n,m ─ Really Big Data Task ◾ Random access on big amounts of data > 100 GB ◾ Good job for powerful CPUs ◾ Efficient CPU/GPU cooperation Mitglied der Helmholtz-Gemeinschaft René Widera, Erik Zenker, Guido Juckeland · Computational Radiation Physics · www.hzdr.de/crp 24 { r.widera, e.zenker, g.juckeland }@hzdr.de
Recommend
More recommend