An Evaluation of UPC in the Ludwig Application Alan Gray EPCC, The - PowerPoint PPT Presentation

An Evaluation of UPC in the Ludwig Application Alan Gray EPCC, The University of Edinburgh CUG 2009, Atlanta

Introduction • Modern HPC architectures comprise multiple nodes – connected via interconnect • Applications must utilise these multiple nodes to solve single problem – Mechanism needed for each process to acquire remote data • Message passing (MPI) has become de-facto standard – need for complex coding to manage the message passing – performance overheads due to underlying 2-way communication • Novel PGAS languages offer intuitive access of remote data – Potentially increase productivity and performance in HPC • UPC (arguably) most mature and portable PGAS language today 4 th May 2009 CUG 2009, Atlanta 2

Introduction (cont.) • AIM: evaluate UPC as a replacement of MPI within real application (LUDWIG) – measure performance • Full conversion beyond scope of work – But UPC and MPI can co-exist: can target area of interest • UPC fully supported at hardware level on Cray X2 – This study uses X2 component of HECToR (112 processors) – UPC will be fully supported on XT after upgrade to GEMINI interconnect 4 th May 2009 CUG 2009, Atlanta 3

UPC • Consider simplistic case: 8 elements distributed between 2 processes – Where updates require neighbouring values • Regular C array (local): int p[6]; • UPC shared array (global): shared [8/THREADS] int s[8]; 4 th May 2009 CUG 2009, Atlanta 4 4

LUDWIG • LUDWIG uses Lattice-Boltzmann models to enable simulation of hydrodynamics of complex fluids (mixtures of fluids, solids/fluids) in 3D – Jean Christophe Desplat, Dublin Institute for Advanced Studies – Kevin Stratford, Mike Cates, The University of Edinburgh – Applications include personal care products, e.g. shampoo 4 th May 2009 CUG 2009, Atlanta 5

LUDWIG • Original Code: – Halo cells only accessed in Propagation 4 th May 2009 CUG 2009, Atlanta 6

LUDWIG Conversion • Main data structure is array site[] , where – each element corresponds to a lattice site – consists of a struct containing physical variables • Original Code Propagation section: updates require values from neighbouring sites Loop over index … site[index].f[0]=site[index-1].f[0]+…; … • Halo cells + message passing halo swap routines required 4 th May 2009 CUG 2009, Atlanta 7

LUDWIG Conversion • Strategy: mirror site with UPC Shared structure s_site . – New functionality: sindex[index] Mapping of local ( site ) - global ( s_site ) index put_site_in_shared() Copy data local -> shared get_site_from_shared() Copy data shared -> local • Allows for specific area of application to be targeted – Propagation section adapted to work with shared arrays Loop over index … s_site[sindex[index]].f[0] =s_site[sindex[index-1]].f[0]+…; … • No halo cells/swaps needed, remote accesses done directly 4 th May 2009 CUG 2009, Atlanta 8

LUDWIG Conversion • Modified LUDWIG code: 4 th May 2009 CUG 2009, Atlanta 9

Performance results 4 th May 2009 CUG 2009, Atlanta 10

Performance results • Naïve adaptation has substantial negative impact • Underlying communication is not cause of this • Shared pointer dereferencing more costly than for regular pointers • Optimised version: access memory through regular C pointers where possible – Obtained by casting from shared pointers – Boundary updates must still use shared array accesses to get remote data. 4 th May 2009 CUG 2009, Atlanta 12

Conclusions • UPC allows for intuitive access to remote data – Potentially increasing performance and productivity in HPC • LUDWIG adapted to utilise UPC functionality – Focusing on key section – Shared structures remove need for complicated halo swaps • Significant performance degradation with naïve adaptation – Due to sensitivity to costly shared pointer operations • Optimised version uses regular C pointers to access data where possible – Performs similarly to (but slightly worse than) MPI version – remaining degradation likely due to remaining shared pointer operations • Would be interesting to test on larger system (inc. future Cray XT) 4 th May 2009 CUG 2009, Atlanta 14

An Evaluation of UPC in the Ludwig Application Alan Gray EPCC, The - PowerPoint PPT Presentation

An Evaluation of UPC in the Ludwig Application Alan Gray EPCC, The University of Edinburgh CUG 2009, Atlanta Introduction Modern HPC architectures comprise multiple nodes connected via interconnect Applications must utilise these

Mobile Agents for Database Applications Ludwig Klug Database Agents 1 Ludwig Klug Ludwig

Results of histopathological thrombus evaluation in patients presenting with stent thrombosis

Normalization by Evaluation for System F Andreas Abel Department of Computer Science

Authorization for IoT using OAuth draft-seitz-ace-oauth-authz-00 Ludwig Seitz (ludwig@sics.se)

MPI-Checker Static Analysis for MPI Alexander Droste, Michael Kuhn, Thomas Ludwig November

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

Evaluation DEMMS: Evaluation of Multimedia What are the Evaluation lectures about: When

Addition about Prototypes http://www.useit.com/papers/guerrilla_hci.html 29/01/04 LMU

Society is nothing but the combination of individuals for cooperative effort. - Ludwig Von

bung zur Vorlesung Mensch-Maschine-Interaktion Raphael Wimmer Ludwig-Maximilians-Universitt

Evaluation 101: An Introduction for New Evaluation Practitioners AEA/CDC Summer Evaluation

Ludwig B. Chincarini, Ph.D., CFA University of San Francisco United States Commodity Fund

Normalization by Evaluation for Martin-L of Type Theory Andreas Abel 1 Thierry Coquand 2 Peter

G LO W A Danube Integrative Techniques, Scenarios and Strategies for the Future of Water in the

AC Gentechnologie Heidrun Karlic 1 und Clemens Heitzinger 2 , 3 1 Ludwig Boltzmann Institute for

Sustainability and Dissemination Chair: Ludwig-Maximilians-Universitt - LMU European University

Vorlesung Mensch-Maschine-Interaktion Ludwig-Maximilians-Universitt Mnchen LFE

Introduction to NumPy arrays Gert-Ludwig Ingold

Adversarially Robust Generalization Requires More Data Ludwig Schmidt Shibani Santurkar

A Schedule Optimization Tool for Destructive and Non-Destructive Vehicle Tests Jeremy Ludwig,

Vorlesung Mensch-Maschine-Interaktion Models and Users (1) Ludwig-Maximilians-Universitt

Evaluation: Using CDCs Evaluation Framework By: Thomas J. Chapel, MA, MBA Chief Evaluation

Forest financing approaches in the Mediterranean: elements of a state of the art Ludwig Liagre

SunyoungKim,PhD Todays agenda Evaluation Expert evaluation o Cognitive