Composable Parallel Libraries in Charm++ e Phil Miller Laxmikant - PowerPoint PPT Presentation

Composable Parallel Libraries in Charm++ e ∗ Phil Miller Laxmikant V. Kal´ Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign ∗ { mille121, kale } @illinois.edu SIAM PP12: 15 February 2012 Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 1 / 15

Charm++ Programming Model Object-based Express logic via indexed collections of interacting objects (both data and tasks) Over-decomposed Expose more parallelism than available processors Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 2 / 15

Charm++ Programming Model Message-Driven Trigger computation by invoking remote entry methods Non-blocking, Asynchronous Implicitly overlapped data transfer Runtime-Assisted scheduling, observation-based adaptivity, load balancing, composition, etc. Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 3 / 15

Charm++ Capabilities Promotes natural expression of parallelism Supports modularity Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 4 / 15

Charm++ Capabilities Promotes natural expression of parallelism Supports modularity Overlaps communication and computation Automatically balances load Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 4 / 15

Charm++ Capabilities Promotes natural expression of parallelism Supports modularity Overlaps communication and computation Automatically balances load Automatically handles heterogenous systems Adapts to reduce energy consumption Tolerates component failures For more info http://charm.cs.illinois.edu/why/ Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 4 / 15

Separation of Concerns Application developers focus on their algorithms and data Libraries should ◮ not tie users’ hands ◮ share resources seamlessly ◮ overlap ◮ manage their own performance Strong runtime makes it possible! Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 5 / 15

LU: Capabilities Composable library ◮ Modular program structure ◮ Seamless execution structure (interleaved modules) Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 6 / 15

LU: Capabilities Composable library ◮ Modular program structure ◮ Seamless execution structure (interleaved modules) Block-centric ◮ Algorithm from a block’s perspective ◮ Agnostic of processor-level considerations Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 6 / 15

LU: Capabilities Composable library ◮ Modular program structure ◮ Seamless execution structure (interleaved modules) Block-centric ◮ Algorithm from a block’s perspective ◮ Agnostic of processor-level considerations Separation of concerns ◮ Domain specialist codes algorithm ◮ Systems specialist codes tuning, resource mgmt etc Lines of Code Module-specific CI C++ Total Commits Factorization 517 419 472/572 83% 936 Mem. Aware Sched. 9 492 501 86/125 69% Mapping 10 72 82 29/42 69% Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 6 / 15

LU: Capabilities Flexible data placement ◮ Don’t mind client’s layout - transposition is cheap ◮ Variations don’t impose on client ◮ Can improve performance 1 Memory-constrained dynamic lookahead 1 Lifflander et al., IPDPS 2012 Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 7 / 15

LU: Performance Weak Scaling: (N such that matrix fills 75% memory) 100 Theoretical peak on XT5 Weak scaling on XT5 65.7% 10 Total TFlop/s 67.4% 66.2% 67.4% 1 67.1% 67% 0.1 128 1024 8192 Number of Cores Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 8 / 15

LU: Performance ... and strong scaling too! (N=96,000) 100 Theoretical peak on XT5 Weak scaling on XT5 Theoretical peak on BG/P Strong scaling on BG/P 10 Total TFlop/s 31.6% 40.8% 1 45% 60.3% 0.1 128 1024 8192 Number of Cores Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ SIAM PP12: 15 February 2012 9 / 15

Parallel IO MPI-IO is selfish, still demands dedicated nodes Overlap IO in-line with the application! SIAM PP12: 15 February 2012 10 / Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ 15

Parallel IO Architecture SIAM PP12: 15 February 2012 11 / Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ 15

Parallel IO Implementation notes Forward data to selected processors for stripe-disjoint access Buffer to write whole stripes (not in results shown) SIAM PP12: 15 February 2012 12 / Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ 15

Parallel IO Implementation void Manager::write(Token token, const char *data, size_t bytes, size_t offset) { Options &opts = files[token].opts; do { size_t stripe = offset / opts.peStripe; int pe = opts.basePE + stripe * opts.skipPEs; size_t bytesToSend = min(bytes, opts.peStripe - offset % opts.peStripe); thisProxy[pe].write_forwardData(token, data, bytesToSend, offset); data += bytesToSend; offset += bytesToSend; bytes -= bytesToSend; } while (bytes > 0); } SIAM PP12: 15 February 2012 13 / Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ 15

Parallel IO SIAM PP12: 15 February 2012 14 / Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ 15

Conclusion Parallel libraries needn’t be call and return Need to respect resource bounds Applications can find other work to do Let developers fully utilize system resources SIAM PP12: 15 February 2012 15 / Miller, Kal´ e (PPL, UIUC) Composable Parallel Libraries in Charm++ 15

Composable Parallel Libraries in Charm++ e Phil Miller Laxmikant - PowerPoint PPT Presentation

Composable Parallel Libraries in Charm++ e Phil Miller Laxmikant V. Kal Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign { mille121, kale } @illinois.edu SIAM PP12: 15 February

Recent Results in Charm Physics Recent Results in Charm Physics Topics Topics Rare Charm

State of Charm++ Laxmikant Kale http://charm.cs.uiuc.edu Parallel Programming Laboratory

A Parallel Union-Find Library in Charm ++ Karthik Senthil Parallel Programming Laboratory

Welcome to the 2017 Charm++ Workshop! Laxmikant (Sanjay) Kale http://charm.cs.illinois.edu

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

EXPOSING EXPOSING A FLEXIBLE, COMPOSABLE & EXTENSIBLE A FLEXIBLE, COMPOSABLE &

Charm4py: Parallel Programming with Python and Charm++ Juan Galvez May 1, 2019 17 th Annual

Charm++ Interoperability Nikhil Jain Charm Workshop - 2013 1 Monday, April 15, 13 1

Charm physics and XYZ states at BESIII Evgeny BOGER JINR Dubna On behalf of BESIII

BigSim Tutorial Presented by Eric Bohm Charm++ Workshop 2008 Parallel Programming Laboratory

How to Write a Parallel GPU Application Using CUDA and Charm++ Presented by Lukasz Wesolowski

Heterogeneous Task Execution Frameworks in Charm++ Michael Robson Parallel Programming Lab

Libraries Jonathan Platt Head of Libraries and Heritage 22 nd July 2014 Libraries 1.

Libraries In C++ its possible to create static libraries and shared libraries Static

CharmPy: Parallel Programming with Python Objects Juan Galvez April 11, 2018 16th Annual

Fused and Composable Heterogeneous Cores Roshan Nair and Anirudh Krishna Villivalam Single cores

The first experience with the newly launched Parallel Consultation platform Industry stakeholder

Parallel lines They are lines whose distance between then does not change along their

i). The use of a set square ii). The use of angle transfer method iii). The use of parallelogram

ROUTE 29 / NEW BALTIMORE ADVISORY PANEL MEETING #16 April 23, 2020 U.S. 29 New Baltimore Project

To use any of these reasons in a proof, you must have already stated that you have parallel

Risk Tolerance Study Pat Furlong, BSN, MS Holly Peay, MS CGC Vice President, Education and

Family Involvement in Research Projects a Parental Perspective Silke Mader March 17, 2015

Title 1, PAC New Field Elementary School Please Sign in Favor de Firmar TITLE I SCHOOL

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Composable Parallel Libraries in Charm++ e Phil Miller Laxmikant - PowerPoint PPT Presentation

Composable Parallel Libraries in Charm++ e Phil Miller Laxmikant V. Kal Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign { mille121, kale } @illinois.edu SIAM PP12: 15 February

Recent Results in Charm Physics Recent Results in Charm Physics Topics Topics Rare Charm

State of Charm++ Laxmikant Kale http://charm.cs.uiuc.edu Parallel Programming Laboratory

A Parallel Union-Find Library in Charm ++ Karthik Senthil Parallel Programming Laboratory

Welcome to the 2017 Charm++ Workshop! Laxmikant (Sanjay) Kale http://charm.cs.illinois.edu

Dynamic Load Balancing in Dynamic Load Balancing in Charm+ + Charm+ + Abhinav S Bhatele

EXPOSING EXPOSING A FLEXIBLE, COMPOSABLE &amp; EXTENSIBLE A FLEXIBLE, COMPOSABLE &amp;

Charm4py: Parallel Programming with Python and Charm++ Juan Galvez May 1, 2019 17 th Annual

Charm++ Interoperability Nikhil Jain Charm Workshop - 2013 1 Monday, April 15, 13 1

Charm physics and XYZ states at BESIII Evgeny BOGER JINR Dubna On behalf of BESIII

BigSim Tutorial Presented by Eric Bohm Charm++ Workshop 2008 Parallel Programming Laboratory

How to Write a Parallel GPU Application Using CUDA and Charm++ Presented by Lukasz Wesolowski

Heterogeneous Task Execution Frameworks in Charm++ Michael Robson Parallel Programming Lab

Libraries Jonathan Platt Head of Libraries and Heritage 22 nd July 2014 Libraries 1.

Libraries In C++ its possible to create static libraries and shared libraries Static

CharmPy: Parallel Programming with Python Objects Juan Galvez April 11, 2018 16th Annual

Fused and Composable Heterogeneous Cores Roshan Nair and Anirudh Krishna Villivalam Single cores

The first experience with the newly launched Parallel Consultation platform Industry stakeholder

Parallel lines They are lines whose distance between then does not change along their

i). The use of a set square ii). The use of angle transfer method iii). The use of parallelogram

ROUTE 29 / NEW BALTIMORE ADVISORY PANEL MEETING #16 April 23, 2020 U.S. 29 New Baltimore Project

To use any of these reasons in a proof, you must have already stated that you have parallel

Risk Tolerance Study Pat Furlong, BSN, MS Holly Peay, MS CGC Vice President, Education and

Family Involvement in Research Projects a Parental Perspective Silke Mader March 17, 2015

Title 1, PAC New Field Elementary School Please Sign in Favor de Firmar TITLE I SCHOOL

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

EXPOSING EXPOSING A FLEXIBLE, COMPOSABLE & EXTENSIBLE A FLEXIBLE, COMPOSABLE &