The U.S. D.O.E. Exascale Computing Project – Goals and Challenges Paul Messina, ECP Director Big Simulation and Big Data Workshop Indiana University January 9, 2017 www.ExascaleProject.org
What is the Exascale Computing Project (ECP)? • As part of the National Strategic Computing initiative, ECP was established to accelerate delivery of a capable exascale computing system that integrates hardware and software capability to deliver approximately 50 times more performance than today’s 20-petaflops machines on mission critical applications. – DOE is a lead agency within NSCI, along with DoD and NSF – Deployment agencies: NASA, FBI, NIH, DHS, NOAA • ECP’s work encompasses – applications, – system software, – hardware technologies and architectures, and – workforce development to meet scientific and national security mission needs. 2 Exascale Computing Project, www.exascaleproject.org
What is the Exascale Computing Project? • A collaborative effort of two US Department of Energy (DOE) organizations: – Office of Science (DOE-SC) – National Nuclear Security Administration (NNSA) • A 7-year project to accelerate the development of a capable exascale ecosystem A capable exascale computing system will – Led by DOE laboratories have a well-balanced – Executed in collaboration with academia and industry ecosystem (software, hardware, applications) – emphasizing sustained performance on relevant applications 3 Exascale Computing Project, www.exascaleproject.org
Exascale Computing Project Goals Enable by 2021 and Help ensure continued Develop scientific, 2023 American leadership engineering, and large- Create software that at least two diverse in architecture, data applications that makes exascale computing platforms software and exploit the emerging, systems usable with up to 50 × more applications to support exascale-era by a wide variety computational scientific discovery, computational trends of scientists capability than today’s energy assurance, caused by the end of and engineers across 20 PF systems, within stockpile stewardship, Dennard scaling and a range of applications a similar size, cost, and nonproliferation Moore’s law and power footprint programs and policies Foster application Ease Rich exascale US HPC development of use ecosystem leadership 4 Exascale Computing Project, www.exascaleproject.org
What is a capable exascale computing system? A capable exascale computing system requires an entire computational ecosystem that: This ecosystem • Delivers 50 × the performance of today’s 20 PF will be developed using systems, supporting applications that deliver high- a co-design approach to deliver new software, fidelity solutions in less time and address problems applications, platforms, and of greater complexity computational science capabilities at heretofore • Operates in a power envelope of 20–30 MW unseen scale • Is sufficiently resilient (perceived fault rate: ≤ 1/week) • Includes a software stack that supports a broad spectrum of applications and workloads 5 Exascale Computing Project, www.exascaleproject.org
ECP has formulated a holistic approach that uses co- design and integration to achieve capable exascale Software Hardware Exascale Application Development Technology Technology Systems Science and mission Scalable and Hardware technology Integrated exascale applications productive software elements supercomputers stack Correctness Visualization Data Analysis Applications Co-Design Programming models, Math libraries development environment, Tools Resilience and Frameworks Workflows and runtimes System Software, resource management threading, Data Memory scheduling, monitoring, management and Burst and control I/O and file buffer system Node OS, runtimes Hardware interface ECP’s work encompasses applications, system software, hardware technologies and architectures, and workforce development 6 Exascale Computing Project, www.exascaleproject.org
The ECP Plan of Record • A 7-year project that follows the holistic/co-design approach, which runs through 2023 (including 12 months of schedule contingency) • Enable an initial exascale system based on advanced architecture and delivered in 2021 • Enable capable exascale systems, based on ECP R&D, delivered in 2022 and deployed in 2023 as part of an NNSA and SC facility upgrades • Acquisition of the exascale systems is outside of the ECP scope, will be carried out by DOE-SC and NNSA-ASC facilities 7 Exascale Computing Project, www.exascaleproject.org
What is an exascale advanced architecture ? Capable exascale 10X systems First exascale Computing advanced architecture Capability system y r o t c e j a r t s i h t n o s i s e r u t c e t i h c r a s 5X ’ y a d o t f o n o i t u l o v E 2017 2021 2022 2023 2024 2025 2026 2027 Time 8 Exascale Computing Project, www.exascaleproject.org
Reaching the Elevated Trajectory will require Advanced and Innovative Architectures In order to reach the elevated trajectory, advanced architectures must be developed that make a big leap in: – Parallelism The exascale advanced architecture developments benefit all future U.S. – Memory and Storage systems on the higher trajectory – Reliability – Energy Consumption In addition, the exascale advanced architecture will need to solve emerging data science and machine learning problems in addition to the traditional modeling and simulations applications. 9 Exascale Computing Project, www.exascaleproject.org
High-level ECP technical project schedule R&D before facilities Targeted development for procure first system known exascale architectures Application Development Software Technology Hardware Technology NRE System #1 NRE System #2 Testbeds Site Prep #1 Exascale System #1 Facilities activities outside ECP Exascale System #2 Site Prep #2 FY 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 10 Exascale Computing Project, www.exascaleproject.org
ECP WBS Exascale Computing Project 1. Paul Messina Project Management Application Development Software Technology Hardware Technology Exascale Systems 1.1 Kathlyn Boudwin 1.2 Doug Kothe 1.3 Rajeev Thakur 1.4 Jim Ang 1.5 Terri Quinn Project Planning and DOE Science and Energy Programming Models and NRE Management Apps Runtimes PathForward 1.5.1 Terri Quinn 1.1.1 Kathlyn Boudwin 1.2.1 Andrew Siegel 1.3.1 Rajeev Thakur Vendor Node and System Design Tools DOE NNSA Applications Project Controls & Risk Testbeds 1.4.1 Bronis de Supinski 1.3.2 Jeff Vetter 1.2.2 Bert Still Management 1.5.2 Terri Quinn 1.1.2 Monty Middlebrook Mathematical and Scientific Co-design Libraries Other Agency Applications Design Space Evaluation and Integration and Frameworks 1.2.3 Doug Kothe 1.4.2 John Shalf Business Management 1.5.3 Susan Coghlan 1.3.3 Mike Heroux 1.1.3 Dennis Parton Developer Training and Data Management and Co-Design Productivity Workflows and Integration Procurement Management 1.2.4 Ashley Barker 1.3.4 Rob Ross 1.4.3 Jim Ang 1.1.4 Willy Besancenez Data Analytics and Co-Design and Integration Visualization LeapForward 1.2.5 Phil Colella Information Technology and 1.3.5 Jim Ahrens Vendor Node Quality Management And System 1.1.5 Doug Collins Design System Software 1.4.4 TBD 1.3.6 Martin Schulz Communications & Outreach 1.1.6 Mike Bernhardt Resilience and Integrity 1.3.7 Al Geist Integration 1.1.7 Julia White Co-Design and Integration 1.3.8 Rob Neely 11 Exascale Computing Project, www.exascaleproject.org
Science and Industry Councils • The ECP is in the process of establishing two advisory bodies: An Industry Council composed of ~20 representatives from end-user industries and software vendors A Science Council composed of computer scientists, applied mathematicians, and computational scientists 12 Exascale Computing Project, www.exascaleproject.org
ECP application, co-design center, and software project awards 13 Exascale Computing Project, www.exascaleproject.org
Recommend
More recommend