Automatic MPI application transformation with ASPhALT Anthony - PowerPoint PPT Presentation

Automatic MPI application transformation with ASPhALT Anthony Danalis Lori Pollock Martin Swany University of Delaware University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Problem Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Overall Research Goal Overall Research Goal Requirements: ✔ Achieve high-performance communication Have your cake ✔ Simplify the MPI code developers write + Eat your cake Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Overall Research Goal Overall Research Goal Requirements: ✔ Achieve high-performance communication Have your cake ✔ Simplify the MPI code developers write + Eat your cake Automatic cake making machine Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Overall Research Goal Overall Research Goal Requirements: ✔ Achieve high-performance communication Have your cake ✔ Simplify the MPI code developers write + Eat your cake Proposed Solution: An automatic automatic system that transforms transforms simple Automatic cake communication code into efficient code. making machine Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Overall Research Goal Overall Research Goal Requirements: ✔ Achieve high-performance communication Have your cake ✔ Simplify the MPI code developers write + Eat your cake Proposed Solution: An automatic automatic system that transforms transforms simple Automatic cake communication code into efficient code. making machine Side-effect: Enables legacy parallel MPI applications legacy parallel MPI applications to scale, even if written without any knowledge of this system without any knowledge Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Overall Research Goal Overall Research Goal Application Cluster Layers ASPhALT: Information from multiple Runtime Libraries layers contributes to source optimization Operating System/ Network Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Our Framework : ASPhALT * Original Executable Code Source to Source Optimized Existing Optimizer Application Source Code Compiler Analyzer Low Level System System Comm. API Benchmarks Parameters Operating System/ Application Runtime Libraries Network Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work “Prepushing” Transformation Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Comm. Aggregation vs. Performance Traditional Approach: Low Overhead High High + Communication Aggregation Performance High Bandwidth Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Comm. Aggregation vs. Performance Traditional Approach: Low Overhead High High + Communication Aggregation Performance High Bandwidth Why our communication segmentation works? High Overhead on the network not the CPU High Fine Grain Application Communication Low Bandwidth but Performance transfer Overlapped i.e. CPU not idle Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Transformer Prototype Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Fortran Semantics vs. MPI Semantics After ASPhALT Before ASPhALT sArray[ NX, NY ], rArray[ NX, NY ] sArray[ NX, NY ] DO T = 1, N, K DO I = 1,N kernel ( sArray[ : , I ], ... ) DO P = 1, NPROC END DO S = F( NX, P, NPROC ) E = G( NX, P, NPROC ) synchrnsTransfer ( sArray[ : , : ], rArray[ : , : ]) asynchRecvInit ( rArray[ S:E, T:T+K-1], req[ T/K ] ) END DO DO I = T, T+K-1 kernel ( sArray[ : , I ], ... ) END DO DO P = 1, NPROC S = F( NX, P, NPROC ) E = G( NX, P, NPROC ) asynchSendInit ( sArray[ S:E, T:T+K-1] ) END DO IF( T/K > D ) THEN wait ( request[ T/K - D ] ) END IF END DO Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Fortran Semantics vs. MPI Semantics After ASPhALT Before ASPhALT sArray[ NX, NY ], rArray[ NX, NY ] sArray[ NX, NY ] DO T = 1, N, K DO I = 1,N kernel ( sArray[ : , I ], ... ) DO P = 1, NPROC END DO S = F( NX, P, NPROC ) E = G( NX, P, NPROC ) synchrnsTransfer ( sArray[ : , : ], rArray[ : , : ]) asynchRecvInit ( rArray[ S:E, T:T+K-1], req[ T/K ] ) END DO DO I = T, T+K-1 P 1 kernel ( sArray[ : , I ], ... ) END DO P 2 DO P = 1, NPROC S = F( NX, P, NPROC ) E = G( NX, P, NPROC ) asynchSendInit ( sArray[ S:E, T:T+K-1] ) END DO IF( T/K > D ) THEN wait ( request[ T/K - D ] ) P 1 P 2 END IF END DO Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Fortran Semantics vs. MPI Semantics After ASPhALT sArray[ NX, NY ], rArray[ NX, NY ] DO T = 1, N, K After FORTRAN compiler DO P = 1, NPROC S = F( NX, P, NPROC ) TEMP1[ : ] = rArray[ S:E, T:T+K-1] E = G( NX, P, NPROC ) asynchRecvInit ( rArray[ S:E, T:T+K-1], req[ T/K ] ) asynchRecvInit ( TEMP1[ : ] ) END DO rArray[ S:E, T:T+K-1] = TEMP1[ : ] DO I = T, T+K-1 kernel ( sArray[ : , I ], ... ) Array slice means implicit END DO copy for data to be contiguous DO P = 1, NPROC S = F( NX, P, NPROC ) TEMP2[ : ] = sArray[ S:E, T:T+K-1] E = G( NX, P, NPROC ) asynchSendInit ( sArray[ S:E, T:T+K-1] ) asynchSend ( TEMP2[ : ] ) END DO sArray[ S:E, T:T+K-1] = TEMP2[ : ] IF( T/K > D ) THEN wait ( request[ T/K - D ] ) END IF END DO Anthony Danalis University of Delaware

Motivation Overview Transformation Automation Evaluation Future Work Fortran Semantics vs. MPI Semantics Potential Problems: After FORTRAN compiler TEMP1[ : ] = rArray[ S:E, T:T+K-1] asynchRecvInit ( TEMP1[ : ] ) TEMP1 is copied back but rArray[ S:E, T:T+K-1] = TEMP1[ : ] Data not here yet TEMP2[ : ] = sArray[ S:E, T:T+K-1] Data Flow Analysis allows F90 asynchSend ( TEMP2[ : ] ) compiler to re-define TEMP2 after sArray[ S:E, T:T+K-1] = TEMP2[ : ] this copy, but Data not departed yet Anthony Danalis University of Delaware

Automatic MPI application transformation with ASPhALT Anthony - PowerPoint PPT Presentation

Automatic MPI application transformation with ASPhALT Anthony Danalis Lori Pollock Martin Swany University of Delaware University of Delaware Motivation Overview Transformation Automation

Asphalt Production Asphalt Production Hot-mix asphalt is produced in an asphalt plant where the

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

Warm Mix Asphalt Warm Mix Asphalt (WMA 101) (WMA 101) What Is Warm Mix Asphalt ? What Is Warm

Asphalt Emulsion Eugene Cifers Sales Manager Asphalt Emulsion Industries, LLC What is an

Asphalt Production Asphalt Plants Batch Plant Drum Plant Produces asphalt one batch at a time

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

IDEAL/TSR and SCB Smart Jigs Bluetooth Enabled Asphalt Test Jigs InstroTek Asphalt Jigs Overview

Asphalt Cement Grading Asphalt Cement Performance During its lifetime, asphalt cement must

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

Advanced Cold Asphalts HIGH PERFORMANCE ASPHALT COLD MIX FOR POT HOLE AND UTILITY CUT REPAIRS

Asphalt Quality Assurance Program & Construction Inspection 2015 Asphalt Regional Seminars

Seamless Asphalt Pavement Preservation Featuring Types of asphalt failure Alligator cracking

Factors The characteristics of measurements made under different conditions are affected by

Asphalt Institute Method Mechanistic-Empirical Methodology Material Trial Layer Properties

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/Dida2ca/PLP-16/ Prof.

Other Opportunities in Neutrino Experiments Morgan Wascko Imperial College London PPAP

SOUTHERN REGION GRMP SITE INSPECTION FORM Slope movement has extended onto the pavement and cracks

City of Lodi Storm Water Workshop k h Thursday November 5 2009 Thursday, November 5, 2009 LID

ENVIRONMENTAL GEOMECHANICS CE-641 Lecture No. 10 Prof. D N Singh Department of Civil

Safe Fully Automated Driving on Roads and Highways: Pie in the Sky or Future Reality? Grard Le

Automatic MPI application transformation with ASPhALT Anthony - PowerPoint PPT Presentation

Automatic MPI application transformation with ASPhALT Anthony Danalis Lori Pollock Martin Swany University of Delaware University of Delaware Motivation Overview Transformation Automation

Asphalt Production Asphalt Production Hot-mix asphalt is produced in an asphalt plant where the

MPI is too High-Level MPI is too Low-Level Marc Snir High-Level MPI MPI is an Application

Warm Mix Asphalt Warm Mix Asphalt (WMA 101) (WMA 101) What Is Warm Mix Asphalt ? What Is Warm

Asphalt Emulsion Eugene Cifers Sales Manager Asphalt Emulsion Industries, LLC What is an

Asphalt Production Asphalt Plants Batch Plant Drum Plant Produces asphalt one batch at a time

The MPI+MPI programming model and why we need shared-memory MPI libraries Jeff Hammond Extreme

Introduction to MPI T opics to be covered MPI vs shared memory Initializing MPI MPI

Message Passing Programming with MPI What is MPI? Message Passing Programming with MPI 1

IDEAL/TSR and SCB Smart Jigs Bluetooth Enabled Asphalt Test Jigs InstroTek Asphalt Jigs Overview

Asphalt Cement Grading Asphalt Cement Performance During its lifetime, asphalt cement must

MPI-IO: A Retrospective Rajeev Thakur 25 th Anniversary of MPI Workshop Argonne, IL, Sept 25,

Message Passing Programming with MPI Message Passing Programming with MPI 1 What is MPI?

Programming Miscellaneous MPI-IO topics MPI-IO Errors Unlike the rest of MPI, MPI-IO errors

Advanced Cold Asphalts HIGH PERFORMANCE ASPHALT COLD MIX FOR POT HOLE AND UTILITY CUT REPAIRS

Asphalt Quality Assurance Program &amp; Construction Inspection 2015 Asphalt Regional Seminars

Seamless Asphalt Pavement Preservation Featuring Types of asphalt failure Alligator cracking

Factors The characteristics of measurements made under different conditions are affected by

Asphalt Institute Method Mechanistic-Empirical Methodology Material Trial Layer Properties

Principles of Programming Languages h&quot;p://www.di.unipi.it/~andrea/Dida2ca/PLP-16/ Prof.

Other Opportunities in Neutrino Experiments Morgan Wascko Imperial College London PPAP

SOUTHERN REGION GRMP SITE INSPECTION FORM Slope movement has extended onto the pavement and cracks

City of Lodi Storm Water Workshop k h Thursday November 5 2009 Thursday, November 5, 2009 LID

ENVIRONMENTAL GEOMECHANICS CE-641 Lecture No. 10 Prof. D N Singh Department of Civil

Safe Fully Automated Driving on Roads and Highways: Pie in the Sky or Future Reality? Grard Le

Asphalt Quality Assurance Program & Construction Inspection 2015 Asphalt Regional Seminars

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/Dida2ca/PLP-16/ Prof.