A C++/CUDA DSL for Object-oriented Programming with - PowerPoint PPT Presentation

A C++/CUDA DSL for Object-oriented Programming with Structure-of-Arrays Layout Matthias Springer Tokyo Institute of Technology CGO 2018, ACM Student Research Competition

AOS vs. SOA ● AOS: Array of Structures struct Body { float pos_x, pos_y, vel_x, vel_y; void move( float dt) { pos_x += vel_x * dt; pos_y += vel_y * dt; } }; Body bodies[128]; ● SOA: Structure of Arrays float pos_x[128], pos_y[128], vel_x[128], vel_y[128]; void move( int id, float dt) { pos_x[id] += vel_x[id] * dt; SOA: Good for caching, SOA: Good for caching, pos_y[id] += vel_y[id] * dt; vectorization, parallelization vectorization, parallelization } CGO'18 SRC A C++/CUDA DSL for OOP with SOA 2

AOS vs. SOA ● AOS: Array of Structures struct Body { float pos_x, pos_y, vel_x, vel_y; void move( float dt) { pos_x += vel_x * dt; pos_y += vel_y * dt; } }; Body bodies[128]; ● SOA: Structure of Arrays float pos_x[128], pos_y[128], vel_x[128], vel_y[128]; void move( int id, float dt) { pos_x[id] += vel_x[id] * dt; pos_y[id] += vel_y[id] * dt; IDs instead of pointers IDs instead of pointers } CGO'18 SRC A C++/CUDA DSL for OOP with SOA 3

AOS vs. SOA ● AOS: Array of Structures struct Body { float pos_x, pos_y, vel_x, vel_y; void move( float dt) { pos_x += vel_x * dt; pos_y += vel_y * dt; } }; Body bodies[128]; ● SOA: Structure of Arrays float pos_x[128], pos_y[128], vel_x[128], vel_y[128]; ● IDs instead of pointers void move( int id, float dt) { ● IDs instead of pointers ● No member of obj./ptr. operator pos_x[id] += vel_x[id] * dt; ● No member of obj./ptr. operator pos_y[id] += vel_y[id] * dt; ● No constructors, new keyword ● No constructors, new keyword } ● No inheritance ● No inheritance ● No virtual function calls ● No virtual function calls CGO'18 SRC A C++/CUDA DSL for OOP with SOA 4

Embedded C++ DSL class Body : public SOA<Body> { public : INITIALIZE_CLASS float_ pos_x = 0.0; float_ pos_y = 0.0; float_ vel_x = 1.0; float_ vel_y = 1.0; Body( float x, float y) : pos_x(x), pos_y(y) {} void move( float dt) { pos_x = pos_x + vel_x * dt; Use this class like any other C++ class: pos_y = pos_y + vel_y * dt; void create_and_move() { } Body* b = new Body(1.0, 2.0); }; b->move(0.5); assert (b->pos_x == 1.5); } HOST_STORAGE (Body, 128); CGO'18 SRC A C++/CUDA DSL for OOP with SOA 5

Embedded C++ DSL class Body : public SOA<Body> { public : INITIALIZE_CLASS float_ pos_x = 0.0; float_ pos_y = 0.0; float_ vel_x = 1.0; float_ vel_y = 1.0; Body( float x, float y) : pos_x(x), pos_y(y) {} void move( float dt) { pos_x = pos_x + vel_x * dt; “Parallel” API (CPU+GPU): pos_y = pos_y + vel_y * dt; } Body* q = Body::make(10, 1.0, 2.0); }; forall(&Body::make, q, 10, 0.5); forall(&Body::make, 0.5); HOST_STORAGE (Body, 128); CGO'18 SRC A C++/CUDA DSL for OOP with SOA 6

Implementation Outline class Body : public SOA<Body> { public : INITIALIZE_CLASS float_ pos_x = 0.0; During assignment of float, float_ pos_y = 0.0; conversion to float float_ vel_x = 1.0; Calculate physical memory float_ vel_y = 1.0; location inside buffer Body( float x, float y) : pos_x(x), pos_y(y) {} void move( float dt) { pos_x = pos_x + vel_x * dt; pos_y = pos_y + vel_y * dt; } }; char buffer[128 * 16]; HOST_STORAGE (Body, 128); CGO'18 SRC A C++/CUDA DSL for OOP with SOA 7

Implementation Outline e.g.: float x = b127->vel_x; buffer beginning of array CGO'18 SRC A C++/CUDA DSL for OOP with SOA 8

Implementation Outline e.g.: float x = b127->vel_x; buffer beginning of array offset into array CGO'18 SRC A C++/CUDA DSL for OOP with SOA 9

Implementation Outline e.g.: float x = b127->vel_x; buffer float_ is a macro. float_ vel_x; float_ vel_x; => Field<float, 8> vel_x; => Field<float, 8> vel_x; beginning of array Macro keeps track of field offsets. offset into array CGO'18 SRC A C++/CUDA DSL for OOP with SOA 10

Implementation Outline e.g.: float x = b127->vel_x; buffer float_ is a macro. float_ vel_x; float_ vel_x; => Field<float, 8> vel_x; => Field<float, 8> vel_x; beginning of array offset into array “Fake” pointers encode IDs. int Body::id() { int Body::id() { return ( int ) this ; return ( int ) this ; } } CGO'18 SRC A C++/CUDA DSL for OOP with SOA 11

Performance Evaluation float codegen_test(Body* ptr) { return ptr->vel_x; } Same performance (and assembly code) as in hand-written SOA code (gcc 5.4.0, clang 3.8) → Compilers can understand and optimize this code. (mainly constant folding) 0000000000400690 <_Z11codegen_testP9Body>: 400690: 8b 04 bd 60 10 60 00 mov 0x601060(,%rdi,4),%eax 400697: c3 retq 400698: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) 40069f: 00 CGO'18 SRC A C++/CUDA DSL for OOP with SOA 12

Performance Evaluation forall(&Body::move, 0.5); Compiler hints are necessary for auto-vectorization ● gcc: constexpr “hints” ● clang: No luck so far (problems with alias analysis) CPU GPU CGO'18 SRC A C++/CUDA DSL for OOP with SOA 13

Related Work ● ASX: Array of Structures eXtended Robert Strzodka. Abstraction for AoS and SoA Layout. In C++ GPU Computing Gems Jade Edition, pp. 429-441, 2012. ● SoAx Holger Homann, Francois Laenen. SoAx: A generic C++ Structure of Arrays for handling particles in HPC code. Comp. Phys. Comm., Vol. 224, pp. 325-332, 2018. ● Intel SPMD Compiler (ispc) Matt Pharr, William R. Mark. ispc: A SPMD compiler for high-performance CPU programming. In Innovative Parallel Computing (InPar), 2012. CGO'18 SRC A C++/CUDA DSL for OOP with SOA 14

Summary ● Embedded C++/CUDA DSL for SOA Layout ● OOP Features (pointers instead of IDs, member function calls, constructors, ...) ● Notation close to standard C++ ● Implemented in C++, no external tools required ● Challenges/Future Work: Compiler optimizations (ROSE Compiler), inheritance, virtual function calls CGO'18 SRC A C++/CUDA DSL for OOP with SOA 15

A C++/CUDA DSL for Object-oriented Programming with - PowerPoint PPT Presentation

A C++/CUDA DSL for Object-oriented Programming with Structure-of-Arrays Layout Matthias Springer Tokyo Institute of Technology CGO 2018, ACM Student Research Competition AOS vs. SOA AOS: Array of Structures struct Body { float pos_x, pos_y,

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Outline Overview Parallel Computing with GPU Introduction to CUDA CUDA Thread Model

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

61A Lecture 15 Announcements Object-Oriented Programming Object-Oriented Programming 4

The Object Factory Object-Oriented Programming in R: S3 & R6 pour refill Object-Oriented

Lecture 2.1 - Introduction to CUDA C CUDA C vs. Thrust vs. CUDA Libraries Objective To learn

Introduction to CUDA C What is CUDA? CUDA Architecture Expose general-purpose GPU

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object Oriented Programming in Python By Amarjit Singh Karanvir Singh *#%???$% Contents

Lecture 18: object-oriented programming Review of Object-Oriented Programming in Python The

CMSC 132: Object-Oriented Programming II Object-Oriented Programming Intro Department of

Introduction to Object-Oriented Programming Review 2: Object-Oriented Programming Christopher

Object-Oriented Programming in Processing Object-Oriented Programming Weve (kinda) been

CUDA/Ada An Ada binding to CUDA Reto B urki, Adrian-Ken R uegsegger University of Applied

Comp-304 : Object-Oriented Design What do is mean to be Object Oriented? Computer Science McGill

DSL with pyrser Author: L. Auroux lionel@lse.epita.fr For pyParis 2018 lionel@lse.epita.fr For

Career Connections to Employers VTAM Workshop: 11:00 a.m. -12:15 pm Heather Rose Career

Yale Art Gallery Visual Resources Department Works on Paper Studio used for this test

Driving collaboration - OPEN DEI Ambassador of the Agrifood domain, senior researcher in

Our first MOOC experience: - as teacher and former University Vice Rector - as facilitator and

Stability Analysis of Material Point Method Dr. Martin Berzins, Dr. Mike Kirby, Chris Gritton

Final Issue Report on the RAA Amendments Margie Milam RAA Developments- Dakar Board Resolution

Two-stage Benchmarking of Time-Series Models for Small Area Estimation Danny Pfeffermann,

ME 416/516 Dynamics The Mathematics of Analytical Dynamics (Integral Formulation) Gregory P.

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

A C++/CUDA DSL for Object-oriented Programming with - PowerPoint PPT Presentation

A C++/CUDA DSL for Object-oriented Programming with Structure-of-Arrays Layout Matthias Springer Tokyo Institute of Technology CGO 2018, ACM Student Research Competition AOS vs. SOA AOS: Array of Structures struct Body { float pos_x, pos_y,

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Outline Overview Parallel Computing with GPU Introduction to CUDA CUDA Thread Model

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

61A Lecture 15 Announcements Object-Oriented Programming Object-Oriented Programming 4

The Object Factory Object-Oriented Programming in R: S3 &amp; R6 pour refill Object-Oriented

Lecture 2.1 - Introduction to CUDA C CUDA C vs. Thrust vs. CUDA Libraries Objective To learn

Introduction to CUDA C What is CUDA? CUDA Architecture Expose general-purpose GPU

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object Oriented Programming in Python By Amarjit Singh Karanvir Singh *#%???$% Contents

Lecture 18: object-oriented programming Review of Object-Oriented Programming in Python The

CMSC 132: Object-Oriented Programming II Object-Oriented Programming Intro Department of

Introduction to Object-Oriented Programming Review 2: Object-Oriented Programming Christopher

Object-Oriented Programming in Processing Object-Oriented Programming Weve (kinda) been

CUDA/Ada An Ada binding to CUDA Reto B urki, Adrian-Ken R uegsegger University of Applied

Comp-304 : Object-Oriented Design What do is mean to be Object Oriented? Computer Science McGill

DSL with pyrser Author: L. Auroux lionel@lse.epita.fr For pyParis 2018 lionel@lse.epita.fr For

Career Connections to Employers VTAM Workshop: 11:00 a.m. -12:15 pm Heather Rose Career

Yale Art Gallery Visual Resources Department Works on Paper Studio used for this test

Driving collaboration - OPEN DEI Ambassador of the Agrifood domain, senior researcher in

Our first MOOC experience: - as teacher and former University Vice Rector - as facilitator and

Stability Analysis of Material Point Method Dr. Martin Berzins, Dr. Mike Kirby, Chris Gritton

Final Issue Report on the RAA Amendments Margie Milam RAA Developments- Dakar Board Resolution

Two-stage Benchmarking of Time-Series Models for Small Area Estimation Danny Pfeffermann,

ME 416/516 Dynamics The Mathematics of Analytical Dynamics (Integral Formulation) Gregory P.

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

The Object Factory Object-Oriented Programming in R: S3 & R6 pour refill Object-Oriented