programmability in the era of parallel computing
play

Programmability in the Era of Parallel Computing Per Stenstrm - PowerPoint PPT Presentation

Programmability in the Era of Parallel Computing Per Stenstrm Department of Computer Science and Engineering Chalmers University of Technology Sweden Multicore Scaling Cores/chip 100s cores 16 cores Source: Computer Performance : Game


  1. Programmability in the Era of Parallel Computing Per Stenström Department of Computer Science and Engineering Chalmers University of Technology Sweden

  2. Multicore Scaling Cores/chip 100s cores 16 cores Source: Computer Performance : Game Over or Next Level” IEEE Computer, Jan 2011 Predictions 1 core 1990 2000 2014 2020 By 2020, several hundreds of powerful cores/chip

  3. Programmability

  4. High-Productivity Software Design in the Multi/Many-core Era Plug & play End user Productivity programming languages (e.g. C/C++, Java) Productivity programmers System-near programming Efficiency programmers 4

  5. High-Productivity Software Stack for Multi/Many-core Systems Software Components Oblivious to Parallelism Productivity programmers Runtime with Parallelism Efficiency-only programmers Capabilities Increased level of Hardware abstraction Primitives Computer architects (e.g. TM) 5

  6. Topic 1: Task based programming models

  7. Task-based Dataflow Prog. Models #pragma css task output(a) void TaskA( float a[M][M]); TaskA #pragma css task input(a) void TaskB( float a[M][M]); TaskB TaskC #pragma css task input(a) void TaskC( float a[M][M]); • Programmer annotations for task dependences • Annotations used by run-time for scheduling • Dataflow task graph constructed dynamially Hypothesis: Programmers focus on extracting parallelism, system delivers performance. BUT: Is this a good idea?

  8. Topic 1: Transactional memory

  9. Transactional Memory (TM) • Transactional memory semantics: TX 1 TX 2 – Atomicity, consistency, and isolation – Tx_begin/Tx_end primitives • Allow for concurrency inside critical W A R A sections • Software implementations too slow Commit • Hardware implementations complex but Data have been adopted (IBM Bluegene, Intel conflict Haswell) R A • 100s of papers in the open literature; design space fairly well understood Re-execution Hypothesis: Simplifies for programmers, but is this a good idea?

Recommend


More recommend