Parallel Programming and High-Performance Computing Part 1: - PowerPoint PPT Presentation

Technische Universität München Parallel Programming and High-Performance Computing Part 1: Introduction Dr. Ralf-Peter Mundani CeSIM / IGSSE / CiE Technische Universität München

Technische Universität München 1 Introduction General Remarks � Ralf-Peter Mundani � email mundani@tum.de, phone 289–25057, room 3181 (city centre) � consultation-hour: by appointment � Atanas Atanasov � email atanasoa@in.tum.de, phone 289-18615, room 02.05.036 � lecture (2 SWS) � weekly � Tuesday, 14:00—15:30, room 02.07.23 � exercise (1 SWS) � fortnightly � Wednesday, 08:30—10:00, room 02.07.23 materials: http: // www5.in.tum.de / � 1 − 2 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction General Remarks � content � part 1: introduction � part 2: high-performance networks � part 3: foundations � part 4: programming memory-coupled systems � part 5: programming message-coupled systems � part 6: dynamic load balancing � part 7: examples of parallel algorithms 1 − 3 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Overview � motivation � hardware excursion � supercomputers � classification of parallel computers � levels of parallelism � quantitative performance evaluation I think there is a world market for maybe five computers. —Thomas Watson, chairman IBM, 1943 1 − 4 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Motivation � numerical simulation: from phenomena to predictions physical phenomenon technical process 1. modelling determination of parameters, expression of relations 2. numerical treatment model discretisation, algorithm development 3. implementation software development, parallelisation discipline 4. visualisation mathematics illustration of abstract simulation results 5. validation computer science comparison of results with reality application 6. embedding insertion into working process 1 − 5 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Motivation � why numerical simulation? � because experiments are sometimes impossible � life cycle of galaxies, weather forecast, terror attacks, e. g. bomb attack on WTC (1993) � because experiments are sometimes not welcome � avalanches, nuclear tests, medicine, e. g. 1 − 6 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Motivation � why numerical simulation? (cont’d) � because experiments are sometimes very costly and-time consuming � protein folding, material sciences, e. g. Mississippi basin model (Jackson, MS) � because experiments are sometimes more expensive � aerodynamics, crash test, e. g. 1 − 7 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Motivation � why parallel programming and HPC? � complex problems (especially the so called “grand challenges”) demand for more computing power � climate or geophysics simulation (tsunami, e. g.) � structure or flow simulation (crash test, e. g.) � development systems (CAD, e. g.) � large data analysis (Large Hadron Collider at CERN, e. g.) � military applications (crypto analysis, e. g.) � … � performance increase due to � faster hardware, more memory ( “ work harder ” ) � more efficient algorithms, optimisation ( “ work smarter ” ) � parallel computing ( “ get some help ” ) 1 − 8 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Motivation � objectives (in case all resources would be available N -times) � throughput : compute N problems simultaneously � running N instances of a sequential program with different data sets ( “ embarrassing parallelism ” ); SETI@home, e. g. � drawback: limited resources of single nodes � response time : compute one problem at a fraction (1 / N ) of time � running one instance (i. e. N processes) of a parallel program for jointly solving a problem; finding prime numbers, e. g. � drawback: writing a parallel program; communication � problem size : compute one problem with N -times larger data � running one instance (i. e. N processes) of a parallel program, using the sum of all local memories for computing larger problem sizes; iterative solution of SLE, e. g. � drawback: writing a parallel program; communication 1 − 9 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Overview � motivation � hardware excursion � supercomputers � classification of parallel computers � levels of parallelism � quantitative performance evaluation 1 − 10 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Hardware Excursion � definition of parallel computers “A collection of processing elements that communicate and cooperate to solve large problems” (A LMASE and G OTTLIEB , 1989) � possible appearances of such processing elements � specialised units (steps of a vector pipeline, e. g.) � parallel features in modern monoprocessors (instruction pipelining, superscalar architectures, VLIW, multithreading, multicore, …) � several uniform arithmetical units (processing elements of array computers, GPUs, e. g.) � complete stand-alone computers connected via LAN (work station or PC clusters, so called virtual parallel computers ) � parallel computers or clusters connected via WAN (so called metacomputers ) 1 − 11 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Hardware Excursion � reminder: arithmetic logical unit (ALU) � schematic layout of the (classical 32-bit) arithmetic logical unit … registers … … 32-bit data bus main memory A B … ALU C C ← A ⊗ B with arithmetic operation ⊗ 1 − 12 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Hardware Excursion � reminder: memory hierarchy single access access speed register cache block access main memory page access capacity background memory serial access archive memory 1 − 13 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Hardware Excursion � instruction pipelining � instruction execution involves several operations 1. instruction fetch (IF) 2. decode (DE) 3. fetch operands (OP) 4. execute (EX) 5. write back (WB) which are executed successively � hence, only one part of CPU works at a given moment … … IF DE OP EX WB IF DE OP EX WB instruction N + 1 instruction N 1 − 14 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Hardware Excursion � instruction pipelining (cont‘d) � observation: while processing particular stage of instruction, other stages are idle � hence, multiple instructions to be overlapped in execution � instruction pipelining (similar to assembly lines) � advantage: no additional hardware necessary … time instruction N IF DE OP EX WB instruction N + 1 IF DE OP EX WB instruction N + 2 IF DE OP EX WB instruction N + 3 IF DE OP EX WB instruction N + 4 IF DE OP EX WB … 1 − 15 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Hardware Excursion � superscalar � CPU (containing several ALUs) might execute several instructions in parallel (with static or dynamic (i. e. out-of-order execution) scheduling) IF DE OP EX WB instruction N instruction N + 1 IF DE OP EX WB time IF DE OP EX WB IF DE OP EX WB IF DE OP EX WB … IF DE OP EX WB IF DE OP EX WB IF DE OP EX WB IF DE OP EX WB instruction N + 9 IF DE OP EX WB 1 − 16 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Technische Universität München 1 Introduction Hardware Excursion � very long instruction word (VLIW) � in contrast to superscalar architectures, the compiler groups parallel executable instructions during compilation (pipelining still possible) � advantage: no additional hardware logic necessary � drawback: not always fully useable ( � dummy filling (NOP)) VLIW instruction instr. 1 instr. 2 instr. 3 instr. 4 registers 1 − 17 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2010

Parallel Programming and High-Performance Computing Part 1: - PowerPoint PPT Presentation

Technische Universitt Mnchen Parallel Programming and High-Performance Computing Part 1: Introduction Dr. Ralf-Peter Mundani CeSIM / IGSSE / CiE Technische Universitt Mnchen Technische Universitt Mnchen 1 Introduction General

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel

Parallel Computing the Why and the How Albert-Jan Yzelman February, 2010 Albert-Jan Yzelman

Cluster Basics Hana Sevcikova University of Washington DataCamp Parallel Programming in R

Parallel Programming and High-Performance Computing Part 7: Examples of Parallel Algorithms Dr.

Parallel Programming and High-Performance Computing Part 2: High-Performance Networks Dr.

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Outline Overview Theoretical background Parallel computing systems Parallel

Overview Parallel computing platforms Approaches to building parallel computers

PARALLEL Joachim Nitschke PROGRAMMING Project Seminar Parallel Programming, Summer

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel

Adventures in HPC and R: Going Parallel What is Parallel Computing? Justin Harrington &

Introduction to OpenMP ! Introduction to parallel computing ! Classification of parallel

Introduction to Parallel Computing George Karypis Parallel Programming Platforms Elements of a

Parallel Programming and High-Performance Computing Part 4: Programming Memory-Coupled Systems

Parallel Programming and High-Performance Computing Part 5: Programming Message-Coupled Systems

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

Are Our OSes Prepared for Edge Computing? Pekka Enberg www.cs.helsinki.fi 1 Introduction

NFPs Reid Holmes Lecture 5 - Tuesday, Sept 27 2010. [TAILOR ET AL.] NFPs NFPs are

Graphical Models Graphical Models Monte-Carlo Inference Siamak Ravanbakhsh Winter 2018

CREATIVE SOLUTIONS TO PROBLEMS John McCarthy Computer Science Department Stanford University

DM820 Advanced T opics in Programming Languages Peter Schneider-Kamp petersk@imada.sdu.dk

9 th principle of Arya Samaj izR;sd dks viuh gh mUufr ls lUrqV u jguk pkfg, A fdUrq lc dh mUufr

The Holy Grail of The Holy Grail of Advanced Planning Advanced Planning and Scheduling and

WORLD WORLD WORLD WORLD WORLD WORLD En End of of the Br Bron onze Age ME MEETI NG 8

Parallel Programming and High-Performance Computing Part 1: - PowerPoint PPT Presentation

Technische Universitt Mnchen Parallel Programming and High-Performance Computing Part 1: Introduction Dr. Ralf-Peter Mundani CeSIM / IGSSE / CiE Technische Universitt Mnchen Technische Universitt Mnchen 1 Introduction General

Parallel Computing: Opportunities and Challenges Victor Lee Parallel Computing Lab (PCL), Intel

Parallel Computing the Why and the How Albert-Jan Yzelman February, 2010 Albert-Jan Yzelman

Cluster Basics Hana Sevcikova University of Washington DataCamp Parallel Programming in R

Parallel Programming and High-Performance Computing Part 7: Examples of Parallel Algorithms Dr.

Parallel Programming and High-Performance Computing Part 2: High-Performance Networks Dr.

Parallel and Distributed Programming Introduction Kenjiro Taura 1 / 21 Contents 1 Why Parallel

Outline Overview Theoretical background Parallel computing systems Parallel

Overview Parallel computing platforms Approaches to building parallel computers

PARALLEL Joachim Nitschke PROGRAMMING Project Seminar Parallel Programming, Summer

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel

Adventures in HPC and R: Going Parallel What is Parallel Computing? Justin Harrington &amp;

Introduction to OpenMP ! Introduction to parallel computing ! Classification of parallel

Introduction to Parallel Computing George Karypis Parallel Programming Platforms Elements of a

Parallel Programming and High-Performance Computing Part 4: Programming Memory-Coupled Systems

Parallel Programming and High-Performance Computing Part 5: Programming Message-Coupled Systems

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

Are Our OSes Prepared for Edge Computing? Pekka Enberg www.cs.helsinki.fi 1 Introduction

NFPs Reid Holmes Lecture 5 - Tuesday, Sept 27 2010. [TAILOR ET AL.] NFPs NFPs are

Graphical Models Graphical Models Monte-Carlo Inference Siamak Ravanbakhsh Winter 2018

CREATIVE SOLUTIONS TO PROBLEMS John McCarthy Computer Science Department Stanford University

DM820 Advanced T opics in Programming Languages Peter Schneider-Kamp petersk@imada.sdu.dk

9 th principle of Arya Samaj izR;sd dks viuh gh mUufr ls lUrqV u jguk pkfg, A fdUrq lc dh mUufr

The Holy Grail of The Holy Grail of Advanced Planning Advanced Planning and Scheduling and

WORLD WORLD WORLD WORLD WORLD WORLD En End of of the Br Bron onze Age ME MEETI NG 8

Adventures in HPC and R: Going Parallel What is Parallel Computing? Justin Harrington &