Technische Universität München Parallel Programming and High-Performance Computing Part 1: Introduction Dr. Ralf-Peter Mundani CeSIM / IGSSE
Technische Universität München 1 Introduction General Remarks materials: http: // www5.in.tum.de / lehre / vorlesungen / parhpp / SS08 / • • Ralf-Peter Mundani – email mundani@tum.de, phone 289–25057, room 3181 (city centre) – consultation-hour: Tuesday, 4:00—6:00 pm (room 02.05.058) • Ioan Lucian Muntean – email muntean@in.tum.de, phone 289–18692, room 02.05.059 • lecture (2 SWS) – weekly – Tuesday, start at 12:15 pm, room 02.07.023 • exercises (1 SWS) – fortnightly – Wednesday, start at 4:45 pm, room 02.07.023 1 − 2 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction General Remarks • content – part 1: introduction – part 2: high-performance networks – part 3: foundations – part 4: programming memory-coupled systems – part 5: programming message-coupled systems – part 6: dynamic load balancing – part 7: examples of parallel algorithms 1 − 3 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Overview • motivation • classification of parallel computers • levels of parallelism • quantitative performance evaluation I think there is a world market for maybe five computers. —Thomas Watson, chairman IBM, 1943 1 − 4 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Motivation • numerical simulation: from phenomena to predictions physical phenomenon technical process 1. modelling determination of parameters, expression of relations 2. numerical treatment model discretisation, algorithm development 3. implementation software development, parallelisation discipline 4. visualisation mathematics illustration of abstract simulation results 5. validation computer science comparison of results with reality application 6. embedding insertion into working process 1 − 5 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Motivation • why parallel programming and HPC? – complex problems (especially the so called “grand challenges”) demand for more computing power • climate or geophysics simulation (tsunami, e. g.) • structure or flow simulation (crash test, e. g.) • development systems (CAD, e. g.) • large data analysis (Large Hadron Collider at CERN, e. g.) • military applications (crypto analysis, e. g.) • … – performance increase due to • faster hardware, more memory ( “ work harder ” ) • more efficient algorithms, optimisation ( “ work smarter ” ) • parallel computing ( “ get some help ” ) 1 − 6 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Motivation • objectives (in case all resources would be available N-times) – throughput : compute N problems simultaneously • running N instances of a sequential program with different data sets ( “ embarrassing parallelism ” ); SETI@home, e. g. • drawback: limited resources of single nodes – response time : compute one problem at a fraction (1 / N) of time • running one instance (i. e. N processes) of a parallel program for jointly solving a problem; finding prime numbers, e. g. • drawback: writing a parallel program; communication – problem size : compute one problem with N-times larger data • running one instance (i. e. N processes) of a parallel program, using the sum of all local memories for computing larger problem sizes; iterative solution of SLE, e. g. • drawback: writing a parallel program; communication 1 − 7 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Overview • motivation • classification of parallel computers • levels of parallelism • quantitative performance evaluation 1 − 8 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Classification of Parallel Computers • definition: “A collection of processing elements that communicate and cooperate to solve large problems” (A LMASE and G OTTLIEB , 1989) • possible appearances of such processing elements – specialised units (steps of a vector pipeline, e. g.) – parallel features in modern monoprocessors (superscalar architectures, instruction pipelining, VLIW, multithreading, multicore, …) – several uniform arithmetical units (processing elements of array computers, e. g.) – processors of a multiprocessor computer (i. e. the actual parallel computers) – complete stand-alone computers connected via LAN (work station or PC clusters, so called virtual parallel computers ) – parallel computers or clusters connected via WAN (so called metacomputers ) 1 − 9 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Classification of Parallel Computers • reminder: dual core, quad core, manycore, and multicore – observation: increasing frequency (and thus core voltage) over past years – problem: thermal power dissipation increases linearly in frequency and with the square of the core voltage 1 − 10 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Classification of Parallel Computers • reminder: dual core, quad core, manycore, and multicore (cont’d) – 25% reduction in frequency (and thus core voltage) leads to 50% reduction in dissipation dissipation performance � normal CPU reduced CPU 1 − 11 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Classification of Parallel Computers • reminder: dual core, quad core, manycore, and multicore (cont’d) – idea: installation of two cores per die with same dissipation as single core system dissipation performance � single core dual core 1 − 12 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Classification of Parallel Computers • commercial parallel computers – manufacturers: starting from 1983, big players and small start-ups (see tabular; out of business: no longer in the parallel business) – names have been coming and going rapidly – in addition: several manufacturers of vector computers and non- standard architectures company country year status in 2003 Sequent U.S. 1984 acquired by IBM Intel U.S. 1984 out of business Meiko U.K. 1985 bankrupt nCUBE U.S. 1985 out of business Parsytec Germany 1985 out of business Alliant U.S. 1985 bankrupt 1 − 13 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction 1 Introduction Classification of Parallel Computers Classification of Parallel Computers • • commercial parallel computers (cont’d) commercial parallel computers (cont’d) company country year status in 2003 Encore U.S. 1986 out of business Floating Point Systems U.S. 1986 acquired by SUN Myrias Canada 1987 out of business Ametek U.S. 1987 out of business Silicon Graphics U.S. 1988 active C-DAC India 1991 active Kendall Square Research U.S. 1992 bankrupt IBM U.S. 1993 active NEC Japan 1993 active SUN Microsystems U.S. 1993 active Cray Research U.S. 1993 active 1 − 14 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Classification of Parallel Computers • arrival of clusters – in the late eighties, PCs became a commodity market with rapidly increasing performance, mass production, and decreasing prices – growing attractiveness for parallel computers – 1994: Beowulf, the first parallel computer built completely out of commodity hardware • NASA Goddard Space Flight Centre • 16 Intel DX4 processors • multiple 10 Mbit Ethernet links • Linux with GNU compilers • MPI library – 1996: Beowulf cluster performing more than 1 GFlops – 1997: a 140-node cluster performing more than 10 GFlops 1 − 15 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Technische Universität München 1 Introduction Classification of Parallel Computers • arrival of clusters (cont’d) – 2005: InfiniBand cluster at TUM • 36 Opteron nodes (quad boards) • 4 Itanium nodes (quad boards) • 4 Xeon nodes (dual boards) for interactive tasks • InfiniBand 4 × Switch, 96 ports • Linux (SuSE and Redhat) 1 − 16 Dr. Ralf-Peter Mundani - Parallel Programming and High-Performance Computing - Summer Term 2008
Recommend
More recommend