multi core computing
play

Multi-Core Computing Instructor: Hamid Sarbazi-Azad Department of - PDF document

11/2/2014 Multi-Core Computing Instructor: Hamid Sarbazi-Azad Department of Computer Engineering Sharif University of Technology Fall 2014 Programming Models P-thread A POSIX standard for threads General purpose multi-core


  1. � 11/2/2014 Multi-Core Computing Instructor: Hamid Sarbazi-Azad Department of Computer Engineering Sharif University of Technology Fall 2014 Programming Models � P-thread � A POSIX standard for threads � General purpose multi-core processors � Implementations are available on many Unix-like POSIX-conformant operating systems such as FreeBSD, NetBSD, OpenBSD, GNU/Linux, Mac OS X and Solaris � DR-DOS and Microsoft Windows implementations also exist � A set of C programming language types, functions and constants � There are around 100 PThreads procedures, all prefixed "pthread_“ 2 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. � 1

  2. � 11/2/2014 OpenMP � What is OpenMP? � Open specification for Multi-Processing � “Standard” API for defining multi-threaded shared-memory programs � openmp.org – Talks, examples, forums, etc. � High-level API � Preprocessor (compiler) directives ( ~ 80% ) � Library Calls ( ~ 19% ) � Environment Variables ( ~ 1% ) 3 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. A Programmer’s View of OpenMP � OpenMP is a portable, threaded, shared-memory programming specification with “light” syntax � OpenMP will: � Allow a programmer to separate a program into serial regions and parallel regions � Provide synchronization constructs � OpenMP will not: � Parallelize automatically � Guarantee speedup � Provide freedom from data races 4 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. � 2

  3. � 11/2/2014 Motivation � Thread libraries are hard to use � PThreads/Solaris threads have many library calls for initialization, synchronization, thread creation, condition variables, etc. � Programmer must code with multiple threads in mind � Synchronization between threads introduces a new dimension of program correctness � Wouldn’t it be nice to write serial programs and somehow parallelize them “automatically”? � OpenMP can parallelize many serial programs with relatively few annotations that specify parallelism and independence � It is not automatic: you can still make errors in your annotations 5 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. Motivation (Cont’d) � Good performance and scalability � If you do it right .... � De-facto standard � An OpenMP program is portable � Supported by a large number of compilers � Requires little programming effort � Allows the program to be parallelized incrementally � Maps naturally onto a multicore architecture: � Lightweight � Each OpenMP thread in the program can be executed by a hardware thread 6 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. � 3

  4. � 11/2/2014 Fork/Join Parallelism � Initially only master thread is active � Master thread executes sequential code � Fork: Master thread creates or awakens additional threads to execute parallel code � Join: At end of parallel code created threads die or are suspended 7 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. The OpenMP Execution Model 8 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. � 4

  5. � 11/2/2014 Programming Models OpenMP vs. PThread � OpenMP is generally best suited to data parallel applications with evident loop level parallelism � It may be easier to debug and performance tune than direct application of PThreads � OpenMP does use PThreads when running on linux systems � OpenMP’s greatest attributes are its portability and the simplicity it brings to parallel programming � When handling simple loops, OpenMP and PThreads have similar speed ups but OpenMP is times easier to program 9 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. OpenMP vs. PThread � A critical question when programming with PThread: � How many threads will be available at run-time? There are ways of extracting this information from the � system at run time and dynamically creating the appropriate number of threads � This process can be messy and, with Hyper-Threading Technology, error-prone. � OpenMP figures out the correct number of threads and automatically distributes the work 10 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. � 5

  6. � 11/2/2014 OpenMP vs. PThread � Code containing OpenMP pragmas compiles as single-threaded code if the compiler does not support OpenMP, and as multithreaded code if the compiler does support OpenMP � Open MP is not general enough to be used for all kinds of parallelism. It is best equipped with pragmas for loop-level parallelism often found in compute intensive workloads � PThread is universal and can be used for any type of parallelism 11 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. Programming Models-OepnMP vs. PThread � Not all loops can be threaded � OpenMP does not analyze code correctness, and so it cannot detect this dependency � OpenMP requires that developers have made their code thread-safe 12 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. � 6

  7. � 11/2/2014 IBM Cell � IBM Cell/BE SDK � For IBM Cell/BE heterogeneous multi-core system architecture: 1 PPE and 8 SPEs � C/C++ Language Extensions � GNU based C/C++ compiler targeting SPE/PPE � Assembly Language Specification � IBM XLC C/C++ auto-Vectorization (auto- SIMD) for SPE and PPE Multimedia Extension cod � Full System Simulator 13 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. CUDA � Compute Unified Device Architecture � Parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce � CUDA-accelerated libraries � Compiler directives � Extensions to industry-standard programming languages, including C, C++ and Fortran 14 Multicore Computing, SHARIF U. OF TECHNOLOGY, 2014. � 7

  8. � 11/2/2014 QUESTIONS? Multicore Computing, SHARIF 15 U. OF TECHNOLOGY, 2014. � 8

Recommend


More recommend