Lecture 5.2 Parallel Memory Models EN 600.320/420/620 Instructor: - PowerPoint PPT Presentation

Jul 14, 2023 •26 likes •106 views

Lecture 5.2 Parallel Memory Models EN 600.320/420/620 Instructor: Randal Burns 12 February 2018 Department of Computer Science, Johns Hopkins University Shared Memory Systems Large class defined by memory model And thus, the programming

Lecture 5.2 Parallel Memory Models EN 600.320/420/620 Instructor: Randal Burns 12 February 2018 Department of Computer Science, Johns Hopkins University
Shared Memory Systems  Large class defined by memory model – And thus, the programming model  Shared-memory programming – Threads exchange information through reads and writes to memory – Synchronization constructs to control sharing – Easy to use abstraction  Examples – OpenMP, Java, pthreads Lecture 3: Parallel Architectures
Symmetric Multi-Processor (SMP)  Shared memory MIMD system – All processors can address all memory  Symmetric access to memory – Performance statement  SMPs have scaling limits  On symmetry – SMP not symmetric to caches – Multi-core (symmetric to L2, not L1) https://computing.llnl.gov/tutorials/parallel_comp/ Lecture 3: Parallel Architectures
NUMA: Non-Uniform Memory Access  Shared memory MIMD systems  Latency and bandwidth to physical memory differs – by address and location  Same programming semantics as SMP https://computing.llnl.gov/tutorials/parallel_comp/ Lecture 3: Parallel Architectures
RB ’ s Take on NUMA  Very difficult to program – The tools don’t help programmer account for NU – Easy to write programs that work correctly – More difficult to write programs that run fast  But, all multicore is NUMA – Even SMPs today have NUMA properties – Because of cache hierarchy  New programming tools to help in Linux – hwloc: https://www.open-mpi.org/projects/hwloc/ – libnuma and numactl: http://oss.sgi.com/projects/libnuma/ Lecture 3: Parallel Architectures
Message Passing  Book calls these distributed memory machines – This term is deceptive to me  Each processor/node has its own private memory  Nodes synchronize actions and exchange data by sending messages to each other https://computing.llnl.gov/tutorials/parallel_comp/ Lecture 3: Parallel Architectures
Programming Message Passing  MPI – The “ assembly language ” of supercomputing – Libraries that allow for collective operations, synchronization, etc. – Explicit handling of data distribution and inter-process communication  Map/reduce and other cloud systems – New paradigm that emerged from Google – Divide computation into data parallel and data dependent portions – Better abstraction of HW. More restrictive. – MR, Hadoop!, Spark, GraphLab, etc. Lecture 3: Parallel Architectures
Hybrid Architectures  When a message passing machine has SMP parallelism at each of its nodes – Book is behind on this trend: every machine is a hybrid  How to program – MPI: ignore the SMP aspects – MPI + ( OpenMPI, pthreads, Java, CUDA, OpenCL )  Expensive, hard to maintain – Automated compilation https://computing.llnl.gov/tutorials/parallel_comp/ Lecture 3: Parallel Architectures

Recommend

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Main Focus I. Memory as a process Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory the process by which I. Sensory Memory information is - acquired, II. Short -Term Memory - stored,

169 views • 5 slides

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016 Overview Weak memory models 1 Introduction 2 Hardware architectures Compiler optimizations Sequential consistency Weak memory models 3 TSO

1.3k views • 87 slides

Shared Memory Programming with OpenMP Lecture 3: Parallel Regions Parallel region directive

Shared Memory Programming with OpenMP Lecture 3: Parallel Regions Parallel region directive Code within a parallel region is executed by all threads. Syntax: Fortran: !$OMP PARALLEL block !$OMP END PARALLEL C/C++: #pragma omp

321 views • 15 slides

CSC2/458 Parallel and Distributed Systems Parallel Memory Systems: Coherence Sreepathi Pai

CSC2/458 Parallel and Distributed Systems Parallel Memory Systems: Coherence Sreepathi Pai February 06, 2018 URCS Outline Introduction to Parallel Memory Systems Memory Systems in Parallel Processors Coherence Implementations in Hardware

572 views • 34 slides

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC Computing Computing + Fabric SoC Memory HYPERCONVERGED Exascale EDGE DEVICE SYSTEM Eliminate data movement via shared

401 views • 11 slides

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory Device Device Memory Computer-Computer Comm CPU CPU CPU CPU Comm Comm Comm Comm Memory Memory Memory Memory Device Device Device Device

629 views • 36 slides

Lecture 2: Parallel Architectures Lecture 2: Parallel Architectures and Programming Models

ECE 451/566 - Intro. to Parallel & Distributed Prog. ECE-451/ECE-566 - Introduction to Parallel and Distributed Programming Lecture 2: Parallel Architectures Lecture 2: Parallel Architectures and Programming Models Department of Electrical

480 views • 32 slides

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Persistent Memory Databases Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap Larger-than-Memory Databases Recap 2 / 71 Persistent Memory Databases Recap Larger-than-Memory Databases Larger-than-Memory

892 views • 71 slides

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if we want to run a process that requires 10GB memory? 2 Memory Hierarchy Virtual Memory Memory Cache Registers Answer: Pretend we had something

737 views • 45 slides

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a bucket of bytes . Computer Memory Organization Memory is a bucket of bytes. Each byte is 8 bits wide. Computer Memory Organization Memory

994 views • 42 slides

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate (Working memory) Retain (Long term memory) Memory Retrieve (Long term memory) processing A difficulty with any one or more of these skills

361 views • 6 slides

Memory Management Memory Manager Requirements Minimize primary memory access time

Memory Management Memory Manager Requirements Minimize primary memory access time Maximize primary memory size Primary memory must be cost-effective Todays memory manager: Allocates primary memory to processes Maps

637 views • 27 slides

Robustness against Relaxed Memory Models Memory Models Roland Meyer Technische Universit at

Robustness against Relaxed Memory Models Memory Models Roland Meyer Technische Universit at Kaiserslautern Roland Meyer (TU KL) Robustness against Relaxed Memory Models MM February 2015 1 / 27 Concurrent Programs with Shared Memory

1.35k views • 132 slides

CSC2/458 Parallel and Distributed Systems Parallel Memory Systems Consistency Sreepathi Pai

CSC2/458 Parallel and Distributed Systems Parallel Memory Systems Consistency Sreepathi Pai February 8, 2018 URCS Outline Memory Consistency Programming on Relaxed Consistency Machines Memory Models for Languages Special Relativity

375 views • 36 slides

Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel

Parallel Programming Paradigms MPI Message-Passing Interface OpenMP Portable Shared Memory Programming Parallel Numerical Algorithms Chapter 2 Parallel Thinking Section 2.2 Parallel Programming Michael T. Heath and Edgar

790 views • 45 slides

Parallel Algorithms Parallel Prefix Sums Algorithm Theory WS 2012/13 Fabian Kuhn PRAM Parallel

Chapter 8 Parallel Algorithms Parallel Prefix Sums Algorithm Theory WS 2012/13 Fabian Kuhn PRAM Parallel version of RAM model processors, shared random access memory Basic operations / access to shared memory cost 1 Processor

397 views • 17 slides

CO406H: Concurrent Processes -calculus: dynamic reconfiguration of communication links

CO406H: Concurrent Processes -calculus: dynamic reconfiguration of communication links Session types for distributed protocols Plan of Lectures and Tutorials Lectures, Tutorials and Coursework Room 342, Monday, 16.0018.00

811 views • 39 slides

Message passing and channels INF4140 - Models of concurrency Message passing and channels Fall

Message passing and channels INF4140 - Models of concurrency Message passing and channels Fall 2016 17. Oct. 2016 Outline Course overview: Part I: concurrent programming; programming with shared variables Part II: distributed

847 views • 41 slides

The Essentials of CAGD Chapter 10: B-Spline Curves Gerald Farin & Dianne Hansford CRC Press,

The Essentials of CAGD Chapter 10: B-Spline Curves Gerald Farin & Dianne Hansford CRC Press, Taylor & Francis Group, An A K Peters Book www.farinhansford.com/books/essentials-cagd 2000 c Farin & Hansford The Essentials of CAGD

483 views • 47 slides

Message Passing DM519 Concurrent Programming 1 1 Absence Of Shared Memory In previous lectures

Chapter 10 Message Passing DM519 Concurrent Programming 1 1 Absence Of Shared Memory In previous lectures interaction between threads has been via shared memory In Java, we refer to shared objects. Usually encapsulate shared memory

856 views • 74 slides

Agenda Risk Management In Software Who am I and why am I teaching you this Intensive Projects

Agenda Risk Management In Software Who am I and why am I teaching you this Intensive Projects What is the risk Definitions Different ways of looking at risk Risk and project management Lecture 1 - Introduction The case

308 views • 3 slides

CBER Plans for Monitoring COVID-19 Vaccine Safety and Effectiveness Steve Anderson, PhD, MPP

CBER Plans for Monitoring COVID-19 Vaccine Safety and Effectiveness Steve Anderson, PhD, MPP Director, Office of Biostatistics & Epidemiology, CBER ACIP Meeting October 30, 2020 FDA Vaccine Surveillance Programs: Post-Licensure 1.

682 views • 24 slides

NTCIR-7 Patent Mining Experiments at RALI Guihong Cao, Jian-Yun Nie and Lixin Shi Department of

NTCIR-7 Patent Mining Experiments at RALI Guihong Cao, Jian-Yun Nie and Lixin Shi Department of Computer Science and Operations Research University of Montreal, Canada 1 Outline Introduction Our Approaches Issues Investigated

789 views • 23 slides

Goodbye Seagate , Hello Halo : Effects of the Evolving Willfulness Standard on Life Science

February 27-28, 2017 Goodbye Seagate , Hello Halo : Effects of the Evolving Willfulness Standard on Life Science Patent Filings Yelee Y. Kim Donna M. Meuth Partner Associate General Counsel Arent Fox LLP Intellectual Property Eisai Inc.

387 views • 23 slides