An Analysis of SMP Memory Allocators: MapReduce on Large - PowerPoint PPT Presentation

An Analysis of SMP Memory Allocators: MapReduce on Large Shared-Memory Systems Robert D¨ obbelin, Thorsten Sch¨ utt, Alexander Reinefeld Zuse Institute Berlin September 10, 2012 1 / 11

SGI Altix UltraViolett (UV) 1000 32 GB Intel Xeon Intel Xeon 32 GB DDR3 X7560 QPI X7560 DDR3 RAM (8 cores) (8 cores) RAM 32 blades in one rack QPI 2 × 8 cores per blade HUB 64 GB memory per blade NUMAlink5 to other blades QPI for memory on same blade inter-blade communication via NUMAlink5 2 / 11

Memory allocation First-touch policy When a process requests memory from the OS threads gets (unmapped) virtual address page fault on first touch OS allocates physical pages to NUMA node on which accessing thread is running Once a virtual address is mapped, this mapping persists until the page is released to the OS. 3 / 11

Memory allocation Successive malloc / free operations Process Thread A Thread B Thread C Thread D Allocator OS ? 4 / 11

Memory allocation Successive malloc / free operations Process malloc : Thread A gets virtual page Thread A Thread B Thread C Thread D and touches it A Allocator OS 4 / 11

Memory allocation Successive malloc / free operations Process malloc : Thread A gets virtual page Thread A Thread B Thread C Thread D and touches it free : page may be released to the allocators cache Allocator A OS 4 / 11

Memory allocation Successive malloc / free operations Process malloc : Thread A gets virtual page and touches it Thread A Thread B Thread C Thread D free : page may be released to the allocators cache malloc : Thread D gets this page A Allocator Thread D got remote memory from the allocator! OS 4 / 11

MapReduce MapReduce workflow MapReduce stages ... map map : apply map-function to input ... ... ... combine (opt) ... ... ... shuffle : merge partitions shuffle ... ... ... reduce : apply reduce-function to all reduce kv-pairs with the same key ... size of buffers unknown a priori iterative MapReduce: output of one MR step is input for the next 5 / 11

MapReduce How to speed things up? Memory allocators for SMPs ( tbbmalloc ) provide fast concurrent allocations Memory reuse ( reuse ) reuse buffers for subsequent MapReduce iterations Memory preallocation ( prealloc ) allocate needed amount of memory for each buffer 6 / 11

Evaluation 8 glibc reuse 7 tbbmalloc tbb_pool 6 prealloc relative speedup 5 4 3 2 1 0 7 15 31 61 127 threads MR-Search with various allocators. Speedup is relative to glibc . Significant speedup if more than one blade is used. 7 / 11

Evaluation 600 8000 time 7000 500 6000 sent data [GByte] 400 5000 time [s] 300 4000 3000 200 2000 100 1000 0 0 glibc reuse tbbmalloc tbb_pool prealloc NUMA traffic and runtime with various allocators (127 Threads). Traffic on NUMAlink traced with Performance Co-Pilot TBB does not prevent remote memory 8 / 11

Evaluation NUMA traffic and runtime with various allocators (127 Threads). Traffic on NUMAlink traced with Performance Co-Pilot TBB does not prevent remote memory 8 / 11

Evaluation 500 400 300 speedup 200 100 0 0 100 200 300 400 500 cores perfect speedup tbbmalloc prealloc, MR only reuse prealloc glibc Scalability with various allocators. 9 / 11

Evaluation 500 400 300 speedup 200 100 0 0 100 200 300 400 500 cores perfect speedup MPI, Cluster MPI, UV OpenMP, UV, prealloc, MR only Comparing scalability: OpenMP vs. explicit message passing 10 / 11

Summary Summary It is not that easy to write scalable code for large SMPs. large variability of memory access costs on large SMPs allocators for SMPs help to increase scalability they do not prevent remote memory programmer needs to keep track of memory location (if possible) 11 / 11

An Analysis of SMP Memory Allocators: MapReduce on Large - PowerPoint PPT Presentation

An Analysis of SMP Memory Allocators: MapReduce on Large Shared-Memory Systems Robert D obbelin, Thorsten Sch utt, Alexander Reinefeld Zuse Institute Berlin September 10, 2012 1 / 11 SGI Altix UltraViolett (UV) 1000 32 GB Intel Xeon

How SMPng Works and Why It Doesn't Work The Way You Think NYC*BUG February 6, 2013 John Baldwin

MapReduce Andrew Crotty Alex Galakatos What is MapReduce? MapReduce is a framework for:

Cutting MapReduce Cost with Spot Market Huan Liu Accenture Technology Labs Why spot market? 2

Mrs: MapReduce for Scientific Computing in Python Andrew McNabb, Jeff Lund , and Kevin Seppi

Phoenix Rebirth: Scalable MapReduce on a Large-Scale Shared-Memory System Richard Yoo, Anthony

Dynamic Memory Allocation in the Heap (malloc and free) Now: Explicit allocators (a.k.a. manual

MapReduce 320302 Databases & Web Services (P. Baumann) 1 Why MapReduce? Motivation: Large

Visualising Dynamic Memory Allocators A.M. Cheadle, A.J. Field, J.W. Ayres, N. Dunn, R.A. Hayden ,

Fermilab, Science, SMP & Whats after SMP? Sowjanya Gollapinni University of Tennessee,

Fermilab, Science, SMP & Whats after SMP? Sowjanya Gollapinni University of Tennessee,

Lecture 16: Overview of MapReduce MapReduce is a parallel, distributed programming model and

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Hadoop Map Reduce 1 MapReduce 2-in-1 A programming paradigm A query execution engine A kind

MapReduce 340151 Big Data & Cloud Services (P. Baumann) 1 Overview MapReduce : the

COMP9313: Big Data Management MapReduce Data Structure in MapReduce Key-value pairs are the

Lecture 36: MapReduce Frameworks [Adapted from slides by John DeNero and MapReduce is a

EGI FedCloud in the last 6 months Lifewatch Status (Seville Site) 10/11/15 1 The infrastructure

Wind Turbines Wind Turbines A balanced wind turbine rotates smoothly A balanced wind turbine

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin Lee Duke University October

Whittier/Lyndale Bikeway The meeting will begin shortly September 1, 2020 Racism as a Public

Modeling Resilience in Cloud-Scale Data Centers John

Initial Results on Provisioning Variation in Cloud Services M. Suhail Rehman Research Analyst

On the Semantic Basis of Heraldic Propaganda or What do Arms Mean, and How? St. Andrews, August

IMC4-2RT Real-time scheduling Damien MASSON http://esiee.fr/~massond/Teaching/ last

An Analysis of SMP Memory Allocators: MapReduce on Large - PowerPoint PPT Presentation

An Analysis of SMP Memory Allocators: MapReduce on Large Shared-Memory Systems Robert D obbelin, Thorsten Sch utt, Alexander Reinefeld Zuse Institute Berlin September 10, 2012 1 / 11 SGI Altix UltraViolett (UV) 1000 32 GB Intel Xeon

How SMPng Works and Why It Doesn't Work The Way You Think NYC*BUG February 6, 2013 John Baldwin

MapReduce Andrew Crotty Alex Galakatos What is MapReduce? MapReduce is a framework for:

Cutting MapReduce Cost with Spot Market Huan Liu Accenture Technology Labs Why spot market? 2

Mrs: MapReduce for Scientific Computing in Python Andrew McNabb, Jeff Lund , and Kevin Seppi

Phoenix Rebirth: Scalable MapReduce on a Large-Scale Shared-Memory System Richard Yoo, Anthony

Dynamic Memory Allocation in the Heap (malloc and free) Now: Explicit allocators (a.k.a. manual

MapReduce 320302 Databases &amp; Web Services (P. Baumann) 1 Why MapReduce? Motivation: Large

Visualising Dynamic Memory Allocators A.M. Cheadle, A.J. Field, J.W. Ayres, N. Dunn, R.A. Hayden ,

Fermilab, Science, SMP &amp; Whats after SMP? Sowjanya Gollapinni University of Tennessee,

Fermilab, Science, SMP &amp; Whats after SMP? Sowjanya Gollapinni University of Tennessee,

Lecture 16: Overview of MapReduce MapReduce is a parallel, distributed programming model and

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Hadoop Map Reduce 1 MapReduce 2-in-1 A programming paradigm A query execution engine A kind

MapReduce 340151 Big Data &amp; Cloud Services (P. Baumann) 1 Overview MapReduce : the

COMP9313: Big Data Management MapReduce Data Structure in MapReduce Key-value pairs are the

Lecture 36: MapReduce Frameworks [Adapted from slides by John DeNero and MapReduce is a

EGI FedCloud in the last 6 months Lifewatch Status (Seville Site) 10/11/15 1 The infrastructure

Wind Turbines Wind Turbines A balanced wind turbine rotates smoothly A balanced wind turbine

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin Lee Duke University October

Whittier/Lyndale Bikeway The meeting will begin shortly September 1, 2020 Racism as a Public

Modeling Resilience in Cloud-Scale Data Centers John

Initial Results on Provisioning Variation in Cloud Services M. Suhail Rehman Research Analyst

On the Semantic Basis of Heraldic Propaganda or What do Arms Mean, and How? St. Andrews, August

IMC4-2RT Real-time scheduling Damien MASSON http://esiee.fr/~massond/Teaching/ last

MapReduce 320302 Databases & Web Services (P. Baumann) 1 Why MapReduce? Motivation: Large

Fermilab, Science, SMP & Whats after SMP? Sowjanya Gollapinni University of Tennessee,

Fermilab, Science, SMP & Whats after SMP? Sowjanya Gollapinni University of Tennessee,

MapReduce 340151 Big Data & Cloud Services (P. Baumann) 1 Overview MapReduce : the