Distributed Shared Persistent Memory (SoCC 17) Yizhou Shan, Yiying - PowerPoint PPT Presentation

Distributed Shared Persistent Memory (SoCC ’17) Yizhou Shan, Yiying Zhang

Persistent Memory (PM/NVM) CPU Byte Addressable Persistent Cache Low Latency Capacity Cost effective PM DRAM 2

Many PM Work, but All in Single Machine • Local memory models – NV-Heaps [ASPLOS ’11] , Mnemosyne [ASPLOS ’11] – Memory Persistency [ISCA ’14] , Synchronous Ordering [Micro’16] • Local file systems – BPFS [SOSP’09] , PMFS [EuroSys’14] , SCMFS [SC’11] , HiNFS [EuroSys’16] • Local transaction/logging systems – NVWAL [ASPLOS’16] , SCT/DCT [ASPLOS’16] , Kamino-Tx [Eurosys’17] 3

Moving PM into Datacenters • PM fits datacenter - Applications require a lot memory - and accessing persistent data fast - with low monetary cost • Challenges - Handle node failure - Ensure good performance and scalability - Easy-to-use abstraction 4

How to Use PM in Distributed Environments? • As distributed memory? • As distributed storage? • Mojim [Zhang etal., ASPLOS’15] - First PM work in distributed environments - Efficient PM replication - But far from a full-fledged distributed NVM system 5

Resource Allocation in Datacenters VM1 App1 Container1 App2 Core Core Core Core Core Core Core Core Core Core Core 3GB 4GB Main Memory 8GB Main Memory Memory Node 1 Node 2 6

Resource Utilization in Production Clusters * Google Production Cluster Trace Data. “https://github.com/google/cluster-data” Unused Resource + Waiting/Killed Jobs Because of Physical-Node Constraints 7

Q1: How to achieve better resource utilization? Use remote memory 8

Distributed (Remote) Memory VM1 App1 Container1 App2 App2 Core Core Core Core Core Core Core Core Core Core Core Memory Main Memory Main Memory Node 1 Node 2 9

Modern Datacenter Applications Have Significant Memory Sharing • ¡ PowerGraph TensorFlow 10 Purdue ECE WukLab

Q2: How to scale out parallel applications? Distributed shared memory 11

What about persistence? • Data persistence is useful - Many existing data storage systems ➡ Performance - Memory-based, long-running applications ➡ Checkpointing 12

Q3: How to provide data persistence? 13

DSM Distributed Shared Persistent Memory (DSPM)   a significant step towards using PM in datacenters 14

DSPM • Native memory load/store interface – Local or remote (transparent) – Pointers and in-memory data structures • Supports memory read/write sharing 15

DSM Distributed Shared Persistent Memory (DSPM)   a significant step towards using PM in datacenters 16

DSPM • Memory load/store interface – Local or remote (transparent) DSPM: One Layer Approach – Pointers and in-memory data structures (Distributed) ¡Memory Benefits of both memory and storage • Supports memory read/write sharing No redundant layers No data marshaling/unmarshalling • Persistent naming • Data durability and reliability (Distributed) ¡Storage 17

Hotpot :   A Kernel-Level RDMA-Based   DSPM System • Easy to use • Native memory interface • Fast, scalable • Flexible consistency levels • Data durability & reliability 18

Hotpot Architecture 19

Hotpot Code Example /* Open a dataset named 'boilermaker’ */ int fd = open(”/mnt/hotpot/boilermaker”, O_CREAT|O_RDWR); / * map it to application’s virtual address space */ void *base = mmap(0, 40960, PROT_WRITE, MAP_PRIVATE, fd, 0); /* First access: Hotpot will fetch page from remote */ *base = 9; /* Later accesses: Direct memory load/store */ memset(base, 0x27, PAGE_SIZE); /* Commit data: making data coherent, durable, and replicated */ msync(sg_addr, sg_len, MSYNC_HOTPOT); 25

How to efficiently add P to “DSM”? • Distributed Shared Memory - Cache remote memory on-demand for fast local access - Multiple redundant copies • Distributed Storage Systems - Actively add more redundancy to provide data reliability Integrate two forms of redundancy with morphable page states One Layer Principle 26

Morphable Page States • A PM page can serve different purposes, possibly at different times • as a local cached copy to improve performance • as a redundant data page to improve data reliability Node 2 accesses page 3 Node ¡1 Node ¡2 3 3 1 4 2 2 4 1 3 10/9/17 27

How to efficiently add P to “DSM”? • When to make cached copies coherent? • When to make data durable and reliability? • Observations - Data-store applications have well-defined commit points - Commit points: time to make data persistent - Visible to storage devices => visible to other nodes Exploit application behavior: Make data coherent only at commit points 28

Commit Point CPU cache CPU cache CPU cache A’ B’ A C’ PM PM PM A B C A’ A’ Node 1 Node 2 Node 3 • durable • coherent • reliable • single-node and distributed consistency • two consistency modes: single/multiple writer 29

Flexible Coherence Levels • Multiple Reader Multiple Writer ( MRMW ) - Allows multiple concurrent dirty copies - Great parallelism, but weaker consistency - Three-phase commit protocol • Multiple Reader Single Writer ( MRSW ) - Allows only one dirty copy - Trades parallelism for stronger consistency - Single phase commit protocol 30

MongoDB Results • Modify MongoDB with ~120 LOC, use MRMW mode • Compare with tmpfs , PMFS , Mojim , Octopus using YCSB 10/9/17 31

Conclusion • One layer approach: challenges and benefits • Hotpot: a kernel-level RDMA-based DSPM system • Hide complexity behind simple abstraction • Calls for attention to use PM in datacenter • Many open problems in distributed PM! 32

Thank You Questions? Get Hotpot at: https://github.com/WukLab/Hotpot wuklab.io

Distributed Shared Persistent Memory (SoCC 17) Yizhou Shan, Yiying - PowerPoint PPT Presentation

Distributed Shared Persistent Memory (SoCC 17) Yizhou Shan, Yiying Zhang Persistent Memory (PM/NVM) CPU Byte Addressable Persistent Cache Low Latency Capacity Cost effective PM DRAM 2 Many PM Work, but All in Single Machine

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Distributed Shared Memory 1 Distributed Shared Memory Making the main memory of a cluster of

Hardware Support for ACID Transactions in Persistent Memory Arpit Joshi , Vijay Nagarajan, Marcelo

Distributed Shared Memory Shared memory : difficult to realize vs . easy to program with.

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Distributed Shared Memory Presented by Humayun Arafat 1 Outline Background Shared Memory,

Distributed Shared Memory and Machine Learning CSci 8211 Chai-Wen Hsieh 11/5/2018 Agenda

Outline Asynchronous shared memory model Wait-free Consensus in shared memory with R/W

Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari , Jishen Zhao

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis

Distributed Shared Memory Distributed Shared Memory Systems Page based

COMP 590-154: Computer Architecture Shared-Memory Multi-Processors Shared-Memory Multiprocessors

Distributed Memory and Cache Consistency Distributed Memory and Cache Consistency (some slides

Distributed Memory and Cache Consistency Distributed Memory and Cache Consistency (some slides

Distributed Shared Memory History, fundamentals and a few examples Coming up Cluster Computing

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Des Moines Chapter of Medical D M i Ch t f M di l Assistants, AAMA Where do you fit into

Integrated Deep and Shallow Networks for Salient Object Detection Jing Zhang, Bo Li, Yuchao Dai,

Distributed Shared Memory and Machine Learning CSci 8211 Chai-Wen Hsieh 11/5/2018 Overview of

Highlights and key findings of the 2015 conference Ruprecht Niepold Independent spectrum policy

A Search Term By Any Other Name Is Just As Correct? The importance of evolving language in

Title San Francisco Department of Public Health Behavioral Health Services Quality Management

Communication Systems ISDN University of Freiburg Computer Science Computer Networks and

Qsig Encapsulation Over SIP Encapsulation Over SIP Qsig audet@ @nortelnetworks