Virtualize and Share Non-Volatile Memory in User Space Chih Chieh Chou, Jaemin Jung, A. L. Narasimha Reddy, Paul Gratz, and Doug Voigt May 23, 2019 DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 1
Outline • Introduction • Motivation and Goal • Architecture • Conclusions • Acknowledgements DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 2
Introduction • The non-volatile memory has becomes promising storage device because of some amazing properties – Byte-addressability – Non-volatility – Low latency – Low power in idle (except for NVDIMM) HPE 8GB NVDIMM single Rank x4 DDR4-2133 Module DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 3
Introduction • Unlike DRAM and disk, how to deploy NVM (put in which layer of memory hierarchy) does not have an agreement so far cache DRAM cache NVM cache DRAM DRAM Disk NVM Disk Disk cache DRAM NVM 1. Use DRAM as cache of NVM (w/o non-volatility) 2. Use NVM as cache of disk (w/o byte-addressability) Disk Can we do more? DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 4
Challenge • Directly attach to memory bus as DIMM under cache are “not persistent” after power cycling • Need write ordering! (sol: logs and transactions) System crashes A’ B’ C’ B’ cache cache cache A B C A’ B C’ A’ B C’ NVM NVM NVM DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 5
Motivations • Several prior work focusing on building a specific file system tailed for NVM • Scmfs (SC’11), NOVA (FAST’16, MSST’17), Strata (SOSP’17) – Limit users to use their file systems – No concurrency – System calls are too expensive and will squander the low latency provided by NVM • Handling almost everything in user space provides much better performance • Intel SPDK (https://spdk.io): user space, polling-based, NVMe driver – ULL SSD: Intel Optane SSD/Samsung Z-NAND DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 6
Motivations • SNIA NVM Programming Model/Intel PMDK( https://pmem.io/pmdk/) – Use mmap interface to access NVM • Virtualize and share NVM (between processes), like virtual memory (mmap) – Virtual NVM capacity more than physical available capacity • Leveraging storage device as data final destination • Leveraging DRAM as cache – Performance: better latency; avoid log searching – Write lifetime issue of PCM: reduce write to NVM DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 7
Our Goals • User space – library • Transactional interface – Log • mmap-like access form • Virtualization and sharing of NVM – Leverage storage device • DRAM cache DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 8
Methodology • Leveraging the existing mmap function • Integrate DRAM, NVM, and SSD to provide virtual NVM – Treat (DRAM + NVM + SSD) as a huge NVM pool – Its performance is very close to that of NVM (or DRAM) DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 9
Methodology • User space library: vNVML – Access NVM only through vNVML – Support concurrently (processes) access – Allocate (virtual) NVM regardless of actual NVM size App1 App2 App1 App2 vNVML vNVML DRAM NVM NVM Storage DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 10
Example ptr = nv_allocate( filepath , filesize , mode); tid = nv_txbegin(); // TX starts x = *ptr; // read y = *(ptr + sizeof(x)); // read x = 1; y = 2; nv_write(tid, ptr, &x, sizeof(x)); //write nv_write(tid, ptr+sizeof(x), &y, sizeof(y)); //write nv_commit(tid); //TX commits DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 11
Components Virtual address space Log Meta Cache File SHM object buffer data mmapping DAX shared mapping Storage File NVM Storage Limitations/challenge: • 1. File system must support mmap • 2. Virtual addressed cannot be stored in NVM • DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 12
Components Process 1 virtual address space Log Meta Cache File SHM object buffer data NVM Storage Storage File File Meta Log Cache File SHM object data buffer Process 2 virtual address space DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 13
Data access flow 3. read 1. write 2. commit NVM NVM Meta File A Mapping log Cache data private 4. write back to SSD mmap NVM File A Storage DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 14
R/W flows • DRAM as read only cache R W • Limitation: Read committed TX Read NVM DRAM Log buffer • NVM as log buffer and cache 1 • Write only cache Write NVM • Two background threads cache 2 1 – Update the logs to write cache – Update the write cache to Storage storage DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 15
Log structure tid : page object (log page) : log object P N open lists Committed list tid tid tid tid P P P P 2 1 4 5 tid tid tid P P 15 33 37 tid P P P 7 1. Committed 2. Abort tid P 37 Limitation: write first should commit first (only when writing the same object) DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 16
NVM Cache Management Free list log content adoption - after commit cache hit Dirty list cache hit writeback - when over 30% pages are dirty cache miss Clean list DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 17
Shared files Committed list R W tid Log buffer DRAM NVM N1 “digested” tid Shared by background thread msync N2 mmap tid N3 File A tid N4 Storage Limitations: 1. write first should commit first (only when writing the same object) 2. All writes of a TX must write to the same shared file DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 18
Recovery after crashing Dirty pages: check dirty bits • Logs of committed list: leverage 8-byte atomicity (pointer) of cpu • Insert: (a) => (b) => (c) • Delete: (c) => (b) => (a) • A A A C C C B B B (a) (c) (b) DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 19
Experiment methodology • YCSB + MongoDB + Library – YCSB generates read/write traffic YCSB (workload) to MongoDB • Fixed size record: 64KB • Run 100K records for each experiments – MongoDB accesses the NVM through MongoDB library – Baseline: MongoDB generates files vNVML directly to NVM, and disables journaling/msync DRAM NVM • Platform setting: – 12GB DRAM, 12GB emulated NVM – CPU: 4 cores Storage – 4 MongoDB instances run concurrently DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 20
Evaluation • Assume NVM size is fixed, how to partition the log buffer size and cache size? • How does vNVML perform compared to other libraries? DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 21
Results of fixed cache size • NVM cache size is 4GB, record number is the size of data set in the MongoDB Uniform, W/R=95/5 Zipfian, W/R=95/5 1 1 0.9 0.9 0.8 0.8 Normalized throughput Normalized throughput 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 record = record = record = record = record = record = record = record = record = record = 10000 (0.6GB) 15000 (0.9GB) 20000 25000 (1.5GB) 30000 (1.8GB) 10000 15000 20000 25000 30000 (2.4GB for 4) (3.6GB for 4) (1.22GB) (6.0GB for 4) (7.2GB for 4) (0.6GB) (0.9GB) (1.22GB) (1.5GB) (1.8GB) (4.88GB for 4) (2.4GB for 4) (3.6GB for 4) (4.88GB for (6.0GB for 4) (7.2GB for 4) 4) Log :2G Log: 1G Log: 512M Log: 128M Log :2G Log: 1G Log: 512M Log: 128M DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 22
Results of fixed cache size • NVM cache size is 4GB Uniform, W/R=50/50 Zipfian, W/R=50/50 1 1 0.9 0.9 0.8 0.8 Normalized throughput Normalized throughput 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 record = record = record = record = record = record = record = record = record = record = 10000 15000 20000 25000 30000 10000 15000 20000 25000 30000 (0.6GB) (0.9GB) (1.22GB) (1.5GB) (1.8GB) (0.6GB) (0.9GB) (1.22GB) (1.5GB) (1.8GB) (2.4GB for 4) (3.6GB for 4) (4.88GB for (6.0GB for 4) (7.2GB for 4) (2.4GB for (3.6GB for (4.88GB for (6.0GB for (7.2GB for 4) 4) 4) 4) 4) 4) Log :2G Log: 1G Log: 512M Log: 128M Log :2G Log: 1G Log: 512M Log: 128M DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 23
Results of fixed cache size • NVM cache size is 4GB Zipfian, W/R=30/70 Uniform, W/R=30/70 1 1 0.9 0.9 0.8 0.8 Normalized throughput Normalized throughput 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 record = record = record = record = record = record = record = record = record = record = 10000 15000 20000 25000 30000 10000 15000 20000 25000 30000 (0.6GB) (0.9GB) (1.22GB) (1.5GB) (1.8GB) (0.6GB) (0.9GB) (1.22GB) (1.5GB) (1.8GB) (2.4GB for (3.6GB for (4.88GB (6.0GB for (7.2GB for (2.4GB for (3.6GB for (4.88GB for (6.0GB for (7.2GB for 4) 4) for 4) 4) 4) 4) 4) 4) 4) 4) Log :2G Log: 1G Log: 512M Log: 128M Log :2G Log: 1G Log: 512M Log: 128M DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING 24
Recommend
More recommend