native os support for persistent memory with regions
play

NATIVE OS SUPPORT FOR PERSISTENT MEMORY WITH REGIONS Mohammad - PowerPoint PPT Presentation

NATIVE OS SUPPORT FOR PERSISTENT MEMORY WITH REGIONS Mohammad Chowdhury (mchow017@fiu.edu) Raju Rangaswami (raju@cs.fiu.edu) Florida International University PERSISTENT MEMORY (PM) Hybrid characteristics of memory and storage Memory


  1. NATIVE OS SUPPORT FOR PERSISTENT MEMORY WITH REGIONS Mohammad Chowdhury (mchow017@fiu.edu) Raju Rangaswami (raju@cs.fiu.edu) Florida International University

  2. PERSISTENT MEMORY (PM)  Hybrid characteristics of memory and storage Memory Storage • Volatile • Non-volatile/Persistent • Byte-addressable access • Block I/O access • Fast • Slow Persistent Memory Read/Write latency : • Non-Volatile/Persistent 4X-10X • Byte-addressable access of memory • Fast 2 5/19/2017

  3. PM CHALLENGES  PM is directly accessible by CPU  BUT …  Caches and Memory controller sit between PM and CPU PM resident data can be  Caches write dirty pages to DRAM/PM according to corrupted after a system failure cache eviction policy  Memory Controller optimizes performance by reordering if ordering of updates is violated the updates 3 5/19/2017

  4. PM CHALLENGES: THE COSTS OF ORDERING • Ordering requires cache line flushes, barriers, and ADR (asynchronous DRAM refresh) • Increased cost of operations • More redundant metadata  More ordering required • GOAL  • Reduce ordering requirements 4 5/19/2017

  5. PM CHALLENGES: ATOMIC DATA DURABILITY P1 P2 P3 PM 8 4 2 P2(x 2 ) P1(x) P3(x 3 ) P3(msync) Final t 1 t 2 t 3 t 0 Version “4” Memory PM Null 4 All good!! shouldn’t be here Requirements: 1. Make data atomically durable (ALL or NONE) 2. Revert back to initial state in case of failure 5 5/19/2017

  6. PM OPPORTUNITIES: SHARED CONSISTENCY DAX/Regular MMAP NOVA ATOMIC MMAP P1 P2 P1 P2 Private copy Cache coherent visibility PM PM MAP_ATOMIC  MAP_PRIVATE • MAP_SHARED • Requirements: • Updates immediately reflected in process • Updates only visible to the process 1. Updates should be visible to all the shared processes address spaces • Atomically durable 2. Should support atomic durability of all updates across a shared region • NOT atomically durable • Forfeits sharing/cache coherency support 6 5/19/2017

  7. PM OPPORTUNITIES: SIMPLE MEMORY-LIKE TRANSACTIONS Program A Program B Allocate persistent Obj1; A = mmap(PM); Allocate persistent Obj2; Allocate objects Obj1,Obj2 from mapped area Programmers Begin Transaction 1. Must track Obj1 operations Operations involving Obj1, Obj2. all updates End transaction Sync() to persistent objects Begin Transaction More Operations on both Obj1, Obj2 2. Must annotate Obj2 operations Sync() individual End transactions Programmers simply call Sync() to transactions persist all updates in a mapped area 7 5/19/2017

  8. APPLICATIONS REQUIREMENTS FOR PM Arbitrary & Unordered Allocation Mapped Persistent Data Namespace PM Based Consistency Application Consistent Simple Sharing Memory Like Support Transactions 8 5/19/2017

  9. CONTEMPORARY SOLUTIONS DAX File Systems Memory Subsystem Persistent Heaps Mnemosyne NOVA, EXT4- OS NV-Heaps DAX, PMFS LibpmemObj Regular File Sys. Atomic Msync Replication Failure Atomic Mojim EXT4, BTRFS, Msync (EXT4-JBD) RDMA etc. 9 5/19/2017

  10. CONTEMPORARY SOLUTIONS Region System DAX File Systems Memory Subsystem Persistent Heaps Regular File Sys. Atomic Msync Replication Arbitrary and Unordered Allocation Consistent Sharing Support Simple Memory Like Transactions Mapped Data Consistency Persistent Namespace Mapped Data Consistency (Partial) 10 5/19/2017

  11. REGION SYSTEM We present “Region System”, a kernel subsystem, to support persistent memory to achieve the following goals: • Minimize unwanted latency in the persistent memory access path; • Provide users with direct and consistent access to shared persistent memory; and • Demonstrate modifications of the existing applications for optimized usage. 11 5/19/2017

  12. REDEFINED OS MEMORY/STORAGE STACK NOT intended as replacement for File Systems or Memory Subsystem RS should serve as a core “Persistent Memory Support System” usable by applications , file systems , and other kernel subsystems. 12 5/19/2017

  13. ARCHITECTURE Region : Collection of persistent pages PPAGES : 4KB PM pages 13 5/19/2017

  14. CONSISTENCY STATES Current Snapshot State 0 0 No Ppage 0 y Invalid – There can not be a snapshot without current x 0 Un-synced page, mapped to the address space x == y, page in synced state x y x != y, page in unsynced state, “y” is the consistent version 14 5/19/2017

  15. REGION SYSTEM (RS) INTERFACE Class System Call region_d open (char region_name, flags f) int close (region_d rd ) Namespace int delete (region_d rd) ppage_no alloc_ppage (region_d rd) Allocation int free_ppage (region_d rd, ppage_no ppn) vaddr pmmap(vaddr va, region_d rd, ppage_no, int nbytes, flags f) Mapping & int pmunmap(vaddr va) Consistency pmsync(vaddr va) 15 5/19/2017

  16. METADATA OPERATIONS • Persistent Operations • Modifies persistent metadata • Volatile Operations • No updates to persistent metadata • Persistent operations are designed to achieve atomic durability 16 5/19/2017

  17. METADATA OPERATION COMPARISON Persistent Operations 1.1x 2.8x 2.2x 1.25x 2.3x Volatile Operations 17 5/19/2017

  18. MAPPED DATA CONSISTENCY CHALLENGES • Avoid Unwanted Durability • Applications want to make updates durable only updates a msync() invocation. • Updates are made durable in PM before a msync call. • In case of a failure, the mapped PM area will contain uncommitted data. • Protecting the Sync • During sync operation no applications should be allowed to write to mapped PM  difficult to achieve due to direct CPU access. 18 5/19/2017

  19. ATOMIC DURABILITY WITH PMSYNC 1. Identify the dirty pages 2. Write protect the pages 3. Flush dirty cache lines 4. Copy-on-write protection for future writes to a sync’ed page 19 5/19/2017

  20. AVOIDING COW PROPAGATION 1 1 1 2 2 2 3 3 4 4 5 4 5 6 7 8 c s c s c s 7 6 7 8 9 9 9 10 9 10 Conventional CoW Region System CoW Copy-on-write for page 9 Copy-on-write for page 9 20 5/19/2017

  21. PMSYNC EXAMPLE 7. PMSYNC_COMPLETE 5. PMSYNC_IN_PROGRESS PMSYNC A rs_root rnode: A Locked rnode: B 4 5 6 1 2 3 1 2 4 5 6 3 PM c s c s c s c s c s c s c s c s c s c s c s c s 6. Change s E2 E3 E6 E7 E8 E9 EE F0 F2 4. Flush cache Cache line for tlb E2 mmu Volatile Page 2. Write 2. Wait for tables Protect CPU 1 vma vma vma vma E2 mm mm Task Z Task Y 1. IPI CPU 1 CPU 2 3. IPI returns 21 5/19/2017

  22. PMSYNC COMPARISON WITH EXT4-DAX Latency ( μs ) File/Region size 22 5/19/2017

  23. LIBPMEM-REGION Non-transactional pmem-flush All or None policy does not work A portion of the updates can be lost Outcome 1. Add atomic durability guarantee to libpmem 2. Reduce risk factor for libraries built on top of libpmem 23 5/19/2017

  24. LIBPMEM COMPARISONS 24 5/19/2017

  25. LIBPMEM COMPARISONS 25 5/19/2017

  26. SUMMARY • Region System Features • Provides arbitrary and unordered allocation and de- allocation • Minimizes ordering requirements by eliminating redundancy • Provides transparent sharing and atomic durability of mapped data with competitive performance • Usable by File systems , Applications , Libraries , and other kernel subsystems or modules. • Source code will be made public soon! 26 5/19/2017

  27. Thanks ! QUESTIONS ? 27 5/19/2017

Recommend


More recommend