hoop efficient hardware assisted
play

HOOP: Efficient Hardware-Assisted Out-of-Place Update for - PowerPoint PPT Presentation

HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory Miao Cai Chance Coats Jian Huang Systems Platform Research Group Non-Volatile Memory is a Revolutionary Technology Close-to-DRAM Performance Data


  1. HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory Miao Cai † Chance Coats Jian Huang Systems Platform Research Group †

  2. Non-Volatile Memory is a Revolutionary Technology Close-to-DRAM Performance Data Durability Byte Addressability New and emerging NVMs offer promising properties and become popular 2

  3. Memory Persistency Challenge: A Well-Known Problem Volatile Processor Cache Out-of-Order Execution Performance vs. Persistency Ensuring memory persistency with commodity architecture is challenging! 3

  4. State-of-the-Art Approach: Redo/Undo Logging Undo Logging Redo Logging Undo/Redo logging causes DOUBLE WRITES on the critical path. 4

  5. State-of-the-Art Approach: Shadow Paging Page Copy Optimized shadow paging still suffers from FREQUENT DATA FLUSHES . 5

  6. State-of-the-Art Approach: Log-structured NVM Log Index Software-based LSNVM suffers from LONG ACCESS LATENCY . 6

  7. A Summary of State-of-the-Art Approaches Log-structured NVM Logging Shadow Paging Memory persistency overheads: double writes, frequent flushes, long critical-path latency 7

  8. Our Approach: Hardware-assisted Out-Of-Place (HOOP) Update Reduced write traffic with data coalescing and packing + No requirement on persistence ordering + Transparent support of atomic data durability 8

  9. Challenges of Supporting Out-Of-Place Update Lightweight Limited Resource in Efficient Garbage Indirection Layer Memory Controller Collection 9

  10. Address Remapping for Supporting Out-of-Place Update Processor Cache Insert mapping entry load store physical-to-physical Upo pon a wri rite to o OOP regio ion address mapping Mapping Table Memory Controlle Co ler Delete mapping entry Data migra rati tion fr from OOP P to o hom ome Up Upon n a read fr from OOP region NVM Home Region OOP Region GC GC 10

  11. Data Packing in the Memory Controller for Improved Performance Home Processor Cache address load store Many applications update data at a Mapping Table OOP Data Buffer Memory fine granularity Co Controlle ler … NVM Home Region OOP Region Head OOP Block Head OOP Block 11

  12. Ensuring Persistence Ordering in the Memory Controller Processor Cache load store Done the data packing for a memory slice Mapping Table OOP Data Buffer Memory Co Controlle ler Upon the end of transaction (e.g., Tx_end) NVM Home Region OOP Region 12

  13. Efficient Garbage Collection for Improved Memory Utilization Processor Cache Linked Memory Slices load store Mapping Table OOP Data Buffer Memory Load sta tale le data ta dur urin ing GC Controlle Co ler Eviction Buffer … NVM Home Region OOP Region Head OOP Block Head OOP Block GC GC 13

  14. Handling Crash Consistency Upon Failures Processor Cache load store Mapping Table OOP Data Buffer Memory Co Controlle ler Eviction Buffer … NVM Home Region OOP Region Head OOP Block Head OOP Block 14

  15. Put It All Together core core store load L1 Cache L1 Cache miss Last-Level Cache miss Mapping Table OOP Data Buffer Memory Controlle Co ler Eviction Buffer NVM Home Region OOP Region 15

  16. McS cSim imA+: OoO oO co cores, , 2.5 .5GHz, Processor Simulator 32KB KB L1, , 256KB L2, , 2MB LLC HOOP Implementation Read/Wri rite te = 50/1 /150ns, , 512GB NVM Simulator Evaluation Vect ctor, , Hash shMap, , Queue, , RB-Tree, , B- Tree Synthetic Workloads Benchmarks YCSB, TPC PCC Real-world Workloads 16

  17. Improving Transaction Throughput with HOOP Optimized Redo Optimized Undo Optimized Shadow Paging Log-Structured NVM Logless Atomic Durability HOOP Ideal 2.5 Normalized Speedup 2 1.5 1 0.5 0 Vector Queue RBTree Btree HashMap YCSB TPCC HOOP is close to the performance of a system without any persistence enforcement. 17

  18. Reducing Critical-Path Latency with HOOP Ideal Optimized Redo Optimized Undo Optimized Shadow Paging Log-Structured NVM Logless Atomic Durability HOOP 2.5 Normalized Latency 2 1.5 1 0.5 0 Vector Queue RBTree Btree HashMap YCSB TPCC HOOP achieves the lowest latency, compared to state-of-the-art approaches. 18

  19. Reducing Write Traffic with HOOP Ideal Optimized Redo Optimized Undo Optimized Shadow Paging Log-Structured NVM Logless Atomic Durability HOOP Normalized Write Traffic 3 2.5 2 1.5 1 0.5 0 Vector Queue RBTree Btree HashMap YCSB TPCC HOOP reduces write traffic by up to 2.1x, compared to logging approaches. 19

  20. HOOP Summary 1.7x Performance Speedup for Data-Intensive Apps 2.1x Reduction of Write Amplification 20

  21. Thanks! Miao Cai Chance Coats Jian Huang Systems Platform Research Group University of Illinois at Urbana-Champaign

Recommend


More recommend