a write friendly hashing scheme for non volatile memory
play

A Write-friendly Hashing Scheme for Non-volatile Memory Systems - PowerPoint PPT Presentation

A Write-friendly Hashing Scheme for Non-volatile Memory Systems Pengfei Zuo and Yu Hua Huazhong University of Science and Technology, China Non-volatile Memory NVMs are expected to replace DRAM and SRAM SRAM DRAM PCM RRAM STT-RAM


  1. A Write-friendly Hashing Scheme for Non-volatile Memory Systems Pengfei Zuo and Yu Hua Huazhong University of Science and Technology, China

  2. Non-volatile Memory  NVMs are expected to replace DRAM and SRAM SRAM DRAM PCM RRAM STT-RAM Non-volatile N N Y Y Y Read (ns) 1 10 20~70 10 2~20 Write (ns) 1 10 150~220 50 5~35 Standby Power High High Low Low Low Scalability (nm) 20 20 5 11 32 Endurance (10^N) > 15 > 15 7~8 8~10 12~15  NVMs vs. DRAM & SRAM  No-volatile, high scalability, and low standby power X Limited endurance and asymmetric properties 2/20

  3. Rethinking Data Structures on NVMs  How could in-memory and in-cache data structures be modified to efficiently adapt to NVMs?  Previous work mainly focuses on tree-based structures  CDDS-tree (FAST 2011)  NV-tree (FAST 2015)  wB+-tree (VLDB 2015)  FP-tree (SIGMOD 2016)  Write Optical Radix Tree (FAST 2017)  Hash tables are also widely used in main memory and caches  Main memory database  In-memory key-value store, e.g., Memcached, Redis  In-cache index (ICS 2014, MICRO 2015) 3/20

  4. Existing Hashing Schemes on NVMs 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 a d e b c a b c Insertion d g f Deletion Deletion  Extra Writes e h  Extra Writes f (a) Chained Hashing (b) Linear Probing x x Low Space h 2 (x) h 2 (x) h 1 (x) h 1 (x) Utilization: Insertion 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7  Extra Writes ~35% a b Evict (c) 2-choice Hashing (d) Cuckoo Hashing  Our Design Goals  Minimize NVM writes while ensuring high performance 4/20

  5. Our Scheme: Path Hashing  Position Sharing  Double-path Hashing  Path Shortening A novel hash-collision resolution method without extra NVM writes Deliver high performance on space utilization and request latency 5/20

  6. Our Scheme: Path Hashing  Position Sharing  Double-path Hashing  Path Shortening A novel hash-collision resolution method resulting in no extra NVM writes Deliver high performance on space utilization and request latency Addressable cells by hash functions 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Level 4 Level 3 Level 2 Level 1 Level 0 Un-addressable, shared standby cells 6/20

  7. Our Scheme: Path Hashing  Position Sharing  Double-path Hashing  Path Shortening A novel hash-collision resolution method resulting in no extra NVM writes Deliver high performance on space utilization and request latency Insertion and deletion without extra modifications and data movements 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Level 4 Level 3 Level 2 Level 1 Level 0 Problem: One path can only deal with at most L hash collisions 7/20

  8. Our Scheme: Path Hashing  Position Sharing  Double-path Hashing  Path Shortening A novel hash-collision resolution method resulting in no extra NVM writes Deliver high performance on space utilization and request latency X h 1 (x) h 2 (x) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Level 4 Level 3 Level 2 Level 1 Level 0 Using two different hash functions to compute two paths  high space utilization 8/20

  9. Our Scheme: Path Hashing  Position Sharing  Double-path Hashing  Path Shortening A novel hash-collision resolution method resulting in no extra NVM writes Deliver high performance on space utilization and request latency X h 1 (x) h 2 (x) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Level 4 Level 3 Level 2 Level 1 Level 0 Problem: Each query may probe many nodes in a high tree 9/20

  10. Our Scheme: Path Hashing  Position Sharing  Double-path Hashing  Path Shortening A novel hash-collision resolution method resulting in no extra NVM writes Deliver high performance on space utilization and request latency Observation: The bottom levels provide a few standby positions while increasing the length of the read path. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Level 4 Level 3 Level 2 Level 1 Level 0 Path Shortening: Removing multiple levels in the bottom. 10/20

  11. Our Scheme: Path Hashing  Position Sharing  Double-path Hashing  Path Shortening A novel hash-collision resolution method resulting in no extra NVM writes Deliver high performance on space utilization and request latency Observation: The bottom levels provide a few standby positions while increasing the length of the read path. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Level 4 Level 3 Level 2 Evaluation: Reserving a small part of levels can also achieve a high space tilization. Path Shortening: Removing multiple levels in the bottom. 11/20

  12. Physical Storage Structure of Path Hashing 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 Level 4 Level 3 Level 2 An array: Level 3 Level 4 Level 2 12/20

  13. Physical Storage Structure of Path Hashing 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 Level 4 Level 3 Level 2 A[4] A[16+4/2] A[16+8+4/2/2] An array: Level 3 Level 4 Level 2  No pointers  The nodes in a path can be accessed in parallel for insertion, query and deletion 13/20

  14. Experimental Configurations  Gem5: a full system simulator  NVMain: a main memory simulator for NVMs  Datasets: Random Number, Document Word, Fingerprint 14/20

  15. NVM Writes 7.3 4 No. of Written Lines Chained Linear 2-choice 3 Cuckoo Path 2 No extra writes 1 0 0.6 0.8 Load Factor RandomNum 14.2 10 No. of Written Lines Chained Linear 2-choice 8 Cuckoo Path 6 4 No extra writes 2 0 0.6 0.8 Load Factor 15/20 DocWord

  16. Space Utilization Chained 2-choice Cuckoo Path 100% Space Utilization Ratio 80% 60% 40% 20% 0% RandomNum DocWord Fingerprint  Path hashing achieves up to 95% space utilization ratio 16/20

  17. Reserved Levels vs. Space Utilization 100% Space Utilization Ratio 90% 80% RandomNum (L = 22) 70% DocWord (L = 23) 60% Fingerprint (L = 24) 50% 40% 30% 3 5 7 9 11 13 15 17 19 21 23 25 The Number of Reserved Levels  Reserving a small part of levels can also achieve a high space utilization ratio 17/20

  18. Request Latency 15.3 6 Insertion Latency (us) Chained Linear 2-choice 5 Cuckoo Path 4 3 2 3.5 2.5 Deletion Latency (us) 1 Chained Linear P-2-choice P-Cuckoo 0.6 0.8 Load Factor 2 Path 1.5 1 0.5 1 0.6 0.8 Chained Linear P-2-choice Load Factor Query Latency (us) P-Cuckoo Path P-Path 0.8 0.6 0.4 0.2 18/20 0.6 0.8 Load Factor

  19. Conclusion  Existing main hashing schemes usually cause many extra writes to NVMs  We propose a write-friendly hashing scheme, path hashing, without extra writes while having high performance  Position sharing  Double-path hashing  Path shortening  Experimental results on gem5 with NVMain  No extra writes  Up to 95% space utilization ratio  Low request latency 19/20

  20. Thanks! Q&A Open-source Code: https://github.com/Pfzuo/Path-Hashing E-mail: pfzuo@hust.edu.cn Homepage: http://pfzuo.github.io/about/ 20/20

Recommend


More recommend