hmeh write optimal extendible hashing for hybrid dram nvm
play

HMEH: write-optimal extendible hashing for hybrid DRAM-NVM memory - PowerPoint PPT Presentation

HMEH: write-optimal extendible hashing for hybrid DRAM-NVM memory Xiaomin Zou 1 , Fang Wang 1 *, Dan Feng 1 , Janxi Chen 1 , Chaojie Liu 1 , Fan Li 1 , Nan Su 2 Huazhong University of Science and Technology 1 , China Shandong Massive Information


  1. HMEH: write-optimal extendible hashing for hybrid DRAM-NVM memory Xiaomin Zou 1 , Fang Wang 1 *, Dan Feng 1 , Janxi Chen 1 , Chaojie Liu 1 , Fan Li 1 , Nan Su 2 Huazhong University of Science and Technology 1 , China Shandong Massive Information Technology Research Institute 2 , China

  2. Outline • Background and motivation • Our Work: HMEH • Performance Evaluation • Conclusion

  3. Background : Non-Volatile Memory (NVM)  NVM is expected to complement or replace DRAM as main memory CPU  non-volatile  large capacity Cache hierarchy  high performance  low standby power  limited write endurance  asymmetric properties Intel Optane DC Persistent Memory 3

  4. Background : NVM-based hash structures  Hashing structures are widely used in storage systems  main memory database  in-cache index  in-memory key-value store  Previous work is insufficient for real NVM device  PFHT [INFLOW 2015]  Path hashing [MSST 2017]  Level hashing [OSDI 2018]  CCEH [FAST 2019] 4

  5. Motivation : The design of hashing structure  Static hashing structure vs Dynamic hashing structure  Static hashing: Cost inefficiency for resizing hash table  Dynamic hashing: need extra directory access and the read latency of optane DCPMM is higher 000 &val3 hash(key) Directory 00 2 01 2 10 2 11 2 Buckets rehash all items Static hashing structure dynamic hashing structure 5

  6. Motivation : High overhead for data consistency  Data consistency guarantee  The volatile/non-volatile boundary is between CPU cache and NVM  Arbitrarily- evicted cache lines → memory writes reordering Program reordering CPU St value; cache St key; volatile Non-volatile ② ① CPU key value 6

  7. Motivation : High overhead for data consistency  Data consistency guarantee  The volatile/non-volatile boundary is between CPU cache and NVM  Arbitrarily- evicted cache lines → memory writes reordering Program reordering CPU St value; cache reordered value St key; volatile Non-volatile CPU key 11 7

  8. Motivation : High overhead for data consistency  Data consistency guarantee  The volatile/non-volatile boundary is between CPU cache and NVM  Arbitrarily- evicted cache lines → memory writes reordering Program reordering CPU St value; cache reordered value St key; volatile Non-volatile Crash CPU Inconsistency key 11 8

  9. Motivation : High overhead for data consistency  Data consistency guarantee  The volatile/non-volatile boundary is between CPU cache and NVM  Arbitrarily- evicted cache lines → memory writes reordering  Flush: flush cache lines  Fence: order CPU cache line flush Program reordering CPU cache St value; volatile Fence(); St key; Non-volatile Flush() CPU key value 11 9

  10. Motivation : High overhead for data consistency  Data consistency guarantee  The volatile/non-volatile boundary is between CPU cache and NVM  Arbitrarily- evicted cache lines → memory writes reordering  Flush: flush cache lines  Fence: order CPU cache line flush Expensive ! Program reordering CPU cache St value; volatile Fence(); St key; Non-volatile Flush() CPU key value 11 1 0

  11. Motivation : High overhead for data consistency  Data consistency guarantee  the evaluation with/without Fence and Flush in optane DCPMM  CCEH[FAST 2019], LEVL[OSDI 2018], linear hashing, and cuckoo hashing without Fence and Flush instructions , the throughputs of these hashing schemes are improved by 20.3% to 29.1%  Our goals  high-performance dynamic hashing with low data consistency overhead and fast recovery 1 1

  12. Our Scheme: HMEH  HMEH: Extendible Hashing for Hybrid DRAM-NVM Memory  Flat-structured Directory for fast access and radix-tree Directory for recovery  Directory → segment → cacheline-sized bucket DRAM NVM Segment Hash key 00 10 Bucket 00 0000 &val4 Bucket 01 0001 &val2 Segment index Bucket index Bucket 10 0010 &val6 Bucket 11 radix-tree Directory Flat-structured Directory Segment 1100 &val0 Bucket 00 1101 &val8 Bucket 01 Bucket 10 1110 &val9 Bucket 11 12

  13. HMEH : Two directories  Flat-structured Directory VS Radix-tree Directory  Radix tree is friendly to NVM  exploit RT-directory to rebuild FS-directory upon recovery  every segment is pointed by 2 G-L directory entries 0 1 Local depth : 1 0 1 2 0 1 3 0 1 0 1 0 1 Global depth : 3 000 001 010 011 100 101 110 111 13

  14. HMEH : Low data consistency overhead  Cross-KV mechanism  Split kv item into several pieces and alternately store key and value as several 8-byte atomic blocks  Avoid lots of Flush and Fence instructions CPU key value Program reordering cache St value; volatile Fence(); Non-volatile St key; CPU Flush() 14

  15. HMEH : Low data consistency overhead  Cross-KV mechanism  Split kv item into several pieces and alternately store key and value as several 8-byte atomic blocks  Avoid lots of Flush and Fence instructions CPU Program reordering cache key value St value; volatile Fence(); Non-volatile St key; CPU Flush() 15

  16. HMEH : Low data consistency overhead  Cross-KV mechanism  Split kv item into several pieces and alternately store key and value as several 8-byte atomic blocks  Avoid lots of Flush and Fence instructions CPU Program reordering cache K1 K2 V1 V2 St value; volatile Fence(); Non-volatile St key; CPU Flush() 16

  17. HMEH : Low data consistency overhead  Cross-KV mechanism  Split kv item into several pieces and alternately store key and value as several 8-byte atomic blocks  Avoid lots of Flush and Fence instructions CPU Program reordering cache K1 V1 K2 V2 St value; volatile Fence(); Non-volatile St key; CPU Flush() 17

  18. HMEH : Low data consistency overhead  Cross-KV mechanism  Split kv item into several pieces and alternately store key and value as several 8-byte atomic blocks  Avoid lots of Flush and Fence instructions CPU Program reordering cache K2 V2 St value; volatile Fence(); Non-volatile St key; CPU K1 V1 Flush() 18

  19. HMEH : Low data consistency overhead  Cross-KV mechanism  Split kv item into several pieces and alternately store key and value as several 8-byte atomic blocks  Avoid lots of Flush and Fence instructions CPU Program reordering cache K2 V2 St value; volatile Fence(); Non-volatile Crash √ St key; CPU V1 K1 Flush() 19

  20. HMEH : Low data consistency overhead  Cross-KV mechanism  Split kv item into several pieces and alternately store key and value as several 8-byte atomic blocks  Avoid lots of Flush and Fence instructions CPU Program reordering cache K2 V2 St value; volatile Fence(); St cross-KVs Non-volatile Crash √ St key; CPU V1 K1 Flush() 20

  21. HMEH : Improve load factor  Resolve hash collisions  linear probing : allow probe 4 buckets (256bytes, the access granularity of intel optane DCPMM)  stash: non-addressable and used to store colliding items 11 01 00 2 01 2 10 2 11 2 Hash key Segment Segment Segment Bucket 00 Bucket 00 Bucket 00 0000 &val4 1000 &val4 Bucket 01 Bucket 01 Bucket 01 0101 &val2 1001 &val2 1101 &val2 Bucket 10 Bucket 10 Bucket 10 0010 &val6 1010 &val6 Bucket 11 Bucket 11 Bucket 11 stash stash stash 21

  22. HMEH : Improve load factor  Resolve hash collisions  linear probing : allow probe 4 buckets (256bytes, the access granularity of intel optane DCPMM)  stash: non-addressable and used to store colliding items 11 01 00 2 01 2 10 2 11 2 Hash key Segment Segment Segment Bucket 00 Bucket 00 Bucket 00 0000 &val4 1000 &val4 Bucket 01 Bucket 01 Bucket 01 0101 &val2 1001 &val2 1101 &val2 Bucket 10 Bucket 10 Bucket 10 0010 &val6 1010 &val6 Bucket 11 Bucket 11 Bucket 11 stash stash stash 22

  23. HMEH : Improve load factor  Resolve hash collisions  linear probing : allow probe 4 buckets (256bytes, the access granularity of intel optane DCPMM)  stash: non-addressable and used to store colliding items 11 01 00 2 01 2 10 2 11 2 Hash key Segment Segment Segment Bucket 00 Bucket 00 Bucket 00 0000 &val4 1000 &val4 Bucket 01 Bucket 01 Bucket 01 0101 &val2 1001 &val2 1101 &val2 Bucket 10 Bucket 10 Bucket 10 0010 &val6 1010 &val6 Bucket 11 Bucket 11 Bucket 11 stash stash stash 23

  24. HMEH : Improve load factor  Resolve hash collisions  linear probing : allow probe 4 buckets (256bytes, the access granularity of intel optane DCPMM)  stash: non-addressable and used to store colliding items 11 01 00 2 01 2 10 2 11 2 Hash key Segment Segment Segment Bucket 00 Bucket 00 Bucket 00 0000 &val4 1000 &val4 Bucket 01 Bucket 01 Bucket 01 0101 &val2 1001 &val2 1101 &val2 Bucket 10 Bucket 10 Bucket 10 0010 &val6 1010 &val6 Bucket 11 Bucket 11 Bucket 11 stash stash stash 24

  25. HMEH : Optimistic Concurrency Mutex and version number for directories Directories Fine-grained lock for segment split lock-free read Segment Segment Bucket 00 0000 &val4 1100 &val0 Bucket 00 Bucket 01 Bucket 01 0001 &val2 1101 &val8 Bucket 10 Bucket 10 1110 &val9 0010 &val6 Compare-and-swap Bucket 11 Bucket 11 Instructions for Slots 25

Recommend


More recommend