wisckey separating keys from values in ssd conscious
play

WiscKey: Separating Keys from Values in SSD-Conscious Storage - PowerPoint PPT Presentation

WiscKey: Separating Keys from Values in SSD-Conscious Storage Lanyue Lu, Thanumalayan Pillai, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin-Madison Key-Value Stores Key-Value Stores Key-value stores are important


  1. WiscKey: Separating Keys from Values in SSD-Conscious Storage Lanyue Lu, Thanumalayan Pillai, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin-Madison

  2. Key-Value Stores

  3. Key-Value Stores Key-value stores are important ➡ web indexing, e-commerce, social networks ➡ various key-value stores ➡ hash table, b-tree ➡ log-structured merge trees (LSM-trees)

  4. Key-Value Stores Key-value stores are important ➡ web indexing, e-commerce, social networks ➡ various key-value stores ➡ hash table, b-tree ➡ log-structured merge trees (LSM-trees) LSM-tree based key-value stores are popular ➡ optimize for write intensive workloads ➡ widely deployed ➡ BigTable and LevelDB at Google ➡ HBase, Cassandra and RocksDB at FaceBook

  5. Why LSM-trees ?

  6. Why LSM-trees ? Good for hard drives ➡ batch and write sequentially ➡ high sequential throughput ➡ sequential access up to 1000x faster than random

  7. Why LSM-trees ? Good for hard drives ➡ batch and write sequentially ➡ high sequential throughput ➡ sequential access up to 1000x faster than random Not optimal for SSDs ➡ large write/read amplification ➡ wastes device resources

  8. Why LSM-trees ? Good for hard drives ➡ batch and write sequentially ➡ high sequential throughput ➡ sequential access up to 1000x faster than random Not optimal for SSDs ➡ large write/read amplification ➡ wastes device resources ➡ unique characteristics of SSDs ➡ fast random reads ➡ internal parallelism

  9. Our Solution: WiscKey

  10. Our Solution: WiscKey Separate keys from values

  11. Our Solution: WiscKey Separate keys from values ➡ decouple sorting and garbage collection key value LSM-tree

  12. Our Solution: WiscKey Separate keys from values ➡ decouple sorting and garbage collection value key LSM-tree Value Log

  13. Our Solution: WiscKey Separate keys from values ➡ decouple sorting and garbage collection ➡ harness SSD’s internal parallelism for range queries value key LSM-tree Value Log

  14. Our Solution: WiscKey Separate keys from values ➡ decouple sorting and garbage collection ➡ harness SSD’s internal parallelism for range queries ➡ online and light-weight garbage collection value key LSM-tree Value Log

  15. Our Solution: WiscKey Separate keys from values ➡ decouple sorting and garbage collection ➡ harness SSD’s internal parallelism for range queries ➡ online and light-weight garbage collection ➡ minimize I/O amplification and crash consistent value key LSM-tree Value Log

  16. Our Solution: WiscKey Separate keys from values ➡ decouple sorting and garbage collection ➡ harness SSD’s internal parallelism for range queries ➡ online and light-weight garbage collection ➡ minimize I/O amplification and crash consistent value key LSM-tree Value Log Performance of WiscKey ➡ 2.5x to 111x for loading, 1.6x to 14x for lookups

  17. Background Key-Value Separation Challenges and Optimizations Evaluation Conclusion

  18. LSM-trees: Insertion memory disk L0 (8MB) Log L1 (10MB) L2 (100MB) L6 (ITB)

  19. LSM-trees: Insertion memory disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  20. LSM-trees: Insertion KV memory disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  21. LSM-trees: Insertion KV memory 1 disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  22. LSM-trees: Insertion 2 KV memT memory 1 disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  23. LSM-trees: Insertion 3 2 KV memT memT memory 1 disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  24. LSM-trees: Insertion 3 2 KV memT memT memory 1 4 disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  25. LSM-trees: Insertion 3 2 KV memT memT memory 1 4 disk L0 (8MB) 5 Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  26. LSM-trees: Insertion 1. Write sequentially 2. Sort data for quick lookups 3 2 KV memT memT memory 1 4 disk L0 (8MB) 5 Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  27. LSM-trees: Insertion 1. Write sequentially 2. Sort data for quick lookups 3. Sorting and garbage collection are coupled 3 2 KV memT memT memory 1 4 disk L0 (8MB) 5 Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  28. LSM-trees: Lookup memory disk L0 (8MB) Log L1 (10MB) L2 (100MB) L6 (ITB)

  29. LSM-trees: Lookup memory disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  30. LSM-trees: Lookup K memory disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  31. LSM-trees: Lookup 1 K memT memory disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  32. LSM-trees: Lookup 1 K memT 2 memory disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  33. LSM-trees: Lookup 1 K memT 2 memory disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  34. LSM-trees: Lookup 1 K memT 2 memory 3 L1 to L6 disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  35. LSM-trees: Lookup 1. Random reads 2. Travel many levels for a large LSM-tree 1 K memT 2 memory 3 L1 to L6 disk L0 (8MB) Log L1 (10MB) LevelDB L2 (100MB) L6 (ITB)

  36. I/O Amplification in LSM-trees

  37. I/O Amplification in LSM-trees 1000 Write Read 327 Amplification Ratio 100 14 10 1 100 GB

  38. I/O Amplification in LSM-trees Random load: 1000 Write Read a 100GB database 327 Amplification Ratio Random lookup: 100 100,000 lookups 14 10 1 100 GB

  39. I/O Amplification in LSM-trees Random load: 1000 Write Read a 100GB database 327 Amplification Ratio Random lookup: 100 100,000 lookups 14 10 Problems: large write amplification 1 large read amplification 100 GB

  40. Background Key-Value Separation Challenges and Optimizations Evaluation Conclusion

  41. Key-Value Separation

  42. Key-Value Separation Main idea: only keys are required to be sorted

  43. Key-Value Separation Main idea: only keys are required to be sorted Decouple sorting and garbage collection

  44. Key-Value Separation Main idea: only keys are required to be sorted Decouple sorting and garbage collection SSD device LSM-tree Value Log

  45. Key-Value Separation Main idea: only keys are required to be sorted Decouple sorting and garbage collection value key SSD device LSM-tree Value Log

  46. Key-Value Separation Main idea: only keys are required to be sorted Decouple sorting and garbage collection value key SSD device value LSM-tree Value Log

  47. Key-Value Separation Main idea: only keys are required to be sorted Decouple sorting and garbage collection value key SSD device k, addr value LSM-tree Value Log

  48. Random Load 500 LevelDB WiscKey 450 400 Throughput (MB/s) 350 300 250 200 150 100 50 0 64B 256B 1KB 4KB 16KB 64KB 256KB Key: 16B, Value: 64B to 256KB

  49. Random Load load 100 GB database 500 LevelDB WiscKey 450 400 Throughput (MB/s) 350 300 250 200 150 100 50 0 64B 256B 1KB 4KB 16KB 64KB 256KB Key: 16B, Value: 64B to 256KB

  50. Random Load load 100 GB database 500 LevelDB WiscKey 450 400 Throughput (MB/s) 350 300 250 200 150 100 only 2 MB/s to 4.1 MB/s 50 0 64B 256B 1KB 4KB 16KB 64KB 256KB Key: 16B, Value: 64B to 256KB

  51. Random Load load 100 GB database 500 LevelDB WiscKey 450 400 Throughput (MB/s) 350 large write amplification 300 (12 to 16) in LevelDB 250 200 150 100 only 2 MB/s to 4.1 MB/s 50 0 64B 256B 1KB 4KB 16KB 64KB 256KB Key: 16B, Value: 64B to 256KB

  52. Random Load load 100 GB database 500 LevelDB WiscKey 450 400 Throughput (MB/s) 350 large write amplification 300 (12 to 16) in LevelDB 250 200 150 100 only 2 MB/s to 4.1 MB/s 50 0 64B 256B 1KB 4KB 16KB 64KB 256KB Key: 16B, Value: 64B to 256KB Small write amplification in WiscKey due to key- value separation (up to 111x in throughput)

  53. Random Load load 100 GB database 500 LevelDB WiscKey 450 400 Throughput (MB/s) 350 large write amplification 300 (12 to 16) in LevelDB 250 200 150 100 only 2 MB/s to 4.1 MB/s 50 0 64B 256B 1KB 4KB 16KB 64KB 256KB Key: 16B, Value: 64B to 256KB Small write amplification in WiscKey due to key- value separation (up to 111x in throughput)

  54. LevelDB limits of files num of files L0 9 L1 (5) 30 L2 (50) 365 L3 (500) 2184 L4 (5000) 15752 L5 (50000) 23733 L6 (500000) 0

  55. LevelDB limits of files num of files L0 9 Large LSM-tree: L1 (5) 30 Intensive compaction L2 (50) 365 ➡ repeated reads/writes ➡ stall foreground I/Os L3 (500) 2184 L4 (5000) 15752 Many levels ➡ travel several levels for L5 (50000) 23733 each lookup L6 (500000) 0

  56. LevelDB WiscKey limits of files num of files num of files L0 9 7 L1 (5) 30 11 L2 (50) 365 127 L3 (500) 2184 460 L4 (5000) 15752 0 L5 (50000) 23733 0 L6 (500000) 0 0

  57. LevelDB WiscKey limits of files num of files num of files L0 9 7 L1 (5) 30 11 L2 (50) 365 127 L3 (500) 2184 460 L4 (5000) 15752 0 L5 (50000) 23733 0 L6 (500000) 0 0 Small LSM-tree: less compaction, fewer levels to search, and better caching

  58. Random Lookup 300 LevelDB WiscKey 250 Throughput (MB/s) 200 150 100 50 0 64B 256B 1KB 4KB 16KB 64KB 256KB Key: 16B, Value: 64B to 256KB

Recommend


More recommend