position synergetic effects of software and hardware
play

Position: Synergetic Effects of Software and Hardware Parameters on - PowerPoint PPT Presentation

Position: Synergetic Effects of Software and Hardware Parameters on the LSM System Authors: Jinghuan Yu, Heejin Yoon* Sam H. Noh*, Young-ri Choi*, Chun Jason Xue * Log Structured Merge-tree (LSM) Specific designs for HDD and write-intensive


  1. Position: Synergetic Effects of Software and Hardware Parameters on the LSM System Authors: Jinghuan Yu, Heejin Yoon* Sam H. Noh*, Young-ri Choi*, Chun Jason Xue *

  2. Log Structured Merge-tree (LSM) Specific designs for HDD and write-intensive workload. Does the working principle of LSM still fit these new mediums? Periodical compaction with various resource occupation. What is the critical factor deciding performance?

  3. Performance Feature of Each Media SATA SSD NVMe SSD PMM • • • Limited bandwidth Highest sequential bandwidth Relatively low latency • • • Poor parallel performance Strong parallel performance, with Stable performance in any higher requirement for CPU parallelism • Unstable performance due to • • foreground garbage collection Performance is strongly affected Strong wear lifetime without GC by write granularity Average Media Type Access Latency (µs) SATA SSD 37.78 NVMe SSD 11.77 PMM 2.61

  4. Performance Comparison of Devices in RocksDB Operation 450 Batch Size 400 Throughput (kOps/Sec) 350 16 MB 300 32 MB 250 64 MB 200 128 MB 150 100 50 0 2 CPUs 4 CPUs 8 CPUs 2 CPUs 4 CPUs 8 CPUs 2 CPUs 4 CPUs 8 CPUs SATA SSD NVMe SSD PMM • Best throughput • Not sensitive to • Increasing number of the number of CPUs causes IO • Performance increase CPUs or batch congestion, decreasing tends to be stable as the size performance number of CPUs • Throughput increases • With fixed CPUs, difference is far • Suffers from larger batch benefits from larger from bandwidth size batch size comparison

  5. Our Targets Existing Solutions VS • • Rule-based selection Auto scaling Heterogenous Storage • • Size-based scaling Driven by workload • Unified configuration • Device oriented • Parameter Tuning Offline tuning • Online tuning • Based on statistics data • Based on quantitative analysis • • Resource Utilization Disk utilization first Both CPU and disk utilization • • Lazy scheduling Smooth, effective, and • Devices features based predictable

  6. Characteristic analysis and design points

  7. Performance Traits of SATA SSD Strength • Effective for bulk single-thread write workload Weakness Operation Cumulative IO Time in Compaction Runs Batch Size 40000 • Serious IO congestion during multi-thread IO Time(s) 30000 compaction 16 MB 20000 32 MB 10000 Design Opportunities 64 MB 0 • Single queue continuous write 2 CPUs 4 CPUs 8 CPUs 128 MB SATA SSD • Large-grained operations In SATA SSD, IO time increases dramatically as number of CPUs increases

  8. Performance Traits of NVMe SSD Strength Performance of NVMe SSD with different bandwidth (limited by cgroup). • High bandwidth Operation Unlimited Bandwidth = Bandwidth = Bandwidth = Bandwidth = Batch Size 400MB 800MB 1200MB 1600 MB bandwidth Weaknesses • Larger batch size decreases the performance Design Opportunities • Quicksand effect: quicker devices make the data sink too quickly and decreases the performance. • Improve the pipeline of compaction works

  9. Performance Traits of PMM Strength • Stable parallel performance Operation Batch Size Weaknesses • More sensitive to the slow L0 compactions, which can be solved by changing the size ratio between L0 and L1 files. Design Opportunities • More flexible data structures • Can be used directly as a memory expansion • Non-volatile, free of consistency overhead Size Ratio here means the (total size of L0 files) / (total size of L1 files), controlled by compaction scheduling parameters such as WAL

  10. DOTA: Device Oriented Tuning Advisor Solutions Challenges Online modeling Workload adapting Data placement Online Tuning and migration Global resource Thread pool and resource management allocation Amplification reduction Environment detecting and data reuse and monitoring

  11. Email Address: Github link: jinghuayu2-c@my.cityu.edu.hk https://github.com/supermt/ utils_for_lsm.git

Recommend


More recommend