a low latency multi version key value store using b tree
play

A Low-Latency Multi-Version Key-Value Store Using B-tree on an - PowerPoint PPT Presentation

A Low-Latency Multi-Version Key-Value Store Using B-tree on an FPGA-CPU Platform Yuchen Ren , Jinyu Xie, Yunhui Qiu, Hankun Lv, Wenbo Yin, Lingli Wang State Key Laboratory of ASIC and System, Fudan University Bowei Yu, Hua Chen, Xianjun He,


  1. A Low-Latency Multi-Version Key-Value Store Using B-tree on an FPGA-CPU Platform Yuchen Ren , Jinyu Xie, Yunhui Qiu, Hankun Lv, Wenbo Yin, Lingli Wang State Key Laboratory of ASIC and System, Fudan University Bowei Yu, Hua Chen, Xianjun He, Zhijian Liao, Xiaozhong Shi IT R&D Dept., Chengdu Research Institute, Huawei Technologies Co., Ltd. FPL’19, Barcelona, September 11th, 2019

  2. Introduction - Background performance & power consumption CPU-based RDMA-based FPGA-based low (power) efficiency limited flexibility of CPU-centric and efficiency of memory hierarchy RDMA *RDMA: Remote Direct Memory Access Version 1 Value 1 Multi-Version KVS Key Version 2 Value 2 (Key-Value Store) Version ... Value ... 2

  3. Introduction - Contribution Design • a low-latency multi-version in-memory KVS • FPGA-CPU heterogeneous architecture Storage • – hash table – keys FPGA board (Cuckoo hashing) • version-value pairs (VVPs) – – B-trees host memory Operation • get , put , delete , CAS , getPredecessor – bypassing the CPU • range query – with the help of the CPU *CAS: Compare and Swap 3

  4. Architecture 4

  5. Architecture - Network Offload Engine 5

  6. Architecture - Key-Value Store Engine 6

  7. Architecture - First-level indexing by key 7

  8. Architecture - Second-level indexing by version 8

  9. Implementation 2GB DDR4  FPGA platform  Intel i5-2400 • Xilinx KCU105 quad-core CPU  Frequency  12GB DDR3 • KVSE: 120MHz  256GB SSD • DMA: 250MHz  CentOS 7 • DDR4: 300MHz • NOE: 156.25MHz PC two 10GbE Xilinx KCU105 PCIe gen3 x8 9

  10. Evaluation - Key-Value Store Message Generator in FPGA hardware 10

  11. Evaluation - Results  Latency increases almost linearly  KVSE is the bottleneck * Kops: Thousand operations per second 11

  12. Conclusion Comparison (latency, get operation) • Our KVS: < 8μs (within a B -tree of 5 levels ) • Hybrid FPGA approach: ≈ 75μs ( within a B + -tree of 5 levels ) • Many software-based KVS systems: > 1ms (on the support of versioning) Future work • Optimize the system architecture of our multi-version KVS. • Expand to a distributed KVS by setting up multiple storage hosts. * Hybrid FPGA approach: D. Heinrich, S. Werner, M. Stelzner, C. Blochwitz, T. Pionteck and S. Groppe , “Hybrid FPGA approach for a B+ tree in a semantic Web database system,” 2015 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), Bremen, 2015, pp. 1-8. 12

  13. Thanks! Contact ycren18@fudan.edu.cn wbyin@fudan.edu.cn llwang@fudan.edu.cn

Recommend


More recommend