Caribou: Intelligent Distributed Storage Zsolt István, David Sidler, Gustavo Alonso Systems Group, Department of Computer Science, ETH Zurich 1
Rack-scale thinking ToR Switch Compute In the Cloud Compute Compute + Provisioning Compute + Independent Scalability - Data movement bottleneck In an Appliance Storage Storage Storage Storage 2
Storage Design Options Compute > Bandwidth Compute < Bandwidth Oracle Exadata Samsung YourSQL IBM PureData Winsconsin SmartSSD Compute ~ Bandwidth Deuteronomy Kinetic Drives … BlueCache … + Full-fledged Features similar to - Outside management - SW+HW overhead software + No-overhead access - Large footprint Balanced design + Small footprint 3
What is Caribou? Intelligent Distributed Storage with FPGAs 10Gbps Switch Easy integration on commodity network Clients Random access to tuples & in-storage scans Clients Clients Selection predicate pushdown Clients Clients Data replicated consistently to nodes Extensible (open-source) design Caribou Caribou Node Node Caribou Caribou Node Node fpgasystems 4
FPGA 101 Field Programmable Gate Array Reprogrammable hardware Large number of configurable logic blocks Tight integration, massive parallelism Network/App Co-design FPGA Innovation… 5
Inside a Caribou node The pipeline runs at the 10Gbps Switch same speed at the network (line-rate) Clients Software clients, Key-value interface (Single-key lookup or Scanning) Clients Clients Network TCP/IP Clients Clients Key-value Replication Processing management Caribou Caribou Caribou Node Node Caribou Caribou DRAM Node Node 1000s of connections, SW clients Cuckoo hash Conditionals, Primary/backup table, slab memory Regex, Atomic allocation, Decompression Broadcast bitmap indexes 6
Throughput of random access to storage 7
Random access response times • Response times comparable to SW on Infiniband, but Caribou uses commodity networking Get Put/Update Put/Update (Replicated) 60 Response time [us] 50 40 30 20 10 0 0 64 128 192 256 Value size [B] 8
Operator push-down The filtering circuits SELECT … FROM customer are parameterized at WHERE age<35 AND purchases>2 runtime, with no overhead. AND address LIKE “% Luzern%CH %” Multiple comparisons to constants (conjunction) Substrings or regular expression matching [1] Can filter compressed data (LZ77) Extensible pipeline design [1] Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures . D. Sidler, Zs. 9 Istvan, M. Ewaida, G. Alonso. 2017 ACM SIGMOD/PODS Conference (SIGMOD'17)
Exploiting Parallelism Complexity Value Throughput Throughput Value’ Regex LZ77 Core Regex LZ77 Core 1 Value’ Value DRAM Value’ Value’ Regex … LZ77 Core … … 1 0 1 Keep? Regex Core LZ77 Comparison Regular Transform Predicate Expressions 10
Scan and filter Choice of filter and value size do not impact scan rate. Bound by the Bound by the network/client Filter performance Scan rate in GB/s is same regardless value size 11
Near Data Processing without Surprises Filtering can be combined with random access reads as well 12
“The Times They Are A -Changin ” In-Storage Processing Stand-alone boards, MPSoC (ARM+FPGA) Add NVMe flash, N.V. Memory Explore different KVS (memcached, redis , …) In-Network Processing Microsoft Catapult NICs Work on streaming data Distributed service in the cloud Accelerator Intel Xeon+FPGA Offload computation without partitioning or 13 copying data
Time to Explore… Data movement bottleneck on many levels Caribou – Intelligent Distributed Storage Software-like service in a small footprint Balanced design with “right amount” of compute Caribou – Platform to Explore Near-data Processing Open source, modular and portable Data processing operators applicable on other HW platforms https://github.com/fpgasystems/caribou https://www.systems.ethz.ch/fpga/ zsolt.istvan@inf.ethz.ch 14
Recommend
More recommend