learning data systems components
play

Learning Data Systems Components Tim Kraska <kraska@mit.edu> - PowerPoint PPT Presentation

Work partially done at Learning Data Systems Components Tim Kraska <kraska@mit.edu> [Disclaimer: I am NOT talking on behalf of Google] Comments on Social Media Sorting Joins Tree Bloom Filter HashMaps Machine Learning Just Ate


  1. Does It Work? 200M records of map data (e.g., restaurant locations). index on longitude Intel-E5 CPU with 32GB RAM without GPU/TPUs No Special SIMD optimization (there is a lot of potential) Type Config Lookup Speedup Size (MB) Size vs. time vs. BTree Btree 260 ns 1.0X 12.98 MB 1.0X BTree page size: 128 Learned 2nd stage size: 10000 222 ns 1.17X 0.15 MB 0.01X index Learned 2nd stage size: 162 ns 1.60X 0.76 MB 0.05X index 50000 Learned 2nd stage size: 144 ns 1.67X 1.53 MB 0.12X index 100000 60% faster at 1/20th the space, or 17% faster at 1/100th the space 126 ns 2.06X 3.05 MB 0.23X Learned 2nd stage size: index 200000

  2. You Might Have Seen Certain Blog Posts

  3. Worse FAST 256 Size (MB) 32 Lookup Table 4 Learned Fixed-Size Read-Optimized Better Index B-Tree w/ interpolation search 0.5 Better Worse 0 50 100 150 200 250 300 350 Lookup-Time (ns)

  4. My Own Comparison

  5. A Comparison To ARTful Indexes (Radix-Tree) Viktor Leis, Alfons Kemper, Thomas Neumann: The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases. ICDE 2013 Experimental setup: • Dense: continuous keys from 0 to 256M • Sparse: 256M keys where each bit is equally likely 0 or 1.

  6. A Comparison To ARTful Indexes (Radix-Tree) Viktor Leis, Alfons Kemper, Thomas Neumann: The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases. ICDE 2013 Experimental setup: continuous keys from 0 to 256M Reported lookup throughput: 10M/s ≈ 100ns (1) Size: not measured, but paper says overhead of ≈ 8 Bytes per key (dense, best case): 256M * 8 Byte ≈ 1953MB (1) Numbers from the paper

  7. Learned Index Generate Code: Record lookup(key) { return data[0 + 1 * key]; }

  8. Learned Index Generate Code: Record lookup(key) { return data[key]; }

  9. Learned Index Generate Code: Record lookup(key) { return data[key]; } Lookup Latency: 10ns (learned index) vs 100ns* (ARTfull) 
 or one-order-of-magnitude better Space: 0MB vs 1953MB 
 Infinitely better :)

  10. ?

  11. What about Updates and Inserts?

  12. What about Updates and Inserts? Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, Tim Kraska: 
 A-Tree: A Bounded Approximate Index Structure https://arxiv.org/abs/1801.10207

  13. The Simple Approach: Delta Indexing updates Training a simple Multi-Variate Regression Model Can be done in one pass over the data

  14. Leverage the Distribution

  15. Leverage the Distribution for Appends New Inserts (e.g., Timestamps) Inserts Time If the Learned Model Can Generalize to Inserts Insert complexity is O(1) not O(Log N)

  16. Updates/Inserts • Less beneficial as the data still has to be stored sorted • Idea: Leave space in the array where more updates/ inserts are expected • Can also be done with traditional trees. • But, the error of learned indexes should increase with 
 
 𝑂 per node in RMI whereas traditional indexes with 𝑂

  17. Still at the Beginning! • Can we provide bounds for inserts? • When to retrain? • How to retrain models on the fly? • …

  18. Fundamental Algorithms & Data Structures Join Tree Sorting Hash-Map Bloom-Filter Cache Policy Scheduling Range-Filter Priority Queue …..

  19. Fundamental Algorithms & Data Structures Join Tree Sorting Hash-Map Bloom-Filter Cache Policy Scheduling Range-Filter Priority Queue …..

  20. Hash Map Hash Key Key Model Function Goal: Reduce Conflicts

  21. Hash Map - Results 25% - 70% Reduction in Hash-Map Conflicts Skip

  22. You Might Have Seen Certain Blog Posts

  23. Independent of Hash-Map Architecture

  24. Hash Map – Example Results Type Time (ns) Utilization 31ns 99% Stanford AVX Cuckoo, 4 Byte value Stanford AVX Cuckoo, 20 Byte record - Standard Hash 43ns 99% Commercial Cuckoo, 20Byte record - Standard Hash 90ns 95% In-place chained Hash-map, 20Byte record, 
 35ns 100% learned hash functions

  25. Fundamental Algorithms & Data Structures Join Tree Sorting Hash-Map Bloom-Filter Cache Policy Scheduling Range-Filter Priority Queue …..

  26. Fundamental Algorithms & Data Structures Join Tree Sorting Hash-Map Bloom-Filter Cache Policy Scheduling Range-Filter Priority Queue …..

  27. Bloom Filter- Approach 1 Is This Key In My Set? Is This Key In My Set? Maybe Model No Maybe Yes No Maybe Yes No 36% Space Improvement over Bloom Filter 
 at Same False Positive Rate

  28. Bloom Filter- Approach 2 (Future Work) Hash Function 1 Key Model Hash Key Function 2 Hash Function 3

  29. Fundamental Algorithms & Data Structures Join Tree Sorting Hash-Map Bloom-Filter Cache Policy Scheduling Range-Filter Priority Queue …..

  30. Future Work CDF How Would You Design Your Algorithms/Data Structure If You Have a Model for the Empirical Data Distribution?

  31. Future Work Join Tree Sorting Hash-Map Bloom-Filter Cache Policy Scheduling Range-Filter Priority Queue …..

  32. Future work: Multi-Dim Indexes

  33. Future work: Data Cubes

  34. If You Have a Model for the Empirical Data Distribution? How Would You Design Your Algorithms/Data Structure Other Database Components • Cardinality Estimation • Cost Model • Query Scheduling • Storage Layout • Query Optimizer • …

  35. Related Work • Succinct Data Structures � Most related, but succinct data structures usually are carefully, manually tuned for each use case • B-Trees with Interpolation search � Arbitrary worst-case performance • Perfect Hashing � Connection to our Hash-Map approach, but they usually increase in size with N • Mixture of Expert Models � Used as part of our solution • Adaptive Data Structures / Cracking � orthogonal problem • Local Sensitive Hashing (LSH) (e.g., learened by NN) 
 � Has nothing to do with Learned Structures

  36. Local Sensitive Hashing (LSH) Thanks Alkis for the analogy

  37. Summarize CDF How Would You Design Your Algorithms/Data Structure If You Have a Model for the Empirical Data Distribution?

  38. Adapts To Your Data

  39. Big Potential For TPUs/GPUs

  40. Can Lower the Complexity Class Time or Space O(N 2 ) O(N) O(Log N) O(1) N data_array[(lookup_key – 900)]

  41. Warning Not An Almighty Solution

Recommend


More recommend