hybrid indexes
play

Hybrid Indexes Huanchen Zhang David G. Andersen, Andrew Pavlo, - PowerPoint PPT Presentation

Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes Huanchen Zhang David G. Andersen, Andrew Pavlo, Michael Kaminsky, Lin Ma, Rui Shen PARALLEL DATA LABORATORY Carnegie Mellon University 2 3 4 Part I Initial


  1. Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes Huanchen Zhang David G. Andersen, Andrew Pavlo, Michael Kaminsky, Lin Ma, Rui Shen PARALLEL DATA LABORATORY Carnegie Mellon University

  2. 2

  3. 3

  4. 4

  5. Part I Initial Exploration of Hybrid Indexes [SIGMOD’16] 5

  6. You are running out of memory 6

  7. You are running out of memory 6

  8. ? Buy more You are running out of memory 6

  9. TPC-C on -Store Memory Limit = 5GB Throughput 60K 20K 8M 10M 0 2M 4M 6M Transactions Executed Memory (GB) 8 Disk tuples 4 In-memory tuples Indexes 0 7

  10. 8

  11. The better way: Use memory more efficiently 9

  12. Indexes are LARGE Hybrid Index Benchmark % space for index 58% 34% TPC-C 55% 41% Voter 34% 18% Articles 10

  13. Our Contributions [SIGMOD’16] The hybrid index architecture The Dual-Stage Transformation Applied to 4 index structures - B+tree - Skip List - Masstree - Adaptive Radix Tree (ART) Performance Space 30 – 70% 11

  14. Did we solve this problem? -Store Throughput (txn/s) 60K 20K 8M 10M 0 2M 4M 6M TPC-C on Stay tuned Transactions Executed 12

  15. How do hybrid indexes achieve memory savings ? Static 13

  16. Hybrid Index: a dual-stage architecture dynamic stage static stage 14

  17. Inserts are batched in the dynamic stage write merge dynamic stage static stage 15

  18. Reads search the stages in order dynamic stage static stage 16

  19. A Bloom filter improves read performance read dynamic stage static stage 17

  20. Memory-efficient Skew-aware read write merge ~ ~ ~ ~ ~ ~ ~ ~ dynamic stage static stage 18

  21. The Dual-Stage Transformation merge dynamic stage static stage 19

  22. The Dual-Stage Transformation merge dynamic stage static stage 19

  23. The Dynamic-to-Static Rules Compaction Reduction Compression 20

  24. The Dynamic-to-Static Rules Compaction Reduction Compression 20

  25. 4 2 4 6 8 10 11 12 1 2 5 5 5 6 7 8 9 10 3 4 g h i j k l m n a b c d e f 21

  26. Compaction: minimize # of memory blocks 4 2 4 6 8 10 11 12 1 2 5 5 5 6 7 8 9 10 3 4 g h i j k l m n a b c d e f 21

  27. Compaction: minimize # of memory blocks 3 6 9 1 2 3 7 8 9 10 11 12 4 5 6 l m n a b c d h i j k e f g 21

  28. Reduction: minimize structural overhead 3 6 9 10 11 12 1 2 3 7 8 9 4 5 6 a b c i j k l m n d h e f g 22

  29. Reduction: minimize structural overhead 3 6 9 1 2 3 7 8 9 4 5 6 10 11 12 a b c d h i j k l m n e f g 22

  30. Reduction: minimize structural overhead 4 3 6 9 2 4 6 8 10 1 2 3 7 8 9 4 5 6 10 11 12 a b c d h i j k l m n 11 12 1 2 5 5 5 6 7 8 9 10 3 4 e f g g h i j k l m n a b c d e f 22

  31. The merge routine is a blocking process merge dynamic stage static stage 23

  32. The merge routine is a blocking process ? Size % merge dynamic stage static stage 23

  33. Did we solve this problem? B+tree -Store Throughput (txn/s) 60K 20K 8M 10M 0 2M 4M 6M TPC-C on Transactions Executed 24

  34. Yes, we improved the DBMS’s capacity! B+tree -Store Throughput (txn/s) 60K 20K 8M 10M 0 2M 4M 6M TPC-C on Hybrid 60K 20K Transactions Executed 24

  35. Throughput (txn/s) B+tree 60K -Store 20K Hybrid 60K 20K 4M 8M 10M 0 2M 6M TPC-C on 8 B+tree Disk tuples Memory (GB) 4 In-memory tuples Indexes 8 Hybrid 4 Transactions Executed 25

  36. Throughput (txn/s) B+tree 60K -Store 20K Hybrid 60K 20K 4M 8M 10M 0 2M 6M TPC-C on 8 B+tree Disk tuples Memory (GB) 4 In-memory tuples Indexes 8 Hybrid 4 Transactions Executed 25

  37. Throughput (txn/s) B+tree 60K -Store 20K Hybrid 60K 20K 4M 8M 10M 0 2M 6M TPC-C on 8 B+tree Disk tuples Memory (GB) 4 In-memory tuples Indexes 8 Hybrid 4 Transactions Executed 25

  38. Throughput (txn/s) B+tree 60K -Store 20K Hybrid 60K 20K 4M 8M 10M 0 2M 6M TPC-C on 8 B+tree Disk tuples Memory (GB) 4 In-memory tuples Indexes 8 Hybrid 4 Transactions Executed 25

  39. Throughput (txn/s) B+tree 60K -Store 20K Hybrid 60K 20K 4M 8M 10M 0 2M 6M TPC-C on 8 B+tree Disk tuples Memory (GB) 4 In-memory tuples Indexes 8 Hybrid 4 Transactions Executed 25

  40. Throughput (txn/s) B+tree 60K Take Away: -Store 20K Higher Memory saved Larger working Hybrid throughput by indexes set in memory 60K 20K 4M 8M 10M 0 2M 6M TPC-C on 8 B+tree Disk tuples Memory (GB) 4 In-memory tuples Indexes 8 Hybrid 4 Transactions Executed 25

  41. Part I Recap The hybrid index architecture GENERAL The Dual-Stage Transformation PRACTICAL Applied to 4 index structures USEFUL - B+tree - Skip List - Masstree - Adaptive Radix Tree (ART) 26

  42. Part II Concurrent hybrid indexes with non- blocking merge 27

  43. Building Concurrent Hybrid Index? merge write dynamic stage static stage 28

  44. Building Concurrent Hybrid Index? merge write dynamic stage static stage 28

  45. Use concurrent data structures for dynamic-stage merge write dynamic stage static stage 29

  46. Static-stage is perfectly concurrent by default merge write dynamic stage static stage 30

  47. Challenge: efficient non-blocking merge algorithm merge write dynamic stage static stage 31

  48. Merge Algorithm Requirements Non-blocking - All existing items are accessible during merge - New items can still enter Efficient - Fast - Bounded temporary memory use 32

  49. Naïve Solution 1: Coarse-grained Locking merge write dynamic stage static stage 33

  50. Naïve Solution 1: Coarse-grained Locking merge write dynamic stage static stage 33

  51. The intermediate stage unblocks write traffic merge write dynamic stage static stage 34

  52. The intermediate stage unblocks write traffic merge freeze write dynamic stage static stage Intermediate stage 34

  53. The intermediate stage unblocks write traffic merge freeze write dynamic stage static stage Intermediate stage 34

  54. How do we unblock reads during merge? merge static stage Intermediate stage 35

  55. Naïve Solution 2: Full Copy-on-write merge static stage Intermediate stage 36

  56. Key Observation Merged-in items in the static-stage will NOT be accessed until the intermediate-stage is deleted Merge Incrementally! 37

  57. Our Solution: Incremental Copy-on-write with Rapid GC parent new old 38

  58. Our Solution: Incremental Copy-on-write with Rapid GC parent When can we safely reclaim the garbage? new old 38

  59. Our Solution: Incremental Copy-on-write with Rapid GC parent When can we safely reclaim the garbage? new old 38

  60. Our Solution: Incremental Copy-on-write with Rapid GC parent When no thread still holds a reference to it! new old 38

  61. Our Solution: Incremental Copy-on-write with Rapid GC Thread-local counters C n C 1 C 2 C 3 parent When no thread still holds a reference to it! new old 38

  62. Our Solution: Incremental Copy-on-write with Rapid GC Thread-local counters C max C max C min C n C 1 C 2 C 3 ++C i = MAX(C i , C max ) + 1 parent GC Condition: When no thread still C min > garbage tag holds a reference to it! new old 38

  63. A Quick Recap of the Merge Algorithm The intermediate stage separates writes from the merge process The incremental merge algorithm with rapid GC is non-blocking and space-efficient 39

  64. What we are building now Non-blocking Compact Radix Tree Merge 40

  65. What we are building now Non-blocking Compact Radix Tree Merge 40

  66. What we are building now Non-blocking Compact Radix Tree Merge 40

  67. What we are building now Non-blocking Compact Bwtree Radix Tree Merge 40

  68. What we are building now Non-blocking Compact Skiplist Radix Tree Merge 40

  69. What we are building now Non-blocking Compact Masstree Radix Tree Merge 40

  70. Part III Super-compact static-stage 41

  71. Go “crazy” on space-efficiency Succinct Data Structures - Z + o(Z), where Z is the information-theoretic lower bound - Still allow for efficient query operations 100011010000101… rank 1 (x) = # of 1’s up to position x select 1 (x) = position of the x-occurrence of 1 42

  72. Encoding Radix Tree a a 0 10 $ a b 1000 $ab 100 $ l n r $ a $lnr$a 100010 10000100 $ i i o $ 100101010 $iio$ 10001 i $ n i$n 010 101010 $ $ 1010 $$ 11 43

  73. Memory Savings with the New Encoding 1000 50M email keys with average Memory (MB) 800 length = 20 bytes 600 84% 400 200 0 ART Our Encoding 44

  74. The Takeaway Message Hybrid indexes can save the precious memory resources with minimum performance penalty. 45

  75. Toll-Free Hotline: 1-844-88-CMUDB 44

  76. Back-up Slides

  77. Latency (ms) B+tree Hybrid 50% 10 10 99% 50 52 MAX 115 611

Recommend


More recommend