slik scalable low latency indexes for a key value store
play

SLIK: Scalable Low-Latency Indexes for a Key-Value Store Ankita - PowerPoint PPT Presentation

SLIK: Scalable Low-Latency Indexes for a Key-Value Store Ankita Kejriwal (With Arjun Gopalan, Ashish Gupta, Greg Hill, Zhihao Jia, Stephen Yang and John Ousterhout) PlatformLab Hypothesis A key value store can support highly consistent


  1. SLIK: Scalable Low-Latency Indexes for a Key-Value Store Ankita Kejriwal (With Arjun Gopalan, Ashish Gupta, Greg Hill, Zhihao Jia, Stephen Yang and John Ousterhout) PlatformLab

  2. Hypothesis A key value store can support highly consistent secondary indexes while operating at low latency and large scale. SLIK Slide 2

  3. Introduction ● SLIK: Scalable Low-latency Indexes for a Key-value Store § Enables multiple secondary keys for each object § Allows lookups and range queries on these keys ● Key design features: § Scalability using independent partitioning § Consistency with minimal performance overheads using an ordered write approach ● Performance § 11-13 µ s indexed reads § 29-37 µ s writes/overwrites of objects with one indexed attribute § Linear throughput increase with increasing number of partitions ● Feedback welcome!

  4. Talk Outline ● Motivation ● Data Model and API ● Design ● Performance ● Related Work ● Summary SLIK Slide 4

  5. Talk Outline ● Motivation ● Data Model and API ● Design ● Performance ● Related Work ● Summary SLIK Slide 5

  6. Motivation + consistency H-Base Espresso Megastore + scalability NoSQL Traditional Spanner H-Store SLIK Systems RDBMs - data models HyperDex - consistency MongoDB MySQL + low PNUTS latency Tao + data models SLIK Slide 6

  7. Talk Outline ● Motivation ● Data Model and API ● Design ● Performance ● Related Work ● Summary SLIK Slide 7

  8. Object Format Tables Object Key Value Blob SLIK Slide 8

  9. Object Format Tables Object Num Keys Key[0] Key[1] Key[2] …. Value Blob Primary Key SLIK Slide 9

  10. Object Format and API Tables createIndex (tableId, ¡indexId, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡indexType) ¡ dropIndex (tableId, ¡indexId) ¡ ¡ write (tableId, ¡keys, ¡value) ¡ ¡ IndexLookup (tableId, ¡indexId, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡keyRange) ¡ ⇒ objects ¡in ¡a ¡sorted ¡order ¡via ¡ Object streaming ¡interface ¡ Num Keys Key[0] Key[1] Key[2] …. Value Blob Primary Key SLIK Slide 10

  11. Talk Outline ● Motivation ● Data Model and API ● Design ● Performance ● Related Work ● Summary SLIK Slide 11

  12. Design Goals • Scalable distributed system • Consistency expected from a centralized system (with minimal performance overheads) SLIK Slide 12

  13. Design Goals • Scalable distributed system • Consistency expected from a centralized system (with minimal performance overheads) SLIK Slide 13

  14. Index Partitioning Colocation Approach • Colocate index entries and objects • One of the keys used to partition the table’s objects and indexes • No particular association between index partitions and index key ranges Indexlet Indexlet Indexlet 1 2 13 20 23 55 3 9 11 15 24 45 60 5 14 31 89 Tablet Tablet Tablet 55 13 1 60 9 11 15 89 14 5 23 2 20 24 3 45 31 SLIK Slide 14

  15. Index Partitioning Colocation Approach • Colocate index entries and objects • One of the keys used to partition the table’s objects and indexes • No particular association between index partitions and index key ranges Example query: Objects with “age” between 11 – 14 Indexlet Indexlet Indexlet 1 2 13 20 23 55 3 9 11 15 24 45 60 5 14 31 89 Tablet Tablet Tablet 55 13 1 60 9 11 15 89 14 5 23 2 20 24 3 45 31 Not Scalable! Slide 15

  16. Index Partitioning Independent Partitioning • Partition each index and table independently • Partition each index according to sort order for that index Indexlet Indexlet 1 2 3 5 9 11 13 14 15 20 23 24 31 45 55 60 89 Tablet Tablet Tablet 55 13 1 60 9 11 15 89 14 5 23 2 20 24 3 45 31 SLIK Slide 16

  17. Index Partitioning Independent Partitioning • Partition each index and table independently • Partition each index according to sort order for that index Example query: Objects with “age” between 11 – 14 Indexlet Indexlet 1 2 3 5 9 11 13 14 15 20 23 24 31 45 55 60 89 Tablet Tablet Tablet 55 13 1 60 9 11 15 89 14 5 23 2 20 24 3 45 31 Scalable! Slide 17

  18. Index Partitioning 120 Colocation size 1 Independent size 1 Colocation size 10 Independent size 10 100 89.7 Lookup Latency (µs) 87.3 80 60 40 28.8 22.3 26.7 20 16.2 16.7 15.2 12.7 8.3 0 0 10 20 30 40 50 60 70 80 Number of Servers Latency for IndexLookup: single table with one index with varying num indexlets Each object: pk 30 bytes, sk 30 bytes, val 100 bytes

  19. Index Partitioning 5069 Independent Partitioning 5000 4629 Colocation Throughput (10 3 lookups/sec) 4248 4000 3629 3199 3000 2655 2197 2000 1619 1127 1000 580 461 463 457 447 447 441 423 418 435 357 0 0 1 2 3 4 5 6 7 8 9 10 Number of Indexlets Throughput for IndexLookup: single table with one index with varying num indexlets Queried via multiple clients Each object: pk 30 bytes, sk 30 bytes, val 100 bytes

  20. Design Goals • Scalable distributed system: • Use independent partitioning • But: indexed object writes: distributed operations • Consistency expected from a centralized system (with minimal performance overheads): • If an object contains a given secondary key, then an index lookup with that key will return the object • If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range SLIK Slide 20

  21. Consistency • Consistency properties: • If an object contains a given secondary key, then an index lookup with that key will return the object • If an object is returned by index lookup, then this object contains a secondary key for that index within the specified range • Solution: • Longer index lifespan (via ordered writes) • Object data is ground truth and index entries serve as hints commit point commit point commit point Index ¡Entry: ¡Sam ¡-­‑> ¡Foo ¡ Index ¡Entry: ¡Bob ¡-­‑> ¡Foo ¡ Object: ¡Foo ¡(pk): ¡Bob ¡(sk) ¡ Object: ¡Foo ¡(pk): ¡Sam ¡(sk) ¡ time Slide 21 modify object remove object write object

  22. Talk Outline ● Motivation ● Data Model and API ● Design ● Performance ● Related Work ● Summary SLIK Slide 22

  23. Performance Implemented SLIK in RAMCloud ● Distributed in-memory key-value storage system ● Designed for large-scale applications ● Optimized to operate at lowest possible latency SLIK Slide 23

  24. Performance Questions: ● Does SLIK meet the low latency goal? ● Does SLIK meet the scalability goal? ● How does the performance of indexing with SLIK compare to other state-of-the-art systems? SLIK Slide 24

  25. Performance Systems we compared: ● HyperDex: § Spaces containing objects § Objects have primary key and multiple attributes § Data (and indexes) partitioned using hyperspace hashing § Each index contains all object data ● H-Store: § Main memory database § SQL+ACID § Data (and indexes) partitioned based on specified attribute § Many parameters for tuning ● Got assistance from developers to tune for each test ● Examples: txn_incoming_delay, partitioning column Slide 25 SLIK

  26. Lookup Latency 1024.35 989.51 987.00 963.82 937.06 941.17 929.22 1000 (a) Lookup Latency (µs) 263.7 239.2 181.8 184.4 185.0 166.3 155.9 204.40 203.70 196.65 187.69 186.66 173.10 100 147.71 54.5 49.7 48.5 45.2 44.9 45.6 42.7 13.1 12.8 12.7 11.7 11.6 11.0 10.2 10 H-Store SK Partitioned SLIK TCP H-Store PK Partitioned SLIK HyperDex 10 10 10 10 10 10 10 10 0 10 1 10 2 10 3 10 4 10 5 10 6 Size of Index (# objects) Single table with one index having a single partition; Slide 26 Each object: pk 30 bytes, sk 30 bytes, val 100 bytes

  27. Overwrite Latency 1048.86 1019.82 1010.92 987.25 961.17 968.72 939.84 1000 (c) Overwrite Latency (µs) 949.4 870.0 785.7 789.4 782.8 782.9 768.5 207.10 209.37 202.07 195.06 192.46 179.53 160.16 100 135.8 129.7 126.5 125.9 124.4 124.3 123.8 37.0 35.2 35.2 34.2 34.3 32.7 31.4 H-Store SK Partitioned SLIK TCP 10 H-Store PK Partitioned SLIK HyperDex 10 0 10 1 10 2 10 3 10 4 10 5 10 6 Size of Index (# objects) Single table with one index having a single partition; Slide 27 Each object: pk 30 bytes, sk 30 bytes, val 100 bytes

  28. Multiple Secondary Indexes 1620.92 1455.67 1448.46 1360.54 1343.23 1000 1267.8 1116.3 1061.5 978.5 893.6 836.4 268.82 271.58 255.70 248.51 241.77 Latency (µs) 184.6 182.0 179.1 175.3 175.7 165.2 164.3 100 156.2 138.5 139.1 51.2 49.5 46.4 47.3 42.7 42.4 39.8 39.0 35.3 33.0 H-Store via Sk1 SLIK TCP 10 H-Store via Pk SLIK H-Store via SkX HyperDex 0 1 2 3 4 5 6 7 8 9 10 Number of Indexes Single table with varying num indexes, each having a single partition; Slide 28 Each object: pk 30 bytes, sk 30 bytes, val 100 bytes

Recommend


More recommend