small index large table
play

Small Index Large Table Author: Hyeontaek Lim, Bin Fan, D. G. - PowerPoint PPT Presentation

Small Index Large Table Author: Hyeontaek Lim, Bin Fan, D. G. Andersen, M. Kaminsky Presenter: Xiaoyu Zhang Motivation To achieve low latency, Aggressive usage of Dram-based index key value storage system to avoid bottleneck, caused by


  1. Small Index Large Table Author: Hyeontaek Lim, Bin Fan, D. G. Andersen, M. Kaminsky Presenter: Xiaoyu Zhang

  2. Motivation ● To achieve low latency, Aggressive usage of Dram-based index key value storage system to avoid bottleneck, caused by disk operation. ● DRAM is 8X more expensive, uses 25X more power per bit than flash ● DRAM is growing more slowly than disk or flash

  3. SILT KEY-VALUE STORAGE SYSTEM ● SILT Key-Value Storage System Basic Storage Design ● LogStore Hash Store Sorted Store Extending SILT Fuctionality ●

  4. Question (1) “Figure 1: The memory overhead and lookup performance of SILT and the recent key-value stores. For both axes, smaller is better.” Explain the positions of FAWN-DS, SkimpyStash, BufferHash, and SILT on the graph.

  5. SILT KEY-VALUE STORAGE SYSTEM ● LogStore handles inputs and deletes, ● on-flash hash table that does not require an in memory index to locate entries ● SortedStore > 80% of total entries.

  6. SILT KEY-VALUE STORAGE SYSTEM ● Keys are first inserted into LogStore, in memory hash table maps key to offset. ● The LogStore is converted to an memory efficient HashStore ● Finally, it merges in bulk several HashStores along with an older version of SortedStore.

  7. Questions (2) Two design goals of SILT are low read amplification and low write amplification. Use any KV store we have studied as an example to show how these amplifications are produced.

  8. Questions (3) Describe SILT’s structure using Figure 2 (Architecture of SILT). Compared with LevelDB, SILT has only three levels. What’s concern with a multi-level KV store when it has too few levels?

  9. SILT KEY-VALUE STORAGE SYSTEM LogStore ● a partial-key cuckoo was used to reduce the flash reads and the alternative bucket index ● To make it compact, it uses tag of actual key, reduce unnecessary flash reads. ● move a key to its alternative bucket to displace another key is very cost. ● 4 way set associative hash table

  10. From Bin Fan, 2013

  11. Questions (4) Use Figure 3 (Design of LogStore: an in-memory cuckoo hash table (index and filter) to describe how a PUT request and a GET request is served in a LogStore. In particular, explain how the tag is used in a LogStore.

  12. SILT KEY-VALUE STORAGE SYSTEM HashStore (memory efficient) ● LogStore -> a much large SortedStore, high WA or incures memory overhead. Solution: write to a Hashstores, and then ● performs bulk merge. ● Advantage: eliminate the index and reorder the on-flash (key, value) pairs to save memory ● Hash filter - efficient in memory filter to reject queries.

  13. Questions (5) Use Figure 4 to explain how a LogStotre is converted into a HashStore?

  14. Question (6) “Once a LogStore fills up (e.g., the insertion algorithm terminates without finding any vacant slot after a maximum number of displacements in the hash table), SILT freezes the LogStore and converts it into a more memory-efficient data structure.” Compared to LogStore, what’s the advantage of HashStore? Why doesn’t SILT create HashStore at the beginning (without first creating LogStore)?

  15. SILT KEY-VALUE STORAGE SYSTEM kkkk SortedStore (kv entries sorted by key on flash) ● HashStore->sorted->SortedStore ● Trie structure: Each leaf nodes represent one key, and the shortest unique prefix of the tree serves as index ● guarantees a correct index lookup, but says nothing about the presence

  16. SILT KEY-VALUE STORAGE SYSTEM a) a pair of numbers in each denotes the leaf nodes number in right and left b) a recursive representation of the trie c) its entropy-coded representation used by Sorted-Store.

  17. Questions (7) “When fixed-length key-value entries are sorted by key on flash, a trie for the shortest unique prefixes of the keys serves as an index for these sorted data.” While a SortedStore is fully sorted, could you comment on the cost of merging a HashStore with a SortedStore? Compare this cost to the major compaction cost for LevelDB?

  18. SILT KEY-VALUE STORAGE SYSTEM ● Tradoffs: we need to balance write amplification, read amplification, or memory amplification. For example, using larger tags reduces read amplification by reducing false positive rate or the number of HashStores. However, the HashStores then consume more DRAM due to the larger tags.

  19. SILT KEY-VALUE STORAGE SYSTEM ● The bottom right graph shows the memory consumed by SILT, compared with the bottom left, which omits the intermediate HashStore, thus needs twice as much memory as the SILT. The top right graph intead ommits SorttedStore, and consumes four times as much memory. The top left one uses only the basic LogStore, it uses 10X as much memory as SILT.

  20. SILT KEY-VALUE STORAGE SYSTEM LogStore construction, entry by entry insertion, 90% of write bandwidth LogStroe to HashStore conversion involves bulk data reads and writes SortedStore involves an external sort for the entire HashStores

  21. Conclusion ● SILT combines multiple stores to balance the use of memory, storage and computation to form a memory efficient and high performance storage system. ● The integrated system use partial key cuckoo hashing and entropy-coded tries to reduce drastically the amount of memory needed and provide high write speed, high read throughput. ● On average, only 0.7 bytes of memory per entry it stores and makes only 1.01 flash reads to serve a lookup and those can be done within 400 microseconds.

Recommend


More recommend