The BW-Tree: A B-tree for New Hardware Platforms Presented by: Sidharth Raj
An Alternate Title “The BW -Tree: A Latch-free, Log- structured B-tree for Multi-core Machines with Large Main Memories and Flash Storage” BW = “Buzz Word”
Bw-Tree Architecture Focus of this talk API • B-Tree B-tree search/update logic • In-memory pages only Layer • Logical page abstraction for • Cache B-tree layer Brings pages from flash to Layer • RAM as necessary Flash Sequential writes to log- • structured storage Layer Flash garbage collection •
1. “Uni -core speed will at best increase modestly, thus we need to get better at exploiting a large number of cores by addressing at least two important aspects:…. ” What are the aspects that have to be addressed with multi-cores? Why are the aspects not discussed in the KV stores we’ve studied so far? • The two aspects discussed in the paper are: • Multi-core CPUs mandate high concurrency. • Good multi-core processor performance depends on high cache hit ratios.
6. Read Section II.C “ Delta Updating ”, and explain how a tree node is updated. • Creation of delta record and prepending it to the existing page state. • Installation of new memory address using CAS. • If successful, delta record address become new physical address. • Same technique used for both data changes (insert a record) and management changes (flushing a page to storage).
Delta Updates Mapping Table PID Physi sical Δ : Delete te reco cord 48 Address ss Δ : Insert record 50 P Page P • Each update to a page produces a new address (the delta) • Delta physically points to existing “root” of the page • Install delta address in physical address slot of mapping table using compare and swap
2. “ Having the new node state at a new storage location permits us to use the atomic compare and swap instructions to update state.” Study the CAS instruction and explain how it enables latch-free (lock-free) memory access? • In BW-tree design, thread almost never block. • Instead of Latches, BW-tree uses state change using Compare and Swap (CAS) instruction. • The node update is performed using delta update. • When new address is installed using CAS instruction, CAS compares the current address P to the current address in the mapping table – if they match, the instruction over-writes the delta record on the current address. • All pointer are via PID, CAS operation on the mapping table is the ONLY physical pointer change.
Compare and Swap • Atomic instruction that compares contents of a memory location M to a given value V • If values are equal, installs new given value V’ in M • Otherwise operation fails Address New Value M X 20 20 30 30 Compar Compa mpareAn mpareAndSwap AndSwa AndSwap(&M dSwap(& (&M, (&M, 20, 40) M, 20, 30) Compare Value
3. “ Further, the Bw- tree performs node updates via ”delta updates” (attaching the update to an existing page), not via update-in-place (updating the existing page memory). ” What does the delta actually refer to? • Update-in-place: • Read the page • Process the change • Write back • Bw-tree uses delta update, prepending update to the existing page and installing it using CAS. • Delta – describing the change. • Avoiding update-in-place reduces CPU cache invalidation resulting in higher cache hit ratio.
5. “ The mapping table severs the connection between physical location and inter- node links. ” Explain the benefit of introducing the indirection (the mapping table). • Cache layer maintains Mapping table – maps logical pages to physical pages. • Logical pages identified as ”Page Identifier” PID • Mapping table translates PID into either: • Flash Offset • Memory Pointer • Enabling physical location of the Bw-tree change without requiring location change to proporgate to the root of the tree.
4. “ We use PIDs in the Bw-tree to link the nodes of the tree. For instance, all downward “search” pointers between Bw-tree nodes are PIDs, not physical pointers. ” Explain how the tree is stored on the disk or/and in the memory. • PIDs bridge the gap between logical pages and physical pages. • Mapping table translates the PID as required. • Bw-tree nodes are logical – do not occupy fixed physical location. • All pointers between nodes are via PIDs. • PIDs denote the node of the tree.
Questions?
Recommend
More recommend