Meerkat: Multicore-Scalable Replicated Transactions Following the Zero Coordination Principle 1
Distributed storage systems are getting faster Peak throughput (million txns/s) Can we achieve this? 10 in-memory, kernel-bypass 8 6 4 critical section 2 in-memory, in-kernel networking disk-based, in-kernel networking 40 # cores 2
The Zero-Coordination Principle: When transactions do not conflict: • No writes to memory shared with other cores (P1) • No cross-replica coordination (P2) 3
Common ways in which existing systems violate ZCP agreement on log order Replication contention on the log centralized timestamp management Concurrency control contention on the list of active/validated transactions 4
Other systems Meerkat decentralized agreement agreement on log order on transaction status Replication per-core record of contention on the log transactions centralized timestamp clients pick the commit management timestamp Concurrency control contention on the list of active/validated key-parallel OCC transactions 5
Meerkat’s approach Get rid of the log! Use a decentralized approach instead. 6
Meerkat’s decentralized approach • Decentralized OCC - client picks a commit timestamp using loosely synchronized clocks - replicas independently check for conflicts • Fast, decentralized consensus - client learns the fate of the transaction ( fast path ) - client proposes to commit the transaction (slow path) only if OCC checks successful at a majority Correctness comes from quorum intersection + pairwise conflict checks; see paper 7
Other systems Meerkat decentralized agreement agreement on log order on transaction status Replication ZCP ✓ per-core record of contention on the log transactions centralized timestamp clients pick the commit management timestamp Concurrency ZCP ✓ control contention on the list of active/validated key-parallel OCC transactions 8
Meerkat also has some nice performance properties • Low latency (no leader) - commits transactions in 1RTT (in the absence of conflicts and failures) - waits for replies from the fastest replicas • Read from any replica - balance the workload 9
Prototypes No cross-processor No cross-replica coordination coordination KuaFu++ X X X ✓ TAPIR Meerkat-PB X ✓ Meerkat ✓ ✓ 10
Meerkat scales near linearly when low contention (uniform) Leader bottleneck Log contention short txns (YCSB-T), 1 mil keys/core Contention on the validation list 11
Meerkat performs well for low to medium contention Expensive aborts More slow paths short txns (YCSB-T), 1 mil keys/core, 64 hyperthreads 12
Recommend
More recommend