Dyad: Reordered Consensus Message Request PCIe 1 & 2 Ordered Log 2 1 Prepare Handler Leader SmartNIC Consensus - Data 3 2, 3 3 Network Prepareok 1 Replica 2 Majority prepareok for request 1 43
Dyad: Response and Commit PCIe Response Ordered Log 2 1 Response Handler Leader SmartNIC Consensus - Data 3 3 Response Network Commit Client Replica Update log meta-data 44
Dyad: Timestamp Server with 5 replicas ~2 Million messages processed on the NIC ➢ Reduce latency by up to 76% , Improves throughput by 5.8x 45
Dyad – Replica Data Operations 1. Ordering 2. Replication 3. Ordered execution Client request response to host Leader prepare prepareok Replica 1 commit Replica 2 PCIe Protocol processing Context switch Application SmartNIC 46
Dyad: Ordering on Replica SmartNIC ● Ordering and Logging: ➢ Logs ordered by the sequence number in prepare message ➢ Prepare message are processed and dropped on the SmartNIC ● Ordered Execution: ➢ Commit messages forwarded to the host processor ➢ The request is appended to the commit message by SmartNIC 47
Dyad: Logging on Replica SmartNIC PCIe Ordered Log 2 1 Prepare Handler Replica SmartNIC Consensus - Data Network Prepareok 2 Prepare 2 Leader Leader Log request using sequence number 48
Dyad: Ordered Execution on the Replica Commit 1 PCIe Ordered Log 2 1 Commit Handler Replica SmartNIC Consensus - Data Network Commit 1 Leader Verify order of received commit 49
Dyad: Timestamp Server with 5 replicas ➢ Reduce latency by 30 μs 50
Dyad: Consensus Latency System Consensus % reduction latency (μs) VR 350 N/A VR-batching 409 N/A Dyad-Leader 48 86% Dyad-All 17 95% Timestamp server - 5 replicas 51
Dyad: CPU Usage Timestamp Server ➢ Reduce CPU usage by up to 70% on the leader 52
Dyad: Control Operations Application Consensus - Control Host BSD socket Linux epoll Replica Protocol Processing PCIe Consensus – Data SmartNIC Network 53
Dyad: Application Failures Application 92% catastrophic failure - due to Consensus - Control software [1] Host BSD socket Linux epoll Replica Protocol Processing PCIe Fail-stop failure Consensus – Data SmartNIC Network [1] Simple Testing Can Prevent Most Critical Failures, OSDI’14 54
Dyad: Detecting Application Failures Application Consensus - Control Host BSD socket Linux epoll Replica Protocol Processing Request Response Host RTT Consensus – Data SmartNIC Network 55
Dyad: Detecting Application Failures ● Measure host RTT for each request ● Computed weighted average of host RTTs ● Detect failure - response not within host RTT threshold 56
Application Recovery - VR Replica 1 - Leader Replica 2 Replica 3 Application Restart Application Application Application Consensus Consensus Consensus BSD socket Linux epoll BSD socket Linux epoll BSD socket Linux epoll Log Transfer Protocol Processing Protocol Processing Protocol Processing PCIe PCIe PCIe NIC NIC NIC Client Requests Data Center Network (μs RTT) 57
Dyad: Application Recovery ● Recovery using logs on SmartNIC ● Two stage recovery: ➢ Recover logs from the SmartNIC ➢ Recover remaining logs from other replicas 58
Dyad: Application Recovery 400MB of data received ➢ Dyad reduces recovery time by up to 67% 59
Dyad: SmartNIC Failure Application Consensus - Control Host BSD socket Linux epoll Replica Protocol Processing PCIe Consensus – Data SmartNIC Network 60
Dyad: System Failure Application 8% - hardware faults, misconfigs [1] Consensus - Control Host BSD socket Linux epoll Replica Protocol Processing PCIe Consensus – Data SmartNIC Network [1] Simple Testing Can Prevent Most Critical Failures, OSDI’14 61
Dyad: System Recovery ● SmartNIC Failure: ➢ Detected on the host using heartbeat/client messages ➢ Existing VR recovery: fetch remaining logs from other replicas ● System Failure: ➢ Existing VR recovery: fetch logs from other replicas ➢ Dyad supports logging to disk from host (Raft) 62
Dyad: Reliable Connection ● Dyad Supports Raft: ➢ Using TCP connection to replicas ➢ TCP stack specifically decode Raft headers and payload ➢ Host application logs client commands to disk for persistence 63
Dyad: Raft Latency ➢ Improves latency by up to 62% 64
Dyad: Ease of Use ● Memcached: ➢ Enable consensus for Memcached ■ ~100 lines of code for data operations on replica ➢ Evaluate impact on latency and throughput 65
Dyad: Memcached Throughput ➢ Provides consensus with ~7% reduction in throughput 66
Dyad: Memcached Latency ➢ Provides consensus with ~16% increase in latency 67
Dyad: Untangling Logically-Coupled Consensus ● Motivation ● Background ● Overview ● Design and Evaluation ● Conclusion 68
Dyad: Conclusion ● SmartNIC abstraction for consensus ● Data operations performed on the SmartNIC ● Control operations performed on the Host ● Enables consensus as a service on SmartNICs 69
Thesis: Conclusion ● Xps - Extensible Protocol Stack: ➢ Abstraction in kernel, user space, and SmartNIC ● Latr - lazy TLB shootdown: ➢ Kernel mechanism for TLB shootdown System abstractions and optimizations are needed at different levels of the software stack to reduce the latency and improve the throughput of current data-center applications. 70
Thank you! 71
Backup Slides 72
Arrakis 73
Redis comparison with Arrakis 74
Latr - Apache 75
Latr - Apache latency 76
User-Space Stacks 77
User Space: Protocol processing Systems Latency (μs) Mitigation mTCP ~ 23 Batching IX ~12 Batching Arrakis ~2.6 - 6.3 None 78
VR: IX batching with 3 Replicas 79
Context Switch 80
VR - Leader Context Switch 81
Dyad - Parallelism 82
Dyad: Application Parallelism ● Without SmartNIC: ➢ Sequence numbers are available in prepareok message ➢ Multi-thread execution by using the sequence number ● Dyad: ➢ Request are ordered without containing the sequence number ➢ SmartNIC appends the sequence number to the client request 83
Dyad: Parallelism Timestamp Server ➢ Improves throughput by up to 2.1x 84
Reading Logs 85
Dyad: Log Read Throughput ➢ Log read throughput ~256 MB with 16 threads 86
Direct Cost Formula 87
Cost of Consensus - Direct and Indirect Consensus overhead increases with increasing replicas 88
VR Recovery Data Transfer 89
Application Recovery - VR data transfer Replicas Log Size Data transferred (MB) (MB) 3 100 200 5 100 400 7 100 600 90
False Positives RTT 91
Dyad: False Positives with Timestamp Server ➢ RTT = ~96 μs 92
SmartNIC - Netronome 93
SmartNIC: Memory Hierarchy and Latency 94
Recovery Example 95
Dyad - Recovery Phase1 Replica 1 - Leader Replica 2 Replica 3 Application Restart Application Application Application Consensus Consensus Consensus BSD socket Linux epoll BSD socket Linux epoll BSD socket Linux epoll Protocol Processing Protocol Processing Protocol Processing 1, 2 PCIe PCIe PCIe SmartNIC SmartNIC SmartNIC 3 2 1 2 1 3 2 1 Client Requests Data Center Network (μs RTT) 96
Dyad - Recovery Phase2 Replica 1 - Leader Replica 2 Replica 3 Application Restart Application Application Application Consensus Consensus Consensus BSD socket Linux epoll BSD socket Linux epoll BSD socket Linux epoll Log Transfer Log Transfer Protocol Processing Protocol Processing Protocol Processing 3 3 PCIe PCIe PCIe NIC NIC NIC 3 2 1 3 2 1 3 2 1 Client Requests Data Center Network (μs RTT) 97
Raft - Logging to Disk 98
Dyad: Raft Latency with disk logging ➢ Improves latency by up to 46% 99
Dyad - Future Work 100
Recommend
More recommend