Securing RDMA for High-Performance Datacenter Storage Systems Anna Kornfeld Simpson, Adriana Szekeres (Paul G. Allen School of Computer Science & Engineering, University of Washington), Jacob Nelson, Irene Zhang (Microsoft Research) 1
Remote Direct Memory Access (RDMA) does CPU-bypass over the datacenter network with only a few microseconds of latency RDMA over Converged Ethernet (RoCEv2) packet . Source: RoCEv2 spec, Infiniband Trade Association, 2014 UDP Head. RDMA Head. & Data Ethernet Head. IP Head. Queue Pair Info. Remote Memory Addr. and r_key Payload Abstracted RDMA portion of RoCEv2 packet. 2
Example RDMA System: Pilaf (2013): Put (SEND) Clients Server CPU Memory 3 2 1 4 NIC 5 3
Pilaf (2013): Unlike Put, Get is CPU-bypassing 4
RDMA not designed for datacenter security needs Security weaknesses discovered over past 2 decades (see Section 2 of paper for citations): • Confidentiality: packet in plaintext • Integrity: no packet integrity check or authentication • Availability: denial of service • Side channels: non-random r_keys and more 5
We analyzed recent distributed storage systems built on RDMA and discovered additional systems design challenges even after security fundamentals are fixed. • Can RDMA-based storage systems provide security at least as good as pre-RDMA datacenter security best practices? • We analyzed: Pilaf, FaRM, HERD, DrTM, FaSST, Octopus, Hyperloop, DrTM+H 6
Threat Model = Compromised Storage Client Clients Server CPU Memory Bad() NIC VLANs/virtualization does not help! Compromised client only needs to see its own network traffic to spoof RDMA. 7
Challenge 1: no auditability/logging on reads Clients Server CPU Memory What data was 2 exfiltrated? 1 NIC 3 Adversary does CPU-bypassing READ 8
Challenge 2: Design Implications of Storage Logic Location: RPC and Concurrency 5 4 6 DrTM (2016) Put 9
Challenge 3: Separating different users’ data • Single big remote memory registration -> attacker access to all user data • Vendor suggested solution (protection domains) a poor performance fit for storage systems with multiple storage clients who all want to access same data 10
Ingredients for more secure CPU-bypass systems Security Fundamentals System Design Challenges • High throughput AEAD • Logging strategy that does not cryptography for rely on client datapath (e.g. DTLS) • Alternatives to unreliable RPC • Centralized key • Finer-grained permissions on management remote data access Source: Zookeeper Project 11
Lots of big open questions for future research! • Wishlist for features to help support application security when building systems that use CPU-bypassing RDMA? • Wishlist for securing non-user-facing datacenter tasks? • How do we get these better features baked in? Changing the RDMA standard? Thank you for watching! Questions? Email Anna: aksimpso@cs.washington.edu 12
Recommend
More recommend