SuperMem: Enabling Application- transparent Secure Persistent Memory with Low Overheads Pengfei Zuo 1,2 , Yu Hua 1 , Yuan Xie 2 1 Huazhong University of Science and Technology, China 2 University of California at Santa Barbara, USA 52nd IEEE/ACM International Symposium on Microarchitecture ( MICRO ), 2019
Images from Internet DRAM � Persistent Memory Low power Low power 2
Two Key Challenges for Persistent Memory Persistence Security Core username, Cache password Volatile: Inconsistency Non-volatile: Persistent Memory Persistent Memory Gap between persistence and security: � Clflush, mfence & logging for crash consistency � Memory encryption for data security Encryption incurs new inconsistency problem 3
Counter Mode Encryption Write Write Counter Data CPU Cache Counter Cache Back Back AES Engine XOR One-time pad Encrypted Data 4
Counter Mode Encryption Write Write CPU Cache Counter Cache Back Back Updated Encrypted Counter Data 5
Crash Inconsistency Caused by Encryption Write Write CPU Cache Counter Cache Back Back Updated Encrypted Counter Data � Data and counter cannot reach NVM at the same time 6
Crash Inconsistency Caused by Encryption Write Write CPU Cache Counter Cache Back Back Updated Encrypted CASE 1: Counter Data � Data and counter cannot reach NVM at the same time 7
Crash Inconsistency Caused by Encryption Write Write CPU Cache Counter Cache Back Back Updated Encrypted CASE 2: Counter Data � Data and counter cannot reach NVM at the same time 8
Crash Inconsistency Caused by Encryption Clflush Write Write CPU Cache Counter Cache Back Back Updated Encrypted Counter Data � Data and counter cannot reach NVM at the same time � Clflush and mfence cannot operate the counter cache 9
Existing Solutions (Write-back Counter Cache) Large Battery Backup Software-level Modification Error Correction [Awad et al., ASPLOS’16] [Liu et al., HPCA’18] [Ye et al., MICRO’18] [Zuo et al., MICRO’18] New programming primitives • counter_cache_writeback() Check CPU Counter • CounterAtomic Cache Cache App Battery Unencrypted Encrypted Expensive Portability limitation Long recovery time 10
SuperMem: Secure and Persistent Memory � Exploit a write-through counter cache – No large battery backup – No software-level modifications – No need to correct counters – Double writes � A counter write coalescing scheme – Reduce the number of write requests � A cross-bank counter storage scheme Asynchronous DRAM refresh (ADR): – Speedup memory writes cache lines reaching the write queue can be considered durable. 11
SuperMem: Secure and Persistent Memory Application-transparent Write-through counter cache ( Guarantee consistency ) Counter write coalescing ( Reduce writes ) Cross-bank counter storage Asynchronous DRAM refresh (ADR): ( Speedup writes ) cache lines reaching the write queue can be considered durable. 12
SuperMem: Secure and Persistent Memory Write-through counter cache ( Guarantee consistency ) Counter write coalescing ( Reduce writes ) Cross-bank counter storage Asynchronous DRAM refresh (ADR): ( Speedup writes ) cache lines reaching the write queue can be considered durable. 13
Write-through Counter Cache � Ensure that data and its counter reach the write queue in the same time – Write through counter cache Flu(A) Ret(A) CPU Ack(A) Memory Ctrl Read(Ac) Ac++ Enc(A) App(Ac) App(A) Write Queue 14
Write-through Counter Cache � Ensure that data and its counter reach the write queue in the same time – Write through counter cache – Add a register Flu(A) Ret(A) CPU Read(Ac) Ac++ Enc(A) Memory Ctrl Ack(A) Write Queue App(Ac+A) Sto(Ac) Sto(A) Register 15
SuperMem: Secure and Persistent Memory Write-through counter cache ( Guarantee consistency ) Counter write coalescing ( Reduce writes ) Cross-bank counter storage Asynchronous DRAM refresh (ADR): ( Speedup writes ) cache lines reaching the write queue can be considered durable. 16
Cross-bank Counter Storage � SingleBank: Counters are stored in a continuous area in NVM [ASPLOS’15, ASPLOS’16, HPCA’18] Data 0 Data 1 Data 2 Ctr 0, Ctr 1, Ctr 2 Bank ID: 0 1 2 3 4 5 6 7 Data Area Bottleneck Ctr Area 17
Cross-bank Counter Storage � SameBank: Stores the counters of data into their local banks 2X write latency Ctr 0, Data 0 Ctr 1, Data 1 Ctr 2, Data 2 Bank ID: 0 1 2 3 4 5 6 7 Data Area Ctr Area 0 1 2 3 4 5 6 7 18
Cross-bank Counter Storage � XBank: Stores each data and its counter into different banks to leverage bank parallelism Data 0 Data 1 Data 2 Ctr 0 Ctr 1 Ctr 2 Bank ID: 0 1 2 3 4 5 6 7 Data Area Ctr Area 4 5 6 7 0 1 2 3 19
SuperMem: Secure and Persistent Memory Write-through counter cache ( Guarantee consistency ) Counter write coalescing ( Reduce writes ) Cross-bank counter storage Asynchronous DRAM refresh (ADR): ( Speedup writes ) cache lines reaching the write queue can be considered durable. 20
Locality-aware Counter Write Coalescing � Spatial locality of counter storage – All counters of a page are stored in a counter line … … … … A page: Line 1 Line 2 Line 3 Line 4 Line 64 (64 lines) … A counter line: M m 1 m 2 m 3 m 4 m 64 (64B) 21
Locality-aware Counter Write Coalescing � Spatial locality of counter storage – All counters of a page are stored in a counter line … … … … A page: Line 1 Line 2 Line 3 Line 4 Line 64 (64 lines) A log entry or the transaction data � Spatial locality of log and data writes 22
Locality-aware Counter Write Coalescing � An example of writing 4 lines within a page … … … … A page: Line 1 Line 2 Line 3 Line 4 Line 64 (64 lines) 23
Locality-aware Counter Write Coalescing � An example of writing 4 lines within a page Cache A B C D Dc D Cc C Bc B Ac A Write Queue 24
Locality-aware Counter Write Coalescing � An example of writing 4 lines within a page … Ac: ' M m 1 m 2 m 3 m 4 m 64 … ' m 2 Bc: ' M m 1 m 3 m 4 m 64 … ' m 2 ' m 3 Cc: ' M m 1 m 4 m 64 … ' m 2 ' m 3 ' m 4 Dc: ' M m 1 m 64 Dc D Cc C Bc B Ac A Write Queue 25
Locality-aware Counter Write Coalescing � Coalescing counter writes in the write queue … Ac: ' M m 1 m 2 m 3 m 4 m 64 … ' m 2 Bc: ' M m 1 m 3 m 4 m 64 … ' m 2 ' m 3 Cc: ' M m 1 m 4 m 64 … ' m 2 ' m 3 ' m 4 Dc: ' M m 1 m 64 Dc D Cc C Bc B Ac A Write Queue 26
Locality-aware Counter Write Coalescing (CWC) � Coalescing counter writes in the write queue With CWC Dc D C B A Without CWC Dc D Cc C Bc B Ac A Write Queue 27
Performance Evaluation � Model NVM using gem5 and NVMain Comparisons Benchmarks Unsec: An un-encrypted NVM Array: Randomly swapping entries WB: An ideal write-back scheme Queue: Randomly enqueueing and dequeueing WT: A write-through scheme B-tree: Inserting random KVs WT+CWC: A write-through scheme with CWC Hash Table: Inserting random KVs WT+Xbank: A write-through RB-tree: Inserting random KVs scheme with XBank SuperMem 28
Transaction Execution Latency – Single-core Normalized Execution Latency Unsec WB WT WT+CWC WT+XBank SuperMem Normalized Execution Latency Unsec WB WT WT+CWC WT+XBank SuperMem 2.0 2.0 WT+CWC 1.5 1.5 1.0 1.0 0.5 0.5 0.0 0.0 Array Queue B-tree Hash Table RB-tree Array Queue B-tree Hash Table RB-tree Transaction size: 256B Transaction size: 4KB � SuperMem achieves the performance comparable to a secure NVM with an ideal write-back cache (WB) 29
Transaction Execution Latency – Multi-core Normalized Execution Latency Unsec WB WT WT+CWC WT+XBank SuperMem Normalized Execution Latency Unsec WB WT WT+CWC WT+XBank SuperMem 2.0 2.0 WT+XBank 1.5 1.5 1.0 1.0 0.5 0.5 0.0 0.0 Array Queue B-tree Hash Table RB-tree Array Queue B-tree Hash Table RB-tree 8 programs 2 programs � SuperMem achieves the performance comparable to a secure NVM with an ideal write-back cache (WB) 30
The Number of Write Requests Unsec WB WT SuperMem Unsec WB WT SuperMem Unsec WB WT SuperMem Normalized # of Writes Normalized # of Writes Normalized # of Writes 2.0 2.0 2.0 1.5 1.5 1.5 1.0 1.0 1.0 0.5 0.5 0.5 0.0 0.0 0.0 y e e e e Array Queue B-tree Hash Table RB-tree Array Queue B-tree Hash Table RB-tree a l u e e b r e r r a r t t A u - - T B B Q R h s a H Transaction size: 256B Transaction size: 1KB Transaction size: 4KB � SuperMem reduces up to 50% of write requests by using the CWC scheme 31
Conclusion Problem � Memory encryption incurs crash inconsistency issue Existing Work � Using a write-back counter cache � Large battery backup, software-level modification, or error correction Our Solution � SuperMem: exploit a write-through counter cache � Large battery backup, software-level modification, error correction � Counter write coalescing for reducing writes � Cross-bank counter storage for speeding up writes 32
Thanks! Q&A
Recommend
More recommend