Microarchitectural Attacks: Protecting Cloud Accelerators By Ahmad “Daniel” Moghimi PhD Candidate Worcester Polytechnic Institute (WPI) @danielmgmi
OUTLINE ▪ Summary of Recent Contributions: ▪ Microarchiture → MemJam ▪ Intel SGX → CacheZoom ▪ Intel EPID → CacheQuote ▪ Speculation → Spoiler ▪ Mitigation → MicroWalk ▪ Shared FPGA-CPU Hardware Security ▪ Proposal ▪ Lab Equipment/Setup ▪ Ongoing Work 2
Microarchitecture (Memory) 3
μ Arch Attacks: Data Dependency add %ebx, %eax 1 sub %eax, %edx 2 xor %ecx, %ecx 3 add %eax, %edi 4 sub %ecx, %edi 5 4
μ Arch Attacks: Pipelined Memory Exec add %ebx, %eax 1 IF ID sub %eax, %edx 2 IF xor %ecx, %ecx 3 add %eax, %edi 4 sub %ecx, %edi 5 Instruction Fetch IF Instruction Decode ID Execute EX Write Back WB 5
μ Arch Attacks: Pipelined Memory Exec add %ebx, %eax 1 IF ID EX sub %eax, %edx 2 IF ID xor %ecx, %ecx IF 3 add %eax, %edi 4 sub %ecx, %edi 5 Instruction Fetch IF Instruction Decode ID Execute EX Write Back WB 6
μ Arch Attacks: Pipelined Memory Exec add %ebx, %eax 1 WB IF ID EX sub %eax, %edx 2 IF ID EX xor %ecx, %ecx IF ID 3 add %eax, %edi IF 4 sub %ecx, %edi 5 Instruction Fetch IF Instruction Decode ID Execute EX Write Back WB 7
μ Arch Attacks: Pipelined Memory Exec add %ebx, %eax 1 WB IF ID EX sub %eax, %edx 2 IF ID EX EX WB xor %ecx, %ecx IF ID 3 EX WB add %eax, %edi WB EX IF ID 4 EX WB IF ID sub %ecx, %edi 5 Instruction Fetch IF Instruction Decode ID Execute EX Write Back WB 8
μ Arch Attacks: 4K Aliasing False Dependency Memory loads/stores are executed out of order and speculatively ▪ The dependency is verified after the execution! ▪ mov %eax, (%ebx) Execute Execute Store Load Store mov (%ecx), %edx Load Dependent? Yes 4K Aliasing: Addresses that are 4K apart are assumed dependent ▪ Re-execute the load and corresponding instructions due to false dependency ▪ Virtual-to-physical address translation → Memory disambiguation ▪ 9
μ Arch Attacks – Hyperthreading 4K Aliasing Core HT – Thread A HT – Thread B Load 0xFECD1 Load 0xFECD2 Execute & Time Load 0xFECD3 Load 0xFECD4 Load 0xFECD5 Load 0xFECD6 Load 0xFECD7 Load 0xFECD8 10
μ Arch Attacks – Hyperthreading 4K Aliasing Core HT – Thread A HT – Thread B Store 0x12ABCDEF Load 0xFECD1 Store 0x12ABCDEF Load 0xFECD2 Execute & Time Store 0x12ABCDEF Load 0xFECD3 Store 0x12ABCDEF Load 0xFECD4 Store 0x12ABCDEF Load 0xFECD5 Store 0x12ABCDEF Load 0xFECD6 Store 0x12ABCDEF Load 0xFECD7 Store 0x12ABCDEF Load 0xFECD8 Store 0x12ABCDEF Store 0x12ABCDEF 11
μ Arch Attacks – Hyperthreading 4K Aliasing Core HT – Thread A HT – Thread B Store 0x12ABC200 Load 0xFECD1 Store 0x12ABC200 Load 0xFECD2 Execute & Time Store 0x12ABC200 Load 0xFECD3 Store 0x12ABC200 Load 0xFECD4 Store 0x12ABC200 Load 0xFECD5 Store 0x12ABC200 Load 0xFECD6 Store 0x12ABC200 Load 0xFECD7 Store 0x12ABC200 Load 0xFECD8 Store 0x12ABC200 Store 0x12ABC200 12
μ Arch Attacks – Hyperthreading 4K Aliasing Core HT – Thread A HT – Thread B Store 0x12ABC Load 0xFECD1 Store 0x12ABC Load 0xFECD2 Execute & Time Store 0x12ABC Load 0xFECD3 Store 0x12ABC Load 0xFECD4 Store 0x12ABC Load 0xFECD5 Store 0x12ABC Load 0xFECD6 Store 0x12ABC Load 0xFECD7 Store 0x12ABC Load 0xFECD8 Store 0x12ABC Store 0x12ABC 13
MemJam 14
MemJam – Intra Cache Line Resolution Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) 15
MemJam – Intra Cache Line Resolution Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks 16
MemJam – Intra Cache Line Resolution Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks L2/LLC Cache Attacks 17
MemJam – Intra Cache Line Resolution Least 12 bits (Virtual Address = Physical Address) Rest of the bits (Virtual != Physical) L1 Cache Attacks L2/LLC Cache Attacks Conflicted intra-cache line Leakage (4-byte granularity) ▪ Higher time correlates → Memory accesses with the same bit 3 to 12 ▪ 4 bits of intra-cache level leakage ▪ 18
MemJam Attack CPU Core Core HT HT HT HT Execute load compute load Execute Again load compute load Higher time if there compute are more number of load 4K conflicts load Encryption 19 Service
Constant time AES – Safe2Encrypt_RIJ128 Scatter-gather implementation of AES ▪ 256 S-Box – 4 Cache Line ▪ Cache independent access pattern ▪ Implemented and distributed as part of Intel products ▪ Intel SGX Linux Software Development Kit (SDK) ▪ Intel IPP Cryptography Library ▪ 64 Bytes A LINE 2 4 Cache Lines B LINE 2 C LINE 2 D LINE 2 B D A C B S-Box Lookup 20 Local Buffer
MemJam Attack on Safe2Encrypt_RIJ128 64 Bytes LINE 2 4 Cache Lines 21 Local Buffer
MemJam Attack on Safe2Encrypt_RIJ128 64 Bytes LINE 2 4 Cache Lines 22 Local Buffer
Intel SGX 23
INTEL SOFTWARE GUARD EXTENSION (SGX) ▪ Trusted Execution Environment (TEE) ▪ Enclave: Hardware protected user-level software module ▪ Loaded by the user program ▪ Mapped by the Operating System ▪ Authenticated and Encrypted by CPU ▪ Memory accesses are protected by the hardware 24
MemJam Attack on SGX 25
CacheZoom: Controlled Cache Attack ON SGX 1. Isolation of the target & victim cache 2. Stabilize the processor frequency 3. Perform the attack on small exec steps by interrupting the victim 4. Measure and filter the remaining noise 26
CacheZoom: Interrupted Cache Attack PC L1D Cache 0 1 2 3 4 Step 1: Attacker prime all the L1D sets 5 6 7 8 … 56 57 58 59 60 61 62 63 27
CacheZoom: Interrupted Cache Attack PC L1D Cache 0 1 2 3 4 Step 1: Attacker prime all the L1D sets 5 Step 2: Victim executes some codes 6 7 8 … 56 57 58 59 60 61 62 63 28
CacheZoom: Interrupted Cache Attack L1D Cache 0 PC 1 2 3 4 Step 1: Attacker prime all the L1D sets 5 Step 2: Victim executes some codes 6 7 8 … 56 57 58 59 60 61 62 63 29
CacheZoom: Interrupted Cache Attack L1D Cache 0 PC 1 2 3 4 Step 1: Attacker prime all the L1D sets 5 Step 2: Victim executes some codes 6 7 8 … Step 3: Attacker interrupts the execution pipeline 56 57 58 59 60 61 62 63 30
CacheZoom: Interrupted Cache Attack L1D Cache 0 PC 1 2 3 4 Step 1: Attacker prime all the L1D sets 5 Step 2: Victim executes some codes 6 7 8 … Step 3: Attacker interrupts the execution pipeline 56 Step 4: Attacker probes the access times 57 58 → Go to step 1 59 60 61 62 63 31
CacheZoom: Interrupted Cache Attack L1D Cache 0 1 2 3 4 Step 1: Attacker prime all the L1D sets PC 5 Step 2: Victim executes some codes 6 7 8 … Step 3: Attacker interrupts the execution pipeline 56 Step 4: Attacker probes the access times 57 58 → Go to step 1 59 60 61 62 63 32
CacheZoom: Interrupted Cache Attack L1D Cache 0 1 2 3 4 Step 1: Attacker prime all the L1D sets PC 5 Step 2: Victim executes some codes 6 7 8 … Step 3: Attacker interrupts the execution pipeline 56 Step 4: Attacker probes the access times 57 58 → Go to step 1 59 60 61 62 63 33
CacheZoom: Interrupted Cache Attack 34
CacheQuote 35
CacheQuote Attack Quoting Enclave: ▪ EPID Signature scheme built-in enclave by Intel ▪ Attest the integrity of user-provided enclave ▪ ▪ EPID Implementation (is)was not constant-time 36
CacheQuote Attack Loop iteration leaks Leading Zero Bits ▪ CacheZoom to accurately measure ▪ Feed the short vectors to a lattice and ▪ 37
Memory Speculation 38
Speculative Memory Accesses 39
Spoiler on Spoiler Attack 40
MicroWalk : Finding μ Arch Sources in Binaries Detecting Leakages based on Binary Instrumentation ▪ and Mutual Information Analysis 41
Accelerators in the Cloud 42
Side-channel Threats Shared FPGA-CPU Platforms FPGAs on the cloud can boost applications ▪ Optimized Application-specific Hardware Configuration ▪ e.g Real-time Artificial Intelligence ▪ New Attack Surface: ▪ Accelerator Function Units (AFUs) placed on the FPGA can be used to interact with the CPU ▪ or other AFUs for malicious purpose. AFU to AFU Attack ▪ AFU to HPS Attack ▪ AFU to CPU Attack ▪ CPU to AFU Attack ▪ Across VMS ? ▪ 43
Shared FPGA-CPU Platforms 44
Attack Vectors Rowhammer DMA/IOMMU Cache Attacks ▪ ▪ ▪ Trojan Bitstreams FPGA-centric Attacks Cold Boot ▪ ▪ ▪ 45
What is interesting about FPGA-CPU in the Cloud? Infancy, Attack/Defense Playground (Intel SGX in 2015) ▪ Customizable Hardware → More Devastating Attacks ▪ E.g. Design your own timers, Direct access to memory interface, etc. ▪ Complex Threat Model ▪ 46
Recommend
More recommend