Understanding The Security of Discrete GPUs Zhiting Zhu 1 , Sangman Kim 1 , Yuri Rozhanski 2 , Yige Hu 1 , Emmett Witchel 1 , Mark Silberstein 2 1.The University of Texas at Austin 2.Technion-Israel Institute of Technology
Outline ● Can GPUs improve the security of a computing system? ○ PixelVault ○ Attacking PixelVault ● Can GPUs subvert the security of a computing system? ○ GPU driver attack ○ GPU microcode attack ○ IOMMU mitigation 2
Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources GPU PCIe Bus SM SM SM ... CPU Register Global memory 3
Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources Independent computational GPU resources PCIe Bus SM SM SM ... CPU Register Global memory 4
Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources Independent computational GPU resources PCIe Bus SM SM SM ... CPU Register Independent memory system Global memory 5
Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources Independent computational GPU resources PCIe Bus SM SM SM ... CPU Register Independent memory system Global memory Physically partitioned from CPU 6
Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus CPU Register Global memory 7
Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus CPU Register Secret Data Global memory Secret Data 8
Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus …... CPU Register Secret Data Global memory Secret Data 9
Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus …... CPU Register Secret Data Global memory Secret Data 10
PixelVault (CCS 14) ● Runs AES/RSA encryption in GPU. GPU Plaintext Register CPU Ciphertext Global memory 11
PixelVault (CCS 14) ● Runs AES/RSA encryption in GPU. GPU ● Encryption(Enc) keys Plaintext are encrypted by a master key and are Register stored in GPU memory. CPU Ciphertext Global memory Enc key 12
PixelVault (CCS 14) ● Runs AES/RSA encryption in GPU. GPU ● Encryption(Enc) keys Plaintext are encrypted by a master key and are Register stored in GPU memory. CPU Master key ● Master key is stored in a Ciphertext GPU register. Global memory Enc key 13
PixelVault (CCS 14) ● Runs AES/RSA GPU encryption in GPU. Plaintext ● Encryption(Enc) keys are encrypted by a master Register key and are stored in Master key CPU GPU memory. Ciphertext Enc key ● Master key is stored in a GPU register. Global memory Enc key 14
PixelVault (CCS 14) ● Runs AES/RSA GPU encryption in GPU. Plaintext ● Encryption(Enc) keys are encrypted by a master Register key and are stored in Master key CPU GPU memory. Ciphertext Enc key ● Master key is stored in a GPU register. Global memory Enc key 15
PixelVault (CCS 14) ● Runs AES/RSA GPU encryption in GPU. Plaintext ● Encryption(Enc) keys are encrypted by a master Register key and are stored in Master key CPU GPU memory. Ciphertext Enc key ● Master key is stored in a GPU register. Global memory ● Prevent any adversarial Enc key from accessing registers. 16
Threat model ● System boots from a trusted configuration and sets up PixelVault execution environment on GPU. 17
Threat model ● System boots from a trusted configuration and sets up PixelVault execution environment on GPU. ● After setup, attacker can have full control over the platform. ○ Execute code at any privilege. ○ Has access to all platform hardware. ● Attack goal: Steal keys from GPU. 18
Threat model Security guarantees depend on several NVIDIA GPU characteristics. ● Some of these characteristics are well known and confirmed. ● Some are experimentally validated. ● Others are only assumed to correct. ○ Experimentally verify. 19
Assumption about NVIDIA GPU Assumption PixelVault safety property Attack A running GPU kernel cannot be Secure register contents from Debugger API. stopped and debugged. CPU-based debugger. GPU registers can’t be read after Cannot get the master key after Concurrent kernel. kernel termination. kernel termination. Can’t replace code of GPU kernel Cannot replace PixelVault code Flush instruction cache using executing from instruction cache. without stopping the kernel. MMIO registers. 20
Assumption: A running GPU kernel cannot be stopped and debugged. CUDA 4.2 CUDA 5.0 and newer ● Compiled with explicit debug Stop a running kernel and inspect all support. GPU registers via debugger API. ● Insert breakpoints before kernel is running. 21
Assumption: A running GPU kernel cannot be stopped and debugged. CUDA 4.2 CUDA 5.0 and newer ● Compiled with explicit debug Stop a running kernel and inspect all support. GPU registers via debugger API. ● Insert breakpoints before kernel is running. 22
Assumption about NVIDIA GPU Assumption PixelVault safety property Attack A running GPU kernel cannot be Secure register contents from Debugger API. stopped and debugged. CPU-based debugger. GPU registers can’t be read after Cannot get the master key after Concurrent kernel. kernel termination. kernel termination. Can’t replace code of GPU kernel Cannot replace PixelVault code Flush instruction cache using executing from instruction cache. without stopping the kernel. MMIO registers. 23
CUDA Stream ● An operation sequence on a GPU device. ● Every CUDA kernel is invoked on an independent stream. ● Share the same address space. 24
PixelVault GPU Computation Data Transfer Stream Stream CPU Register Register 25
Assumption: GPU registers can’t be read after kernel termination. Attack: Stream A Stream B Register Register 26
Assumption: GPU registers can’t be read after kernel termination. Attack: If GPU kernel B is invoked in parallel with running kernel A, A’s register state can be retrieved using the debugger API even after A terminates, as long as B is still running. Stream A Stream B Stream A Stream B Debugger API Register Register Register Register 27
Loading a program into the GPU GPU Instruction cache GPU global memory CPU GPU Chipset PCIe Bus 28
Loading a program into the GPU GPU Instruction cache GPU global memory Program CPU GPU Chipset PCIe Bus 29
Loading a program into the GPU GPU Instruction cache Program GPU global memory CPU GPU Chipset PCIe Bus 30
Loading a program into the GPU GPU Program Instruction cache Program GPU global memory CPU GPU Chipset PCIe Bus 31
If CPU writes to GPU instructions in memory while the GPU is running GPU …... Program Instruction cache Program GPU global memory Program CPU GPU Chipset PCIe Bus 32
If CPU writes to GPU instructions in memory while the GPU is running GPU …... Program Instruction cache Program GPU global memory CPU GPU Chipset PCIe Bus 33
No public API for flushing the instruction cache. 34
Assumption about NVIDIA GPU Assumption PixelVault safety property Attack A running GPU kernel cannot be Secure register contents from Debugger API. stopped and debugged. CPU-based debugger. GPU registers can’t be read after Cannot get the master key after Concurrent kernel. kernel termination. kernel termination. Can’t replace code of GPU kernel Cannot replace PixelVault code Flush instruction cache using executing from instruction cache. without stopping the kernel. MMIO registers. 35
Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ○ Some MMIO registers that flush the GPU instruction cache are not documented as flushing the cache. ○ Private debugger API. 36
Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ● Manufacturers are free to change what’s implemented in software and what’s implemented in hardware across generations. ○ Debugger API 37
Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ● Manufacturers are free to change what’s implemented in software and what’s implemented in hardware across generations. ● Manufacturers can change the architecture that invalidates the security of systems based on GPU. 38
Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ● Manufacturers are free to change what’s implemented in software and what’s implemented in hardware across generations. ● Manufacturers can change the architecture that invalidates the security of systems based on GPU. ● Discrete GPUs cannot enhance the security of the computing system. 39
GPU as a host for stealthy malware 1. Threat Model 2. GPU driver attack 3. GPU microcode attack 4. IOMMU mitigation 40
Recommend
More recommend