understanding the security of discrete gpus
play

Understanding The Security of Discrete GPUs Zhiting Zhu 1 , Sangman - PowerPoint PPT Presentation

Understanding The Security of Discrete GPUs Zhiting Zhu 1 , Sangman Kim 1 , Yuri Rozhanski 2 , Yige Hu 1 , Emmett Witchel 1 , Mark Silberstein 2 1.The University of Texas at Austin 2.Technion-Israel Institute of Technology Outline Can GPUs


  1. Understanding The Security of Discrete GPUs Zhiting Zhu 1 , Sangman Kim 1 , Yuri Rozhanski 2 , Yige Hu 1 , Emmett Witchel 1 , Mark Silberstein 2 1.The University of Texas at Austin 2.Technion-Israel Institute of Technology

  2. Outline ● Can GPUs improve the security of a computing system? ○ PixelVault ○ Attacking PixelVault ● Can GPUs subvert the security of a computing system? ○ GPU driver attack ○ GPU microcode attack ○ IOMMU mitigation 2

  3. Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources GPU PCIe Bus SM SM SM ... CPU Register Global memory 3

  4. Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources Independent computational GPU resources PCIe Bus SM SM SM ... CPU Register Global memory 4

  5. Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources Independent computational GPU resources PCIe Bus SM SM SM ... CPU Register Independent memory system Global memory 5

  6. Can GPUs improve the security of a computing system? Motivation: Dedicated hardware resources Independent computational GPU resources PCIe Bus SM SM SM ... CPU Register Independent memory system Global memory Physically partitioned from CPU 6

  7. Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus CPU Register Global memory 7

  8. Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus CPU Register Secret Data Global memory Secret Data 8

  9. Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus …... CPU Register Secret Data Global memory Secret Data 9

  10. Can discrete GPUs enhance the security of a computing system? GPU PCIe Bus …... CPU Register Secret Data Global memory Secret Data 10

  11. PixelVault (CCS 14) ● Runs AES/RSA encryption in GPU. GPU Plaintext Register CPU Ciphertext Global memory 11

  12. PixelVault (CCS 14) ● Runs AES/RSA encryption in GPU. GPU ● Encryption(Enc) keys Plaintext are encrypted by a master key and are Register stored in GPU memory. CPU Ciphertext Global memory Enc key 12

  13. PixelVault (CCS 14) ● Runs AES/RSA encryption in GPU. GPU ● Encryption(Enc) keys Plaintext are encrypted by a master key and are Register stored in GPU memory. CPU Master key ● Master key is stored in a Ciphertext GPU register. Global memory Enc key 13

  14. PixelVault (CCS 14) ● Runs AES/RSA GPU encryption in GPU. Plaintext ● Encryption(Enc) keys are encrypted by a master Register key and are stored in Master key CPU GPU memory. Ciphertext Enc key ● Master key is stored in a GPU register. Global memory Enc key 14

  15. PixelVault (CCS 14) ● Runs AES/RSA GPU encryption in GPU. Plaintext ● Encryption(Enc) keys are encrypted by a master Register key and are stored in Master key CPU GPU memory. Ciphertext Enc key ● Master key is stored in a GPU register. Global memory Enc key 15

  16. PixelVault (CCS 14) ● Runs AES/RSA GPU encryption in GPU. Plaintext ● Encryption(Enc) keys are encrypted by a master Register key and are stored in Master key CPU GPU memory. Ciphertext Enc key ● Master key is stored in a GPU register. Global memory ● Prevent any adversarial Enc key from accessing registers. 16

  17. Threat model ● System boots from a trusted configuration and sets up PixelVault execution environment on GPU. 17

  18. Threat model ● System boots from a trusted configuration and sets up PixelVault execution environment on GPU. ● After setup, attacker can have full control over the platform. ○ Execute code at any privilege. ○ Has access to all platform hardware. ● Attack goal: Steal keys from GPU. 18

  19. Threat model Security guarantees depend on several NVIDIA GPU characteristics. ● Some of these characteristics are well known and confirmed. ● Some are experimentally validated. ● Others are only assumed to correct. ○ Experimentally verify. 19

  20. Assumption about NVIDIA GPU Assumption PixelVault safety property Attack A running GPU kernel cannot be Secure register contents from Debugger API. stopped and debugged. CPU-based debugger. GPU registers can’t be read after Cannot get the master key after Concurrent kernel. kernel termination. kernel termination. Can’t replace code of GPU kernel Cannot replace PixelVault code Flush instruction cache using executing from instruction cache. without stopping the kernel. MMIO registers. 20

  21. Assumption: A running GPU kernel cannot be stopped and debugged. CUDA 4.2 CUDA 5.0 and newer ● Compiled with explicit debug Stop a running kernel and inspect all support. GPU registers via debugger API. ● Insert breakpoints before kernel is running. 21

  22. Assumption: A running GPU kernel cannot be stopped and debugged. CUDA 4.2 CUDA 5.0 and newer ● Compiled with explicit debug Stop a running kernel and inspect all support. GPU registers via debugger API. ● Insert breakpoints before kernel is running. 22

  23. Assumption about NVIDIA GPU Assumption PixelVault safety property Attack A running GPU kernel cannot be Secure register contents from Debugger API. stopped and debugged. CPU-based debugger. GPU registers can’t be read after Cannot get the master key after Concurrent kernel. kernel termination. kernel termination. Can’t replace code of GPU kernel Cannot replace PixelVault code Flush instruction cache using executing from instruction cache. without stopping the kernel. MMIO registers. 23

  24. CUDA Stream ● An operation sequence on a GPU device. ● Every CUDA kernel is invoked on an independent stream. ● Share the same address space. 24

  25. PixelVault GPU Computation Data Transfer Stream Stream CPU Register Register 25

  26. Assumption: GPU registers can’t be read after kernel termination. Attack: Stream A Stream B Register Register 26

  27. Assumption: GPU registers can’t be read after kernel termination. Attack: If GPU kernel B is invoked in parallel with running kernel A, A’s register state can be retrieved using the debugger API even after A terminates, as long as B is still running. Stream A Stream B Stream A Stream B Debugger API Register Register Register Register 27

  28. Loading a program into the GPU GPU Instruction cache GPU global memory CPU GPU Chipset PCIe Bus 28

  29. Loading a program into the GPU GPU Instruction cache GPU global memory Program CPU GPU Chipset PCIe Bus 29

  30. Loading a program into the GPU GPU Instruction cache Program GPU global memory CPU GPU Chipset PCIe Bus 30

  31. Loading a program into the GPU GPU Program Instruction cache Program GPU global memory CPU GPU Chipset PCIe Bus 31

  32. If CPU writes to GPU instructions in memory while the GPU is running GPU …... Program Instruction cache Program GPU global memory Program CPU GPU Chipset PCIe Bus 32

  33. If CPU writes to GPU instructions in memory while the GPU is running GPU …... Program Instruction cache Program GPU global memory CPU GPU Chipset PCIe Bus 33

  34. No public API for flushing the instruction cache. 34

  35. Assumption about NVIDIA GPU Assumption PixelVault safety property Attack A running GPU kernel cannot be Secure register contents from Debugger API. stopped and debugged. CPU-based debugger. GPU registers can’t be read after Cannot get the master key after Concurrent kernel. kernel termination. kernel termination. Can’t replace code of GPU kernel Cannot replace PixelVault code Flush instruction cache using executing from instruction cache. without stopping the kernel. MMIO registers. 35

  36. Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ○ Some MMIO registers that flush the GPU instruction cache are not documented as flushing the cache. ○ Private debugger API. 36

  37. Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ● Manufacturers are free to change what’s implemented in software and what’s implemented in hardware across generations. ○ Debugger API 37

  38. Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ● Manufacturers are free to change what’s implemented in software and what’s implemented in hardware across generations. ● Manufacturers can change the architecture that invalidates the security of systems based on GPU. 38

  39. Discussion ● Security guarantees rely on proprietary hardware and software which is poorly (often purposefully) publicly documented. ● Manufacturers are free to change what’s implemented in software and what’s implemented in hardware across generations. ● Manufacturers can change the architecture that invalidates the security of systems based on GPU. ● Discrete GPUs cannot enhance the security of the computing system. 39

  40. GPU as a host for stealthy malware 1. Threat Model 2. GPU driver attack 3. GPU microcode attack 4. IOMMU mitigation 40

Recommend


More recommend