pixelvault using gpus for securing cryptographic opera ons
play

PixelVault:+Using+GPUs+for+Securing+ Cryptographic+Opera;ons+ ! - PowerPoint PPT Presentation

PixelVault:+Using+GPUs+for+Securing+ Cryptographic+Opera;ons+ ! Giorgos+Vasiliadis + + +gvasil@ics.forth.gr+ Elias!Athanasopoulos ! !elathan@ics.forth.gr! Michalis!Polychronakis ! !mikepo@cs.columbia.edu! So=ris!Ioannidis! ! !


  1. PixelVault:+Using+GPUs+for+Securing+ Cryptographic+Opera;ons+ ! Giorgos+Vasiliadis + + +gvasil@ics.forth.gr+ Elias!Athanasopoulos ! !elathan@ics.forth.gr! Michalis!Polychronakis ! !mikepo@cs.columbia.edu! So=ris!Ioannidis! ! ! !so=ris@ics.forth.gr! 1!

  2. How!SSL/TLS!works! • Secure!Sockets!Layer!(SSL/TLS)!is!a!deGfacto! standard!for!secure!communica=on!! – Authen=ca=on,!confiden=ality,!integrity!! Client Server Client Initiates Handshake RSA Server Responds + Certificate decryption Client sends secret Server and Client create Keys AES cipher Secure Data Exchange 2!

  3. Mo=va=on! • Secret!keys!may!remain!unencrypted!in!CPU! Registers,!RAM,!etc.! – Memory!aOacks! – DMA/Firewire!aOacks! – Heartbleed!aOack! – …! 3!

  4. PixelVault!Overview! • Runs!encryp=on! Host! securely!outside!CPU/ RAM! x86+Host+CPU+ • Only!onGchip!memory! of!GPU!is!used!as! PLAINTEXT CIPHERTEXT storage! • Secret!keys!are!never! ENCRYPT observed!from!host! Graphics+Card+ 4!

  5. Cryptographic!Processing!with!GPUs! • GPUGaccelerated!SSL! SSH! Web! IMAP! Server! Server! Server! – [CryptoGraphics,!CTGRSA’05]! – [Harrison!et!al.,!Sec’08]! – [SSLShader,!NSDI’11]! OpenSSL!stub! – …! • HighGperformance! • CostGeffec=ve! GPU! 5!

  6. Cryptographic!Processing!with!GPUs! • GPUGaccelerated!SSL! SSH! Web! IMAP! Server! Server! Server! – [CryptoGraphics,!CTGRSA’05]! – [Harrison!et!al.,!Sec’08]! – [SSLShader,!NSDI’11]! OpenSSL!stub! – …! • HighGperformance! • CostGeffec=ve! GPU! Can+we+also+make+it+secure?+ 6!

  7. Implementa=on!Challenges! • How!to!isolate!GPU!execu=on?! • Who!holds!the!keys?! • Where!is!the!code?! 7!

  8. Implementa=on!Challenges! • How!to!isolate!GPU!execu=on?! • Who!holds!the!keys?! • Where!is!the!code?! 8!

  9. GPU!as!a!coprocessor! • Typically!handled!by!the!host! – Load!parameters,!launch!GPU!kernel,!transfer! data,!etc.! • Not!secure!for!our!purposes! – Crypto!keys!have!to!be!transferred!every!=me! 9!

  10. Autonomous!GPU!execu=on! • Force!GPU!kernel!to!run!indefinitely! – i.e.,!using!an!infinite! while !loop! • Cannot!rely!on!the!typical!parameterGpassing! execu=on!of!GPU!kernels!! – Instead,!we!allocate!a!memory!segment!that!is! shared!between!CPU/GPU! 10!

  11. Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • Page%locked+ memory! Server! Server! Server! – Accessed!by!the!GPU! directly,!via!DMA! – Cannot!be!swapped!to! OpenSSL!stub! disk! Shared+Memory+Segment+ • Processing!requests!are! issued!through!this! shared!memory!space! GPU+ 11!

  12. Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • GPU!con=nuously! Server! Server! Server! monitors!the!shared! space!for!new!requests! OpenSSL!stub! ! Shared+Memory+Segment+ GPU+ 12!

  13. ���������� ���������� Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • When!a!new!request!is! Server! Server! Server! available,!it!is! transferred!to!the! memory!space!of!the! OpenSSL!stub! GPU! REQUEST msg# Shared+Memory+Segment+ offsets[msg#] ! keyIDs[msg#] msg_buf[] GPU+ 13!

  14. ���������� ���������� ���������� ���������� Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • The!request!is! Server! Server! Server! processed!by!the!GPU! ! OpenSSL!stub! Shared+Memory+Segment+ REQUEST RESPONSE msg# msg# offsets[msg#] offsets[msg#] keyIDs[msg#] keyIDs[msg#] enc_msg_buf[] msg_buf[] 14!

  15. ���������� ���������� Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • When!processing!is! Server! Server! Server! finished,!the!host!is! no=fied!by!segng!the! response!parameter! OpenSSL!stub! fields!accordingly! RESPONSE msg# Shared+Memory+Segment+ offsets[msg#] keyIDs[msg#] enc_msg_buf[] GPU+ 15!

  16. Autonomous!GPU!execu=on! SSH! Web! IMAP! • NonGpreemp=ve! Server! Server! Server! execu=on! OpenSSL!stub! • Only!the!output!block!is! being!wriOen!back!to! host!memory! Shared+Memory+Segment+ non-preemptive exec GPU+ 16!

  17. Implementa=on!Challenges! • How!to!isolate!GPU!execu=on?! • Who!holds!the!keys?! • Where!is!the!code?! 17!

  18. Who!holds!the!keys?! GPU! Mul=processor!N! Host!Memory! Mul=processor!2! Global!Memory! Mul=processor!1! CPU! Regs! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris=cs! 18!

  19. Who!holds!the!keys?! GPU! Mul=processor!N! Host!Memory! Mul=processor!2! Global!Memory! Mul=processor!1! CPU! Regs! (Host)! Shared! OffGchip!global!memory.! Memory! Cache! No!protec=on;!data!can! SP! SP! SP! SP! be!acquired!by!the!CPU! SP! SP! SP! SP! directly.!! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris=cs! 19!

  20. Who!holds!the!keys?! GPU! Mul=processor!N! OnGchip!memories! Host!Memory! Mul=processor!2! Global!Memory! Mul=processor!1! CPU! Regs! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris=cs! 20!

  21. Who!holds!the!keys?! GPU! Comparable!with! scratchpad!RAM!in!other! Mul=processor!N! architectures.! Host!Memory! ! Mul=processor!2! Global!Memory! Unfortunately ,!its!contents! Mul=processor!1! can!be!acquired!by!a! CPU! Regs! subsequent!GPU!kernel.!! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris=cs! 21!

  22. Who!holds!the!keys?! GPU! Mul=processor!N! Host!Memory! Many!different!caches!(L1GL3,! Mul=processor!2! texture,!constant).! Global!Memory! Mul=processor!1! Unfortunately ,!the!data!stored! CPU! there!cannot!be!managed!by! Regs! (Host)! Shared! the!programmer! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris=cs! 22!

  23. Who!holds!the!keys?! GPU! Mul=processor!N! Not!fullyGaddressable.! Host!Memory! Mul=processor!2! Reset!to!zero!on!each! Global!Memory! GPU!kernel!execu=on.! Mul=processor!1! CPU! Regs! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris=cs! 23!

  24. Keeping!secrets!on!GPU!registers! • Secret!keys!are!loaded!on!GPU!registers!at!an! early!stage!of!the!bootstrapping!phase! – Preferably!from!an!external!storage!device! • Unfortunately,!the!number!of!available! registers!in!current!GPU!models!is!small! – Enough!for!a!single/few!secret!keys,!but! what+ about+mul7%homing+servers?+ 24!

  25. Support!for!an!arbitrary!number!of!keys! • We!can!use!a!separate!KeyStore!array!that! holds!an!arbitrary!number!of!secret!keys! encrypted!keys!are! each!key!is!decrypted!in!registers! stored!in!GPU!RAM:! during!encryp=on/decryp=on:! KeyStore+ GPU+Registers+File+ Master! Key! copy!to!registers! Enc’ed!Key! Dec’ed!Key! 25!

  26. Implementa=on!Challenges! • How!to!isolate!GPU!execu=on?! • Who!holds!the!keys?! • Where!is!the!code?! 26!

  27. Where!is!the!code?! • GPU!code!is!ini=ally!stored!in!global!device! memory!for!the!GPU!to!execute!it! – An!adversary!could!replace!it!with!a!malicious! version! Global!Device! Memory! 27!

  28. Preven=ng!code!modifica=on!aOacks! • Three!levels!of!instruc=on!caching!(icache)! – 4KB,!8KB,!and!32KB,!respec=vely! – HardwareGmanaged! • Opportunity: !Load!the!code!to!the!icache,!and! then!erase!it!from!global!device!memory! – The!code!runs!indefinitely!from!the!icache! – Not!possible!to!be!flushed!or!modified! 28!

  29. PixelVault!Crypto!Suite! • AESG128! • RSAG1024! 29!

  30. AES!Implementa=on! • The!key!and!all!intermediate!states!are!stored! in!GPU!registers! – 16!bytes!for!the!key! – 16!bytes!for!the!round!key! – 16!bytes!for!the!input/output!block! • The!only!data!that!is!wriOen!back!to!global,! offGchip!device!memory!is!the!output!block! 30!

  31. RSA!Implementa=on! • During!exponen=a=on,!each!thread!needs!three! temporary!values!of!( n !+!2)!words!each,!where! n ! is!the!size!of!the!key!in!bits! – 408!words!for!1024Gbit!keys! • Unfortunately,!there!is!not!always!enough!space! to!hold!all!three!temporary!values!in!registers! – Store!the!three!temporary!values!in!shared!memory! (i.e.!scratchpad!memory)! 31!

  32. Performance!Evalua=on! • Hardware!setup! – 2x!Intel!Xeon!E5520!QuadGcore!CPUs!at!2.27GHz! – 12GB!of!RAM! – GeForce!GTX480! • Comparison!against!the!standard!OpenSSL! implementa=on! – No!AESGNI!support! 33!

  33. AESG128!CBC!Performance! GPU 6 6 PixelVault 3 3 PixelVault (w/ KeyStore) 5 5 Throughput (Gbit/s) Throughput (Gbit/s) CPU Throughput (Gbit/s) Throughput (Gbit/s) Up!to!13%!overhead!! 4 4 Up!to!20%!overhead! 2 2 on!GPU!execu=on! 3 3 on!GPU!execu=on! 2 2 1 1 1 1 0 0 0 0 1 16 64 128 1024 4096 1 16 64 128 1024 4096 Number of Messages Number of Messages Number of Messages Number of Messages Encryp=on! Decryp=on! 34!

Recommend


More recommend