PixelVault:+Using+GPUs+for+Securing+ Cryptographic+Opera;ons+ ! Giorgos+Vasiliadis+ + +gvasil@ics.forth.gr+ 1!
Mo%va%on! • Secret!keys!may!remain!unencrypted!in!CPU! Registers,!RAM,!etc.! – Memory!disclosure!a?acks! • Heartbleed! – DMA/Firewire!a?acks! – Physical!a?acks! • ColdGboot!a?acks! – …! 3!
PixelVault!Overview! • Runs!encryp%on! Host! securely!outside!CPU/ RAM! Host!CPU! • Only!onGchip!memory! of!GPU!is!used!as! PLAINTEXT CIPHERTEXT storage! • Secret!keys!are!never! CIPHER observed!from!host! Graphics+Card+ 4!
Cryptographic!Processing!with!GPUs! • GPUGaccelerated!SSL! SSH! Web! IMAP! Server! Server! Server! – [CryptoGraphics,!CTGRSA’05]! – [Harrison!et!al.,!Sec’08]! – [SSLShader,!NSDI’11]! OpenSSL!stub! – …! • HighGperformance! • CostGeffec%ve! GPU! 5!
Cryptographic!Processing!with!GPUs! • GPUGaccelerated!SSL! SSH! Web! IMAP! Server! Server! Server! – [CryptoGraphics,!CTGRSA’05]! – [Harrison!et!al.,!Sec’08]! – [SSLShader,!NSDI’11]! OpenSSL!stub! – …! • HighGperformance! • CostGeffec%ve! GPU! Can+we+also+make+it+secure?+ 6!
Implementa%on!Challenges! • How!to!isolate!GPU!execu%on?! • Who!holds!the!keys?! • Where!is!the!code?! 7!
Implementa%on!Challenges! • How!to!isolate!GPU!execu%on?! • Who!holds!the!keys?! • Where!is!the!code?! 8!
GPU!as!a!coprocessor! • Typically!handled!by!the!host! – Load!parameters,!launch!GPU!program,!transfer! data,!etc.! • Not!secure!for!our!purposes! – Crypto!keys!have!to!be!transferred!every!%me! 9!
Autonomous!GPU!execu%on! • Force!GPU!program!to!run!indefinitely! – i.e.,!using!an!infinite! while !loop! • GPUs!are!nonGpreemp%ve! – No!other!program!can!run!at!the!same!%me! • We!use!a!shared!memory!segment!for! communica%on!between!the!CPU!and!the! GPU! 10!
Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • Page%locked+ memory! Server! Server! Server! – Accessed!by!the!GPU! directly,!via!DMA! – Cannot!be!swapped!to! OpenSSL!stub! disk! Shared+Memory+Segment+ • Processing!requests!are! issued!through!this! shared!memory!space! GPU+ 11!
Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • GPU!con%nuously! Server! Server! Server! monitors!the!shared! space!for!new!requests! OpenSSL!stub! ! Shared+Memory+Segment+ GPU+ 12!
���������� ���������� Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • When!a!new!request!is! Server! Server! Server! available,!it!is! transferred!to!the! memory!space!of!the! OpenSSL!stub! GPU! REQUEST msg# Shared+Memory+Segment+ offsets[msg#] ! keyIDs[msg#] msg_buf[] GPU+ 13!
���������� ���������� ���������� ���������� Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • The!request!is! Server! Server! Server! processed!by!the!GPU! OpenSSL!stub! ! Shared+Memory+Segment+ RESPONSE REQUEST msg# msg# offsets[msg#] offsets[msg#] keyIDs[msg#] keyIDs[msg#] enc_msg_buf[] msg_buf[] 14!
���������� ���������� Shared!Memory!between!CPU/GPU! SSH! Web! IMAP! • When!processing!is! Server! Server! Server! finished,!the!host!is! no%fied!by!secng!the! response!parameter! OpenSSL!stub! fields!accordingly! RESPONSE msg# Shared+Memory+Segment+ offsets[msg#] keyIDs[msg#] enc_msg_buf[] GPU+ 15!
Autonomous!GPU!execu%on! SSH! Web! IMAP! • NonGpreemp%ve! Server! Server! Server! execu%on! OpenSSL!stub! • Only!the!output!block!is! being!wri?en!back!to! host!memory! Shared+Memory+Segment+ input output non-preemptive exec GPU+ 16!
Implementa%on!Challenges! • How!to!isolate!GPU!execu%on?! • Who!holds!the!keys?! • Where!is!the!code?! 17!
Who!holds!the!keys?! GPU! Mul%processor!N! Host!Memory! Mul%processor!2! Global!Memory! Mul%processor!1! CPU! Regs! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris%cs! 18!
Who!holds!the!keys?! GPU! Mul%processor!N! Host!Memory! Mul%processor!2! Global!Memory! Mul%processor!1! CPU! Regs! (Host)! Shared! OffGchip!global!memory.! Memory! Cache! No!protec%on;!data!can! SP! SP! SP! SP! be!acquired!by!the!CPU! directly.!! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris%cs! 19!
Who!holds!the!keys?! GPU! Mul%processor!N! OnGchip!memories! Host!Memory! Mul%processor!2! Global!Memory! Mul%processor!1! CPU! Regs! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris%cs! 20!
Who!holds!the!keys?! GPU! Comparable!with! scratchpad!RAM!in!other! Mul%processor!N! architectures.! Host!Memory! ! Mul%processor!2! Global!Memory! Unfortunately ,!its!contents! Mul%processor!1! can!be!acquired!by!a! CPU! Regs! subsequent!GPU!program.!! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris%cs! 21!
Who!holds!the!keys?! GPU! Mul%processor!N! Host!Memory! Many!different!data!caches! Mul%processor!2! (L1GL3,!texture,!constant).! Global!Memory! Mul%processor!1! Unfortunately ,!the!data!stored! CPU! there!cannot!be!managed!by! Regs! (Host)! Shared! the!programmer! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris%cs! 22!
Who!holds!the!keys?! GPU! Mul%processor!N! Host!Memory! Reset!to!zero!on!each! Mul%processor!2! Global!Memory! GPU!kernel!execu%on.! Mul%processor!1! CPU! Regs! (Host)! Shared! Memory! Cache! SP! SP! SP! SP! SP! SP! SP! SP! • GPUs!contain!different!memory!hierarchies!of!…! – different!sizes,!and!…! – different!characteris%cs! 23!
Keeping!secrets!on!GPU!registers! • Secret!keys!are!loaded!on!GPU!registers!at!an! early!stage!of!the!bootstrapping!phase! – Remain!there!as!long!as!the!autonomous!GPU! program!is!running! • Unfortunately,!the!number!of!available! registers!in!current!GPU!models!is!small! – Enough!for!a!single/few!secret!keys,!but! what+ about+if+we+want+to+store+more?+ 24!
Support!for!an!arbitrary!number!of!keys! • We!can!use!a!separate!KeyStore!array!that! holds!an!arbitrary!number!of!secret!keys! encrypted!keys!are! stored!in!GPU!global! each!key!is!decrypted!in!registers! during!encryp%on/decryp%on:! device!memory:! KeyStore+ GPU+Registers+File+ Master! Key! copy!to!registers! Enc’ed!Key! Dec’ed!Key! 25!
Implementa%on!Challenges! • How!to!isolate!GPU!execu%on?! • Who!holds!the!keys?! • Where!is!the!code?! 26!
Where!is!the!code?! • GPU!code!is!ini%ally!stored!in!global!device! memory!for!the!GPU!to!execute!it! – An!adversary!could!replace!it!with!a!malicious! version! Global+Device+ Memory+ 27!
Prevent!GPU!code!modifica%on!a?acks! • Three!levels!of!instruc%on!caching!(icache)! – 4KB,!8KB,!and!32KB,!respec%vely! – HardwareGmanaged! • Opportunity: !Load!the!code!to!the!icache,!and! then!erase!it!from!global!device!memory! – The!code!runs!indefinitely!from!the!icache! – Not!possible!to!be!flushed!or!modified! 28!
PixelVault!Crypto!Suite! • Currently!implemented!algorithms! – AESG128! – RSAG1024! • Implemented!completely!using!onGchip! memory!(i.e.!registers,!scratchpad!memory)! – The!only!data!that!is!wri?en!back!to!global,!offG chip!device!memory!is!the!output!block! 33!
AESG128!CBC!Performance! 6 6 GPU PixelVault 3 3 PixelVault (w/ KeyStore) 5 5 Throughput (Gbit/s) Throughput (Gbit/s) CPU Throughput (Gbit/s) Throughput (Gbit/s) Up!to!13%!overhead!! 4 4 Up!to!20%!overhead! 2 2 on!GPU!execu%on! 3 3 on!GPU!execu%on! 2 2 1 1 1 1 0 0 0 0 1 16 64 128 1024 4096 1 16 64 128 1024 4096 Number of Messages Number of Messages Number of Messages Number of Messages Encryp%on! Decryp%on! 37!
AESG128!CBC!Performance! 6 6 GPU PixelVault 3 3 PixelVault (w/ KeyStore) 5 5 Throughput (Gbit/s) Throughput (Gbit/s) CPU 3xG4x!faster!than!CPU! Throughput (Gbit/s) Throughput (Gbit/s) 4 4 2 2 for!a!sufficient!number! Intel!Nehalem! 3 3 of!messages! single!core!(2.27GHz)!! 2 2 1 1 1 1 0 0 0 0 1 16 64 128 1024 4096 1 16 64 128 1024 4096 Number of Messages Number of Messages Number of Messages Number of Messages Encryp%on! Decryp%on! 38!
RSA!1024Gbit!Decryp%on! #Msgs CPU GPU [25] PixelVault PixelVault (w/ KeyStore) 1 1632.7 15.5 15.3 14.3 16 1632.7 242.2 240.4 239.2 64 1632.7 954.9 949.9 939.6 112 1632.7 1659.5 1652.4 1630.3 128 1632.7 1892.3 1888.3 1861.7 1024 1632.7 10643.2 10640.8 9793.1 4096 1632.7 17623.5 17618.3 14998.8 8192 1632.7 24904.2 24896.1 21654.4 • PixelVault!adds!an!1%G15%!overhead!over!the!default!! GPUGaccelerated!RSA! 39!
Recommend
More recommend