automatic virtualization of accelerators
play

Automatic Virtualization of Accelerators Hangchen Yu, Arthur Michener - PowerPoint PPT Presentation

Automatic Virtualization of Accelerators Hangchen Yu, Arthur Michener Peters , Amogh Akshintala, Christopher J. Rossbach HotOS19 13 May 2019 Accelerators are not virtualized Cloud computing relies on virtualization Consolidation,


  1. Automatic Virtualization of Accelerators Hangchen Yu, Arthur Michener Peters , Amogh Akshintala, Christopher J. Rossbach HotOS’19 13 May 2019

  2. Accelerators are not virtualized • Cloud computing relies on virtualization – Consolidation, Elasticity, … • Most resources virtualized – CPUs, Memory, I/O devices • Except Accelerators VM 1 VM 2 – Dedicated to VMs – Underutilized H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #2

  3. Explosion of Accelerators and APIs • 15 new accelerators in 3.5 years – Google TPUs, Intel QuickAssist, … • Many important APIs – TensorFlow, TF Lite, QuickAssist, ... • Accelerator stacks – proprietary – highly-specialized H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #3

  4. Traditional Technology Stacks Application … Public API Stream Socket DNS User-mode Library … Standard OS Interfaces File Socket syscall Driver … Hardware Interface MMIO INTR Hardware H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #6

  5. Accelerator Silos Application Public API API User-mode Library Proprietary Interfaces ioctl MMIO Driver Hardware Interface INTR DMA Hardware H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #9

  6. Accelerator Silos Application Public API API API Silo User-mode Library Proprietary Interfaces ioctl Silo MMIO Driver Hardware Interface INTR DMA Hardware Hardware H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #11

  7. API Remoting VM Application Custom User-mode Library Custom API Server Silo Vendor Library Vendor Driver Accelerator Hypervisor Hypervisor H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #14

  8. SVGA2: Para-virtual GPU VM Application Custom User-mode Library Silo Custom Guest Driver Vendor Library Custom Virtual GPU Vendor Driver Custom API Server Hardware Hypervisor H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #15

  9. CAvA: Compiler for Automatic Virtualization of Accelerators Para-virtual VM API Stack Application Generated User-mode Library CUDA.h CAvA Silo AvA Guest Driver Vendor Library AvA Virtual Device Vendor Driver Generated API Server Accelerators Hypervisor H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #16

  10. AvA Toolchain Accelerator API.h CAvA Skeletal API Spec. Developer API Specification CAvA Para-virtual API Stack H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #17

  11. AvA Toolchain ncStatus_t ncGlobalGetOption( int option, Accelerator API.h void *data, int *dataLength); CAvA Skeletal API Spec. Developer API Specification CAvA Para-virtual API Stack H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #18

  12. AvA Toolchain ncStatus_t ncGlobalGetOption( int option, Accelerator API.h void *data, int *dataLength); CAvA ava_argument(data) { Skeletal API Spec. ava_input; Developer ava_output; ava_buffer(1); API Specification } CAvA Para-virtual API Stack H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #19

  13. AvA Toolchain ncStatus_t ncGlobalGetOption( int option, Accelerator API.h ava_argument(data) { void *data, ava_output; int *dataLength); CAvA ava_buffer(*dataLength); ava_argument(data) { Skeletal API Spec. } ava_input; Developer ava_output; ava_buffer(1); API Specification } CAvA Para-virtual API Stack H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #20

  14. AvA Toolchain ncStatus_t ncGlobalGetOption( int option, Accelerator API.h ava_argument(data) { void *data, ava_output; int *dataLength); CAvA ava_buffer(*dataLength); ava_argument(data) { Skeletal API Spec. } ava_input; Developer ava_output; ncStatus_t ncGlobalGetOption(...) { ava_buffer(1); API Specification ... } cmd = new_command(...); cmd->api_id = MVNC_API; CAvA cmd->command_id = NC_GLOBAL_GET_OPTION; Para-virtual API Stack ... send_command(cmd); wait_for_reply(cmd); ... } H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #21

  15. Preliminary Experiences • APIs – OpenCL, CUDA, HIP, TensorFlow C, NCSDK, QAT, … • Devices – GPUs, Intel Movidius NCS, Intel QAT, FPGA applications, … • Overhead measurements – 5.6% for CUDA; 7% for OpenCL, excluding myocyte (call-intensive) – Compare to 100× for GPUvm H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #23

  16. Preliminary Development Effort Type APIs LoC (#Funcs) Time Difficulty ★★★★ GPUvm Full-virt 1 20 000 Years ★★★★ SVGA2 Para-virt 2 MANY! Years CUDA: 3 011 (71) ★★ GvirtuS API Remoting 7 OpenCL: 2 531 (22) Months ~60 / Function Automatic CUDA: 221 (16) ★ AvA Para-virtual 9 OpenCL: 835 (37) Days API Remoting ~20 / Function H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #24

  17. Continuing Work • Inferring more information about the API – API documentation – Programs which use the API – API implementations H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #25

  18. AvA: Automatic Virtualization of Accelerators Silos → API remoting only viable accelerator virtualization technique • • AvA – Compensates for: • Compatibility with automation • Interposition with hypervisor mediated transport – A single developer can virtualize a new API/device in days Thank you && Debate H. Yu, A. M. Peters, A. Akshintala and C. J. Rossbach, AvA , HotOS’19 #26

Recommend


More recommend