April 4-7, 2016 | Silicon Valley UNVEILING THE IMPACT OF TIME SLICING WITH NVIDIA GRID Erik Bohnhorst, GRID Performance Architect, 04/04/2016
GRID vGPU resource sharing (Simplified) Time Slicing works AGENDA Using “benchmarks” to evaluate GRID Impact of realistic recommendations 2
GRID vGPU RESOURCE SHARING … vGPU-n vGPU-1 vGPU-2 Each vGPU is assigned a fixed range of framebuffer for its exclusive use t=1 t=2 t=16 GPU engines (Graphics/Compute, Tesla GPU (simplified view) Video Encode, Video Decode and Framebuffer Copy Engine) are time sliced and can execute in parallel VM-1 FB GPU Engines VM-2 FB Each vGPU has exclusive access to the entire engine during its … Graphics Video Video Copy time slice (all CUDA cores) Compute Encode Decode Engine VM-n FB 3
EXAMPLE OF SINGLE VM TESTING ViewPerf 12 Catia viewset Framebuffer usage in MB Scores Framebuffer 2Q,4Q and 8Q vGPU profile perform equally (Sufficient framebuffer + Entire time of the 3D Engine) 4
WHY TIME SLICING WORKS 5
BUT WHAT IF THE RESOURCE IS UNDER “LOAD” JT2GO – Zooming/Panning/Rotating 6
EXAMPLE OF A GPU “HEAVY” TASK Performance utilization during 22 seconds Zooming Panning Rotating GPU seems to be constantly utilized during zooming, panning and rotating 7
EXAMPLE OF A GPU “HEAVY” TASK Performance utilization during 1 second GPU CPU Lots of unused time in between spikes 8
CLOSE LOOK AT A RESOURCE UNDER “LOAD” Time a different Time a different process/virtual machine process/virtual machine can use the GPU can use the GPU NVIDIA uses ultra high-end GPUs with GRID for maximum available time 9
EXAMPLE OF 6 ACTIVE USERS 6 users running “heavy” tasks NVIDIA uses ultra high-end GPUs with GRID for maximum available time 10
EXAMPLE OF MANY ACTIVE USERS Many users running “heavy” tasks NVIDIA uses ultra high-end GPUs with GRID for maximum available time 11
WHY “ BENCHMARKS ” DON’T APPLY FOR EVALUATION Benchmark Human workflow VP12 Catia viewset GPU heavy process (zooming) Synthetic workload Human workflow 12
WHY “ BENCHMARKS ” DON’T APPLY FOR EVALUATION Unrealistic users per host Realistic users per host recommendation recommendation using benchmarks using human workflows Active Active Active Active Idle Idle Idle Idle Performance Performance Time Active Active Active Active Active Idle Idle Idle Active Time … Active Active Active Active Idle Idle Idle Time 13
IMPACT OF USING BENCHMARKS Realistic users per host (UPH) recommendations can only be generated with human workflows. Evaluate GRID by monitoring human workflows by working with a small group of real end users on GRID vGPU. Monitor Configure/ Run Change 14
IMPACT OF REALISTIC RECOMMENDATIONS “BENCHMARKS” REAL USERS Cost per Server $30,000 $40,000 Users per Host 8 16 Software Costs Per User Per User Cost per User $3,750 $2,500 (Server Hardware) “Cost per User” drops ”significantly” when evaluating GRID with real users. 15
SUMMARY Time Slicing the 3D Engine allows sharing based on actual need for great performance at scale. Leveraging benchmarks results in unrealistic recommendations and too low user per host recommendations. Too low user per host recommendations create unrealistic TCO/ROI assumptions Too high TCO/ROI assumptions could delay/kill the project 16
April 4-7, 2016 | Silicon Valley THANK YOU JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join
Recommend
More recommend