evaluating windows 10
play

EVALUATING WINDOWS 10 LEARN WHY YOUR USERS NEED GPU ACCELERATION - PowerPoint PPT Presentation

May 8-11 2017 | Silicon Valley EVALUATING WINDOWS 10 LEARN WHY YOUR USERS NEED GPU ACCELERATION Jason Kyungho Lee, Sr Performance Engineer, NVIDAI GRID @NVIDIA Hari Sivaraman, Staff Engineer @ VMware Introduction Latest Announcements


  1. May 8-11 2017 | Silicon Valley EVALUATING WINDOWS 10 LEARN WHY YOUR USERS NEED GPU ACCELERATION Jason Kyungho Lee, Sr Performance Engineer, NVIDAI GRID @NVIDIA Hari Sivaraman, Staff Engineer @ VMware

  2. Introduction • Latest Announcements • AGENDA Windows 10 vs. Windows 7 • Performance Testing • Summary • 2

  3. TESLA LINEUP FOR GRID The most powerful data center GPUs targeted at graphics virtualization M10 M6 M60 GPU Quad Mid-level Maxwell Single High-end Maxwell Dual High-end Maxwell CUDA Cores 2560 (640 per GPU) 1536 4096 (2048 per GPU) Memory Size 32 GB GDDR5 (8 GB per GPU) 8 GB GDDR5 16 GB GDDR5 (8GB per GPU) H.264 1080p30 streams 28 18 36 Max vGPU instances 64 16 32 Form Factor PCIe 3.0 Dual Slot (rack servers) MXM (blade servers) PCIe 3.0 Dual Slot (rack servers) Power 225W 100W (75W opt) 240W / 300W (225W opt) Thermal passive bare board active / passive USER DENSITY BLADE PERFORMANCE Optimized Optimized Optimized 3

  4. LATEST ANNOUNCEMENTS 5

  5. LATEST ANNOUNCEMENTS Instant Clone Support (VMware Horizon 7.1) • Allows ultra fast provisioning of virtual machines. • • NVIDIA is the only GPU vendor supported High Availability Support(VMware vSphere 6.5) • • vSphere 6.5 supports HA for NVIDIA GRID vGPU enabled virtual machines Multi Monitor support with Blast Extreme H.264 HW (VMware Horizon 7.1) • Offload the H.264 encode to the NVIDIA GPU for improved and predictable UX • S7763 - DELIVER A TRANSFORMATIVE 3D GRAPHICS USER EXPERIENCE WITH VMWARE HORIZON, BLAST EXTREME ADAPTIVE TRANSPORT , AND NVIDIA GRID S7429 - EXPERT AND CUSTOMER ROUNDTABLE: REAL-WORLD TALES OF GPU-ACCELERATED DESKTOPS AND APPS - IMPLEMENTERS SHARE BEST PRACTICES 6

  6. WINDOWS 10 7

  7. WINDOWS 10 NEW CHANGES Visual compelling Modern UI / Menu with transparency • No Modern UI Disabling, assumption is you have GPU on Windows 10 • GPU accelerated Virtual desktop / Task view / Alt-TAB preview • Video playback GPU acceleration by default media player • GPU accelerated font(DPI) and display scaling with Ultra high definition • resolution • Windows Device Driver Model WDDM 2.0 / DirectX 12 supported Microsoft Edge GPU acceleration • 8

  8. WINDOWS 10 REQUIRES MORE RESOURCES FOR IMPROVEMENT USER EXPERIENCE Windows 10 requires more CPU cycles Windows 10 requires more GPU frame buffer 100 400 90 80 15% more 300 CPU utilization 70 CPU host utilization % 60 200 50 40 100 30 20 10 0 Windows 7 Windows 10 Windows 10 Windows 10 0 Time (single (single (single (dual 1920x1080) 1920x1080) 2560x1600) 1920x1080) Windows 7 Windows 10 64 x Tesla M10-1B VMs on a host running LoginVSI knowledge worker workload 9

  9. WINDOWS START BUTTON EXPERIENCE This is Side-by-Side 10

  10. PERFORMANCE TESTING 11

  11. TEST SETUP - SUBJECTIVE USER TESTING Two identical servers run LoginVSI Knowledge Worker • to create a realistic customer environment CPU Utilization of the hosts is around 60-80% • Testers don’t know which session is GPU accelerated • Testers do the same tasks on both systems • Access Devices (Thin • Client/Monitor/Mouse/Keyboard) are the same with a single screen and 1080p resolution • Predefined scenarios plus freestyle at the end. Scenarios include (Browsing, YouTube, Creation of • PowerPoint, Google Maps, WebGL) 12

  12. CPU ONLY VS. NVIDIA GRID GPU with NVENC provide an average positive increase to UX of 34% +13% 5.0 +55% +26% +13% +133% +68% +30% +20% User Experience Scale +5% +6% +21% +9% +19% +65% 4.0 1 Unacceptable, unusable - fire someone in IT! 3.0 2 Barely useable, borderline, but I’ll get tired of this soon Higher 2.0 is 3 Tolerable, I guess I can 1.0 make do better 4 Pretty good for a virtual 0.0 desktop 5 Outstanding - as good (or almost) as physical Horizon 7 with PCoIP - No GPU Horizon 7 with Blast Extreme and H.264 HW Testing ran on two identical systems, CPU system was loaded up to 60-80% utilization, the GPU system ran the same workload 13

  13. CLICK TO PHOTON What it is and why it matters • Click-to-Photon is more than network latency • Click-to-Photon is a key metric that contributes to the overall user experience • Click-to-Photon defines how interactive/snappy the solution is • Click-to-Photon measures the overall latency from the user perspective • Click-to-Photon measures the time of the mouse click till the action is visible to the user • includes latency of the USB device process, rendering the frame, displaying the frame, etc. • Click-to-Photon in remote environments (VDI, etc.) in addition includes encode latency, network latency and decode latency • 15

  14. CLICK TO PHOTON SIMPLIFIED CLICK-TO-PHOTON CAPTURES THE OVERALL LATENCY Mouse button Mouse click Packet Received Packed Decoded released processed Access Device Packetized and Packet Frame displayed encoded transmitted Network Latency on the WAN Network Latency on the WAN (i.e. 50ms) (i.e. 50ms) Mouse click Application Packet Received Packet Decoded processed Server New Frame Frame Frame Captured Frame Encoded transmitted rendered via NVIDIA NVFBC via NVIDIA NVENC 16

  15. CLICK TO PHOTON SIMPLIFIED CLICK-TO-PHOTON CAPTURES THE OVERALL LATENCY Mouse click Mouse button Packet Received Packed Decoded processed released Access Device Packetized and Packet Frame displayed encoded transmitted CLICK-TO- Network Latency on the WAN Network Latency on the WAN PHOTON (i.e. 50ms) (i.e. 50ms) LATENCY Mouse click Application Packet Received Packet Decoded processed Server New Frame Frame Frame Captured Frame Encoded transmitted rendered via NVIDIA NVFBC via NVIDIA NVENC 17

  16. CLICK TO PHOTON SIMPLIFIED CLICK-TO-PHOTON CAPTURES THE OVERALL LATENCY Mouse click Mouse button Packet Received Packed Decoded processed released Access Device Packetized and Packet Frame displayed encoded transmitted Network Latency CLICK-TO- Network Latency on the WAN Network Latency on the WAN PHOTON (i.e. 50ms) (i.e. 50ms) LATENCY Mouse click Application Packet Received Packet Decoded processed Server New Frame Frame Frame Captured Frame Encoded transmitted rendered via NVIDIA NVFBC via NVIDIA NVENC 18

  17. CLICK TO PHOTON LATENCY Blast Extreme with NVENC decreases latency up to 140ms at <1ms network latency 300 250 ms 200 185 Lower 150 165 155 is 125 100 better 107 50 65 0 Local PC Blast Blast Blast Blast Blast with Extreme Extreme Extreme Extreme Extreme Integrated No GPU - M10-1B - No GPU - M10-1B - M10-1B - GPU JPEG/PNG JPEG/PNG H.264 H.264 H.264 Software Software Hardware 19

  18. CLICK TO PHOTON LATENCY Comparing latency of single VM and at scale at <1ms network latency 300 250 250 240 200 ms 150 Lower 170 160 is 100 185 Idle, 1 VM 110 better 165 155 125 50 Scale, 64VMs 65 107 0 Local PC Blast Blast Blast Blast Blast with Extreme Extreme Extreme Extreme Extreme Integrated No GPU - M10-1B - No GPU - M10-1B - M10-1B - GPU JPEG/PNG JPEG/PNG H.264 H.264 H.264 Software Software Hardware 63 x Tesla M10-1B VMs on a host running LoginVSI knowledge worker workload and 1 additional VM measuring latency 20

  19. HOST CPU OFFLOADING Blast Extreme decreases CPU utilization on the host, up to 42% 90000 100 90 75000 80 70 60000 Lower 60 45000 is 50 better 40 30000 30 15000 20 10 0 0 NOGPU-PCoIP GPU-PCoIP NoGPU-JPEG GPU-JPEG NOGPU-Blast-H.264 CPU GPU-BLAST-H.264CPU GPU-BLAST-NVENC 63 x Tesla M10-1B VMs on a host running LoginVSI knowledge worker workload and 1 additional VM measuring latency 21

  20. GUEST VM, REMOTING PROCESS CPU OFFLOADING Blast Extreme decreases CPU utilization on the VM Remoting process utilization(PCoIP_server.exe or BlastW.exe) in Guest VM 90 Percent One CPU core Time 80 70 Lower is 60 better 50 40 30 20 10 0 Time NOGPU-PCoIP GPU-PCoIP NoGPU-JPEG GPU-JPEG NOGPU-Blast-H.264 CPU GPU-BLAST-H.264CPU GPU-BLAST-NVENC 63 x Tesla M10-0B VMs on a host running LoginVSI knowledge worker workload and 1 additional VM measuring latency 22

  21. VIDEO PLAYBACK Up to 52% improved User Experience due to GRID vGPU and H.264 FPS is remoted FPS 23

  22. VIDEO PLAYBACK Average FPS for a set of Videos Total FPS for a set of Videos 805 25 JPG +vGPU 705 JPG +vGPU 605 20 HW- HW-H264 505 H264 + + vGPU vGPU FPS FPS 405 JPG-NO vGPU JPG-NO 305 15 vGPU 205 SW-H264 SW-H264 105 10 0 10 20 30 40 5 0 10 20 30 40 #VM #VM 24

  23. VIDEO PLAYBACK CPU-Util (%) for a set of Videos 25 JPG +vGPU 20 HW- CPU-Util (%) 15 H264+vGPU JPG-NO 10 vGPU 5 SW-H264 0 0 5 10 15 20 25 30 35 #VM 25

  24. VIDEOS 26

  25. POWERPOINT ANIMATION This is Side-by-Side 27

  26. VIDEO PLAYBACK AND OFF LOADING CPU This is Side-by-Side 28

  27. SUMMARY 29

  28. WINDOWS 10 IS DIFFERENT Windows 10 is Microsoft’s most graphical operating system Windows is differs to Windows 10 • • requires more CPU resources Leveraged the GPU more • NVIDIA GRID vGPU • Improves user experience (as Microsoft intended) • • Reduces Click-to-Photon latency(snappy user interaction) Predictable and consistent user experience • reduces CPU cycles to allow higher user density • 6/9/2017 30

  29. May 8-11 2017 | Silicon Valley THANK YOU

Recommend


More recommend