use tesla to provide first gpu vm service in china
play

Use Tesla to provide first GPU VM Service in China Feng Zhu - PowerPoint PPT Presentation

Use Tesla to provide first GPU VM Service in China Feng Zhu Outline UCloud Introduction K80 GPU VM P40 GPU VM UCloud GPU PaaS Service: UAI-Service UCloud GPU ecosystem 2 About UCloud Top 3


  1. Use Tesla to provide first GPU VM Service in China Feng Zhu 专注 • 服务 • 中立

  2. Outline • UCloud Introduction • K80 GPU VM • P40 GPU VM • UCloud GPU PaaS Service: UAI-Service • UCloud GPU ecosystem 2

  3. About UCloud • Top 3 IaaS Provider in China • Found in 2012 • HQ in Shanghai • Served 50,000+ Enterprise 3

  4. Data Centers Frankfurt LA BJ2 DC BJ1 SH1 Seoul SH2 ZJ GZ HK TW Bangkok SG 14 Global Regions 4

  5. UCloud Product Line ������� ������� ����� � ���&"���� �� ���� �������� �"����� ���'��� ����� ��� ����� ���� ������� �%�� %��$������! ���*��$�� ��!)� ����� ��������� ��� ��� ���'� �(�)�� ���%# ����*��$�� �! ��������� ��� �"�� ������ ���� �%# ������ �" ������ ���� �#�$� ��� "�!���� +���!�! ������� ������� � �#� ���� �������� ��� ���!!��� �����%�� ��%� 5

  6. GPU Timeline 2012 UCloud founded 2015.11 K80 GPU VM 2016.2 K80 GPU Physical Machine 2017.5 P40 GPU VM 2017.? P40 GPU Physical Machine 6

  7. GPU Decision: Virtualization PCI Pass through Grid ��! $���"��������������'��� ������'�����,����������$���$���"� ���)����!��� ����!���� ��� ����!�����!����)����!�� �����������������'����!������ ( (-.�/�0�1� ��� ) !��$�����)���������'��� ���$���!�� ����!�� √ 2

  8. VM Advantage • Flexibility for VM configuration • CPU、Memory、Disk size、GPU number are all flexible • SDN network flexible • Main OS all supported, Win/Linux • CentOS 6.5/CentOS 7.0/Ubuntu 14.04/Ubuntu 12.04/Gentoo 2.2/Win 2008/Win 2012 • Fast Deployment • Based on self-defined image, can deploy 1000 VMs in 1 minute

  9. VM Performance Degrade • Using Pass-through Technology, almost no degradation Degradation Virtualization Bare Metal �������! 33456 7..6 ��!�����, 3-6 7..6 7.74..6 7..4..6 ������$�9����! 334..6 �����"���$ 3-4..6 384..6 ������� ��

  10. UCloud GPU Virtualization – DL test • Caffe Performance (Ubuntu) Cases iters GPU(secs) CPU(secs) Speedup ����� �!�"���+ 7.... 0:543 3..45 3.5 ��)��7. :... 58;40 03574. 7.8 #�!�*��!�! �)������$�� 7.... 575-43 78<5;4- 5.6 ���� !����! ����������������+���!�! :.... 0:-;40 -35847 3.5 0.... 7:... 7.... ��� :... ��� . 10

  11. UCloud GPU Virtualization – DL test(2) • Theano/Keras ( Ubuntu) ����� ����� ���=����> ���=����> Speedup ����?$���4�� 0.... ;8 053 5.1 ����?�!!4�� 0.... 77 :<5 51.2 ����?�!!?$���4�� 0.... 08 05< 8.7 �������!?�!!4�� ;:... : 50 6.4 ��)��7.?�!!4�� :.... 73- 0<8. 13.5 ����?�!!4�� 3:. 3 00 2.4 �!���?�!!4�� <.... 05 7550 57.9 �!���?��!!4�� <.... 08. ;:7 1.7 �!���?�$�4�� <.... 7 : 5.0 11

  12. K80 Physical Machine Hardware Specification ��� +��$��(-. ��� �!��$��: 0<5.10 "����� 730� ���� 0+���� ��������! 7.�����1; 12

  13. VM Configuration - K80 VM VM GPU GPU GPU ;� CPU -� 7<� Memory -� 7<� 50� <;� 7..��* 7+ Disk 13

  14. Flexible VM Save Cost Configuration Fixed Flexible CPU 7<� ;� Memory 3<� -� Disk 7+ 7..� GPU 7 7 Price ;@5..���A&��!�, 0@0..���A&��!�, USD Price B<7:&� B57:&� 7.... -... <... #$�C��$� ;... #�C�� 0... . 7��� 0��� 14

  15. GPU VM Features VM VPC Networking Self-defined images, deploy 1000 VMs in 1 min Image Snapshot Data backup 24 continuous data protection, call rollback to any second DataArk Hotfix Kernel patch without system shutdown Re scale Resize CPU/Memory/Disk anytime 15

  16. Storage Solution VM Disk Local SSD disk NAS, no limit on device numbers UDisk UFS NFS file system Object storage UFile UArchive Low cost cloud archive 16

  17. Create GPU VM 17

  18. Create GPU VM 18

  19. Create GPU VM 19

  20. P40 Physical Machine Hardware Specification ��� +��$���;.1; ��� �!��$��: 0<:.10 "����� 0:<� ���� 5+���� ��������! 7.�����1; 20

  21. VM Configuration - P40 VM VM VM VM GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU ;� CPU -� 7<� 50� 0;� Memory -� 7<� 50� <;� 3<� 70-� 7..��* 7+ Disk 21

  22. P40 Price Configuration Spec 1 Spec 2 CPU ;� 50� Memory -� 70-� Disk 7..� 7+ GPU 7 ; Price ;@...���A&��!�, 7-@0..���A&��!�, USD Price B:8.&� B0@<..&� 0.... 7:... (-. 7.... �;. :... . "�! "�C 22

  23. UAI-Service Overview �������(&��� %���������� (���� +�!���#$�� "2��� +�����!' �!���)��� ������������"���$ D!$�!�����'��� %���������� +���!�! Resources ����� � ��� #��% ��� 23

  24. Distributed Training Layout �������(&��� Features ���$�� �'�$ +�!���#$�� (���� ��E�F� �!)���!�� +�����!' "2��� Storage &����7&���� Resources &���� ��� ��� &����$ &$� 24

  25. Distributed Training Process 54���$�� 04+����E��'�$ ���$�� ��(&��� +�����!' 74��$��� �� �� +����� Storage &����7&���� �" &���� ������ ������ ������ &���� ���) &$� Resources ��� ��� 25

  26. Online Inference Layout User SDK/Web Online Inference System Deploy Running Eval TensorFlow Keras Test Env MXNet Storage Docker /task1/code /data Resources Images /ckpt CPU FPGA GPU /log 26

  27. Online Inference Process 2.Test & Eval 3.Deploy Deploy SDK/Web Test Env 1.Upload ULB Tester Storage Docker /task1/code Docker Docker Docker /ckpt Perf Docker Resource 27

  28. Online Inference API/SDK Deploy AB Test User ULB Scalable Service Service Service Perf report Docker Docker Docker Model update Docker Resource Rollback 28

  29. GPU Scenario Deep Learning Advertisement CTR Face Recognition Gene Sequencing HPC Weather Voice Recognition Forecasting Picture\Film\ACG Maya Rendering Rendering Online Rendering 3Dmax Simulator Unity

  30. GPU Scenario Training Online Service User Input Big Data Advertisement CTR Neural Neural Network Network Face Recognition Model Model Voice Recognition Output Neural Network Model ( GPU ) ( ( ( ) ) ) ( ( ( ( GPU ) ) ) ) Compute-Intensive Compute-Sensitive

  31. GPU Scenario: Example CTR click through rate estimation • • ��������%�'������G��������! ���������!�$�,��������$����� • %�������! �E���!��! : �+F �$�������� 1���������� • ����� 、 ��� �������!�����! : ����,�! ������

  32. GPU Scenario: Example CTR click through rate estimation • x=[Weekday=Wednesday, Gender=Male, City=Shanghai] x=[0,0,1,0,0,0,0 0,1 0,0,1,0…0] CTR Estimate Model Percent of Click : : 25% : :

  33. Thank You www.ucloud.cn 33

Recommend


More recommend