optimization of deep learning applications in highly
play

Optimization of Deep Learning Applications in Highly Heterogeneous - PowerPoint PPT Presentation

Optimization of Deep Learning Applications in Highly Heterogeneous Distributed System 515030910586 F1503024 Last Edited: May. 17, 2018 Report Date: Apr. 20, 2018 1 Catalog Introduction Related Works Proposed


  1. Optimization of Deep Learning Applications in Highly Heterogeneous Distributed System ��� 515030910586 F1503024 Last Edited: May. 17, 2018 Report Date: Apr. 20, 2018 1

  2. Catalog ◦ Introduction ◦ Related Works ◦ Proposed Framework ◦ Experiments ◦ Next Step ◦ Reference 2

  3. Introduction ◦ Fast development of AI ◦ Social Hotspot � AlphaGo Series [1] ◦ Government Support ���� - ����������� [2] � SJTU AI Research [3] ◦ Bussiness Concern � HomePod (Apple), Amazon Echo ����������������� ����� ������������� 3

  4. Introduction ‘Optimization of Deep Learning Applications in Highly Heterogeneous Distributed System’ ◦ The importance of Deep Learning in AI ◦ Computer Vision ◦ NLP ◦ Computer Architecture ◦ Network ◦ …… ◦ Requires huge amount of computation ◦ Requires huge amount of data ���������������������������������� ��������� 4

  5. Introduction Traditional DL Computation Platform ◦ Researcher: Desktop + High performance GPU ◦ Company: Server cluster / Data center ◦ Advantages: High performance density ◦ Disadvantages: No portability; High usage cost; High maintenance cost �-�������������� ��� ����� �����������3�� 5

  6. Introduction Traditional DL Computation Platform ◦ Commercial products: DL applications based on Cloud Computing ◦ Advantages: Low hardware requirement on client side; Advantages of Cloud Computing ◦ Disadvantages: Poor user experience (Network Latency); Privacy issue ������������������������ ������� ����� �������������������� 6

  7. Introduction Traditional DL Data Source ◦ Publically available dataset ◦ Advantages: Convenient to compare different algos; Easy to obtain ◦ Disadvantages: Out-of-date; Far-away from end users ◦ Private dataset (in company, hospital, etc.) ◦ Advantages: Large scale; Close to production workflow ◦ Disadvantages: Not publically available; Limited research value 7

  8. Introduction New computational platform and data source: Smart Phone and IoT devices ◦ Increasing processing power of smart phone and smart devices. ◦ Giant data source with enormous IoT devices (smart phones). ◦ Low network latency in a LAN network structure. 8

  9. Introduction New computational platform and data source: Smart Phone and IoT devices �5����������������������������� ��� ���������������������� ��������� 9

  10. Introduction New Challenges ◦ As data producers, mobile and IoT devices are called: End Devices 1. Enormous data generated by end devices, larger than computational capacity. 2. Limitation of the power consumption on end devices. 3. Limitation of internal storage for both program and model. 10

  11. Introduction Summary ◦ End devices (mobile phones and IoT devices) have the potential to performance DL applications, and many companies (Qualcomm, Apple, etc.) are working on it. 11

  12. Related Works ◦ Cloud Computing, Edge Computing, Fog Computing ◦ Internet of Things (IoT) ◦ Highly Heterogeneous Distributed System ◦ S. Teerapittayanon, et al. [6] 12

  13. Related Works Cloud Computing Edge, Fog Computing [7] ◦ Features of Cloud Computing: ◦ High-speed interconnection between workers ◦ Virtualization and high scalability ◦ Service abstraction (IaaS � PaaS � SaaS) ◦ …… ◦ Motivation of Edge, Fog Computing � ◦ Apply for smart devices and IoT devices ◦ Improve processing efficiency ��������������������������� ����������������� ◦ Improve Quality of Service (QoS) 13

  14. Related Works I nternet o f T hings (IoT) [8] ◦ Many scenarios for IoT: ◦ Smart Home, Smart Retail, Smart City, ◦ Smart Agriculture, Smart Transportation, ◦ …… ◦ Typical IoT devices: ◦ Sensors ◦ Embedded communication devices ◦ Storage / computation middleware ����������������������������� ��������� �� 14

  15. Related Works ‘Optimization of Deep Learning Applications in Highly Heterogeneous Distributed System’ ◦ Distributed system with Cloud, Edge, and End devices [7] 1. Highly heterogeneous: ◦ Optimization space both locally and globally 2. Controllable communication: ◦ Leverage the communication latency 3. Towards ubiquitous computing: ◦ Smart collection of computational resources ��������������������� �������������� ����������������� 15

  16. Related Works D istributed D eep N eural N etwork (DDNN) [6] (S. Terrapittayanon et al.) ◦ Advantages ◦ A novel framework to discuss DL applications on heterogeneous distributed system. ◦ Leverage cloud workload by local exit. ◦ Disadvantages ◦ Lack of experiment on multiple devices ◦ Lack of discussion on communication latency ◦ Lack of generous DL application test-case (multi-view tracking) ◦ Lack of discussion on computing capabilities (all devices use BNN) �������������������� �������� 16

  17. Related Works D istributed D eep N eural N etwork (DDNN) [6] (S. Terrapittayanon et al.) �)��-��������� ����)���� �)��-����������-���-��� ��-����-:�)�(�()�-�)����(���)�����( ����-��-)���)()���-��������-��� 17

  18. Related Works Summary ◦ Utilize the heterogeneous distributed framework (end device + edge device + cloud device) to perform DL applications, based on and improve from DDNN. 18

  19. Proposed Framework ‘Optimization of Deep Learning Applications in Highly Heterogeneous Distributed System’ ◦ Propose a framework with 3 types of heterogeneity : 1. Computing Node Heterogeneity 2. Neural Network Heterogeneity 3. Deep Learning Task Heterogeneity 19

  20. Proposed Framework ‘Optimization of Deep Learning Applications in Highly Heterogeneous Distributed System’ ◦ Optimization Target 1. Each device has locally optimal performance: DL performance, power consumption, model size. 2. Overall system has globally optimal performance: response time, load balancing. 3. Obtain scalability, robustness, failure recovery. 20

  21. Proposed Framework Computing Node Heterogeneity ◦ Cloud computing Distribute workload to each node ◦ Benefit: ◦ Avoid long response delay between end user and the cloud ◦ Avoid potential privacy leakage ◦ Make full use of nearby computing resources 21

  22. Proposed Framework Neural Network Heterogeneity ◦ All same NN structure (like DDNN) Optimized NN for each type ◦ Benefit: ◦ Choose different NN structure for each node ◦ Make full use of available hardware resources ◦ Being able to reach local optimal for performance, speed, model size, power consumption, …… e.g., End devices use MobileNet [9] (resource-oriented) Cloud device use ResNet [4] (performance-oriented) 22

  23. Proposed Framework Deep Learning Task Heterogeneity ◦ Same task for all nodes Different subtask for each node ◦ Benefit: ◦ Choose different DL tasks according to hardware and the DNN loaded ◦ Consider the network latency between nodes ◦ Reach globally optimal for response time, overall performance, …… e.g., In some cases, giving user a classification of ‘cat’ in very short delay is better than return ‘American bobtail cat’ for longer time. 23

  24. Proposed Framework Scheduling Algorithm ◦ Baseline 1 – ‘End’ scheme ◦ All data send to end devices in sequential order ◦ Lowest precision but highest speed ◦ Not utilizing the distributed system 24

  25. Proposed Framework Scheduling Algorithm ◦ Baseline 1 – ‘End’ scheme ◦ Baseline 2 – ‘End-Cloud’ scheme ◦ Slower, but higher performance ◦ ‘Jam’ on Cloud device 25

  26. Proposed Framework Scheduling Algorithm ◦ Baseline 1 – ‘End’ scheme ◦ Baseline 2 – ‘End-Cloud’ scheme ◦ Proposed – ‘Mapping’ scheme 26

  27. Proposed Framework Scheduling Algorithm ◦ Proposed – ‘Mapping’ scheme ◦ Find the ‘balance’ point of the distributed system ◦ High accuracy ◦ ‘Jam’ on the cloud less likely to happen 27

  28. Proposed Framework Privacy Protection ◦ Encryption module ◦ Encrypt on End/Edge devices ◦ Only send processed data 28

  29. Proposed Framework Fault Tolerance ◦ Without / With device state monitoring Dev2 Dev3 Dev2 Dev3 Dev1 Dev1 Round 1 Round 2 29

  30. Proposed Framework Fault Tolerance ◦ Without / With device state monitoring Dev1 Dev2 Dev3 Dev1 Dev1 Dev1 Round 1 Round 2 Round 4 Round 3 Online: Online: Online: Online: Dev1 Dev1 Dev1 Dev1 Dev2 Dev3 30

Recommend


More recommend