recent trends in computer vision and deep learning systems
play

Recent Trends in Computer Vision and Deep Learning Systems Yangqing - PowerPoint PPT Presentation

Recent Trends in Computer Vision and Deep Learning Systems Yangqing Jia Lead Researcher and Manager of AI Platform, Facebook Computer Vision AlexNet So it begins. VGGNet Punch it. GoogLeNet We must go deeper. ResNet And we took the word


  1. Recent Trends in Computer Vision and Deep Learning Systems Yangqing Jia Lead Researcher and Manager of AI Platform, Facebook

  2. Computer Vision

  3. AlexNet So it begins.

  4. VGGNet Punch it.

  5. GoogLeNet We must go deeper.

  6. ResNet And we took the word seriously

  7. ResNet And we took the word seriously

  8. ResNeXT We totally see it coming

  9. Pushing the Performance 28.2 16.4 7.3 6.7 3.57 3.03 ScSVM AlexNet VGGNet GoogLeNet ResNet ResNeXT

  10. Why is it challenging? Gradients, as one example exploding ideal vanishing 1 3 5 7 9 11 13 15 depth

  11. Deep Learning Systems

  12. "SAP" - Scalability

  13. Scalability Run fast, run far “How do I train on 
 multiple GPUs and machines?” - Probably the most question we got from Ca ff e users

  14. Scalability Run fast, run far 1.2 million = (# of images in ImageNet1K) (# of new images @FB every 5 mins in 2013) (# of AI jobs per month @FB)

  15. Scalability Run fast, run far L1 L2 L3 L3b L2b L1b U3 U2 U1

  16. Scalability Run fast, run far L1 L2 L3 L3b L2b L1b R3 R2 R1 U3 U2 U1

  17. Scalability Run fast, run far L1 L2 L3 L3b L2b L1b R3 R2 R1 U3 U2 U1 L1 L2 L3 L3b L2b L1b R3 R2 R1 U3 U2 U1

  18. Scalability Run fast, run far L1 L2 L3 L3b L2b L1b R3 U3 R2 U2 R1 U1 L1 L2 L3 L3b L2b L1b R3 U3 R2 U2 R1 U1

  19. The Return of MPI "I'm your father", said Allreduce. Allreduce Tree based - O(MlogN) Ring based - O(M) etc.

  20. And so we scale

  21. "SAP" - Arithmetics

  22. Quantized Computation Forget about float, the world is bigger 8 23 float 5 10 fp16 16 fixed16 8 fixed8

  23. Why do we care? Battery life is life. 0.9 float add 4.0 float mul 0.4 fp16 add 1.0 fp16 mul 0.05 fixed16 add 0.2 fixed8 mul 0.03 fixed8 add

  24. How does it perform? Source: Nvidia https://devblogs.nvidia.com/parallelforall/mixed-precision-programming-cuda-8/

  25. Why does it matter for cars? 250 watts 10 watts 10 -> 20 TFlops 0.7 -> 1.5 TFlops

  26. "SAP" - Portability

  27. Portable System One software to rule them all, and... AI Math and Algorithms Deployment Platforms

  28. Portable System Cloud, Mobile, IoT, Cars, Drones, Co ff ee makers auto predictor = 
 caffe2::Predictor(model_file) public class Predictor implements 
 Model Caffe2ModelInterface;

  29. The Land of Deep Learning System Not as complex as a car, but still. Applications Caffe, Torch, TF, etc... DataBases Core Math Comms Low Level LevelDB 
 Eigen 
 NCCL 
 CUDA RocksDB 
 CuDNN MPI 
 OpenGL Compilers Hadoop NNPack 
 ZeroMQ 
 OpenCL Amazon S3 THNN 
 Redis 
 Vulkan 
 your old disk MKL ... ...

  30. Thank you! Recent Trends in Computer Vision and Deep Learning Systems Yangqing Jia

Recommend


More recommend