scaling up deep learning for autonomous vehicles
play

Scaling-Up Deep Learning For Autonomous Vehicles JOSE M. ALVAREZ - PowerPoint PPT Presentation

Scaling-Up Deep Learning For Autonomous Vehicles JOSE M. ALVAREZ | | San Jose 2019 1 NVIDIA AI-Infra 2 AI-Infra Team One of our top Goals Industry grade Deep Learning to take AV Perception DNN into production, tested in multiple


  1. Accuracy vs Efficiency (for Large datasets) 46

  2. Accuracy vs Efficiency 2013 2015 2016 TRAINING TESTING 47

  3. Accuracy vs Efficiency Efficient Training of DNN Goal: maximize training resources while obtaining deployment ‘friendly’ network. 48

  4. Over-parameterization 49

  5. Accuracy vs Efficiency Capacity Non-linearity Num. parameters Same receptive field 50

  6. Accuracy vs Efficiency Validation Accuracy on a 3x3-based Convnet (orange) and the equivalent 5x5-based Convnet ( blue ) 51 https://blog.sicara.com/about-convolutional-layer-convolution-kernel-9a7325d34f7d

  7. Accuracy vs Efficiency Capacity Non-linearity Num. parameters FLOPS ? Non-linearity Same receptive field n x n as [1 x n] and [n x 1] 52

  8. Accuracy vs Efficiency Filter Decompositions for Real-time Semantic Segmentation [Alvarez and Petersson], DecomposeMe: Simplifying ConvNets for End-to-End Learning. Arxiv 2016 53 [Romera, Alvarez et al.] , Efficient ConvNet for Real-Time Semantic Segmentation. IEEE-IV 2017, T-ITS 2018

  9. Accuracy vs Efficiency Filter Decompositions for Real-time Semantic Segmentation Cityscapes dataset (19 classes, 7 categories) Train Pixel Class IoU Category IoU mode accuracy Scratch 94.7 % 70.0 % 86.0 % Pre-trained 95.1 % 71.5 % 86.9 % Forward-Time: Cityscapes 19 classes TEGRA-TX1 TITAN-X Fwd 1024x5 512x256 1024x512 2048x1024 512x256 2048x1024 Pass 12 Time 85 ms 310 ms 1240 ms 8 ms 24 ms 89 ms FPS 11.8 3.2 0.8 125.0 41.7 11.2 54 [Romera, Alvarez et al.] , Efficient ConvNet for Real-Time Semantic Segmentation. IEEE-IV 2017, T-ITS 2018

  10. Accuracy vs Efficiency 55 [Romera, Alvarez et al.] , Efficient ConvNet for Real-Time Semantic Segmentation. IEEE-IV 2017, T-ITS 2018

  11. Accuracy vs Efficiency Efficient Training of DNN Goal: maximize training resources while obtaining deployment ‘friendly’ network. 56

  12. Accuracy vs Efficiency Efficient Training of DNN Goal: maximize training resources while obtaining deployment ‘friendly’ network. 57

  13. Accuracy vs Efficiency Common Approach Train a large model (trade-off accuracy / computational cost) Prune / TRAIN DEPLOY Optimize Promising model Optimize for Specific hardware For a specific application 58 Regularization at parameter level

  14. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Train a large model (trade-off accuracy / computational cost) DEPLOY Joint Train / Pruning Optimize for Specific hardware 59

  15. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Convolutional layer Removed 5x1x3x3 To be kept 60

  16. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Common approach: [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 61 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  17. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Our Approach: [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 62 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  18. Classification Results 63

  19. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ImageNet dataset: 1.2 million training images and 50.000 for validation split in 1000 categories. Between 5000 and 30000 training images per class. No data augmentation (random flip) . [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 64 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  20. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ImageNet Train an over-parameterized architecture up to 768 neurons per layer ( Dec 8 -768 ) [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 65 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  21. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ImageNet [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 66 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  22. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ICDAR character recognition dataset [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 67 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  23. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ICDAR character recognition dataset Train an over-parameterized architecture up to 512 neurons per layer ( Dec 3 -512 ) [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 68 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  24. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Quantitative Results on ICDAR character recognition dataset [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 69 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  25. Accuracy vs Efficiency Joint Training and Pruning Deep Networks Skip connection Dec7-1 Dec7-2 Dec8 Dec8-1 Dec1 Dec2 Dec3 Dec4 Dec5 Dec6 Dec7 Dec8-2 FC 100 0 Skip connection [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 70 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  26. Accuracy vs Efficiency Skip connection Dec1 Dec2 Dec3 Dec4 Dec5 Dec6 Dec7 Dec7-1 Dec7-2 Dec8 Dec8-1 Dec8-2 FC 100 0 Skip connection Initial number 600 Learned number 500 Number of neurons 400 300 200 100 0 L1v L1h L2v L2h L3v L3h L4v L4h L5v L5h L6v L6h L7v L7hL7-1v L7-1h L7-2v L7-2hL8v L8hL8-1v L8-1h L8-2v L8-2h 71 Layer Name

  27. Accuracy vs Efficiency Skip connection Dec1 Dec2 Dec3 Dec4 Dec5 Dec6 Dec7 Dec7-1 Dec7-2 Dec8 Dec8-1 Dec8-2 FC 100 0 Skip connection Initial number 600 Learned number 500 Number of neurons 400 300 (No drop in accuracy) 200 100 0 L1v L1h L2v L2h L3v L3h L4v L4h L5v L5h L6v L6h L7v L7hL7-1v L7-1h L7-2v L7-2hL8v L8hL8-1v L8-1h L8-2v L8-2h 72 Layer Name

  28. KITTI Object Detection Results 73

  29. Accuracy vs Efficiency Object Detection KITTI Prune / TRAIN Optimize Promising model For a specific application 74

  30. Accuracy vs Efficiency Object Detection KITTI Prune / TRAIN Optimize Joint Train / Pruning 75

  31. Accuracy vs Efficiency Compression-aware Training of DNN Convolutional layer Removed 5x1x3x3 To be kept [Alvarez and Salzmann], Learning the number of neurons in Neural Nets, NIPS 2016 76 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  32. Accuracy vs Efficiency Compression-aware Training of DNN Uncorrelated filters should maximize the use of each parameter / kernel Cross-correlation of Gabor Filters. 77

  33. Accuracy vs Efficiency Compression-aware Training of DNN Weak-Points Significantly larger training time (prohibitive at large scale) . Usually drops in accuracy. Orthogonal filters are difficult to compress (post-processing). 78 [P Rodríguez, J Gonzàlez, G Cucurull, J. M. Gonfaus, X. Roca] Regularizing CNNs with Locally Constrained Decorrelations. ICLR 2017

  34. Accuracy vs Efficiency Compression-aware Training of DNN Convolutional layer Removed 5x1x3x3 To be kept 79

  35. Accuracy vs Efficiency Compression-aware Training of DNN 80 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  36. Accuracy vs Efficiency Compression-aware Training of DNN Our Approach: 81 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  37. Classification Results 82

  38. Accuracy vs Efficiency Compression-aware Training of DNN Quantitative Results on ImageNet using ResNet50* 256-d 1x1, 64 relu 3x1, 64 relu 1x3, 64 relu 1x1, 256 83 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  39. Training Efficient (side benefit) 84

  40. Accuracy vs Efficiency Compression-aware Training of DNN 85 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  41. Accuracy vs Efficiency Compression-aware Training of DNN 86 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  42. Accuracy vs Efficiency Compression-aware Training of DNN Up to 70% train speed-up (similar accuracy) 87 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  43. Accuracy vs Efficiency Compression-aware Training of DNN Is Over-parameterization needed? Observations: Additional training parameters are needed to initially help the optimizer. Small models are explicitly constrained, same training regime may not be fair. Other optimizers lead to slightly better results in optimizing compact networks from scratch. 88 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  44. Accuracy vs Efficiency Compression-aware Training of DNN Number of parameters decreases Number of layers increases Data Movements may be more significant than current savings. 89 [Alvarez and Salzmann], Compression-aware training of DNN, NIPS 2017

  45. Accuracy vs Efficiency (more on over-parameterization) 90

  46. Accuracy vs Efficiency Capacity Non-linearity Num. parameters Num. layers Same receptive field 91

  47. ExpandNets Exploiting Linear Redundancies 92

  48. ExpandNets 3x3 conv, 64 Input 224x224 3x3 conv, 64 11x11 conv, 64 11x11 conv, 64 3x3 conv, 64 3x3 conv, 64 5x5 conv, 192 3x3 conv, 64 [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

  49. ExpandNets [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

  50. ExpandNets [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

  51. Classification Results

  52. ExpandNets N N 384 Conv3 Conv4 Conv5 192 64 Conv2 Conv1 3 input ImageNet Baseline Expanded N =128 46.72% 49.66% N =256 54.08% 55.46% N =512 58.35% 58.75% 6 @ 3x3 128 @ 3x3 64 @ 3x3 128 @ 3x3 [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

  53. ExpandNets MobileNetV2: The Next Generation of On-Device Computer Vision Networks Model Top-1 Top-5 MobileNetV2 70.78% 91.47% MobileNetV2- expanded 74.85% 92.15% [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

  54. ExpandNets MobileNetV2: The Next Generation of On-Device Computer Vision Networks 3x3 conv, 64 3x3 conv, 64 3x3 conv, 64 Model Top-1 Top-5 MobileNetV2 70.78% 91.47% 3x3 conv, 64 MobileNetV2- expanded 74.85% 92.15% MobileNetV2- expanded-nonlinear 74.17% 91.61% 3x3 conv, 64 MobileNetV2- expanded (nonlinear Init) 75.46% 92.58% [ Guo, Alvarez, Salzmann ], ExpandNets: Exploiting Linear Redundancy to Train Small Networks. Arxiv 2018

  55. ExpandNet beyond classification

Recommend


More recommend