APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER GRAPHICS Mike Houston
Practical DEEP LEARNING Examples Image Classification, Object Detection, Localization, Speech Recognition, Speech Translation, Action Recognition, Scene Understanding Natural Language Processing Breast Cancer Cell Mitosis Detection, Pedestrian Detection, Traffic Sign Recognition Volumetric Brain Image Segmentation
What is DEEP LEARNING? Input Result
Deep Learning Framework Forward Propagation Repeat “turtle” Tree Training Backward Propagation Cat Compute weight update to nudge f rom “turtle” towards “dog” Trained Neural Dog Net Model Inference “cat”
Making a vehicle classifier PICKUP SUV SUV
The “Big Bang” In Deep Learning Algorithms Data Compute Capability
Medical Research Detecting Mitosis in Predicting the Toxicity Understanding Gene Mutation Breast Cancer Cells of New Drugs to Prevent Disease — IDSIA — Johannes Kepler University — University of Toronto
Captioning “Automated Image Captioning with ConvNets and Recurrent Nets” — Andrej Karpathy, Fei-Fei Li
Why Are GPUs Good for Deep Learning? Neural Networks GPUs Inherently Parallel 110 28% Matrix 26% Operations 60 FLOPS 16% 12% 7% 4 0 0 GPUs deliver -- 2010 2011 2012 2013 2014 same or better prediction accuracy faster results person bird smaller footprint dog frog chair lower power
GPU-Accelerated Deep Learning START-UPS
GPU-Accelerated Deep Learning Frameworks CAFFE TORCH THEANO CUDA-CONVNET2 KALDI Deep Learning Scientific Computing Math Expression Deep Learning Speech Recognition Domain Framework Framework Compiler Application Toolkit cuDNN R2 R2 R2 -- -- ( nnet2) Multi-GPU In Progress In Progress In Progress ( nnet2) Multi-CPU License BSD-2 GPL BSD Apache 2.0 Apache 2.0 Text-based definition Interface(s) Python, Lua, MATLAB Python C++ C++, Shell scripts files, Python, MATLAB Embedded (TK1) http://developer.nvidia.com/deeplearning
DIGITS
DIGITS USER Monitor Process Configure Visualize Progress Data DNN Layers INTERFACE DEEP GPU TRAINING SYSTEM FOR DATA Theano SCIENTISTS Torch Caffe cuDNN, cuBLAS CUDA Design DNNs Visualize activations Manage multiple trainings GPU Cloud GPU HW GPU Multi-GPU Cluster
DIGITS Process Data Configure DNN Monitor Progress Visualize Layers Test Image
DIGITS DEVBOX World’s fastest GPU Max GPU out of a plug Multi-GPU training & inference
Production Automotive Pipeline
TEGRA X1 CLASSIFICATION Performance AlexNet 100 90 IMAGES / SECOND 80 70 60 50 40 30 20 10 0 Tegra K1 Tegra X1
Project dave — darpa autonomous vehicle IMAGENET DNN-based self-driving robot CHALLENGE Training data by human driver Accuracy % No hand-coded CV algorithms DNN 84% CV 74% 72% 2010 2011 2012 2013 2014
TRAINING DATA 225K Images
DAVE IN ACTION
Active Learning Data Scientist Vehicle Solver Network Dashboard Model Classification Detection DIGITS - Train Drive PX - Deploy Segmentation
Deep Learning and Vision/Graphics
Street Number Detection [Goodfellow 2014]
Object Classification [Krizhevsky 2012]
Image Retrieval [Krizhevsky 2012]
Pose Estimation [Toshev, Szegedy 2014]
Object Detection
[Huval et al. 2015]
Face Recognition [Taigman et al. 2014]
Action Recognition [Simonyan et al. 2014]
Playing Games [Mnih et al. 2013]
Semantic Segmentation [Farabet et al. 2013]
Super Resolution [Dong et al. 2014]
Ray Tracing – Monte Carlo Denoising [Kalantari et al. 2015]
“Dreams” [Mordvinstev et al. 2015]
“Dreams” [Mordvinstev et al. 2015]
Recommend
More recommend