STARTING A DEEP LEARNING PROJECT Bryan Catanzaro, 11 May 2017

Supervised learning (learning from tagged data) X Y Input Output tag: Yes/No Image (Is it a coffee mug?) Yes Data: No Learning X ➡ Y mappings is hugely useful Andrew Ng

EXAMPLE X->Y MAPPINGS Image classification Speech recognition Speech synthesis Recommendation systems Natural language understanding Most surprisingly: these mappings can generalize @ctnzr 3

DEEP NEURAL NET A very simple universal approximator One layer Deep Neural Net nonlinearity @ctnzr 4

WHY DEEP LEARNING Accuracy Deep Learning Scale Matters Millions to Billions of parameters Data Matters Many previous Regularize using more data methods Productivity Matters Data & Compute It’s simple, so we can make tools Deep learning is most useful for large problems @ctnzr 5

SUCCESSFUL DEEP LEARNING What characteristics do successful deep learning applications share? How to prepare to use deep learning? @ctnzr 6

1. DATASET Deep learning requires large datasets Without a large dataset, deep learning isn’t likely to succeed What is large? (typically thousands to millions) Labels are a huge hassle Getting someone to decide the “right” answer can be hard If a dataset requires skilled labor to produce labels, this limits scale @ctnzr 7

2. REUSE Making deep neural networks is expensive Computation Data acquisition Engineering time So deep learning makes sense if a model can be reused If small changes to the problem invalidate the model, it’s not a good fit For example, if a model has to be retrained for each level of a videogame, this makes it hard to deploy @ctnzr 8

3. FEASIBILITY Can you describe the problem as an X -> Y mapping? Speech recognition Image classification Or does it require “strong AI” “Magic goes here” What level of accuracy is required for the application to succeed? @ctnzr 9

4. PAYOFF Generally needs a big payoff to justify investment If you had an oracle for this problem, what would change? What is the speed of light opportunity? Self-driving cars - $T market opportunity Cafeteria menu predictor - ??? @ctnzr 10

5. FAULT TOLERANCE Every statistical method fails at times Plan for occasional failure: Guard rails Heuristics All models are wrong, but some are useful -- George Box @ctnzr 11

TRAINING, VALIDATION, TEST SET Dataset division Training set: 0.6 Rule of thumb bang on this data all you want Validation set: periodically during training, check (are we overfitting?) 0.2 0.2 Test set: rarely (weekly), evaluate progress TRAIN VALIDATION TEST @ctnzr 12

OVERFITTING Neural networks can memorize details of training set This can lead to loss of generalization In other words: failure Val. Train It often looks like this: Training loss goes down Validation loss goes up Your network is probably too big Or your data is too small @ctnzr 13

MAKING YOUR TEST SET Garbage in, garbage out Many choices while partitioning dataset into train, validation, test Critical to do this right Training set should be representative of testing set But cannot include the testing set If you don’t set up your test set to prove generalization You will get overfitting @ctnzr 14

THE EXTERNAL TRAINING LOOP What happens if you peek at your test set too often? Survival of the fittest Evolution Overfitting, like it or not This is why competitions have rules Can’t test your model too often Hierarchy of test sets @ctnzr 15

PRECISION & RECALL For binary classifier Precision: when you said you found it, how often were you right? Recall: what percentage of true things did you find? Fundamental tradeoff here: Only care about precision: always say no Only care about recall: always say yes Area under the curve @ctnzr 16

ACCURACY Before starting a project, you should figure out what success looks like This can be surprisingly hard to pin down Lots of ways to measure it: Area Under Curve, specificity/sensitivity, mean average precision First thing to do: get a test set, figure out how to measure accuracy @ctnzr 17

CAN SOMETHING SIMPLER WORK? Make test Try simple Try deep set model learning After you have a test set and an accuracy metric You should try a very simple model (linear regression, logistic regression, random forest) This gives you a baseline on which to improve If the simple thing is already good enough, you’ve won! @ctnzr 18

DATA CULTURE Often, data is undervalued We need to preserve as much data as possible Years down the road, it could be useful All of us should think of ways of building up data Labels are especially useful (like feedback, or sorting, etc.) Would be great for Nvidia to have centralized data stores So others could experiment @ctnzr 19

HOW DO I GET STARTED Take a machine learning class! (DLI) Learn a framework: Tensorflow, Torch, Caffe, CNTK, Mxnet, Keras, Theano Brainstorm useful X-Y mappings Bias towards action: experiment! Try it out! @ctnzr 20

BIAS TOWARDS EXPERIMENTATION Deep Learning is an empirical field It’s hard to know whether an idea will work Some, surprisingly, do work Some, surprisingly, don’t If you have convinced yourself you’ve framed the problem appropriately You should then start trying things out @ctnzr 21

CONCLUSION We’re all excited about Deep Learning As you think about your own DL applications, consider: 1. Dataset 2. Reuse 3. Feasibility 4. Payoff 5. Fault Tolerance Make a test set, figure out how to measure accuracy Experiment! Try it out! @ctnzr 22

STARTING A DEEP LEARNING PROJECT Bryan Catanzaro, 11 May 2017 - PowerPoint PPT Presentation

STARTING A DEEP LEARNING PROJECT Bryan Catanzaro, 11 May 2017 Supervised learning (learning from tagged data) X Y Input Output tag: Yes/No Image (Is it a coffee mug?) Yes Data: No Learning X Y mappings is hugely useful Andrew Ng

Going deep and learning to love the haters Advice for graduate school Kay Ousterhout UC Berkeley

Final Project. Advanced Topics in Deep Learning Instructor: Yuan Yao Due: 23:59 Sunday 15 Dec,

Full Stack Deep Learning Troubleshooting Deep Neural Networks Josh Tobin, Sergey Karayev, Pieter

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Project Project Todays Topic: Deep Learning in Manufacturing Difficulty of DL in

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Reproducibility and Replicability in Deep Reinforcement Learning (and Other Deep Learning

FUNDAMENTALS OF DEEP LEARNING FOR COMPUTER VISION Twin Karmakharm DLI Certified Instructor

Image Classification with DIGITS NVIDIA Deep Learning Institute 1 DEEP LEARNING INSTITUTE DLI

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

An Overview of Deep Residual Learning Semih Yagcioglu 01.03.2016 Deep Residual Learning

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Mocha.jl Deep Learning in Julia Chiyuan Zhang (@pluskid) CSAIL, MIT Deep Learning Learning

Minjie Wang Deep Learning Deep Learning trend in the past 10 years Caffe State-of-art DL

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Deep Learning: State of the Art (2020) Deep Learning Lecture Series https://deeplearning.mit.edu

12. Unsupervised Deep Learning CS 535 Deep Learning, Winter 2018 Fuxin Li With materials from

Regularization for Deep Learning Lecture slides for Chapter 7 of Deep Learning

Introduction to Deep Learning Outline Deep Learning RNN CNN Attention

Deep learning for NLP: Introduction CS 6956: Deep Learning for NLP Words are a very fantastical

Deep Reinforcement Learning and Complex Environments Raia Hadsell End-to-end Deep Learning

Bayesian Deep Learning Mohd Adnan Problems With Deep Learning What does a model not know?