CS 4803 / 7643: Deep Learning Website: - PowerPoint PPT Presentation

Deep Learning = Hierarchical Compositionality “car” Low-Level Mid-Level High-Level Trainable Feature Feature Feature Classifier Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013] Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

So what is Deep (Machine) Learning? • A few different ideas: • (Hierarchical) Compositionality – Cascade of non-linear transformations – Multiple layers of representations • End-to-End Learning – Learning (goal-driven) representations – Learning to feature extraction • Distributed Representations – No single neuron “encodes” everything – Groups of neurons work together (C) Dhruv Batra & Zsolt Kira 43

Traditional Machine Learning VISION hand-crafted your favorite features “car” classifier SIFT/HOG fixed learned SPEECH hand-crafted your favorite features \ˈd ē p\ classifier MFCC fixed learned NLP hand-crafted This burrito place your favorite features “+” classifier is yummy and fun! Bag-of-words fixed learned 44 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Feature Engineering SIFT Spin Images HoG Textons and many many more…. (C) Dhruv Batra & Zsolt Kira 45

Traditional Machine Learning (more accurately) “Learned” VISION K-Means/ SIFT/HOG classifier “car” pooling fixed unsupervised supervised SPEECH Mixture of MFCC classifier \ˈd ē p\ Gaussians fixed unsupervised supervised NLP Parse Tree This burrito place n-grams classifier “+” Syntactic is yummy and fun! fixed unsupervised supervised (C) Dhruv Batra & Zsolt Kira 46 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Deep Learning = End-to-End Learning “Learned” VISION K-Means/ SIFT/HOG classifier “car” pooling fixed unsupervised supervised SPEECH Mixture of MFCC classifier \ˈd ē p\ Gaussians fixed unsupervised supervised NLP Parse Tree This burrito place n-grams classifier “+” Syntactic is yummy and fun! fixed unsupervised supervised (C) Dhruv Batra & Zsolt Kira 47 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

“Shallow” vs Deep Learning • “Shallow” models hand-crafted “Simple” Trainable Feature Extractor Classifier fixed learned • Deep models Trainable Trainable Trainable Feature- Feature- Feature- Transform / Transform / Transform / Classifier Classifier Classifier Learned Internal Representations Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Distributed Representations Toy Example • Local vs Distributed (C) Dhruv Batra & Zsolt Kira 50 Slide Credit: Moontae Lee

Distributed Representations Toy Example • Can we interpret each dimension? (C) Dhruv Batra & Zsolt Kira 51 Slide Credit: Moontae Lee

Ideal Feature Extractor (C) Dhruv Batra & Zsolt Kira 52

Power of distributed representations! Local Distributed (C) Dhruv Batra & Zsolt Kira 53 Slide Credit: Moontae Lee

Power of distributed representations! • United States:Dollar :: Mexico:? (C) Dhruv Batra & Zsolt Kira 54 Slide Credit: Moontae Lee

ThisPlusThat.me Image Credit: (C) Dhruv Batra & Zsolt Kira 55 http://insightdatascience.com/blog/thisplusthat_a_search_engine_that_lets_you_add_words_as_vectors.html

Benefits of Deep/Representation Learning • (Usually) Better Performance – “Because gradient descent is better than you” Yann LeCun • New domains without “experts” – RGBD – Multi-spectral data – Gene-expression data – Unclear how to hand-engineer (C) Dhruv Batra & Zsolt Kira 57

“Expert” intuitions can be misleading • “Every time I fire a linguist, the performance of our speech recognition system goes up” – Fred Jelinik, IBM ’98 (C) Dhruv Batra & Zsolt Kira 58

Benefits of Deep/Representation Learning • Modularity! • Plug and play architectures! (C) Dhruv Batra & Zsolt Kira 59

Differentiable Computation Graph Any DAG of differentialble modules is allowed! (C) Dhruv Batra & Zsolt Kira 60 Slide Credit: Marc'Aurelio Ranzato

(C) Dhruv Batra & Zsolt Kira 61

Logistic Regression as a Cascade Given a library of simple functions Compose into a complicate function (C) Dhruv Batra & Zsolt Kira 62 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Logistic Regression as a Cascade Given a library of simple functions Compose into a complicate function (C) Dhruv Batra & Zsolt Kira 63 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Key Computation: Forward-Prop (C) Dhruv Batra & Zsolt Kira 64 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Key Computation: Back-Prop (C) Dhruv Batra & Zsolt Kira 65 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Differentiable Computation Graph Any DAG of differentialble modules is allowed! (C) Dhruv Batra & Zsolt Kira 66 Slide Credit: Marc'Aurelio Ranzato

Visual Dialog Model #1 Late Fusion Encoder Slide Credit: Abhishek Das

Yes it works, but how? (C) Dhruv Batra & Zsolt Kira 89

Outline • What is Deep Learning, the field, about? • What is this class about? • What to expect? – Logistics • FAQ (C) Dhruv Batra & Zsolt Kira 90

Outline • What is Deep Learning, the field, about? • What is this class about? • What to expect? – Logistics • FAQ (C) Dhruv Batra & Zsolt Kira 91

What is this class about? (C) Dhruv Batra & Zsolt Kira 92

What is this class about? • Introduction to Deep Learning • Goal: – After finishing this class, you should be ready to get started on your first DL research project. • Convolutional Neural Networks (CNNs) • Recurrent Neural Networks (RNNs) • Deep Reinforcement Learning • Generative Models (VAEs, GANs) • Target Audience: – Senior undergrads, MS-ML, and new PhD students • Note : Materials largely follows those developed by Dhruv Batra but with slight modifications (C) Dhruv Batra & Zsolt Kira 93

What this class is NOT • NOT the target audience: – Advanced grad-students already working in ML/DL areas – People looking to understand latest and greatest cutting- edge research (e.g. GANs, AlphaGo, etc) – Undergraduate/Masters students looking to graduate with a DL class on their resume. • NOT the goal: – Teaching a toolkit. “Intro to TensorFlow/PyTorch” – Intro to Machine Learning (C) Dhruv Batra & Zsolt Kira 94

Caveat • This is an ADVANCED Machine Learning class – This should NOT be your first introduction to ML – You will need a formal class; not just self-reading/courser – Taking these concurrently does not count! – If you took CS 7641/ISYE 6740/CSE 6740 @GT, you’re in the right place – If you took an equivalent class elsewhere, see list of topics taught in CS 7641 to be sure. (C) Dhruv Batra & Zsolt Kira 95

Prerequisites • Intro Machine Learning – Classifiers, regressors, loss functions, MLE, MAP • Linear Algebra – Matrix multiplication, eigenvalues, positive semi-definiteness… • Calculus – Multi-variate gradients, hessians, jacobians… If you do not have these pre-requisite, consider dropping! • This is for your benefit, as well as benefit of others (C) Dhruv Batra & Zsolt Kira 96

Prerequisites • Intro Machine Learning – Classifiers, regressors, loss functions, MLE, MAP • Linear Algebra – Matrix multiplication, eigenvalues, positive semi-definiteness… • Calculus – Multi-variate gradients, hessians, jacobians… (C) Dhruv Batra & Zsolt Kira 97

Prerequisites • Intro Machine Learning – Classifiers, regressors, loss functions, MLE, MAP • Linear Algebra – Matrix multiplication, eigenvalues, positive semi-definiteness… • Calculus – Multi-variate gradients, hessians, jacobians… • Programming! – Homeworks will require Python, C++! – Libraries/Frameworks: PyTorch – HW1 (pure python + PyTorch), HW2-4 (PyTorch) – Your language of choice for project (C) Dhruv Batra & Zsolt Kira 98

Course Information • Instructor: Zsolt Kira – zkira@gatech – Location: 222 CCB • I will always be available; just contact me or come to office hours • My job is to: – Teach the course such that you learn a lot – Provide any support needed towards that – Have fun and develop a passion for these topics (C) Dhruv Batra and Zsolt Kira 99

Course Information • Instructor: Zsolt Kira – zkira@gatech – Location: CODA room S1181B Incoming Ph.D. • Zubair Irshad • Ben Wilson • James Smith (C) Dhruv Batra & Zsolt Kira 100

Current TAs Sameer Dharur Rahul Duggal Patrick Grady MS-CS student 2 nd year CS PhD student 2 nd year Robotics PhD student https://www.linkedin.com/in/sameerdharur/ http://www.rahulduggal.com/ https://www.linkedin.com/in/patrick-grady Jiachen Yang Anishi Mehta Yinquan Lu 2 nd year MSCSE student 2nd year ML PhD MSCS student https://www.cc.gatech.edu/~jyang462/ https://www.linkedin.com/in/anishimehta https://www.cc.gatech.edu/~jyang462/ More TAs coming soon! (C) Dhruv Batra & Zsolt Kira 101

Organization & Deliverables • PS0 (2%) + 4 homeworks (78%) – PS0 is warm-up graded pass/fail – Do it! – In general PS/HWs a mix of theory and implementation – First real one goes out next week • Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early • Final project (20%) – Projects done in groups of 3-4 • (Bonus) Class Participation (up to 3%) – Top contributors to discussions (mainly on Piazza) – Ask questions, answer questions (C) Dhruv Batra & Zsolt Kira 102

New Element: FB Co-Teaching! • Several elements including: – Guest Lectures – 6 in-class lectures by FB • Data wrangling • Embeddings and world2vec • Self-attention and transformers • Language modeling and translation • Large-scale systems • Fairness, privacy, ethics – Assignments – Volunteers developing some new elements for assignments – Project ideas – Instructors will provide ideas for real-world projects and possible (surrogate/public) data sources that mirror some of the challenges they are working on (C) Dhruv Batra & Zsolt Kira 103

Late Days • “Free” Late Days – 7 late days for the semester • Use for HWs • Cannot use for project related deadlines – After free late days are used up: • 25% penalty for each late day (C) Dhruv Batra & Zsolt Kira 104

PS0 • Out today; due 01/14 – Available on website (will show up on Canvas today) • Grading: pass/fail – <=80% means that you might not be prepared for the class – Consider dropping or talk to me if that’s the case! • Topics – Probability, calculus, convexity, proving things (C) Dhruv Batra & Zsolt Kira 105

Project • Goal – Chance to try Deep Learning – Encouraged to apply to your research (computer vision, NLP, robotics,…) – Must be done this semester. – Can combine with other classes with separated thrusts • get permission from both instructors; delineate different parts – Extra credit for shooting for a publication – Teams of 3-4 people • Undergraduate and graduates on separate teams • Contributions of each member must be explained and cannot just be report writing, etc. • Main categories – Application/Survey • Compare a bunch of existing algorithms on a new application domain of your interest – Formulation/Development • Formulate a new model or algorithm for a new or old problem – Theory • Theoretically analyze an existing algorithm (C) Dhruv Batra & Zsolt Kira 106

Computing • Major bottleneck – GPUs • Options – Your own / group / advisor’s resources – Google Cloud Credits • $50 credits to every registered student courtesy Google – Google colaboratory allows free TPU access!! • https://colab.research.google.com/notebooks/welcome.ipynb – Minsky cluster in IC (C) Dhruv Batra & Zsolt Kira 107

CS 4803 / 7643: Deep Learning Website: - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Website: http://www.cc.gatech.edu/classes/AY2020/cs7643_spring/ Piazza: https://piazza.com/gatech/spring2020/cs4803dl7643a/ Staff mailing list (personal questions): cs4803-7643-staff@lists.gatech.edu Gradescope:

CS 4803 / 7643: Deep Learning Website: https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/

CS 4803 / 7643: Deep Learning Topics: Image Classification Supervised Learning view

CS 4803 / 7643: Deep Learning Topics: Structured representations with graph networks Zsolt

CS 4803 / 7643: Deep Learning Topics: Dynamic Programming (Q-Value Iteration)

CS 4803 / 7643: Deep Learning Topics: Moving beyond supervised learning Zsolt Kira Georgia

CS 4803 / 7643: Deep Learning Topic: Reinforcement Learning (RL) Overview Markov

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Ashwin Kalyan

CS 4803 / 7643: Deep Learning Guest Lecture: Embeddings and world2vec Feb. 18 th 2020 Ledell Wu

CS 4803 / 7643: Deep Learning Topics: Forward and backward though conv (Beginning) of

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML Formulations Zsolt Kira

CS 4803 / 7643: Deep Learning Topics: Application: PointGoal Navigation Trust Region

CS 4803 / 7643: Deep Learning Topics: Low-label ML Formulations Zsolt Kira Georgia Tech

CS 4803 / 7643: Deep Learning Topics: Backpropagation Vector/Matrix/Tensor math

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Zsolt Kira Georgia

Monte-Carlo Game Tree Search: Advanced Techniques Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Generative Adversarial Networks presented by Ian Goodfellow presentation co-developed with Aaron

Deep Counterfactual Regret Min inimization Noam Brown* 12 , Adam Lerer* 1 , Sam Gross 1 , Tuomas

Deep Reinforcement Learning Philipp Koehn 21 April 2020 Philipp Koehn Artificial Intelligence:

Transfer learning with neural language models CS 685, Spring 2020 Advanced Natural Language

CS440/ECE448 Lecture 12: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

Remember these? Playing Atari Games using RL VARSHA LALWANI AKSHAY MASARE Motivation May be we

AI Methodology Theoretical aspects Mathematical formalizations, properties, algorithms

CS 4803 / 7643: Deep Learning Website: - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Website: http://www.cc.gatech.edu/classes/AY2020/cs7643_spring/ Piazza: https://piazza.com/gatech/spring2020/cs4803dl7643a/ Staff mailing list (personal questions): cs4803-7643-staff@lists.gatech.edu Gradescope:

CS 4803 / 7643: Deep Learning Website: https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/

CS 4803 / 7643: Deep Learning Topics: Image Classification Supervised Learning view

CS 4803 / 7643: Deep Learning Topics: Structured representations with graph networks Zsolt

CS 4803 / 7643: Deep Learning Topics: Dynamic Programming (Q-Value Iteration)

CS 4803 / 7643: Deep Learning Topics: Moving beyond supervised learning Zsolt Kira Georgia

CS 4803 / 7643: Deep Learning Topic: Reinforcement Learning (RL) Overview Markov

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Ashwin Kalyan

CS 4803 / 7643: Deep Learning Guest Lecture: Embeddings and world2vec Feb. 18 th 2020 Ledell Wu

CS 4803 / 7643: Deep Learning Topics: Forward and backward though conv (Beginning) of

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward &amp; Backward

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML Formulations Zsolt Kira

CS 4803 / 7643: Deep Learning Topics: Application: PointGoal Navigation Trust Region

CS 4803 / 7643: Deep Learning Topics: Low-label ML Formulations Zsolt Kira Georgia Tech

CS 4803 / 7643: Deep Learning Topics: Backpropagation Vector/Matrix/Tensor math

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward &amp; Backward

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Zsolt Kira Georgia

Monte-Carlo Game Tree Search: Advanced Techniques Tsan-sheng Hsu tshsu@iis.sinica.edu.tw

Generative Adversarial Networks presented by Ian Goodfellow presentation co-developed with Aaron

Deep Counterfactual Regret Min inimization Noam Brown* 12 , Adam Lerer* 1 , Sam Gross 1 , Tuomas

Deep Reinforcement Learning Philipp Koehn 21 April 2020 Philipp Koehn Artificial Intelligence:

Transfer learning with neural language models CS 685, Spring 2020 Advanced Natural Language

CS440/ECE448 Lecture 12: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

Remember these? Playing Atari Games using RL VARSHA LALWANI AKSHAY MASARE Motivation May be we

AI Methodology Theoretical aspects Mathematical formalizations, properties, algorithms

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward