CS 4803 / 7643: Deep Learning Topics: Backpropagation - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Topics: – Backpropagation – Vector/Matrix/Tensor math – Deriving vectorized gradients for ReLU Zsolt Kira Georgia Tech

Administrivia • PS1/HW1 out • Start thinking about project topics/teams (C) Dhruv Batra & Zsolt Kira 2

Do the Readings! (C) Dhruv Batra & Zsolt Kira 3

Recap from last time (C) Dhruv Batra & Zsolt Kira 4

Gradient Descent Pseudocode for i in {0,…,num_epochs}: for x, y in data: 𝑧 � � 𝑇𝑁 𝑋𝑦 𝑀 � 𝐷𝐹 𝑧 �, 𝑧 �� ? ? ? �� 𝑋 ≔ 𝑋 � 𝛽 �� Some design decisions: • How many examples to use to calculate gradient per iteration? • What should alpha (learning rate) be? • Should it be constant throughout? • How many epochs to run to?

How to Simplify? • Calculating gradients for large functions is complicated • Step 1 : Decompose the function and compute local gradients for each part! • Step 2: Apply generic algorithm that computes gradients locally and uses chain rule to propagate across computation graph (C) Dhruv Batra & Zsolt Kira 6

Computational Graph Any DAG of differentiable modules is allowed! (C) Dhruv Batra & Zsolt Kira 7 Slide Credit: Marc'Aurelio Ranzato

Key Computation: Forward-Prop (C) Dhruv Batra & Zsolt Kira 8 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Key Computation: Back-Prop (C) Dhruv Batra & Zsolt Kira 9 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Neural Network Training • Step 1: Compute Loss on mini-batch [F-Pass] (C) Dhruv Batra & Zsolt Kira 10 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Neural Network Training • Step 1: Compute Loss on mini-batch [F-Pass] • Step 2: Compute gradients wrt parameters [B-Pass] (C) Dhruv Batra & Zsolt Kira 13 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Neural Network Training • Step 1: Compute Loss on mini-batch [F-Pass] • Step 2: Compute gradients wrt parameters [B-Pass] • Step 3: Use gradient to update parameters (C) Dhruv Batra & Zsolt Kira 16 Slide Credit: Marc'Aurelio Ranzato, Yann LeCun

Backpropagation: a simple example e.g. x = -2, y = 5, z = -4 Chain rule: Want: Upstream Local gradient gradient 17 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

Patterns in backward flow add gate: gradient distributor max gate: gradient router mul gate: gradient switcher Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

Summary • We will have a composed non-linear function as our model – Several portions will have parameters • We will use (stochastic/mini-batch) gradient descent with a loss function to define our objective • Rather than analytically derive gradients for complex function, we will modularize computation – Back propagation = Gradient Descent + Chain Rule • Now: – Work through mathematical view – Vectors, matrices, and tensors – Next time: Can the computer do this for us automatically? • Read: – https://explained.ai/matrix-calculus/index.html – https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/slides/L5_gradients _notes.pdf (C) Dhruv Batra and Zsolt Kira 19

Matrix/Vector Derivatives Notation • Read: – https://explained.ai/matrix-calculus/index.html – https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/slide s/L5_gradients_notes.pdf • Matrix/Vector Derivatives Notation • Vector Derivative Example • Extension to Tensors • Chain Rule: Composite Functions – Scalar Case – Vector Case – Jacobian view – Graphical view – Tensors • Logistic Regression Derivatives (C) Dhruv Batra & Zsolt Kira 20

(C) Dhruv Batra & Zsolt Kira 21

CS 4803 / 7643: Deep Learning Topics: Backpropagation - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Topics: Backpropagation Vector/Matrix/Tensor math Deriving vectorized gradients for ReLU Zsolt Kira Georgia Tech Administrivia PS1/HW1 out Start thinking about project topics/teams (C)

CS 4803 / 7643: Deep Learning Website: http://www.cc.gatech.edu/classes/AY2020/cs7643_spring/

CS 4803 / 7643: Deep Learning Website: https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/

CS 4803 / 7643: Deep Learning Topics: Image Classification Supervised Learning view

CS 4803 / 7643: Deep Learning Topics: Structured representations with graph networks Zsolt

CS 4803 / 7643: Deep Learning Topics: Dynamic Programming (Q-Value Iteration)

CS 4803 / 7643: Deep Learning Topics: Moving beyond supervised learning Zsolt Kira Georgia

CS 4803 / 7643: Deep Learning Topic: Reinforcement Learning (RL) Overview Markov

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Ashwin Kalyan

CS 4803 / 7643: Deep Learning Guest Lecture: Embeddings and world2vec Feb. 18 th 2020 Ledell Wu

CS 4803 / 7643: Deep Learning Topics: Forward and backward though conv (Beginning) of

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML Formulations Zsolt Kira

CS 4803 / 7643: Deep Learning Topics: Application: PointGoal Navigation Trust Region

CS 4803 / 7643: Deep Learning Topics: Low-label ML Formulations Zsolt Kira Georgia Tech

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Zsolt Kira Georgia

CSC2/458 Parallel and Distributed Systems Introduction Sreepathi Pai January 18, 2018 URCS

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 1 Slides adapted from

CS 126 Lecture P1: Introduction to C Outline Administrivia Background Syntax

COMP 431 Internet Services & Protocols Please sit in the front of the classroom (first

Administrivia Mini project is graded 1 st place: Justin (75.45) 2 nd place: Liia

Introduction Instructor: Haifeng Xu Outline Course Overview Administrivia An Example 2

Administrivia Website. cis.poly.edu/jsterling/cs3224 Text: Modern Operating Systems ;

COURSE OVERVIEW WEB SKILL SETS Front-End Back-End Design Front-End Back-End MY BLOG HTTP

CS 4803 / 7643: Deep Learning Topics: Backpropagation - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Topics: Backpropagation Vector/Matrix/Tensor math Deriving vectorized gradients for ReLU Zsolt Kira Georgia Tech Administrivia PS1/HW1 out Start thinking about project topics/teams (C)

CS 4803 / 7643: Deep Learning Website: http://www.cc.gatech.edu/classes/AY2020/cs7643_spring/

CS 4803 / 7643: Deep Learning Website: https://www.cc.gatech.edu/classes/AY2020/cs7643_fall/

CS 4803 / 7643: Deep Learning Topics: Image Classification Supervised Learning view

CS 4803 / 7643: Deep Learning Topics: Structured representations with graph networks Zsolt

CS 4803 / 7643: Deep Learning Topics: Dynamic Programming (Q-Value Iteration)

CS 4803 / 7643: Deep Learning Topics: Moving beyond supervised learning Zsolt Kira Georgia

CS 4803 / 7643: Deep Learning Topic: Reinforcement Learning (RL) Overview Markov

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Ashwin Kalyan

CS 4803 / 7643: Deep Learning Guest Lecture: Embeddings and world2vec Feb. 18 th 2020 Ledell Wu

CS 4803 / 7643: Deep Learning Topics: Forward and backward though conv (Beginning) of

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward &amp; Backward

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML Formulations Zsolt Kira

CS 4803 / 7643: Deep Learning Topics: Application: PointGoal Navigation Trust Region

CS 4803 / 7643: Deep Learning Topics: Low-label ML Formulations Zsolt Kira Georgia Tech

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward &amp; Backward

CS 4803 / 7643: Deep Learning Topics: Policy Gradients Actor Critic Zsolt Kira Georgia

CSC2/458 Parallel and Distributed Systems Introduction Sreepathi Pai January 18, 2018 URCS

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 1 Slides adapted from

CS 126 Lecture P1: Introduction to C Outline Administrivia Background Syntax

COMP 431 Internet Services &amp; Protocols Please sit in the front of the classroom (first

Administrivia Mini project is graded 1 st place: Justin (75.45) 2 nd place: Liia

Introduction Instructor: Haifeng Xu Outline Course Overview Administrivia An Example 2

Administrivia Website. cis.poly.edu/jsterling/cs3224 Text: Modern Operating Systems ;

COURSE OVERVIEW WEB SKILL SETS Front-End Back-End Design Front-End Back-End MY BLOG HTTP

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward

CS 4803 / 7643: Deep Learning Topics: Specifying Layers Forward & Backward

COMP 431 Internet Services & Protocols Please sit in the front of the classroom (first