CS 4803 / 7643: Deep Learning Topics: – Specifying Layers – Forward & Backward autodifferentiation – (Beginning of) Convolutional neural networks Zsolt Kira Georgia Tech
Projects • We will release a set of project ideas for those that are having trouble coming up with them – NEW: Facebook will develop a set of project ideas as well, with some of their researchers involved – Should be released early/mid Feb. (C) Dhruv Batra & Zsolt Kira 2
(C) Dhruv Batra & Zsolt Kira 3
(C) Dhruv Batra & Zsolt Kira 4
(C) Dhruv Batra & Zsolt Kira 5
Types of Projects • Application – Image to cooking recipe – Voice cloning – Etc. – Need to modify/implement something ; not enough to just take code and run it • Rigorous empirical report – Have a hypothesis or something you’re trying to test – Perform rigorous experiments with many (reasonable) conditions that would support/refute your hypothesis • Reproduction of a paper – Not just running online code! • Theory • Novel new method – (not necessary for the project, good if goal is publishing!) (C) Dhruv Batra & Zsolt Kira 6
Recap from last time (C) Dhruv Batra & Zsolt Kira 7
Modularized implementation: forward / backward API x z * y (x,y,z are scalars) 8 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Example: Forward mode AD + sin( ) * x 1 x 2 https://www.cc.gatech.edu/classes/AY2020/cs7643_spring/slides/autodiff_forward_reverse.pdf (C) Dhruv Batra & Zsolt Kira 9
Example: Reverse mode AD + sin( ) * x 1 x 2 https://www.cc.gatech.edu/classes/AY2020/cs7643_spring/slides/autodiff_forward_reverse.pdf (C) Dhruv Batra & Zsolt Kira 10
Fully Connected Layer Example: 200x200 image 40K hidden units ~2B parameters !!! - Spatial correlation is local - Waste of resources + we have not enough training samples anyway.. 11 Slide Credit: Marc'Aurelio Ranzato
Locally Connected Layer Example: 200x200 image 40K hidden units “Filter” size: 10x10 4M parameters Note: This parameterization is good when input image is registered (e.g., face 12 Slide Credit: Marc'Aurelio Ranzato recognition).
Locally Connected Layer STATIONARITY? Statistics similar at all locations 13 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer Share the same parameters across different locations (assuming input is stationary): Convolutions with learned kernels 14 Slide Credit: Marc'Aurelio Ranzato
How to implement this (and efficiently)?
Convolutions for mathematicians • On operation on two functions and to produce a third function • E.g. input and kernel or weighting function (C) Peter Anderson 16
Convolutions! • Math vs. CS vs. programming viewpoints (C) Dhruv Batra & Zsolt Kira 17
Convolutions for mathematicians • On operation on two functions and to produce a third function • E.g. input and kernel or weighting function (C) Peter Anderson 18
Convolutions for mathematicians • One dimension • Two dimensions 1 2 1 2 1 2 (C) Peter Anderson 19
Convolutions for programmers y • Iterate over the kernel instead of the image • Implement cross-correlation instead of convolution • Later - implementation as matrix multiplication (C) Peter Anderson 20
Discrete convolution • Discrete Convolution! • Very similar to correlation but associative 1D Convolution 2D Convolution Filter
A note on sizes m N-m +1 N m N N-m +1 Filter Input Output MATLAB to the rescue! • conv2(x,w, ‘valid’)
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 23 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 24 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 25 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 26 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 27 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 28 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 29 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 30 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 31 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 32 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 33 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 34 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 35 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 36 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 37 Slide Credit: Marc'Aurelio Ranzato
Convolutional Layer (C) Dhruv Batra & Zsolt Kira 38 Slide Credit: Marc'Aurelio Ranzato
Convolution Explained • http://setosa.io/ev/image-kernels/ • https://github.com/bruckner/deepViz (C) Dhruv Batra & Zsolt Kira 40
Convolutional Layer Learn multiple filters. E.g.: 200x200 image 100 Filters Filter size: 10x10 10K parameters (C) Dhruv Batra & Zsolt Kira 41 Slide Credit: Marc'Aurelio Ranzato
Convolution Layer 32x32x3 image -> preserve spatial structure height 32 width 32 depth 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Convolution Layer 32x32x3 image 5x5x3 filter 32 Convolve the filter with the image i.e. “slide over the image spatially, computing dot products” 32 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Convolution Layer Filters always extend the full depth of the input volume 32x32x3 image 5x5x3 filter 32 Convolve the filter with the image i.e. “slide over the image spatially, computing dot products” 32 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Convolution Layer 32x32x3 image 5x5x3 filter 32 1 number: the result of taking a dot product between the filter and a small 5x5x3 chunk of the image 32 (i.e. 5*5*3 = 75-dimensional dot product + bias) 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Convolution Layer activation map 32x32x3 image 5x5x3 filter 32 28 convolve (slide) over all spatial locations 28 32 3 1 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
consider a second, green filter Convolution Layer activation maps 32x32x3 image 5x5x3 filter 32 28 convolve (slide) over all spatial locations 28 32 3 1 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 28 32 3 6 We stack these up to get a “new image” of size 28x28x6! Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Im2Col (C) Dhruv Batra & Zsolt Kira 49 Figure Credit: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/
General Matrix Multiply (GEMM) (C) Dhruv Batra & Zsolt Kira 50 Figure Credit: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/
Preview: ConvNet is a sequence of Convolution Layers, interspersed with activation functions 32 28 CONV, ReLU e.g. 6 5x5x3 32 28 filters 3 6 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Recommend
More recommend