Lecture 7: Convolutional Networks Justin Johnson September 23, 2020 Lecture 7 - 1
Reminder: A2 Due this Friday, 9/25/2020 Justin Johnson September 23, 2020 Lecture 7 - 2
Autograder Late Tokens - Our late policy is (from syllabus): - 3 free late days - After that, late work gets 25% penalty per day - This was difficult to implement in autograder.io - We will keep track of free late days and penalties outside of autograder.io - We increased autograder.io late tokens to 1000 per student; this does not mean you can turn everything in a month late! Justin Johnson September 23, 2020 Lecture 7 - 3
Last Time: Backpropagation During the backward pass, each node in the graph receives upstream gradients Represent complex expressions and multiplies them by local gradients to as computational graphs compute downstream gradients x s (scores) * hinge L + loss W R f Downstream gradients Local gradients Forward pass computes outputs Upstream gradient Backward pass computes gradients Justin Johnson September 23, 2020 Lecture 7 - 4
Problem : So far our classifiers don’t f(x,W) = Wx respect the spatial structure of images! Stretch pixels into column 56 56 231 231 24 2 Input: x 24 h s W 1 W 2 3072 Input image Output: 10 2 Hidden layer: (2, 2) 100 (4,) Justin Johnson September 23, 2020 Lecture 7 - 5
Problem : So far our classifiers don’t f(x,W) = Wx respect the spatial structure of images! Solution : Define new computational nodes that operate on images! Stretch pixels into column 56 56 231 231 24 2 Input: x 24 h s W 1 W 2 3072 Input image Output: 10 2 Hidden layer: (2, 2) 100 (4,) Justin Johnson September 23, 2020 Lecture 7 - 6
Components of a Fully-Connected Network Fully-Connected Layers Activation Function x h s Justin Johnson September 23, 2020 Lecture 7 - 7
Components of a Convolutional Network Fully-Connected Layers Activation Function x h s Pooling Layers Normalization Convolution Layers 𝑦 !,# = 𝑦 !,# − 𝜈 # ! $ + 𝜁 𝜏 # Justin Johnson September 23, 2020 Lecture 7 - 8
Components of a Convolutional Network Fully-Connected Layers Activation Function x h s Pooling Layers Normalization Convolution Layers 𝑦 !,# = 𝑦 !,# − 𝜈 # ! $ + 𝜁 𝜏 # Justin Johnson September 23, 2020 Lecture 7 - 9
Fully-Connected Layer 32x32x3 image -> stretch to 3072 x 1 Input Output 1 1 10 x 3072 3072 10 weights Justin Johnson September 23, 2020 Lecture 7 - 10
Fully-Connected Layer 32x32x3 image -> stretch to 3072 x 1 Input Output 1 1 10 x 3072 3072 10 weights 1 number: the result of taking a dot product between a row of W and the input (a 3072- dimensional dot product) Justin Johnson September 23, 2020 Lecture 7 - 11
Convolution Layer 3x32x32 image: preserve spatial structure 32 height 32 width depth / 3 channels Justin Johnson September 23, 2020 Lecture 7 - 12
Convolution Layer 3x32x32 image 3x5x5 filter Convolve the filter with the image i.e. “slide over the image spatially, 32 height computing dot products” 32 width depth / 3 channels Justin Johnson September 23, 2020 Lecture 7 - 13
Convolution Layer Filters always extend the full depth of the input volume 3x32x32 image 3x5x5 filter Convolve the filter with the image i.e. “slide over the image spatially, 32 height computing dot products” 32 width depth / 3 channels Justin Johnson September 23, 2020 Lecture 7 - 14
Convolution Layer 3x32x32 image 3x5x5 filter 1 number: 32 the result of taking a dot product between the filter and a small 3x5x5 chunk of the image (i.e. 3*5*5 = 75-dimensional dot product + bias) 32 𝑥 ! 𝑦 + 𝑐 3 Justin Johnson September 23, 2020 Lecture 7 - 15
Convolution Layer 1x28x28 activation map 3x32x32 image 3x5x5 filter 28 convolve (slide) over 32 all spatial locations 28 32 1 3 Justin Johnson September 23, 2020 Lecture 7 - 16
Convolution Layer two 1x28x28 activation map Consider repeating with 3x32x32 image a second (green) filter: 3x5x5 filter 28 28 convolve (slide) over 32 all spatial locations 28 32 1 1 3 Justin Johnson September 23, 2020 Lecture 7 - 17
Convolution Layer 6 activation maps, each 1x28x28 3x32x32 image Consider 6 filters, each 3x5x5 Convolution Layer 32 6x3x5x5 32 Stack activations to get a filters 3 6x28x28 output image! Justin Johnson September 23, 2020 Lecture 7 - 18
Convolution Layer 6 activation maps, each 1x28x28 3x32x32 image Also 6-dim bias vector: Convolution Layer 32 6x3x5x5 32 Stack activations to get a filters 3 6x28x28 output image! Justin Johnson September 23, 2020 Lecture 7 - 19
28x28 grid, at each Convolution Layer point a 6-dim vector 3x32x32 image Also 6-dim bias vector: Convolution Layer 32 6x3x5x5 32 Stack activations to get a filters 3 6x28x28 output image! Justin Johnson September 23, 2020 Lecture 7 - 20
Convolution Layer 2x6x28x28 Batch of outputs 2x3x32x32 Also 6-dim bias vector: Batch of images Convolution Layer 32 6x3x5x5 32 filters 3 Justin Johnson September 23, 2020 Lecture 7 - 21
Convolution Layer N x C out x H’ x W’ Batch of outputs N x C in x H x W Also C out -dim bias vector: Batch of images Convolution Layer H W C out x C in x K w x K h C out filters C in Justin Johnson September 23, 2020 Lecture 7 - 22
Stacking Convolutions 32 28 26 …. Conv Conv Conv W 1 : 6x3x5x5 W 2 : 10x6x3x3 W 3 : 12x10x3x3 b 1 : 6 b 2 : 10 b 3 : 12 32 28 26 3 6 10 Input: First hidden layer: Second hidden layer: N x 3 x 32 x 32 N x 6 x 28 x 28 N x 10 x 26 x 26 Justin Johnson September 23, 2020 Lecture 7 - 23
Q : What happens if we stack Stacking Convolutions two convolution layers? 32 28 26 …. Conv Conv Conv W 1 : 6x3x5x5 W 2 : 10x6x3x3 W 3 : 12x10x3x3 b 1 : 6 b 2 : 10 b 3 : 12 32 28 26 3 6 10 Input: First hidden layer: Second hidden layer: N x 3 x 32 x 32 N x 6 x 28 x 28 N x 10 x 26 x 26 Justin Johnson September 23, 2020 Lecture 7 - 24
(Recall y=W 2 W 1 x is Q : What happens if we stack Stacking Convolutions a linear classifier) two convolution layers? A : We get another convolution! 32 28 26 …. Conv ReLU Conv ReLU Conv ReLU W 1 : 6x3x5x5 W 2 : 10x6x3x3 W 3 : 12x10x3x3 b 1 : 6 b 2 : 10 b 3 : 12 32 28 26 3 6 10 Input: First hidden layer: Second hidden layer: N x 3 x 32 x 32 N x 6 x 28 x 28 N x 10 x 26 x 26 Justin Johnson September 23, 2020 Lecture 7 - 25
What do convolutional filters learn? 32 28 26 …. Conv ReLU Conv ReLU Conv ReLU W 1 : 6x3x5x5 W 2 : 10x6x3x3 W 3 : 12x10x3x3 b 1 : 6 b 2 : 10 b 3 : 12 32 28 26 3 6 10 Input: First hidden layer: Second hidden layer: N x 3 x 32 x 32 N x 6 x 28 x 28 N x 10 x 26 x 26 Justin Johnson September 23, 2020 Lecture 7 - 26
What do convolutional filters learn? 32 28 Linear classifier: One template per class Conv ReLU W 1 : 6x3x5x5 b 1 : 6 32 28 3 6 Input: First hidden layer: N x 3 x 32 x 32 N x 6 x 28 x 28 Justin Johnson September 23, 2020 Lecture 7 - 27
What do convolutional filters learn? MLP: Bank of whole-image templates 32 28 Conv ReLU W 1 : 6x3x5x5 b 1 : 6 32 28 3 6 Input: First hidden layer: N x 3 x 32 x 32 N x 6 x 28 x 28 Justin Johnson September 23, 2020 Lecture 7 - 28
What do convolutional filters learn? First-layer conv filters: local image templates (Often learns oriented edges, opposing colors) 32 28 Conv ReLU W 1 : 6x3x5x5 b 1 : 6 32 28 3 6 Input: First hidden layer: AlexNet: 64 filters, each 3x11x11 N x 3 x 32 x 32 N x 6 x 28 x 28 Justin Johnson September 23, 2020 Lecture 7 - 29
A closer look at spatial dimensions 32 28 Conv ReLU W 1 : 6x3x5x5 b 1 : 6 32 28 3 6 Input: First hidden layer: N x 3 x 32 x 32 N x 6 x 28 x 28 Justin Johnson September 23, 2020 Lecture 7 - 30
A closer look at spatial dimensions Input: 7x7 Filter: 3x3 7 7 Justin Johnson September 23, 2020 Lecture 7 - 31
A closer look at spatial dimensions Input: 7x7 Filter: 3x3 7 7 Justin Johnson September 23, 2020 Lecture 7 - 32
A closer look at spatial dimensions Input: 7x7 Filter: 3x3 7 7 Justin Johnson September 23, 2020 Lecture 7 - 33
A closer look at spatial dimensions Input: 7x7 Filter: 3x3 7 7 Justin Johnson September 23, 2020 Lecture 7 - 34
A closer look at spatial dimensions Input: 7x7 Filter: 3x3 Output: 5x5 7 7 Justin Johnson September 23, 2020 Lecture 7 - 35
A closer look at spatial dimensions Input: 7x7 Filter: 3x3 Output: 5x5 In general: Problem: Feature 7 maps “shrink” Input: W with each layer! Filter: K Output: W – K + 1 7 Justin Johnson September 23, 2020 Lecture 7 - 36
A closer look at spatial dimensions 0 0 0 0 0 0 0 0 0 Input: 7x7 0 0 Filter: 3x3 0 0 Output: 5x5 0 0 In general: Problem: Feature 0 0 maps “shrink” Input: W with each layer! 0 0 Filter: K Output: W – K + 1 0 0 0 0 Solution: padding Add zeros around the input 0 0 0 0 0 0 0 0 0 Justin Johnson September 23, 2020 Lecture 7 - 37
Recommend
More recommend