Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) CMSC 678 UMBC
Recap from last timeβ¦
Feed-Forward Neural Network: Multilayer Perceptron π§ π¦ β π π¦ + π 0 ) π± π β π = πΊ(π± π£ π β + π 1 ) π± π π± π π± π π§ π = G(π π€ F : (non-linear) activation G: (non-linear) activation function function Classification: softmax πΈ Regression: identity π§ 1 π§ 2 information/ no self-loops computation flow (recurrence/reuse of weights)
Flavors of Gradient Descent βOnlineβ β Minibatch β βBatchβ Set t = 0 Set t = 0 Set t = 0 Pick a starting value ΞΈ t Pick a starting value ΞΈ t Pick a starting value ΞΈ t Until converged: Until converged: Until converged: get batch B β full data set g t = 0 set g t = 0 for example i in full data: for example(s) i in B: for example(s) i in full data: 1. Compute loss l on x i 1. Compute loss l on x i 1. Compute loss l on x i 2. Get gradient 2. Accumulate gradient 2. Accumulate gradient g t = lβ(x i ) g t += lβ(x i ) g t += lβ( x i ) 3. Get scaling factor Ο t done done 4. Set ΞΈ t+1 = ΞΈ t - Ο t *g t Get scaling factor Ο t Get scaling factor Ο t 5. Set t += 1 Set ΞΈ t+1 = ΞΈ t - Ο t *g t Set ΞΈ t+1 = ΞΈ t - Ο t *g t done Set t += 1 Set t += 1
Dropout: Regularization in Neural Networks π§ π¦ β π± π π± π π± π π± π πΈ π§ 1 π§ 2 randomly ignore βneuronsβ (h i ) during training
tanh Activation s=10 2 tanh s π¦ = 1 + exp(β2 β π‘ β π¦) β 1 s=1 = 2π π‘ π¦ β 1 s=0.5
Rectifiers Activations relu π¦ = max(0, π¦) softplus π¦ = log(1 + exp π¦ ) leaky_relu π¦ = α0.01π¦, π¦ < 0 π¦, π¦ β₯ 0
Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets
Dot Product β π¦ π π§ = ΰ· π¦ π π§ π π
Convolution: Modified Dot Product Around a Point π¦ π π§ π = ΰ· π¦ π+π π§ π π<πΏ β Convolution/cross-correlation
Convolution: Modified Dot Product Around a Point π¦ π π§ π = ΰ· π¦ π+π π§ π π β π¦ β π§ π = Convolution/cross-correlation
Convolution: Modified Dot Product Around a Point π¦ π π§ π = ΰ· π¦ π+π π§ π π β π¦ β π§ π = Convolution/cross-correlation
Convolution: Modified Dot Product Around a Point π¦ π π§ π = ΰ· π¦ π+π π§ π π β π¦ β π§ π = Convolution/cross-correlation
Convolution: Modified Dot Product Around a Point π¦ π π§ π = ΰ· π¦ π+π π§ π π β π¦ β π§ π = Convolution/cross-correlation
Convolution: Modified Dot Product Around a Point 1-D π¦ π π§ π = ΰ· π¦ π+π π§ π convolution π input (βimageβ) β kernel π¦ β π§ = feature map Convolution/cross-correlation
Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets
2-D Convolution kernel width : shape of the kernel input (often square) (βimageβ)
2-D Convolution stride(s) : how many spaces to move the kernel width : shape of the kernel (often square) input (βimageβ)
2-D Convolution stride(s) : how many spaces to move the kernel stride=1 width : shape of the kernel (often square) input (βimageβ)
2-D Convolution stride(s) : how many spaces to move the kernel stride=1 width : shape of the kernel (often square) input (βimageβ)
2-D Convolution stride(s) : how many spaces to move the kernel stride=1 width : shape of the kernel (often square) input (βimageβ)
2-D Convolution stride(s) : how many spaces to move the kernel stride=2 width : shape of the kernel (often square) input (βimageβ)
2-D Convolution skip starting here stride(s) : how many spaces to move the kernel stride=2 width : shape of the kernel (often square) input (βimageβ)
2-D Convolution skip starting here stride(s) : how many spaces to move the kernel stride=2 width : shape of the kernel (often square) input (βimageβ)
2-D Convolution skip starting here stride(s) : how many spaces to move the kernel stride=2 width : shape of the kernel (often square) input (βimageβ)
2-D Convolution pad with 0s (one option) stride(s) : how many spaces to move the kernel padding : how to handle input/kernel shape mismatches width : shape of the kernel (often square) input (βimageβ) βsameβ : βdifferentβ : input.shape == output.shape input.shape β output.shape
2-D Convolution pad with 0s pad with 0s (another option) (another option) stride(s) : how many spaces to move the kernel padding : how to handle input/kernel shape mismatches width : shape of the kernel (often square) input (βimageβ) βsameβ : βdifferentβ : input.shape == output.shape input.shape β output.shape
2-D Convolution stride(s) : how many spaces to move the kernel padding : how to handle input/kernel shape mismatches width : shape of the kernel (often square) input (βimageβ) βsameβ : βdifferentβ : input.shape == output.shape input.shape β output.shape
From fully connected to convolutional networks image Fully connected layer Slide credit: Svetlana Lazebnik
From fully connected to convolutional networks feature map learned weights image Convolutional layer Slide credit: Svetlana Lazebnik
From fully connected to convolutional networks feature map learned weights image Convolutional layer Slide credit: Svetlana Lazebnik
Convolution as feature extraction Filters/Kernels . . . Feature Map Input Slide credit: Svetlana Lazebnik
From fully connected to convolutional networks feature map learned weights image Convolutional layer Slide credit: Svetlana Lazebnik
From fully connected to convolutional networks non-linearity and/or pooling next layer image Convolutional layer Slide adapted: Svetlana Lazebnik
Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets Solving vanishing gradients problem
Key operations in a CNN Feature maps Spatial pooling Non-linearity Convolution . (Learned) . . Input Image Feature Map Input Slide credit: Svetlana Lazebnik, R. Fergus, Y. LeCun
Key operations Example: Rectified Linear Unit (ReLU) Feature maps Spatial pooling Non-linearity Convolution (Learned) Input Image Slide credit: Svetlana Lazebnik, R. Fergus, Y. LeCun
Key operations Feature maps Max Spatial pooling Non-linearity Convolution (Learned) Input Image Slide credit: Svetlana Lazebnik, R. Fergus, Y. LeCun
Design principles Reduce filter sizes (except possibly at the lowest layer), factorize filters aggressively Use 1x1 convolutions to reduce and expand the number of feature maps judiciously Use skip connections and/or create multiple paths through the network Slide credit: Svetlana Lazebnik
LeNet-5 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86(11): 2278 β 2324, 1998. Slide credit: Svetlana Lazebnik
ImageNet ~14 million labeled images, 20k classes Images gathered from Internet Human labels via Amazon MTurk ImageNet Large-Scale Visual Recognition Challenge (ILSVRC): 1.2 million training images, 1000 classes www.image-net.org/challenges/LSVRC/ Slide credit: Svetlana Lazebnik
Slide credit: Svetlana Lazebnik http://www.inference.vc/deep-learning-is-easy/
Outline Convolutional Neural Recurrent Neural Networks Networks What is a convolution? Types of recurrence Multidimensional A basic recurrent cell Convolutions BPTT: Backpropagation Typical Convnet Operations through time Deep convnets Solving vanishing gradients problem
AlexNet: ILSVRC 2012 winner Similar framework to LeNet but: Max pooling, ReLU nonlinearity More data and bigger model (7 hidden layers, 650K units, 60M params) GPU implementation (50x speedup over CPU): Two GPUs for a week Dropout regularization A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012 Slide credit: Svetlana Lazebnik
GoogLeNet Szegedy et al., 2015 Slide credit: Svetlana Lazebnik
GoogLeNet Szegedy et al., 2015 Slide credit: Svetlana Lazebnik
GoogLeNet: Auxiliary Classifier at Sub- levels Idea: try to make each sub-layer good (in its own way) at the prediction task Szegedy et al., 2015 Slide credit: Svetlana Lazebnik
GoogLeNet β’ An alternative view: Szegedy et al., 2015 Slide credit: Svetlana Lazebnik
Recommend
More recommend