cs 4803 7643 deep learning
play

CS 4803 / 7643: Deep Learning Topics: Forward and backward though - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Topics: Forward and backward though conv (Beginning) of convolutional neural network (CNN) architectures Zsolt Kira Georgia Tech Administrative PS1/HW1 Due Feb 11 th ! (C) Dhruv Batra & Zsolt


  1. CS 4803 / 7643: Deep Learning Topics: – Forward and backward though conv – (Beginning) of convolutional neural network (CNN) architectures Zsolt Kira Georgia Tech

  2. Administrative • PS1/HW1 Due Feb 11 th ! (C) Dhruv Batra & Zsolt Kira 2

  3. Example: Reverse mode AD + sin( ) * x 1 x 2 (C) Dhruv Batra 3

  4. Duality in Fprop and Bprop FPROP BPROP SUM + COPY + (C) Dhruv Batra and Zsolt Kira 4

  5. Convolutions for programmers y • Iterate over the kernel instead of the image • Implement cross-correlation instead of convolution • Later - implementation as matrix multiplication (C) Peter Anderson 5

  6. Discrete convolution • Discrete Convolution! • Very similar to correlation but associative 1D Convolution 2D Convolution Filter

  7. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 7 Slide Credit: Marc'Aurelio Ranzato

  8. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 8 Slide Credit: Marc'Aurelio Ranzato

  9. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 9 Slide Credit: Marc'Aurelio Ranzato

  10. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 10 Slide Credit: Marc'Aurelio Ranzato

  11. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 11 Slide Credit: Marc'Aurelio Ranzato

  12. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 12 Slide Credit: Marc'Aurelio Ranzato

  13. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 13 Slide Credit: Marc'Aurelio Ranzato

  14. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 14 Slide Credit: Marc'Aurelio Ranzato

  15. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 15 Slide Credit: Marc'Aurelio Ranzato

  16. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 16 Slide Credit: Marc'Aurelio Ranzato

  17. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 17 Slide Credit: Marc'Aurelio Ranzato

  18. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 18 Slide Credit: Marc'Aurelio Ranzato

  19. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 19 Slide Credit: Marc'Aurelio Ranzato

  20. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 20 Slide Credit: Marc'Aurelio Ranzato

  21. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 21 Slide Credit: Marc'Aurelio Ranzato

  22. Convolutional Layer (C) Dhruv Batra & Zsolt Kira 22 Slide Credit: Marc'Aurelio Ranzato

  23. Convolution Layer Filters always extend the full depth of the input volume 32x32x3 image 5x5x3 filter 32 Convolve the filter with the image i.e. “slide over the image spatially, computing dot products” 32 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  24. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  25. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  26. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  27. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  28. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter => 5x5 output 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  29. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter applied with stride 2 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  30. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter applied with stride 2 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  31. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter applied with stride 2 => 3x3 output! 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  32. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter applied with stride 3? 7 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  33. A closer look at spatial dimensions: 7 7x7 input (spatially) assume 3x3 filter applied with stride 3? doesn’t fit! 7 cannot apply 3x3 filter on 7x7 input with stride 3. Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  34. N Output size: (N - F) / stride + 1 F e.g. N = 7, F = 3: N F stride 1 => (7 - 3)/1 + 1 = 5 stride 2 => (7 - 3)/2 + 1 = 3 stride 3 => (7 - 3)/3 + 1 = 2.33 :\ Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  35. In practice: Common to zero pad the border 0 0 0 0 0 0 e.g. input 7x7 3x3 filter, applied with stride 1 0 pad with 1 pixel border => what is the output? 0 0 0 (recall:) (N - F) / stride + 1 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  36. In practice: Common to zero pad the border 0 0 0 0 0 0 e.g. input 7x7 3x3 filter, applied with stride 1 0 pad with 1 pixel border => what is the output? 0 7x7 output! 0 0 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  37. In practice: Common to zero pad the border 0 0 0 0 0 0 e.g. input 7x7 3x3 filter, applied with stride 1 0 pad with 1 pixel border => what is the output? 0 7x7 output! 0 in general, common to see CONV layers with 0 stride 1, filters of size FxF, and zero-padding with (F-1)/2. (will preserve size spatially) e.g. F = 3 => zero pad with 1 F = 5 => zero pad with 2 F = 7 => zero pad with 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  38. Convolutional Layer Learn multiple filters. E.g.: 200x200 image 100 Filters Filter size: 10x10 10K parameters (C) Dhruv Batra & Zsolt Kira 38 Slide Credit: Marc'Aurelio Ranzato

  39. For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 28 32 3 6 We stack these up to get a “new image” of size 28x28x6! Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  40. General Matrix Multiply (GEMM) (C) Dhruv Batra & Zsolt Kira 40 Figure Credit: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

  41. Examples time: Input volume: 32x32x3 10 5x5 filters with stride 1, pad 2 Output volume size: ? Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  42. Examples time: Input volume: 32x32x3 10 5x5 filters with stride 1, pad 2 Output volume size: (32+2*2-5)/1+1 = 32 spatially, so 32x32x10 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  43. Examples time: Input volume: 32x32x3 10 5x5 filters with stride 1, pad 2 Number of parameters in this layer? Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  44. Examples time: Input volume: 32x32x3 10 5x5 filters with stride 1, pad 2 Number of parameters in this layer? each filter has 5*5*3 + 1 = 76 params (+1 for bias) => 76*10 = 760 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  45. Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  46. Common settings: K = (powers of 2, e.g. 32, 64, 128, 512) - F = 3, S = 1, P = 1 - F = 5, S = 1, P = 2 - F = 5, S = 2, P = ? (whatever fits) - F = 1, S = 1, P = 0 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  47. Example: CONV layer in Torch Torch is licensed under BSD 3-clause. Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  48. Preview: ConvNet is a sequence of Convolution Layers, interspersed with activation functions 32 28 CONV, ReLU e.g. 6 5x5x3 32 28 filters 3 6 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  49. Backprop through Conv (C) Dhruv Batra Image Credit: Yann LeCun, Kevin Murphy 49

  50. Preview: ConvNet is a sequence of Convolutional Layers, interspersed with activation functions 32 28 24 …. CONV, CONV, CONV, ReLU ReLU ReLU e.g. 6 e.g. 10 5x5x3 5x5x 6 32 28 24 filters filters 3 6 10 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n

  51. Convolutional Neural Networks (C) Dhruv Batra Image Credit: Yann LeCun, Kevin Murphy 51

  52. The architecture of LeNet5

  53. Handwriting Recognition Example

  54. Translation Invariance

  55. Some Rotation Invariance

  56. Some Scale Invariance

  57. Case Studies • There are several generations of ConvNets – 2012 – 2014: AlexNet, ZNet, VGGNet • Conv-Relu, Pooling, Fully connected, Softmax • Deeper ones (VGGNet) tend to do better – 2014 • Fully-convolutional networks for semantic segmentation • Matrix outputs rather than just one probability distribution – 2014-2016 • Fully-convolutional networks for classification • Less parameters, faster than comparable Gen1 networks • GoogleNet, ResNet – 2014-2016 • Detection layers (proposals) • Caption generation (combine with RNNs for language)

Recommend


More recommend