basis of cnn and rnn
play

Basis of CNN and RNN School of Data Science, Fudan - PowerPoint PPT Presentation

DATA130006 Text Management and Analysis Basis of CNN and RNN School of Data Science, Fudan University Dec. 27 th , 2017 Linear score function Neural networks: Architectures Neuron Activation Functions


  1. DATA130006 Text Management and Analysis Basis of CNN and RNN 魏忠钰 复旦大学大数据学院 School of Data Science, Fudan University Dec. 27 th , 2017

  2. Linear score function

  3. Neural networks: Architectures

  4. Neuron

  5. Activation Functions

  6. Formal Definition of Neural Network § Definition: § L : Number of Layers; § 𝑜 " : Number of neurons in 𝑚 $% layer; size of the hidden state § 𝑔 " () : Activation function in 𝑚 $% layer; § W (") ∈ R , - ., -/0 weight matrix between 𝑚 − 1 $% layer and 𝑚 $% layer § 𝑐 (") ∈ R , - bias vector between 𝑚 − 1 $% layer and 𝑚 $% layer § 𝑨 (") ∈ R , - state vector of neurons in 𝑚 $% layer § a (") ∈ R , - activation vector of neurons in 𝑚 $% layer 𝑨 (") = 𝑋 " ∗ 𝑏 (";<) + 𝑐 " 𝑏 (") = 𝑔 " (𝑨 (") )

  7. Example feed-forward computation of a neural network

  8. Outline § Forward Neural Networks § Convolutional Neural Networks

  9. Fully Connected Layer

  10. Convolutional Neural Networks

  11. Convolution Layer

  12. Convolution Layer

  13. Convolution Layer

  14. Convolution Layer

  15. Convolution Layer

  16. Convolution Layer § Consider a second, green filter

  17. Convolution Layer

  18. Convolutional Neural Network § ConvNet is a sequence of Convolution Layers, interspersed with activation functions

  19. Convolutional Neural Network § ConvNet is a sequence of Convolution Layers, interspersed with activation functions

  20. VGG Net Visualization

  21. Example of Spatial dimensions

  22. Example - Convolution

  23. Example - Convolution

  24. Example - Convolution

  25. Example - Convolution

  26. Example - Convolution

  27. Example - Convolution

  28. Example - Convolution

  29. Example - Convolution

  30. Example - Convolution

  31. Example - Convolution

  32. Example - Convolution

  33. Padding

  34. Padding

  35. Convolutional Neural Networks

  36. Examples : § Input volume: 32X32X3 § 10 filters 5X5 with stride 1, pad 2 § What is the volume size of output? § (32+2*2-5)/1 + 1 = 32 spatially, so 32X32X10 § How about the number of parameters? § Each filter has 5*5*3 + 1 = 76 params (+1 for bias) § à 76*10 = 760

  37. Fully Connected Layers V.S. Convolutional Layer

  38. Pooling layer § Makes the representations smaller and more manageable § Operates over each activation map independently:

  39. Pooling Layer § It is common to periodically insert a pooling layer in-between successive convolutional layers § Progressively reduce the spatial size of the representation § Reduce the amount of parameters and computation in the network § Avoid overfitting

  40. Max Pooling

  41. Alpha Go

  42. Alpha Go

  43. General Neural Architectures for NLP 1. Represent the words/features with dense vectors (embeddings) by lookup table’ 2. Concatenate the vectors 3. Multi-layer neural networks § Classification § Matching § ranking R. Collobert et al. “Natural language processing (almost) from scratch”

  44. CNN for Sentence Modeling § Input: A sentence of Length n, § After lookup layer, 𝑌 = [𝑦 < , 𝑦 B , … , 𝑦 , ] ∈ 𝑆 F×, § Variable-length input § Convolution § Pooling

  45. CNN for Sentence Modeling

  46. Sentiment Analysis using CNN

  47. Outline § Forward Neural Networks § Convolutional Neural Networks § Recurrent Neural Networks

  48. Recurrent Neural Networks: Process Sequences Vanilla Image Classification Machine Translation Sequence Captioning Labeling

  49. Recurrent Neural Network

  50. Recurrent Neural Network § We can process a sequence of vectors x by applying a recurrent formula at every time step § Notice: the same function and the same set of parameters are used at every time step.

  51. (Vanilla) Recurrent Neural Network § The state consists of a single “hidden” vector h:

  52. Unfolded RNN: Computational Graph

  53. Unfolded RNN: Computational Graph § Re-use the same weight matrix at every time-step

  54. RNN Computational Graph

  55. Sequence to Sequence § Many-to-one + one-to-many

  56. Sequence to Sequence

  57. Attention Mechanism

  58. Example: Character-level Language Model

  59. Example: Character-level Language Model

  60. Example: Character-level Language Model

  61. Example: Character-level Language Model

  62. Example Image Captioning

  63. Example Image Captioning

  64. Example Image Captioning

  65. Example Image Captioning

  66. Example Image Captioning

  67. Example Image Captioning

  68. Example Image Captioning

  69. Example Image Captioning

  70. Example Image Captioning

  71. Example Image Captioning

  72. Example Image Captioning

  73. Example Image Captioning

Recommend


More recommend