DATA130006 Text Management and Analysis Basis of CNN and RNN 魏忠钰 复旦大学大数据学院 School of Data Science, Fudan University Dec. 27 th , 2017
Linear score function
Neural networks: Architectures
Neuron
Activation Functions
Formal Definition of Neural Network § Definition: § L : Number of Layers; § 𝑜 " : Number of neurons in 𝑚 $% layer; size of the hidden state § 𝑔 " () : Activation function in 𝑚 $% layer; § W (") ∈ R , - ., -/0 weight matrix between 𝑚 − 1 $% layer and 𝑚 $% layer § 𝑐 (") ∈ R , - bias vector between 𝑚 − 1 $% layer and 𝑚 $% layer § 𝑨 (") ∈ R , - state vector of neurons in 𝑚 $% layer § a (") ∈ R , - activation vector of neurons in 𝑚 $% layer 𝑨 (") = 𝑋 " ∗ 𝑏 (";<) + 𝑐 " 𝑏 (") = 𝑔 " (𝑨 (") )
Example feed-forward computation of a neural network
Outline § Forward Neural Networks § Convolutional Neural Networks
Fully Connected Layer
Convolutional Neural Networks
Convolution Layer
Convolution Layer
Convolution Layer
Convolution Layer
Convolution Layer
Convolution Layer § Consider a second, green filter
Convolution Layer
Convolutional Neural Network § ConvNet is a sequence of Convolution Layers, interspersed with activation functions
Convolutional Neural Network § ConvNet is a sequence of Convolution Layers, interspersed with activation functions
VGG Net Visualization
Example of Spatial dimensions
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Example - Convolution
Padding
Padding
Convolutional Neural Networks
Examples : § Input volume: 32X32X3 § 10 filters 5X5 with stride 1, pad 2 § What is the volume size of output? § (32+2*2-5)/1 + 1 = 32 spatially, so 32X32X10 § How about the number of parameters? § Each filter has 5*5*3 + 1 = 76 params (+1 for bias) § à 76*10 = 760
Fully Connected Layers V.S. Convolutional Layer
Pooling layer § Makes the representations smaller and more manageable § Operates over each activation map independently:
Pooling Layer § It is common to periodically insert a pooling layer in-between successive convolutional layers § Progressively reduce the spatial size of the representation § Reduce the amount of parameters and computation in the network § Avoid overfitting
Max Pooling
Alpha Go
Alpha Go
General Neural Architectures for NLP 1. Represent the words/features with dense vectors (embeddings) by lookup table’ 2. Concatenate the vectors 3. Multi-layer neural networks § Classification § Matching § ranking R. Collobert et al. “Natural language processing (almost) from scratch”
CNN for Sentence Modeling § Input: A sentence of Length n, § After lookup layer, 𝑌 = [𝑦 < , 𝑦 B , … , 𝑦 , ] ∈ 𝑆 F×, § Variable-length input § Convolution § Pooling
CNN for Sentence Modeling
Sentiment Analysis using CNN
Outline § Forward Neural Networks § Convolutional Neural Networks § Recurrent Neural Networks
Recurrent Neural Networks: Process Sequences Vanilla Image Classification Machine Translation Sequence Captioning Labeling
Recurrent Neural Network
Recurrent Neural Network § We can process a sequence of vectors x by applying a recurrent formula at every time step § Notice: the same function and the same set of parameters are used at every time step.
(Vanilla) Recurrent Neural Network § The state consists of a single “hidden” vector h:
Unfolded RNN: Computational Graph
Unfolded RNN: Computational Graph § Re-use the same weight matrix at every time-step
RNN Computational Graph
Sequence to Sequence § Many-to-one + one-to-many
Sequence to Sequence
Attention Mechanism
Example: Character-level Language Model
Example: Character-level Language Model
Example: Character-level Language Model
Example: Character-level Language Model
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Example Image Captioning
Recommend
More recommend