recurrent neural networks
play

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview - PowerPoint PPT Presentation

Recurrent Neural Networks CS 6956: Deep Learning for NLP Overview 1. Modeling sequences 2. Recurrent neural networks: An abstraction 3. Usage patterns for RNNs 4. BiDirectional RNNs 5. A concrete example: The Elman RNN 6. The vanishing


  1. Recurrent Neural Networks CS 6956: Deep Learning for NLP

  2. Overview 1. Modeling sequences 2. Recurrent neural networks: An abstraction 3. Usage patterns for RNNs 4. BiDirectional RNNs 5. A concrete example: The Elman RNN 6. The vanishing gradient problem 7. Long short-term memory units 1

  3. Overview 1. Modeling sequences 2. Recurrent neural networks: An abstraction 3. Usage patterns for RNNs 4. BiDirectional RNNs 5. A concrete example: The Elman RNN 6. The vanishing gradient problem 7. Long short-term memory units 2

  4. What can we do with such an abstraction? 1. The encoder: Convert a sequence into a feature vector for subsequent classification 2. A generator: Produce a sequence using an initial state 3. A transducer: Convert a sequence into another sequence 4. A conditioned generator (or an encoder-decoder): Combine 1 and 2

  5. 1. An Encoder Convert a sequence into a feature vector for subsequent classification Initial state I like cake 4

  6. 1. An Encoder Convert a sequence into a feature vector for subsequent classification A neural network Initial state I like cake 5

  7. 1. An Encoder Convert a sequence into a feature vector for subsequent classification loss A neural network Initial state I like cake 6

  8. 1. An Encoder Convert a sequence into a feature vector for subsequent classification Example: Encode a sentence or a phrase into a feature vector for a classification task such as sentiment classification loss A neural network Initial state I like cake 7

  9. 2. A Generator Produce a sequence using an initial state I like cake Initial state ∅ ∅ ∅ 8

  10. 2. A Generator Produce a sequence using an initial state loss I like cake Initial state ∅ ∅ ∅ 9

  11. 2. A Generator Produce a sequence using an initial state Maybe the previous output becomes the current input loss I like cake Initial state ∅ I like 10

  12. 2. A Generator Produce a sequence using an initial state Examples: Text generation tasks loss I like cake Initial state ∅ I like 11

  13. 3. A Transducer Convert a sequence into another sequence Verb Pronoun Noun Initial state I like cake 12

  14. 3. A Transducer Convert a sequence into another sequence loss Verb Pronoun Noun Initial state I like cake 13

  15. 4. Conditioned generator Or an encoder-decoder: First encode a sequence, then generate another one First encode a sequence Initial state I like cake 14

  16. 4. Conditioned generator Or an encoder-decoder: First encode a sequence, then generate another one Then decode it to produce a different sequence मला आवडतो केक Initial state I ∅ ∅ like cake ∅ 15

  17. 4. Conditioned generator Or an encoder-decoder: First encode a sequence, then generate another one Example: A building block for neural machine translation मला आवडतो केक Initial state I ∅ ∅ like cake ∅ 16

  18. Stacking RNNs • A commonly seen usage pattern • An RNN takes an input sequence and produces an output sequence • The input to an RNN can itself be the output of an RNN – stacked RNNs, also called deep RNNs • Two or more layers often seems to improve prediction performance 17

Recommend


More recommend