Sequence to Sequence Learning with Neural Networks Sequence-to-Sequence Learning with Neural Networks Ilya Sutskever, Oriol Vinyals, Quoc V. Le, NIPS 2014 Introduced by Graham Neubig, NAIST 2014-11-01 1
Sequence to Sequence Learning with Neural Networks Review: Recurrent Neural Networks 2
Sequence to Sequence Learning with Neural Networks Perceptron w φ “A” = 1 0 φ “site” = 1 -3 φ “located” = 1 0 0 φ “Maizuru” = 1 I f ( ∑ i = 1 w i ⋅ϕ i ( x )) y 0 φ “,” = 2 0 φ “in” = 1 0 2 φ “Kyoto” = 1 0 φ “priest” = 0 φ “black” = 0 3
Sequence to Sequence Learning with Neural Networks Neural Net ● Combine multiple perceptrons φ “A” = 1 φ “site” = 1 φ “located” = 1 φ “Maizuru” = 1 y φ “,” = 2 φ “in” = 1 φ “Kyoto” = 1 φ “priest” = 0 φ “black” = 0 ● Learning of complex functions possible 4
Sequence to Sequence Learning with Neural Networks Recurrent Neural Nets I can eat an apple </s> I can eat an apple </s> 5
Sequence to Sequence Learning with Neural Networks Long Short Term Memory [Hochreiter+ 97] ● Problem: RNNs suffer from vanishing gradient ● Solution: Create units that decide when to activate 6
Sequence to Sequence Learning with Neural Networks Dialogue State Tracking with RNNs [Henderson+ 14] f: features s: slot m: memory 7 p: probability over goals
Sequence to Sequence Learning with Neural Networks Sequence-to-Sequence Learning with Neural Networks 8
Sequence to Sequence Learning with Neural Networks Task: Machine Translation ● Mapping from input to output sentence Input Output Taro 太郎が花子を visited 訪問した。 Hanako. 9
Sequence to Sequence Learning with Neural Networks Traditional Method: Phrase-based MT ● Translate phrases, reorder Today I will give a lecture on machine translation . Today I will give a lecture on machine translation . 今日は、 を行います の講義 機械翻訳 。 Today machine translation a lecture on I will give . 今日は、 機械翻訳 の講義 を行います 。 今日は、機械翻訳の講義を行います。 ● Requires alignment, phrase extraction, scoring 10 (phrase, reordering), NP-hard decoding, tuning
Sequence to Sequence Learning with Neural Networks Proposed Method: Memorize Sequence, Generate Sequence ● Left-to-right beam search (size 2 was largely sufficient) ● Also can use for reranking 11
Sequence to Sequence Learning with Neural Networks Proposed Method: Reversal Trick x x x C B A 12
Sequence to Sequence Learning with Neural Networks Experimental Setup ● Network details ● 160,000/80,000 word input/output (all other UNK) ● 4 hidden LSTM layers of 1,000 cells ● 1,000 dimensional word representations ● Training ● Stochastic gradient descent ● 8 GPUs (1 for each hidden layer, 4 for output) ● 6,300 words per second, 10 days total ● Data details ● ~340M words of English-French data from WMT14 13
Sequence to Sequence Learning with Neural Networks Results 14
Sequence to Sequence Learning with Neural Networks Learned Phrase Representations 15
Sequence to Sequence Learning with Neural Networks Effect of Length 16
Sequence to Sequence Learning with Neural Networks Examples/Problems with UNK 17
Sequence to Sequence Learning with Neural Networks Addressing the Rare Word Problem in Neural Machine Translation [Luong+ 14] ● Copyable model: label unk words en: The unk 1 portico in unk 2 … fr: Le unk n unk 1 de unk 2 ... ● Positional all model: label word positions (i=j-d, e i f j ) en: The <unk> portico in <unk> ... fr: Le pos 0 <unk> pos −1 <unk> pos 1 de pos n <unk> pos −1 ... ● Positional unk: label unk positions en: The <unk> portico in <unk> ... 18 fr: Le unkpos 1 unkpos -1 de unkpos 1 ...
Sequence to Sequence Learning with Neural Networks Results with PosUnk 19
Recommend
More recommend