Neural Turing Machines Tristan Deleu � June 23, 2016 @tristandeleu
Deep Learning
The building blocks + Convolutional Fully connected Recurrent Others Layer Layer Layer Object Recognition Predictions Speech Recognition � � � Object Detection Language Processing � � Image Segmentation �
Examples + = Object Predictions Face detection Detection + = Speech Predictions Automatic speech recognition Recognition = + Image Predictions Image segmentation Segmentation
Examples + = Language Language Processing Processing Machine translation + = Language Predictions Sentiment analysis Processing = + + Object Language Predictions Image captioning Recognition Processing
Frameworks Lasagne Lasagne Theano Torch Tensorflow Caffe Keras Neon MXNet Chainer CNTK
Theano + Lasagne https://github.com/Lasagne/Lasagne/blob/master/examples/mnist.py
Neural Turing Machines
Recurrent Neural Network y t +1 y t − 1 y t h t +1 h t − 1 h t LSTM t +1 LSTM t − 1 LSTM t x t +1 x t − 1 x t
Memory-augmented Networks BOAT Neural Network ? Boats float on water You can’t sail against the wind Boats do not fly … • Inspired by neuroscience • Memory-augmented networks : add an external memory to neural networks to act as a knowledge base • Keep track of intermediate computations — The story to answer the question in QA problems Memory Networks & Dynamic Memory Networks
Memory-augmented Networks Memory Networks Dynamic Memory Networks Neural GPU Neural Stack/Queue/DeQue Stack-augmented RNN
Turing Machine 0 1 0 1 0 0 1 1 1 0 q 0 Current state Read Operation New state Write q 0 q 1 0 0 q 0 q 0 1 0 q 1 q 0 0 1 q 1 q 1 1 0 · · ·
Neural Turing Machine 0 1 0 1 0 0 1 1 1 0 q 0 Current state Read Operation New state Write Input Output q 0 q 1 0 0 q 0 q 0 1 0 ? q 1 q 0 0 1 q 1 q 1 1 0 · · ·
Heads M t 0 1 0 1 0 0 1 1 � � � � w t Turing Machine Neural Turing Machine
Neural Turing Machine y t +1 y t − 1 y t h t +1 h t − 1 h t � � Controller FF t − 1 FF t FF t +1 � � M t − 1 M t Read heads r t +1 r t − 1 r t � � x t +1 x t − 1 x t Write heads x t +1 x t − 1 x t
Neural Turing Machine y t +1 y t − 1 y t h t +1 h t − 1 h t � � Controller LSTM t +1 LSTM t − 1 LSTM t � � M t − 1 M t Read heads r t +1 r t − 1 r t � � x t +1 x t − 1 x t Write heads x t +1 x t − 1 x t
Neural Turing Machine Input Output � NTM � � � Controller � � � � Write heads Read heads � � Memory
Open-source Library medium.com/snips-ai � github.com/snipsco/ntm-lasagne �
NTM-Lasagne
Algorithmic Tasks • Goal : Learn full algorithms only from input/output examples Generate as much data as we need Input Output � ? • Strong Generalization : Generalize beyond the data the NTM has seen during training Longer sequences for example ? P ( X, Y )
Copy task Inputs EOS Outputs
Training
Copy task
Copy task
Copy task Length 120
Copy task Length 150
Repeat Copy task Inputs x5 EOS Outputs
Repeat Copy task
Repeat Copy task
Associative Recall task Inputs Outputs
Associative Recall task
Associative Recall task
Priority Sort task
bAbI tasks
bAbI tasks John John Mary Mary garden garden Mary went to the garden John went to the garden Mary went back to the hallway Sandra Sandra Sandra journeyed to the bathroom John went to the hallway Mary went to the bathroom hallway hallway bathroom bathroom
bAbI tasks
Conclusion • The NTM is able to learn algorithms only from examples • It shows better generalization performances compared to other recurrent architectures For example LSTMs • Fully differentiable structure Drawback: generalization is still not quite perfect • New take on Artificial Intelligence Trying to teach machines things they can do, the same way we would learn them • Resources Theano: http://deeplearning.net/software/theano/ • Lasagne: http://lasagne.readthedocs.io/en/latest/ • NTM-Lasagne: https://github.com/snipsco/ntm-lasagne • � June 23, 2016 @tristandeleu
Thank you � June 23, 2016 @tristandeleu
Recommend
More recommend