neural monkey a natural language processing toolkit
play

Neural Monkey: A Natural Language Processing Toolkit Jindich Helcl, - PowerPoint PPT Presentation

Neural Monkey: A Natural Language Processing Toolkit Jindich Helcl, Jindich Libovick, Tom Kocmi, Duan Vari, Tom Musil, Ondej Cfka, Ondej Bojar March 19, 2019 GTC 2019 Charles University Faculty of Mathematics and Physics


  1. Neural Monkey: A Natural Language Processing Toolkit Jindřich Helcl, Jindřich Libovický, Tom Kocmi, Dušan Variš, Tomáš Musil, Ondřej Cífka, Ondřej Bojar March 19, 2019 GTC 2019 Charles University Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise stated

  2. NLP Toolkit Overview Why do we need NLP toolkits? Neural Monkey: A Natural Language Processing Toolkit 1/17 • No need to re-implement everything from scratch. • Re-use of published (trained) components. • Often diffjcult design decisions already made. • Usually, published results indicate the reliability of the toolkit.

  3. NLP Toolkit Overview NLP research libraries and toolkits can be categorized as: Neural Monkey: A Natural Language Processing Toolkit 2/17 • Math libraries (matrix- or tensor-level, usually symbolic) • TensorFlow, (py)Torch, Theano. • Neural Network abstraction APIs (handle individual NN layers) • Keras • Higher-level toolkits (work with encoders, decoders, etc.) • Neural Monkey , AllenNLP, Sockeye • Specialized applications • Marian or tensor2tensor for NMT, Kaldi for ASR, etc.

  4. Neural Monkey Neural Monkey: A Natural Language Processing Toolkit 3/17 • Open-source toolkit for NLP tasks • Suited for research and education • Three (overlapping) user groups considered: • Students • Researchers • Newcomers to deep learning

  5. Development Source code here: https://github.com/ufal/neuralmonkey Neural Monkey: A Natural Language Processing Toolkit 4/17 • Implemented in Python 3.6 using TensorFlow • Thanks to TensorFlow GPU support using CUDA, cuDNN • Actively developed using GitHub as the main communication platform

  6. Used in Research (University of Amsterdam, EMNLP 2017) Neural Monkey: A Natural Language Processing Toolkit (Charles University, EMNLP 2018) 5/17 (Charles University, ACL 2017, WMT 2018) (Heidelberg University, ACL 2017) • Multimodal translation • Bandit learning schläft • Graph Convolutional Encoders grünen grünen einem einem Mann Sofa Raum ein auf in . • Non-autoregressive translation (1) (2) (3)

  7. Goals 2. Modularity along research concepts 3. Up-to-date building blocks 4. Fast prototyping Neural Monkey: A Natural Language Processing Toolkit 6/17 1. Code readability

  8. Goals 2. Modularity along research concepts 3. Up-to-date building blocks 4. Fast prototyping Neural Monkey: A Natural Language Processing Toolkit 6/17 1. Code readability

  9. Goals 2. Modularity along research concepts 3. Up-to-date building blocks 4. Fast prototyping Neural Monkey: A Natural Language Processing Toolkit 6/17 1. Code readability

  10. Goals 2. Modularity along research concepts 3. Up-to-date building blocks 4. Fast prototyping Neural Monkey: A Natural Language Processing Toolkit 6/17 1. Code readability

  11. Usage neuralmonkey-train config.ini neuralmonkey-run config.ini data.ini Neural Monkey: A Natural Language Processing Toolkit 7/17 • Neural Monkey experiments defjned in INI confjguration fjles • Once confjg is ready, run with: • Inference from a trained model uses a second confjg for data:

  12. Neural Monkey: A Natural Language Processing Toolkit Abstractions in Neural Monkey 8/17 • Compositional design • High-level abstractions derived from low-level ones • (High-level) abstractions aligned with literature • Encoder, decoder, etc. • Separation between model defjnition and usage • “Model parts” defjne the network architecture • “Graph executors” defjne what to compute in the TF session

  13. Example Use-Case: Machine Translation feed-forward attention Neural Monkey: A Natural Language Processing Toolkit 9/17 x 1 x 2 x 3 x 4 <s> ... h 0 h 1 h 2 h 3 h 4 α 0 α 1 α 2 α 3 α 4 • Bahdanau et al., 2015 × × × × × • Encoder: Bidirectional GRU • Decoder: GRU with single-layer s i-1 s i s i+1 + + ~y i ~y i+1

  14. Simple MT Confjguration Example path="de_vocab.tsv" [my_encoder] GRU Encoder confjguration: data=["data/val.en", "data/val.de"] series=["source, target"] class=dataset.load [val_data] data=["data/train.en", "data/train.de"] series=["source, target"] class=dataset.load [train_data] Loading training and validation data: class=vocabulary.from_wordlist rnn_size=500 [de_vocabulary] path="en_vocab.tsv" class=vocabulary.from_wordlist [en_vocabulary] Loading vocabularies: validation_period=5000 logging_period=500 evaluation=[("target", evaluators.BLEU)] runners=[<my_runner>] trainer=<my_trainer> val_dataset=<val_data> class=encoders.SentenceEncoder embedding_size=600 epochs=20 data_id="target" Neural Monkey: A Natural Language Processing Toolkit output_series="target" decoder=<my_decoder> class=runners.GreedyRunner [my_runner] clip_norm=1.0 decoders=[<my_decoder>] class=trainers.CrossEntropyTrainer [my_trainer] Trainer and runner: vocabulary=<de_vocabulary> embedding_size=600 data_id="source" rnn_size=1000 max_output_len=20 attentions=[<my_attention>] encoders=[<my_encoder>] class=decoders.Decoder [my_decoder] state_size=500 encoder=<my_encoder> class=attention.Attention [my_attention] GRU Decoder and Attention confjguration: vocabulary=<en_vocabulary> train_dataset=<train_data> batch_size=64 [main] path="en_vocab.tsv" data=["data/val.en", "data/val.de"] series=["source, target"] class=dataset.load [val_data] data=["data/train.en", "data/train.de"] series=["source, target"] class=dataset.load [train_data] path="de_vocab.tsv" class=vocabulary.from_wordlist [de_vocabulary] class=vocabulary.from_wordlist class=encoders.SentenceEncoder [en_vocabulary] validation_period=5000 logging_period=500 evaluation=[("target", evaluators.BLEU)] runners=[<my_runner>] trainer=<my_trainer> val_dataset=<val_data> train_dataset=<train_data> epochs=20 batch_size=64 output="output_dir" [my_encoder] rnn_size=500 output="output_dir" data_id="target" [main] General training confjguration: output_series="target" decoder=<my_decoder> class=runners.GreedyRunner [my_runner] clip_norm=1.0 decoders=[<my_decoder>] class=trainers.CrossEntropyTrainer [my_trainer] vocabulary=<de_vocabulary> embedding_size=600 embedding_size=600 rnn_size=1000 max_output_len=20 attentions=[<my_attention>] encoders=[<my_encoder>] class=decoders.Decoder [my_decoder] state_size=500 encoder=<my_encoder> class=attention.Attention [my_attention] vocabulary=<en_vocabulary> data_id="source" 10/17

Recommend


More recommend