Recurrent Neural Networks using TensorFlow Jindřich Libovický December 5, 2018 B4M36NLP Introduction to Natural Language Processing Charles University Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise stated
The Taks Train a model that will correctly decide whether you should write ‘i’ or ‘y’ in a Czech sentence. Recurrent Neural Networks using TensorFlow 1/7
Prepare Python Environment virtualenv -p python3 env source env/bin/activate pip3 install tensorflow pip3 install jupyter Recurrent Neural Networks using TensorFlow 2/7 • create a new environment • activate the environment • install TensorFlow and Jupyter
Alternatively using Anaconda wget http://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh bash Anaconda3-4.2.0-Linux-x86_64.sh export PATH=$PATH:$HOME/anaconda3/bin conda create -n tf python=3.5 anaconda source activate tf pip3 install tensorflow pip3 install jupyter Recurrent Neural Networks using TensorFlow 3/7 • download and install Anaconda • create a new environment • activate the environment • install TensorFlow 0.12
Download the Lab Notebook wget http://ufallab.ms.mff.cuni.cz/~libovicky/ctu_lab.ipynb jupyter notebook Recurrent Neural Networks using TensorFlow 4/7 • download the notebook • run jupyter in the same directory
The Data The text: aristotelés dále určil poloměr země, kterí ale odhadl na dvojnásobek… v aristotelovském modelu země stojí a měsíc se sluncem a hvězdami krouží… mišlenki aristotelovi rozvinul ve 2. století našeho letopočtu klaudios… Correct solution: 00001000000000000000000000000000001000000100000000000000000000001000000100000… 02000002000100000000200000100000000000000001000000000000000000000001000000000… 00000000000000000010000000000001000002000000000000000000020000000000000000000… 1 = ‘i’, 2 = ‘y’, 0 = ‘others’ Recurrent Neural Networks using TensorFlow 5/7 • 500k sentences from Czech Wikipedia (in general the more, the better) • only character from Czech alphabet, sentence-splitted, lower-cased • randomly shuffmed, seperated validation data
Baselines 70 % 80 % 90 % Recurrent Neural Networks using TensorFlow 6/7 • just leave ‘i’ everywhere • simple rules: ‘y’ after ‘h’, ‘k’, ‘r’ and for words starting with ‘v’ • remember the most frequent spelling for each word
Learning curves Recurrent Neural Networks using TensorFlow 7/7 1 0.95 0.9 0.85 0.8 0.75 0.7 Accuracy 0.65 0.6 0.55 # Training sentences (log scale) 0.5 500 5000 50000 500000 5000000 Most frequent spelling Recurrent network Baseline Simple rules
Recommend
More recommend