TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini paolo.dragone@unitn.it passerini@disi.unitn.it Machine Learning Dragone, Passerini (DISI) TensorFlow Machine Learning 1 / 28
Introduction TensorFlow TensorFlow is a Python package Numerical computation using data flow graphs Developed (by Google) for the purpose of machine learning and deep neural networks research Installation and Documentation https://www.tensorflow.org/ Dragone, Passerini (DISI) TensorFlow Machine Learning 2 / 28
Introduction Outline MNIST dataset I https://www.tensorflow.org/versions/master/tutorials/ mnist/beginners/index.html#the-mnist-data MNIST for ML Beginners I https://www.tensorflow.org/versions/master/tutorials/ mnist/beginners/index.html Deep MNIST for Experts I https://www.tensorflow.org/versions/master/tutorials/ mnist/pros/index.html Dragone, Passerini (DISI) TensorFlow Machine Learning 3 / 28
Introduction On the lab computers To use TensorFlow on the lab computers open the terminal in Menu → Others → TensorFlow. On Ubuntu 12.04 Run the script run me on ubuntu1204.sh Dragone, Passerini (DISI) TensorFlow Machine Learning 4 / 28
MNIST dataset MNIST dataset Dataset of handwritten digits Each image 28 × 28 = 784 pixels Train: 60k test: 10k Dragone, Passerini (DISI) TensorFlow Machine Learning 5 / 28
MNIST dataset Importing MNIST Use the given input data.py script Train: 55k, validation: 5k, test: 10k mnist is a Python Class (has attributes, and methods) I mnist.train.images I mnist.train.labels I mnist.train.next batch(n) I . . . Dragone, Passerini (DISI) TensorFlow Machine Learning 6 / 28
MNIST dataset Data representation The 28 × 28 = 784 pixels are represented as a vector The 10 classes are represented with the one-hot encoding Dragone, Passerini (DISI) TensorFlow Machine Learning 7 / 28
Softmax regressions Softmax regressions Look at an image and give probabilities for it being each digit Evidence that an image is a particular class i X evidence i = W ij x j + b i ∀ i j Softmax to shape the evidence as a probability distribution over i cases exp ( evidence i ) softmax i = ∀ i P j exp ( evidence j ) Dragone, Passerini (DISI) TensorFlow Machine Learning 8 / 28
Softmax regressions Softmax regressions Schematic view Vectorized version Compactly y = softmax ( Wx + b ) ˆ Dragone, Passerini (DISI) TensorFlow Machine Learning 9 / 28
Softmax regressions Implementation: model Import TensorFlow Define the placeholders Define the variables Define the softmax layer Dragone, Passerini (DISI) TensorFlow Machine Learning 10 / 28
Softmax regressions Implementation: optimization y ) = − P y log ˆ Define the cost (cross-entropy): H y (ˆ y Define the training algorithm Start a new session (for now we have not computed anything) Initialize the variables Train the model Dragone, Passerini (DISI) TensorFlow Machine Learning 11 / 28
Softmax regressions Implementation: evaluation Evaluate accuracy (it should be around 0.91) Plot the model weights ( plotter.py ) Dragone, Passerini (DISI) TensorFlow Machine Learning 12 / 28
Deep convolutional net Deep architechtures 0.91 accuracy on MNIST is NOT good! State of the art performance is 0.9979 Let’s refine our model I 2 convolutional layers I alternated with 2 max pool layers I ReLU layer (with dropout) I Softmax regressions Accuracy target: 0.99 Dragone, Passerini (DISI) TensorFlow Machine Learning 13 / 28
Deep convolutional net Convolutional layer Broadly used in image classification Local connettivity (width, height, depth) ˆ ˆ ˆ ˆ ˆ ˆ y 0 y 0 y 1 y 1 y 9 y 9 . . . Spatial arrangement (stride, padding) Parameter sharing softmax ReLU (dropout) max pool convolutional max pool convolutional x 1 x 1 x 2 x 2 x 784 x 784 . . . Dragone, Passerini (DISI) TensorFlow Machine Learning 14 / 28
Deep convolutional net Max pooling layer Commonly used after convolutional layer(s) Reduce spatial size Avoid overfitting ˆ ˆ ˆ ˆ ˆ ˆ y 0 y 0 y 1 y 1 y 9 y 9 . . . Max pooling 2 × 2 is very common softmax ReLU (dropout) max pool convolutional max pool convolutional x 1 x 1 x 2 x 2 x 784 x 784 . . . Dragone, Passerini (DISI) TensorFlow Machine Learning 15 / 28
Deep convolutional net ReLU (and Dropout) Fully connected layer Rectified Linear Unit activation ˆ ˆ ˆ ˆ ˆ ˆ y 0 y 0 y 1 y 1 y 9 y 9 . . . softmax ReLU (dropout) Dropout randomly excludes neurons to avoid max pool overfitting convolutional max pool convolutional x 1 x 1 x 2 x 2 x 784 x 784 . . . Dragone, Passerini (DISI) TensorFlow Machine Learning 16 / 28
Deep convolutional net Softmax layer Look at a feature configuration (coming from the layers below) and give probabilities for it being each digit ˆ ˆ ˆ ˆ ˆ ˆ y 0 y 0 y 1 y 1 y 9 y 9 . . . Evidence that a feature configuration is a softmax particular class i ReLU (dropout) X max pool evidence i = W ij x j + b i ∀ i convolutional j max pool Softmax to shape the evidence as a probability convolutional distribution over i cases x 1 x 1 x 2 x 2 x 784 x 784 . . . exp ( evidence i ) softmax i = ∀ i P j exp ( evidence j ) Dragone, Passerini (DISI) TensorFlow Machine Learning 17 / 28
Deep convolutional net Implementation: preparation Imports Load data Placeholders and data reshaping NOTE Reshaping is needed for convolution and max pooling Dragone, Passerini (DISI) TensorFlow Machine Learning 18 / 28
Deep convolutional net Implementation: weights initialization Define functions to initialize variables of the model Dragone, Passerini (DISI) TensorFlow Machine Learning 19 / 28
Deep convolutional net Implementation: convolution and pooling Define functions (to keep the code cleaner) Dragone, Passerini (DISI) TensorFlow Machine Learning 20 / 28
Deep convolutional net Implementation: model (convolution and max pooling) 1st layer: convolutional with max pooling 2nd layer: convolutional with max pooling Shrinking Applying 2 × 2 max pooling we are shrinking the image After 2 layers we moved from 28 × 28 to 7 × 7 For each point we have 64 features Dragone, Passerini (DISI) TensorFlow Machine Learning 21 / 28
Deep convolutional net Implementation: model (ReLu, dropout and softmax) 3rd layer: ReLU Reshape We are switching back to fully connected layers, we want to reshape the input as a flat vector. 3rd layer: add dropout 4th layer: softmax (output) Dragone, Passerini (DISI) TensorFlow Machine Learning 22 / 28
Deep convolutional net Implementation: optimization and evaluation y ) = − P y log ˆ Define the cost (cross-entropy): H y (ˆ y Define the training algorithm Start a new session (for now we have not computed anything) Initialize the variables Define the accuracy before training (for monitoring) Dragone, Passerini (DISI) TensorFlow Machine Learning 23 / 28
Deep convolutional net Implementation: optimization and evaluation Train the model (may take a while) Evaluate the accuracy (it should be around 0.99) Dragone, Passerini (DISI) TensorFlow Machine Learning 24 / 28
Assignment Assignment The third ML assignment is to compare the performance of the deep convolutional network when removing layers. For this assignment you need to adapt the code of the complete deep architecture. By removing one layer at the time, and keeping all the others, you can evaluate the change in performance of the neural network in classifing the MNIST dataset. Note The only coding required is to modify the shape and/or size of the input vectors of the layers. The output of each layer has to remain the same. The report has to contain a short introduction on the methodologies used in the deep architecture showed during the lab (convolution, max pooling, ReLU, dropout, softmax). Dragone, Passerini (DISI) TensorFlow Machine Learning 25 / 28
Assignment Assignment Steps 1 Remove the 1st layer: convolutional and max pooling 2 Train and test the network 3 Remove the 2nd layer: convolutional and max pooling 4 Train and test the network 5 Remove the 3rd layer: ReLU with dropout 6 Train and test the network Computation Training a model on a quad-core CPU takes 30-45 mins. You may want to use the computers in the lab. Dragone, Passerini (DISI) TensorFlow Machine Learning 26 / 28
Assignment Assignment After completing the assignment submit it via email Send an email to paolo.dragone@unitn.it (cc: passerini@disi.unitn.it) Subject: tensorflowSubmit2016 Attachment: id name surname.zip containing: I the Python code F model no1.py (model without the 1st layer) F model no2.py (model without the 2nd layer) F model no3.py (model without the 3rd layer) I the report (PDF format) NOTE No group work This assignment is mandatory in order to enroll to the oral exam Dragone, Passerini (DISI) TensorFlow Machine Learning 27 / 28
References References https://www.tensorflow.org/ http://cs231n.github.io/convolutional-networks/ https: //www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf Dragone, Passerini (DISI) TensorFlow Machine Learning 28 / 28
Recommend
More recommend