introduction to deep models
play

Introduction to Deep Models Part I: Classifiers and Generative - PowerPoint PPT Presentation

TensorFlow Workshop 2018 Introduction to Deep Models Part I: Classifiers and Generative Networks Nick Winovich Department of Mathematics Purdue University July 2018 SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I


  1. TensorFlow Workshop 2018 Introduction to Deep Models Part I: Classifiers and Generative Networks Nick Winovich Department of Mathematics Purdue University July 2018 SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  2. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  3. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  4. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  5. Classifier Models Classifier models aim to assign labels/classes to input data; the inputs to classifier models typically correspond to structured data which is characterized by high-level properties or features . → “2” � The MNIST database of handwritten digits comprises images of handwritten digits along with the corresponding labels: http://yann.lecun.com/exdb/mnist/ SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  6. Automating Feature Discovery Defining features manually may be feasible for some simple cases, such as handwritten digits (e.g. symmetry, pixel intensity, etc.); for more complex problems, however, finding a sufficient set of features by hand is often impractical. � When the input data is spatially organized, convolution layers can be used to identify local patterns and extract features � Hidden layers can be used in series to form a hierarchy of features and expand receptive fields for convolutional layers SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  7. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  8. The Softmax Function The softmax function can be used to convert a collection of real values [ z 1 , . . . , z N ] into a set of probability values [ p 1 , . . . , p N ] . These probabilities sum to 1 by construction, and can be used to select a class using the onehot encoding format. It is defined by: � � � � exp( z 1 ) exp( z N ) � � softmax [ z 1 , . . . , z N ] = n exp( z n ) n exp( z n ) , . . . , = = p 1 p N The values { z n } are referred to as logits and are typically produced by the final network layer without using any activation. In general, the values { p n } should not be computed directly, since efficient fused operations are available with better numerical stability. SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  9. Softmax Cross Entropy Given a onehot encoded output [ y 1 , . . . , y N ] and a set of class probabilities [ p 1 , . . . , p N ] , we define the cross entropy by: N � � � [ p 1 , . . . , p N ] , [ y 1 , . . . , y N ] = − y n · log( p n ) H n =1 In particular, if the true class corresponds to k (so that y k = 1 and y n = 0 ∀ n � = k ), the expression for the cross entropy reduces to: � � � � y k = 1 = − log( p k ) H [ p 1 , . . . , p N ] By construction, this value is zero precisely when p k = 1 . Note : In TensorFlow, use original logits with the fused operation: tf.nn.softmax cross entropy with logits v2 SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  10. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  11. Classifier Network: MNIST Example A simple example of a classifier network trained on the MNIST database is provided in the TensorFlow Examples repository: https://github.com/nw2190/TensorFlow Examples The file Models/01 Classifier.py defines the classifier model and training procedure. This also uses the supplementary files: � utils.py : writes the *.tfrecords files for training/validation � flags.py : specifies model hyperparameters/training settings � layers.py : defines custom network layers for the model � misc.py : defines the parse function and early stopping hook SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  12. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  13. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  14. Generative Models Generative models aim to produce “realistic” examples from a target dataset; more generally, these models aim to sample from the universal distribution which characterizes a given dataset. N (0 , I ) → � For example, we may consider a distribution representative of all the ways in which digits can be written by hand; a generative model can then be trained to generate images of these digits SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  15. Example: Fully-Connected Generator Noise SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  16. Example: Convolutional Generator SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  17. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  18. Distinguishing “Real” from “Fake” In order to actually train a generative model, however, we must have a way to accurately quantify how “real” the generated data is with respect to the target data distribution. For example, consider assigning a loss to the following “digit”: We may decide to manually assess predictions, having a group of individuals subjectively assign a loss; this is clearly not scalable... SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  19. Discriminators Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y., 2014. Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680). A more practical approach is to instead define an additional network component designed to handle the loss quantification automatically. These components, referred to as discriminators , are trained to distinguish between real and generated data. In particular, discriminators are designed to take structured data as inputs and produce probability values p ∈ (0 , 1) reflective of how confident the network is as to whether the given data is real or fake. Discriminators are typically trained to assign values p ≈ 1 to data from the training set and to assign values p ≈ 0 to generated data. SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  20. Example: Fully-Connected Discriminator Real or Fake SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  21. Example: Convolutional Discriminator SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  22. Adversarial Training As the discriminator is learning to distinguish between real and generated data, the network’s generator component aims to decieve it by producing increasingly realistic data. To accomplish this, we define separate loss functions for the generator and discriminator components of the network: � The generator loss L G is defined so that the loss is minimized when the discriminator’s predictions D ( � y ) on generated data � y is equal to 1 (i.e. the discriminator believes the data is real). � The discriminator loss L D is defined so that the loss is minimized when the discriminator assigns the correct labels: i.e. D ( � y ) = 0 for generated data � y and D ( y ) = 1 for real data y SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  23. Defining the Adversarial Loss Functions Setting y ∼ Real and � y ∼ Generated , we define the losses by: � � L G = − log D ( � y ) � � � � L D = − log D ( y ) − log 1 − D ( � y ) SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  24. Adversarial Training Algorithm SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

  25. Outline 1 Classifier Networks Classifier Models Softmax Function MNIST Example 2 Generative Networks Generative Models Adversarial Training Selected Examples SIAM@Purdue 2018 - Nick Winovich Introduction to Deep Models : Part I

Recommend


More recommend