deep learning theory and practice
play

Deep Learning - Theory and Practice Linear Regression, Least Squares - PowerPoint PPT Presentation

Deep Learning - Theory and Practice Linear Regression, Least Squares 20-02-2020 Classification and Logistic Regression http://leap.ee.iisc.ac.in/sriram/teaching/DL20/ deeplearning.cce2020@gmail.com Least Squares for Classification K-class


  1. Deep Learning - Theory and Practice Linear Regression, Least Squares 20-02-2020 Classification and Logistic Regression http://leap.ee.iisc.ac.in/sriram/teaching/DL20/ deeplearning.cce2020@gmail.com

  2. Least Squares for Classification ❖ K-class classification problem ❖ With 1-of-K hot encoding, and least squares regression Bishop - PRML book (Chap 3)

  3. Logistic Regression ❖ 2- class logistic regression ❖ Maximum likelihood solution ❖ K-class logistic regression ❖ Maximum likelihood solution Bishop - PRML book (Chap 3)

  4. Typical Error Surfaces Typical Error Surface as a function of parameters (weights and biases)

  5. Learning with Gradient Descent Error surface close to a local

  6. Learning Using Gradient Descent

  7. Parameter Learning • Solving a non-convex optimization. • Iterative solution. • Depends on the initialization. • Convergence to a local optima. • Judicious choice of learning rate

  8. Least Squares versus Logistic Regression Bishop - PRML book (Chap 4)

  9. Least Squares versus Logistic Regression Bishop - PRML book (Chap 4)

  10. Neural Networks

  11. Perceptron Algorithm Perceptron Model [McCulloch, 1943, Rosenblatt, 1957] Targets are binary classes [-1,1] What if the data is not linearly separable

  12. Multi-layer Perceptron Multi-layer Perceptron [Hopfield, 1982] non-linear function ( tanh,sigmoid ) thresholding function

  13. Neural Networks Multi-layer Perceptron [Hopfield, 1982] non-linear function ( tanh,sigmoid ) thresholding function • Useful for classifying non-linear data boundaries - non-linear class separation can be realized given enough data.

  14. Neural Networks Types of Non-linearities tanh sigmoid ReLu Cost-Function Cross Entropy Mean Square Error are the desired outputs

  15. Learning Posterior Probabilities with NNs Choice of target function • Softmax function for classification • Softmax produces positive values that sum to 1 • Allows the interpretation of outputs as posterior probabilities

Recommend


More recommend