neural networks
play

Neural Networks Oskar Taubert (SCC) SCC 1 15.01.2020 Oskar - PowerPoint PPT Presentation

Neural Networks Oskar Taubert (SCC) SCC 1 15.01.2020 Oskar Taubert - Neural Networks SCC www.kit.edu KIT The Research University in the Helmholtz Association Neural Network Concept (very) remotely brain inspired computational system


  1. Neural Networks Oskar Taubert (SCC) SCC 1 15.01.2020 Oskar Taubert - Neural Networks SCC www.kit.edu KIT – The Research University in the Helmholtz Association

  2. Neural Network Concept (very) remotely brain inspired computational system directed graph, encoding an ordered system of simple mathematical transformations successor of the perceptron concept (i.e. logistic regression) more complicated ’fit’ i.e. universal function approximator usually supervised machine learning 2 15.01.2020 Oskar Taubert - Neural Networks SCC

  3. Motivation automation of tasks machines are traditionally bad at image recognition natural language processing . . . image processing challenges e.g. character recognition (MNIST) 3 15.01.2020 Oskar Taubert - Neural Networks SCC

  4. Perceptron decide whether an image x 2 depicts a 0: o = Θ ( w · x + b ) where Output: o ∈ { 0 , 1 } Input: x ∈ R | Pixels | x 1 Parameter: w ∈ R | Pixels | Parameter: b ∈ R 4 15.01.2020 Oskar Taubert - Neural Networks SCC

  5. Perceptron decide whether an image x 2 depicts a 0: o = Θ ( w · x + b ) where Output: o ∈ { 0 , 1 } Input: x ∈ R | Pixels | x 1 Parameter: w ∈ R | Pixels | Parameter: b ∈ R Problem: Non-linear decision boundaries 4 15.01.2020 Oskar Taubert - Neural Networks SCC

  6. XOR 1 x 2 0 . 5 0 0 0 . 5 1 x 1 OR-gate h 1 = x 1 + x 2 − 1 5 15.01.2020 Oskar Taubert - Neural Networks SCC

  7. XOR 1 x 2 0 . 5 x 2 0 0 0 . 5 1 0 0 . 5 1 x 1 x 1 OR-gate NAND-gate h 1 = x 1 + x 2 − 1 h 2 = − x 1 − x 2 − 1 . 5 5 15.01.2020 Oskar Taubert - Neural Networks SCC

  8. XOR 1 h 2 x 2 0 . 5 x 2 0 0 0 . 5 1 0 0 . 5 1 0 0 . 5 1 x 1 x 1 h 1 OR-gate NAND-gate AND-gate h 1 = x 1 + x 2 − 1 h 2 = − x 1 − x 2 − 1 . 5 y = h 1 + h 2 − 1 . 5 ˆ 5 15.01.2020 Oskar Taubert - Neural Networks SCC

  9. Multilayer Perceptron o = f ( W · h + b ) x 0 h = g ( V · x + c ) o 0 h 0 o ∈ R 10 x 1 W ∈ R 10 × 3 o 1 h 1 h ∈ R 3 . . . b ∈ R 10 . . . h 2 V ∈ R | Pixels |× 3 x n x ∈ R | Pixels | Input Output Hidden c ∈ R 3 6 15.01.2020 Oskar Taubert - Neural Networks SCC

  10. Multilayer Perceptron o = f ( W · h + b ) x 0 h = g ( V · x + c ) o 0 h 0 f and g ? x 1 Values for W and V ? o 1 h 1 . . . . . . h 2 x n Input Output Hidden 6 15.01.2020 Oskar Taubert - Neural Networks SCC

  11. Training parameters = { W , V , b , c } Error measure: E ( o , t ) = ( o − t ) 2 7 15.01.2020 Oskar Taubert - Neural Networks SCC

  12. Training parameters = { W , V , b , c } Error measure: E ( o , t ) = ( o − t ) 2 ∂ error parameters ← parameters − λ ∂ parameters 7 15.01.2020 Oskar Taubert - Neural Networks SCC

  13. Training parameters = { W , V , b , c } Error measure: E ( o , t ) = ( o − t ) 2 ∂ error parameters ← parameters − λ ∂ parameters ∂ W = ∂ E ∂ E ∂ W = ∂ E ∂ o ∂ o ∂ f ∂ W h ∂ o ∂ o ∂ f ∂ E ∂ b = ∂ E ∂ o ∂ b = ∂ E ∂ o ∂ f ∂ o ∂ o ∂ f ∂ b 7 15.01.2020 Oskar Taubert - Neural Networks SCC

  14. Training parameters = { W , V , b , c } Error measure: E ( o , t ) = ( o − t ) 2 ∂ error parameters ← parameters − λ ∂ parameters ∂ W = ∂ E ∂ E ∂ W = ∂ E ∂ o ∂ o ∂ f ∂ W h ∂ o ∂ o ∂ f ∂ b = ∂ E ∂ E ∂ b = ∂ E ∂ o ∂ o ∂ f ∂ o ∂ o ∂ f ∂ b ∂ E ∂ V = ∂ E ∂ o ∂ f ∂ h ∂ g ∂ V x ∂ o ∂ f ∂ h ∂ g 7 15.01.2020 Oskar Taubert - Neural Networks SCC

  15. Training More general: E ( t , f n ( W n · f n − 1 ( . . . f 1 ( W 1 · x )))) ∂ E = δ i · h i − 1 ∂ W i δ i − 1 = δ i ∂ f i − 1 | W l − 1 · W i 7 15.01.2020 Oskar Taubert - Neural Networks SCC

  16. Error functions Regression: MSE, KL-divergence Classification: Cross Entropy, NLL-loss Segmentation: Hinge-losses, Overlap/Dissimilarity losses 8 15.01.2020 Oskar Taubert - Neural Networks SCC

  17. Convolutions Figure: * � Machine Learning Guru c 9 15.01.2020 Oskar Taubert - Neural Networks SCC

  18. Convolutions Figure: * � Machine Learning Guru c 9 15.01.2020 Oskar Taubert - Neural Networks SCC

  19. Convolutions Figure: * � Machine Learning Guru c 9 15.01.2020 Oskar Taubert - Neural Networks SCC

  20. Activation Functions Activation functions f ( x ) introduce non-linearity , e.g. sigmoid Other non-linear choices, e.g. tanh ( x ) , relu ( x ) = max ( 0 , x ) , exp ( x i ) softmax i ( x ) = ∑ i exp ( x i ) , etc. Better numerical properties, e.g. avoid vanishing gradient sigmoid tanh ReLU SeLU 2 1 f ( x ) 0 − 1 − 2 − 2 − 1 0 1 2 − 2 − 1 0 1 2 − 2 − 1 0 1 2 − 2 − 1 0 1 2 x x x x 10 15.01.2020 Oskar Taubert - Neural Networks SCC

  21. Regularization degree=0 degree=1 y degree=3 degree=9 y x x 11 15.01.2020 Oskar Taubert - Neural Networks SCC

  22. Regularization early stopping J weight decay Test loss weight sharing Optimum Training loss dropout epoch batch normalization data augmentation more data 12 15.01.2020 Oskar Taubert - Neural Networks SCC

  23. Hyperparameters guessing experience non-gradient based optimization grid search random search particle swarm genetic 13 15.01.2020 Oskar Taubert - Neural Networks SCC

  24. Out of Scope residual models generative models recurrent models attention models lots reinforcement learning (next week) 14 15.01.2020 Oskar Taubert - Neural Networks SCC

  25. Sources http://nyu-cds.sparksites.io/wp-content/uploads/2015/10/ header_4@2x.png https://github.com/Markus-Goetz/gks-2019/blob/solutions/ slides/slides.pdf 15 15.01.2020 Oskar Taubert - Neural Networks SCC

Recommend


More recommend