perceptrons
play

Perceptrons Steven J Zeil Old Dominion Univ. Fall 2010 1 - PowerPoint PPT Presentation

Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Perceptrons Steven J Zeil Old Dominion Univ. Fall 2010 1 Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs


  1. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Perceptrons Steven J Zeil Old Dominion Univ. Fall 2010 1

  2. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Perceptrons Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Training Multilayer Perceptrons 3 Structure Training MLPs 4 Backpropagation Improving Convergence OverTraining Tuning Network Size Applying MLPs 5 Structuring Networks Dimensionality Reduction Time Delay Neural Networks Recurrent Networks 2

  3. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Neural Networks Networks of processing units (neurons) with connections (synapses) between them Large number of neurons: 10 10 Large connectitivity: 10 5 Parallel processing Robust 3

  4. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Computing via NN Not so much an attempt to imitate the brain as inspired by it A model for massive parallel processing Simplest building block: the perceptron 4

  5. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs The Perceptron Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Training Multilayer Perceptrons 3 Structure Training MLPs 4 Backpropagation Improving Convergence OverTraining Tuning Network Size Applying MLPs 5 Structuring Networks Dimensionality Reduction Time Delay Neural Networks Recurrent Networks 5

  6. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Perceptron Rosenblatt, 1962 w i are connection weights w T � y = � x w = [ w 0 , w 1 , . . . , w d ] � � x = [1 , x 1 , x 2 , . . . , w d ] ( augmented input vector) 6

  7. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Basic Uses w T � y = � x + w 0 Linear regression Linear discriminant between 2 classes Use multiple perceptrons for K > 2 classes 7

  8. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Perceptron Output Functions Many perceptrons have a “post-processing” function at the output node. A common choice is the threshold: � 1 w T � if � x > 0 y = 0 ow Useful for classification. 8

  9. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Sigmoid Output Functions Useful when we need differentiation or the ability to estimate posterior probs. 1 y = sigmoid ( o ) = w T � 1 + exp [ − � x ] 9

  10. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs K Classes w T o i = � i � x Use softmax: exp o i y i = � k exp o k Choose C i if y i = max k y k 10

  11. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Training Allows online (incremental) training rather than the usual batch No need to store whole sample Adjusts to slow changes in the problem domain Incremental form of gradient-descent: update in direction of gradient after each training instance LMS update: ∆ w t r t i − y t x t � � ij = η i j η is the learning factor - size controls rate of convergence and stability 11

  12. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Update Rule: Regression Error function is x t , r t ) = 1 2( r t − � E t ( � w T � x t ) 2 w | � with gradient components ∂ E t − ( r t − � w T � x t ) x t = i ∂ w t i − ( r t − y t ) x t = i Therefore to move in the direction of the gradient ∆ w t � r t i − y t � x t ij = η i j 12

  13. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Update Rule: Classification ∆ w t r t i − y t x t � � ij = η i j For K=2, y t = sigmoid ( � w T � x ) leads to same update function as for regression For K > 2 softmax leads to same update as well. 13

  14. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Example: Learning Boolean Functions Example: spreadsheet demonstrates that perceptrons can learn linearly separable functions (AND, OR, NAND, . . . ) but cannot learn XOR Minsky & papert, 1969 Nearly halted all work on neural networks until 1982 14

  15. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Multilayer Perceptrons Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Training Multilayer Perceptrons 3 Structure Training MLPs 4 Backpropagation Improving Convergence OverTraining Tuning Network Size Applying MLPs 5 Structuring Networks Dimensionality Reduction Time Delay Neural Networks Recurrent Networks 15

  16. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Multilayer Perceptrons Adds one or more hidden layers H � v T � y i = � z = v ih z h + v i 0 h =1 z h w T � = sigmoid ( � x ) 1 = � �� d 1 + exp − j =1 w hj x j + w h (Rumelhart et al. 1986) 16

  17. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Learning XOR 17

  18. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs MLP as a Universal Approximator Any function with continuous inputs and outputs can be approximated by an MLP Given two hidden layers, can use one to divide input domain and the other to compute a piecewise linear regression function Hidden layers may need to be arbitrarily wide 18

  19. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Training MLPs Introduction: Neural Networks 1 The Perceptron 2 Using Perceptrons Training Multilayer Perceptrons 3 Structure Training MLPs 4 Backpropagation Improving Convergence OverTraining Tuning Network Size Applying MLPs 5 Structuring Networks Dimensionality Reduction Time Delay Neural Networks Recurrent Networks 19

  20. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Training MLPs: Backpropagation H � v T � y i = � z = v ih z h + v i 0 h =1 z h w T � = sigmoid ( � x ) 1 = � �� d 1 + exp − j =1 w hj x j + w h Given the z values, we could train the � v as we do a single-layer perceptron. ( r t − y t ) z t � ∆ v h = η h 20 t

  21. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Backpropagation (cont.) − η ∂ E ∆ w hj = ∂ w hj ∂ E ∂ y i ∂ z h = ∂ y i ∂ z h ∂ w hj − ( r t − y t ) ∂ y i ∂ z h � = − η ∂ z h ∂ w hj t ∂ z h � − ( r t − y t ) v h = − η ∂ w hj t � − ( r t − y t ) v h z t h (1 − z t h ) x t = − η j t 21

  22. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Backpropagation Algorithm Initialize all v ih and w hj to rand(-0.01,0.01) repeat x t , r t ) ∈ X in random order do for all ( � for h=1 , . . . , H do w T x t ) z h ← sigmoid ( ˜ h ˜ end for for i=1 , . . . , K do v T y i ← � i � z end for for i=1 , . . . , K do v i ← η ( r t i − y t ∆ � i ) � z end for for h=1 , . . . , H do i ( r t i − y t x t ∆ � w h ← η ( � i ) v ih ) z h (1 − z h ) � end for for i=1 , . . . , K do � v i ← � v i + ∆ � v i end for for h=1 , . . . , H do � w h ← � w h + ∆ � w h end for end for until convergence 22

  23. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Applying Backpropagation Batch learning: make multiple passes over entire sample Update � v and � w after each entire pass Each pass is called an epoch Online learning: one pass, smaller η 23

  24. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Example of Batch Learning 24

  25. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Multiple hidden Levels Multiple hidden levels are possible Backpropagation generalizes to any nuimber of levels. 25

  26. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs Improving Convergence Momentum: Attempts to damp out oscillations by averaging in the “trend” of prior updates i = − η∂ E t ∆ w t + α ∆ w t − 1 i ∂ w i 0 . 5 ≤ α < 1 . 0 Adaptive Leanring rate: Keep η large when learning is going on, decreasing it later � + a if E t + τ < E t ∆ η = − b η otherwise Note that increase is arithmetic, but decrease is geometric. 26

  27. Introduction: Neural Networks The Perceptron Multilayer Perceptrons Training MLPs Applying MLPs OverTraining MLPs are subject to overtaining partly due to large number of parameters but also is a function of training time w i start near zero - in effect the paramters are ignored 27

Recommend


More recommend