the perceptron algorithm
play

The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) First - PDF document

9/14/10 The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) First learning algorithm for neural networks; Originally introduced for character classification, where each character is represented as an image; 1 9/14/10


  1. 9/14/10 The Perceptron Algorithm Perceptron (Frank Rosenblatt, 1957) • First learning algorithm for neural networks; • Originally introduced for character classification, where each character is represented as an image; 1

  2. 9/14/10 Perceptron (contd.) n Total input to output node: w j x ∑ j j 1 = 1 if x 0 ≥  ( ) H x Output unit performs the =  0 if x 0 < function: (activation function):  Perceptron: Learning Algorithm • Goal : we want to define a learning algorithm for the weights in order to compute a mapping from the inputs to the outputs; • Example : two class character recognition problem. – Training set : set of images representing either the character ‘a’ or the character ‘b’ (supervised learning); – Learning Task : Learn the weights so that when a new unlabelled image comes in, the network can predict its label. – Settings: The perceptron Class ‘a’  1 (class C1) needs to learn Class ‘b’  0 (class C2) ℜ n { } n input units (intensity level of a pixel) f : 0 , 1 → 1 output unit 2

  3. 9/14/10 Perceptron: Learning Algorithm The algorithm proceeds as follows : • Initial random setting of weights; • The input is a random sequence { } ℵ x k k ∈ • For each element of class C1, if output = 1 (correct) do nothing , otherwise update weights ; • For each element of class C2, if output = 0 (correct) do nothing , otherwise update weights . Perceptron: Learning Algorithm A bit more formally: ( ) ( ) x x , x ,..., x w w , w ,..., w = = 1 2 n 1 2 n : θ Threshold of the output unit T wx w x w x ... w x = + + + 1 1 2 2 n n T wx 0 − θ ≥ Output is 1 if To eliminate the explicit dependence on : θ Output is 1 if: n 1 + ˆ ˆ T w x w x 0 ∑ = ≥ i i i 1 = 3

  4. 9/14/10 Perceptron: Learning Algorithm • We want to learn values of the weights so that the perceptron correctly discriminate elements of C1 from elements of C2: • Given x in input, if x is classified correctly, weights are unchanged, otherwise: w x if an elem ent of cla ss C ( 1 ) was classi fied as in C +  1 2 ' w =  w x if an elem ent of cla ss C ( 0 ) was classi fied as in C −  2 1 Perceptron: Learning Algorithm w x if an elem ent of cla ss C ( 1 ) was classi fied as in C  + ' 1 2 w =  w x if an elem ent of cla ss C ( 0 ) was classi fied as in C −  2 1 • 1 st case : x ∈ C and was classified in C 1 2 ˆ ˆ T w x 0 The correct answer is 1, which corresponds to: ≥ ˆ ˆ T w x 0 We have instead: < We want to get closer to the correct answer: T ' T wx w x < T T T ' T wx ( w x ) x wx w x iff < < + 2 ( ) T T T T w x x wx xx wx x + = + = + 2 ≥ because x 0 , the condit ion is ver ified 4

  5. 9/14/10 Perceptron: Learning Algorithm w x if an elem ent of cla ss C ( 1 ) was classi fied as in C +  1 2 ' w =  w x if an elem ent of cla ss C ( 0 ) was classi fied as in C −  2 1 • 2 nd case : x ∈ C 2 and was classified in C 1 The correct answer is 0, which corresponds to: ˆ ˆ T w x 0 < ˆ ˆ T We have instead: w x 0 ≥ We want to get closer to the correct answer: T ' T wx w x > T ' T T T wx w x wx ( w x ) x iff > > − 2 ( ) T T T T w x x wx xx wx x − = − = − 2 ≥ because x 0 , the condit ion is ver ified The previous rule allows the network to get closer to the correct answer when it performs an error. Perceptron: Learning Algorithm • In summary : 1. A random sequence is generated x , x ,  , x ,  1 2 k such that x C C ∈ ∪ i 1 2 2. If is correctly classified, then x w w = k k + 1 k otherwise w x if x C + ∈  k k k 1 w =  k 1 + w x if x C − ∈  k k k 2 5

  6. 9/14/10 Perceptron: Learning Algorithm Does the learning algorithm converge? Convergence theorem: Regardless of the initial choice of weights, if the two classes are linearly separable, i.e. there exist s.t. w ˆ ˆ T w x 0 if x C  ≥ ∈  1  ˆ ˆ T w x 0 if x C < ∈   2 then the learning rule will find such solution after a finite number of steps. Representational Power of Perceptrons • Marvin Minsky and Seymour Papert, “Perceptrons” 1969: “The perceptron can solve only problems with linearly separable classes.” • Examples of linearly separable Boolean functions: AND OR 6

  7. 9/14/10 Representational Power of Perceptrons 1 1 -1.5 -0.5 1 1 Perceptron that computes the Perceptron that computes the AND function OR function Representational Power of Perceptrons • Example of a non linearly separable Boolean function: EX-OR The EX-OR function cannot be computed by a perceptron 7

Recommend


More recommend