natural language understanding
play

Natural Language Understanding Lecture 2: Revision of neural - PowerPoint PPT Presentation

Natural Language Understanding Lecture 2: Revision of neural networks and backpropagation Adam Lopez Credits: Mirella Lapata and Frank Keller 19 January 2018 School of Informatics University of Edinburgh alopez@inf.ed.ac.uk 1 Biological


  1. Natural Language Understanding Lecture 2: Revision of neural networks and backpropagation Adam Lopez Credits: Mirella Lapata and Frank Keller 19 January 2018 School of Informatics University of Edinburgh alopez@inf.ed.ac.uk 1

  2. Biological neural networks • Neuron receives inputs and combines these in the cell body. • If the input reaches a threshold, then the neuron may fire (produce an output). • Some inputs are excitatory, while others are inhibitory. 2

  3. The relationship of artifical neural networks to the brain 3

  4. The relationship of artifical neural networks to the brain While the brain metaphor is sexy and intriguing, it is also distracting and cumbersome to manipulate mathematically. (Goldberg 2015) 3

  5. The perceptron: an artificial neuron Developed by Frank Rosenblatt in 1957. x 1 w 1 w 2 x 2 y . . . � f . . . w n x n 4

  6. The perceptron: an artificial neuron Developed by Frank Rosenblatt in 1957. x 1 w 1 w 2 x 2 y . . . � f . . . w n x n Input function: n u ( x ) = � w i x i i =1 4

  7. The perceptron: an artificial neuron Developed by Frank Rosenblatt in 1957. x 1 w 1 w 2 x 2 y . . . � f . . . w n x n Activation function: threshold Input function:  n 1 , if u ( x ) > θ  u ( x ) = � w i x i y = f ( u ( x )) = 0 , otherwise i =1  4

  8. The perceptron: an artificial neuron Developed by Frank Rosenblatt in 1957. x 1 w 1 w 2 x 2 y . . . � f . . . w n x n Activation function: threshold Input function: Activation state:  n 1 , if u ( x ) > θ  u ( x ) = � w i x i y = f ( u ( x )) = 0 or 1 (-1 or 1) 0 , otherwise i =1  4

  9. The perceptron: an artificial neuron Developed by Frank Rosenblatt in 1957. x 1 w 1 w 2 x 2 y . . . � f . . . w n x n • Inputs are in the range [0, 1], where 0 is “off” and 1 is “on”. • Weights can be any real number (positive or negative). 5

  10. Perceptrons can represent logic functions Perceptron for AND 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 0 1 0 0.5 1 0 0 f if � ≥ 1 then 1 else 0 1 1 1 6

  11. Perceptrons can represent logic functions Perceptron for AND 0 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 0 1 0 0.5 1 0 0 1 f if � ≥ 1 then 1 else 0 1 1 1 6

  12. Perceptrons can represent logic functions Perceptron for AND 0 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 0 1 0 0.5 1 0 0 1 f if � ≥ 1 then 1 else 0 1 1 1 0 · 0 . 5 + 1 · 0 . 5 = 0 . 5 < 1 6

  13. Perceptrons can represent logic functions Perceptron for AND 0 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 0 0 1 0 0.5 1 0 0 1 f if � ≥ 1 then 1 else 0 1 1 1 0 · 0 . 5 + 1 · 0 . 5 = 0 . 5 < 1 6

  14. Perceptrons can represent logic functions Perceptron for AND 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 0 1 0 0.5 1 0 0 f if � ≥ 1 then 1 else 0 1 1 1 7

  15. Perceptrons can represent logic functions Perceptron for AND 1 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 0 1 0 0.5 1 0 0 1 f if � ≥ 1 then 1 else 0 1 1 1 7

  16. Perceptrons can represent logic functions Perceptron for AND 1 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 0 1 0 0.5 1 0 0 1 f if � ≥ 1 then 1 else 0 1 1 1 1 · 0 . 5 + 1 · 0 . 5 = 1 = 1 7

  17. Perceptrons can represent logic functions Perceptron for AND 1 0.5 x 1 AND x 2 x 1 x 2 0 0 0 1 1 0 1 0 0.5 1 0 0 1 f if � ≥ 1 then 1 else 0 1 1 1 1 · 0 . 5 + 1 · 0 . 5 = 1 = 1 7

  18. Perceptrons can represent logic functions Perceptron for OR 0.5 x 1 OR x 2 x 1 x 2 0.5 0 0 0 0.5 0 1 1 1 0 1 f if � ≥ 0 . 5 then 1 else 0 1 1 1 8

  19. Perceptrons can represent logic functions Perceptron for OR 0 0.5 x 1 OR x 2 x 1 x 2 0.5 0 0 0 0.5 0 1 1 1 0 1 1 f if � ≥ 0 . 5 then 1 else 0 1 1 1 8

  20. Perceptrons can represent logic functions Perceptron for OR 0 0.5 x 1 OR x 2 x 1 x 2 0.5 0 0 0 0.5 0 1 1 1 0 1 1 f if � ≥ 0 . 5 then 1 else 0 1 1 1 0 · 0 . 5 + 1 · 0 . 5 = 0 . 5 = 0 . 5 8

  21. Perceptrons can represent logic functions Perceptron for OR 0 0.5 x 1 OR x 2 x 1 x 2 0.5 0 0 0 1 0.5 0 1 1 1 0 1 1 f if � ≥ 0 . 5 then 1 else 0 1 1 1 0 · 0 . 5 + 1 · 0 . 5 = 0 . 5 = 0 . 5 8

  22. How would you represent NOT(OR)? Perceptron for NOT(OR) ??? x 1 OR x 2 x 1 x 2 0 0 1 0.5 0 1 0 ??? 1 0 0 f if � ≥ ??? then 1 else 0 1 1 0 9

  23. Perceptrons are linear classifiers − 1 w 0 1 x 1 w 1 y x = � n i =0 w i x i 0 0 w n x n x n 10

  24. � � Perceptrons are linear classifiers Perceptrons are linear classifiers, i.e., they can only separate points with a hyperplane (a straight line). � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 11

  25. Perceptron can learn logic functions from examples Give some examples to the Perceptron: N input x target t 1 (0,1,0,0) 1 2 (1,0,0,0) 0 3 (0,1,1,1) 0 4 (1,0,1,0) 0 5 (1,1,1,1) 1 6 (0,1,0,0) 1 . . . . . . . . . • Input: a vector of 1’s and 0’s—-a feature vector. • Output: a 1 or 0, given as the target. 12

  26. Perceptron can learn logic functions from examples Give some examples to the Perceptron: N input x target t output o 1 (0,1,0,0) 1 0 2 (1,0,0,0) 0 0 3 (0,1,1,1) 0 1 4 (1,0,1,0) 0 1 5 (1,1,1,1) 1 0 6 (0,1,0,0) 1 1 . . . . . . . . . . . . • Input: a vector of 1’s and 0’s—-a feature vector. • Output: a 1 or 0, given as the target. 12

  27. Perceptron can learn logic functions from examples Give some examples to the Perceptron: N input x target t output o 1 (0,1,0,0) 1 0 2 (1,0,0,0) 0 0 3 (0,1,1,1) 0 1 4 (1,0,1,0) 0 1 5 (1,1,1,1) 1 0 6 (0,1,0,0) 1 1 . . . . . . . . . • Input: a vector of 1’s and 0’s—-a feature vector. • Output: a 1 or 0, given as the target. • How do we efficiently find the weights and threshold? 12

  28. Learning Q 1 : Choosing weights and threshold θ for the perceptron is not easy! What’s an effective to learn the weights and threshold from examples? A 1 : We use a learning algorithm that adjusts the weights and threshold based on examples. http://www.youtube.com/watch?v=vGwemZhPlsA&feature=youtu.be 13

  29. Simplify by converting θ into a weight n � w i x i > θ i =1 14

  30. Simplify by converting θ into a weight n � w i x i > θ i =1 n � w i x i − θ > 0 i =1 14

  31. Simplify by converting θ into a weight n � w i x i > θ i =1 n � w i x i − θ > 0 i =1 w 1 x 1 + w 2 x 2 + . . . w n x n − θ > 0 14

  32. Simplify by converting θ into a weight n � w i x i > θ i =1 n � w i x i − θ > 0 i =1 w 1 x 1 + w 2 x 2 + . . . w n x n − θ > 0 w 1 x 1 + w 2 x 2 + . . . w n x n + θ ( − 1) > 0 14

  33. Simplify by converting θ into a weight n � w i x i > θ i =1 x 0 = − 1 n � w 0 = θ w i x i − θ > 0 x 1 w 1 i =1 w 2 x 2 y . . . � f . . . w n x n w 1 x 1 + w 2 x 2 + . . . w n x n − θ > 0 w 1 x 1 + w 2 x 2 + . . . w n x n + θ ( − 1) > 0 14

  34. Simplify by converting θ into a weight x 0 = − 1 w 0 = θ x 1 w 1 w 2 x 2 y . . . � f . . . w n x n Let x 0 = − 1 be the weight of θ . Now our activation function is:  1 , if u ( x ) > 0  y = f ( u ( x )) = 0 , otherwise 15 

  35. Learn by adjusting weights whenever output � = target Intuition: classification depends on the sign (+ or -) of the output. If output has a different sign than the target, adjust weights to move output in the direction of 0. 16

  36. Learn by adjusting weights whenever output � = target Intuition: classification depends on the sign (+ or -) of the output. If output has a different sign than the target, adjust weights to move output in the direction of 0. o = 0 and t = 0 Don’t adjust weights 16

  37. Learn by adjusting weights whenever output � = target Intuition: classification depends on the sign (+ or -) of the output. If output has a different sign than the target, adjust weights to move output in the direction of 0. o = 0 and t = 0 Don’t adjust weights o = 0 and t = 1 u ( x ) was too low. Make it bigger! 16

  38. Learn by adjusting weights whenever output � = target Intuition: classification depends on the sign (+ or -) of the output. If output has a different sign than the target, adjust weights to move output in the direction of 0. o = 0 and t = 0 Don’t adjust weights o = 0 and t = 1 u ( x ) was too low. Make it bigger! o = 1 and t = 0 u ( x ) was too high. Make it smaller! 16

  39. Learn by adjusting weights whenever output � = target Intuition: classification depends on the sign (+ or -) of the output. If output has a different sign than the target, adjust weights to move output in the direction of 0. o = 0 and t = 0 Don’t adjust weights o = 0 and t = 1 u ( x ) was too low. Make it bigger! o = 1 and t = 0 u ( x ) was too high. Make it smaller! o = 1 and t = 1 Don’t adjust weights 16

Recommend


More recommend