lecture 4 backpropagation and neural networks part 1
play

Lecture 4: Backpropagation and Neural Networks part 1 Fei-Fei Li - PowerPoint PPT Presentation

Lecture 4: Backpropagation and Neural Networks part 1 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 1 Administrative A1 is due


  1. Lecture 4: Backpropagation and Neural Networks part 1 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 1

  2. Administrative A1 is due Jan 20 (Wednesday). ~150 hours left Warning: Jan 18 (Monday) is Holiday (no class/office hours) Also note: Lectures are non-exhaustive. Read course notes for completeness. I’ll hold make up office hours on Wed Jan20, 5pm @ Gates 259 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 2

  3. Where we are... scores function SVM loss data loss + regularization want Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 3

  4. Optimization (image credits to Alec Radford) Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 4

  5. Gradient Descent Numerical gradient : slow :(, approximate :(, easy to write :) Analytic gradient : fast :), exact :), error-prone :( In practice: Derive analytic gradient, check your implementation with numerical gradient Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 5

  6. Computational Graph x s (scores) * hinge L + loss W R Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 6

  7. Convolutional Network (AlexNet) input image weights loss Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 7

  8. Neural Turing Machine input tape loss Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 8

  9. Neural Turing Machine Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 9

  10. e.g. x = -2, y = 5, z = -4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 10

  11. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 11

  12. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 12

  13. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 13

  14. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 14

  15. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 15

  16. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 16

  17. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 17

  18. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 18

  19. e.g. x = -2, y = 5, z = -4 Chain rule: Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 19

  20. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 20

  21. e.g. x = -2, y = 5, z = -4 Chain rule: Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 21

  22. activations f Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 22

  23. activations “local gradient” f Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 23

  24. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 24

  25. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 25

  26. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 26

  27. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 27

  28. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 28

  29. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 29

  30. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 30

  31. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 31

  32. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 32

  33. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 33

  34. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 34

  35. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 35

  36. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 36

  37. Another example: (-1) * (-0.20) = 0.20 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 37

  38. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 38

  39. Another example: [local gradient] x [its gradient] [1] x [0.2] = 0.2 [1] x [0.2] = 0.2 (both inputs!) Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 39

  40. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 40

  41. Another example: [local gradient] x [its gradient] x0: [2] x [0.2] = 0.4 w0: [-1] x [0.2] = -0.2 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 41

  42. sigmoid function sigmoid gate Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 42

  43. sigmoid function sigmoid gate (0.73) * (1 - 0.73) = 0.2 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 43

Recommend


More recommend