common recognition tasks
play

Common recognition tasks Adapted from Slide from L. Lazebnik. - PowerPoint PPT Presentation

Common recognition tasks Adapted from Slide from L. Lazebnik. Fei-Fei Li Image classification and tagging outdoor mountains city Asia Lhasa Adapted from Slide from L. Lazebnik. Fei-Fei Li Object detection find


  1. Common recognition tasks Adapted from Slide from L. Lazebnik. Fei-Fei Li

  2. Image classification and tagging • outdoor • mountains • city • Asia • Lhasa • … Adapted from Slide from L. Lazebnik. Fei-Fei Li

  3. Object detection • find pedestrians Adapted from Slide from L. Lazebnik. Fei-Fei Li

  4. Activity recognition • walking • shopping • rolling a cart • sitting • talking • … Adapted from Slide from L. Lazebnik. Fei-Fei Li

  5. Semantic segmentation Adapted from Slide from L. Lazebnik. Fei-Fei Li

  6. Semantic segmentation sky mountain building tree building lamp lamp umbrella umbrella person market stall person person person ground person Adapted from Slide from L. Lazebnik. Fei-Fei Li

  7. Image description This is a busy street in an Asian city. Mountains and a large palace or fortress loom in the background. In the foreground, we see colorful souvenir stalls and people walking around and shopping. One person in the lower left is pushing an empty cart, and a couple of people in the middle are sitting, possibly posing for a photograph. Adapted from Slide from L. Lazebnik. Fei-Fei Li

  8. The statistical learning framework • Apply a prediction function to a feature representation of the image to get the desired output: f( ) = “apple” f( ) = “tomato” f( ) = “cow” Slide from L. Lazebnik.

  9. The statistical learning framework y = f( x ) output prediction feature function representation • Training: given a training set of labeled examples {( x 1 ,y 1 ), …, ( x N ,y N )}, estimate the prediction function f by minimizing the prediction error on the training set • Testing: apply f to a never before seen test example x and output the predicted value y = f( x ) Slide from L. Lazebnik.

  10. Steps Training Training Labels Training Images Image Learned Training Features model Learned model Testing Image Prediction Features Test Image Slide credit: D. Hoiem

  11. “Classic” recognition pipeline Image Class Feature Trainable Pixels label representation classifier • Hand-crafted feature representation • Off-the-shelf trainable classifier Slide from L. Lazebnik.

  12. Neural networks for images image Fully connected layer Slide from L. Lazebnik.

  13. Neural networks for images image Slide from L. Lazebnik.

  14. Neural networks for images feature map learned weights image Slide from L. Lazebnik.

  15. Neural networks for images another feature map another set of learned weights image Slide from L. Lazebnik.

  16. Convolution as feature extraction K feature maps bank of K filters . . . feature map image Slide from L. Lazebnik.

  17. Convolutional layer K feature maps Spatial resolution: K filters (roughly) the same if stride of 1 is used, reduced by 1/S if stride of S is used image convolutional layer Slide from L. Lazebnik.

  18. Convolutional layer L feature maps in the next layer K feature maps F x F x K filter L filters image convolutional layer + ReLU Slide from L. Lazebnik.

  19. CNN pipeline Feature maps Spatial pooling Non-linearity Convolution . (Learned) . . Input Image Input Feature Map Source: R. Fergus, Y. LeCun

  20. CNN pipeline Feature maps Spatial pooling Non-linearity Convolution (Learned) Input Image Source: R. Fergus, Y. LeCun Source: Stanford 231n

  21. CNN pipeline Feature maps Max (or Avg) Spatial pooling Non-linearity Convolution (Learned) Input Image Source: R. Fergus, Y. LeCun

  22. Max pooling layer K feature maps, resolution 1/S K feature maps max value F x F pooling filter, stride S Usually: F=2 or 3, S=2 Slide from L. Lazebnik.

  23. Summary: CNN pipeline Softmax layer: exp( w c ⋅ x ) P ( c | x ) = C ∑ exp( w k ⋅ x ) k = 1 Slide from L. Lazebnik.

  24. Training of multi-layer networks Find network weights to minimize the prediction loss between • true and estimated labels of training examples: 𝐹 𝐱 = ∑ ! 𝑚(𝐲 ! , 𝑧 ! ; 𝐱) • Possible losses (for binary problems): • 𝐱 (𝐲 ! ) − 𝑧 ! # Quadratic loss: 𝑚 𝐲 ! , 𝑧 ! ; 𝐱 = 𝑔 • Log likelihood loss: 𝑚 𝐲 ! , 𝑧 ! ; 𝐱 = −log 𝑄 𝐱 𝑧 ! | 𝐲 ! • Hinge loss: 𝑚 𝐲 ! , 𝑧 ! ; 𝐱 = max(0,1 − 𝑧 ! 𝑔 𝐱 𝐲 ! ) • Slide from L. Lazebnik.

  25. Convolutional networks linear conv subsample conv subsample filters filters weights Slide from B. Hariharan

  26. Empirical Risk Minimization N 1 X min L ( h ( x i ; θ ) , y i ) N θ i =1 Convolutional network N θ ( t +1) = θ ( t ) � λ 1 X r L ( h ( x i ; θ ) , y i ) N i =1 Gradient descent update Slide from B. Hariharan

  27. Computing the gradient of the loss r L ( h ( x ; θ ) , y ) z = h ( x ; θ ) r θ L ( z, y ) = ∂ L ( z, y ) ∂ z ∂ z ∂ θ Slide from B. Hariharan

  28. Convolutional networks linear conv subsample conv subsample filters filters weights Slide from B. Hariharan

  29. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 Slide from B. Hariharan

  30. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 = ∂ f 5 ( z 4 , w 5 ) ∂ z ∂ w 5 ∂ w 5 Slide from B. Hariharan

  31. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 = ∂ f 5 ( z 4 , w 5 ) ∂ f 4 ( z 3 , w 4 ) ∂ z = ∂ z ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 Slide from B. Hariharan

  32. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 = ∂ f 5 ( z 4 , w 5 ) ∂ f 4 ( z 3 , w 4 ) ∂ z = ∂ z ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 Slide from B. Hariharan

  33. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 = ∂ f 5 ( z 4 , w 5 ) ∂ f 4 ( z 3 , w 4 ) ∂ z = ∂ z ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 Slide from B. Hariharan

  34. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 = ∂ f 5 ( z 4 , w 5 ) ∂ f 4 ( z 3 , w 4 ) ∂ z = ∂ z ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 Slide from B. Hariharan

  35. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 = ∂ f 5 ( z 4 , w 5 ) ∂ f 4 ( z 3 , w 4 ) ∂ z = ∂ z ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 Slide from B. Hariharan

  36. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 = ∂ f 5 ( z 4 , w 5 ) ∂ f 4 ( z 3 , w 4 ) ∂ z = ∂ z ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 ∂ z 4 ∂ w 4 Slide from B. Hariharan

  37. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 ∂ z = ∂ z ∂ z 3 ∂ w 3 ∂ z 3 ∂ w 3 Slide from B. Hariharan

  38. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 ∂ z = ∂ z ∂ z 3 ∂ w 3 ∂ z 3 ∂ w 3 Slide from B. Hariharan

  39. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 ∂ z = ∂ z ∂ z 3 ∂ w 3 ∂ z 3 ∂ w 3 Slide from B. Hariharan

  40. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 ∂ z = ∂ z ∂ z 4 ∂ z 3 ∂ z 4 ∂ z 3 ∂ z = ∂ z ∂ z 3 ∂ w 3 ∂ z 3 ∂ w 3 Slide from B. Hariharan

  41. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 ∂ z = ∂ z ∂ z 4 ∂ z 3 ∂ z 4 ∂ z 3 ∂ z = ∂ z ∂ z 3 ∂ w 3 ∂ z 3 ∂ w 3 Slide from B. Hariharan

  42. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 Recurrence ∂ z = ∂ z ∂ z 3 going ∂ z 2 ∂ z 3 ∂ z 2 backward!! ∂ z = ∂ z ∂ z 2 ∂ w 2 ∂ z 2 ∂ w 2 Slide from B. Hariharan

  43. The gradient of convnets z 5 = z z 1 z 2 z 3 z 4 x f 1 f 2 f 3 f 4 f 5 w 1 w 2 w 3 w 4 w 5 Back-Propagation Slide from B. Hariharan

  44. Backpropagation for a sequence of functions • Each “function” has a “forward” and “backward” module • Forward module for f i • takes z i-1 and weight w i as input • produces z i as output • Backward module for f i • takes g(z i ) as input • produces g(z i-1 ) and g(w i ) as output g ( w i ) = g ( z i ) ∂ z i g ( z i − 1 ) = g ( z i ) ∂ z i ∂ z i − 1 ∂ w i Slide from B. Hariharan

  45. Backpropagation for a sequence of functions z i-1 f i z i w i Slide from B. Hariharan

  46. Backpropagation for a sequence of functions g(z i-1 ) f i g(z i ) g(w i ) Slide from B. Hariharan

  47. Computation graph - Functions • Each node implements two functions • A “forward” • Computes output given input • A “backward” • Computes derivative of z w.r.t input, given derivative of z w.r.t output Slide from B. Hariharan

  48. Computation graphs a b f i d c Slide from B. Hariharan

  49. Computation graphs ∂ z ∂ a ∂ z ∂ z f i ∂ b ∂ d ∂ z ∂ c Slide from B. Hariharan

  50. Computation graphs a b f i d c Slide from B. Hariharan

  51. Computation graphs ∂ z ∂ a ∂ z ∂ z f i ∂ b ∂ d ∂ z ∂ c Slide from B. Hariharan

  52. Neural network frameworks Slide from B. Hariharan

  53. Image classification and tagging • outdoor • mountains • city • Asia • Lhasa • … Adapted from Fei-Fei Li

Recommend


More recommend