cse 152 computer vision
play

CSE 152: Computer Vision Hao Su Lecture 7: Neural Networks Review - PowerPoint PPT Presentation

CSE 152: Computer Vision Hao Su Lecture 7: Neural Networks Review of Filters: From Linear to Non-linear Image filtering (Linear case) 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


  1. CSE 152: Computer Vision Hao Su Lecture 7: Neural Networks

  2. Review of Filters: From Linear to Non-linear

  3. Image filtering (Linear case) 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Credit: S. Seitz

  4. Image filtering (Linear case) 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 0 0 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Credit: S. Seitz

  5. Image filtering (Linear case) 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 0 0 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Credit: S. Seitz

  6. Image filtering (Linear case) 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 0 0 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Credit: S. Seitz

  7. Image filtering (Linear case) 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 0 0 0 0 0 0 0 0 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 90 0 90 90 90 0 0 0 0 0 90 90 90 90 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Credit: S. Seitz

  8. Image filtering (Linear case) 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10 0 0 0 90 90 90 90 90 0 0 0 20 40 60 60 60 40 20 0 0 0 90 90 90 90 90 0 0 0 30 60 90 90 90 60 30 0 0 0 90 90 90 90 90 0 0 0 30 50 80 80 90 60 30 0 0 0 90 0 90 90 90 0 0 0 30 50 80 80 90 60 30 0 0 0 90 90 90 90 90 0 0 0 20 30 50 50 60 40 20 10 20 30 30 30 30 20 10 0 0 0 0 0 0 0 0 0 0 0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Credit: S. Seitz

  9. Reducing salt-and-pepper noise 3x3 5x5 7x7 • What’s wrong with the results?

  10. Median filter (Non-linear) •What advantage does median filtering have over box filtering? • Robustness to outliers Source: K. Grauman

  11. Median filter (Non-linear) 3x3 5x5 7x7 Gaussian Median Source: M. Hebert

  12. Gaussian vs. median filtering 3x3 5x5 7x7 Gaussian Median

  13. Neural Networks A General Framework from Linear to Non-linear

  14. Image Classification : A core task in Computer Vision (assume given set of discrete labels) {dog, cat, truck, plane, ...} cat This image by Nikita is licensed under CC-BY 2.0 Lecture 2 - 14

  15. The Problem : Semantic Gap What the computer sees An image is just a big grid of numbers between [0, 255]: e.g. 800 x 600 x 3 (3 channels RGB) This image by Nikita is licensed under CC-BY 2.0 Lecture 2 - 15

  16. Challenges : Viewpoint variation All pixels change when the camera moves! This image by Nikita is licensed under CC-BY 2.0 Lecture 2 - 16

  17. 17 Challenges : Illumination This image is CC0 1.0 public domain This image is CC0 1.0 public domain This image is CC0 1.0 public domain This image is CC0 1.0 public domain Lecture 2 -

  18. Challenges : Deformation This image by Tom Thai is This image by sare bear is This image by Umberto Salvagnin This image by Umberto Salvagnin licensed under CC-BY 2.0 licensed under CC-BY 2.0 is licensed under CC-BY 2.0 is licensed under CC-BY 2.0 Lecture 2 -

  19. Challenges : Occlusion 19 This image by jonsson is licensed This image is CC0 1.0 public domain This image is CC0 1.0 public domain under CC-BY 2.0 Lecture 2 -

  20. Challenges : Background Clutter 20 This image is CC0 1.0 public domain This image is CC0 1.0 public domain Lecture 2 -

  21. Challenges : Intraclass variation This image is CC0 1.0 public domain Lecture 2 -

  22. Linear Classification Lecture 2 -

  23. Recall CIFAR10 50,000 training images each image is 32x32x3 10,000 test images. Lecture 2 -

  24. Parametric Approach Image 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Lecture 2 -

  25. Parametric Approach: Linear Classifier f(x,W) = Wx Image 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Lecture 2 -

  26. Parametric Approach: Linear Classifier 3072x1 f(x,W) = Wx Image 10x1 10x3072 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Lecture 2 -

  27. Parametric Approach: Linear Classifier 3072x1 f(x,W) = Wx + b 10x1 Image 10x1 10x3072 10 numbers giving f( x , W ) class scores Array of 32x32x3 numbers W (3072 numbers total) parameters or weights Lecture 2 -

  28. Example with an image with 4 pixels, and 3 classes (cat/dog/ship) Stretch pixels into column 56 0.2 -0.5 0.1 2.0 1.1 -96.8 56 231 Cat score 231 1.5 1.3 2.1 0.0 3.2 437.9 + = 24 2 Dog score 24 0 0.25 0.2 -0.3 -1.2 61.95 Ship score 2 Input image W b Lecture 2 -

  29. Example with an image with 4 pixels, and 3 classes (cat/dog/ship) Algebraic Viewpoint f(x,W) = Wx Lecture 2 -

  30. Example with an image with 4 pixels, and 3 classes (cat/dog/ship) Input image Algebraic Viewpoint f(x,W) = Wx 0.2 -0.5 1.5 1.3 0 .25 W 0.1 2.0 2.1 0.0 0.2 -0.3 b 1.1 3.2 -1.2 -96.8 437.9 61.95 Score Lecture 2 -

  31. Interpreting a Linear Classifier Lecture 2 -

  32. Interpreting a Linear Classifier: Geometric Viewpoint f(x,W) = Wx + b Array of 32x32x3 numbers (3072 numbers total) Plot created using Wolfram Cloud Cat image by Nikita is licensed under CC-BY 2.0 Lecture 2 -

  33. Hard cases for a linear classifier Class 1 : Class 1 : Class 1 : First and third quadrants 1 <= L2 norm <= 2 Three modes Class 2 : Class 2 : Class 2 : Everything else Everything else Second and fourth quadrants Lecture 2 -

  34. Linear Classifier: Three Viewpoints Visual Viewpoint Algebraic Viewpoint Geometric Viewpoint One template Hyperplanes f(x,W) = Wx per class cutting up space Lecture 2 -

  35. How the Human Brain learns In the human brain, a typical neuron collects signals from others through a host of fine structures called dendrites . • • The neuron sends out spikes of electrical activity through a long, thin strand known as an axon , which splits into thousands of branches. • At the end of each branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity in the connected neurons.

  36. A Simple Neuron • An artificial neuron is a device with many inputs and one output.

  37. Element of Neural Network 𝑔 : 𝑆 𝐿 → 𝑆 Neuron a z a w a w a w b w = + + + + 1 � 1 1 2 2 K K 1 w a 2 z 2 a ( ) z + σ … w Activation a K K function b weights bias

  38. Neural Network neuron Input Layer 1 Layer 2 Layer L Output y 1 …… x 1 x …… y 2 2 …… …… …… …… …… …… y M x N Input Output Hidden Layers Layer Layer Deep means many hidden layers

  39. Example of Neural Network 0.98 4 1 1 -2 1 0.12 -1 -2 -1 1 0 Sigmoid Function ( ) z σ 1 ( ) z = 1 σ z e − + z

  40. Activation functions Leaky ReLU Sigmoid tanh Maxout ELU ReLU

Recommend


More recommend