introduction to recognition
play

Introduction to Recognition Computer Vision CS 543 / ECE 549 - PowerPoint PPT Presentation

Introduction to Recognition Computer Vision CS 543 / ECE 549 University of Illinois Many Slides from D. Hoiem, L. Lazebnik. Outline Overview of image and region categorization Task description What is a category Example of


  1. Introduction to Recognition Computer Vision CS 543 / ECE 549 University of Illinois Many Slides from D. Hoiem, L. Lazebnik.

  2. Outline • Overview of image and region categorization – Task description – What is a category • Example of spatial pyramids bag-of-words scene categorizer • Key concepts: features and classification • Deep convolutional neural networks (CNNs)

  3. Recognition as 3D Matching “Instance” “Category-level” Recognition Recognition Recognizing solid objects by alignment with an image. Huttenlocher and Ullman IJCV 1990.

  4. Detection, semantic segmentation, instance segmentation image classification object detection semantic segmentation instance segmentation Image source

  5. “Classic” recognition pipeline Image Class Feature Trainable Pixels label representation classifier

  6. Overview Training Training Labels Training Images Image Classifier Trained Features Training Classifier Testing Prediction Image Trained Outdoor Features Classifier Test Image

  7. Classifiers: Nearest neighbor Training Test Training examples example examples from class 2 from class 1 f( x ) = label of the training example nearest to x All we need is a distance or similarity function for our inputs • No training required! •

  8. K-nearest neighbor classifier • Which classifier is more robust to outliers ? Credit: Andrej Karpathy, http://cs231n.github.io/classification/

  9. Linear classifiers • Find a linear function to separate the classes: f( x ) = sgn( w × x + b)

  10. Nonlinear SVMs • Linearly separable dataset in 1D: x 0 • Non-separable dataset in 1D: x 0 • We can map the data to a higher-dimensional space : x 2 0 x Slide credit: Andrew Moore

  11. Bag of features 1. Extract local features 2. Learn “visual vocabulary” 3. Quantize local features using visual vocabulary 4. Represent images by frequencies of “visual words”

  12. Digit Classification Case Study

  13. The MNIST DATABASE of handwritten digits Yann LeCun & Corinna Cortes • Has a training set of 60 K examples (6K examples for each digit), and a test set of 10K examples. • Each digit is a 28 x 28 pixel grey level image. The digit itself occupies the central 20 x 20 pixels, and the center of mass lies at the center of the box.

  14. Bias-Variance Trade-off Performance on MNIST Dataset 35 Gradient, Int Gradient, Linear Raw, Poly 30 Raw, Rbf 25 20 Error Rate 15 10 5 0 2 3 4 5 10 10 10 10 Number of Training Examples

  15. Bias and Variance

  16. Bias-Variance Trade-off Performance as a function of model complexity (SVM)

  17. Model Selection

  18. Bias-Variance Trade-off As a function of dataset size

  19. Generalization Error Fixed classifier Error Testing Generalization Error Training Number of Training Examples

  20. Features vs Classifiers Performance on MNIST Dataset 35 Gradient, Int Gradient, Linear Raw, Poly 30 Raw, Rbf 25 20 Error Rate 15 10 5 0 2 3 4 5 10 10 10 10 Number of Training Examples

  21. What are the right features? Depend on what you want to know! • Object: shape – Local shape info, shading, shadows, texture • Scene : geometric layout – linear perspective, gradients, line segments • Material properties: albedo, feel, hardness – Color, texture • Action: motion – Optical flow, tracked points

  22. Stuff vs Objects • recognizing cloth fabric vs recognizing cups

  23. Feature Design Process 1. Start with a model 2. Look at errors on development set 3. Think of features that can improve performance 4. Develop new model, test whether new features help. 5. If not happy, go to step 1. 6. “Ablations”: Simplify system, prune out features that don’t help anymore in presence of other features.

  24. Features vs Classifiers Performance on MNIST Dataset 35 Gradient, Int Gradient, Linear Raw, Poly 30 Raw, Rbf 25 20 Error Rate 15 10 5 0 2 3 4 5 10 10 10 10 Number of Training Examples

  25. “Classic” recognition pipeline Image Class Feature Trainable Pixels label representation classifier

  26. Categorization involves features and a classifier Training Training Training Labels Images Image Classifier Trained Features Training Classifier Testing Prediction Trained Image Classifier Features Outdoor Test Image

  27. New training setup with moderate sized datasets Training Labels Training Images Tune CNN features and Trained Classifier Neural Network classifier Initialize CNN Features Dataset similar to task with millions of labeled examples

  28. Categorization involves features and a classifier Training Training Training Labels Images Image Classifier Trained Features Training Classifier Testing Prediction Trained Image Classifier Features Outdoor Test Image

  29. New training setup with moderate sized datasets Training Labels Training Images Tune CNN features and Trained Classifier Neural Network classifier Initialize CNN Features Dataset similar to task with millions of labeled examples

Recommend


More recommend