lecture 2 object detection
play

Lecture 2: Object Detection Professor Fei Fei Li Stanford Vision Lab - PowerPoint PPT Presentation

Lecture 2: Object Detection Professor Fei Fei Li Stanford Vision Lab 1 29 Mar 11 Lecture 2 - Fei-Fei Li What we will learn today? Visual recognition overview Representation Learning Recognition Implicit Shape Model


  1. Lecture 2: Object Detection Professor Fei ‐ Fei Li Stanford Vision Lab 1 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  2. What we will learn today? • Visual recognition overview – Representation – Learning – Recognition • Implicit Shape Model – Representation – Recognition – Experiments and results 2 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  3. What are the different visual recognition tasks? 3 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  4. Categorization vs Single instance recognition Does this image contain the Chicago Macy building’s? 4 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  5. Categorization vs Single instance recognition Where is the crunchy nut? 5 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  6. Applications of computer vision • Recognizing landmarks in mobile platforms + GPS 6 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  7. Classification: Does this image contain a building? [yes/no] Yes! 7 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  8. Classification: Is this an beach? 8 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  9. Image Search Organizing photo collections 9 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  10. Detection: Does this image contain a car? [where?] car 10 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  11. Detection: Which object does this image contain? [where?] Building clock person car 11 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  12. Detection: Accurate localization (segmentation) clock 12 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  13. Detection: Estimating object semantic & geometric attributes Object: Building, 45º pose, 8 ‐ 10 meters away It has bricks Object: Person, back; 1 ‐ 2 meters away Object: Police car, side view, 4 ‐ 5 m away 13 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  14. Applications of computer vision Surveillance Assistive technologies Computational photography Assistive driving Security 14 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  15. Activity or Event recognition What are these people doing? 17 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  16. Visual Recognition • Design algorithms that are capable to – Classify images or videos – Detect and localize objects – Estimate semantic and geometrical attributes – Classify human activities and events Why is this challenging? 18 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  17. How many object categories are there? 19 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  18. Challenges: viewpoint variation Michelangelo 1475-1564 20 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  19. Challenges: illumination image credit: J. Koenderink 21 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  20. Challenges: scale 22 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  21. Challenges: deformation 23 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  22. Challenges: occlusion Magritte, 1957 24 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  23. Challenges: background clutter Kilmeny Niland. 1995 25 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  24. Challenges: intra ‐ class variation 26 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  25. Some early works on object categorization • Turk and Pentland, 1991 • Belhumeur, Hespanha, & Kriegman, 1997 • Schneiderman & Kanade 2004 • Viola and Jones, 2000 • Amit and Geman, 1999 • LeCun et al. 1998 • Belongie and Malik, 2002 • Schneiderman & Kanade, 2004 • Argawal and Roth, 2002 • Poggio et al. 1993 29 ‐ Mar ‐ 11 Lecture 2 -

  26. Basic issues • Representation – How to represent an object category; which classification scheme? • Learning – How to learn the classifier, given training data • Recognition – How the classifier is to be used on novel data 28 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  27. Representation ‐ Building blocks: Sampling strategies Interest operators Dense, uniformly Image credits: L. Fei ‐ Fei, E. Nowak, J. Sivic Randomly Multiple interest operators 29 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  28. Representation – Appearance only or location and appearance 31 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  29. Representation – Invariances • View point • Illumination • Occlusion • Scale • Deformation • Clutter • etc. 32 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  30. Basic issues • Representation – How to represent an object category; which classification scheme? • Learning – How to learn the classifier, given training data • Recognition – How the classifier is to be used on novel data 42 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  31. Learning • Learning parameters: What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) 43 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  32. Learning • Learning parameters: What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) • Level of supervision • Manual segmentation; bounding box; image labels; noisy labels • Batch/incremental • Priors 44 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  33. Learning • Learning parameters: What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) • Level of supervision • Manual segmentation; bounding box; image labels; noisy labels • Batch/incremental • Priors • Training images: •Issue of overfitting •Negative images for discriminative methods 45 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  34. Basic issues • Representation – How to represent an object category; which classification scheme? • Learning – How to learn the classifier, given training data • Recognition – How the classifier is to be used on novel data 46 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  35. Recognition – Recognition task: classification, detection, etc.. 47 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  36. Recognition – Recognition task – Search strategy: Sliding Windows Viola, Jones 2001, • Simple • Computational complexity (x,y, S, θ , N of classes) ‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 48 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  37. Recognition – Recognition task – Search strategy: Sliding Windows Viola, Jones 2001, • Simple • Computational complexity (x,y, S, θ , N of classes) ‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 • Localization • Objects are not boxes 49 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  38. Recognition – Recognition task – Search strategy: Sliding Windows Viola, Jones 2001, • Simple • Computational complexity (x,y, S, θ , N of classes) ‐ BSW by Lampert et al 08 ‐ Also, Alexe, et al 10 • Localization • Objects are not boxes • Prone to false positive Non max suppression: Canny ’86 …. Desai et al , 2009 50 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  39. Recognition • Savarese, 2007 – Recognition task • Sun et al 2009 • Liebelt et al., ’08, 10 – Search strategy • Farhadi et al 09 – Attributes Category: car Azimuth = 225º Zenith = 30 º ‐ It has metal ‐ it is glossy ‐ has wheels • Farhadi et al 09 • Lampert et al 09 • Wang & Forsyth 09 54 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  40. Recognition – Recognition task – Search strategy – Attributes – Context Semantic: • Torralba et al 03 • Rabinovich et al 07 • Gupta & Davis 08 • Heitz & Koller 08 • L ‐ J Li et al 08 • Yao & Fei ‐ Fei 10 Geometric • Hoiem, et al 06 • Gould et al 09 • Bao, Sun, Savarese 10 55 29 ‐ Mar ‐ 11 Fei-Fei Li Lecture 2 -

  41. Basic issues • Representation – How to represent an object category; which classification scheme? • Learning – How to learn the classifier, given training data • Recognition – How the classifier is to be used on novel data 56 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  42. What we will learn today? • Visual recognition overview – Representation – Learning – Recognition • Implicit Shape Model – Representation – Recognition – Experiments and results 57 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  43. Implicit Shape Model (ISM) • Basic ideas x 1 – Learn an appearance codebook x 6 x 2 – Learn a star ‐ topology structural model x 5 x 3 x 4 • Features are considered independent given obj. center • Algorithm: probabilistic Gen. Hough Transform → – Exact correspondences Prob. match to object part → – NN matching Soft matching – Feature location on obj. → Part location distribution → – Uniform votes Probabilistic vote weighting → – Quantized Hough array Continuous Hough space Source: Bastian Leibe 58 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  44. Implicit Shape Model: Basic Idea • Visual vocabulary is used to index votes for object position [a visual word = “part”]. Visual codeword with displacement vectors Training image B. Leibe, A. Leonardis, and B. Schiele, Robust Object Detection with Interleaved Categorization and Segmentation, International Journal of Computer Vision, Vol. 77(1 ‐ 3), 2008. Source: Bastian Leibe 59 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

  45. Implicit Shape Model: Basic Idea • Objects are detected as consistent configurations of the observed parts (visual words). Test image B. Leibe, A. Leonardis, and B. Schiele, Robust Object Detection with Interleaved Categorization and Segmentation, International Journal of Computer Vision, Vol. 77(1 ‐ 3), 2008. Source: Bastian Leibe 60 29 ‐ Mar ‐ 11 Lecture 2 - Fei-Fei Li

Recommend


More recommend