generic object recognition
play

Generic object recognition Wed, April 6 Kristen Grauman Source: - PDF document

4/6/2011 What does recognition involve? Generic object recognition Wed, April 6 Kristen Grauman Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Verification: is that a lamp? Detection: are there people? Source: Fei-Fei Li, Rob Fergus,


  1. 4/6/2011 What does recognition involve? Generic object recognition Wed, April 6 Kristen Grauman Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Verification: is that a lamp? Detection: are there people? Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Identification: is that Potala Palace? Object categorization mountain tree building banner street lamp vendor people Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. 1

  2. 4/6/2011 Scene and context categorization Instance-level recognition problem • outdoor • city • … John’s car Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Generic categorization problem Object Categorization • Task Description  “Given a small number of training images of a category, Perceptual and Sensory Augmented Computing recognize a-priori unknown instances of that category and assign the correct category label.” • Which categories are feasible visually? Visual Object Recognition Tutorial “Fido” German dog animal living shepherd being K. Grauman, B. Leibe K. Grauman, B. Leibe Visual Object Categories Visual Object Categories • Basic-level categories in humans seem to be defined • Basic Level Categories in human categorization predominantly visually. [Rosch 76, Lakoff 87] Perceptual and Sensory Augmented Computing Perceptual and Sensory Augmented Computing • There is evidence that humans (usually)  The highest level at which category members have similar … start with basic-level categorization perceived shape before doing identification.  The highest level at which a single mental image reflects the animal Visual Object Recognition Tutorial Visual Object Recognition Tutorial  Basic-level categorization is easier entire category Abstract and faster for humans than object …  The level at which human subjects are usually fastest at levels … identification! identifying category members quadruped  How does this transfer to automatic  The first level named and understood by children … classification algorithms?  The highest level at which a person uses similar motor actions Basic level dog cat cow for interaction with category members German Doberman shepherd Individual … … “ Fido” level K. Grauman, B. Leibe K. Grauman, B. Leibe K. Grauman, B. Leibe K. Grauman, B. Leibe 2

  3. 4/6/2011 How many object categories are there? Biederman 1987 Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Other Types of Categories Other Types of Categories • Functional Categories • Ad-hoc categories  e.g. chairs = “something you can sit on”  e.g. “something you can find in an office environment” Perceptual and Sensory Augmented Computing Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Tutorial K. Grauman, B. Leibe K. Grauman, B. Leibe K. Grauman, B. Leibe K. Grauman, B. Leibe Posing visual queries Why recognition? – Recognition a fundamental part of perception • e.g., robots, autonomous agents Yeh et al., MIT – Organize and give access to visual content • Connect to information Belhumeur et al. • Detect trends and themes Kooaba, Bay & Quack et al. 3

  4. 4/6/2011 Autonomous agents able to Finding visually similar objects detect objects http://www.darpa.mil/grandchallenge/gallery.asp Discovering visual patterns Auto-annotation Sivic & Zisserman Objects Lee & Grauman Categories Gammeter et al. T. Berg et al. Wang et al. Actions Kristen Grauman Kristen Grauman Challenges: robustness Challenges: robustness Illumination Object pose Clutter Realistic scenes are crowded, cluttered, have overlapping objects. Occlusions Intra-class Viewpoint appearance Kristen Grauman 4

  5. 4/6/2011 Challenges: importance of context Challenges: importance of context slide credit: Fei-Fei, Fergus & Torralba Challenges: learning with Challenges: complexity minimal supervision • Thousands to millions of pixels in an image More Less • 3,000-30,000 human recognizable object categories • 30+ degrees of freedom in the pose of articulated objects (humans) • Billions of images indexed by Google Image Search • 18 billion+ prints produced from digital camera images in 2004 • 295.5 million camera phones sold in 2005 • About half of the cerebral cortex in primates is devoted to processing visual information [Felleman and van Essen 1991] Kristen Grauman Kristen Grauman What works most reliably today What works most reliably today • Reading license plates, zip codes, checks • Reading license plates, zip codes, checks • Fingerprint recognition Source: Lana Lazebnik Source: Lana Lazebnik 5

  6. 4/6/2011 What works most reliably today What works most reliably today • Reading license plates, zip codes, checks • Reading license plates, zip codes, checks • Fingerprint recognition • Fingerprint recognition • Face detection • Face detection • Recognition of flat textured objects (CD covers, book covers, etc.) Source: Lana Lazebnik Source: Lana Lazebnik Generic category recognition: Generic category recognition: basic framework representation choice • Build/train object model – Choose a representation – Learn or fit parameters of model / classifier • Generate candidates in new image • Score the candidates Window ‐ based Part ‐ based Kristen Grauman Kristen Grauman Supervised classification Supervised classification • Given a collection of labeled examples, come up with a • Given a collection of labeled examples, come up with a function that will predict the labels of new examples. function that will predict the labels of new examples. • Consider the two-class (binary) decision problem “four” – L(4 → 9): Loss of classifying a 4 as a 9 “nine” – L(9 → 4): Loss of classifying a 9 as a 4 ? Novel input Training examples • Risk of a classifier s is expected loss:               • How good is some function we come up with to do the R ( s ) Pr 4 9 | using s L 4 9 Pr 9 4 | using s L 9 4 classification? • We want to choose a classifier so as to minimize this • Depends on total risk – Mistakes made – Cost associated with the mistakes Kristen Grauman Kristen Grauman 6

  7. 4/6/2011 Supervised classification Supervised classification Optimal classifier will Optimal classifier will minimize total risk. minimize total risk. At decision boundary, At decision boundary, either choice of label either choice of label yields same expected yields same expected Feature value x Feature value x loss. loss. If we choose class “four” at boundary, expected loss is: So, best decision boundary is at point x where     ( class is 9 | ) (9 4) (class is 4 | ) (4 4) P L P L    x x ( class is 9 | ) (9 4) P(class is 4 | ) (4 9) P L L x x   ( class is 9 | ) (9 4) P L x To classify a new point, choose class with lowest expected If we choose class “nine” at boundary, expected loss is: loss; i.e., choose “four” if      P ( 4 | ) L ( 4 9 ) P ( 9 | ) L ( 9 4 ) P ( class is 4 | ) L (4 9) x x x Kristen Grauman Kristen Grauman Probability Supervised classification Basic probability Optimal classifier will • X is a random variable P(4 | x ) P(9 | x ) minimize total risk. • P(X) is the probability that X achieves a certain value called a PDF At decision boundary, -probability distribution/density function either choice of label yields same expected Feature value x loss. So, best decision boundary is at point x where •    ( class is 9 | ) (9 4) P(class is 4 | ) (4 9) P L L x x • or To classify a new point, choose class with lowest expected continuous X discrete X loss; i.e., choose “four” if    P ( 4 | ) L ( 4 9 ) P ( 9 | ) L ( 9 4 ) • Conditional probability: P(X | Y) x x How to evaluate these probabilities? – probability of X given that we already know Y Source: Steve Seitz Kristen Grauman Example: learning skin colors Example: learning skin colors • We can represent a class-conditional density using a • We can represent a class-conditional density using a histogram (a “non-parametric” distribution) histogram (a “non-parametric” distribution) Percentage of skin pixels in each bin P(x|skin) P(x|skin) Feature x = Hue Feature x = Hue Now we get a new image, P(x|not skin) and want to label each pixel as skin or non-skin. P(x|not skin) What’s the probability we care about to do skin detection? Feature x = Hue Feature x = Hue Kristen Grauman Kristen Grauman 7

Recommend


More recommend