9 54 class 16
play

9.54 Class 16 Features for recognition supervised, unsupervised - PowerPoint PPT Presentation

9.54 Class 16 Features for recognition supervised, unsupervised and innate Shimon Ullman + Tomaso Poggio Danny Harari + Daniel Zysman + Darren Seibert Visual recognition The initial input is just image intensities Object Categories -- We


  1. 9.54 Class 16 Features for recognition supervised, unsupervised and innate Shimon Ullman + Tomaso Poggio Danny Harari + Daniel Zysman + Darren Seibert

  2. Visual recognition

  3. The initial input is just image intensities

  4. Object Categories -- We perceive the world in term of objects and classes -- Large variability within a each class

  5. Individual Recognition

  6. Object parts Window Mirror Window Door knob Headlight Back wheel Bumper Front wheel Headlight

  7. Categorization: dealing with class variability

  8. Class Non-class Natural for the brain, difficult computationally

  9. Unsupervised Classification

  10. Features and Classifiers

  11. Features and Classifiers

  12. Image features Classifier Generic Features Simple (wavelets) Complex (Geons)

  13. Visual Class: Similar Configurations of Shared Image Components

  14. What will be optimal image building- blocks for the class?

  15. Optimal Class Components? • Large features are too rare • Small features are found everywhere Find features that carry the highest amount of information

  16. Mutual Information I(C,F) • Definition of MI as the difference between the class entropy and conditional entropy of the class given a feature: I(F,C) = H(C) – H(C|F) • Definition of entropy:    ( ) ( ) ( ( )) H C P c Log P c  c C • Definition of conditional entropy:    ( ) ( ) ( ) H C F p f H C F f  f F     ( ) ( ) ( ( )) p f P c f Log P c f   f F c C

  17. Mutual Information I(C,F) Class: 1 1 0 1 0 1 0 0 Feature 1 0 0 1 1 1 0 0 I(F,C) = H(C) – H(C|F)

  18. Computing MI from Examples • Mutual information can be measured from examples: • 100 Faces 100 Non-faces Feature: 44 times 6 times Mutual information: 0.1525 H(C) = 1, H(C|F) = 0.8475 Simple neural-network approximations

  19. Optimal classification features • Theoretically: maximizing delivered information minimizes classification error Error = H – I(C;F) • In practice: informative object components can be identified in training images

  20. Selecting Fragments Mutual Info vs. forehead Threshold hairline mouth Mutual Info eye nose nosebridge long_hairline 0.00 20.00 40.00 chin Detection threshold twoeyes ‘Imprinting’ many receptive fields and selecting a subset

  21. Adding a New Fragment (Avoiding redundancy by max-min selection) Δ MI ? Compare new fragments Fi to all the previous ones. Select F which maximizes the additional information Max i Min k Δ MI (Fi, Fk) Competition between units with similar responses

  22. Highly Informative Face Fragments Optimal receptive fields for Faces Ullman et al Nature Neuroscience 2002

  23. Informative class features Horse-class features Car-class features

  24. Informative fragments with positions ∑ w k F k > θ On all detected fragments within their regions

  25. Star model Detected fragments ‘vote’ for the center location Find location with maximal vote In variations, a popular state-of-the art scheme

  26. Image parts informative for classification Ullman, Sali 1999 Agarwal, Roth 2002 Fergus, Perona, Zisserman 2003

  27. Variability of Airplanes Detected

  28. Image representation for recognition HoG Descriptor Dallal, N & Triggs, B. Histograms of Oriented Gradients for Human Detection

  29. Object model using HoG

  30. fMRI Functional Magnetic Resonance Imaging

  31. Looking for Class Features in the Brain: fMRI Lerner, Epshtein Ullman Malach JCON 2008

  32. Class-fragments and Activation Malach et al 2008

  33. EEG

  34. Informative Fragments: ERP Study Harel, Ullman, Epshtein, Bentin

  35. ERP FACE FEATURES FACE FEATURES Posterior-Temporal sites Posterior-Temporal sites Left Hemisphere Left Hemisphere Right Hemisphere Right Hemisphere MI 1 — MI 2 — MI 3 — MI 4 — MI 5 — 0 0 200 200 400 400 600 600 0 0 200 200 400 400 600 600 milliseconds milliseconds milliseconds milliseconds Harel, Ullman,Epshtein, Bentin Vis Res 2007

  36. Features for object segregation: Innate mechanisms for unsupervised learning

  37. Object Segregation Object 2 Object 1 Background

  38. Object segregation is learned [Kellman & Spelke 1983; Spelke 1990; Kestenbaum et al., 1987] Even basic Gestalt cues are initially missing [Schmidt et al. 1986] 5 months

  39. Object segregation is learned Adults

  40. It all begins with motion

  41. It all begins with motion Grouping by common motion precedes figural goodness [Spelke 1990 - review] Motion discontinuities provide an early cue for occlusion boundaries [Granrud et al. 1984]

  42. Our model Motion-based segregation Motion Common Boundary Global discontinuities motion General Object-specific Accurate Complete Noisy Inaccurate Local occlusion Object form Incomplete boundaries Static segregation Dorfman, Harari & Ullman, CogSci 2013

  43. Boundary Intensity edges?

  44. Boundary Occlusion cues T-junctions Convexity Extremal edges [Ghose & Palmer 2010]

  45. Global Familiar object

  46. How does it actually work?

  47. Motion Moving object

  48. Motion Figure Ground Unknown Boundary Global

  49. Boundary Informative boundary features Need many examples for good results (1000+)

  50. Boundary Prediction Figure Figure or or Ground? Ground? Novel object, novel background 78% success Using 100,000 training examples

  51. Boundary Entire image Figure Background

  52. Global Learning an object Standard object recognition algorithm Learns local features and their relative locations

  53. Global Detection

  54. Combining information sources Figure Background Boundary Global Combined Accurate Complete Noisy & Incomplete Inaccurate

  55. More complex algorithms Default GrabCut With segregation cue [Rother et al. 2004]

  56. More complex algorithms Default GrabCut With segregation cue [Rother et al. 2004]

  57. Object segregation - summary • Static segregation is learned from motion • Two simple mechanisms: Boundary Motion discontinuities  Occlusion boundaries (Need a rich library, including extremal edges) Global Common motion  Object form • These mechanisms work in synergy • This is enough to get started, adult segregation is much more complex

  58. Summary • Features are important for many visual tasks such as object recognition and segregation. • Features can be learned in a supervised manner given labeled examples. • Features can be also learned in an unsupervised manner using statistical regularities or domain-specific cues.

Recommend


More recommend