detecting categories in news video using image features
play

Detecting Categories in News Video Using Image Features Slav - PowerPoint PPT Presentation

Detecting Categories in News Video Using Image Features Slav Petrov, Arlo Faria, Pascal Michaillat, Alex Berg, Andreas Stolcke, Dan Klein, Jitendra Malik System Overview Images GB SVM Category correlation Sequential context Source Video


  1. Detecting Categories in News Video Using Image Features Slav Petrov, Arlo Faria, Pascal Michaillat, Alex Berg, Andreas Stolcke, Dan Klein, Jitendra Malik

  2. System Overview Images GB SVM Category correlation Sequential context Source Video combination Audio MFCC GMM 1-best selection ASR TFIDF SVM Feature extraction Primary systems Higher-level systems

  3. Image Features in TrecVid ’05  IBM:  Color Histogram  Co-occurence Texture  Color Correlogram  Wavelet Texture Grid  Color Moments  Edge Histogram Layout  CMU (local):  Color Histograms (in different color spaces)  Texture Histograms  Edge Histograms  Columbia (part based model):  Color  Size  Texture  Spatial Relation  Tsinghua (local and global):  Color Auto-Correlograms  Color Moments  Color Coherence Vectors  Edge Histograms  Color Histograms  Wavelet Texture

  4. Image Features in TrecVid ’05 Columbia Berkeley Tsinghua CMU IBM ✓ ✓ ✓ ✓ Histograms ✓ ✓ Color Moments ✓ ✓ Correlograms ✓ ✓ ✓ Histograms Texture ✓ Wavelets ✓ ✓ ✓ Edge Histograms ✓ Shape

  5. Exemplars for Recognition  Use exemplars for recognition  Compare query image and each exemplar using shape cues Database of Exemplars Query Image

  6. Finding similar patches Query Exemplar

  7. Geometric Blur [Berg & Malik, CVPR’01] (Local Appearance Descriptor) Compute sparse channels from image Extract a patch in each channel ~ Idealized signal Geometric Blur Apply spatially varying Descriptor is robust to Descriptor blur and sub-sample small affine distortions

  8. GB in Practice  In practice compute discrete blur levels for whole image and sample as needed for each feature location. Horizontal Channel Vertical Channel Increasing Blur

  9. [Berg, Berg & Malik, CVPR’05] Comparing Images  Sample 200 GB features from edge points  Dissimilarity from A to B is where the F x are the GB features.

  10. Caltech 101 Dataset  Object Recognition Benchmark  101 Categories:  Stereotypical pose  Little clutter  Objects centered  One object per image

  11. [Zhang, Berg, Maire & Malik, CVPR’06] Caltech 101 Results uses GB features

  12. Primal features for SVM  Compare to 50 prototypes from each class  Use distances as feature vector for an SVM Query .. .. .. …… Prototype s Featur … ………. … … 0.9 0.1 0.8 0.7 0.7 e Vector

  13. SVM features interpretation  Slices of the Kernel Matrix: q  Fixed-points in a t i t j t k higher dimensional t i vector space: t j q t k

  14. SVM Specifics  SVM light package  Same parameters for all categories:  Linear kernel  Default regularization parameter  Asymmetric cost doubling the weight of positive examples

  15. Results ’05 Berkeley-Shape mAP = Results ’06 0.38 Computer- Best ’05 (IBM) mAP = 0.34 TV-Screen Meeting Sports Car Best Berkeley-Shape Median mAP = 0.11

  16. Limitations  Several objects per image:  Features do not capture:  Different Scales  Color

  17. Conclusions  Shape is an important cue for object recognition.  System that uses shape features only can have competitive performance.  Shape features are orthogonal to features used in the past.

  18. Thank You! petrov@eecs.berkeley.edu

Recommend


More recommend