columbia university trecvid 2006 high level feature
play

Columbia University TRECVID-2006 High-Level Feature Extraction - PowerPoint PPT Presentation

Columbia University TRECVID-2006 High-Level Feature Extraction Shih-Fu Chang, Winston Hsu, Wei Jiang, Lyndon Kennedy, Dong Xu, Akira Yanagawa, and Eric Zavesky Digital Video and Multimedia Lab, Columbia University


  1. Columbia University TRECVID-2006 High-Level Feature Extraction Shih-Fu Chang, Winston Hsu, Wei Jiang, Lyndon Kennedy, Dong Xu, Akira Yanagawa, and Eric Zavesky Digital Video and Multimedia Lab, Columbia University http://www.ee.columbai.edu/dvmm

  2. Overview – 5 methods & 6 submitted runs 5 methods 1 2 baseline context-based concept fusion 4 3 text feature lexicon-spatial 5 pyramid matching event detection Visual-based 6 runs baseline context LSPM text visual_concept adaptive multi-model_concept adaptive 2

  3. Overview – performance MAP 0.16 visual-text best all visual-based best visual 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 multi-model_ visual_ text LSPM context baseline Every method contributes incrementally to the final detection A_CL1_1 A_CL2_2 A_CL3_3 A_CL4_4 A_CL5_5 A_CL6_6 concept adaptive concept adaptive � context > baseline context-based concept fusion ( CBCF ) improves baseline � LSPM > context lexicon-spatial pyramid matching ( LSPM ) further improves detection � text > LSPM: text features improve visual 3

  4. Overview – performance MAP 0.16 visual-text best all visual-text best all visual-based visual-based best visual best visual 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 multi-model_ visual_ text spatial pyramid context baseline A_CL1_1 A_CL2_2 A_CL3_3 A_CL4_4 A_CL5_5 A_CL6_6 concept adaptive concept adaptive visual_concept adaptive > LSPM (also > context > baseline): best of visual selection works text > multi-model_concept adaptive: best of all selection does not work well probably due to over fitting of text tool 4

  5. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 5

  6. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 6

  7. Individual Methods: (1) Baseline Average fusion of two SVM baseline classification results Based on 3 visual features � color moments over 5x5 fixed grid partitions � Gabor texture � edge direction histogram from the whole image 1 Fixed/Global � Color � Texture … � Edge Support Vector Machines (SVM) coarse local features, layout, and global appearance 7

  8. Individual Methods: (1) Baseline Average fusion of two SVM baseline classification results Based on 3 visual features Features and models � color moments over 5x5 fixed grid partitions available for download � Gabor texture soon! � edge direction histogram from the whole image 2 Fixed/Global � Color � Texture … � Edge ensemble classifier Yanagawa et al., Tec. Rep., Columbia Univ., 2006 , http://www.ee.columbia.edu/dvmm/newPublication.htm 8

  9. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 9

  10. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 10

  11. Individual Methods: (2) CBCF Background on Context Fusion government-leader different person different view Hard/specific concept “Government-Leader” Detector large variance in appearance Context-based Model Government-Leader - + Generic concept Generic concept “Face” Detector “outdoor” Detector Outdoor Face Context Information 11

  12. Individual Methods: (2) CBCF Formulation outdoor detector government-leader detector face detector (outdoor|image) P (government-leader|image) P (face|image) P context-based model (Naphade et al 2002) � � � (government-leader|image) (face|image) (outdoor|image) P P P 12

  13. Individual Methods: (2) CBCF Our approach: Discriminative + Generative I C C C 1 3 2 outdoor detector government-leader detector face detector x x (outdoor|image) x P (government-leader|image) P (face|image) P 1 2 3 observation Conditional Random Field (Jiang, Chang, et al I CI P 2006) outdoor airplane office updated posteriors � � � (government-leader|image) (face|image) (outdoor|image) P P P p y = p y = p y = ( 1| ) ( 1| ) X X ( 1| ) X 3 2 1 13

  14. Individual Methods: (2) CBCF Our approach: Discriminative + Generative I C C C 1 3 2 outdoor detector government-leader detector face detector x x (outdoor|image) x P (government-leader|image) P (face|image) P 1 2 3 observation Conditional Random Field ∏∏ + − = − = = − (1 )/ 2 (1 )/ 2 y y ( 1| ) ( 1| ) J p y X p y X i i min i i I C i iteratively minimized by boosting updated posteriors � � � (government-leader|image) (face|image) (outdoor|image) P P P p y = p y = p y = ( 1| ) ( 1| ) X X ( 1| ) X 3 2 1 14

  15. Individual Methods: (2) CBCF During each iteration t: Classifier 2 keeps updating through iteration two SVM classifiers are trained for each concept: And captures inter-conceptual influences ∏∏ 1. Using input independent detection results + − = − = = − (1 )/ 2 (1 )/ 2 y y ( 1| ) ( 1| ) J p y X p y X i i min i i 2. Using updated posteriors from iteration t-1 I C i iteratively minimized by boosting Without classifier 2, Traditional AdaBoost 15

  16. Individual Methods: (2) CBCF Database & lexicon for context • Predefined lexicon to provide context -- 374 concepts from LSCOM ontology ( observation ) airplane, building, car, boat, person, outdoor, sports, etc • Independent detector -- our baseline • Test concepts -- the 39 concepts defined by NIST ( update posteriors ) 16

  17. Individual Methods: (2) CBCF experimental results over TRECVID 2005 development set 1.2 24 improve context-based fusion independent detector independent detector Boosted CRF 15 degrade 1 0.8 AP 0.6 0.4 0.2 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 17

  18. Selective Application of Context • Not every concept classification benefits from context-based fusion Consistent with previous context-based fusion: IBM: no more than 8 out of 17 concepts gained performance [Amir et al., TRECVID Workshop, 2003] Mediamill: 80 out of 101 concepts [Snoek et al., TRECVID Workshop, 2005] • Is there a way to predict when it works? 18

  19. Predict When Context Helps Why CBCF may not help every concept ? � Complex inter-conceptual relationships vs. limited training samples � Strong classifiers may suffer from fusion with weak context Avoid using CBCF for if is strong and with weak context C C i i Use CBCF for concept if C C is weak or with strong context i i ( ; ) I C C C C -- mutual information between and i i j j ( ) E C C -- error rate of independent detector for i i ∑ ( ; ) ( ) I C C E C j i j ≠ > λ , < β C j i ( ) E C j ∑ or i ( ; ) I C C j i ≠ , C j i j weak concept 19 Strong context

  20. Predict When Context Helps Change parameters to predict different number of concepts # predicted # concept improved precision of prediction MAP gain 62% 3.0% 39 24 9.5% 20 15 75% 88% 14% 16 14 9 9 100% 7.2% 20

  21. Example Fighter_Combat Military I ndividual House . . . 21

  22. I ndependent Detector Example 22

  23. Context-based concept fusion Example 23

  24. Context-based concept fusion Example House 24

  25. Positive frames are moved forward Context-based concept fusion with the help of Fighter_Combat Example 25

  26. Context-Based Fusion + Baseline TRECVI D 2005 development set R6 R5 All get improved ! baseline context 1 0.9 MAP Gain: 0.8 14% 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 26

  27. Context-Based Fusion + Baseline TRECVI D 2006 evaluation 4 concepts Similar to results over TRECVI D 2005 set ! 0.3 baseline context 0.25 0.2 AP 0.15 0.1 0.05 0 1 2 3 4 27

  28. Discussion The smaller the better ∑ ( ; ) ( ) I C C E C j i j ≠ , C j i j Quality of context: ∑ ( ; ) I C C j i ≠ , C j i j Concepts with performance improved: 3.23 Concepts with performance degraded: 4.17 Adding context – strong relationship and robust 28

  29. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 29

  30. Outline – New Algorithms • Baseline • Context-based concept fusion (CBCF) • Lexicon-spatial pyramid matching (LSPM) • Text features • Event detection 30

  31. Individual Methods: (3) LSPM Local features (SIFT) Spatial layout sky tree water Spatial Pyramid Matching (SPM) [ Lazebnik et al. CVPR, 2006 ] multi-resolution histogram matching in spatial domain, bags-of-features Appropriate size for visual lexicon ? Lexicon-Spatial Pyramid Matching (LSPM) SPM matching guided by multi-resolution lexicons 31

  32. Individual Methods: (3) LSPM SI FT features t n t 1 t 2 t 3 t 4 t 5 Lexicon level 0 Lexicon t 2_2 t 4_2 t 5_2 level 1 t n_2 t 1_1 t 1_2 t 3_1 t 3_2 32 t 4_1 t 5_1 . . . t n_1 t 2_1

Recommend


More recommend