TRECVID2008: MCG-ICT-CAS Co cept Co cept Concept Detection Based on Concept Detection Based on LDA etect o etect o ased o ased o LDA- -SVM SVM S Sheng Tang( 唐胜 ) Jin Tao Li ( 李锦涛 ) Ming Li( 李明 ) Sheng Tang( 唐胜 ), Jin-Tao Li ( 李锦涛 ), Ming Li( 李明 ) Cheng Xie ( 谢呈 ), Yi-Zhi Liu ( 刘毅志 ), Kun Tao ( 陶焜 ), Shao-Xi Xu ( 徐邵稀 ) Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, 100080 Email: ts@ict.ac.cn Tel:8610-62600617 E il t @i t T l 8610 62600617 For TRECVI D 2008 concept detection task we focus on: For TRECVI D 2008 concept detection task we focus on: For TRECVI D 2008 concept detection task, we focus on: For TRECVI D 2008 concept detection task, we focus on: � � To improve the training efficiency and explore the To improve the training efficiency and explore the p p g g y y p p knowledge between concepts or hidden sub knowledge between concepts or hidden sub-domains, we domains, we propose a novel method based on Latent Dirichlet Allocation: propose a novel method based on Latent Dirichlet Allocation: p p p p LDA LDA-based Multiple based Multiple-SVM (LDA SVM (LDA-SVM); SVM); � Early fusion of texture, edge and color features TECM: � Early fusion of texture edge and color features TECM: Early fusion of texture, edge and color features TECM: Early fusion of texture edge and color features TECM: TF* I DF weights based on SI FT features + Edge Histogram+ TF* I DF weights based on SI FT features + Edge Histogram+ Color Moments; Color Moments; Color Moments; Color Moments; � I ntroduction of Pseudo Relevance Feedback (PRF) into our I ntroduction of Pseudo Relevance Feedback (PRF) into our ( ( ) ) concept detection system for the purpose of making re concept detection system for the purpose of making re- trained models more adaptive to the test data trained models more adaptive to the test data. p
1 LDA-SVM 1.1 Flowchart of LDA-SVM 1.2 Topic-simplex Representation Vector (TRV) LDA T 2 T N T 1 …… TRV SVM 2 …… SVM 1 SVM Ν Fig 2 TRV of frames in a Topic 1.3 Our Novelties Fusion • Sample’s separability-keeping strategy during training Unlike multi-bag SVM, we only use positive samples in current topic for the sake of retaining sample’s separability, instead of all positive samples among the whole training set, Fig 1 Flowchart of LDA-SVM and ignore the topics with too few positive samples. • TRV-weight-based fusion strategy during testing Whil t While testing a keyframe for a given concept, we adopt TRV ti k f f i t d t TRV as the weight vector, instead of equal weighting strategy, to combine the SVM outputs of topic-models. bi th SVM t t f t i d l
2 System overview y 2.1 Early Fusion 2.3 Pseudo Relevance Feedback (PRF) Unlike existing PRF techniques in text and U lik i ti PRF t h i i t t d Early fusion of texture, edge and color features E l f i f d d l f video retrieval, we propose two preliminary TECM (890 dims), abbreviation of the combined: strategies to explore the visual features of positive strategies to explore the visual features of positive � TF*IDF weights based on SIFT features (345 TF*IDF i ht b d SIFT f t (345 training samples to improve the quality of pseudo dims) positive samples: positive samples: � Edge Histogram (320 dims) Edge Histogram (320 dims) � Similarity-based method � Color Moments (225 dims). Select pseudo positive samples by calculating Select pseudo positive samples by calculating the feature similarities between top-retrieval 2.2 Novel LDA-SVM Detection Method examples with positive training samples after every retrieval process. � LDA clustering � Detector-based method After quantization of the TF*IDF weights we After quantization of the TF*IDF weights, we Select pseudo positive samples through the use Latent Dirichlet Allocation to cluster all the overall evaluation of positions among the keyframes into 20 topics according to the keyframes into 20 topics according to the ranked lists from several detectors. k d li t f l d t t maximum element of the TRVof each keyframe. � SVM Training g 2.4 Object-based features 2 4 Object-based features � Sample’s separability-keeping strategy Object-based features: we train models with j For all the 20 concepts, we get 344 models For all the 20 concepts, we get 344 models object-based TF*IDF features within labeled after removing 56 topics with no more than 1 rectangles for positive training samples. But our positive sample. result is not good due to unavailability of such lt i t d d t il bilit f h � SVM Test object-based features of test samples. � TRV-weight-based fusion strategy g gy
3 Annotation & Experiments p 3 1 A 3.1 Annotation of training data t ti f t i i d t 3.3 Result Analysis 3 3 R lt A l i • Effective: 35.4% improvement (run3 via run1) • Efficient: • Topic size is greatly smaller • Samples in each topic are of higher separability S l i h t i f hi h bilit • SVM training is very efficient, only about 20 minutes for all the 344 models on our cluster i t f ll th 344 d l l t server (dualcore 1.8ghz *15) • Employing all samples in each topic for cross- • Employing all samples in each topic for cross validation becomes very practicable ( about 12 hours for all 344 model on our cluster server ) hours for all 344 model on our cluster server ). Fi 3 Th i t Fig 3 The interface for our annotation f f t ti 3.4 Conclusion In order to encourage researchers to propose methods extracting features based on object rather than the whole ⑴ Eearly fusion TECM, clustering via LDA , sample’s frame, we divided the 20 concepts into two groups: separability-keeping strategy , and TRV-weight-based sepa ability keeping st ategy , a d V weight based (1) Object related concepts (1) Object-related concepts fusion strategy together contribute to the high efficiency (2) Scene-related concepts and effectiveness of our proposed method. p p 3 2 I fAP f 3.2 InfAP of our runs ⑵ Determination method of hidden topic number should be carefully studied for further improvement. HLF run InfAP Description ⑶ PRF method is not stable since the introduction of A_ICT_1 0.048 Visual Baseline pseudo positive samples may ruin the separability of A_ICT_2 0.038 LocalizationClassifier topic samples. A_ICT_3 0.065 TECM_LDA_SVM ⑷ More frames pershot should be used for test data. A_ICT_4 0.037 TECM_LDA_SVM_PRF ⑸ ⑸ Should combine LIG annotation to remove false A_ICT_5 0.076 TECM_LDA_SVM+Baseline annotations. A_ICT_6 0.078 Fusion All
Recommend
More recommend