new challenges in new challenges in semantic concept
play

New Challenges in New Challenges in Semantic Concept Detection - PDF document

New Challenges in New Challenges in Semantic Concept Detection Semantic Concept Detection M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan, Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin National Taiwan University Preliminaries for Semantic


  1. New Challenges in New Challenges in Semantic Concept Detection Semantic Concept Detection M.-F. Weng, C.-K. Chen, Y.-H. Yang, R.-E. Fan, Y.-T. Hsieh, Y.-Y. Chuang, W. H. Hsu, and C.-J. Lin National Taiwan University Preliminaries for Semantic Preliminaries for Semantic Concept Detection Concept Detection • What are preliminaries for building a semantic concept detection system? – A lexicon of well A lexicon of well- -defined concepts defined concepts – – Training resources – Training resources • Video data • Annotations • Features – Tools – Tools • Tagging or labeling tools, (e.g., CMU and IBM tools) • Feature extractors • Machine learning tools, (e.g., LIBSVM) • Semantic concept detection tailored tools 2

  2. Semantic Concepts Semantic Concepts LSCOM 449 Columbia374 MediaMill 101 TRECVID-2006 LSCOM-Lite (39) TRECVID-2005 10 Concepts 3 Video Data Sets Video Data Sets devel. set 100 hours 100 hours test set Sound and Sound and TV 07 Vision Video Video… … Vision ~330 hours ~330 hours TV 06 Multi Multi- -Lang. Lang. Dataset Broadcast Broadcast TV 05 News Video News Video TV 04 ~190 hours ~190 hours News Video News Video TV 03 0 50 100 150 200 Video Length (hrs) 4

  3. Annotations Annotations Common Annotations NIST Truth Judgments TV 07 Dataset TV 06 devel. set TV 05 test set 0 50 100 150 200 Video Length (hrs) 5 Features, Detectors, Scores Features, Detectors, Scores Featur eatures Detectors Scores Scores MediaMill •Visual feature 5 sets of 101 Scores of TV 5 sets of 101 Scores of TV classifiers classifiers 05/06 dataset 05/06 dataset Baseline •Text feature •EDH Columbia374 Columbia374 Columbia 374 Columbia 374 Columbia374 •GBR scores of TV scores of TV detectors detectors 06/07 dataset 06/07 dataset •GCM •Color moment VIREO- -374 374 Scores of TV VIREO Scores of TV VIREO-374 •Wavelet texture detectors detectors 07 dataset 07 dataset •Keypoint feature 6

  4. Available Resources Available Resources Well-defined concepts Video data Training Annotations Resources Features Tagging Tools Feature Extractors Tools Machine Learning Tools Tailored Tools • Concept definition is sufficient • Training resources are plentiful • No feature extractors and tailored tools available 7 The New Challenges The New Challenges • Challenge 1 : Easy and Efficient Tools L*M*N N features, imply L*M*N – L L datasets, M M concepts, N – classifiers – Each classifier has to consider many parameters – Time seems very limited Time seems very limited to validate each parameter – and to train all classifiers • Challenge 2: Resource Exploitation or Reuse – Resources are precious – Existent resources are potentially useful potentially useful for new dataset – Plentiful resources have not been fully utilized not been fully utilized 8

  5. Facing the New Challenges Facing the New Challenges • Challenge 1: – Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection • Challenge 2: – Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy 9 Facing the New Challenges Facing the New Challenges • Challenge 1 : Easy and Efficient Tools – Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection • Challenge 2 – Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy 10

  6. A Tailored Toolkit A Tailored Toolkit • We extended LIBSVM in three aspects for semantic concept detection: – Using dense representations – Exploiting parallelism of independent concepts, features, and SVM model parameters – Narrowing down parameter search to a safe range • Overall, training time of our baseline was approximately reduced from 14 days to about 3 days from 14 days to about 3 days 11 Facing the New Challenges Facing the New Challenges • Challenge 1 – Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection • Challenge 2 : Resource Exploitation or Reuse – Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy 12

  7. Reuse Past Data Reuse Past Data • Early aggregation – Must re Must re- -train classifiers train classifiers – – Cause considerable training time Cause considerable training time – • Late aggregation – Simple and direct Simple and direct – – May be biased May be biased – 13 Late Aggregation Late Aggregation • We adopt late aggregation to reuse existent classifiers by two strategies: – Equally Average Aggregation Equally Average Aggregation – • Simply average the scores of past and newly trained classifiers – Concept Concept- -dependent Weighted Aggregation dependent Weighted Aggregation – • Use concept-dependent weights to aggregate classifiers 14

  8. Aggregation Benefits Aggregation Benefits Overall Improvement Ratio – 0.3 TV07 Classifiers Average Aggregation : 22% 0.25 Average Aggregation Weighed Aggregation : 30% Weighted Aggregation 0.2 infAP 0.15 0.1 0.05 0 n n e e t n g e e y o e r r n t p g c c r S r e s s i e t t i f r i s r n h s s n s F r r e t i s r l U u n n y k t h r l e l t e i r p - a e e a c r p i - a a c _ a a r r c h S t V c - s r o o e V m t a a s t a a r g e n l u t a C i a l n p _ T f e p p h t e e a p M M e T a S o r i f W i u M t n O v C l D D T e S S M _ _ l r r i a F _ i M o O s i i W - r r A A o _ e A e e o M e B e c l l t t p p p u u i l o a o x p p E e P c m m s P r o o e C C t a W 15 Facing the New Challenges Facing the New Challenges • Challenge 1 – Extended LIBSVM to improve training efficiency – Developed an efficient and easy-to-use toolkit tailored for semantic concept detection • Challenge 2: Resource Exploitation or Reuse – Reused classifiers of past data to improve accuracy by late aggregation – Exploited contextual relationship and temporal dependency from annotations to boost accuracy 16

  9. Observation in Annotations Observation in Annotations A sequence of video shots car 1 1 1 1 1 1 1 1 outdoor 1 1 1 1 1 1 1 1 urban 1 1 0 1 1 1 1 1 building 1 0 0 1 1 1 1 0 sky 0 1 1 1 0 1 1 0 people 0 0 0 1 0 1 1 0 A lexicon Temporal Dependency Contextual relationship of concepts 17 Post- -processing Framework processing Framework Post Feature Concept Shot Ranking Extraction Detection Unsupervised Concept Contextual Reranking Fusion Video Annotation Combination Segmentation Temporal Temporal Temporal Dependency Filter Filtering Video Mining Design Sequence Detecting phase: Mining phase: Processing phase: 18

  10. Temporal Filtering Temporal Filtering Feature Concept Shot Ranking Extraction Detection Unsupervised Concept Contextual Reranking Fusion Video Annotation Combination Segmentation Temporal Temporal Temporal Dependency Filter Filtering Video Mining Design Sequence Detecting phase: Mining phase: Processing phase: 19 Temporal Dependency Temporal Dependency • Different concepts have different levels of dependency at different temporal distance – E.g., sports, weather, maps, explosion sports, weather, maps, explosion Chi-square test sports 10000 weather 8000 maps explosion χ 2 6000 k 4000 2000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Temporal Distance 20

  11. Temporal Filter Temporal Filter ….. … … ….. X t-k X t-2 X t-1 X t X t+1 X t+2 X t+k SVM Classifier − = − = = + = + = + = x − = x P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | ) P ( l 1 | ) − − − + + + t k t k 2 2 1 1 1 1 2 2 t t t t t t t t t t t k t k = x = = = = = = x P ( l 1 | ) P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | x ) P ( l 1 | ) − − − + + + 2 1 1 2 t t k t t t t t t t t t t t t k w 2 w 1 w 0 w 1 w k w 2 w k d [ ] ( ) ∑ ˆ = = = ( 1 ) ( 1 | ) P l w P l x − t k t t k = 0 k 21 Filtering Prediction Filtering Prediction • A sequence of shots for predicting sports sports – Classifier prediction results Classifier prediction results – 0.8 0.6 0.4 0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 – After temporal filtering – After temporal filtering 0.5 Rank rose Misclassification picked up 0.4 0.3 0.2 0.1 0 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Recommend


More recommend