C Context-based Visual Concept Context C t t t b t based Visual Concept b d Vi d Vi l C l C t t Detection Using Domain Adaptive Detection Using Domain Adaptive Detection Using Domain Adaptive Detection Using Domain Adaptive Semantic Diffusion Semantic Diffusion Yu-Gang Jiang ‡ , Jun Wang ‡ , Shih-Fu Chang ‡ , Chong-Wah Ngo † † VIREO Research Group (VIREO), City University of Hong Kong ‡ Digital Video and Multimedia Lab (DVMM), Columbia University 1 NIST TRECVID Workshop, Nov. 2009
Overview: framework Local Feature Global Feature SVM Classifiers 6 5 Domain Adaptive VIREO-374: 374 LSCOM 374 LSCOM S Semantic Diffusion ti Diff i concept detectors 1 ‐ 4
Overview: performance p 0.25 Precision 0.20 DASD Local + global features g d Average 0.15 Local feature alone ean Inferred 0 10 0.10 0.05 Me 0.00 222 system runs � Local feature is still the most powerful component (MAP=0.150) � Global features help a little bit (MAP=0.156) � DASD further contributes incrementally to the final detection � DASD further contributes incrementally to the final detection 3
Overview: framework Local Feature Global Feature SVM Classifiers 6 5 Domain Adaptive VIREO-374: 374 LSCOM 374 LSCOM S Semantic Diffusion ti Diff i concept detectors 1 ‐ 4
Local feature representation p space SIFT feature S Chang et al TRECVID 2008; Jiang, Yang, Ngo & Hauptmann, IEEE TMM, to appear 5
Context-based concept detection p Local Feature Global Feature SVM Classifiers 6 5 DASD: Domain VIREO-374: Adaptive Semantic Adaptive Semantic 374 LSCOM 374 LSCOM concept detectors Diffusion 1 ‐ 4 1 4
DASD - motivation • Most existing methods aim at the Most existing methods aim at the assignment of concept labels individually – but concepts do not occur in isolation! but concepts do not occur in isolation! military personnel smoke building explosion_fire vehicle road outdoor 7
DASD - motivation • Most existing methods aim at the Most existing methods aim at the assignment of concept labels individually – but concepts do not occur in isolation! but concepts do not occur in isolation! • Domain change between training and testing data was not considered Broadcast News Videos Documentary Videos 8
DASD - overview road vehicle sky water 0.01 0.11 0.01 0.05 0.19 0.12 0.36 0.58 0.80 0.91 0.10 0.53 0.46 0.18 0.17 0.13 0.13 0.05 0.05 0.23 0.23 0.02 0.02 Jiang, Wang, Chang & Ngo, ICCV 2009 9
DASD - overview • Domain adaptive p semantic diffusion 0.1 road 0.2 0.8 0.5 ( (DASD) ) 0.1 … 0.4 – Semantic graph 0.0 • Nodes are concepts 0.1 0.9 0.2 vehicle • Edges represent 0.1 … 0.3 concept correlation 0.1 0.0 0.6 0.4 0.1 0.5 – Graph diffusion G h diff i 0.1 0.2 0.0 0.8 … … 0.8 0.7 • Smooth concept Water sky detection scores w.r.t detection scores w.r.t the concept correlation 10
DASD - formulation • Energy function gy Detection score of concept c i on test samples f l Concept affinity Concept affinity 11
DASD - formulation (cont.) ( ) • Gradually smooth the function makes the • Gradually smooth the function makes the detection scores in accordance with the concept relationships t l ti hi Detection score smoothing process 12
DASD - formulation (cont.) ( ) • Graph adaptation p p Graph adaptation process 13
Graph adaptation - example WEAPON WEAPON WEAPON WEAPON WEAPON WEAPON 0.00 0.12 0.00 0.05 0.24 0.18 CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS CLOUDS DESERT DESERT DESERT DESERT DESERT DESERT 0.27 0.13 0.19 0.24 0.10 0.16 0.15 0.15 0.15 0.15 0.16 0 15 0 16 0.16 0.15 0 16 0.15 0 15 0.43 0.34 0.42 0.38 0.29 0.32 CAR CAR CAR CAR CAR CAR 0.64 0.64 0.64 0.64 0.64 0.64 0.17 0.17 0.16 0.16 0.19 0.20 SKY SKY SKY SKY SKY SKY SKY SKY VEHICLE VEHICLE VEHICLE VEHICLE SKY SKY SKY SKY VEHICLE VEHICLE VEHICLE VEHICLE VEHICLE VEHICLE VEHICLE VEHICLE 0.08 0.08 0.09 0.08 0.09 0.09 PARKING_LOT PARKING_LOT PARKING_LOT PARKING_LOT PARKING_LOT PARKING_LOT Iteration: 8 Iteration: 12 Iteration: 0 Iteration: 4 Iteration: 16 Iteration: 20 Broadcast news video domain Documentary video domain 14
Experiments on TV ’05-’07 p • Baseline detectors – VIREO-374 • Graph construction: • Graph construction: – Ground-truth labels on TRECVID 2005 TRECVID 05/06 (Broadcast News Videos) TRECVID 07 (Documentary Videos) WALKING WALKING MAP MAP SPORTS SPORTS WEATHER WEATHER SPORTS WEATHER OFFICE BUS CLASSROOM PEOPLE PEOPLE- PEOPLE PEOPLE CORP. LEADER CORP. LEADER DESERT DESERT MOUNTAIN MOUNTAIN MARCHING MARCHING DESERT DESERT MOUNTAIN MOUNTAIN WATER WATER NIGHT TIME NIGHT TIME TELEPHONE TELEPHONE EXPLOSION- EXPLOSION - TRUCK TRUCK OFFICE OFFICE BUILDING BUILDING ANIMAL ANIMAL TWO PEOPLE TWO PEOPLE STREET STREET FIRE FIRE POLICE POLICE MILITARY MILITARY 15
Results on TV ’05-’07 • Performance gain on TRECVID 05-07 g Datasets TRECVID ‐ 2005 2006 2007 # of evaluated concepts 39 20 20 Baseline (MAP) 0.166 0.154 0.099 SD 11.8% 15.6% 12.1% DASD DASD 11 9% 11.9% 17 5% 17.5% 16 2% 16.2% � SD: semantic diffusion (without graph adaptation) � SD: semantic diffusion (without graph adaptation) � Consistent improvement over all 3 data sets � DASD: domain adaptive semantic diffusion � Graph adaptation further improves the performance 16
Results on TV ’05-’07 (cont.) ( ) TRECVID 2006 Test Data 0.5 Baseline Semantic Graph Diffusion 0.4 Precision 0.3 Average P 0.2 0.1 0 Comparison with the state ‐ of ‐ the ‐ arts C i ith th t t f th t TRECVID Jiang et al Aytar et al Weng et al DASD 2005 2005 2.2% 2.2% 4.0% 4.0% N/A N/A 11.9% 11.9% 2006 N/A N/A 16.7% 17.5% 17
Results on TRECVID ’09 0.4 A_vireo.localglobal_5 0.35 A_vireo.dasd20fcs_2 0.3 0.25 0.2 10% 0.15 5% 30% 0.1 0.05 0 18
Results on TRECVID ’09 (cont.) ( ) • Quality of contextual detectors (VIREO-374) y ( ) 0.25 5% DASD performance gain on ge Precisio TV09 0.20 detectors 16% 0.15 0 15 rred Avera 18% TV07 detectors TV06 0.10 Context Mean Infer d t detectors t VIREO-374 0.05 0.00 222 system runs 19
DASD - computational time p • Complexity is O ( mn ) Complexity is O ( mn ) – m : # concepts; n : # video shots • Only 2 milliseconds per shot/keyframe! O l 2 illi d h t/k f ! TRECVID 05 TRECVID 06 TRECVID 07 SD 59s 84s 12s DASD 89s 165s 28s 20
Summary • A well-designed approach using local features achieves good results for concept detection achieves good results for concept detection. • Context information is helpful ! – Domain adaptive semantic diffusion • effective for enhancing concept detection accuracy • can alleviate the effect of data domain changes ll i t th ff t f d t d i h • highly efficient ! – Future directions include: Future directions include: • detector reliability: diffusion over directed graph • web data annotation: utilize contextual information to improve p the quality of tags – Source code available for download from DVMM lab research page 21
22
Recommend
More recommend