On the use of semantic features for the semantic indexing task - PowerPoint PPT Presentation

On the use of semantic features for the semantic indexing task Bahjat Safadi, Nadia Derbas, Abdelkader Hamadi, Mateusz Budnik, Philippe Mulhem and Georges Quénot UJF-LIG and many other people from the IRIM group of GDR 720 ISIS 10 November 2014 1

Outline • System overview • Semantic features • Contrast experiments • Engineered versus learned features • Conclusion 2

Mean InfAP. 0,05 0,15 0,25 0,35 0,1 0,2 0,3 0 D_MediaMill.14_1 D_MediaMill.14_2 D_MediaMill.14_3 A_MediaMill.14_4 Main runs scores 2014 (from NIST) D_PicSOM.14_1 D_PicSOM.14_3 D_TokyoTech-Waseda.14_1 D_TokyoTech-Waseda.14_2 D_PicSOM.14_2 D_LIG.14_3 D_LIG.14_4 A_TokyoTech-Waseda.14_3 A_TokyoTech-Waseda.14_4 D_LIG.14_2 D_IRIM.14_2 D_IRIM.14_1 D_LIG.14_1 D_IRIM.14_4 A_CMU.14_1 D_IRIM.14_3 * non-LIG submitted runs in 2013 against 2014 testing data (progress runs) * LIG submitted runs in 2014 against 2014 testing data (main runs) * LIG (Quaero) submitted runs in 2013 against 2014 testing data (progress runs) A_LIG.13_3 A_LIG.13_1 A_CMU.14_3 A_IRIM.13_1 A_VideoSense.13_4 D_OrangeBJ.14_4 A_CMU.14_2 A_CMU.14_4 D_VIREO.14_2 D_EURECOM.14_1 A_axes.inria.lear.13_8 A_axes.inria.lear.13_5 A_axes.inria.lear.13_2 A_ITI_CERTH.14_1 A_IRIM.13_2 D_OrangeBJ.14_2 A_ITI_CERTH.14_2 D_EURECOM.14_2 D_VIREO.14_1 A_PicSOM.14_4 A_ITI_CERTH.14_3 A_OrangeBJ.14_1 D_UEC.14_1 D_CRCV_UCF.14_3 A_NII.13_1 A_EURECOM.14_3 A_NII.13_2 D_CRCV_UCF.14_2 D_UEC.14_2 D_CRCV_UCF.14_1 D_OrangeBJ.14_3 A_EURECOM.14_4 A_ITI_CERTH.13_6 A_ITI_CERTH.13_5 D_CRCV_UCF.14_4 A_UEC.14_3 A_NHKSTRL.13_3 A_insightdcu.13_1 A_ITI_CERTH.14_4 E_insightdcu.14_1 A_UEC.13_2 E_insightdcu.14_2 A_insightdcu.14_1 A_HFUT.13_2 A_EURECOM.13_1 A_EURECOM.13_2 E_CMU.14_1 E_CMU.14_2 Median = 0.206 A_PKUSZ_ELMT.14_2 A_PKUSZ_ELMT.14_1 A_FIU_UM.14_4 3 A_FIU_UM.14_3

Basic classification pipeline Text Audio Image Descriptor extraction Classification Late fusion Classification score 4

Springer 2014] + hierarchical fusion [Strat et al., ECCV/IFCVCR workshop 2012, LIG/Quaero/IRIM classification pipeline Text Audio Image Descriptor extraction Classification Descriptors and classifier variants fusion Higher level hierarchical fusion Classification score 5

LIG/Quaero/IRIM classification pipeline Descriptors and classifier Text Audio Image Re-ranking (re-scoring) Descriptor extraction Classification score hierarchical fusion variants fusion Classification Higher level + Temporal re-ranking [Safadi et al., CIKM 2011; Wang et al, TV 2009]: update shot scores considering other shots’ scores for a same concept 6

transformations of PCA-based dimensionality reduction and pre- and post- power + Descriptor optimization [Safadi et al., MTAP 2014]: combination LIG/Quaero/IRIM classification pipeline Text Audio Image Descriptor extraction Descriptor transformation Classification Descriptors and classifier variants fusion Higher level hierarchical fusion Re-ranking (re-scoring) Classification score 7

+ conceptual feedback [Hamadi et al., MTAP, 2014] LIG/Quaero/IRIM classification pipeline Text Audio Image Descriptor extraction Descriptor transformation Conceptual feedback Classification Descriptors and classifier variants fusion Higher level hierarchical fusion Re-ranking (re-scoring) Classification score 8

scores considering other concepts’ scores for a same shot + conceptual re-ranking [Hamadi et al., MTAP, 2014] update concept LIG/Quaero/IRIM classification pipeline Text Audio Image Descriptor extraction Descriptor transformation Conceptual feedback Classification Descriptors and classifier variants fusion Higher level hierarchical fusion Re-ranking (re-scoring) Classification score 9

+ semantic descriptors [TRECVid 2013 and 2014] LIG/Quaero/IRIM classification pipeline Text Audio Image Descriptor extraction Descriptor transformation Conceptual feedback Classification Descriptors and classifier variants fusion Higher level hierarchical fusion Re-ranking (re-scoring) Classification score 10

Text Audio Image Descriptor extraction Conceptual feedback: unfolded graph Descriptor transformation Classification Descriptors and classifier variants fusion Higher level hierarchical fusion Re-ranking (re-scoring) Descriptor transformation Classification Descriptors and classifier variants fusion Higher level hierarchical fusion Re-ranking (re-scoring) Score iteration 0 Score iteration 1 (feedback) (original) 11

components Conceptual feedback: semantic descriptor (computed only once) shared Image Image Audio Audio Text Text Descriptor extraction Descriptor extraction Descriptor transformation Descriptor transformation Classification Classification Descriptors and classifier Descriptors and classifier variants fusion variants fusion Higher level Higher level hierarchical fusion hierarchical fusion Re-ranking (re-scoring) Re-ranking (re-scoring) iteration 1 Classification score iteration 0 Classification score semantic descriptor standard descriptor extraction processing 12

Semantic descriptor: general case Any classification system using Image any approach trained on Semantic descriptor Audio any annotated data for semantic descriptor any target concept set Text extraction standard descriptor processing Descriptors and classifier Descriptor transformation Re-ranking (re-scoring) Image Descriptor extraction hierarchical fusion variants fusion Classification Higher level Classification score Audio Text Model vectors [Smith et al. ICME 2003] 13

Semantic descriptors trained on ImageNet • Fisher Vector based descriptor [Perronnin, IJCV 2013] : - XEROX/ilsvrc2010: vectors of 1000 scores trained on ILSVRC10 and applied to key frames, kindly produced by Florent Perronnin from Xerox (XRCE) - XEROX/imagenet10174: same with10274 concepts scores trained ImageNet • Deep learning based descriptors, computed by Eurecom and LIG using Berkeley caffe tool [Jia et al, 2013]: - EUR/caffe1000: vectors of 1000 scores trained on ILSVRC12 and applied to key frames, fusing outputs for 10 variants of each input image - LIG/caffe1000b: same with a different version of the tool and using only one variant of each input image 14

“Quasi-semantic” descriptors from deep learning and ImageNet [Krizhevsky et al., 2012] • 7 hidden layers, 650K units, 630 M connections, 60M parameters • GPU implementation (50× speed-up over CPU) • Trained on two GPUs for a week b1000 fc7 fc6 fc5 15

“Quasi-semantic” descriptors from deep learning and ImageNet • Deep learning based descriptors, computed by LIG using Berkeley caffe tool [Jia et al, 2013]: - LIG/caffe_fc7b_4096: 4096 values of the last hidden layer (non convolutional) - LIG/caffe_fc6b_4096: 4096 values of the last but one hidden layers (non convolutional) - LIG/caffe_fc5b_43264: 43264 values of the last but two hidden layers (convolutional, 13×13×256) • Not strictly semantic as not classification scores, close to the semantic level however • Expected to perform better than the last layer: - No (or les) information loss due to the targeting of different and/or unrelated target concepts 16

Local semantic descriptors trained on TRECVid 2003 • Scores for 15 TRECVid 2003 concepts (sky, building, water, greenery ...) on image patches trained using local annotations [Ayache et al., IVP 2007] - LIG/percepts*: computed at various resolutions in a pyramidal way, aggregated by concatenation - Computed using local color and texture descriptors • No longer state of the art 17

Experiments • Use of SIN 2013 development data only (no tuning on SIN 2013 test data) and various components using ImageNet annotated data → D type submissions • Evaluation on SIN 2013 and 2014 test data • Use of a combination of kNN and MSVM for classification [Safadi, RIAO 2010] • Use of uploader information: multiplicative factor at the video level, weighted at 10%, provided by Eurecom [Niaz, TV 2012] 18

Performance of “low-level” descriptors LIRIS OC-LBP MAP 2013 LIG opponent SIFT MAP 2014 CEALIST pyramidal bag of SIFT ETIS color (lab BoW) and texture (wavelets) LISTIC SIFT with retina masking ETIS VLAT (vector of locally aggregated tensors) 13 Low-level "engineered" descriptor 0 0,1 0,2 0,3 19

Performance of semantic descriptors 13 "low-level" "engineered" descriptors Xerox semantic features ILSVRC 1000 Xerox semantic features ImageNet 10174 Xerox semantic features (fused) Caffe semantic features LIG MAP 2013 Caffe semantic feature Eurecom MAP 2014 Caffe semantic features output layer (fused) Caffe quasi-semantic hidden layer 5 (43264) Caffe quasi-semantic hidden layer 6 (4096) Caffe quasi-semantic hidden layer 7 (4096) Caffe semantic features last two hidden layers… LIG/concepts first iteration (includes Xerox) LIG/concepts second iteration (includes Xerox) 0 0,1 0,2 0,3 20

On the use of semantic features for the semantic indexing task - PowerPoint PPT Presentation

On the use of semantic features for the semantic indexing task Bahjat Safadi, Nadia Derbas, Abdelkader Hamadi, Mateusz Budnik, Philippe Mulhem and Georges Qunot UJF-LIG and many other people from the IRIM group of GDR 720 ISIS 10 November

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Indexing December 12, 2008 Indexing Introduction New tuple is stored without any order next

Retrieval by Content Part 3: Text Retrieval Latent Semantic Indexing Srihari: CSE 626 1 Latent

NPFL103: Information Retrieval (11) Latent semantic indexing Pavel Pecina Institute of Formal

Kristen Grauman Kristen Grauman CS 376 Lecture 18 1 3/30/2011 Indexing local features

Quaero at TRECVID 2013 Semantic Indexing Task Bahjat Safadi, Nadia Derbas, Abdelkader Hamadi,

Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping Time Warping Eamonn Keogh

Semantic Indexing Using GMM Supervectors with MFCCs and SIFT features Ilseo Kim, Byungki Byun

FTRDBJ Semantic Indexing Systems for TRECVID 2010 Kun TAO France Telecom (R&D) Orange Labs,

Overview of LBNF Target Conceptual Design & Physics Performance Chris Densham (STFC Rutherford

Redeem Technological Systems: Complications, Challenges, Commitments for the Redeeming

Conceptual Spaces for Artificial Intelligence Formalization, Domain Grounding, and Concept

On Augmented Lagrangian approach for inverse problems Adriano De Cezaro- FURG in collaboration

Better Concrete Security for Half-Gates Garbling (in the Multi-Instance Setting) Chun Guo

1 How far can a Contract Serve as a Justification for Permanent Storage on a Blockchain?

Non-uniform Concrete security: an example cracks in the concrete: What is the best NIST P-256

CMPSC 497: Symbolic Execution Trent Jaeger Systems and Internet Infrastructure Security

On the use of semantic features for the semantic indexing task - PowerPoint PPT Presentation

On the use of semantic features for the semantic indexing task Bahjat Safadi, Nadia Derbas, Abdelkader Hamadi, Mateusz Budnik, Philippe Mulhem and Georges Qunot UJF-LIG and many other people from the IRIM group of GDR 720 ISIS 10 November

Distributed Indexing Indexing, session 8 CS6200: Information Retrieval Slides by: Jesse Anderton

Indexing Multimedia Multimedia Databases Databases Indexing Indexing Multimedia Databases

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

Indexing Presentation - The Basics Attached is the slide deck for a short presentation on indexing

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing &amp; Searching 3

Bitmap Indexing and related indexing techniques Presented by: El Ghailani Maher Outline I

Chapter 6 Hash-Based Indexing Efficient Support for Equality Search Hash-Based Indexing Static

Indexing December 12, 2008 Indexing Introduction New tuple is stored without any order next

Retrieval by Content Part 3: Text Retrieval Latent Semantic Indexing Srihari: CSE 626 1 Latent

NPFL103: Information Retrieval (11) Latent semantic indexing Pavel Pecina Institute of Formal

Kristen Grauman Kristen Grauman CS 376 Lecture 18 1 3/30/2011 Indexing local features

Quaero at TRECVID 2013 Semantic Indexing Task Bahjat Safadi, Nadia Derbas, Abdelkader Hamadi,

Exact Indexing of Dynamic Exact Indexing of Dynamic Time Warping Time Warping Eamonn Keogh

Semantic Indexing Using GMM Supervectors with MFCCs and SIFT features Ilseo Kim, Byungki Byun

FTRDBJ Semantic Indexing Systems for TRECVID 2010 Kun TAO France Telecom (R&amp;D) Orange Labs,

Overview of LBNF Target Conceptual Design &amp; Physics Performance Chris Densham (STFC Rutherford

Redeem Technological Systems: Complications, Challenges, Commitments for the Redeeming

Conceptual Spaces for Artificial Intelligence Formalization, Domain Grounding, and Concept

On Augmented Lagrangian approach for inverse problems Adriano De Cezaro- FURG in collaboration

Better Concrete Security for Half-Gates Garbling (in the Multi-Instance Setting) Chun Guo

1 How far can a Contract Serve as a Justification for Permanent Storage on a Blockchain?

Non-uniform Concrete security: an example cracks in the concrete: What is the best NIST P-256

CMPSC 497: Symbolic Execution Trent Jaeger Systems and Internet Infrastructure Security

Indexing and Searching Indexing and Searching TDT4215 TDT4215 Indexing & Searching 3

FTRDBJ Semantic Indexing Systems for TRECVID 2010 Kun TAO France Telecom (R&D) Orange Labs,

Overview of LBNF Target Conceptual Design & Physics Performance Chris Densham (STFC Rutherford