Combining text/image in WikipediaMM task 2009 Christophe Moulin, C - PowerPoint PPT Presentation

Combining text/image in WikipediaMM task 2009 Christophe Moulin, C´ ecile Barat, C´ edric Lemaˆ ıtre, Mathias G´ ery, Christophe Ducottet, Christine Largeron Laboratoire Hubert Curien, Saint-´ Etienne, France October 1st 2009 Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 1 / 16

Outline 1 Model overview Textual vector space model Visual vocabulary Combining text and image modalities 2 Experiments 3 Conclusion and future work Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 2 / 16

Model overview Model overview A textual/visual model based on the bag of words approach bag of words +( 1 − α ) α approach ✞ ☎ ✞ ☎ ✞ ☎ documents indexing combining ✝ ✆ ✝ ✆ ✝ ✆ Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 3 / 16

Model overview Textual vector space model Textual vocabulary creation Main steps of the textual bag of words creation ✄ � ✄ � ✄ � stop words filtering Porter stemming bag of words creation ✂ ✁ ✂ ✁ ✂ ✁ Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 4 / 16

Model overview Textual vector space model Textual vector weighting Salton’s based tf.idf weighting [ 1 ] bag of words vector of tf.idf weights ☛ ✟ [2] w i , j = tf i , j idf j ✡ ✠ tf i , j : representativeness idf j : discrimination power [1]: Salton et al. A vector space model for automatic indexing , 1975 [2]: Robertson et al. Okapi et trec-3 , 1994 Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 5 / 16

Model overview Textual vector space model Exploiting of the text around an image Two sources of text : metadata + extracted text of the original Wikipedia articles metadata of Wikipedia image used in ImageCLEFwiki original Wikipedia article ( n char around the image) Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 6 / 16

Model overview Visual vocabulary Visual representation Similar to the text representation using a visual codebook [ 3 ] Visual vocabulary creation descriptors visual bag of visual descriptors projection vocabulary words Image representation vector of descriptors bag of visual tfidf weights words [3]: Jurie et al. Creating efficient codebooks for visual recognition , 2005 Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 7 / 16

Model overview Visual vocabulary Visual features computation Two different descriptors are used regular partitioning: 16 × 16 cells meanstd (6 dimensions: 9350 visual words) sift 2 (128 dimensions: 9630 visual words) interest regions based on MSER detector sift 1 (128 dimensions: 9303 visual words) Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 8 / 16

Model overview Combining text and image modalities Score matching Distance computed between query and document vectors query documents query document tf tf.idf score 1 score 2 tf.idf tf.idf Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 9 / 16

Model overview Combining text and image modalities Model overview Linear combination of textual and visual scores bag of words +( 1 − α ) α approach α is fixed globally on ImageCLEFwiki 2008 Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 10 / 16

Experiments Global results rank participant/score text image map num ret num rel ret 1 deuceng TXT - 0.2397 43052 1351 5 lahc/score 2 100 char meanstd ( α =0.025) 0.2178 44993 1213 6 lahc/score 2 50 char meanstd ( α =0.025) 0.2148 44993 1218 14 lahc/score 2 metadata sift 2 ( α =0.084) 0.1903 44993 1212 15 lahc/score 2 100 char - 0.1890 38004 1205 16 lahc/score 2 50 char - 0.1880 37041 1198 20 lahc/score 2 metadata meanstd ( α =0.025) 0.1845 44993 1208 21 lahc/score 2 metadata sift 1 ( α =0.012) 0.1807 44995 1200 24 lahc/score 2 metadata meanstd ( α =0.015) 0.1792 44993 1213 33 lahc/score 2 metadata - 0.1667 35611 1192 44 lahc/score 1 metadata - 0.1432 35611 1164 52 lahc/score 2 metadata sift 2 0.0365 619 142 53 lahc/score 2 metadata meanstd 0.0338 574 76 54 lahc/score 2 metadata sift 1 0.0321 637 120 57 sztaki - IMG 0.0068 44993 80 Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 11 / 16

Experiments Textual results 0.7 score 1 (map: 0.1432) score 2 (map: 0.1667) score 2 50 char (map: 0.1880) score 2 100 char (map: 0.1890) 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 Improvements provided by additional text (15%) Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 12 / 16

Experiments Textual+visual results 0.7 score 2 (map: 0.1667) score 2 sift 1 : α =0.012 (map: 0.1807) score 2 meanstd: α =0.025 (map: 0.1845) score 2 sift 2 : α =0.084 (map: 0.1903) 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 sift 2 > meanstd > sift 1 Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 13 / 16

Experiments Best results 0.8 score 2 50 char (map: 0.1880) score 2 100 char (map: 0.1890) score 2 50 char + meanstd (map: 0.2148) score 2 100 char + meanstd (map: 0.2178) 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 Improvements provided by visual information (15%) Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 14 / 16

Conclusion and future work Conclusion Improvement of our last year model It works: Text around the image in original wikipedia articles. (+15%) Addition of visual features (MSER+sift). (color/texture complementarity) Text-Image combination. (+15%) Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 15 / 16

Conclusion and future work Future work Combination with more than one visual descriptor. Other fusion method. Learn α for each query. Christophe Moulin et al. (LaHC) Combining text/image in WikipediaMM task 2009 October 1st 2009 16 / 16

Combining text/image in WikipediaMM task 2009 Christophe Moulin, C - PowerPoint PPT Presentation

Combining text/image in WikipediaMM task 2009 Christophe Moulin, C ecile Barat, C edric Lema tre, Mathias G ery, Christophe Ducottet, Christine Largeron Laboratoire Hubert Curien, Saint- Etienne, France October 1st 2009

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Text-to-Image Generation Yu Cheng Text-to-Image Synthesis Text-to-Image Synthesis

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Van Dyke Rd Station New 115/13.2kV Station This text box and image This text box and image

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Text Text #ICANN50 Contractual Compliance Text Text GNSO Council Meeting Wednesday, Jun 25

Deep Image-Text Embeddings Learning Deep Structure-Preserving Image-Text Embeddings (CVPR 2016)

Image Features Sanja Fidler CSC420: Intro to Image Understanding 1 / 64 Image Features Image

Image Features Sanja Fidler CSC420: Intro to Image Understanding 1 / 1 Image Features Image

Combining Text and Image Processing in an Automa6c Image

God Rescues Daniel from the Lions Daniel 6 Here is some test text Here is some test text Here

Harmonical structure of digital sequences and applications to s-dimensional uniform

Lars Bauer, Jrg Henkel - 1 - Lecture time: Mi., 15.45 - 17.15 Bld. 50.34, HS -102

Web Security: Web Application Security Spring 2016 Franziska (Franzi) Roesner

I18n (Internationalization) OpenStack Summit |Denver 2019 - Project update SPEAKER: FRANK

Sabana REIT (the Merger) 12 November 2020 Important Notice Important Notice The value of

SPECIAL INDUSTRIAL TARIFF (SIT) NEW POLICY 2016 TENAGA NASIONAL BERHAD 1 JANUARY 2016

Ecritures de nombres en base r eelle, fractals et pavages Wolfgang Steiner LIAFA, CNRS,

Aspects of symmetric Gamma process mixtures Zacharie Naulet (Paris-Dauphine University) Joint