Cross Language Image Retrieval ImageCLEF 2010 Henning Müller 1 , Theodora Tsikrika 2 , Steven Bedrick 3 , Barbara Caputo 4 , Henrik Christensen 5 , Marco Fornoni 4 , Mark Huiskes 6 , Jayashree Kalpathy-Cramer 3 , Jana Kludas 7 , Stefanie Nowak 8 , Adrian Popescu 9 , Andrzej Pronobis 10 1 University of Applied Sciences Western Switzerland (HES-SO), Sierre, Switzerland 2 Centrum Wiskunde & Informatica, Amsterdam, The Netherlands 3 Oregon Health and Science University (OHSU), Portland, OR, USA 4 Idiap Research Institute, Martigny, Switzerland 5 Georgia Institute of Technology, Atlanta, USA 6 Leiden Institute of Advanced Computer Science, Leiden University, The Netherlands 7 University of Geneva, Switzerland 8 Fraunhofer Institute for Digital Media Technology, Ilmenau, Germany 9 Institut Telecom/Telecom Bretagne, Brest, France 10 Centre for Autonomous Systems, The Royal Institute of Technology, Stockholm, Sweden
ImageCLEF History
ImageCLEF 2010 • General overview o news, participation, management • Tasks o Medical Image Retrieval o Wikipedia Retrieval o Photo Annotation o Robot Vision • Conclusions
News - ImageCLEF Book! ImageCLEF: Experimental Evaluation in Visual Information Retrieval The Information Retrieval Series, Vol. 32 Müller, H.; Clough, P.; Deselaers, Th.; Caputo, B. (Eds.) 1st Edition., 2010, 495 pages Contents • Basic concepts (6 chapters) history, datasets, topic development, relevance assessments, evaluation, fusion approaches • Task reports (7 chapters) • Participants' reports (11 chapters) • External perspectives on ImageCLEF (3 chapters)
News - ImageCLEF @ ICPR! • ImageCLEF contest @ ICPR 2010 o ICPR: major event in pattern recognition (Aug 2010) o ImageCLEF contest: Oct 2009 - April 2010 o ImageCLEF 2009 test collections o 4 tasks photo annotation robot vision information fusion for medical image retrieval interactive photo retrieval (showcase event) o 76 registrations, 30 submitted results, 14 presented o Half had not previously participated at ImageCLEF o Largest contest at ICPR!
News - ImageCLEF 2010 • Medical Image Retrieval o new subtask: modality detection o larger image collection, more case-based topics • Wikipedia Retrieval o new, larger image collection o multilingual annotations and topics o Wikipedia articles containing the images provided • Photo Annotation o new concepts added o crowdsourcing for image annotation o multi-modal approaches • Robot Vision o new image collection o unknown places as category
Participation • Total: o 2010: 112 groups registered, 47 submitted results o 2009: 84 groups registered, 40 submitted results • Tasks o Medical Image Retrieval: 16 groups o Wikipedia Retrieval: 13 groups o Photo Annotation: 17 groups o Robot Vision: 7 groups
ImageCLEF Management • Online management system for participants o registration, collection access, result submission • ImageCLEF web site o Unique access point to all information on tasks & events o Access to test collections from previous years o Use of content-management system so that all 15 organisers can edit directly o Very appreciated!! 2000-3000 unique visits per month >10,000 page views very international access http://www.imageclef.org/
Medical Image Retrieval Task
Tasks proposed • Modality detection task o purely visual task, training set with modalities given o one of seven modalities had to be assigned to all images • Image-based retrieval task o clear information need for a single image o topics are based on a survey, 3 languages, example images • Case-based retrieval task o full case description from teaching file as example but without diagnosis, including several image examples o unit for retrieval is a complete case or article, closer to clinical routine
Setup • Database with journal articles and 77,506 images o very good annotations o all in English • Image-based topics generated from a survey among clinicians using a retrieval system o OHSU, Portland, OR o selection based on available images • Case-based topics used a teaching file as source • Relevance judgements performed by clinicians in Portland OR, USA o double judgements to control behaviour and compare ambiguity o several sets of qrels, but ranking remains stable
Participation • 51 registrations, 16 groups submitting results o AUEB (Greece) o Bioingenium (Colombia)* o Computer Aided Medical Procedures (Germany)* o Gigabioinforamtics (Belgium)* o IRIT (France) o ISSR (Egypt) o ITI, NIH (USA) o MedGIFT (Switzerland) o OHSU (USA) o RitsMIP (Japan)* o Sierre, HES--SO (Switzerland) o SINAI (Spain) o UAIC (Romania)* o UESTC (China)* o UIUC-IBM (USA)* o Xerox (France)* o *=new groups • Fusion task at ICPR with another five participants
Example of a case-based topic Immunocompromised female patient who received an allogeneic bone marrow transplantation for acute myeloid leukemia. The chest X-ray shows a left retroclavicular opacity. On CT images, a ground glass infiltrate surrounds the round opacity. CT1 shows a substantial nodular alveolar infiltrate with a peripheral anterior air crescent. CT2, taken after 6 months of antifungal treatment, shows a residual pulmonary cavity with thickened walls.
Results • Modality detection task (purely visual) was very popular o performance of over 90% very high o CT and MRI are mixed (hard task) • Text-based retrieval is much better than visual retrieval for both image and case-based topics o difference lower for the case-based topics o more research on visual techniques has to be fostered • Early precision can be improved using visual techniques • Fusion of visual and textual retrieval remains hard but can improve performance o fusion works really well when different systems are used • Interactive and feedback is rarely used o ICPR session on interactive retrieval was even cancelled
Wikipedia Retrieval Task
Wikipedia Retrieval Task • History: o 2008-2009: wikipediaMM task @ ImageCLEF o 2006-2007: MM track @ INEX • Description: o ad-hoc image retrieval o collection of Wikipedia images large-scale heterogeneous user-generated multilingual annotations o diverse multimedia information needs • Aim: o investigate mono-media and multi-modal retrieval approaches focus on fusion/combination of evidence from different modalities o attract researchers from both text and visual retrieval communities o support participation through provision of appropriate resources
Wikipedia Retrieval Collection • Image collection o 237,434 Wikipedia images o wide variety, global scope • Annotations o user-generated highly heterogeneous, varying length, noisy o semi-structured o multi-lingual (English, German, French ) 10% images with annotations in 3 languages 24% images with annotations in 2 languages 62% images with annotations in 1 language 4% images with annotations in unidentified language or no annotations • Wikipedia articles containing the images in the collection • Low-level features provided by CEA-LIST o cime : border/interior classification algorithm o telp: texture + colour o bag of visual words
Wikipedia Retrieval Collection
Wikipedia Retrieval Collection
Wikipedia Retrieval Collection
Wikipedia Retrieval Collection
Wikipedia Retrieval Topics <topic> <number> 68 </number> <title xml:lang="en"> historic castle <title> <title xml:lang="de"> historisches schloss <title> <title xml:lang="fr"> château fort historique <title> <image> 3691767116_caa1648fee.jpg </image> <image> 4155315506_545e3dc590.jpg </image> < narrative > We like to find pictures of historic castles. The castle should be of the robust, well-fortified kind. Palaces and chateaus are not relevant. </ narrative > </topic> • range from easy (eg. 'postage stamps') Number of topics 70 to difficult highly semantic topics (e.g. average # of images/topic 1.68 'paintings related to cubism') average # of terms/topic 2.7 • challenging for current state-of-the-art average # of relevant images 252.3 retrieval algorithms
Wikipedia Retrieval Participation • 43 groups registered • 13 groups submitted a total of 127 runs 48 textual 23 relevance feedback 41 monolingual 7 visual 18 query expansion 79 multilingual 72 mixed 1 QE + RF
Wikipedia Retrieval Results Conclusions: • best performing run: a multi-modal, multi-lingual approach • 8 groups with mono-media and multi-modal runs o for 4 groups multi-modal runs outperform mono-media runs o combination of modalities remains a challenge • many (successful) query/document expansion submissions • topics with named entities are easier and benefit from textual approaches • topics with semantic interpretation and visual variation are more difficult
Photo Annotation Task
Task Description • Automated Annotation of 93 visual concepts in photos • Flickr photos based on interestingness (MIR Flickr Set): o Trainingset: 8,000 photos + Flickr User Tags + EXIF data + GT o Testset: 10,000 photos + Flickr User Tags + EXIF data • 3 Configurations: o Textual information (EXIF tags, Flickr User Tags) o Visual information (photos) o Multi-modal information (all) • Evaluation: o Average Precision (AP) o Example-based F-measure (F-ex) o Ontology Score with Flickr Context Similarity (OS-FCS)
Recommend
More recommend