iCLEF 2009 overview tags : image_search, multilinguality, interactivity, log_analysis, web2.0 J U LI O G O N ZA LO V Í CTO R P E I N A D O J U LI O G O N ZA LO , V Í CTO R P E I N A D O , P A U L CLO U G H & J U S S I K A R LG R E N CL E F 2 0 0 9 , CO R F U
What CLIR researchers assume User is User needs Machine happy (or information. searches. not).
But finding is a matter of two But finding is a matter of two Fast stupid smart slow Room for collaboration!
“Users screw things up” g p Can’t be reset Differences between systems dissappear y pp Differences between interactive systems too! Diff b t i t ti t t ! Who needs QA systems having a search engine and a user?
But CLIR is different
Help! p
iCLEF methodology: hypothesis-driven gy yp � hypothesis � Reference & contrastive systems, topics, users y , p , � latin-square pairing between system/ topic/ user � Features: � Hypothesis-based (vs. operational) � Controlled (vs. ecological) � Deductive (vs. inductive) � Sound
iCLEF 2001-2005: tasks 5 On newswire On newswire On im age archives On im age archives � Cross-Language � Cross-Language Image Document Selection Document Selection search. search. � Cross-Language query formulation and formulation and refinement � Cross-Language � Cross Language Question Answering
Practical outcome!
iCLEF 2001-2005: problems 5 p � Unrealistic search scenario, user sample U li ti h i l opportunistic � Experimental design not cost-effective i l d i ff i � Only one aspect of CLIR at a time � High cost of recruiting, training, observing users. i h f i i i i b i
Pick a document for “saffron”
Pick an illustration for “saffron”
Flickr
iCLEF 2006 Topics Topics Methodology Methodology � Ad hoc : find as many � Participants m ust propose their own photographs of (different) m ethodology and m ethodology and european parliaments as european parliaments as experim ent design possible. � Creative : find five illustrations for this article about saffron in Italy. � Visual : What is the name � Visual : What is the name of the beach where this crab is lying on?
Explored issues p • How users deal with native/ passive/ unknown user’s languages? behaviour behaviour • Do they actually use CLIR facilities when available? il bl ? user’s • Satisfaction (all tasks) • Completeness (creative,ad-hoc) perceptions • Quality (creative) search search • How many facets were retrieved (creative, ad-hoc) H f i d ( i d h ) effectiveness • Was the image found? (visual)
iCLEF 2008/ 2009 / 9 Produce reusable Much larger set of dataset dataset users users search log search log online gam e analysis task.
iCLEF 2008/ 2009: Log Analysis / 9 g y Online game: see this image? Find it! (in any of six languages) Game interface features ML search assistance Users register with a language profile Users register with a language profile Dataset: rich search log • All search interactions • Explicit success/ failure • Post-search questionnaires Queries • Easy to find with the appropriate tags ( � typically 3 tags) • Hint mechanism (first target language then tags) • Hint mechanism (first target language, then tags)
Simultaneous search in six languages g g
Boolean search with translations
Relevance feedback
Assisted query translation y q
User profiles p
User rank (Hall of Fame) ) (
Group rank p
Hint mechanism
Language skills bias in 2008 g g Native Languages Language Skills: English DE EN native native ES active FR passive IT unknown unknown NL Other
Language skills bias in 2008 g g Target language was for the user… 31% active passive 55% 55% unknown k 14%
Selection of topics (images) p ( g ) � No English annotations (new for 20 0 9) N E li h t ti ( f ) � Not buried in search results � Visual cues � No named entities
Harvested logs g 20 0 8 20 0 8 20 0 9 20 0 9 � 312 users / 41 teams 130 users / 18 teams � � 5101 complete search sessions 2410 complete search sessions � � Linguistics students, � Linguistics students CS & linguistics students, CS & linguistics students � � photography fans, IR photography fans, IR researchers from industry and researchers from industry and academia monitored groups academia, monitored groups, academia monitored groups academia, monitored groups, other other.
Language skills bias in 2009 g g 9 Target language was for the user… 0% 1% active passive unknown k 99% 99%
Log statistics g
Distribution of users Distribution of users
Native languages Native languages Language skills g g Interface Interface
Language skills (II) ) Spanish Spanish ( g g English English
Language skills (III) ) ( Dutch Dutch g g Germ an Germ an
Language skills (and IV) ) Italian Italian ( g g French French
Participants (I): log analysis p ( ) g y U i University of i f • Goal: correlation between lexical ambiguity in queries and search success Alicante • Methodology: analysis of full search log • Goal: correlations between several search parameters UAIC and search success • Methodology: own set of users, search log analysis M th d l t f h l l i • Goal: correlation between search strategies and UNED UNED search success h • Methodology: analysis of full search log • Goal: study confidence and satisfaction from search SICS logs • Methodology: analysis of full search log
Participants (II): other strategies p ( ) g • Goal: focus on users’ trust and confidence to Manchester reveal their perceptions of the task. Metropolitan Metropolitan • Methodology: Own set of users, own set of M h d l O f f queries, training, observational study, University retrospective thinking aloud, questionnaires. • Goal: understanding challenges when G l d t di h ll h searching images that have multilingual University of annotations. North Texas North Texas • Methodology: Own set of users training • Methodology: Own set of users, training, questionnaires, interviews, observational analysis.
Discussion � 2008+2009 logs = “iCLEF legacy” 8 l “iCLEF l ” � 442 users w. heterogeneous language skills � 7511 search sessions w. questionnaires � iCLEF has been a success in terms of � iCLEF has been a success in terms of providing insights into interactive CLIR � … and a failure in terms of gaining adepts?
So long! g
the iCLEF Bender Awards And now…
Recommend
More recommend