To Re-rank or to Re-query: Can Visual Analytics Solve This Dilemma? E. Di Buccio 1 , M. Dussin 1 , N. Ferro 1 , I. Masiero 1 , G. Santucci 2 , G. Tino 2 1 University of Padua, Padova, Italy 2 Sapienza University of Rome, Rome, Italy Second International Conference of the Cross Language Evaluation Forum, CLEF2011 September 21, 2011, Amsterdam, The Netherlands
IR System Failure Analysis • Objective Understading factors affecting the perfomance of an IR system • Problem Complexity of the analysis task Example: RIA Workshop [ HarmanEt2009 ] (28 people, 6 weeks, 11-40 hours per topic) • How to address this complexity? [HarmanEt2009] Harman, D., Buckley, C.: Overview of the Reliable Information Access Workshop . Information Retrieval 12, 615-641 (2009) 2
Supporting Failure Analysis • Provide analysts with ‐ Methodologies ‐ Tools • Previous approaches ‐ Beadplots [ BanksEt1999 ] ‐ Query Performance Analyzer [ SormunenEt2002 ] ‐ VisualVectora [ JärvelinEt2008 ] ‐ Potential for Personalization Curve [ TeevanEt2010 ] [BanksEt1999] Banks, D., Over, P., Zhang, N.-F.: Blind men and Elephants: Six Approaches to TREC data . Information Retrieval 1, 7-34 (1999) [SormunenEt2002] Sormunen, E., Hokkanen, S., Kangaslampi, P., Pyy, P., Sepponen, B.: Query performance analyzer -: a web- based tool for IR research and instruction . In Proceedings of SIGIR 2002, p. 450, ACM, New York (2002) [JärvelinEt2008] Järvelin, K., Vähämöttönen, I., Keskustalo, H., Kekäläinen, J.: VisualVectora: An interactive Visualization Tool for Cumalated Gain-based Retrieval Experiments . In Proceedings of ECIR ’08, Glasgow, UK (2008) [TeevanEt2010] Teevan, J., Dumais, S.T., Horvitz, E.: Potential for Personalization. ACM TOCHI, 17, 1-31 (2010) 3
Proposed Solution • Visual Analytics-based approach • Quantify gain/loss with respect to the optimal and the ideal ranking 4
Analytical Model • Ranked result list representation V GT(V) DF id1 3 3 ‐ Vector representation [ JärvelinEt2002 ] id2 1 1 ‐ GT: ground truth function (values in {0,1,…,k}) 2 id3 2 ‐ DF: discounting function 3 id4 3 … … … • Two analytical measures introduced: ‐ R_Pos is the relative position of the documents in V with respect to their optimal position in the optimal ranking O ‐ Δ_Gain (i) difference between DF at rank i of the experiment and the optimal vector [JärvelinEt2002] Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques . ACM TOIS, 20, 422-446 (2002) 5
Analytical Model Visualisation GT(V) GT(V) DF DF DCG DCG Δ _Gain Δ _Gain GT(O) DF DCG 3 3 3,00 3,00 3,00 3,00 0,00 0,00 3 3,00 3,00 1 1 1,00 1,00 4,00 4,00 -2,00 -2,00 3 3,00 6,00 ok 2 2 1,26 1,26 5,26 5,26 -0,63 -0,63 3 1,89 7,89 above 3 3 1,50 1,50 6,76 6,76 0,00 0,00 3 1,50 9,39 below 2 2 0,86 0,86 7,62 7,62 0,00 0,00 2 0,86 10,25 2 2 0,77 0,77 8,40 8,40 0,00 0,00 2 0,77 11,03 ok 3 3 1,07 1,07 9,47 9,47 0,36 0,36 2 0,71 11,74 loss 2 2 0,67 0,67 10,13 10,13 0,00 0,00 2 0,67 12,41 local gain 0 0 0,00 0,00 10,13 10,13 -0,32 -0,32 1 0,32 12,72 1 1 0,30 0,30 10,43 10,43 0,00 0,00 1 0,30 13,02 0 0 0,00 0,00 10,43 10,43 0,00 0,00 0 0,00 13,02 3 3 0,84 0,84 11,27 11,27 0,84 0,84 0 0,00 13,02 6
Failure Analysis Approach τ : Kendall Tau Rank correlation among Analysis through ( τ ideal-opt , τ opt-exp ) pairs gain vectors - High τ ideal-opt and low τ opt-exp : possible re-ranking - Low or negative τ ideal-opt : possible re-query More in-depth investigation on a per topic basis Ranking curves by examining gap among ranking curves Analysis on a per document basis using R_Pos and Δ _Gain vectors (e.g. examining document by click R_Pos and Δ _Gain on the corresponding entry) 7
Experimentation • Experimentation carried out on TREC data ‐ Document corpora of the TREC7 Adhoc Test Collection ‐ Subset of the TREC7 Adhoc topics re-assessed in [JärvelinEt2002] ‐ Graded relevant judgments gathered in [JärvelinEt2002] • DCG ‐ trec_eval implementation with log x (i+1) 8
Case Study (re-ranking) ( τ ideal-opt , τ opt-exp ) = (0.88, 0.07) 9
Case Study (re-ranking) ( τ ideal-opt , τ opt-exp ) = (0.88, 0.07) ( τ ideal-opt , τ opt-exp ) = (0.99, 0.24) 10
Case Study (re-query) ( τ ideal-opt , τ opt-exp ) = (0.59, 0.45) 11
Concluding Remarks • Visual Analytics integrated in IR Evaluation ‐ helps explore the quality of ranked result lists ‐ helps point out the location and the magnitude of ranking errors • Future Work ‐ Extending the approach to comparison of multiple experiments ‐ Allowing for more complex forms of interaction with curve and R_Pos and Δ _Gain vectors ‐ Automatic extraction of features from misplaced documents and visualization of relationship among misplaced documents 12
Questions? 13
Recommend
More recommend